Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
Claims 1-20 are pending and are being examined in this application.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “at least one index creation module configured to generate...”; “an index similarity scoring module configured to compare...”; “a candidate filtering module configured to reduce...”; and “an ensembling module configured to compile...” as recited in claim 1.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claim 20 is rejected under 35 U.S.C. 101 because the claim encompasses non-statutory subject matter.  The claim recites “at least one computer readable medium,” which is construed to cover both transitory and non-transitory media under the broadest reasonable interpretation consistent with the specification.1  Transitory, propagating signals per se constitute non-statutory subject matter.2  Because the full scope of the claim encompasses non-statutory subject matter, the claim as a whole is non-statutory.
It is suggested that claim 20 be amended to recite “at least one non-transitory computer readable medium comprising computer program instructions...” Appropriate correction is required.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-11 and 15-20 are rejected under 35 U.S.C. 102(a)(1) and (a)(2) as being anticipated by Vadlamani et al. (US Pub. 20180357569).
Referring to claim 1, Vadlamani discloses an apparatus for determining whether one or more labelers from a set of candidate machine learning labelers should be combined with a target machine learning labeler, said apparatus comprising: 
at least one index creation module [fig. 7, processor 702] configured to generate an index for each of the candidate labelers and for the target labeler [par. 34-36; test outputs are generated for each of a plurality of classification models, and a golden set is generated by a Universal Human Relevance System (UHRS) or other entity (i.e., a target model)]; 
coupled to the at least one index creation module, an index similarity scoring module [fig. 7, processor 702] configured to compare the indices associated with each candidate labeler against the index associated with the target labeler, and to produce a similarity score for each candidate labeler [par. 36; the test outputs are compared against the golden set, and the classification models are scored based on resulting metrics such as precision, recall, and F1 statistics]; 
coupled to the index similarity scoring module, a candidate filtering module [fig. 7, processor 702] configured to reduce the set of candidate labelers based upon their similarity scores, thereby producing a set of filtered candidate labelers [par. 36; the best classification models are selected based on the scores]; and 
coupled to the candidate filtering module and to the target labeler, an ensembling module [fig. 7, processor 702] configured to compile an ensemble of labelers comprising the target labeler and the filtered labelers [pars. 34-36; the raw data is classified by combining the golden set generated by the target model with the results of the selected classification models].
Referring to claim 2, Vadlamani discloses wherein at least one index creation module uses a topic-modeling based method to generate indices [par. 38; note classification into topics].
Referring to claim 3, Vadlamani discloses wherein at least one index creation module uses a probabilistic approach to generate indices, said approach comparing the probability for a given label versus an alternative label [par. 38; note confidence scores].
Referring to claim 4, Vadlamani discloses wherein the index similarity scoring module is configured to detect an under-addressed sub-domain associated with a candidate labeler [par. 38; note use of more than one classification model to address overlapping or mutually exclusive classes].
Referring to claim 5, Vadlamani discloses wherein the index similarity scoring module automatically adds a new labeler to the ensemble to compensate for the under-addressed sub- domain [par. 38; note the use of more than one classification model to address overlapping or mutually exclusive classes].
Referring to claim 6, Vadlamani discloses wherein the index similarity scoring module is configured to define a specification for a new labeler that will compensate for the under-addressed sub-domain [pars. 38, 45, and 46; a domain or class taxonomy (i.e., classification or specification of a classification model) may be defined by a client, a classification engine, or human annotators].
Referring to claim 7, Vadlamani discloses wherein the specification is used by a human curator to obtain relevant datasets to generate labelers from said datasets [pars. 38, 45, and 46; a domain or class taxonomy (i.e., classification or specification of a classification model) may be defined by a client, a classification engine, or human annotators].
Referring to claim 8, Vadlamani discloses wherein the specification is used to drive an automated crawler or search engine to find appropriate data and then to generate an appropriate labeler from said data [pars. 38, 45, and 46; a domain or class taxonomy (i.e., classification or specification of a classification model) may be defined by a client, a classification engine, or human annotators].
Referring to claim 9, Vadlamani discloses a method for creating an ensemble of machine learning labelers, said method comprising the steps of: 
selecting a set of candidate labelers associated with a dataset in an existing archive [par. 33; a repository includes multiple classification models that have already been computed to conduct classification on at least one set of raw data (i.e., existing data)]; 
generating an index for each candidate labeler [pars. 34-36; test outputs are generated for each of the classification models]; 
selecting at least one target labeler from a new dataset; generating an index for said target labeler [pars. 34-36; a golden set is generated by a Universal Human Relevance System (UHRS) or other entity (i.e., a target model); the golden set represents a set of true classifications about a subset of raw data to be classified (i.e., new raw data)]; and 
comparing the indices of each candidate labeler against the index for the target labeler, thereby producing a similarity score for each candidate labeler [par. 36; the test outputs are compared against the golden set, and the classification models are scored based on resulting metrics such as precision, recall, and F1 statistics].
Referring to claim 10, Vadlamani discloses producing a subset of high scoring labelers from among the set of candidate labelers [par. 36; the best classification models are selected based on the scores]; and combining the high scoring labelers with the target labeler to produce a labeling ensemble [pars. 34-36; the new raw data is classified by combining the golden set generated by the target model with the results of the selected classification models].
Referring to claim 11, Vadlamani discloses wherein the scoring is based on a configured similarity threshold [par. 36; note that the best classification models are selected based on configured thresholds].
Referring to claim 15, Vadlamani discloses wherein the new dataset, the target labeler, and the target index are added to the existing archive [par. 41; note storing of classified output and methods and models used to produce the classified output].
Referring to claim 16, Vadlamani discloses wherein the index similarity scoring module detects an under-addressed sub-domain associated with a candidate labeler [par. 38; note use of more than one classification model to address overlapping or mutually exclusive classes].
Referring to claim 17, Vadlamani discloses wherein the index similarity scoring module automatically adds a labeler to the ensemble to compensate for the under-addressed sub-domain [par. 38; note the use of more than one classification model to address overlapping or mutually exclusive classes].
Referring to claim 18, Vadlamani discloses wherein the index similarity scoring module defines a specification for a new labeler that will compensate for the under-addressed sub-domain [pars. 38, 45, and 46; a domain or class taxonomy (i.e., classification or specification of a classification model) may be defined by a client, a classification engine, or human annotators].
Referring to claim 19, Vadlamani discloses wherein the index generating step comprises a combination of the following two methods to generate indices: a topic-modeling based method [par. 38; note classification into topics]; and a probabilistic method wherein the probability that a candidate labeler will produce a given label is compared with the probability that the candidate labeler will produce an alternative label [par. 38; note confidence scores].
Referring to claim 20, see at least the rejection for claim 9. Vadlamani further discloses at least one computer readable medium comprising computer program instructions for creating an ensemble of machine learning labelers, said instructions comprising the claimed steps [fig. 7, processor 702, memory 704, instructions 724].

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Vadlamani in view of Barthur (US Pub. 20200250477).
Referring to claim 12, Vadlamani does not appear to explicitly disclose wherein high scoring candidate labelers are further filtered on a Top-N basis as an upper limit, while still meeting the configured similarity threshold.
However, Barthur discloses wherein high scoring candidate labelers are further filtered on a Top-N basis as an upper limit, while still meeting the configured similarity threshold [pars. 79 and 80; performance metrics (e.g., a number of true positives) is determined for each machine learning model, and a top tier (e.g., top 5) of the machine learning models are selected].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the selecting of classification models taught by Vadlamani so that the classification models are selected only if they are in the top tier as taught by Barthur. The motivation for doing so would have been to limit the number of classification models selected to improve processing time.

Claim 13 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Vadlamani in view of Fukushima et al. (US Pub. 20180189242)
Referring to claim 13, Vadlamani does not appear to explicitly disclose wherein the combining step comprises a majority vote method, wherein the same example input data is presented to each candidate labeler, with the candidate labeler associated with the most common predicted label being selected for inclusion in the ensemble.
However, Fukushima discloses wherein the combining step comprises a majority vote method, wherein the same example input data is presented to each candidate labeler, with the candidate labeler associated with the most common predicted label being selected for inclusion in the ensemble [par. 188; classifications models are selected based on a largest (i.e., most common) judgment result based on majority vote)].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the selecting of classification models taught by Vadlamani so that the classification models are selected using majority vote as taught by Fukushima. The motivation for doing so would have been to improve the correct answer rate [Fukushima, par. 188].
Referring to claim 14, Fukushima discloses wherein the majority vote method is modified by weighting votes for candidate labelers based upon at least one of the following two criteria: confidence scores or sub-domain relevance; abstention of votes for low-confidence predictions by individual candidate labelers [par. 188; the classification models are selected in descending order of identification rate until a probability of the judgment result being a correct answer reaches target performance].

Conclusion
The following prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
Odaibo et al. (US Pub. 20190043193) discloses weighting and ranking of an ensemble of machine learning models.
Wang et al. (US Pub. 20080249762) discloses using several different models to classify documents. 


Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GRACE PARK whose telephone number is (571) 270-7727.  The examiner can normally be reached on M-F 8AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JAMES TRUJILLO can be reached on (571) 272-3677.  The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at (866) 217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call (800) 786-9199 (IN USA OR CANADA) or (571) 272-1000.

/Grace Park/Primary Examiner, Art Unit 2157                                                                                                                                                                                                        	
	






    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 Official Gazette Notice 1351 OG 212, dated February 23, 2010, states “the broadest reasonable interpretation of a claim drawn to a computer readable medium…typically covers forms of non-transitory tangible media and transitory propagating signals per se in view of the ordinary and customary meaning of computer readable media, particularly when the specification is silent.”
        2 See In re Nuijten, 500 F.3d 1346, 1356-57 (Fed. Cir. 2007) (transitory embodiments are not directed to statutory subject matter) and Interim Examination Instructions for Evaluating Subject Matter Eligibility Under 35 U.S.C. § 101, Aug. 24, 2009; p. 2.