DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claim 16 is objected to because of the following informalities: Lines 1-2 recite “the combined training subsets… has a number NA”. The word “has” in this phrase should be “have” for grammatical correctness. Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-28 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites the limitation “the error subset ER(i)” in line 16. There is unclear antecedent basis for this limitation in the claim, since the error subset ER(i) in line 7 is defined where i=1 and the error subset ER(i) in line 15 is defined where i would be incremented to a value of 2 or more. A way to make this clearer would be to call the error subset ER(i) where i=1 in line 7 “a first error subset ER(i)” similar to how the  evaluation subset where i =1 is called a “first evaluation subset SE(i)”. The language “the error subset ER(i)” also occurs in claim 3 and claim 28, although it is noted that the suggested fix to claim 1 would also fix claim 3 even if claim 3 were not amended. A similar fix as that suggested to claim 1 would be appropriate for line 8 of claim 28.
Claim 15, line 1 recites “the number N2” and claim 15, lines 1-2 recite “the number N1”. There is insufficient antecedent basis for these limitations in the claim. To correct this, the dependence of claim 15 could be changed so that the claim is dependent on claim 14 rather than claim 1.
Claim 25, line 4 and claim 26, lines 2-3 recite “the error subsets ER(i)”. There is insufficient antecedent basis for these limitations in the claim. To correct this, applicant could delete “the” and define error subsets ER(i) for i between 1 and i inclusive, similar to how training subsets are defined in claim 1.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1, 2, 27, and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang (U.S. Publication 2020/0027207) in view of Merhav (U.S. Publication 2017/0300811), Paul (U.S. Publication 2022/0215541), and McLane (U.S. Publication 2019/0007433). 

As to claim 1, Zhang discloses a method for generating a classification model to classify objects in a plurality of categories using a training data set S of objects, comprising one or more programmed computers: for an index i=1, accessing a first training subset ST(i) including some of the objects in the training data set S (p. 2, sections 0031-0034; a number of training examples are used to train a model to identify objects of a number of categories, such as lesions and background); training a first model M(i) using the first training subset ST(i) (p. 2, sections 0031-0034; it is a detection/classification model that is being trained); using the first model M(i) to classify a first evaluation subset SE(i) of the training data set S, and identifying an error subset ER(i) of objects in the first evaluation subset SE(i) classified erroneously (p. 2, sections 0031-0034; a subset of objects that were identified as lesions but not actually lesions are identified); (a) incrementing the index i, and accessing another training subset ST(i) including some of the objects in error subset ER(i−1) (p. 2, sections 0031-0034; in a subsequent iteration, which would read on an iteration with an incremented index, training is performed using these incorrectly classified objects); (b) training a model M(i) (p. 2, sections 0031-0034; it is a detection/classification model that is being trained iteratively; for a second iteration, this would read on M(2), for a third, M(3), etc.); (c) using the model M(i) to classify an evaluation subset SE(i) of the training data set S, and identifying an error subset ER(i) of objects in the evaluation subset SE(i) classified erroneously (p. 2, sections 0031-0034; the method is performed iteratively, such that each iteration would have samples classified and have an error subset identified as described above).
Zhang does not disclose, but Merhav does disclose using a combination of the training subsets ST(i), for i between 1 and i inclusive (p. 9, section 0101; the error/loss function uses a combination of subsets of current and previous samples). The motivation for this is to extend the loss/error function and obtain extreme samples that may or may not all be in a single batch (p. 8-9, section 0096). It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify Zhang to use a combination of training subsets in order to extend the loss/error function and obtain extreme samples that may or may not all be in a single batch as taught by Merhav.
Zhang does not disclose, but Paul does disclose excluding the first training subset ST(i) and excluding the training subsets ST(i), for i between 1 and i inclusive (p. 8, section 0234; any evaluation images are excluded from the training process). While not explicitly stated, the use of evaluation images for training would be known to have a disadvantage, since overfitting of the data could not be properly tested. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify Zhang and Merhav to use evaluation subsets that exclude the training subsets as taught by Paul in order to avoid situations where overfitting data cannot be tested.
Zhang does not disclose, but McLane does disclose (d) evaluating the error subset ER(i) to estimate performance of the model M(i), and if performance is satisfactory, save model M(i), and that if performance is not satisfactory, then repeat steps (a) to (d) (p. 5, section 0049; if a performance threshold is not met, another iteration of training is performed). The motivation for this is to iteratively reduce error. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify Zhang, Merhav, and Paul to save a model if performance is satisfactory and repeat training steps if performance is not satisfactory in order to iteratively reduce error as taught by McLane.

As to claim 2, McLane discloses wherein said evaluating includes determining a number of objects erroneously classified, and comparing the number to a threshold (p. 5, section 0049).

As to claim 27, see the rejection to claim 1. Further Zhang discloses a computer system, comprising one or more processors including or having access to memory storing a classification engine trained according to the method (p. 5, section 0081-p. 6, section 0085).

As to claim 28, see the rejection to claim 1. Further, Zhang discloses a computer program product comprising: non-transitory computer readable memory, storing a computer program including logic to execute a procedure (p. 5, section 0081-p. 6, section 0085).

Claims 3-5, 7-13, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang in view of Merhav, Paul, and McLane and further in view of Zhang ‘691 (U.S. Patent 11,182,691).

As to claim 3, Zhang does not disclose but Zhang ‘691 does disclose wherein said evaluating includes determining a number of objects erroneously classified in the error subset ER(i), and comparing the number with a number of objects erroneously classified in previous error subset ER(i−1) (col. 62, lines 26-62; an evaluation takes place to determine if a number of erroneously classified objects is decreasing). The motivation for this is to determine if further iterations will improve results or if training must be concluded. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify Zhang, Merhav, Paul, and McLane to determine a number of objects erroneously classified in the error subset ER(i), and compare the number with a number of objects erroneously classified in previous error subset ER(i−1)  in order to determine if further iterations will improve results or if training must be concluded as taught by Zhang ‘691.

As to claim 4, Zhang does not expressly disclose but Zhang ‘691 does disclose wherein the first training subset ST(i) where i=1, includes 10% or less, of the objects in the training data set S (col. 61, line 66-col. 62, line 25; 1% of a training set is used in the first training iteration). While not explicitly disclosed in the reference, the motivation for using a small portion of a training set initially would be known in the art to be that processing time for the subset can be reduced compared to using a larger subset. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify Zhang, Merhav, Paul, and McLane to have the first training subset ST(i) where i=1, include 10% or less, of the objects in the training data set S as taught by Zhang ‘691 in order to have processing time for the subset be reduced compared to using a larger subset.

As to claim 5, Zhang ‘691 discloses wherein the first training subset ST(i) where i=1 includes 1% or less, of the objects in the training data set S (col. 61, line 66-col. 62, line 25; 1% of a training set is used in the first training iteration). Motivation for the combination is given in the rejection to claim 4.

As to claim 7, Zhang does not disclose, but Zhang ‘691 does disclose segmenting the training data set S into a plurality of blocks of training data, and wherein said first training subset ST(1) is accessed from a first block of the plurality of blocks, and the first evaluation subset includes some or all of a second block of the plurality of blocks, and excludes the first block (col. 47, line 41-col. 48, line 56; col. 53, lines 7-22; the training data is segmented into chunks/blocks, with training data including subsets for iterations being chunks 7, 2, 4, 5, 9, 1, 10, and 8 and subsets for test/evaluation being chunks 3 and 6; for a first iteration, a portion of chunk C1 is used). The motivation for this is to allow work on a large total data set without reading or writing a prohibitive amount of data for each operation (col. 35, lines 7-20). It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify Zhang, Merhav, Paul, and McLane to segment the training data set S into a plurality of blocks of training data, wherein said first training subset ST(1) is accessed from a first block of the plurality of blocks, and wherein the first evaluation subset includes some or all of a second block of the plurality of blocks, and excludes the first block in order to allow work on a large total data set without reading or writing a prohibitive amount of data for each operation as taught by Zhang ‘691.

As to claim 8, Zhang ‘691 discloses wherein the first and second blocks have uniform sizes (col. 36, lines 11-32; each chunk/block can have the same size for all requests in one embodiment of the invention). Motivation for the combination of references is given in the rejection to claim 7.

As to claim 9, Zhang does not disclose, but Zhang ‘691 does disclose segmenting the training data set S into a plurality of blocks of training data having uniform sizes (col. 36, lines 11-32; each chunk/block can have the same size for all requests in one embodiment of the invention), and wherein the training subset ST(i) for a given value of i, is accessed from a different block in the plurality of blocks than the evaluation subset SE(i) for the given value of I (col. 47, line 41-col. 48, line 56; col. 53, lines 7-22; the training data is segmented into chunks/blocks, with training data including subsets for iterations being chunks 7, 2, 4, 5, 9, 1, 10, and 8 and subsets for test/evaluation being chunks 3 and 6). Motivation for the combination of references is given in the rejection to claim 7.

As to claim 10, Zhang does not disclose, but Zhang ‘691 does disclose including determining a distribution of objects over the plurality of categories in the training set, and said training set is segmented so that some or all of the blocks in the plurality of blocks have the determined distribution (col. 59, line 50-col. 60, line 4; sampling is performed equally at chunk/block level for multiple chunks/blocks; the result of this is that after sampling, the chunks/blocks would have the same distribution of categories in the sample groups; for example, the unlabeled data set in each chunk is sampled at 10%). Motivation for the combination of references is given in the rejection to claim 7.

As to claim 11, Zhang does not disclose, but Zhang ‘691 does disclose accessing a database including objects classified according to the plurality of categories and filtering the database as a function of the plurality of categories to produce the training set S (col. 55, line 56-col. 56, line 34; col. 57, line 5-col. 58, line 14; sampling/filtering to create a training set is done as a function of each category). The motivation for this is to compensate for imbalances in the data set and make sure each category is adequately represented so as to improve accuracy (col. 54, line 59-col. 55, line 5). subset. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify Zhang, Merhav, Paul, and McLane to access a database including objects classified according to the plurality of categories and filter the database as a function of the plurality of categories to produce the training set S in order to compensate for imbalances in the data set and make sure each category is adequately represented so as to improve accuracy as taught by Zhang ‘691 

As to claim 12, Zhang ‘691 discloses wherein said filtering includes setting a maximum limit on a number objects classified in a given category accessed for inclusion in the training set S (col. 55, line 56-col. 56, line 34; col. 57, line 5-col. 58, line 14; a percentage is set for each category, this percentage would correspond to a particular number of objects and correspond to a maximum and minimum number that can be included in the training set for that category). Motivation for the combination of references is given in the rejection to claim 11.

As to claim 13, Zhang ‘691 discloses wherein said filtering includes setting a minimum limit on a number of objects classified in a given category accessed for inclusion in the training set S (col. 55, line 56-col. 56, line 34; col. 57, line 5-col. 58, line 14; a percentage is set for each category, this percentage would correspond to a particular number of objects and correspond to a maximum and minimum number that can be included in the training set for that category; also a minimum population constraint can be defined as a number, such as 1000). Motivation for the combination of references is given in the rejection to claim 11.

As to claim 18, Zhang does not disclose but Zhang ‘691 does disclose wherein accessing another training subset ST(i), for i>1, including some of the objects in the error subset ER(i−1) includes accessing a target number of the objects in the error subset ER(i−1) without regard to categories for inclusion in the training subset (col. 62, lines 16-62; the target number of error subset objects accessed for a next training subset is based on a weighted sampling; error objects are more likely to be selected than non-error objects; in some embodiments, this is also done taking into account a category ratio, and thus in other embodiments, it is done without regard to a category ratio). The motivation for this is that training using observation records for which large prediction errors occurred may enhance the prediction accuracy of subsequent iterations to a greater extent than training using observation records for which the model has already been able to make good predictions. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify Zhang, Merhav, Paul, and McLane to have accessing another training subset ST(i), for i>1, including some of the objects in the error subset ER(i−1) include accessing a target number of the objects in the error subset ER(i−1) without regard to categories for inclusion in the training subset in order to enhance prediction accuracy as taught by Zhang ‘691.

As to claim 19, Zhang does not disclose but Zhang ‘691 does disclose wherein accessing another training subset ST(i), for i>1, including some of the objects in the error subset ER(i−1) includes accessing objects so that for each category in the plurality of categories no more than a maximum number M of objects classified erroneously for each category are included in the training subset (col. 55, line 56-col. 56, line 34; col. 57, line 5-col. 58, line 14; col. 62, lines 16-62; the target number of error subset objects accessed for a next training subset is based on a weighted sampling; error objects are more likely to be selected than non-error objects; in some embodiments, this is also done taking into account a category ratio; a percentage is set for each category, this percentage would correspond to a particular number of objects and correspond to a maximum and minimum number that can be included in the training set for that category; also a minimum population constraint can be defined as a number, such as 1000). Motivation for the combination of references is given in the rejections to claims 11 and 18. 

As to claim 20, Zhang does not disclose but Zhang ‘691 does disclose wherein accessing another training subset ST(i), for i>1, including some of the objects in the error subset ER(i−1) includes accessing objects so that for each category in the plurality of categories at least a minimum number M of objects classified erroneously for each category are included in the training subset (col. 55, line 56-col. 56, line 34; col. 57, line 5-col. 58, line 14; col. 62, lines 16-62; the target number of error subset objects accessed for a next training subset is based on a weighted sampling; error objects are more likely to be selected than non-error objects; in some embodiments, this is also done taking into account a category ratio; a percentage is set for each category, this percentage would correspond to a particular number of objects and correspond to a maximum and minimum number that can be included in the training set for that category; also a minimum population constraint can be defined as a number, such as 1000). Motivation for the combination of references is given in the rejections to claims 11 and 18.

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Zhang in view of Merhav, Paul, and McLane and further in view of Oh (KR 20200045023A, herein represented by a translation).

As to claim 6, Zhang does not disclose, but Oh does disclose wherein the training subset ST(i) for i=2, includes less than one half of the objects in the error subset ER(1) (p. 5, 30% of an error subset is selected for learning in a second iteration of training). The motivation for this is to avoid overfitting (p. 4). It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify Zhang, Merhav, Paul, and McLane to have the second training subset include less than one half of error subset objects in order to avoid overfitting as taught by Oh.

Claims 14-17 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang in view of Merhav, Paul, and McLane and further in view of Packes (U.S. Publication 2015/0242747).

As to claim 14, Zhang does not disclose, but Packes does disclose wherein the training subset ST(i), for I=1, has a number N1 of objects, and the training subset ST(i), for i=2, has a number N2 of objects, and the number N2 is between 50% and 3% of the number N1 (p. 21-23, section 0120; the number of objects in the second training subset is a predetermined percentage of the first, for example 20%). The motivation for this is to retrain a network and reduce errors. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify Zhang, Merhav, Paul, and McLane to use a second training subset of 20% of first training subset objects in order to retrain a network and reduce errors as taught by Packes.

As to claim 15, Packes discloses wherein the number N2 is between 20% and 5% of the number N1 (p. 21-23, section 0120; the number of objects in the second training subset is a predetermined percentage of the first, for example 20%). Motivation for the combination is given in the rejection to claim 14.

As to claim 16, Packes discloses wherein the combined training subsets ST(i), for i between 1 and A−1 inclusive has a number NA of objects; and the training subset ST(i), for i=A, has a number NB of objects, and the number NB is between 50% and 3% of the number NA (p. 21-23, section 0120; the number of objects in the second training subset, and each further retraining subset is a predetermined percentage of the first, for example 20%; since each object of the second and further retraining subset is also a member of the first training subset, the number of objects in second and further training subsets would be 20% of the objects in the total combined training subsets). Motivation for the combination is given in the rejection to claim 14.

As to claim 17, Packes discloses wherein the number NB is between 20% and 5% of the number NA (p. 21-23, section 0120; the number of objects in the second training subset, and each further retraining subset is a predetermined percentage of the first, for example 20%; since each object of the second and further retraining subset is also a member of the first training subset, the number of objects in second and further training subsets would be 20% of the objects in the total combined training subsets). Motivation for the combination is given in the rejection to claim 14.

Claims 23 and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang in view of Merhav, Paul, and McLane and further in view of Chao (U.S. Publication 2020/0089130).

As to claim 23, Zhang does not disclose but Chao does disclose wherein the objects in the training set include images of defects on integrated circuit assemblies sensed in an integrated circuit fabrication process, the defects including a plurality of categories of defect (p. 2, section 0015; p. 2, section 0017; p. 4, section 0034, p. 5, section 0038; images of IC defects are used to train a model to detect and classify the defects by type in a fabrication process). The motivation for this is to improve efficiency in an otherwise time-consuming process (p. 1, section 0003). It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify Zhang, Merhav, Paul, and McLane to have the objects in the training set include images of defects on integrated circuit assemblies sensed in an integrated circuit fabrication process, the defects including a plurality of categories of defects in order to improve efficiency in an otherwise time-consuming process as taught by Beck.

As to claim 24, Zhang does not disclose, but Chao does disclose a method including applying the saved model M(i) in an inference engine to detect and classify defects in an integrated circuit fabrication process (p. 2, section 0015; p. 2, section 0017; p. 5, section 0035; p. 7, section 0048; the model with modified parameters is used to detect and classify defects). Motivation to combine the references is given in the rejection to claim 23. 

Claims 25 and 26 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang in view of Merhav, Paul, and McLane and further in view of Beck (U.S. Patent 10,650,929).

As to claim 25, Zhang does not disclose but Beck does disclose a method including executing a user interface providing interactive tools to display information about categories of objects in the training data set S, to set parameters for configuring the training data set S, and to set parameters for accessing the training subsets ST(i) from the error subsets ER(i) (col. 13, line 31-col. 14, line 34; a user can select categories and see sets of training images in them; parameters for accessing training sets from error sets are set by a user by annotating, verifying, or performing quality control; error images are then accessed as training images for retraining by the network). The motivation for this is to improve the ability of the statistical model to make accurate predictions. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify Zhang, Merhav, Paul, and McLane to provide interactive tools to display information about categories of objects in the training data set S, to set parameters for configuring the training data set S, and to set parameters for accessing the training subsets ST(i) from the error subsets ER(i) in order to improve the ability of the statistical model to make accurate predictions as taught by Beck.

As to claim 26, Beck discloses wherein the user interface provides interactive tools to display information about categories of objects in the training subsets ST(i), and about objects in the error subsets ER(i) (col. 13, line 51-col. 14, line 34; a user can select categories and see sets of training images in them; the user can annotate images adding them to error subsets which would include displaying information about these images). Motivation for the combination of references is given in the rejection to claim 25. 

Conclusion
Claims 21 and 22 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.

As to claim 21, Zhang ‘691 discloses wherein accessing training subsets includes accessing objects so that for each category in the plurality of categories at least a minimum number M of objects classified erroneously for each category are included in the training subset, as discussed in the rejection to claim 20. However, Zhang ‘691 would not appear to disclose adding objects from one or more of the error subset or error subsets ER(i), for i=to i−2 to 1, to establish the minimum number M in the given category in the training subset along with the other limitations of claim 21.

As to claim 22, Zhang ‘691 discloses wherein accessing another training subset ST(i), for i>1, including some of the objects in the error subset ER(i−1) includes accessing a part of a target number of the objects in the error subset ER(i−1) without regard to categories (assuming a target number is zero or one), or accessing a balance of the target number (assuming a target number is zero or one) so that for each category no more than a maximum number M of objects classified erroneously for each category are included in the balances of the target number, and including said part and said balance in the training subset, as discussed in the rejections to claims 18 and 19. However, Zhang ‘691 would not appear to disclose using the target number of objects such that the method would involve both accessing without regard to categories and accessing with regard to a maximum number, as required by claim 22, along with the other limitations of claim 22.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to AARON M RICHER whose telephone number is (571)272-7790. The examiner can normally be reached 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached on (571) 272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/AARON M RICHER/Primary Examiner, Art Unit 2612