DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is in response to Amendment filed 09/16/2022.
No priority date is claimed.  Therefore, the effective filing date of this application is 4/13/2020.
Claims 1, 11, and 19 have been amended, and claims 5 and 15 have been canceled.  Currently, claims 1-4, 6-14, and 16-20 are pending.

Information Disclosure Statement

The Information Disclosure Statement (IDS) filed by Applicant on 09/12/2022 has been considered.  A copy of the considered IDS as initialed, signed and dated by Examiner is included with this Office action.  

Response to Amendment

Amendments to claim 11 are effective to overcome the 101 rejection with respect to claims 11-18 presented in the previous Office action.  Therefore, the previous 101 rejection of claims 11-18 has been withdrawn.

Response to Arguments

Applicant’s arguments with respect to independent claims 1, 11 and 19 (see Remarks, pages 7-8) have been considered but are moot in view of new ground(s) of rejection in view of Zhou et al. (CN 109657793, Publication date 04/19/2019).

Claim Objections

Claim 1 is objected to because of the following informalities:  

Regarding claim 1, the comma (,) at the end of line 7 should be a semicolon (;).

Appropriate correction is required.

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.



Claims 1-4, 6-14, 16-18 (effective filing date 04/13/2020) are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. (U.S. Publication No. 2019/0370955, Publication date 12/05/2019), and further in view of Zhou et al. (CN Publication No. 109657793, Publication date 04/19/2019).

As to claim 1, Zhang et al. teaches:
“A method for machine learning using sparse training data” (see Zhang et al., Abstract and Fig. 3), comprising:
“training, by one or more processors, a machine learning model using a first training data point of a plurality of training data points to generate a candidate machine learning model” (see Zhang et al., Fig. 3 and [0097] for training the defect classifier using a set of labeled data, wherein each modified version of the defect classifier during training is interpreted as a candidate machine learning model as recited);
“selecting, by the one or more processors, a first input to provide to the candidate machine learning model” (see Zhang et al., [0097] for selecting data point(s) to input into the defect classifier);
“sampling, by the one or more processors, a first output of the candidate machine learning model by providing the first input as input to the candidate machine learning model” (see Zhang et al., [0097] for identifying output of the defect classifier for the input data point(s) inputted into the defect classifier).
Zhang et al. further teaches that the defect classifier is modified and trained until the output of the defect classifier for the input data points matches the labels acquired for the data points (see [0097]) or until converged to an optimal model (see [0110]), wherein the condition to stop the training or get the optimal model (e.g., whether the output matches the labels acquired for the data point) as disclosed can be interpreted as equivalent to a convergence condition as recited.  In addition, Zhang et al. teaches random sampling for selecting data points (see [0065]).
In case, Zhang et al. does not explicitly teach random selection of input and convergence condition as recited as follows:
“the first input selected from the plurality of training data points by performing at least one of a random process or a Monte Carlo process”;
“determining, by the one or more processors, whether the first output satisfies a convergence condition”;
“responsive to the first output not satisfying the convergence condition, modifying, by the one or more processors, the candidate machine learning model using a second training data point of the plurality of training data points, the second training data point corresponding to the first output”; and
“responsive to the first output satisfying the convergence condition, outputting, by the one or more processors, the candidate machine learning model”.
Zhou et al. explicitly teaches:
“the first input selected from the plurality of training data points by performing at least one of a random process or a Monte Carlo process” (Zhou et al., [page 3, lines 2-4] for randomly selecting each of the training sample images for inputting to the training model);
“determining, by the one or more processors, whether the first output satisfies a convergence condition” (see Zhou et al., [page 3, lines 27-32] for determining whether the output value of the loss function corresponding to the basic training module satisfies a preset convergence condition);
“responsive to the first output not satisfying the convergence condition, modifying, by the one or more processors, the candidate machine learning model using a second training data point of the plurality of training data points, the second training data point corresponding to the first output” (see Zhou et al., [page 3, lines 27-32] if the output value does not satisfy the preset convergence condition, adjusting the weighting parameter of the training sub-models); and
“responsive to the first output satisfying the convergence condition, outputting, by the one or more processors, the candidate machine learning model” (see Zhou et al., [page 3, lines 34-36] when the output value satisfies a preset convergence condition, stopping selecting the training data image from the training sample set (i.e., stopping the training process) and outputting the training sub-model currently trained by the GPU).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Zhou et al.’s teaching to Zhang et al.’s system by implementing a random selection of training data for inputting into a trained model and a preset convergence condition for controlling the model learning/training.  Ordinarily skilled artisan would have been motivated to do so to provide Zhang et al.’s system with an alternative effective way to selecting sample data for training and determine when to stop the training and obtaining the optimal model.  In addition, both of the references (Zhang et al. and Zhou et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, combining deep learning and active learning in training a deep learning/machine learning classifier/model.  This close relation between both of the references highly suggests an expectation of success when combined.

	As to claim 2, this claim is rejected based on the same reason as above to reject claim 1 and is similarly rejected including the following:
	Zhang et al. as modified by Zhou et al. teaches:
“wherein modifying the candidate machine learning model comprises training, by the one or more processors, the machine learning model using the first training data point and the second training data point” (see Zhang et al., [0097] for modifying one or more parameters of the defect classifier (i.e., candidate machine learning model) using the set of labeled data (i.e., labeled data points)).

As to claim 3, this claim is rejected based on the same reason as above to reject claim 1 and is similarly rejected including the following:
Zhang et al. as modified by Zhou et al. teaches:
“wherein the first training data point is retrieved from a dataset comprising less than one thousand training data points” (see Zhang et al., [0083] and Fig. 3 for using limited data points for training; also see [0059]).

As to claim 4, this claim is rejected based on the same reason as above to reject claim 1 and is similarly rejected including the following:
Zhang et al. as modified by Zhou et al. teaches:
“wherein the machine learning model comprises a neural network comprising an input layer, one or more hidden layers, and an output layer” (see Zhang et al., [0072] for neural networks; also see Zhou et al., Abstract for neural network model).

As to claim 6, this claim is rejected based on the same reason as above to reject claim 1 and is similarly rejected including the following:
Zhang et al. as modified by Zhou et al. teaches:
“wherein determining whether the first output satisfies the convergence condition comprises comparing the first output with a reference output of a reference high-fidelity model” (see Zhang et al., [0097] for comparing output of the defect classifier for the input data points with the labels acquired for the data points, wherein the label can be acquired/outputted from a tool/user (i.e., a reference high fidelity model)).

As to claim 7, this claim is rejected based on the same reason as above to reject claim 1 and is similarly rejected including the following:
Zhang et al. as modified by Zhou et al. teaches:
“wherein training the machine learning model using the first training data point comprises retraining the machine learning model using less than ten training data points” (see Zhang et al., [0059] wherein the data points for the specimen can include a combination of fewer than ten ground truth data points for any one defect type and unlabeled data, wherein ground truth data points is interpreted as training data points).

As to claim 8, this claim is rejected based on the same reason as above to reject claim 1 and is similarly rejected including the following:
Zhang et al. as modified by Zhou et al. teaches:
“wherein the first training data point comprises at least one of an experimental data point or a synthetic data point from a reference high-fidelity model” (see Zhang et al., [0059] wherein ground truth data points are generated by “ground truth” method, e.g., a tool/user (i.e., a reference high-fidelity model)). 

As to claim 9, this claim is rejected based on the same reason as above to reject claim 1 and is similarly rejected including the following:
Zhang et al. as modified by Zhou et al. teaches:
“selecting the first training data point from a first subset of data points of a training database, and selecting inputs for sampling the candidate machine learning model from a second subset of data points of the training database” (see Zhang et al., [0059] for selecting one or more data points (i.e., subset) from data points (i.e., database); also see [0097] for inputting the data points from the set of labeled data (i.e., subset) into the defect classifier to generate output (i.e., training or sampling)).

As to claim 10, this claim is rejected based on the same reason as above to reject claim 1 and is similarly rejected including the following:
Zhang et al. as modified by Zhou et al. teaches:
“sampling a plurality of first outputs of the candidate machine learning model” (see Zhang et al., [0097] for determining output of the defect classifier for the input data points),
“determining, by the one or more processors, whether each first output of the plurality of first outputs satisfies a respective convergence condition” (see Zhang et al., [0097] for determining whether the output of the defect classifier for the input data points matches the labels acquired for the data points; also see Zhou et al., [page 3, last paragraph] for determining whether the model result/output satisfied the preset convergence condition), and 
“selecting one or more second training data points to modify the candidate machine learning model responsive to the one or more second training data points corresponding to one or more first outputs that did not satisfy the respective convergence condition” (see Zhang et al., [0097] for continuing to train the defect classifier by selecting and inputting selected data points to the defect classifier and modifying one or more parameters of the defect classifier if the output of the defect classifier for the input data points don’t match the labels acquired for the data points; also see Zhou et al., [page 3, last paragraph and page 4, lines 1-3] for repeating the training and adjusting the weighting parameter of the training sub-model if the model result/out does not satisfied the preset convergence condition).

As to claim 11, Zhang et al. teaches:
“A system” (see Zhang et al., Abstract, Fig. 3 and [0113]), comprising:
“one or more processors configured to” (see Zhang et al., [0043]):
“train a machine learning model using a first training data point of a plurality of training data points to generate a candidate machine learning model” (see Zhang et al., Fig. 3 and [0097] for training the defect classifier using a set of labeled data, wherein each modified version of the defect classifier during training is interpreted as a candidate machine learning model as recited);
“select a first input to provide to the candidate machine learning model” (see Zhang et al., [0097] for selecting data point(s) to input into the defect classifier);
“sample a first output of the candidate machine learning model by providing the first input as input to the candidate machine learning model” (see Zhang et al., [0097] for identifying output of the defect classifier for the input data point(s) inputted into the defect classifier).
Zhang et al. further teaches that the defect classifier is modified and trained until the output of the defect classifier for the input data points matches the labels acquired for the data points (see [0097]) or until converged to an optimal model (see [0110]), wherein the condition to stop the training or get the optimal model (e.g., whether the output matches the labels acquired for the data point) as disclosed can be interpreted as equivalent to a convergence condition as recited.  In addition, Zhang et al. teaches random sampling for selecting data points (see [0065]).
In case, Zhang et al. does not explicitly teach random selection of input and convergence condition as recited as follows:
“the first input selected from the plurality of training data points by performing at least one of a random process or a Monte Carlo process”;
“determine whether the first output satisfies a convergence condition”;
“modify, responsive to the first output not satisfying the convergence condition, the candidate machine learning model using a second training data point of the plurality of training data points, the second training data point corresponding to the first output”; and
“output, responsive to the first output satisfying the convergence condition, the candidate machine learning model”.
Zhou et al. explicitly teaches:
“the first input selected from the plurality of training data points by performing at least one of a random process or a Monte Carlo process” (Zhou et al., [page 3, lines 2-4] for randomly selecting each of the training sample images for inputting to the training model);
“determine whether the first output satisfies a convergence condition” (see Zhou et al., [page 3, lines 27-32] for determining whether the output value of the loss function corresponding to the basic training module satisfies a preset convergence condition);
“modify, responsive to the first output not satisfying the convergence condition, the candidate machine learning model using a second training data point of the plurality of training data points, the second training data point corresponding to the first output” (see Zhou et al., [page 3, lines 27-32] if the output value does not satisfy the preset convergence condition, adjusting the weighting parameter of the training sub-models); and
“output, responsive to the first output satisfying the convergence condition, the candidate machine learning model” (see Zhou et al., [page 3, lines 34-36] when the output value satisfies a preset convergence condition, stopping selecting the training data image from the training sample set (i.e., stopping the training process) and outputting the training sub-model currently trained by the GPU).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Zhou et al.’s teaching to Zhang et al.’s system by implementing a random selection of training data for inputting into a trained model and a preset convergence condition for controlling the model learning/training.  Ordinarily skilled artisan would have been motivated to do so to provide Zhang et al.’s system with an alternative effective way to selecting sample data for training and determine when to stop the training and obtaining the optimal model.  In addition, both of the references (Zhang et al. and Zhou et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, combining deep learning and active learning in training a deep learning/machine learning classifier/model.  This close relation between both of the references highly suggests an expectation of success when combined.

As to claim 12, this claim is rejected based on the same reason as above to reject claim 11 and is similarly rejected including the following:
	Zhang et al. as modified by Zhou et al. teaches:
“wherein the one or more processors are configured to modify the candidate machine learning model by training the machine learning model using the first training data point and the second training data point” (see Zhang et al., [0097] for modifying one or more parameters of the defect classifier (i.e., candidate machine learning model) using the set of labeled data (i.e., labeled data points)).

As to claim 13, this claim is rejected based on the same reason as above to reject claim 11 and is similarly rejected including the following:
Zhang et al. as modified by Zhou et al. teaches:
“wherein the first training data point is retrieved from a training database comprising less than one thousand training data points” (see Zhang et al., [0083] and Fig. 3 for using limited data points for training; also see [0059]).

As to claim 14, this claim is rejected based on the same reason as above to reject claim 11 and is similarly rejected including the following:
Zhang et al. as modified by Zhou et al. teaches:
“wherein the machine learning model comprises a neural network comprising an input layer, one or more hidden layers, and an output layer” (see Zhang et al., [0072] for neural networks; also see Zhou et al., Abstract for neural network model).

As to claim 16, this claim is rejected based on the same reason as above to reject claim 11 and is similarly rejected including the following:
Zhang et al. as modified by Zhou et al. teaches:
“wherein the one or more processors are configured to determine whether the first output satisfies the convergence condition comprises comparing the first output with a predetermined output value” (see Zhang et al., [0097] for comparing output of the defect classifier for the input data points with the labels acquired for the data points, wherein each acquired label can be interpreted as a predetermined output value as recited).

As to claim 17, this claim is rejected based on the same reason as above to reject claim 11 and is similarly rejected including the following:
Zhang et al. as modified by Zhou et al. teaches:
“wherein the one or more processors are configured to train the machine learning model using the first training data point by training the machine learning model using less than ten training data points” (see Zhang et al., [0059] wherein the data points for the specimen can include a combination of fewer than ten ground truth data points for any one defect type and unlabeled data, wherein ground truth data points is interpreted as training data points).

As to claim 18, this claim is rejected based on the same reason as above to reject claim 11 and is similarly rejected including the following:
Zhang et al. as modified by Zhou et al. teaches:
“wherein the first training data point comprises at least one of an experimental data point or a synthetic data point from a reference high-fidelity model” (see Zhang et al., [0059] wherein ground truth data points are generated by “ground truth” method, e.g., a tool/user (i.e., a reference high-fidelity model)).

Claims 19 and 20 (effective filing date 04/13/2020) are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. (U.S. Publication No. 2019/0370955, Publication date 12/05/2019), in view of Zhou et al. (CN Publication No. 109657793, Publication date 04/19/2019), and further in view of Baughman et al. (U.S. Publication No. 2019/0279094, Publication date 09/12/2019).

As to claim 19, Zhang et al. teaches:
“A method” (see Zhang et al., Abstract and Fig. 3), comprising:
“training a machine learning model using less than ten first training data points of a plurality of training data points to generate a candidate machine learning model” (see Zhang et al., Fig. 3 and [0097] for training the defect classifier using a set of labeled data, wherein each modified version of the defect classifier during training is interpreted as a candidate machine learning model as recited; also see [0059] wherein the data points for the specimen can include a combination of fewer than ten ground truth data points for any one defect type and unlabeled data, wherein ground truth data points is interpreted as training data points);
“select a first input from the plurality of training data points” (see Zhang et al., [0097] for selecting data point(s) to input into the defect classifier);
“applying the first input as input to the candidate machine learning model to sample one or more first outputs of the candidate machine learning model” (see Zhang et al., [0097] for identifying output of the defect classifier for the inputted data point(s)).
Zhang et al. further teaches that the defect classifier is modified and trained until the output of the defect classifier for the input data points matches the labels acquired for the data points (see [0097]) or until converged to an optimal model (see [0110]), wherein the condition to stop the training or get the optimal model (e.g., whether the output matches the labels acquired for the data point) as disclosed can be interpreted as equivalent to a convergence condition as recited.  In addition, Zhang et al. teaches random sampling for selecting data points (see [0065]).
In case, Zhang et al. does not explicitly teach convergence condition as recited as follows:
“testing the one or more first outputs to determine if each of the one or more the first outputs satisfies a respective convergence condition”;
“responsive to at least one first output not satisfying the respective convergence condition, training the candidate machine learning model using at least one second training data point of the plurality of training data points corresponding to the at least one first output”; and
“responsive to the one or more first outputs each satisfying the convergence condition, outputting the candidate machine learning model”.
Zhou et al. explicitly teaches:
“testing the one or more first outputs to determine if each of the one or more the first outputs satisfies a respective convergence condition”(see Zhou et al., [page 3, lines 27-32] for determining whether the output value of the loss function corresponding to the basic training module satisfies a preset convergence condition);
“responsive to at least one first output not satisfying the respective convergence condition, training the candidate machine learning model using at least one second training data point of the plurality of training data points corresponding to the at least one first output” (see Zhou et al., [page 3, lines 27-32] if the output value does not satisfy the preset convergence condition, adjusting the weighting parameter of the training sub-models); and
“responsive to the one or more first outputs each satisfying the convergence condition, outputting the candidate machine learning model” (see Zhou et al., [page 3, lines 34-36] when the output value satisfies a preset convergence condition, stopping selecting the training data image from the training sample set (i.e., stopping the training process) and outputting the training sub-model currently trained by the GPU).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Zhou et al.’s teaching to Zhang et al.’s system by implementing a random selection of training data for inputting into a trained model and a preset convergence condition for controlling the model learning/training.  Ordinarily skilled artisan would have been motivated to do so to provide Zhang et al.’s system with an alternative effective way to selecting sample data for training and determine when to stop the training and obtaining the optimal model.  In addition, both of the references (Zhang et al. and Zhou et al.) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as, combining deep learning and active learning in training a deep learning/machine learning classifier/model.  This close relation between both of the references highly suggests an expectation of success when combined.
However, Zhang et al. as modified by Zhou et al. does not explicitly teach a feature of selecting/sampling using a Monte Carlo method as recited as follows:
“performing a Monte Carlo process to select a first input from the plurality of training data points”.
On the other hand, Baughman et al. teaches a feature of using Monte Carlo method for selecting a data subset from a set of data (see Baughman et al., [0039]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Baughman et al.’s teaching to Zhang et al.’s system (as modified by Zhou et al.) by implementing a feature of selecting data point(s) as input using Monte Carlo method.  Ordinarily skilled artisan would have been motivated to do so to provide Zhang et al.’s system with an alternative effective way to select one or more data points as input as suggested by Baughman et al. (see [0039]) that Monte Carlo methods are well-known and well-used in the art for selecting/sampling data.

As to claim 20, this claim is rejected based on the same reason as above to reject claim 19 and is similarly rejected including the following:
Zhang et al. as modified by Zhou et al. and Baughman et al. teaches:
“wherein the machine learning model comprises a neural network” (see Zhang et al., [0072] for neural networks; also see Zhou et al., Abstract for neural network model).


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to PHUONG THAO CAO whose telephone number is (571)272-2735. The examiner can normally be reached Monday - Friday: 9:00 am - 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ashish Thomas can be reached on 571-272-0631. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Phuong Thao Cao/Primary Examiner, Art Unit 2164