DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claim 10 is rejected under 35 U.S.C. 112(b), as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
As to claim 10, the term “minimizing” is a relative term which renders the claim indefinite. The term “minimizing” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 8-9, 11-15, and 18-19 are rejected under 35 U.S.C. 103 as being upatentable over Karlinsky et al. (US Pub. No. 2017/0177997), hereinafter referred to as Karlinsky, in view of Edwards et al. (US Patent No. 6125105), hereinafter referred to as Edwards.
Referring to claims 1 and 11, Karlinsky discloses a method for using a neural network (fig. 2), comprising: initially training the neural network with an original training data (DNN is trained using a training set...the training set can comprise a plurality of first training samples, [0008-0010]) over a training period (training process can be cyclic, and can be repeated several times until the DNN is sufficiently trained, [0086]); applying the neural network to a task and generating initial predictions from live data (data processing using deep neural network(s) for outputting application-specific data (e.g. classification, [0040]; prediction indicating the family type or general class; [0065]); receiving incremental training data to update the trained neural network (the training set can comprise a plurality of first training samples and a plurality of augmented training samples obtained by augmenting at least part of the first training samples, [0010]; training can be repeated several times (optionally, with an updated training set) until the DNN is sufficiently trained; [0134]); generating a ground truth for the original training data using the trained neural network (augmented ground truth data can be generated by FPEI system by processing the first ground truth data in correspondence with provided augmentation of the images in respective first training samples when deriving the augmented training samples; [0081]); training the neural network using the original training data along with generated ground truth, and the incremental training data (DNN is trained using a training set...the training set can comprise a plurality of first training samples and a plurality of augmented training samples obtained by augmenting at least part of the first training samples. The training set can further comprise ground truth data associated with the first training samples and augmented ground truth data associated with the augmented training samples; [0008-0010]) to generate the neural network, wherein the neural network is incrementally updated (training can be repeated several times (optionally, with an updated training set) until the DNN is sufficiently trained; [0134]): applying the incrementally trained neural network to generate predictions from live data (data processing using deep neural network(s) for outputting application-specific data (e.g. classification, [0040]; prediction indicating the family type or general class; [0065]).
While Karlinsky discloses initially training the neural network and generating initial predictions, Karlinsky is silent regarding further training the neural network, and therefore does not appear to explicitly disclose “after generating initial predictions” further performing training “after the initial training to update the trained neural network”, and the neural network is “updated after the initial predictions.”
However, Edwards teaches “after generating initial predictions” further performing training “after the initial training to update the trained neural network”, and the neural network is “updated after the initial predictions” (after the neural network has been trained input data is provided to the input units 32, 33, 34 and an output is produced at output unit 36. The output comprises a predicted time series value, col. 7, lines 30-35; During the prediction phase, the engine 23 monitors its performance to determine when retraining is required...Retraining involves making a copy of the trends analysis engine 23 incorporating the neural network and retraining the copy. After retraining has taken place the performance of the copy (or daughter engine) is validated. If validation is successful then the original engine is replaced by the daughter engine, col. 9, lines 15-35).
Karlinsky and Edwards are analogous art because they are from the same field of endeavor, neural network training.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of Karlinsky and Edwards before him or her, to modify the neural network training of Karlinsky to employ the retraining technique of Edwards because the retraining would be beneficial to the neural network when the performance deteriorates over time.
The suggestion/motivation for doing so would have been to keep performance within an acceptable level (Edwards: col. 9, lines 55-60).
Therefore, it would have been obvious to combine Karlinsky and Edwards to obtain the invention as specified in the instant claim.

As to claims 2 and 12, while Karlinsky discloses initializing a training model (generalized model of an exemplified deep neural network usable as DNN 112 is illustrated in FIG. 2., [0047]; weighting and/or threshold values of a deep neural network can be initially selected prior to training, [0050]), Karlinsky does not appear to explicitly disclose the training model being initialized to “an original model previously trained on the original training data.”
However, Edwards teaches the training model being initialized to “an original model previously trained on the original training data”  (After the engine 23 has been trained it is used to make predictions by presenting further input data. During the prediction phase, the engine 23 monitors its performance to determine when retraining is required. This is done by comparing recently presented input data against data from the training set. When the difference is significant, according to a predefined criterion or threshold, then retraining takes place. Retraining involves making a copy of the trends analysis engine 23 incorporating the neural network and retraining the copy; col. 9, lines 10-25).
The suggestion/motivation to combine remains as indicated above.

As to claims 3 and 13, Karlinsky discloses finding the original data classification probabilities, by classification of the original data using an original model (obtained ground truth data associated with the first training samples is informative of classes (e.g. particles, pattern deformation, bridges, etc.) and/or of class distribution (e.g. probability of belonging to each of the classes)...Upon generating (503) the classification training set, PMB trains (504) the DNN to extract classification-related features and to provide classification-related attributes enabling minimal classification error. The training process yields the trained DNN with classification-related training parameters; [0100-0102]).

As to claims 4 and 14, Karlinsky discloses assigning an original data classification probability as the ground truth data for the original data (obtained ground truth data associated with the first training samples is informative of classes (e.g. particles, pattern deformation, bridges, etc.) and/or of class distribution (e.g. probability of belonging to each of the classes); [0010]).

As to claims 5 and 15, Karlinsky discloses manually classifying the incremental train data into the target classes and assigning that as the ground truth (the ground truth data can be informative of classes, [0016]; Ground truth data can be synthetically produced (e.g. CAD-based images), actually produced (e.g. captured images), produced by machine-learning annotation (e.g. labels based on feature extracting and analysis); produced by human annotation, [0075) for the incremental train data classifying the incremental and original data using the training model to predict the current classification probabilities (obtained ground truth data associated with the first training samples is informative of classes (e.g. particles, pattern deformation, bridges, etc.) and/or of class distribution (e.g. probability of belonging to each of the classes); [0010]).

As to claims 8 and 18, Karlinsky discloses the incrementally trained neural network comprises a revised model consistent with the original model regarding the original training data and with the incremental training data (training process can be cyclic...The process can start from an initially generated training set, while a user provides a feedback for the results reached by the DNN based on the initial training set...PMB can adjust the next training cycle based on the received feedback. Adjusting can include at least one of: updating the training set (e.g. updating ground truth data and/or augmentation algorithms, obtaining additional first training samples and/or augmented training samples, etc.; [0092-0093]).

As to claims 9 and 19, Karlinsky discloses one or more of: a neural network model, a convolutional neural network model, a deep learning neural network model, a recurrent neural network model (generalized model of an exemplified deep neural network usable as DNN 112 is illustrated in FIG. 2., [0047]; the layers in DNN can be convolutional, fully connected, locally connected, pooling/subsampling, recurrent; [0053]), 


Claims 10 is rejected under 35 U.S.C. 103 as being unpatentable over Karlinsky, in view of Edwards, further in view of Hazan et al. (US Pub. No. 2020/0286221), hereinafter referred to as Hazan.
Referring to claim 10, Karlinsky discloses a method, comprising: providing original train data to a neural network (fig. 2) and training the neural network (DNN is trained using a training set...the training set can comprise a plurality of first training samples, [0008-0010]) based on a plurality of classes in the sets of training data (a training set comprising ground truth data...the ground truth data can be informative of classes; [0015-0016]), wherein connected representation and weights of the neural network comprises a model of the neural network (Each layer of DNN module 114 can include multiple basic computational elements (CE)...Each connection 205 between CE of preceding layer and CE of subsequent layer is associated with a weighting value; [0048], fig. 2); updating the model with an incremental training data (training can be repeated several times (optionally, with an updated training set) until the DNN is sufficiently trained; [0134]) and creating a ground truth for the original training data (first ground truth data corresponding to the first training samples, [0072]; Ground truth data can be synthetically produced (e.g. CAD-based images), actually produced (e.g. captured images), produced by machine-learning annotation (e.g. labels based on feature extracting and analysis); produced by human annotation, [0075]); incrementally training the neural network on a combined set of original train data and the incremental train data (adjust the next training cycle based on the received feedback. Adjusting can include at least one of: updating the training set (e.g. updating ground truth data and/or augmentation algorithms, obtaining additional first training samples and/or augmented training samples; [0093]).
While Karlinsky discloses “deploying the trained model for operation”, “generating initial predictions” (data processing using deep neural network(s) for outputting application-specific data (e.g. classification, [0040]; prediction indicating the family type or general class; [0065]), and “generating an incremental trained model” (training can be repeated several times (optionally, with an updated training set) until the DNN is sufficiently trained; [0134]), Karlinsky does not appear to explicitly disclose “after generating the initial predictions, updating the trained model” and ”training the neural network after generating initial predictions.” Furthermore, Karlinsky does not appear to explicitly disclose testing the neural network on a test data, minimizing a change in the loss of the original data.
However, Edward teaches “after generating the initial predictions, updating the trained model” and ”training the neural network after generating initial predictions” (after the neural network has been trained input data is provided to the input units 32, 33, 34 and an output is produced at output unit 36. The output comprises a predicted time series value, col. 7, lines 30-35; During the prediction phase, the engine 23 monitors its performance to determine when retraining is required...Retraining involves making a copy of the trends analysis engine 23 incorporating the neural network and retraining the copy. After retraining has taken place the performance of the copy (or daughter engine) is validated. If validation is successful then the original engine is replaced by the daughter engine, col. 9, lines 15-35).
Furthermore, Hazan discloses testing the neural network on a test data (modify the weights following initialization so that characteristics of the detected images correspond to characteristics of the test images used to train the neural network, [0026]), minimizing a change in the loss of the original data (the neural network can be used as a loss function to train the adapter network. The adapter network, when trained, can modify the set of images to match characteristics of the test images used to train the neural network, [0026]; the instruction executor 132 can execute or process an instruction based on the trained deep neural network and the adapter network...instruction executor 132 can also execute an operation that analyzes the adapter network to determine that the adapter network did not perform any destructive or permanent modifications to the detected set of images. Accordingly, the set of images can be viewed in their original unmodified state, [0029]).
Karlinsky, Edwards, and Hazan are analogous art because they are from the same field of endeavor, neural network training.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of Karlinsky, Edwards, and Hazan before him or her, to modify the neural network training of Karlinsky to employ the retraining technique of Edwards and the test data training and initialization technique of Hazan because the retraining and testing would be beneficial to the neural network when the performance deteriorates over time.
The suggestion/motivation for doing so would have been to keep performance within an acceptable level (Edwards: col. 9, lines 55-60) and achieve a lower learning rate (Hazan: [0012]).
Therefore, it would have been obvious to combine Karlinsky, Edwards, and Hazan to obtain the invention as specified in the instant claim.

Claims 6-7 and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Karlinsky in view of Edwards, as applied to claims 1-5, 8-9, 11-15, and 18-19 above, further in view of Berthelot et al. (US Pub. No. 2021/0407042), hereinafter referred to as Berthelot.
As to claims 6 and 16, while Karlinsky discloses original data probabilities as a ground truth for the data points in original training data (obtained ground truth data associated with the first training samples is informative of...probability of belonging to each of the classes); [0010]) and the ground truth of incremental training data based on a manual classification (the ground truth data can be informative of classes, [0016]; Ground truth data can be synthetically produced (e.g. CAD-based images), actually produced (e.g. captured images), produced by machine-learning annotation (e.g. labels based on feature extracting and analysis); produced by human annotation, [0075), Karlinsky does not appear to explicitly disclose computing a loss using a custom loss function, wherein the custom loss function takes original data, which specifies the loss proximal to ideal and takes the ground truth of incremental training data which specifies the loss to incrementally train for the incremental training data alone.
However, Berthelot discloses computing a loss using a custom loss function, wherein the custom loss function takes original data, which specifies the loss proximal to ideal and takes the ground truth of incremental training data which specifies the loss to incrementally train for the incremental training data alone (first loss function may satisfy, for example, Equation 1 below, and the first trainer 110 may update the training parameters of the feature extractor 210 and the plurality of classifiers 220 such that the first loss function is minimized, thereby training the deep neural network model 220 such that the estimated label value output from each of the plurality of classifiers 220 is close to the ground-truth label value allocated to the data input to the deep neural network model; [0054]).
Karlinsky and Berthelot are analogous art because they are from the same field of endeavor, neural network training.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of Karlinsky and Berthelot before him or her, to modify the neural network training of Karlinsky to loss function of Berthelot therein reducing the training time.
The suggestion/motivation for doing so would have been to reduce the training time for the neural network (Berthelot: [0028]).
Therefore, it would have been obvious to combine Karlinsky and Berthelot to obtain the invention as specified in the instant claim.

As to claims 7 and 17, Karlinsky does not appear to explicitly disclose updating the training model based on a loss computed using a custom loss function.
However, Berthelot discloses updating the training model based on the loss computed using a custom loss function (perform the first global update using a first loss function based on a ground-truth label value assigned to the data input to the deep neural network model 200 from among the data included in the first data set and an estimated label value of each of the plurality of classifiers 220 for the corresponding input data; [0053]).
The suggestion/motivation to combine remains as indicated above.

Response to Arguments
The Applicant’s arguments with respect to claims 1-2 and 11-12 have been considered but are moot in view of the new grounds of rejection.
The Applicant’s arguments with respect to claims 6-7 and 16-17 have been fully considered but are not persuasive. The Applicant’s arguments generally identify the teachings of Berthelot, reproducing language related to paragraphs [0034], [0035], [0069], and [0070] of the disclosure of Berthelot, and concluding that “Berthelot only addresses training of super-resolution images using a perceptual loss,” without addressing the portion of paragraph [0054] which was indicated in the rejection. Therefore, the Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references. 
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
The examiner has cited particular column, line, and/or paragraph numbers in the references as applied to the claims above for the convenience of the applicant.  Although the specified citations are representative of the teachings of the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the references in its entirety as potentially teaching of all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner.
The examiner requests, in response to this office action, support be shown for language added to any original claims on amendment and any new claims.  That is, indicate support for newly added claim language by specifically pointing to page(s) and line number(s) in the specification and/or drawing figure(s).  This will assist the examiner in prosecuting the application.  When responding to this office action, applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of art disclosed by the references cited or the objections made.  He or she must also show how the amendments avoid such references or objections.  See 37 C.F.R. 1.111(c).
Applicants seeking an interview with the examiner, including WebEx Video Conferencing, are encouraged to fill out the online Automated Interview Request (AIR) form (http://www.uspto.gov/patent/uspto-automated-interview-request-air-form.html). See MPEP §502.03, §713.01(11) and Interview Practice for additional details.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC T OBERLY whose telephone number is (571)272-6991.  The examiner can normally be reached on M-F 800am-430pm (MT).
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Dr. Henry Tsai can be reached on (571) 272-4176.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ERIC T OBERLY/             Primary Examiner, Art Unit 2184