DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-2, 4, 6-9, 11, 13-16, 18, and 20 are presented for examination.

Response to Amendment
Applicant’s amendment has obviated the remaining objections to the specification and drawings.  Therefore, those objections are withdrawn.

Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1-2, 7-9, 14-16, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Liao et al. (WO 2020048594) (“Liao”) in view of Bakker et al., “Clustering Ensembles of Neural Network Models,” in 16 Neural Networks 261-69 (2003) (“Bakker”) and further in view of Son et al. (EP 3276540) (“Son”).
Regarding claim 8, Liao discloses “[a] system comprising: 
a hardware processor (blocks and combinations thereof may be implemented by various means or their combinations, such as hardware, firmware, software, one or more processors and/or circuitry – p. 37, ll. 8-22); 
a memory device coupled with the hardware processor (see Fig. 9, memory 404 coupled to processor/controller 401); 
the hardware processor operable to at least: 
receive a similarity estimate between a sample data set and a source data set associated with a prior-trained neural network model, wherein the source data set associated with the prior-trained neural network model includes a training data set used to train the prior-trained neural network model (in a distributed scheme of self-transfer optimization, each system having sufficient data [source data] collection derives [trains] a pre-trained deep reinforcement learning model for network optimization – Liao, p. 28, l. 14-p. 29, l. 2; see also Fig. 4, esp. ref. char. 400 (showing the training data being fed into the deep reinforcement learning model)), wherein a plurality of similarity estimates is received corresponding to a plurality of source data sets associated with a plurality of prior-trained neural network models (in a distributed scheme of self-transfer optimization, each system having sufficient data [source data] collection derives a pre-trained deep reinforcement learning model for network optimization [note that the language “each system” implies that there may in general be more than one system, each of which has collected source data]; when the optimization model is fully prepared, the system containing the pre-trained network optimization model may send a request for similarity data from the connected systems; the systems which have received the message respond with the similarity data requested for the similarity analysis between the two systems [such similarity analysis being based on a sample data set] – Liao, p. 28, l. 14-p. 29, l. 2; deep reinforcement learning algorithm is based on a plurality of convolutional layers and fully connected layers [i.e., it is a neural network model] – id. at p. 32, l. 28-p. 33, l. 8), …; 
determine, at least based on the similarity estimates, whether to train a new neural network model (after having received the similarity data, the system offering the pre-trained network optimization model conducts a similarity analysis; on the basis of the similarity analysis, the parts of the pre-trained network optimization model to be transferred to the corresponding systems are determined; the systems receiving the partial pre-trained model adapt [train] the received model according to their needs [to make new neural networks; similarity analysis = determining whether to transfer data to the other systems so that they may be trained] –Liao,  p. 29, ll. 4-23); …
determine a set of training data (the systems receiving the partial pre-trained network optimization model adapt the received model according to their needs; the model is fine-tuned or updated on the basis of their own collected data [training data] using transfer learning – Liao, p. 29, ll. 20-23) …; and 
train the new neural network model based on the set of training data (the systems receiving the partial pre-trained network optimization model adapt the received model according to their needs; the model is fine-tuned or updated [i.e., trained] on the basis of their own collected data [training data] using transfer learning – Liao, p. 29, ll. 20-23).”
Liao appears not to disclose explicitly the further limitations of the claim.  However, Bakker discloses “responsive to determining to train the new neural network model, creat[ing] a cluster among the plurality of prior-trained neural network models by at least running the plurality of prior-trained neural network models using the sample data set (a method of clustering may be used to find a workable representation of an oversized ensemble of neural network models – Bakker, last paragraph on p. 261; model-free clustering of elements involves determining cluster centers that are found directly in the space of model outputs; since such cluster centers do not have a model of the same architecture as the models in the ensemble, a retraining set is needed to translate the cluster centers back to actual models [i.e., the clustering is done responsive to the need to train a new model that will be representative of the cluster] – id. at p. 264, paragraph entitled “When to retrain”; distance function between the full ensemble and the clustered subset ensemble is based on model outputs instead of model parameters [i.e., it is based on running the models using data sets and observing their outputs] – id. at p. 262, first full paragraph); [and]
determin[ing] a set of training data based on the cluster (the clustering process may be performed solely in terms of model outputs and the actual representative models can be trained at the end when all cluster centers are found [i.e., a set of training data is determined in order to train the cluster center] – Bakker, p. 264, paragraph entitled “Retrain after clustering”)….”
Bakker and the instant application both relate to the clustering of neural network models and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Liao to create a cluster of models and determine a set of training data based on the cluster, as disclosed by Bakker, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would allow the system to determine which models are sufficiently similar to each other to justify representing each group with a single model, thereby saving computer power that would otherwise be spent running every model individually.  See Bakker, sec. 1, first three paragraphs.
Neither Liao nor Bakker appears to disclose explicitly the further limitations of the claim.  However, Son discloses that “the similarity estimate [is] determined based on outputs of a hidden layer of the prior-trained neural network model generated using the sample data set and outputs of the hidden layer of the prior-trained neural network model produced generated using the source data set (in an iterative regularization process of a lightened neural network, features are extracted from verification data in a verification database, and each verification datum may include a data pair [one being from a sample data set and the other from a source data set]; a feature of each verification datum is extracted through operation of the neural network configured according to the original trained parameters; the extracted features may then be matched to each other; for example, if the extracted features are each a feature vector of an output layer of the neural network configured [prior-trained] according to the original trained parameters, the matching operation may determine a similarity between the two extracted feature vectors; features may also be extracted for each, or select, hidden layers of the neural network configured according to the original trained parameters – Son, paragraphs 86-87).”  
Son and the instant application both relate to neural networks and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Liao and Bakker to develop a similarity estimate between a feature vector output from a hidden layer of a neural network based on input from a first dataset to another feature vector output by the hidden layer of the same neural network based on input from a second dataset, as disclosed by Son, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would provide a compact representation of the input data that can be used to determine which instances of the input data are similar to each other, thereby saving processing power.  See Son, paragraphs 86-87.

	Claim 1 is a method claim corresponding to system claim 8 and is rejected for the same reasons as given in the rejection of that claim.  Similarly, claim 15 is a computer program product claim corresponding to system claim 8 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 9, Liao, as modified by Bakker and Son, discloses that “the new neural network model is trained as a base model for transfer learning (in a fully distributed scheme of self-transfer optimization, a system derives a pre-trained network optimization model, and on the basis of a similarity analysis between the system and the other systems, the system having the pre-trained model sends parts of the model to other systems, which adapt [train] the received model according to their needs on the basis of their own collected data using transfer learning – Liao, p. 28, l. 15-p. 29, l. 23; see also Fig. 2 (showing that both the pre-trained model and the adapted models are each associated with a system and that the systems communicate with each other, such that the adapted models can themselves be used as base models for transfer learning of models further adapted to other systems)).”  

	Claim 2 is a method claim corresponding to system claim 9 and is rejected for the same reasons as given in the rejection of that claim.  Similarly, claim 16 is a computer program product claim corresponding to system claim 9 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 20, the rejection of claim 15 is incorporated.  Liao further discloses that “the set of training data comprises a combination of source data sets used in training prior-trained neural network models (in a fully distributed scheme of self-transfer optimization, each system [possibly multiple systems] having sufficient data [source data, possibly comprising multiple sets] collection derives a pre-trained deep reinforcement learning model for network optimization; when the model is prepared, a request for similarity data is sent to connected systems, and the systems respond with similarity data; on the basis of the similarity analysis, parts of the pre-trained model are transferred to the corresponding systems, and the systems receiving the pre-trained model fine-tune the model on the basis of their own collected data [part of training data] – Liao, p. 28, l. 14 -p. 29, l. 18 [note that since the pre-trained model that the other systems adapt was ultimately derived from the data collected by the first system, the first system’s data may also be regarded as part of the training data])….”
Liao appears not to disclose explicitly the further limitations of the claim.  However, Bakker discloses that the “neural network models [are] identified in the cluster (the method of clustering may be used to find a workable representation of any oversized ensemble of neural network models – Bakker, last full paragraph on p. 261).”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Liao/Son to use a clustering technique to identify neural network models, as disclosed by Bakker, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would allow the system to determine which models are sufficiently similar to each other to justify representing each group with a single model, thereby saving computer power that would otherwise be spent running every model individually.  See Bakker, sec. 1, first three paragraphs.

	Claim 7 is a method claim corresponding to computer program product claim 20 and is rejected for the same reasons as given in the rejection of that claim.  Similarly, claim 14 is a system claim corresponding to computer program product claim 20 and is rejected for the same reasons as given in the rejection of that claim.

Claims 4, 11, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Liao in view of Bakker and Son and further in view of Arora et al. (US 20200090009) (“Arora”).
Regarding claim 11, the rejection of claim 8 is incorporated.  Liao and Bakker appear not to disclose explicitly the further limitations of the claim.  However, Son discloses that the “feature vectors [come from] hidden layers (though the extracted features may be output results of an output layer of the neural network, similar features may be extracted for each, or select, hidden layers of the neural network configured according to the original trained parameters – Son, paragraph 87)….”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Liao and Bakker to derive the feature vectors from hidden layers, as disclosed by Son, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would expand the pool of resources from which the feature vectors can be extracted and analyzed.  See Son, paragraph 87.
Neither Liao, Bakker, nor Son appears to disclose explicitly the further limitations of the claim.  However, Arora discloses that “the hardware processor creates the cluster based on feature vectors … produced by passing, in forward propagation, the sample data through the plurality of prior-trained neural network models (machine learning training module reduces the number of dimensions of each feature vector using a feedforward neural network prior to using the feature vector to train the machine learning models [unreduced feature vector = sample data; reduced feature vector = feature vector]; after training the machine learning models; the machine learning training module can provide machine learning model data that include the machine learning models to a machine learning module that clusters the feature vectors using the models – Arora, paragraph 47; see also Fig. 1, esp. ref. chars. 145-46 and 155).”  
Arora and the instant application both relate to clustering of data produced by neural networks and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Liao, Son, and Bakker to develop a cluster based on feature vectors produced by passing the data through multiple models, as disclosed by Arora, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would simplify the clustering process by providing a compact representation of the input data.  See Arora, paragraph 47.

	Claim 4 is a method claim corresponding to system claim 11 and is rejected for the same reasons as given in the rejection of that claim.  Similarly, claim 18 is a computer program product claim corresponding to system claim 11 and is rejected for the same reasons as given in the rejection of that claim.

Claims 6 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Liao in view of Bakker and Son and further in view of Flanagan et al. (US 20200272805) (“Flanagan”).
Regarding claim 13, Liao, as modified by Bakker, Son, and Flanagan, discloses that “the plurality of prior-trained neural network models are stored as a library of pre-existing models (a local convolutional network library is stored in a computer-readable medium that is coupled to the processor and contains instructions and storage of values that define one or more CNNs – Flanagan, paragraph 25).”
Flanagan and the instant application both relate to neural networks and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Liao, Son, and Bakker to store the models as a library of models, as disclosed by Flanagan, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would provide a central repository of models the best of which can be selected based on the analysis needs called for by the problem to be solved.  See Flanagan, paragraph 31 (disclosing that the user may select  individual CNNs from the library).

	Claim 6 is a method claim corresponding to system claim 13 and is rejected for the same reasons as given in the rejection of that claim.

Response to Arguments
Applicant's arguments filed December 6, 2022 (“Remarks”) have been fully considered but they are not persuasive.
	Applicant first alleges that Liao fails to disclose a similarity estimate determined based on outputs of a hidden layer of a prior-trained neural network model generated using a sample data set and outputs of the hidden layer of the prior-trained neural network model generated using the source data set because Liao allegedly discloses only similarity data and not a similarity estimate.  Remarks at 9-10.  However, in the absence of a special definition of the term “similarity estimate” in the specification, the term must be construed in accordance with its plain meaning.  Here, the term is being construed to refer to any data that indicate how similar two neural network models are.  Given that the cited portions of Liao disclose that a similarity analysis is conducted on the similarity data and the parts of the pre-trained model to be transferred to other systems are determined on the basis of this similarity analysis, the results of the similarity analysis must be a similarity estimate as that term is most broadly reasonably construed in light of the specification.
Applicant then alleges that neither Bakker nor Arora discloses feature vectors of hidden layers used to create a cluster among a plurality of prior-trained neural network models.  Remarks at 11.  Applicant makes no substantive argument in connection with this bare assertion; as such, Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.  Examiner assumes for purposes of this discussion that Applicant is alleging that the prior art of record does not specifically disclose that the cluster is created based on feature vectors of hidden layers.  While it is true that Arora does not appear explicitly to teach that the feature vectors may come from hidden layers, Son does appear to disclose this element, as indicated above.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN C VAUGHN whose telephone number is (571)272-4849. The examiner can normally be reached M-R 7:50a-5:50p ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/R.C.V./Examiner, Art Unit 2125

/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125