Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claim 21 is  objected to because of the following informalities: Reason: claim 21 is duplicated Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claim 21, 23 is rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
MPEP 2163 allows examiner to reject the original-filed claims that lack of written support

    PNG
    media_image1.png
    796
    955
    media_image1.png
    Greyscale

Claim 21 “... generating the different model by performing a second hyperparameter tuning of the received model according to the fixed training hyperparameter; and determining a brittleness score of the different model...” The specfication does not disclose how to perform a second tuning of the received model according to the fixed training model and determining a brittleness score of the different model.
[0077] At step 314, model optimizer 104 generates a preferred model, consistent with disclosed embodiments. The preferred model may be based on the recommendation (step 312). In some embodiments, the preferred model may be the preliminary model or may be based on the preliminary model. The preferred model may be the reference model or may be based on the reference model. Generating a preferred model may include training the preferred model and/or setting a hyperparameter of the preferred model.

	Paragraph [0077] simply discusses a preferred model is generated based on recommendation. The specification does not disclose how to perform a second tuning of the the received model according to the fixed training model and determining a brittleness score of the different model.
Claim 23: “... generating a plurality of parameter seeds for the tuned model; and generating the plurality of convergence outcomes of the tuned model, based on the parameter seeds, wherein determining the brittleness score of the tuned model is based on the convergence outcomes...”
The specification does not disclose how to a) parameter seeds for the model, b) generating convergence outcomes of the tuned model, c) determing brittleness score of the tuned model.


The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 37 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 37: “... wherein the received model is a synthetic data generation model...” it is unclear what is a “synthetic data generation model”. It is unclear what functions of a “synthetic data generation model” are.
[0005] Model “brittleness” can cause problems when training a model to perform a new task. A “brittle” model is a model that may fail to converge during training. For example, a brittle model may work well for identifying faces in one person's photo album but may not work well for another person's photo album, without extensive retraining. In some cases, it can be difficult or impossible to train brittle models without human supervision (e.g., training models to generate synthetic data from sensitive data that human users cannot access). During training, brittle models may converge to a sub-optimal state and/or may converge slowly. For example, a model may converge to a model accuracy that is too low. In some cases, brittle models may fail to converge during training (e.g., the model may oscillate between two model states at each training step). Brittle models may need to be retrained to each newly received dataset. In many cases, it may not be apparent whether a model is brittle, without time consuming and costly training efforts.

<examiner note: [0005] simply discusses “brittle model” that may fail to converge, converge to a sub-optimal state, or converge slowly>
[0021] Systems and methods of disclosed embodiments may involve datasets comprising actual data reflecting real-world conditions, events, or measurement. However, in some embodiments, disclosed systems and methods may fully or partially involve synthetic data (e.g., anonymized actual data or fake data). Datasets of disclosed embodiments may have a respective data schema (i.e, structure), including a data type, key-value pair, label, metadata, field, relationship, view, index, package, procedure, function, trigger, sequence, synonym, link, directory, queue, or the like. Datasets of the embodiments may contain foreign keys, i.e. data elements that appear in multiple datasets and may be used to cross-reference data and determine relationships between datasets. Foreign keys may be unique (e.g., a personal identifier) or shared (e.g., a postal code). Datasets of the embodiments may be “clustered,” i.e., a group of datasets may share common features, such as overlapping data, shared statistical properties, etc. Clustered datasets may share hierarchical relationships (i.e., data lineage).

<examiner note: there is no line in [0021] discloses that “a reveived model is a “synthetic data generation model”>

[0031] Interface 106 can be configured to manage interactions between system 100 and other systems using network 112. In some aspects, interface 106 can be configured to publish data received from other components of system 100. This data can be published in a publication and subscription framework (e.g., using APACHE KAFKA), through a network socket, in response to queries from other systems, or using other known methods. The data can be synthetic data, as described herein. As an additional example, interface 106 can be configured to provide information received from model storage 108 regarding available datasets. In various aspects, interface 106 can be configured to provide data or instructions received from other systems to components of system 100. For example, interface 106 can be configured to receive instructions for generating data models (e.g., type of data model, data model parameters, training data indicators, training hyperparameters, or the like) from another system and provide this information to model optimizer 104. As an additional example, interface 106 can be configured to receive data including sensitive portions from another system (e.g., in a file, a message in a publication and subscription framework, a network socket, or the like) and provide that components of system 100.
[0042] Programs 235 may include a model-training module 236, a dataset-clustering module 237, a model-clustering module 238, a model-optimization module 239, and/or other modules not depicted to perform methods of the disclosed embodiments. In some embodiments, modules of programs 235 may be configured to generate (“spin up”) one or more ephemeral container instances to perform a task and/or to assign a task to a running (warm) container instance, consistent with disclosed embodiments. Modules of programs 235 may be configured to receive, retrieve, and/or generate models, consistent with disclosed embodiments. Modules of programs 235 may be configured to receive, retrieve, and/or generate datasets (e.g., to generate synthetic datasets, data samples, or other datasets), consistent with disclosed embodiments. Modules of programs 235 may be configured to perform operations in coordination with one another. For example, model-optimization module 239 may send a model training request to model-training module 236 and receive a trained model in return, consistent with disclosed embodiments.
[0055] Dataset-clustering module 237 may include or be configured to implement a data classification model. The data classification model may include machine learning models to classify datasets based on the data schema, statistical profile, foreign keys, and/or edges. The data classification model may be configured to segment datasets, consistent with disclosed embodiments. Segmenting may include classifying some or all data within a dataset, marking or labeling data (e.g., as duplicate), cleaning a dataset, formatting a dataset, or eliminating some or all data within a dataset based on classification. The models may be configured to classify data elements as actual data, synthetic data, relevant data for an analysis goal or topic, data derived from another dataset, or any other data category. The data classification model may include a CNN, a random forest model, an RNN model, a support vector machine model, or another machine learning model.
[0067] At step 302, model optimizer 104 receives a modeling request, consistent with disclosed embodiments. The request may be received from, for example, client device 102 and/or via interface 106. The request may include a preliminary model and/or a dataset. In some embodiments, the preliminary model is a machine learning model. The request may include a reference model, consistent with disclosed embodiments. The dataset may include real (actual) data and/or synthetic data, consistent with disclosed embodiments. In some embodiments, the request includes instructions to generate a model and may include model parameters, hyperparameters, or other model characteristics. In some embodiments, the request includes instructions to retrieve a model and/or a dataset from a data storage (e.g., data 231, model storage 108, and/or database 110). The request may include instructions to generate or retrieve a model based on a desired outcome and a dataset (or a dataset cluster or other dataset characteristic), consistent with disclosed embodiments. The request may include one or more parameter seed properties. For example, the request may include an instruction to generate a random parameter seed, to generate a grid of parameter seeds, to generate a predetermined number of parameter seeds, or the like.
[0082] At step 402, model optimizer 104 receives model information, consistent with disclosed embodiments. The information may be received from, for example, client device 102 and/or via interface 106. The information may include a model and/or a dataset. In some embodiments, the model is a machine learning model. The dataset may include real (actual) data and/or synthetic data, consistent with disclosed embodiments. In some embodiments, the information includes instructions to generate a model and may include model parameters, hyperparameters, or other model characteristics. In some embodiments, the information includes instructions to retrieve a model and/or a dataset from a data storage (e.g., data 231, model storage 108, and/or database 110). The information may include instructions to generate or retrieve a model based on a desired outcome and a dataset (or a dataset cluster or other dataset characteristic), consistent with disclosed embodiments. The information may include one or more parameter seed properties (e.g., an instruction to generate a random parameter seed, to generate a grid of parameter seeds, to generate a predetermined number of parameter seeds, or the like).
[0100] At step 602, model optimizer 104 receives a plurality of datasets, consistent with disclosed embodiments. For example, model optimizer 104 may receive datasets from at least one of client device 102, data 231, database 110, another component of system 100, or another remote device. Step 602 may be a triggering event that causes model optimizer 104 to generate an ephemeral container instance to perform other steps of process 600. Step 602 may include receiving a dataset index, a data label, a foreign key, or a foreign key index. The label may indicate whether one or more data elements are actual data, synthetic data, relevant data, or another category of data. The dataset index may include metadata, an indicator of whether data element is actual data or synthetic data, a data schema, a statistical profile, a data label, a relationship between datasets (e.g., node and edge data), or other descriptive information.
[0103] Also at step 606, model optimizer 104 may implement a data classification model, consistent with disclosed embodiments. The data classification model may segment a cluster of connected datasets comprising the selected dataset based on the plurality of edges. In some embodiments, the segmenting may be based on at least one of a statistical metric, a data schema, a foreign key, a data label, an analysis goal, or an analysis topic. The label may indicate that a data element is actual data, synthetic data, or another category of data.
[0104] In some embodiments, segmenting the cluster of connected datasets at step 606 includes labelling data in the cluster of connected datasets, and or removing data based on a label. For example, step 606 may include removing data that is labelled as at least one of synthetic data, derived data, or irrelevant data. In some embodiments, a received dataset in the cluster of connected datasets may comprise labelled data, and segmenting may be based on the received, labelled data.
[0108] At step 702, model optimizer 104 receives model information, consistent with disclosed embodiments. The information may be received from, for example, client device 102 and/or via interface 106. The information may include a model and/or a dataset. In some embodiments, the model is a machine learning model. The dataset may include real (actual) data and/or synthetic data, consistent with disclosed embodiments. In some embodiments, the information includes instructions to generate a model and may include model parameters, hyperparameters, or other model characteristics. In some embodiments, the information includes instructions to retrieve a model and/or a dataset from a data storage (e.g., data 231, model storage 108, and/or database 110). The information may include instructions to generate or retrieve a model based on a desired outcome and a dataset (or a dataset cluster or other dataset characteristic), consistent with disclosed embodiments.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 21-38 and 40 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because claim 21 and 40 include a system comprising a memory and one or more processors.

    PNG
    media_image2.png
    528
    641
    media_image2.png
    Greyscale

System 100 as shown in fig. 1 comprising Model Optimizer 104.

    PNG
    media_image3.png
    184
    653
    media_image3.png
    Greyscale

Given broadest reasonable interpretation, the Model Optimizer implemented in software.

    PNG
    media_image4.png
    681
    623
    media_image4.png
    Greyscale

Fig. 2 shows the Model Optimizer houses processors and memory

    PNG
    media_image5.png
    298
    682
    media_image5.png
    Greyscale

Given broadest reasonable interpretation, the processor is a virtual processor

Model Optimizer is implemented in software and processor is a virtual processor.It is obvious that memory 230 is a virtual/software memory because a hardware memory can not be implemented in a software Model Optimizer.
The claims recite "a system" with various items configured to perform operations, but recite no hardware in the system to perform the claimed steps. Claims 21 and 40 are nothing more than software per se (see specification paragraph [026], [038], [040], gih. 1 and fig. 2). The claims lack the necessary physical articles or objects to constitute a machine or manufacture within the meaning of 35 USC 101. They are clearly not a series of steps or acts to be a process nor are they a combination of chemical compounds to be a composition of matter. As such, they fail to fall within a statutory category. They are, at best, functional descriptive material.
Dependent claims of claim 21 are rejected for failing to cure deficiencies.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Tseng (U.S. Pub 2020/0302292 A1), in view of Li (U.S. Pub 2016/0078339 A1).
Claim 21
Tseng discloses a system for generating a preferred model, the system comprising (fig. 2, apparatus 200, fig. 5):
at least one memory storing instructions (fig. 5, memory 604); and 
one or more processors that execute the instructions to perform operations comprising (fig. 5, processing apparatus 502):
receiving a model characteristic of a received model, the received model comprising a machine learning model ([0052], “... provide model information 204 defining one or more models to the user device for storage. The model information 204 may define parameters of the model. The parameters may include hyperparameters for the model such as a number of layers in the model, a number of kernels in each of the layers, identifiers of the orthogonal binary basis vectors used to represent the kernels, or any combination thereof. In addition, the parameters may include a set of coefficients for each kernel. In some examples, the model information may also include information describing the performance of the model, such as accuracy and/or computational resource use... [0059], line 4-7, “... The plurality of neural network models may be referred to as a candidate list and may have been received from the neural network training apparatus 200...” <examiner note: the received model includes model characteristics such as neural network models (e.g., machine learning model). A neural network includes characteristics such as number of layers, kernels, coefficients and so on>); 
classifying the received model based on the model characteristic ([0083], line 1-3, “... S4-1, the apparatus 200 may receive information indicative of one or more constraints for the neural network model...” [0084], line 10-12, “... the received information may identify a particular implementation or class implementations for the neural network model...” <examiner note: using the received information, the apparatus identifies a class, type, classification of the received neural network>); 
identifying a fixed training hyperparameter based on the classification ([0085], line 1-2, “... operation S4-2, the apparatus 200 determines at least one set of hyperparameters for a neural network model...”); 
generating a tuned model by performing hyperparameter tuning of the received model according to the fixed training hyperparameter ([0089], line 1-4, “... the apparatus 200 selects the sets of hyperparameters from the model grid iteratively, and trains a neural network based on each set of hyperparameters...” <examiner note: the neural network is trained using hyperparameter. The trained neural network <=> tuned model>); 
However, Tseng does not explicitly disclose
determining a brittleness score of the tuned model 
comparing the brittleness score of the tuned model to a brittleness score of a different model; and 
generating the preferred model based on the comparison and the tuned model or the different model.
Li discloses
determining a brittleness score of the tuned model ([0036], “... Evaluating component 128... for evaluating the student DNN model... evaluating component 128 evaluates the output distributions of the student and teacher DNNs... determines whether the student is continuing to improve or whether the student is no longer improving...” <examiner note: the output distribution of student model is considered as brittlness score because it shows whether or not the student model reach to convergence state>)
comparing the brittleness score of the tuned model to a brittleness score of a different model ([0036], “... evaluating component 128 evaluates the output distributions of the student and teacher DNNs, determines the difference (which may be determined as an error signal) between the outputs and also determines whether the student is continuing to improve or whether the student is no longer improving (i.e. the student output distribution shows no further trend towards convergence with the teacher output)...” <examiner note: the output distributions of student model and teacher/different model are compared>); and 
generating the preferred model based on the comparison and the tuned model or the different model ([0065], “... At step 560, the student DNN is updated based on the evaluation determined at step 550... In one embodiment, the difference between the output distribution of the student DNN and teacher DNN determined in step 550 is used to update the parameters or node weights of the student DNN, which may be performed using back propagation. Updating the student DNN in this way facilitates training the output distribution of the student DNN to more closely approximate the output distribution of the teacher DNN...” <examiner note: an updated student model is a preferred model. It is generated based on the comparision and student model or the teacher model>)

Tseng discloses a neural network model is trained and local minimum/value of loss function is calculated; however, the performance of the neural network is not compared with a reference model to determine the performance of the neural network. Li discloses the performance/ouput distribution of student neural network is compared with teacher model and a better student model is generated using the comparioson result and student model and teacher model. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to calculate and compare performance of one model to performance of another/reference model to identify the difference/divergence between models in order to retrain the model with higher accuarcy with small error rate.
Claim 21/22
Claims 21 is included, Tseng discloses wherein the hyperparameter tuning is a first hyperparameter tuning ([0085], line 1-2, “... operation S4-2, the apparatus 200 determines at least one set of hyperparameters for a neural network model...”) and the operations further comprise: generating the different model by performing a second hyperparameter tuning of the received model according to the fixed training hyperparameter ([0086] In some implementations, the apparatus 200 determines plural different sets of hyperparameters, all of which is satisfy the imposed constraints. The different sets of hyperparameters may be referred to as sets of neural network model information...”); and determining a brittleness score of the different model (Li, ([0036], “... evaluating component 128 evaluates the output distributions of the student and teacher DNNs, determines the difference (which may be determined as an error signal) between the outputs and also determines whether the student is continuing to improve or whether the student is no longer improving (i.e. the student output distribution shows no further trend towards convergence with the teacher output)...”)
Claim 23
Claim 21 is included, Tseng discloses the operations further comprising: generating a plurality of parameter seeds for the tuned model; and generating the plurality of convergence outcomes of the tuned model, based on the parameter seeds, wherein determining the brittleness score of the tuned model is based on the convergence outcomes ([0088] Subsequently, in operation S4-3, one or more neural networks based on each of the sets of hyperparameters are trained. The training data may be retrieved from a larger set of data (labelled examples) from which some of the data is used as training examples for training the neural network models and other data of the set is used as validation examples, for validating trained neural network models...” [0089] In some examples, the apparatus 200 selects the sets of hyperparameters from the model grid iteratively, and trains a neural network based on each set of hyperparameters in turn. In other examples, two or more of the neural networks may be trained concurrently...” [0090], “... During training, the kernels in each of the layers are implemented using superposition of the products of the binary basis vectors and corresponding coefficients. The coefficients for each of the kernels may be initiated randomly or in any other suitable way. The coefficients are then updated/refined during training using gradient descent and back propagation. In some examples, stochastic gradient descent may be used...”[0091], “... Once a local minimum has been found and/or a maximum number of iterations have been performed, the coefficients are (in operation S4-4) stored in the appropriate set of neural network model information along with the corresponding hyperparameters. The coefficients may be stored in sets, each corresponding to a different kernel. Each of the coefficients may also be stored in a manner which allows the corresponding orthogonal binary basis vector to be identified. For instance, each coefficient may be stored in association with an identifier of the associated orthogonal binary basis vector...”)
Claim 24
Claim 23 is included, Tseng discloses wherein the parameter seeds comprise randomly-generated parameter seeds ([0090] During training, the kernels in each of the layers are implemented using superposition of the products of the binary basis vectors and corresponding coefficients. The coefficients for each of the kernels may be initiated randomly or in any other suitable way. The coefficients are then updated/refined during training using gradient descent and back propagation. In some examples, stochastic gradient descent may be used...”)
Claim 25
Claim 21 is included, Li discloses wherein the different model is a reference model ([0036], “... Evaluating component 128... for evaluating the student DNN model... evaluating component 128 evaluates the output distributions of the student and teacher DNNs... determines whether the student is continuing to improve or whether the student is no longer improving...” <examiner note: teacher model is considered as reference model>)
Claim 26
Claim 21 is included, Li discloses the operations further comprising providing the brittleness score of the tuned model ([0036], “... Evaluating component 128... for evaluating the student DNN model... evaluating component 128 evaluates the output distributions of the student and teacher DNNs...” <examiner note: output distribution od student model is brittleness score of the tuned/trained model>)
Claim 27
Claim 21 is included, Li discloses wherein generating the preferred model comprises selecting the tuned model ([0065] At step 560, the student DNN is updated based on the evaluation determined at step 550...”)
Claim 28
Claim 21 is included, Li discloses wherein generating the preferred model comprises training the tuned model ([0065] At step 560, the student DNN is updated based on the evaluation determined at step 550. The student DNN may be updated by a training component ... Updating the student DNN in this way facilitates training the output distribution of the student DNN to more closely approximate the output distribution of the teacher DNN...”)
Claim 29
Claim 21 is included, Tseng discloses wherein classifying the received model includes determining that the model belongs to a cluster of models, and the fixed hyperparameter is associated with the cluster ([0059], “...select a neural network model from a plurality of neural network models locally stored on the device 202. The plurality of neural network models may be referred to as a candidate list...”[0015], “... plural sets of hyperparameters, each set of hyperparameters defining a different neural network model...” [0085], line 1-2, “... operation S4-2, the apparatus 200 determines at least one set of hyperparameters for a neural network model...” <examiner note: a model is selected from a group/cluster of models and a set of hyperparameters of the group of models are selected>)
Claim 30
Claim 21 is included, Tseng discloses the operations further comprising generating an accuracy score of the tuned model, and wherein generating the preferred model is further based on the accuracy score ([0095], “... determines whether each of the trained neural networks satisfies the one or more imposed constraints... For instance, the apparatus 200 may determine whether the accuracy of the trained neural network satisfies the minimum acceptable accuracy constraint and/or whether the computational resource usage of the trained neural network satisfies the computational resource constraint...”)
Claim 31
Claim 21 is included, Tseng discloses the operations further comprising generating a training-time score of the tuned model, and wherein generating the preferred model is further based on the training-time score ([0094] Validation may allow the apparatus 200 to determine the accuracy of the model. (since the validation examples are labelled). In addition, the apparatus 200 may monitor the computational resource use during validation. The monitored computational resource use may include energy consumption, CPU usage and memory used to execute the neural network model...” <examiner note: CPU usage is the percentage of the amount of time a CPU spends processing non-idle tasks. In this application, task is training model. Therefore, CPU usage relates to training time>)
Claim 32
Claim 21 is included, Li discloses wherein determining the brittleness score of the tuned model is based on a plurality of convergence outcomes associated with one or more training runs ([0037], “... In particular, some embodiments of evaluating component 128 apply a threshold to determine convergence of the teacher DNN and student DNN output distributions. Where the threshold is not satisfied, iteration may continue, thereby further training the student to approximate the teacher...”)
Claim 33
Claim 21 is included, Li discloses wherein comparing the brittleness score of the tuned model to a brittleness score of the different model comprises retrieving the brittleness score of the different model from storage ([0042], “... teacher DNN 302 comprises a trained DNN model... In the embodiment shown in FIG. 3, student DNN 301 has output distribution 351, and teacher DNN 302 has output distribution 302 of the same size, although the student DNN 301 itself is smaller than teacher DNN 302...” <examiner note: the teacher model has been trained and the oputput distribution of teacher model simply is historical output distribution>)
Claim 34
Claim 21 is included, Li discloses the operations further comprising identifying the different model, and determining a reference brittleness score of the different model ([0045], “... a teacher DNN 402 are provided. Teacher DNN 402 comprises an ensemble teacher DNN model...” [0046], “... initialization component 124 of FIG. 1 (or a similar service) may determine the specific sub-DNNs to be included in the ensemble...” [0036], “... evaluating component 128 evaluates the output distributions of the student and teacher DNNs..”)
Claim 35
Claim 21 is included, Tseng discloses wherein the model characteristic comprises a model parameter ([0052], “... provide model information 204 defining one or more models to the user device for storage. The model information 204 may define parameters of the model. The parameters may include hyperparameters for the model such as a number of layers in the model, a number of kernels in each of the layers, identifiers of the orthogonal binary basis vectors used to represent the kernels, or any combination thereof. In addition, the parameters may include a set of coefficients for each kernel. In some examples, the model information may also include information describing the performance of the model, such as accuracy and/or computational resource use...”)
Claim 36
Claim 21 is included, Tseng discloses wherein the model characteristic comprises a model type ([0052], “... provide model information 204 defining one or more models to the user device for storage. The model information 204 may define parameters of the model. The parameters may include hyperparameters for the model such as a number of layers in the model, a number of kernels in each of the layers, identifiers of the orthogonal binary basis vectors used to represent the kernels, or any combination thereof. In addition, the parameters may include a set of coefficients for each kernel. In some examples, the model information may also include information describing the performance of the model, such as accuracy and/or computational resource use...” <examiner note: underlined characteristics are associated with a type of machine learning model <=> neural network>)
Claim 37
Claim 21 is included, Tseng discloses wherein the received model is a synthetic data generation model ([0059], line 4-7, “... The plurality of neural network models may be referred to as a candidate list and may have been received from the neural network training apparatus 200...”)
Claim 38
Claim 21 is included, Li discloses wherein classifying the received model is based on a training dataset previously used to train the received model ([0037], “... Where the threshold is not satisfied, iteration may continue, thereby further training the student to approximate the teacher. Where the threshold is satisfied, then convergence is determined (indicating the student output distribution is sufficiently close enough to the teacher DNN's output distribution) and the student DNN may be considered trained...” <examiner note: the student model is classified as untrained or trained model based on the training dataset>)
	Claim 39 and 40 are similar to claim 1. The claims are rejected based on similar reasons.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HAU HAI HOANG whose telephone number is (571)270-5894. The examiner can normally be reached 1st biwk: Mon-Thurs 7:00 AM-5:00 PM; 2nd biwk: Mon-Thurs: 7:00 am-5:00pm, Fri: 7:00 am - 4:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Robert Beausoliel can be reached on 571 262 3645. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

HAU HAI. HOANG
Primary Examiner
Art Unit 2167



/HAU H HOANG/Primary Examiner, Art Unit 2167