Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION

Response to Amendment
This communication is in response to the amendment filed on 2/18/2022 for the application No.  16/386,700, Claims 1-14, 16-18 and 20 are currently pending and have been examined. Claims 1-14, 16-18 and 20 have been rejected.

Examiner’s Note

Regarding to 101 compliance, Claims 1-14, 16-18 and 20 are compliant with 101, according with the last “2019 Revised Patent Subject Matter Eligibility Guidance” (2019 PEG), published in the Federal Register, Vol. 84, No. 4, Monday, January 7, 2019. Examiner’s  analysis is presented below in the independent claims:
Claim 1: Step 1 of  2019 PGE, does the claim fall within  a Statutory Category? Yes. The claim recites a device.
Step 2A - Prong 1: Is a Judicial Exception recited in the claim ? No. 
The claim does not recite any of the judicial exceptions enumerated in the 2019 PEG. For instance, the claim does not recite a mental process because the claim, under its broadest reasonable interpretation, does not cover performance in the mind but for the recitation of generic computer components. For example, the  claim recites the limitations of  “A device for training a model, the device comprising at least one sensor configured to acquire a plurality of data instances; a communication interface configured to communicate with a parameter server; and a device processor configured to train the model using a threshold quantity of the data instances of the plurality of data instances; the device processor configured to over sample or under sample the plurality of data instances to equal the threshold quantity; the device processor further configured to transmit a parameter vector of the trained model to the parameter server and receive in response, an updated central parameter vector from the parameter server derived from the model; the device processor further configured to retrain the model using the updated central parameter vector;” they require action by a device processor comprising  a sensor and a parameter server that cannot be practically applied in the mind at least because it requires a device processor.  Furthermore, the claim does not recite any method of organizing human activity, such as a fundamental economic concept or managing interactions between people. Finally, the claim does not recite a mathematical relationship, formula, or calculation. Thus, the claim is eligible because it does not recite a judicial exception. The claim requires at least a worker device.
Next,  in light of DDR Holdings, LLC v. Hotels.com, L.P., 773 F.3d 1245 (Fed. Cir. 2014), the claimed structures and processes of this invention are "necessarily rooted in computer technology" (a method for training a model using a plurality of distributed worker devices) and "overcome a problem specifically arising in the realm of computer networks." Id. at 1257.
Step 2A - Prong 2: Integrated into a Practical Application? n/a.
Step 2B: Claim provides an Inventive Concept? n/a.

Claim 14. Step 1 of  2019 PGE, does the claim fall within  a Statutory Category? Yes. A method.
Step 2A - Prong 1: Is a Judicial Exception recited in the claim ? No, because the same reasons mentioned above for claim 1. Thus, the claim does not recite a judicial exception. The claim requires at least a device processor.
Step 2A - Prong 2: Integrated into a Practical Application? n/a.
2B: Claim provides an Inventive Concept? n/a.

Claim 17. Step 1 of  2019 PGE, does the claim fall within  a Statutory Category? Yes. A computer-readable non-transitory medium.
Step 2A - Prong 1: Is a Judicial Exception recited in the claim? No, because the same reasons mentioned above for claim 1. Thus, the claim does not recite a judicial exception. The claim requires at least a worker device.
Step 2A - Prong 2: Integrated into a Practical Application? n/a.
2B: Claim provides an Inventive Concept? n/a.
As to claims 2-13, 16-17 and 20: they depend of the independent claims analyzed above, they add features such as “wherein the device processor is configured to over sample or under sample the plurality of data instances so that when a number of data instances available to the device processor is larger than the threshold quantity…”; “wherein the device processor is further configured to receive in response to the transmission of the parameter vector to the parameter server, an updated threshold quantity from the parameter server, wherein the device processor is further configured to retrain using the updated threshold quantity of the data instances of the plurality of data instances”; “, wherein the at least one sensor is coupled with a vehicle”;” wherein the updated central parameter is transmitted to the device prior to the updated central parameter being altered again”, etc. which emphasizes that is not practical to perform the instant invention in the human mind.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.

Claims 1-7 and 9-14, 16-18 and 20 are  rejected under 35 U.S.C. 103 as being unpatentable over US PG. Pub. No. 20180336481(Guttmann) in view of US PG. Pub. No.  20190197404 (WANG).

As to claims 14 and 18, Guttmann discloses a method for training a model using a plurality of distributed worker devices, the method comprising: 
a)  identifying, by a worker device (see for example Figs 1A and 1B and associated disclosure),
a plurality of data instances  (see at least element 710 in Fig. 7;  “[0056] In some embodiments, the one or more communication modules 230 may be configured to receive and transmit information. For example, control signals may be transmitted and/or received through communication modules 230. In another example, information received though communication modules 230 may be stored in memory units 210. In an additional example, information retrieved from memory units 210 may be transmitted using communication modules 230. In another example, input data may be transmitted and/or received using communication modules 230. Examples of such input data may include: input data inputted by a user using user input devices; information captured using one or more sensors; and so forth. Examples of such sensors may include: audio sensors 250; image sensors 260; motion sensors 270; positioning sensors 275; chemical sensors; temperature sensors; barometers; pressure sensors; proximity sensors; electrical impedance sensors; electrical voltage sensors; electrical current sensors; and so forth”, paragraph 56); 

b) selecting, by the worker device, a first set of data instances from the plurality of data instances as a function of a threshold quantity received from a parameter server
(“[0038] Unless specifically stated otherwise, as apparent from the following discussions, it is appreciated that throughout the specification discussions utilizing terms such as “processing”, “calculating”, “computing”, “determining”, “generating”, “setting”, “configuring”, “selecting”, “defining”, “applying”, “obtaining”, “monitoring”, “providing”, “identifying”, “segmenting”, “classifying”, “analyzing”, “associating”, “extracting”, “storing”, “receiving”, “transmitting”, or the like, include action and/or processes of a computer that manipulate and/or transform data into other data, said data represented as physical quantities, for example such as electronic quantities, and/or said data representing the physical objects….”, paragraph 38.
“[0085] In some embodiments, dataset 610 may comprise data and information arranged in data-points. For example, a data-point may correspond to an individual, to an object, to a geographical location, to a geographical region, to a species, and so forth. For example, dataset 610 may comprise a table, and each row or slice may represent a data-point. For example, dataset 610 may comprise several tables, and each data-point may correspond to entries in one or more tables. For example, a data-point may comprise a text document, a portion of a text document, a corpus of text documents, and so forth….”, paragraph 85. 
“…Step 750 may compare the updated information associated with the external devices obtained by Step 740 with the original information associated with the external devices 710 to determine if the magnitude of the update is above a selected threshold…”, paragraph 132 and Fig. 7.
“….the number of samples in the set may be selected according to the available memory size [giving the broadest reasonable interpretation it is a threshold quatity]. In some examples, training examples may be sampled (for example, according to the available processing resources information, to available memory size, etc. [giving the broadest reasonable interpretation it is a threshold quantity].), … “, paragraph 152);

wherein selecting comprises over sampling or under sampling the plurality of data instances 
(Guttmann teaches a system that samples according to different rules,  “…In some examples, the training examples may be selected from a plurality of …training examples (for example from datasets 610 and/or annotations 620 and/or views 630) ….. For example, the training examples may be selected according to their size and according to rules chosen …. in response to the available processing resources information …. Some examples of such rules may include the selection of training examples with size that is below a selected threshold, above a selected threshold, and so forth….”, paragraph 157.
so that when a number of data instances available to the worker device is larger than the threshold quantity, the worker device samples the threshold quantity of data instances 


and when the number of data instances available to the worker device is smaller than the threshold quantity, the worker device samples all of the data instances of the plurality of data instances
(“[0128] Additionally or alternatively to Step 720, process 700 may generate synthetic examples using the information associated with external devices (for example, the information obtained by Step 710). For example, an artificial neural network trained to produce synthetic examples from information associated with external devices may be used. In another example, using the information associated with external devices, some examples may be selected as described above, and additional synthetic examples may be generated, for example using the Synthetic Minority Over-sampling Technique (SMOTE)”, paragraph 128.
“0206] FIG. 13 illustrates an example of a process 1300 for enriching datasets while learning. In this example, process 1300 may comprise: obtaining intermediate results of training machine learning algorithms (Step 1310); obtaining additional training examples … (Step 1320); and training the machine learning algorithms using the obtained additional training example …”, paragraph 206)
and then resamples one or more of the data instances until the threshold quantity is reached
batch size, momentum, random seed [method for resampling], and so forth…”, paragraph 198);

Guttmann does not  expressly disclose 
data instances available to the worker device is larger than the threshold quantity, the worker device samples the threshold quantity of data instances 
data instances available to the worker device is smaller than the threshold quantity, the worker device samples all of the data instances of the plurality of data instances
and then resamples one or more of the data instances until the threshold quantity is reached

But giving the broadest reasonable interpretation (MPEP 2111), it is obvious that these limitations are not more than rules and from the teaching of Guttmann it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine rules as disclosed in Guttmann “…Some examples of such rules may include the selection of training examples with size that is below a selected threshold, above a selected threshold, and so forth….”, paragraph 157, since the technical ability exists to combine the elements as claimed and the results of the combination are predictable,  consequently, known variations or principles would 

c) training, by the worker device, the model using the first set of data instances and a set of first parameters
(“…Some examples of machine learning algorithms that may be used may include support vector machine, gradient descent based algorithms, deep learning algorithms for artificial neural networks, AdaBoost, linear regression, and so forth. For example, process 1200 may be used to select hyper-parameters for the machine learning algorithm and/or to cause a selected device to train the machine learning algorithm…”, paragraph 129 and Fig. 12.
“…selecting a device (Step 1250); and causing the … device to perform the training task (Step 1260)…”, paragraph 196, see at least Fig. 12 element 1260.
“[0198] Some examples of properties of a machine learning training task may include a type of a machine learning algorithm, hyper-parameters of the machine learning algorithm, properties of the training set [Examiner interprets as first set of data instances and a set of first parameters], properties of the validation set, properties of the test set, and so forth. The hyper-parameters of the machine learning algorithm may differ from one machine learning algorithm to another…”, paragraph 198); 

d) transmitting, by the worker device, 
(“[0081] In some embodiments, the one or more external communication modules 450 may be configured to receive and/or to transmit information. For 450….”, paragraph 81. See also Fig. 3 at least element 230 and Fig. 4B elements 440 and 450),
a set of second parameters of the trained model to the parameter server
(“0098] In some embodiments, algorithm 640 may comprise one or more decision rules. For example, a decision rule may compare a computed value to a threshold, and in some cases the threshold may be set based on a parameter and/or a hyper-parameter. In some embodiments, algorithm 640 may be preprogrammed manually. For example, a manually preprogrammed algorithm may implement a heuristic algorithm that has zero or more parameters and/or hyper-parameters. In some embodiments, algorithm 640 may comprise a machine learning algorithm configured to train on training examples,…”, paragraph 98.
“…In some examples, the plurality of examples and/or the corresponding assigned weights may be used as… a validation set  [Examiner interprets as second parameters]…”, paragraph 129.
 “…[0198] Some examples of properties of a machine learning training task may include a type of a machine learning algorithm, hyper-parameters of the machine learning algorithm, properties of the training set, properties of the validation set, properties of the test set, and so forth. The hyper-parameters of the machine learning algorithm may differ from one machine learning algorithm to another….”, paragraph 198.
“…Similarly, some examples of properties of the validation set may include samples of the validation examples of the validation set, the entire validation set, the number of validation examples in the validation set, information about the size of 
e) receiving, by the worker device, a set of third parameters from the parameter server and an updated threshold quantity, wherein the set of third parameters is calculated at least partially as a function of the set of second parameters
(“…Further, Step 1030 may be repeated for different groups of data items associated with different groups of devices, for example comparing the results of applying a first group of data items associated with a first group of devices to a first inference model with the results of applying the first group of data items to a second inference model, comparing the results of applying a second group of data items associated with a second group of devices to a first inference model with the results of applying the second group of data items to a second inference model, and so forth. In some examples, comparing the results may comprise comparing loss function values associated with the results, comparing values of a function that summarizes the results, comparing the distributions of the results, comparing the distributions of errors, comparing the distributions of the results where the distributions are with respect to an input space, comparing the distributions of errors where the distributions are with respect to an input space, and so forth….”, paragraph 168 and Fig. 10.
“…In yet another example, using process 1700 it may be determined that process 1200 has insufficient quota  [Examiner interprets as threshold quantity] to use some devices, and as a result different devices may be selected. In another a cost function may be used to assign cost for each one of the plurality of devices according to their corresponding estimated processing resources requirements, and the device corresponding to the lowest cost (possibly out of the devices that satisfy the constraints as described above) may be selected. An example of such a cost function may include c1*t+c2*s, where c1 and c2 are positive constants which may represent cost per processing time and cost per memory size respectively, t may represent the estimated processing time, and s may represent the estimated memory size. In some examples, the estimated processing resources requirements may comprise an estimated range of processing resources requirements together with a distribution that assigns probabilities to the estimations. In such cases, the constraints may specify a required certainty that some other conditions hold [Examiner interprets as threshold quantity]. Further, the cost function may comprise a function that sums values over the different estimations according to the probabilities [Examiner interprets as updated threshold quantity]…”, paragraph 203);

f) selecting, by the worker device, a second set of data instances from the plurality of data instances as a function of the updated threshold quantity received from a parameter server
(“[0208] In some examples, the intermediate results may comprise values of parameters of the machine learning algorithm. In some examples, the intermediate results may comprise values measured using at least part of the training examples and/or using at least part of the validation examples and/or using at least part of the test 
“0270] In some examples, the progress update may be related to an action comprising training of a machine learning algorithm (for example with selected hyper-parameters), and the progress update may comprise indications of the status of the training. For example, the progress update may comprise intermediate results and/or intermediate status of the training task, for example as obtained by Step 1310. In some examples, the progress update may be related to an action comprising usage of an inference model, ….In some examples, the progress update may be related to an action comprising minimizing and/or maximizing an objective function (for example, an objective function based on data from datasets 610 and/or annotations 620 and/or views 630), and the progress update may comprise indications of the status of the minimization and/or maximization…”, paragraph 270) ; and 
g) training, by the worker device, the model using the second set of data instances and the set of third parameters
(“0270] In some examples, the progress update may be related to an action comprising training of a machine learning algorithm (for example with selected parameters), and the progress update may comprise indications of the status of the training. For example, the progress update may comprise intermediate results and/or intermediate status of the training task, for example as obtained by Step 1310. In some examples, the progress update may be related to an action comprising usage of an inference model, for example comprising applying information to the inference model, and the progress update may comprise indications of the status of the action. For example, the information to be applied to the inference model may comprise a plurality of data-points, and the status may comprise the number and/or ratio of data-points already applied to the inference model, the number and/or ratio of data-points waiting to be applied to the inference model, the outputs (and/or statistics about the outputs) of the inference model for the data-points already applied, and so forth. In some examples, the progress update may be related to an action comprising minimizing and/or maximizing an objective function (for example, an objective function based on data from datasets 610 and/or annotations 620 and/or views 630), and the progress update may comprise indications of the status of the minimization and/or maximization. For example, the progress update may comprise intermediate results and/or intermediate status of minimization and/or maximization, such as objective value, iteration number, gradient at the intermediate result, last step size, rate of convergence, and so forth”, paragraph 270 and Fig. 13).
Although, Guttmann teaches “…Some examples of machine learning algorithms that may be used may include support vector machine, gradient descent based algorithms, deep learning algorithms for artificial neural networks, AdaBoost, linear regression, and so forth. For example, process 1200 may be used to select hyper-parameters for the machine learning algorithm and/or to cause a selected device to train the machine learning algorithm…”, paragraph 129 and Fig. 12.
“[0198] Some examples of properties of a machine learning training task may include a type of a machine learning algorithm, hyper-parameters of the machine learning algorithm, properties of the training set [Examiner interprets as first set of data instances and a set of first parameters], properties of the validation set, properties of the test set, and so forth. The hyper-parameters of the machine learning algorithm may differ from one machine learning algorithm to another…”, paragraph 198. Guttmann does not expressly teach the words,   first set, a set of second and a set of third.

However,  WANG  discloses
training of a machine learning model,  “…Various implementations relate to asynchronous training of a machine learning model. A server receives feedback data generated by training the machine learning model from a worker. The feedback data are obtained by the worker with its own training data and are associated with previous values of a set of parameters of the machine learning model at the worker. The server determines differences between the previous values and current values of the set of parameters at the server [Examiner interprets first set, a set of second and a set of third]. The current value may have been updated for once or more due to operation of other workers. Then, the server can update the current values of the set of parameters based on the feedback data and the differences between values of the set of parameters.”, abstract.
It is known that a set of training data may be distributed across multiple workers which optimize the model parameters with their respective training data and return the result to a central server. However, the key problem of distributed or asynchronous model training is mismatch between workers. For instance, if a worker returns its updated parameters, the model parameters at the server may have been updated for one or more times by other workers [Examiner interprets as first set, a set of second and a set of third]…”, see at least paragraphs 2 and 27.

Therefore, it would have been obvious to one of ordinary skill in the art 
before the effective filing 	date of the claimed invention to incorporate WANG’s teaching with the teaching of  Guttmann. One would have been motivated to  provide functionality for training of a machine learning model, determining first set, a set of second and a set of third of data and parameters,  in order to update parameters in a feedback environment. As WANG states,  it is very well known for training a model at a server (see WANG at least paragraph 2 and abstract).


As to claim 1, Guttmann discloses a device for training a model, the device comprising at least one sensor configured to acquire a plurality of data instances (see at least Fig. 2B elements 250, 260, 265, 270 and 275 and associated disclosure); 
 a communication interface configured to communicate with a parameter server(see at least Fig. 2B element 230 and associated disclosure); and 


(see Fig. 2B and associated disclosure. see also “…said data represented as physical quantities, for example such as electronic quantities, and/or said data representing the physical objects….”, paragraphs 38 and 85.
“…Step 750 may compare the updated information associated with the external devices obtained by Step 740 with the original information associated with the external devices obtained by Step 710 to determine if the magnitude of the update is above a selected threshold…”, paragraph 132 and Fig. 7);
 the device processor configured to over sample or under sample the plurality of data instances to equal the threshold quantity
 (“…a group of labeled examples and a group of unlabeled examples may be obtained,..”, abstract)

 the device processor further configured to transmit a parameter vector of the trained model to the parameter server and receive in response, an updated central parameter vector from the parameter server derived from the model
(see Fig 1A, 1B, 2A, 3, 7  and associated disclosure); the device processor further configured to retrain the model using the updated central parameter vector (see at least paragraph 129 and Fig. 12 and Figs. 7 and 8 and associated disclosure); wherein the at least one sensor acquires different data instances than other sensors of the other devices that are training respective models (Fig. 2B);

Although, Guttmann teaches “…Some examples of machine learning algorithms that may be used may include support vector machine, gradient descent based algorithms, deep learning algorithms for artificial neural networks, AdaBoost, linear regression, and so forth. For example, process 1200 may be used to select hyper-parameters for the machine learning algorithm and/or to cause a selected device to train the machine learning algorithm…”, paragraph 129 and Fig. 12.
“[0198] Some examples of properties of a machine learning training task may include a type of a machine learning algorithm, hyper-parameters of the machine learning algorithm, properties of the training set [Examiner interprets as first set of data instances and a set of first parameters], properties of the validation set, properties of the test set, and so forth. The hyper-parameters of the machine learning algorithm may differ from one machine learning algorithm to another…”, paragraph 198. Guttmann does not expressly teach the word asynchronously…

However,  WANG  discloses
“…Various implementations relate to asynchronous training of a machine learning model. A server receives feedback data generated by training the machine learning model from a worker…”, abstract and Figs. 1-2).Therefore, it would have been obvious to one of ordinary skill in the art 


As to claim 2, Guttmann discloses
wherein selecting the first set of data instances comprises: over sampling or under sampling the plurality of data instances 
(Guttmann discloses “In some embodiments, algorithm 640 may comprise a machine learning algorithm configured to train on training examples, such as training examples included in datasets 610 and/or views 630, to estimate labels and/or tags and/or desired results, such as labels and/or tags and/or desired results included in annotations 620 and/or views 630. For example, algorithm 640 may comprise a kernel based algorithm, such as support vector machine and/or kernel principal component analysis, and the selection of a kernel may be according to a hyper-parameter. For example, algorithm 640 may comprise an artificial neural network, and the structure and/or other characteristics of the artificial neural network may be selected according to hyper-parameters. For example, algorithm 640 may comprise a clustering and/or a segmentation algorithm, and the number of desired clusters and/or segments may be selected according to a hyper-parameter [Examiner interprets as number of data instances]. For example, 640 may comprise a factorization algorithm, and the number of desired factors may be determined according to a hyper-parameter [Examiner interprets as number of data instances]…”, paragraph 98
“…number of samples in the set may be selected according to the available memory size. ..”, paragraph 152.
“…In other examples, an artificial neural network may comprise an output of a machine learning algorithm (and in some cases, deep learning algorithm) trained using training examples. In such case, some of the parameters of the artificial neural network may be set manually and are called hyper-parameters, while the other parameters are set by the machine learning algorithm according to the training examples. In some examples, parameters and/or hyper-parameters of the artificial neural network may be obtained by Step 1110. In some examples, the machine learning algorithm used to train the artificial neural network may also have some hyper-parameters, such as …batch size, ..”, paragraph 180),
so that when a number of data instances available to the worker device is larger than the threshold quantity, the worker device samples the threshold quantity of data instances
(“…“[0104] In some examples, updating a view, for example by an algorithm processing data from datasets 610 and/or annotations 620 and/or views 630 as described above, may comprise adding new views to views 630, removing views from views 630, modifying some of the views of views 630, and so forth. For example, observing a 
“..In such case, some of the parameters of the artificial neural network may be set manually and are called hyper-parameters, while the other parameters are set by the machine learning algorithm according to the training examples. In some examples, parameters and/or hyper-parameters of the artificial neural network may be obtained by Step 1110. In some examples, the machine learning algorithm used to train the artificial neural network may also have some hyper-parameters, such as …batch size [Examiner interprets as number of data instances available … is larger than the threshold quantity], ..”, paragraph 180),
 and when the number of data instances available to the worker device is smaller than the threshold quantity, the worker device samples all of the data instances of the plurality of data instances 
(Guttmann discloses “…Additional training examples may be selected based on the intermediate results. In some cases, synthetic examples may be generated based on the intermediate results. The machine learning algorithms may be further trained using the selected additional training examples …”, paragraph 14.
“[0104] In some examples, updating a view, for example by an algorithm processing data from datasets 610 and/or annotations 620 and/or views 630 as described above, may comprise adding new views to views 630, removing views from views 630, 630, and so forth. For example, observing a dataset and/or an annotation with some distribution of elements may cause the algorithm to create a view containing a sample of the elements with a different distribution…”, paragraph 104.
“..In such case, some of the parameters of the artificial neural network may be set manually and are called hyper-parameters, while the other parameters are set by the machine learning algorithm according to the training examples. In some examples, parameters and/or hyper-parameters of the artificial neural network may be obtained by Step 1110. In some examples, the machine learning algorithm used to train the artificial neural network may also have some hyper-parameters, such as …batch size [Examiner interprets as number of data instances available … is smaller than the threshold quantity], ..”, paragraph 180
and then resamples one or more of the data instances until the threshold quantity is reached
( “…based on the intermediate results, and provide the request for new training examples …. For example, the intermediate results may comprise an intermediate inference model, a measurement of the quality of the intermediate inference model may be obtained as described above, and … based on the range of values that the measurement of the quality is in…”, paragraph 216 and  Fig. 13.
“[0280] In some examples, the progress update may be related to an action involving minimizing and/or maximizing an objective function (for example, an objective function based on data from datasets 610 and/or annotations 620 and/or views 630). Further, a project schedule record and/or elements of a project schedule record that correspond to said objective function and/or the optimization method used and/or hyper-parameters of the optimization method used may be selected by Step 1820 and/or updated by Step 1830. For example, an element of a project schedule record may be selected of a plurality of alternative elements of the project schedule record corresponding to different objective functions and/or different optimization methods and/or different hyper-parameters based on the identity of the objective function and/or the optimization method used and/or hyper-parameters related to the action, and the selected element may be updated according to the type of the action, properties of the action, the result of the action, and so forth. For example, the progress update may comprise intermediate results and/or intermediate status of the optimization (such as objective value, iteration number,…”, paragraph 280).

As to claims 3 and  4,  Guttmann discloses
wherein the device processor is further configured to receive in response to the transmission of the parameter vector to the parameter server, an updated threshold quantity from the parameter server, wherein the device processor is further configured to retrain using the updated threshold quantity of the data instances of the plurality of data instances
(“0270] In some examples, the progress update may be related to an action comprising training of a machine learning algorithm (for example with selected hyper-parameters), and the progress update may comprise indications of the status of the progress update may comprise intermediate results and/or intermediate status of the training task, for example as obtained by Step 1310. In some examples, the progress update may be related to an action comprising usage of an inference model, for example comprising applying information to the inference model, and the progress update may comprise indications of the status of the action….”, paragraph 270 and Fig. 13. See also elements 1320 and 1330).
 wherein the device processor is configured to over sample or under sample the plurality of data instances so that when a number of data instances 
(see Fig. 13 and associated disclosure.
Further, “For example, algorithm 640 may comprise an artificial neural network, and the structure and/or other characteristics of the artificial neural network may be selected according to hyper-parameters. For example, algorithm 640 may comprise a clustering and/or a segmentation algorithm, and the number of desired clusters and/or segments may be selected according to a hyper-parameter [Examiner interprets as number of data instances]. For example, algorithm 640 may comprise a factorization algorithm, and the number of desired factors may be determined according to a hyper-parameter [Examiner interprets as number of data instances]…”, paragraph 98.
available to the device processor is larger than the updated threshold quantity, the device processor samples the updated threshold quantity of data instances 
610 and/or annotations 620 and/or views 630 as described above, may comprise adding new views to views 630, removing views from views 630, modifying some of the views of views 630, and so forth. For example, observing a dataset and/or an annotation with some distribution of elements may cause the algorithm to create a view containing a sample of the elements with a different distribution…”, paragraph 104.
“..In such case, some of the parameters of the artificial neural network may be set manually and are called hyper-parameters, while the other parameters are set by the machine learning algorithm according to the training examples. In some examples, parameters and/or hyper-parameters of the artificial neural network may be obtained by Step 1110. In some examples, the machine learning algorithm used to train the artificial neural network may also have some hyper-parameters, such as …batch size [Examiner interprets as number of data instances available … is larger than the updated threshold quantity], ..”, paragraph 180), and 
 when the number of data instances available to the device processor is smaller than the updated threshold quantity, the device processor samples all of the data instances of the plurality of data instances and then resamples one or more of the data instances until the updated threshold quantity is reached.
(Guttmann discloses “…Additional training examples may be selected based on the intermediate results. In some cases, synthetic examples may be generated based on 
“[0104] In some examples, updating a view, for example by an algorithm processing data from datasets 610 and/or annotations 620 and/or views 630 as described above, may comprise adding new views to views 630, removing views from views 630, modifying some of the views of views 630, and so forth. For example, observing a dataset and/or an annotation with some distribution of elements may cause the algorithm to create a view containing a sample of the elements with a different distribution…”, paragraph 104.
“..In such case, some of the parameters of the artificial neural network may be set manually and are called hyper-parameters, while the other parameters are set by the machine learning algorithm according to the training examples. In some examples, parameters and/or hyper-parameters of the artificial neural network may be obtained by Step 1110. In some examples, the machine learning algorithm used to train the artificial neural network may also have some hyper-parameters, such as …batch size [Examiner interprets as number of data instances available … is smaller than the updated threshold quantity], ..”, paragraph 180).
As to claims 5 and 6, Guttmann discloses
wherein the updated threshold quantity is calculated as a function of a number of updates transmitted by the device to the parameter server compared to a predetermined number of updates from all devices

“…In other examples, an artificial neural network may comprise an output of a machine learning algorithm (and in some cases, deep learning algorithm) trained using training examples. In such case, some of the parameters of the artificial neural network may be set manually and are called hyper-parameters, while the other parameters are set by the machine learning algorithm according to the training examples. In some examples, parameters and/or hyper-parameters of the artificial neural network may be obtained by Step 1110. In some examples, the machine learning algorithm used to train the artificial neural network may also have some hyper-parameters, such as …batch size, ..”, paragraph 180.
“0270] In some examples, the progress update may be related to an action comprising training of a machine learning algorithm (for example with selected hyper-parameters), and the progress update may comprise indications of the status of the training. For example, the progress update may comprise intermediate results and/or intermediate status of the training task, for example as obtained by Step 1310. In some examples, the progress update may be related to an action comprising usage of an inference model, for example comprising applying information to the inference model, and the progress update may comprise indications of the status of the action….”, paragraph 270 and Fig. 13. See also elements 1320 and 1330).

(“..In such case, some of the parameters of the artificial neural network may be set manually and are called hyper-parameters, while the other parameters are set by the machine learning algorithm according to the training examples. In some examples, parameters and/or hyper-parameters of the artificial neural network may be obtained by Step 1110. In some examples, the machine learning algorithm used to train the artificial neural network may also have some hyper-parameters, such as …batch size [a first parameter], ..”, paragraph 180),
As to claims 7, 9, 10 and 17, Guttmann discloses
wherein the plurality of data instances is image data, and the model is trained to identify a position of the device
(“[0005] Image sensors are now part of numerous devices, from security systems to mobile phones, and the availability of images and videos produced by those devices is increasing”, paragraph 5. 
“[0042] The term “image sensor” is recognized by those skilled in the art and refers to any device configured to capture images, a sequence of images, videos, and so forth. This includes sensors that convert optical input into images, where optical input can be visible light (like in a camera), radio waves, microwaves, terahertz waves, ultraviolet light, infrared light, x-rays, gamma rays, and/or any other light spectrum. This also includes both 2D and 3D sensors. Examples of image sensor technologies may include: 
“[0061] In some embodiments, the one or more positioning sensors 275 may be configured to obtain positioning information of apparatus 200, to detect changes in the position of apparatus 200, and/or to measure the position of apparatus 200. In some examples, positioning sensors 275 may be implemented using one of the following technologies: Global Positioning System (GPS)…”, paragraph 61).
wherein the plurality of data instances is image data and the model is an image recognition model
(“…[0221] In some examples, at least some of the labeled examples of the group of labeled examples and/or at least some of the unlabeled examples of the group of unlabeled examples may comprise image data (for example, images captured using image sensors 260). In some cases, the inference model generated by Step 1420 may comprise a detector configured to detect items in images (such as faces, people, objects, text, and so forth), and the labels assigned to the image by Step 1430 may comprise an indicator whether an item was detected in the image, a list of items detected in the image, locations of the items detected in the image, and so forth. In some cases, the inference model generated by Step 1420 may comprise a recognition model,…”, paragraph 221).


(“…Some examples of machine learning algorithms that may be used may include support vector machine, gradient descent based algorithms…”, paragraph 129).
wherein the at least one sensor is coupled with a vehicle
(“[0061] In some embodiments, the one or more positioning sensors 275 may be configured to obtain positioning information of apparatus 200, to detect changes in the position of apparatus 200, and/or to measure the position of apparatus 200. In some examples, positioning sensors 275 may be implemented using one of the following technologies: Global Positioning System (GPS), GLObal NAvigation Satellite System (GLONASS), Galileo global navigation system, BeiDou navigation system, other Global Navigation Satellite Systems (GNSS), Indian Regional Navigation Satellite System (IRNSS), Local Positioning Systems (LPS), Real-Time Location Systems (RTLS), Indoor Positioning System (IPS), Wi-Fi based positioning systems, cellular triangulation, and so forth. In some examples, information captured using positioning sensors 275 may be stored in memory units 210, may be processed by processing units 220, may be transmitted and/or received using communication modules 230, and so forth”, paragraphs 61 and 84).
As to claims 11, 12 and 13, Guttmann discloses
wherein the model comprises a generative adversarial network, wherein the device processor is configured to train the model using an adversarial training process 

wherein the plurality of data instances is labeled, and the model is trained using a supervised training process
(Guttmann discloses supervised training in which each input in the training data is correlated to a desired output, this is a supervised training process, “…In some examples, …. a function that takes as inputs an example and at least part of the information associated with the external devices, and outputs … for the input example. Such function may comprise an inference model, an artificial neural network, an algorithm, and so forth. ..”, see at least  paragraph 125 and Fig. 7).
wherein the updated central parameter is transmitted to the device prior to the updated central parameter being altered again
(see at least Fig. 7 elements 740 and 760 and associated disclosure).
As to claims 16 and 20, Guttmann discloses

(see at least Fig. 7 element 710 and associated disclosure).
Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over US PG. Pub. No. 20180336481(Guttmann) in view of US PG. Pub. No.  20190197404 (WANG) and US PG. Pub. No.  20210064616 (HU).

As to claim 8, Guttmann discloses indications projects which provide an indication or recommendation based in some conditions or predictions (see at least Fig. 18 elements 1810, 1840 and 1850 and associated disclosure).
Guttmann does not expressly  disclose but HU discloses 
wherein the plurality of data instances is search text data, and the model is trained to recommend a point of interest based on the search text data
(“…determine one or more addresses corresponding to the geographic coordinates of the historical service requester; determine a plurality of candidate points of interest (POIs) around the historical service requester based on the one or more addresses; and generate a plurality of feature matrices, each of the plurality of feature matrices associated with each of the plurality of candidate POIs, based on area information related to the geographic coordinates of a historical service requester, the feature matrix indicating one or more spatial features of each of the plurality of candidate POIs…”, paragraph 15.
“[0027] According to another aspect of the present disclosure, a non-transitory computer readable medium is provided. The non-transitory computer readable medium may recommendation model based on the geographic coordinates of the service requestor to generate a plurality of POIs associated with the geographic coordinates. The at least one processor may select at least one POI from the plurality of POIs as at least one target POI. The at least one processor may transmit the at least one target POI to the user device of the service requester, wherein the POI recommendation model is pre-trained at least based on one or more spatial features of a plurality of sample POIs related to a plurality of historical service requests”, paragraph 27).
Therefore, it would have been obvious to one of ordinary skill in the art 
before the effective filing 	date of the claimed invention to incorporate HU’s teaching with the teaching of  Guttmann. One would have been motivated to  recommended POI based on historical services such as searching results in order to support a POI recommendation model  (HU abstract and paragraph 27).


Response to Arguments
Applicant’s arguments of 2/18/2022 have been very carefully considered but are not persuasive.
Applicant argues (remarks 7-11)
Claim 1, for example, recites "a device processor configured to train the model using
a threshold quantity of the data instances of the plurality of data instances; the device
processor configured to over sample or under sample the plurality of data instances to equal
the threshold quantity." Neither Guttmann or Wang teach these limitations as neither
Guttmann or Wang teach a system or methods that dictate over or under-sampling data in
view of a threshold as recited in Claim 1.
In response the Examiner asserts that the office did not have possession of the alleged claim language that the applicant is currently arguing.  Action on these newly amended limitations is contained herein. 

….Guttmann does not disclose distributed learning but rather a centralized model. The Applicants thus respectfully disagree with the Office Action's assertion that Guttmann teaches a parameter server, the parameter vector, and the distributed learning process as claimed. Instead, Guttmann teaches a centralized system where the model is trained even though the data is collected by different devices….

In response the Examiner asserts that Guttmann is a strong reference. Guttmann in Fig. 12 teaches that individual devices performs training of each model “…selecting a device (Step 1250); and causing the … device to perform the training task (Step 1260)…”, paragraph 196, see at least Fig. 12 element 1260.

Wang, as discussed below, may disclose a distributed learning system and as such, the
Applicants are not presently arguing that the combination lacks such disclosure. However,
the fact that Guttmann does not disclose or suggest distributed learning is indicative of its
failure to disclose over or under-sampling data as recited in Claim 1. As described above,

parameters of which are aggregated by the parameter server….Guttmann does not have this problem as it teaches centralized learning and thus has no need to over or under sample data at the workers as claimed.

In response the Examiner asserts that again Guttmann is a strong reference. The combination Guttmann and Wang conforms a solid prima facie of obviousness.  This prima facie teaches all the limitations in the claims.
The Examiner respectfully notes that Applicant  has not provided persuasive rebuttal evidence to overcome the prima facie case. Further, the elements of this instant Application are old and well known at the time of the invention. The combination set for the rejection produce results that are predictable. The claims are broad and the search  conducted shows that there is lack of novelty on the claimed invention therefore there is loss of right to a patent. The prior art read on the broad claims. 

The Office Action cites Guttmann for teaching this limitation. See Office Action pg.
16: "(see Fig. 2B and associated disclosure, see also ... said represented as physical
quantities, for example such as electronic quantities, and/or said t' 1 representing the physical
objects .... ", paragraphs 38 and 85. Step 750 may compare the updated information
associated with the external devices obtained by Step 740 with the original information
associated with the external devices obtained by Step 710 to determine if the magnitude of
the update is above a selected threshold", paragraph 132 and Fig 7), the device processor
configured to over sample or under sample the plurality of data instances to equal the
threshold quantity( ... a group of labeled and a group of unlabeled may be obtained, .. ",
abstract)." The Office Action further cites Guttmann for similar limitations of dependent
claim 15 which previously recited similar limitations. See Office Action pp. 21-22. The
Applicant respectfully disagrees with these assertions. The citations provided by the OfficeAction are generally related to a process 700 for selective use of examples by the centralized…


B. Independent Claim 14
Claim 14 recites similar limitations as claim 1 and is allowable for similar reasons.
For example, claim 14 recites "selecting, by the worker device, a first set of data instances
from the plurality of data instances as a function of a threshold quantity received from a
parameter server" and "wherein selecting comprises over sampling or under sampling the
plurality of data instances so that when a number of data instances available to the worker
device is larger than the threshold quantity, the worker device samples the threshold quantity
of data instances and when the number of data instances available to the worker device is
smaller than the threshold quantity, the worker device samples all of the data instances of …. Independent Claim 18
Claim 18 recites similar limitations as those of independent claims 1 and 12 and is
allowable for similar reasons. Claim 20 depends from allowable claim 18 and is allowable
for at least this reason. The Applicants respectfully requests the rejection be withdrawn
In response the Examiner asserts that  the claims are not allowable because the elements of this instant claims are old and well known at the time of the invention. The combination set for the rejection produce results that are predictable.

Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
“Deep Learning for IoT Big Data and Streaming Analytics: A Survey”. IEEE. 2018. This paper teaches devices collect and/or generate various sensory data over time for a wide range of fields and applications. Based on the nature of the .

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARIA VICTORIA VANDERHORST whose telephone number is (571)270-3604.  The examiner can normally be reached on business hours from Monday through Friday from 8:30 AM to 4:30 PM. 
Abdi Kambiz can be reached on 571-272-6702.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/MARIA V VANDERHORST/Primary Examiner, Art Unit 3688                                                                                                                                                                                                        3/16/2022