Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION

This communication is in response to the Request for Continued Examination for filed on 6/21/2022 for the application No.  16/386,700, Claims 1-14, 16-18 and 20 are currently pending and have been examined. Claims 1-14, 16-18 and 20 have been rejected.

Continued Examination Under 37 CFR 1.114

A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 6/21/2022  has been entered. 

Examiner’s Note

Regarding to 101 compliance, Claims 1-14, 16-18 and 20 are compliant with 101, according with the last “2019 Revised Patent Subject Matter Eligibility Guidance” (2019 PEG), published in the Federal Register, Vol. 84, No. 4, Monday, January 7, 2019. Examiner’s  analysis is presented on the Office action dated on 3/21/2022.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-7 and 9-14, 16-18 and 20 are  rejected under 35 U.S.C. 103 as being unpatentable over US PG. Pub. No. 20180336481(Guttmann) in view of  US  PG. PUB. No. 20200051193 (MIAO) in view of US PG. Pub. No.  20190197404 (WANG).

As to claims 14 and 18, Guttmann discloses a method for training a model using a plurality of distributed worker devices, the method comprising: 
a)  identifying, by a worker device (see for example Figs 1A and 1B and associated disclosure),
a plurality of data instances  (see at least element 710 in Fig. 7;  “[0056] In some embodiments, the one or more communication modules 230 may be configured to receive and transmit information. For example, control signals may be transmitted and/or received through communication modules 230. In another example, information received though communication modules 230 may be stored in memory units 210. In an additional example, information retrieved from memory units 210 may be transmitted using communication modules 230. In another example, input data may be transmitted and/or received using communication modules 230. Examples of such input data may include: input data inputted by a user using user input devices; information captured using one or more sensors; and so forth. Examples of such sensors may include: audio sensors 250; image sensors 260; motion sensors 270; positioning sensors 275; chemical sensors; temperature sensors; barometers; pressure sensors; proximity sensors; electrical impedance sensors; electrical voltage sensors; electrical current sensors; and so forth”, paragraph 56); 
b) selecting, by the worker device, a first set of data instances from the plurality of data instances as a function of a threshold quantity received from a parameter server
(“[0038] Unless specifically stated otherwise, as apparent from the following discussions, it is appreciated that throughout the specification discussions utilizing terms such as “processing”, “calculating”, “computing”, “determining”, “generating”, “setting”, “configuring”, “selecting”, “defining”, “applying”, “obtaining”, “monitoring”, “providing”, “identifying”, “segmenting”, “classifying”, “analyzing”, “associating”, “extracting”, “storing”, “receiving”, “transmitting”, or the like, include action and/or processes of a computer that manipulate and/or transform data into other data, said data represented as physical quantities, for example such as electronic quantities, and/or said data representing the physical objects….”, paragraph 38.
“[0085] In some embodiments, dataset 610 may comprise data and information arranged in data-points. For example, a data-point may correspond to an individual, to an object, to a geographical location, to a geographical region, to a species, and so forth. For example, dataset 610 may comprise a table, and each row or slice may represent a data-point. For example, dataset 610 may comprise several tables, and each data-point may correspond to entries in one or more tables. For example, a data-point may comprise a text document, a portion of a text document, a corpus of text documents, and so forth….”, paragraph 85. 
“…Step 750 may compare the updated information associated with the external devices obtained by Step 740 with the original information associated with the external devices obtained by Step 710 to determine if the magnitude of the update is above a selected threshold…”, paragraph 132 and Fig. 7.
“….the number of samples in the set may be selected according to the available memory size [giving the broadest reasonable interpretation it is a threshold quatity]. In some examples, training examples may be sampled (for example, according to the available processing resources information, to available memory size, etc. [giving the broadest reasonable interpretation it is a threshold quantity].), … “, paragraph 152);

wherein selecting comprises over sampling the plurality of data instances when a number of data instances available to the worker device is smaller than the threshold quantity, the worker device samples all of the data instances of the plurality of data instances and then resamples one or more of the data instances until the threshold quantity is reached
(Guttmann teaches a system that obtains sample data, “0128] Additionally or alternatively to Step 720, process 700 may generate synthetic examples using the information associated with external devices (for example, the information obtained by Step 710). … In another example, using the information associated with external devices, some examples may be selected as described above, and additional synthetic examples may be generated, for example using the Synthetic Minority Over-sampling Technique (SMOTE) [Examiner interprets as when a number of data instances available to the worker device is smaller than the threshold quantity, the worker device samples all of the data instances of the plurality of data instances and then resamples one or more of the data instances until the threshold quantity is reached].”, paragraph 128.
“[0142] In some examples, the action may comprise creating an inference model and/or updating an inference model by applying at least part of the changed data to a machine learning algorithm, for example using process 1200, using Step 1330 with the changed data as the additional training examples, and so forth. In some examples, the action may comprise updating datasets 610 and/or annotations 620 and/or views 630, for example using the Synthetic Minority Over-sampling Technique (SMOTE) to create new data-points in a dataset [Examiner interprets as when a number of data instances available to the worker device is smaller than the threshold quantity, the worker device samples all of the data instances of the plurality of data instances and then resamples one or more of the data instances until the threshold quantity is reached]”,, using process 1400 to create new additional labels in an annotation, and so forth….”, paragraph 142.
,  “…In some examples, the training examples may be selected from a plurality of …training examples (for example from datasets 610 and/or annotations 620 and/or views 630) ….. For example, the training examples may be selected according to their size and according to rules chosen …. in response to the available processing resources information …. Some examples of such rules may include the selection of training examples with size that is below a selected threshold above a selected threshold, and so forth….”, paragraph 157.
See also “0206] FIG. 13 illustrates an example of a process 1300 for enriching datasets while learning. In this example, process 1300 may comprise: obtaining intermediate results of training machine learning algorithms (Step 1310); obtaining additional training examples … (Step 1320); and training the machine learning algorithms using the obtained additional training example …”, paragraph 206), 
Guttmann does not  expressly disclose 
 or under sampling [AltContent: connector][AltContent: connector]the plurality of data instances when a number of data instances available to the worker device is larger than the threshold quantity, the worker device samples the threshold quantity of data instances;
and then resamples one or more of the data instances until the threshold quantity is reached

Hovewer, MIAO discloses a “…sample balancing technique may include under-sampling the plurality of … samples”, paragraph 7.
“0121] When the sample balancing sub-unit 414-5 determine that the sample ratio exceeds the ratio threshold, the sample balancing sub-unit 414-5 may determine that the training data includes an imbalanced sample composition, then the sample balancing sub-unit 414-5 may balance the sample composition based on the training data using a sample balancing technique in 730…. in some embodiments, the sample balancing technique may include re-sampling the training data, for example, over-sampling minority samples and/or under-sampling majority samples.”, paragraph 121 and  “[0122] In some embodiments, the sample balancing sub-unit 414-5 may under-sample the … samples based on ….. For example, when the count of the …samples is larger than a predetermined number…”, paragraph 122.

Therefore, it would have been obvious to one of ordinary skill in the art 
before the effective filing date of the claimed invention to incorporate MIAO’s teaching with the teaching of  Guttmann. One would have been motivated to  provide functionality for under-sampling when the instances available are larger in order to provide functionality for “ sample balancing technique” for the plurality of samples, (see at least claim 4 of MIAO).
c) training, by the worker device, the model using the first set of data instances and a set of first parameters
(“…Some examples of machine learning algorithms that may be used may include support vector machine, gradient descent based algorithms, deep learning algorithms for artificial neural networks, AdaBoost, linear regression, and so forth. For example, process 1200 may be used to select hyper-parameters for the machine learning algorithm and/or to cause a selected device to train the machine learning algorithm…”, paragraph 129 and Fig. 12.
“…selecting a device (Step 1250); and causing the … device to perform the training task (Step 1260)…”, paragraph 196, see at least Fig. 12 element 1260.
“[0198] Some examples of properties of a machine learning training task may include a type of a machine learning algorithm, hyper-parameters of the machine learning algorithm, properties of the training set [Examiner interprets as first set of data instances and a set of first parameters], properties of the validation set, properties of the test set, and so forth. The hyper-parameters of the machine learning algorithm may differ from one machine learning algorithm to another…”, paragraph 198); 

d) transmitting, by the worker device, 
(“[0081] In some embodiments, the one or more external communication modules 450 may be configured to receive and/or to transmit information. For example, control signals may be sent and/or received through external communication modules 450….”, paragraph 81. See also Fig. 3 at least element 230 and Fig. 4B elements 440 and 450),
a set of second parameters of the trained model to the parameter server
(“0098] In some embodiments, algorithm 640 may comprise one or more decision rules. For example, a decision rule may compare a computed value to a threshold, and in some cases the threshold may be set based on a parameter and/or a hyper-parameter. In some embodiments, algorithm 640 may be preprogrammed manually. For example, a manually preprogrammed algorithm may implement a heuristic algorithm that has zero or more parameters and/or hyper-parameters. In some embodiments, algorithm 640 may comprise a machine learning algorithm configured to train on training examples,…”, paragraph 98.
“…In some examples, the plurality of examples and/or the corresponding assigned weights may be used as… a validation set  [Examiner interprets as second parameters]…”, paragraph 129.
 “…[0198] Some examples of properties of a machine learning training task may include a type of a machine learning algorithm, hyper-parameters of the machine learning algorithm, properties of the training set, properties of the validation set, properties of the test set, and so forth. The hyper-parameters of the machine learning algorithm may differ from one machine learning algorithm to another….”, paragraph 198.
“…Similarly, some examples of properties of the validation set may include samples of the validation examples of the validation set, the entire validation set, the number of validation examples in the validation set, information about the size of the validation examples, information about the structure of the validation examples, information about the distribution of the validation examples, and so forth…”, paragraph 199); 

e) receiving, by the worker device, a set of third parameters from the parameter server and an updated threshold quantity, wherein the set of third parameters is calculated at least partially as a function of the set of second parameters
(“…Further, Step 1030 may be repeated for different groups of data items associated with different groups of devices, for example comparing the results of applying a first group of data items associated with a first group of devices to a first inference model with the results of applying the first group of data items to a second inference model, comparing the results of applying a second group of data items associated with a second group of devices to a first inference model with the results of applying the second group of data items to a second inference model, and so forth. In some examples, comparing the results may comprise comparing loss function values associated with the results, comparing values of a function that summarizes the results, comparing the distributions of the results, comparing the distributions of errors, comparing the distributions of the results where the distributions are with respect to an input space, comparing the distributions of errors where the distributions are with respect to an input space, and so forth….”, paragraph 168 and Fig. 10.
“…In yet another example, using process 1700 it may be determined that process 1200 has insufficient quota  [Examiner interprets as threshold quantity] to use some devices, and as a result different devices may be selected. In another example, a cost function may be used to assign cost for each one of the plurality of devices according to their corresponding estimated processing resources requirements, and the device corresponding to the lowest cost (possibly out of the devices that satisfy the constraints as described above) may be selected. An example of such a cost function may include c1*t+c2*s, where c1 and c2 are positive constants which may represent cost per processing time and cost per memory size respectively, t may represent the estimated processing time, and s may represent the estimated memory size. In some examples, the estimated processing resources requirements may comprise an estimated range of processing resources requirements together with a distribution that assigns probabilities to the estimations. In such cases, the constraints may specify a required certainty that some other conditions hold [Examiner interprets as threshold quantity]. Further, the cost function may comprise a function that sums values over the different estimations according to the probabilities [Examiner interprets as updated threshold quantity]…”, paragraph 203);

f) selecting, by the worker device, a second set of data instances from the plurality of data instances as a function of the updated threshold quantity received from a parameter server
(“[0208] In some examples, the intermediate results may comprise values of parameters of the machine learning algorithm. In some examples, the intermediate results may comprise values measured using at least part of the training examples and/or using at least part of the validation examples and/or using at least part of the test examples, such as a value of a loss function, a value of a cost function, a value of an objective function, precision, recall, accuracy, specificity, F1 score, confusion matrices, number and/or ratio of true positives, number and/or ratio of false positives, number and/or ratio of false negative, number and/or ratio of true negatives, and so forth. For example, the machine learning algorithm may minimize an objective function and/or maximize an objective function, and the intermediate results may comprise an intermediate value of the objective function in the minimization and/or maximization process …”, paragraph 208.
“0270] In some examples, the progress update may be related to an action comprising training of a machine learning algorithm (for example with selected hyper-parameters), and the progress update may comprise indications of the status of the training. For example, the progress update may comprise intermediate results and/or intermediate status of the training task, for example as obtained by Step 1310. In some examples, the progress update may be related to an action comprising usage of an inference model, ….In some examples, the progress update may be related to an action comprising minimizing and/or maximizing an objective function (for example, an objective function based on data from datasets 610 and/or annotations 620 and/or views 630), and the progress update may comprise indications of the status of the minimization and/or maximization…”, paragraph 270) ; and 
g) training, by the worker device, the model using the second set of data instances and the set of third parameters
(“0270] In some examples, the progress update may be related to an action comprising training of a machine learning algorithm (for example with selected hyper-parameters), and the progress update may comprise indications of the status of the training. For example, the progress update may comprise intermediate results and/or intermediate status of the training task, for example as obtained by Step 1310. In some examples, the progress update may be related to an action comprising usage of an inference model, for example comprising applying information to the inference model, and the progress update may comprise indications of the status of the action. For example, the information to be applied to the inference model may comprise a plurality of data-points, and the status may comprise the number and/or ratio of data-points already applied to the inference model, the number and/or ratio of data-points waiting to be applied to the inference model, the outputs (and/or statistics about the outputs) of the inference model for the data-points already applied, and so forth. In some examples, the progress update may be related to an action comprising minimizing and/or maximizing an objective function (for example, an objective function based on data from datasets 610 and/or annotations 620 and/or views 630), and the progress update may comprise indications of the status of the minimization and/or maximization. For example, the progress update may comprise intermediate results and/or intermediate status of minimization and/or maximization, such as objective value, iteration number, gradient at the intermediate result, last step size, rate of convergence, and so forth”, paragraph 270 and Fig. 13).
Although, Guttmann teaches “…Some examples of machine learning algorithms that may be used may include support vector machine, gradient descent based algorithms, deep learning algorithms for artificial neural networks, AdaBoost, linear regression, and so forth. For example, process 1200 may be used to select hyper-parameters for the machine learning algorithm and/or to cause a selected device to train the machine learning algorithm…”, paragraph 129 and Fig. 12.
“[0198] Some examples of properties of a machine learning training task may include a type of a machine learning algorithm, hyper-parameters of the machine learning algorithm, properties of the training set [Examiner interprets as first set of data instances and a set of first parameters], properties of the validation set, properties of the test set, and so forth. The hyper-parameters of the machine learning algorithm may differ from one machine learning algorithm to another…”, paragraph 198. Guttmann does not expressly teach the words,   first set, a set of second and a set of third.

However,  WANG  discloses
training of a machine learning model,  “…Various implementations relate to asynchronous training of a machine learning model. A server receives feedback data generated by training the machine learning model from a worker. The feedback data are obtained by the worker with its own training data and are associated with previous values of a set of parameters of the machine learning model at the worker. The server determines differences between the previous values and current values of the set of parameters at the server [Examiner interprets first set, a set of second and a set of third]. The current value may have been updated for once or more due to operation of other workers. Then, the server can update the current values of the set of parameters based on the feedback data and the differences between values of the set of parameters.”, abstract.
“[0002] It is known that a set of training data may be distributed across multiple workers which optimize the model parameters with their respective training data and return the result to a central server. However, the key problem of distributed or asynchronous model training is mismatch between workers. For instance, if a worker returns its updated parameters, the model parameters at the server may have been updated for one or more times by other workers [Examiner interprets as first set, a set of second and a set of third]…”, see at least paragraphs 2 and 27.

Therefore, it would have been obvious to one of ordinary skill in the art 
before the effective filing 	date of the claimed invention to incorporate WANG’s teaching with the teaching of  Guttmann. One would have been motivated to  provide functionality for training of a machine learning model, determining first set, a set of second and a set of third of data and parameters,  in order to update parameters in a feedback environment. As WANG states,  it is very well known for training a model at a server (see WANG at least paragraph 2 and abstract).

As to claim 1, it  comprises the same limitations than claim 14 above therefore is rejected in similar manner, and further comprises 
a device for training a model, the device comprising at least one sensor (paragraph 56) configured to acquire a plurality of data instances (see at least Fig. 2B elements 250, 260, 265, 270 and 275 and associated disclosure); 
 a communication interface configured to communicate with a parameter server(see at least Fig. 2B element 230 and associated disclosure); and 

a device processor configured to train the model using a threshold quantity of the data instances of the plurality of data instances  
(see Fig. 2B and associated disclosure. see also “…said data represented as physical quantities, for example such as electronic quantities, and/or said data representing the physical objects….”, paragraphs 38 and 85.
“…Step 750 may compare the updated information associated with the external devices obtained by Step 740 with the original information associated with the external devices obtained by Step 710 to determine if the magnitude of the update is above a selected threshold…”, paragraph 132 and Fig. 7);

the device processor further configured to transmit a parameter vector of the trained model to the parameter server and receive in response, an updated central parameter vector from the parameter server derived from the model
(see Fig 1A, 1B, 2A, 3, 7  and associated disclosure); the device processor further configured to retrain the model using the updated central parameter vector (see at least paragraph 129 and Fig. 12 and Figs. 7 and 8 and associated disclosure); wherein the at least one sensor acquires different data instances than other sensors of the other devices that are training respective models (Fig. 2B);
wherein at least one transmission between the device and the parameter server occurs with respect to the other devices that are training respective models (Fig. 1A, 1B, Fig. 2A communication module 230 and Fig. 3 . communication module 230);
Although, Guttmann teaches “…Some examples of machine learning algorithms that may be used may include support vector machine, gradient descent based algorithms, deep learning algorithms for artificial neural networks, AdaBoost, linear regression, and so forth. For example, process 1200 may be used to select hyper-parameters for the machine learning algorithm and/or to cause a selected device to train the machine learning algorithm…”, paragraph 129 and Fig. 12.
“[0198] Some examples of properties of a machine learning training task may include a type of a machine learning algorithm, hyper-parameters of the machine learning algorithm, properties of the training set [Examiner interprets as first set of data instances and a set of first parameters], properties of the validation set, properties of the test set, and so forth. The hyper-parameters of the machine learning algorithm may differ from one machine learning algorithm to another…”, paragraph 198. Guttmann does not expressly teach the word asynchronously.

However,  WANG  discloses
“…Various implementations relate to asynchronous training of a machine learning model. A server receives feedback data generated by training the machine learning model from a worker…”, abstract and Figs. 1-2).Therefore, it would have been obvious to one of ordinary skill in the art 
before the effective filing 	date of the claimed invention to incorporate WANG’s teaching with the teaching of  Guttmann. One would have been motivated to  provide functionality to  training of a machine learning model, determining first set, a set of second and a set of third of data and parameters,  in order to update parameters in an asynchronous feedback environment, as WANG states,  it is very well known for training a model at a server (see WANG at least paragraph 2 and abstract).

As to claim 2, Guttmann discloses 
when the number of data instances available to the device processor is smaller than the threshold quantity, the device processor samples all of the data instances of the plurality of data instances and then resamples one or more of the data instances until the threshold quantity is reached
(Guttmann teaches a system that obtains sample data, “0128] Additionally or alternatively to Step 720, process 700 may generate synthetic examples using the information associated with external devices (for example, the information obtained by Step 710). … In another example, using the information associated with external devices, some examples may be selected as described above, and additional synthetic examples may be generated, for example using the Synthetic Minority Over-sampling Technique (SMOTE) [Examiner interprets as when a number of data instances available to the worker device is smaller than the threshold quantity, the worker device samples all of the data instances of the plurality of data instances and then resamples one or more of the data instances until the threshold quantity is reached].”, paragraph 128.
“[0142] In some examples, the action may comprise creating an inference model and/or updating an inference model by applying at least part of the changed data to a machine learning algorithm, for example using process 1200, using Step 1330 with the changed data as the additional training examples, and so forth. In some examples, the action may comprise updating datasets 610 and/or annotations 620 and/or views 630, for example using the Synthetic Minority Over-sampling Technique (SMOTE) to create new data-points in a dataset [Examiner interprets as when a number of data instances available to the worker device is smaller than the threshold quantity, the worker device samples all of the data instances of the plurality of data instances and then resamples one or more of the data instances until the threshold quantity is reached]”,, using process 1400 to create new additional labels in an annotation, and so forth….”, paragraph 142.
,  “…In some examples, the training examples may be selected from a plurality of …training examples (for example from datasets 610 and/or annotations 620 and/or views 630) ….. For example, the training examples may be selected according to their size and according to rules chosen …. in response to the available processing resources information …. Some examples of such rules may include the selection of training examples with size that is below a selected threshold above a selected threshold, and so forth….”, paragraph 157.
See also “0206] FIG. 13 illustrates an example of a process 1300 for enriching datasets while learning. In this example, process 1300 may comprise: obtaining intermediate results of training machine learning algorithms (Step 1310); obtaining additional training examples … (Step 1320); and training the machine learning algorithms using the obtained additional training example …”, paragraph 206), 
Guttmann does not  expressly disclose 
wherein the device processor is configured to … under sample the plurality of data instances so that when a number of data instances available  to the device processor is larger than the threshold quantity, the device processor samples the threshold quantity of data instances 
and then resamples one or more of the data instances until the threshold quantity is reached

Hovewer, MIAO discloses a “…sample balancing technique may include under-sampling the plurality of … samples”, paragraph 7.
“0121] When the sample balancing sub-unit 414-5 determine that the sample ratio exceeds the ratio threshold, the sample balancing sub-unit 414-5 may determine that the training data includes an imbalanced sample composition, then the sample balancing sub-unit 414-5 may balance the sample composition based on the training data using a sample balancing technique in 730…. in some embodiments, the sample balancing technique may include re-sampling the training data, for example, over-sampling minority samples and/or under-sampling majority samples.”, paragraph 121 and  “[0122] In some embodiments, the sample balancing sub-unit 414-5 may under-sample the … samples based on ….. For example, when the count of the …samples is larger than a predetermined number…”, paragraph 122.
Therefore, it would have been obvious to one of ordinary skill in the art 
before the effective filing date of the claimed invention to incorporate MIAO’s teaching with the teaching of  Guttmann. One would have been motivated to  provide functionality for under-sampling when the instances available are larger in order to provide functionality for “ sample balancing technique” for the plurality of samples, (see at least claim 4 of MIAO).
As to claims 3 and  4,  Guttmann discloses
wherein the device processor is further configured to receive in response to the transmission of the parameter vector to the parameter server, an updated threshold quantity from the parameter server, wherein the device processor is further configured to retrain using the updated threshold quantity of the data instances of the plurality of data instances
(“0270] In some examples, the progress update may be related to an action comprising training of a machine learning algorithm (for example with selected hyper-parameters), and the progress update may comprise indications of the status of the training. For example, the progress update may comprise intermediate results and/or intermediate status of the training task, for example as obtained by Step 1310. In some examples, the progress update may be related to an action comprising usage of an inference model, for example comprising applying information to the inference model, and the progress update may comprise indications of the status of the action….”, paragraph 270 and Fig. 13. See also elements 1320 and 1330).
 wherein the device processor is configured to over sample or under sample the plurality of data instances so that when a number of data instances 
(see Fig. 13 and associated disclosure.
Further, “For example, algorithm 640 may comprise an artificial neural network, and the structure and/or other characteristics of the artificial neural network may be selected according to hyper-parameters. For example, algorithm 640 may comprise a clustering and/or a segmentation algorithm, and the number of desired clusters and/or segments may be selected according to a hyper-parameter [Examiner interprets as number of data instances]. For example, algorithm 640 may comprise a factorization algorithm, and the number of desired factors may be determined according to a hyper-parameter [Examiner interprets as number of data instances]…”, paragraph 98.
available to the device processor is larger than the updated threshold quantity, the device processor samples the updated threshold quantity of data instances 
(“…“[0104] In some examples, updating a view, for example by an algorithm processing data from datasets 610 and/or annotations 620 and/or views 630 as described above, may comprise adding new views to views 630, removing views from views 630, modifying some of the views of views 630, and so forth. For example, observing a dataset and/or an annotation with some distribution of elements may cause the algorithm to create a view containing a sample of the elements with a different distribution…”, paragraph 104.
“..In such case, some of the parameters of the artificial neural network may be set manually and are called hyper-parameters, while the other parameters are set by the machine learning algorithm according to the training examples. In some examples, parameters and/or hyper-parameters of the artificial neural network may be obtained by Step 1110. In some examples, the machine learning algorithm used to train the artificial neural network may also have some hyper-parameters, such as …batch size [Examiner interprets as number of data instances available … is larger than the updated threshold quantity], ..”, paragraph 180), and 
 when the number of data instances available to the device processor is smaller than the updated threshold quantity, the device processor samples all of the data instances of the plurality of data instances and then resamples one or more of the data instances until the updated threshold quantity is reached.
(Guttmann discloses “…Additional training examples may be selected based on the intermediate results. In some cases, synthetic examples may be generated based on the intermediate results. The machine learning algorithms may be further trained using the selected additional training examples …”, paragraph 14.
“[0104] In some examples, updating a view, for example by an algorithm processing data from datasets 610 and/or annotations 620 and/or views 630 as described above, may comprise adding new views to views 630, removing views from views 630, modifying some of the views of views 630, and so forth. For example, observing a dataset and/or an annotation with some distribution of elements may cause the algorithm to create a view containing a sample of the elements with a different distribution…”, paragraph 104.
“..In such case, some of the parameters of the artificial neural network may be set manually and are called hyper-parameters, while the other parameters are set by the machine learning algorithm according to the training examples. In some examples, parameters and/or hyper-parameters of the artificial neural network may be obtained by Step 1110. In some examples, the machine learning algorithm used to train the artificial neural network may also have some hyper-parameters, such as …batch size [Examiner interprets as number of data instances available … is smaller than the updated threshold quantity], ..”, paragraph 180).
As to claims 5 and 6, Guttmann discloses
wherein the updated threshold quantity is calculated as a function of a number of updates transmitted by the device to the parameter server compared to a predetermined number of updates from all devices
(“…number of samples in the set may be selected according to the available memory size. ..”, paragraph 152.
“…In other examples, an artificial neural network may comprise an output of a machine learning algorithm (and in some cases, deep learning algorithm) trained using training examples. In such case, some of the parameters of the artificial neural network may be set manually and are called hyper-parameters, while the other parameters are set by the machine learning algorithm according to the training examples. In some examples, parameters and/or hyper-parameters of the artificial neural network may be obtained by Step 1110. In some examples, the machine learning algorithm used to train the artificial neural network may also have some hyper-parameters, such as …batch size, ..”, paragraph 180.
“0270] In some examples, the progress update may be related to an action comprising training of a machine learning algorithm (for example with selected hyper-parameters), and the progress update may comprise indications of the status of the training. For example, the progress update may comprise intermediate results and/or intermediate status of the training task, for example as obtained by Step 1310. In some examples, the progress update may be related to an action comprising usage of an inference model, for example comprising applying information to the inference model, and the progress update may comprise indications of the status of the action….”, paragraph 270 and Fig. 13. See also elements 1320 and 1330).
wherein the updated threshold quantity is calculated as a function of a first parameter and the threshold quantity
(“..In such case, some of the parameters of the artificial neural network may be set manually and are called hyper-parameters, while the other parameters are set by the machine learning algorithm according to the training examples. In some examples, parameters and/or hyper-parameters of the artificial neural network may be obtained by Step 1110. In some examples, the machine learning algorithm used to train the artificial neural network may also have some hyper-parameters, such as …batch size [a first parameter], ..”, paragraph 180).
As to claims 7, 9, 10 and 17, Guttmann discloses
wherein the plurality of data instances is image data, and the model is trained to identify a position of the device
(“[0005] Image sensors are now part of numerous devices, from security systems to mobile phones, and the availability of images and videos produced by those devices is increasing”, paragraph 5. 
“[0042] The term “image sensor” is recognized by those skilled in the art and refers to any device configured to capture images, a sequence of images, videos, and so forth. This includes sensors that convert optical input into images, where optical input can be visible light (like in a camera), radio waves, microwaves, terahertz waves, ultraviolet light, infrared light, x-rays, gamma rays, and/or any other light spectrum. This also includes both 2D and 3D sensors. Examples of image sensor technologies may include: CCD, CMOS, NMOS, and so forth. 3D sensors may be implemented using different technologies, including: stereo camera, active stereo camera, time of flight camera, structured light camera, radar, range image camera, and so forth”, paragraph 42.
“[0061] In some embodiments, the one or more positioning sensors 275 may be configured to obtain positioning information of apparatus 200, to detect changes in the position of apparatus 200, and/or to measure the position of apparatus 200. In some examples, positioning sensors 275 may be implemented using one of the following technologies: Global Positioning System (GPS)…”, paragraph 61).
wherein the plurality of data instances is image data and the model is an image recognition model
(“…[0221] In some examples, at least some of the labeled examples of the group of labeled examples and/or at least some of the unlabeled examples of the group of unlabeled examples may comprise image data (for example, images captured using image sensors 260). In some cases, the inference model generated by Step 1420 may comprise a detector configured to detect items in images (such as faces, people, objects, text, and so forth), and the labels assigned to the image by Step 1430 may comprise an indicator whether an item was detected in the image, a list of items detected in the image, locations of the items detected in the image, and so forth. In some cases, the inference model generated by Step 1420 may comprise a recognition model,…”, paragraph 221).
wherein training the model includes a gradient descent-based process 
(“…Some examples of machine learning algorithms that may be used may include support vector machine, gradient descent based algorithms…”, paragraph 129).
wherein the at least one sensor is coupled with a vehicle
(“[0061] In some embodiments, the one or more positioning sensors 275 may be configured to obtain positioning information of apparatus 200, to detect changes in the position of apparatus 200, and/or to measure the position of apparatus 200. In some examples, positioning sensors 275 may be implemented using one of the following technologies: Global Positioning System (GPS), GLObal NAvigation Satellite System (GLONASS), Galileo global navigation system, BeiDou navigation system, other Global Navigation Satellite Systems (GNSS), Indian Regional Navigation Satellite System (IRNSS), Local Positioning Systems (LPS), Real-Time Location Systems (RTLS), Indoor Positioning System (IPS), Wi-Fi based positioning systems, cellular triangulation, and so forth. In some examples, information captured using positioning sensors 275 may be stored in memory units 210, may be processed by processing units 220, may be transmitted and/or received using communication modules 230, and so forth”, paragraphs 61 and 84).
As to claims 11, 12 and 13, Guttmann discloses
wherein the model comprises a generative adversarial network, wherein the device processor is configured to train the model using an adversarial training process 
(0012] In some embodiments, descriptors of artificial neural networks may be generated and/or used. For example, an artificial neural network may be obtained, the artificial neural network may be obtained, descriptors of the segments may be calculated, and a descriptor of the artificial neural network may be compiled. In some examples, a match score for a pair of artificial neural networks may be calculated (for example using the descriptors compiled for the two artificial neural networks), and actions may be selected based on the matching score”, paragraph 12. See also Fig. 11 and associated disclosure).
wherein the plurality of data instances is labeled, and the model is trained using a supervised training process
(Guttmann discloses supervised training in which each input in the training data is correlated to a desired output, this is a supervised training process, “…In some examples, …. a function that takes as inputs an example and at least part of the information associated with the external devices, and outputs … for the input example. Such function may comprise an inference model, an artificial neural network, an algorithm, and so forth. ..”, see at least  paragraph 125 and Fig. 7).
wherein the updated central parameter is transmitted to the device prior to the updated central parameter being altered again
(see at least Fig. 7 elements 740 and 760 and associated disclosure).
As to claims 16 and 20, Guttmann discloses
wherein the plurality of data instances is accessible only on the worker device
(see at least Fig. 7 element 710 and associated disclosure).
Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over US PG. Pub. No. 20180336481(Guttmann) in view of  US  PG. PUB. No. 20200051193 (MIAO) in view of US PG. Pub. No.  20190197404 (WANG)  and US PG. Pub. No.  20210064616 (HU).

As to claim 8, Guttmann discloses indications projects which provide an indication or recommendation based in some conditions or predictions (see at least Fig. 18 elements 1810, 1840 and 1850 and associated disclosure).
Guttmann does not expressly  disclose but HU discloses 
wherein the plurality of data instances is search text data, and the model is trained to recommend a point of interest based on the search text data
(“…determine one or more addresses corresponding to the geographic coordinates of the historical service requester; determine a plurality of candidate points of interest (POIs) around the historical service requester based on the one or more addresses; and generate a plurality of feature matrices, each of the plurality of feature matrices associated with each of the plurality of candidate POIs, based on area information related to the geographic coordinates of a historical service requester, the feature matrix indicating one or more spatial features of each of the plurality of candidate POIs…”, paragraph 15.
“[0027] According to another aspect of the present disclosure, a non-transitory computer readable medium is provided. The non-transitory computer readable medium may include at least one set of instructions for recommending POIs to a user using intelligent data mining. When at least one processor of an electronic terminal executes the at least one set of instructions, the at least one processor may be directed to perform one or more of the following operations. The at least one processor may obtain geographic coordinates of a service requestor from a user device. The at least one processor may execute a pre-trained POI recommendation model based on the geographic coordinates of the service requestor to generate a plurality of POIs associated with the geographic coordinates. The at least one processor may select at least one POI from the plurality of POIs as at least one target POI. The at least one processor may transmit the at least one target POI to the user device of the service requester, wherein the POI recommendation model is pre-trained at least based on one or more spatial features of a plurality of sample POIs related to a plurality of historical service requests”, paragraph 27).
Therefore, it would have been obvious to one of ordinary skill in the art 
before the effective filing 	date of the claimed invention to incorporate HU’s teaching with the teaching of  Guttmann. One would have been motivated to  recommended POI based on historical services such as searching results in order to support a POI recommendation model  (HU abstract and paragraph 27).
Response to Arguments
Applicant’s arguments of 6/21/2022 have been very carefully considered but are not persuasive.
Applicant argues (remarks 8-13)
II. CLAIM REJECTIONS - 35 U.S.C. § 103
Claims 1-7, 9-14, 16-18, and 20 were rejected under 35 U.S.C. § 103 as being
unpatentable Guttmann in view of Wang. Applicant respectfully disagrees with the
aforementioned rejection, although Applicant has amended the claims to clarify and further
distinguish the claims from the cited art. Claim 1 is directed towards a device for…
“…The claimed subject-matter provides a system and method for optimizing cooperation
between devices which are communally solving a problem, where each device acquires a
local set of training data and without sharing data sets across the devices. Each device
possesses its own local and possibly temporally limited data that prevents each device from
learning a model that is sufficiently general. To preserve privacy, or because of bandwidth
limitations, the devices do not share their data with any central or peer entities. The devices
update each other by communicating the parameters extracted from the local data.
(Applicant's specification, paragraph [0018]).
The technical challenge with this type of setup is that the devices that collect the data
may have a diverse range of computational power or large variance in the number of data
points for each device. If a communication to the server happens asynchronously (e.g.,
without imposing an order or fixed request and response cycle on the communication
loops), some of the devices may communicate with the server rapidly and dominate the
aggregation of parameters extensively. If the devices send their updated parameters as soon
as the devices process one or more sets of the local data…”
In response the Examiner asserts that again all the arguments have been very carefully considered but are not persuasive. There is nothing novel at all in the claims at the time of the invention. 
Next, regarding to the pointed above arguments, it seems that the applicant wants that the Examiner read limitations into the broad claims. The MPEP § 2111 provides that claims must be given their broadest reasonable interpretation.  Further, it is generally considered improper to read limitations contained in the specification into the claims.  See In re Prater, 415 F.2d 1393, 162 USPQ 541 (CCPA 1969) and In re Winkhaus, 527 F.2d 637, 188 USPQ 129 (CCPA 1975), which discuss the premise that one cannot rely on the specification to impart limitations to the claim that are not recited in the claim.
The Examiner advices that to move this case forward in its prosecution, claim language needs to be narrow and with novelty features or functionality. The claims recitation need to  point out a fine line of distinction with the prior art.  This case has an effective filing date  2019-4-17 where the field of search is crowd. 

None of the cited paragraphs of Guttmann discusses the feature of "over sampling the
plurality of data instances when the number of data instances available to the device is
smaller than the threshold quantity or under sampling the plurality of data instances when a
number of data instances available to the device is larger than the threshold quantity," as
recited in claim 1.
Furthermore, Applicant submits that the rejection alleged by the Office Action is
conclusory and unsupported….
Wang does not overcome the shortcomings of Guttmann. Wang is directed towards
asynchronous training of a machine learning model. A server receives feedback data
generated by training the machine learning model from a worker. …
In response the Examiner asserts that  the claims are not allowable because the elements of this instant claims are old and well known at the time of the invention. The combination set for the rejection produce results that are predictable. Finally, the arguments regarding previous rejections are moot in light of the above new grounds of rejection. 
Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
“Deep Learning for IoT Big Data and Streaming Analytics: A Survey”. IEEE. 2018. This paper teaches devices collect and/or generate various sensory data over time for a wide range of fields and applications. Based on the nature of the application, these devices will result in big or fast/real-time data streams. Applying analytics over such data streams to discover new information, predict future insights, and make control decisions is a crucial process that makes IoT a worthy paradigm for businesses and a quality-of-life improving technology. This paper provides a thorough overview on using a class of advanced machine learning techniques, namely deep learning (DL), to facilitate the analytics and learning in the IoT domain.

“Parameter Communication Consistency Model for Large-Scale Security Monitoring Based on Mobile Computing”. IEEE.2013. This article teaches that with the application of mobile computing in the security field, security monitoring big data has also begun to emerge, providing favorable support for smart city construction and city-scale and investment expansion. Mobile computing takes full advantage of the computing power and communication capabilities of various sensing devices and uses these devices to form a computing cluster. When using such clusters for training of distributed machine learning models, the load imbalance and network transmission delay result in low efficiency of model training. Therefore, this paper proposes   distributed machine learning parameter communication consistency model based on the parameter server idea, which is called the limited synchronous parallel model. The model is based on the fault-tolerant characteristics of the machine learning algorithm, and it dynamically limits the size of the synchronization barrier of the parameter server, reduces the synchronization communication overhead, and ensures the accuracy of the model training; thus, the model realizes finite asynchronous calculation between the worker nodes and gives full play to the overall performance of the cluster. The implementation of cluster dynamic load balancing experiments shows that the model can fully utilize the cluster performance during the training of distributed machine learning models to ensure the accuracy of the model and improve the training speed.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARIA VICTORIA VANDERHORST whose telephone number is (571)270-3604.  The examiner can normally be reached on business hours from Monday through Friday from 8:30 AM to 4:30 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdi Kambiz can be reached on 571-272-6702.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MARIA V VANDERHORST/Primary Examiner, Art Unit 3688                                                                                                                                                                                                        8/13/2022