Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Priority
Receipt is acknowledged of certified copies of papers submitted under 35 U.S.C. 119(a)-(d), which papers have been placed of record in the file.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-2, 5-6, 8-10, 13-16 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Bhowmick et al (U.S. Patent Application Publication 2020/0104705 A1) in view of Szeto et al (U.S. Patent Application Publication 2018/0018590 A1).

	Regarding claim 1, Bhowmick discloses a processor-implemented method, comprising: 
training a local model (FIG. 1; paragraph [0033], client device 110a includes local machine learning module 136a) using a local data set collected by a terminal device to generate a trained local model (Paragraph [0034], local machine learning 
module 136a can be trained on text data stored on client device 110a, while machine learning module 137a can be trained on image data stored on client device 111a; paragraph [0051], FIG. 4B is a flow diagram of a method 410 to generate a privatized proposed label on a client device; paragraph [0053], operation 412 to generate a training set based on the selected set of client data. Generating the training set can include associating client data with labels associated with that client data (as operation 411); paragraph [0054],  operation 413 to train a machine learning model on the mobile electronic device using the training set); 
receiving, from a server (Paragraph [0030], the server 130 can also store a set of unlabeled data 131), an independent identically distributed (i.i.d.) Paragraph [0035], the server can provide a set of unlabeled data (e.g., a set of unlabeled data 121, a set of unlabeled data 122, a set of unlabeled data 123) to each client device within each device group. The sets of unlabeled data can each include one or more units of unlabeled data 131[i] for which the client devices can generate proposed labels based on the individualized machine learning modules 
136a-136n, 137a-137n, 138a-138n on each client device. In one embodiment, the set of unlabeled data transmitted to devices in a device group includes the same unit or units of unlabeled data, with each device group receiving a different unit of unlabeled data; paragraph [0054], operation 414 to receive a set of unlabeled data from a server), the i.i.d. Paragraph [0055], the unlabeled set of data is of the same type as the selected set of user data; paragraph [0052], operation 411 to select a set of client data on a mobile electronic device. A variety of different types of client data can be used to generate the training data set. For example, images on the device can be used to train an image classifier or text data can be used to train a word or character prediction model. In one embodiment, word sequences typed by a user can be used to train a predictive text model, which can be used to suggest words within a keyboard application. The specific type of data that is selected can be determined or limited based on privacy settings configured for the mobile electronic device. Thus, the unlabeled set of data sampled for each class in a plurality of predefined classes corresponding to the privacy settings configured for the mobile electronic device); 
implementing the trained local model by inputting the i.i.d. Paragraph [0035], the sets of unlabeled data can each include one or more units of unlabeled data 131[i] for which the client devices can generate proposed labels based on the individualized machine learning module 136a on each client device; paragraph [0054], the machine learning models on the mobile electronic devices will become individualized to each device … The mobile electronic device can perform operation 415 to generate a proposed label for one or more elements in the set of unlabeled data) and transmitting final inference results of the implemented trained local model to the server (Paragraph [0056], operation 416, in which the mobile electronic device can transmit a privatized version of one or more proposed labels to the server. In one embodiment, operation 416 includes to transmit one or more tuples to the server, where each tuple includes a privatized proposed label and at least an identifier for an element in the set of unlabeled data);
receiving, from the server, a Paragraph [0040], FIG. 3A is a block diagram of a system 300 for generating privatizing proposed labels for server provided unlabeled data, according to an embodiment. The system 300 includes a client device, which can be any of client devices 110a-110n, 111a-111n, 112a-112n or client devices 210; paragraph [0041], the server 130 can include a receive module 351 and a frequency estimation module 341 to determine label frequency estimations 331 … The labeling and training module 330 can use the determined labels to train an existing server-side machine learning module 135 into an improved server-side machine learning module 346. In one embodiment, the client device 310 and the server 130 can engage in an iterative process to enhance the accuracy of a machine learning model implemented by the machine learning module. In one embodiment the improved machine learning module 346 can be deployed to the client device 310 via a deployment module 352 if the machine learning module 361 on the client device 310 is compatible with the improved machine learning module 346. Alternatively, a version of the machine learning models used by the client device 310 can be enhanced or updated on the server 130 and deployed to the client device 310 via the deployment module 352).
It is noted that the invention of Bhowmick does not use term of “global” to describe the unlabeled set of data and a model. However, the invention of Bhowmick describes that the server connects with a set of client devices; distributes the unlabeled set of data to each client device and the sets of unlabeled data can each include one or more units of unlabeled data for which the client devices can generate proposed labels based on the individualized machine learning modules on each client device (Paragraph [0035]); improves the machine learning module based on the proposed labels from client devices and deploys the improved machine learning module to the client devices (Paragraph [0041]). Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to understand that “the sets of unlabeled data” taught by Bhowmick is an independent identically distributed global data set associated with the individualized machine learning modules on each client device and “machine learning module” implements as a model on the server taught by Bhowmick is a global model for each client device.
In additional, Examiner cites the prior art reference Szeto to discloses (Abstract, researchers can request that relevant private data servers train implementations of machine learning algorithms on their local private data without requiring de-identification of the private data or without exposing the private data to unauthorized computing systems. The private data servers also generate synthetic or proxy data according to the data distributions of the actual data. The servers then use the proxy data to train proxy models. When the proxy models are sufficiently similar to the trained actual models, the proxy data, proxy model parameters, or other learned knowledge can be transmitted to one or more non-private computing devices; paragraph [0042], FIG. 1 shows an example distributed machine learning system 100; the system includes non-private computing device 130 and one or more entities 120A through 120N --- each of private data server 120; paragraph [0098], FIG. 5 presents a computer-implemented method 500 of distributed, online machine learning. Method 500 
relates to building an aggregated trained global model from many private data sets. The trained global model can then be sent back to each entity for use in prediction efforts; paragraphs [0099]-[0100], the modeling engine creating the trained actual model according to the model instructions and as a function of at least some of the local private data by training the implementation of the machine learning algorithm on the local private data; paragraphs [0102]-[0104], generating a set of proxy data according to one or more of the private data distributions … the modeling engine creating a trained proxy model from the proxy data by training the same type or implementation of machine learning algorithm on the proxy data … the modeling engine calculates a model similarity score as a function of the proxy model parameters and actual model parameters … if the similarity score fails to satisfy similarity criteria (e.g., falls below a threshold, etc.), then the modeling engine can repeat operations 540 through 560) receiving, from the server, a global model updated (Paragraph [0107], the global modeling engine also transmits the trained global model back to one or more of the private data servers. The private data servers can then leverage the global trained model to conduct local prediction studies in support of local clinical decision making workflows. In addition, the private data servers can also use the global model as a foundation for continued online learning. Thus, the global model becomes a basis for continued machine learning as new private data becomes available. As new data becomes available, method 500 can be repeated to improve the global modeling engine) based on the final inference results of the inference (Paragraph [0105], under the condition that the similarity score satisfies similarity criteria, the modeling engine can proceed to operation 570. Operation 570includes transmitting the set of proxy data, possibly along with other information, over the network to at least one non-private computing device; paragraphs [0106], the global modeling engine (FIG. 1; 136) train a global model on the aggregated sets of proxy data). 
Bhowmick and Szeto are analogous art because both pertain to utilize the system/method for training and updating the machine learning model on the sever based on the received inference results for the local computer device. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the system/method for improving the accuracy of a machine 
learning mode based on received privatized crowdsourced labels from multiple client devices taught by Bhowmick incorporate the teachings of Szeto, and applying the machine learning system taught by Szeto to train and update a global model on the server based on the learned knowledge from many client device; then transmits the updated global model back to the client device in order to conduct the client device proposed label generation studies in support of client device decision making workflows. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bhowmick according to the relied-upon teachings of Szeto to obtain the invention as specified in claim.

	Regarding claim 2, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 1), and Bhowmick further disclose wherein the trained local model (FIG. 1; paragraph [0034], local machine 
learning module 136a can be trained on text data stored on client device 110a, while machine learning module 137a can be trained on image data stored on client device 111a; paragraph [0051], FIG. 4B is a flow diagram of a method 410 to generate a privatized proposed label on a client device) comprises a neural network trained (Paragraphs [0100]-[0101], FIG. 10 illustrates compute architecture 1000 on a client device that can be used to enable on-device supervised training and inferencing using machine learning algorithm …the various frameworks and hardware resources of the compute architecture 1000 can be used for inferencing operations via a machine learning model, as well as training operations for a machine learning model. For example, a client device can use the compute architecture 1000 to perform supervised learning via a machine learning model as described herein, such as but not limited to a CNN, RNN, or LSTM model. The client device can then use the trained machine learning model to infer proposed labels for a unit of unlabeled data provided by a server) to predict a class of the plurality of predefined classes corresponding to input data (Paragraph [0035], the sets of unlabeled data can each include one or more units of unlabeled data 131[i] for which the client devices can generate proposed labels based on the individualized machine learning module 136a on each client device; paragraph [0054], the machine learning models on the mobile electronic devices will become individualized to each device … The mobile electronic device can perform operation 415 to generate a proposed label for one or more elements in the set of unlabeled data), and 
the final inference results of the inference correspond to a hard label that indicates a class predicted for the i.i.d. global data set (Paragraph [0064], as shown in FIG. 5A, in one embodiment a proposed label encoding 500 is created on a client device in which a proposed label value 502 is encoded into a proposed label vector 503. The proposed label vector 503 is a one-hot encoding in which a bit is set that corresponds with a value associated with a proposed label generated by a client device. In the illustrated proposed label encoding 500, the universe of labels 501 is the set of possible labels that can be proposed for an unlabeled unit of data provided to a client device by the server).  

	Regarding claim 5, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 2), and Bhowmick further disclose wherein, 23012055.0482 
receipt of the updated global model is performed (Paragraph [0040], FIG. 3A is a block diagram of a system 300 for generating privatizing proposed labels for server provided unlabeled data, according to an embodiment. The system 300 
includes a client device, which can be any of client devices 110a-110n, 111a-111n, 112a-112n or client devices 210; paragraph [0041], the server 130 can include a receive module 351 and a frequency estimation module 341 to determine label frequency estimations 331 … The labeling and training module 330 can use the determined labels to train an existing server-side machine
learning module 135 into an improved server-side machine learning module 346.  In one embodiment, the client device 310 and the server 130 can engage in an iterative process to enhance the accuracy of a machine learning model implemented by the machine learning module. In one embodiment the improved machine learning module 346 can be deployed to the client device 310 via a deployment module 352 if the machine learning module 361 on the client device 310 is compatible with the improved machine learning module 346. 
Alternatively, a version of the machine learning models used by the client device 310 can be enhanced or updated on the server 130 and deployed to the client device 310 via the deployment module 352) even when the terminal device transmits only the hard label to the server (FIG. 4B; paragraph [0056], operation 416, in which the mobile electronic device can transmit a privatized version of one or more proposed labels to the server. In one embodiment, operation 416 includes to transmit one or more tuples to the server, where each tuple includes a privatized proposed label and at least an identifier for an element in the set of unlabeled data) and even when the global model and the trained local model have different structures (FIG. 1; paragraph [0033], the local machine learning modules can include different types of learning models than the learning model used by the server. In one embodiment, the  local machine learning modules 136a-136n, 137a-137n, 138a-138n on each client device can include LSTM networks, while the machine learning module 135 on the server 130 may be a CNN).

	Regarding claim 6, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 1), and Bhowmick further disclose wherein the global model is updated in the server based on other final inference results received from other terminal devices in addition to the final inference results received from the terminal device (Paragraph [0047], FIG. 4A is a flow diagram of a method 400 to improve the accuracy of a machine learning model via crowdsourced labeling of unlabeled data, according to an embodiment. The method 400 can be implemented in a server device, such as server device 130; paragraph [0049], operation 402, in which the server receives a set of proposed labels from the set of multiple mobile electronic devices; paragraph [0050], operation 403, in which the server processes the set of proposed labels to determine an estimate of a most frequent proposed label for each element in the unlabeled set of data … perform operation 404 to add each element of unlabeled data and a corresponding most frequently proposed label for the element to a training data set ... operation 405, in which the server trains the machine leaning model using the training data set to generate an improved machine learning model).

Regarding claim 8, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 1), and Bhowmick further disclose wherein the local data set comprises a data set sampled in a non-i.i.d. manner (FIG. 3B; paragraph [0042], the machine learning module 361 on the client device 
310 can be trained via a training module 370 based on client data 332 within data storage 329 on the client device 310. The client data 332 can include various types of client data, such as text message data, image data, application activity data, device activity data, and/or a combination of application activity and device activity data).

Regarding claim 9, Bhowmick discloses a processor-implemented method, comprising: 
transmitting, to a plurality of terminal devices (FIG. 1; Paragraph [0029], a set of client devices 110a-110n, 111a-111n, 112a-112n), an independent identically distributed (i.i.d.) Paragraph [0030], the server 130 can also store a set of unlabeled data 131; paragraph [0035], the server can provide a set of unlabeled data (e.g., a set of unlabeled data 121, a set of unlabeled data 122, a set of unlabeled data 123) to each client device within each device group. The sets of unlabeled data can each include one or more units of unlabeled data 131[i] for which the client devices can generate proposed labels based on the individualized machine learning modules 136a-136n, 137a-137n, 138a-138n on each client device. In one embodiment, the set of unlabeled data transmitted to devices in a device group includes the same unit or units of unlabeled data, with each device group receiving a different unit of unlabeled data; paragraph [0054], operation 414 to receive a set of unlabeled data from a server) for each class in a plurality of predefined classes (Paragraph [0055], the unlabeled set of data is of the same type as the selected set of user data; paragraph [0052], operation 411 to select a set of client data on a mobile electronic device. A variety of different types of client data can be used to generate the training data set. For example, images on the device can be used to train an image classifier or text data can be used to train a word or character prediction model. In one embodiment, word sequences typed by a user can be used to train a predictive text model, which can be used to suggest words within a keyboard application. The specific type of data that is selected can be determined or limited based on privacy settings configured for the mobile electronic device. Thus, the unlabeled set of data sampled for each class in a plurality of predefined classes corresponding to the privacy settings configured for the mobile electronic device); 
receiving, from each of the plurality of terminal devices (Paragraph [0036], FIG. 2 illustrates a system 200 for receiving privatized crowdsourced labels from multiple client devices, according to an embodiment …The client devices 210, using the techniques described above, can each generate privatized proposed labels 212a-212c (privatized proposed label 212a from client device 210a, privatized proposed label 212b from client device 210b, privatized proposed label 212c from client device 210c) which each can be transmitted to the server 130 via the network 120), final inference results of inference obtained by inputting the i.i.d. Paragraph [0035], the sets of unlabeled data can each include one or more units of unlabeled data 131[i] for which the client devices can generate proposed labels based on the individualized machine learning module 136a on each client device; paragraph [0054], the machine learning models on the mobile electronic devices will become individualized to each device … The mobile electronic device can perform operation 415 to generate a proposed label for one or more elements in the set of unlabeled data) in each of the terminal devices based on a corresponding local data set (Paragraph [0034], local machine learning module 136a can be trained on text data stored on client device 110a, while machine learning module 137a can be trained on image data stored on client device 111a; paragraph [0051], FIG. 4B is a flow diagram of a method 410 to generate a privatized proposed label on a client device; paragraph [0053], operation 412 to generate a training set based on the selected set of client data. Generating the training set can include associating client data with labels associated with that client data (as operation 411); paragraph [0054],  operation 413 to train a machine learning model on the mobile electronic device using the training set; paragraph [0056], operation 416, in which the mobile electronic device can transmit a privatized version of one or more proposed labels to the server. In one embodiment, operation 416 includes to transmit one or more tuples to the server, where each tuple includes a privatized proposed label and at least an identifier for an element in the set of unlabeled data); and 
updating a Paragraph [0030], the server 130 stores a machine learning module 135) based on the received final inference results of corresponding inferences from each of the plurality of terminal devices (Paragraph [0047], FIG. 4A is a flow diagram of a method 400 to improve the accuracy of a machine learning model via crowdsourced labeling of unlabeled data, according to an embodiment. The method 400 can be implemented in a server device, such as server device 130; paragraph [0049], operation 402, in which the server receives a set of proposed labels from the set of multiple mobile electronic devices; paragraph [0050], method 400 can then perform operation 404 
to add each element of unlabeled data and a corresponding most frequently proposed label for the element to a training data set. The method 400 additionally include operation 405, in which the server trains the machine learning model using the training data set to generate an improved machine learning model).
It is noted that the invention of Bhowmick does not use term of “global” to describe the unlabeled set of data and a model. However, the invention of Bhowmick describes that the server connects with a set of client devices; distributes the unlabeled set of data to each client device and the sets of unlabeled data can each include one or more units of unlabeled data for which the client devices can generate proposed labels based on the individualized machine learning modules on each client device (Paragraph [0035]); improves the machine learning module based on the proposed labels from client devices and deploys the improved machine learning module to the client devices (Paragraph [0041]). Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to understand that “the sets of unlabeled data” taught by Bhowmick is an independent identically distributed global data set associated with the individualized machine learning modules on each client device and “machine learning module” implements as a model on the server taught by Bhowmick is a global model for each client device.
However, Bhowmick does not specifically disclose updating global model stored in a server.
 In the similar field of endeavor, Szeto discloses (Abstract, researchers can request that relevant private data servers train implementations of machine learning algorithms on their local private data without requiring de-identification of the private data or without exposing the private data to unauthorized computing systems. The private data servers also generate synthetic or proxy data according to the data distributions of the actual data. The servers then use the proxy data to train proxy models. When the proxy models are sufficiently similar to the trained actual models, the proxy data, proxy model parameters, or other learned knowledge can be transmitted to one or more non-private computing devices; paragraph [0042], FIG. 1 shows an example distributed machine learning system 100; the system includes non-private computing device 130 and one or more entities 120A through 120N --- each of private data server 120; paragraph [0098], FIG. 5 presents a computer-implemented method 500 of distributed, online machine learning. Method 500 relates to building an aggregated trained global model from many private data sets. The trained global model can then be sent back to each entity for use in prediction efforts; paragraphs [0099]-[0100], the modeling engine creating the trained actual model according to the model instructions and as a function of at least some of the local private data by training the implementation of the machine learning algorithm on the local private data; paragraphs [0102]-[0104], generating a set of proxy data according to one or more of the private data distributions … the modeling engine creating a trained proxy model from the proxy data by training the same type or implementation of machine learning algorithm on the proxy data … the modeling engine calculates a model similarity score as a function of the proxy model parameters and actual model parameters … if the similarity score fails to satisfy similarity criteria (e.g., falls below a threshold, etc.), then the modeling engine can repeat operations 540 through 560) updating global model stored in a server (Paragraph [0105], under the condition that the similarity score satisfies similarity criteria, the modeling engine can proceed to operation 570. Operation 570 includes transmitting the set of proxy data, possibly along with other information, over the network to at least one non-private computing device; paragraphs [0106], the global modeling engine (FIG. 1; 136) train a global model on the aggregated sets of proxy data; paragraph [0107], the global model becomes a basis for continued machine learning as new private data becomes available. As new data becomes available, method 500 can be repeated to improve the global modeling engine).
Bhowmick and Szeto are analogous art because both pertain to utilize the system/method for training and updating the machine learning model on the sever based on the received inference results for the local computer device. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the system/method for improving the accuracy of a machine 
learning mode based on received privatized crowdsourced labels from multiple client devices taught by Bhowmick incorporate the teachings of Szeto, and applying the machine learning system taught by Szeto to train and update a global model on the server based on the learned knowledge from many client device in order to ; then transmits the updated global model back to the client device in order to improve the accuracy of a global model implemented in the sever. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bhowmick according to the relied-upon teachings of Szeto to obtain the invention as specified in claim.

	Regarding claim 10, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 9), and Bhowmick further disclose wherein the final inference results of inference correspond to a hard label that indicates a class predicted for the i.i.d. global data set by the trained local model (Paragraph [0064], as shown in FIG. 5A, in one embodiment a proposed label encoding 500 is created on a client device in which a proposed label value 502 is encoded into a proposed label vector 503. The proposed label vector 503 is a one-hot encoding in which a bit is set that corresponds with a value associated with a proposed label generated by a client device. In the illustrated proposed label encoding 500, the universe of labels 501 is the set of possible labels that can be proposed for an unlabeled unit of data provided to a client device by the server).

	Regarding claim 13, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 9), and Bhowmick discloses further comprising transmitting the updated global model to at least one of the terminal devices or another terminal device other than the plurality of terminal devices (Paragraph [0040], FIG. 3A is a block diagram of a system 300 for generating privatizing proposed labels for server provided unlabeled data, according to an embodiment. The system 300 includes a client device, which can be any of client devices 110a-110n, 111a-111n, 112a-112n or client devices 210; paragraph [0041], the server 130 can include a receive module 351 and a frequency estimation module 341 to determine label frequency estimations 331 … The labeling and training module 330 can use the determined labels to train an existing server-side machine learning module 135 into an improved server-side machine learning  module 346. In one embodiment, the client device 310 and the server 130 can engage in an iterative process to enhance the accuracy of a machine learning model implemented by the machine learning module. In one embodiment the improved machine learning module 346 can be deployed to the client device 310 via a deployment module 352 if the machine learning module 361 on the client device 310 is compatible with the improved machine learning module 346. Alternatively, a version of the machine learning models used by the client device 310 can be enhanced or updated on the server 130 and deployed to the client device 310 via the deployment module 352).

	Regarding claim 14, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 13).
However, Bhowmick does not specifically wherein the updated global model is used as a pre-trained local model for the other terminal device.       
 In the similar field of endeavor, Szeto discloses (Paragraph [0098], FIG. 5 presents a computer-implemented method 500 of distributed, online machine learning. Method 500 relates to building an aggregated trained global model from many private data sets. The trained global model can then be sent back to each entity for use in prediction efforts) wherein the updated global model is used as a pre-trained local model for the other terminal device (Paragraph [0107], the global modeling engine also transmits the trained global model back to one or more of the private data servers. Thus, the updated global model is pre-trained model and is used as local model for the other terminal device).
Bhowmick and Szeto are analogous art because both pertain to utilize the system/method for training and updating the machine learning model on the sever based on the received inference results for the local computer device. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the system/method for improving the accuracy of a machine 
learning mode based on received privatized crowdsourced labels from multiple client devices taught by Bhowmick incorporate the teachings of Szeto, and applying the machine learning system taught by Szeto to train and update a global model on the server based on the learned knowledge from many client device; then transmits the updated global model back to the client device in order to conduct the client device proposed label generation studies in support of client device decision making workflows. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bhowmick according to the relied-upon teachings of Szeto to obtain the invention as specified in claim.

	Regarding claim 15, Bhowmick discloses a terminal device (FIG. 1; paragraph [0029], client device 110a), comprising: 
at least one memory (Paragraph [0100], FIG. 10 illustrates compute architecture 1000 on a client device that can be used to enable on-device supervised training and inferencing using machine learning algorithms; paragraphs [0102]-[0103], FIG. 11 shows a device architecture 1100 for a mobile or embedded device. The device includes memory 1150);
a local model (Paragraph [0033], client device 110a include a local machine 
learning module 136a) trained using a local data set collected by the terminal device (Paragraph [0034], local machine learning module 136a can be trained on text data stored on client device 110a, while machine learning module 137a can be trained on image data stored on client device 111a; paragraph [0051], FIG. 4B is a flow diagram of a method 410 to generate a privatized proposed label on a client device; paragraph [0053], operation 412 to generate a training set based on the selected set of client data. Generating the training set can include associating client data with labels associated with that client data (as operation 411); paragraph [0054],  operation 413 to train a machine learning model on the mobile electronic device using the training set); and 
at least one processor (Paragraph [0100], compute architecture 1000 
includes a client labeling framework 1002 that can be configured to leverage a processing system 1020 on a client device …The processing system includes an application processor 1022, a neural network processor 1023, and a graphics processor 1024, each of which can be used to accelerate operations of the core machine learning framework 1010 and the various higher-level frameworks that operate via primitives provided via the core machine learning framework; paragraph [0102], a processing system 1104 including one or more data processors) configured to receive an independent identically distributed (i.i.d.) Paragraph [0035], the server can provide a set of unlabeled data (e.g., a set of unlabeled data 121, a set of unlabeled data 122, a set of unlabeled data 123) to each client device within each device group. The sets of unlabeled data can each include one or more units of unlabeled data 131[i] for which the client devices can generate proposed labels based on the individualized machine learning modules 136a-136n, 137a-137n, 138a-138n on each client device. In one embodiment, the set of unlabeled data transmitted to devices in a device group includes the same unit or units of unlabeled data, with each device group receiving a different unit of unlabeled data; paragraph [0054], operation 414 to receive a set of unlabeled data from a server), the i.i.d. Paragraph [0055], the unlabeled set of data is of the same type as the selected set of user data; paragraph [0052], operation 411 to select a set of client data on a mobile electronic device. A variety of different types of client data can be used to generate the training data set. For example, images on the device can be used to train an image classifier or text data can be used to train a word or character prediction model. In one embodiment, word sequences typed by a user can be used to train a predictive text model, which can be used to suggest words within a keyboard application. The specific type of data that is selected can be determined or limited based on privacy settings configured for the mobile electronic device. Thus, the unlabeled set of data sampled for each class in a plurality of predefined classes corresponding to the privacy settings configured for the mobile electronic device) from a server (Paragraph [0030], the server 130 can also store a set of unlabeled data 131), 
implementing the trained local model by inputting the i.i.d. Paragraph [0035], the sets of unlabeled data can each include one or more units of unlabeled data 131[i] for which the client devices can generate proposed labels based on the individualized machine learning module 136a on each client device; paragraph [0054], the machine learning models on the mobile electronic devices will become individualized to each device … The mobile electronic device can perform operation 415 to generate a proposed label for one or more elements in the set of unlabeled data) and transmit final inference results of the implemented trained local model to the server (Paragraph [0056], operation 416, in which the mobile electronic device can transmit a privatized version of one or more proposed labels to the server. In one embodiment, operation 416 includes to transmit one or more tuples to the server, where each tuple includes a privatized proposed label and at least an identifier for an element in the set of unlabeled data); and 
receive a Paragraph [0040], FIG. 3A is a block diagram of a system 300 for generating privatizing proposed labels for server provided unlabeled data, according to an embodiment. The system 300 includes a client device, which can be any of client devices 110a-110n, 111a-111n, 112a-112n or client devices 210; paragraph [0041], the server 130 can include a receive module 351 and a frequency estimation module 341 to determine label frequency estimations 331 … The labeling and training module 330 can use the determined labels to train an existing server-side machine learning module 135 into an improved server-side machine learning module 346. In one embodiment, the client device 310 and the server 130 can engage in an iterative process to enhance the accuracy of a machine learning model implemented by the machine learning module. In one embodiment the improved machine learning module 346 can be deployed to the client device 310 via a deployment module 352 if the machine learning module 361 on the client device 310 is compatible with the improved machine learning module 346. Alternatively, a version of the machine learning models used by the client device 310 can be enhanced or updated on the server 130 and deployed to the client device 310 via the deployment module 352).
It is noted that the invention of Bhowmick does not specific at least one memory configured to store a local model trained. However, Bhowmick discloses the client device includes a memory and the client device contain corresponding local machine 
learning module. The local machine learning module can be trained on the data stored on the client device and then the trained local machine learning module will be used to generate privatized proposed labels based on the received set of unlabeled data. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the method (as shown in FIG. 4B) taught by Bhowmick to store the local machine learning module trained using the local data.
Also, it is noted that the invention of Bhowmick does not use term of “global” to describe the unlabeled set of data and a model. However, the invention of Bhowmick describes that the server connects with a set of client devices; distributes the unlabeled set of data to each client device and the sets of unlabeled data can each include one or more units of unlabeled data for which the client devices can generate proposed labels based on the individualized machine learning modules on each client device (Paragraph [0035]); improves the machine learning module based on the proposed labels from client devices and deploys the improved machine learning module to the client devices (Paragraph [0041]). Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to understand that “the sets of unlabeled data” taught by Bhowmick is an independent identically distributed global data set associated with the individualized machine learning modules on each client device and “machine learning module” implements as a model on the server taught by Bhowmick is a global model for each client device.
In additional,  Examiner cites the prior art reference Szeto to discloses (Abstract, researchers can request that relevant private data servers train implementations of machine learning algorithms on their local private data without requiring de-identification of the private data or without exposing the private data to unauthorized computing systems. The private data servers also generate synthetic or proxy data according to the data distributions of the actual data. The servers then use the proxy data to train proxy models. When the proxy models are sufficiently similar to the trained actual models, the proxy data, proxy model parameters, or other learned knowledge can be transmitted to one or more non-private computing devices; paragraph [0042], FIG. 1 shows an example distributed machine learning system 100; the system includes non-private computing device 130 and one or more entities 120A through 120N --- each of private data server 120; paragraph [0098], FIG. 5 presents a computer-implemented method 500 of distributed, online machine learning. Method 500 relates to building an aggregated trained global model from many private data sets. The trained global model can then be sent back to each entity for use in prediction efforts; paragraphs [0099]-[0100], the modeling engine creating the trained actual model according to the model instructions and as a function of at least some of the local private data by training the implementation of the machine learning algorithm on the local private data; paragraphs [0102]-[0104], generating a set of proxy data according to one or more of the private data distributions … the modeling engine creating a trained proxy model from the proxy data by training the same type or implementation of machine learning algorithm on the proxy data … the modeling engine calculates a model similarity score as a function of the proxy model parameters and actual model parameters … if the similarity score fails to satisfy similarity criteria (e.g., falls below a threshold, etc.), then the modeling engine can repeat operations 540 through 560) receive a global model updated (Paragraph [0107], the global modeling engine also transmits the trained global model back to one or more of the private data servers. The private data servers can then leverage the global trained model to conduct local prediction studies in support of local clinical decision making workflows. In addition, the private data servers can also use the global model as a foundation for continued online learning. Thus, the global model becomes a basis for continued machine learning as new private data becomes available. As new data becomes available, method 500 can be repeated to improve the global modeling engine) based on the final inference results of the inference from the server (Paragraph [0105], under the condition that the similarity score satisfies similarity criteria, the modeling engine can proceed to operation 570. Operation 570 includes transmitting the set of proxy data, possibly along with other information, over the network to at least one non-private computing device; paragraphs [0106], the global modeling engine (FIG. 1; 136) train a global model on the aggregated sets of proxy data). 
Bhowmick and Szeto are analogous art because both pertain to utilize the system/method for training and updating the machine learning model on the sever based on the received inference results for the local computer device. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the system/method for improving the accuracy of a machine 
learning mode based on received privatized crowdsourced labels from multiple client devices taught by Bhowmick incorporate the teachings of Szeto, and applying the machine learning system taught by Szeto to train and update a global model on the server based on the learned knowledge from many client device; then transmits the updated global model back to the client device in order to conduct the client device proposed label generation studies in support of client device decision making workflows. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bhowmick according to the relied-upon teachings of Szeto to obtain the invention as specified in claim.

	Regarding claim 16, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 15), and Bhowmick further disclose wherein 25012055.0482the trained local model (FIG. 1; paragraph [0034], local machine 
learning module 136a can be trained on text data stored on client device 110a, while machine learning module 137a can be trained on image data stored on client device 111a; paragraph [0051], FIG. 4B is a flow diagram of a method 410 to generate a privatized proposed label on a client device) comprises a neural network trained (Paragraphs [0100]-[0101], FIG. 10 illustrates compute architecture 1000 on a client device that can be used to enable on-device supervised training and inferencing using machine learning algorithm …the various frameworks and hardware resources of the compute architecture 1000 can be used for inferencing operations via a machine learning model, as well as training operations for a machine learning model. For example, a client device can use the compute architecture 1000 to perform supervised learning via a machine learning model as described herein, such as but not limited to a CNN, RNN, or LSTM model. The client device can then use the trained machine learning model to infer proposed labels for a unit of unlabeled data provided by a server) to predict a class of the plurality of predefined classes corresponding to input data (Paragraph [0035], the sets of unlabeled data can each include one or more units of unlabeled data 131[i] for which the client devices can generate proposed labels based on the individualized machine learning module 136a on each client device; paragraph [0054], the machine learning models on the mobile electronic devices will become individualized to each device … The mobile electronic device can perform operation 415 to generate a proposed label for one or more elements in the set of unlabeled data), and the final inference results of the inference correspond to a hard label that indicates a class predicted for the i.i.d. global data set (Paragraph [0064], as shown in FIG. 5A, in one embodiment a proposed label encoding 500 is created on a client device in which a proposed label value 502 is encoded into a proposed label vector 503. The proposed label vector 503 is a one-hot encoding in which a bit is set that corresponds with a value associated with a proposed label generated by a client device. In the illustrated proposed label encoding 500, the universe of labels 501 is the set of possible labels that can be proposed for an unlabeled unit of data provided to a client device by the server).

Regarding claim 19, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 16), and Bhowmick further disclose wherein, receipt of the global model is performed (Paragraph [0040], FIG. 3A is a block diagram of a system 300 for generating privatizing proposed labels for server provided unlabeled data, according to an embodiment. The system 300 
includes a client device, which can be any of client devices 110a-110n, 111a-111n, 112a-112n or client devices 210; paragraph [0041], the server 130 can include a receive module 351 and a frequency estimation module 341 to determine label frequency estimations 331 … The labeling and training module 330 can use the determined labels to train an existing server-side machine 
learning module 135 into an improved server-side machine learning module 346. In one embodiment, the client device 310 and the server 130 can engage in an iterative process to enhance the accuracy of a machine learning module implemented by the machine learning module. In one embodiment the improved machine learning module 346 can be deployed to the client device 310 via a deployment module 352 if the machine learning module 361 on the client device 310 is compatible with the improved machine learning module 346. Alternatively, a version of the machine learning models used by the client device 310 can be enhanced or updated on the server 130 and deployed to the client device 310 via the deployment module 352) even when the terminal device transmits only the hard label to the server (FIG. 4B; paragraph [0056], operation 416, in which the mobile electronic device can transmit a privatized version of one or more proposed labels to the server. In one embodiment, operation 416 includes to transmit one or more tuples to the server, where each tuple includes a privatized proposed label and at least an identifier for an element in the set of unlabeled data) and even when the global model and the trained local model have different structures (FIG. 1; paragraph [0033], the local machine learning modules can include different types of learning models than the learning model used by the server. In one embodiment, the  local machine learning modules 136a-136n, 137a-137n, 138a-138n on each client device can include LSTM networks, while the machine learning module 135 on the server 130 may be a CNN).

	Regarding claim 20, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 15);  and Bhowmick further discloses wherein the global model is updated in the server based on other final inference results of inference received from other terminal devices in addition to the final inference results of the inference received from the terminal device (Paragraph [0047], FIG. 4A is a flow diagram of a method 400 to improve the accuracy of a machine learning model 
via crowdsourced labeling of unlabeled data, according to an embodiment. The method 400 can be implemented in a server device, such as server device 130; paragraph [0049], operation 402, in which the server receives a set of proposed labels from the set of multiple mobile electronic devices; paragraph [0050], operation 403, in which the server processes the set of proposed labels to determine an estimate of a most frequent proposed label for each element in the unlabeled set of data … perform operation 404 to add each element of unlabeled data and a corresponding most frequently proposed label for the element to a training data set ... operation 405, in which the server trains the machine leaning model using the training data set to generate an improved machine learning model).

Claims 3 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Bhowmick et al (U.S. Patent Application Publication 2020/0104705 A1) in view of Szeto et al (U.S. Patent Application Publication 2018/0018590 A1) in view of Zylberberg et al (U.S. Patent Application Publication 2022/0001181 A1).

	Regarding claim 3, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 2). 
	However, Bhowmick does not specifically wherein the hard label has a smaller data size than a soft label that comprises information about probability values indicating respective probabilities for the i.i.d. global data set being classified into each of the plurality of predefined classes.  
	 In the similar field of endeavor, Zylberberg discloses (Abstract, various embodiments of the present technology generally relate to closed loop deep brain stimulation based on inferred sleep stage from physiological data using machine learning classifiers …; paragraph [0072], FIG. 9A is an example of a representative spectrogram of a local field potential (LFP) recording acquired over the course of one full night's sleep from a deep brain stimulation (DBS) electrode implanted into the subthalamic nucleus (STN). A PSG informed hypnogram assessed by a sleep expert is aligned with the LFP recordings (AWM, awake with movement; AWOM, awake without movement; REM, rapid eye movement; N1-3, non-rapid eye movement stages 1-3)) wherein the hard label (Paragraph [0073], FIG. 9B is a schematic of a representation of the feedforward classifier 900 used to predict sleep stage from 30-s labelled LFP epochs; an output layer 930 (predicted sleep stage) indicates sleep stage “1:REM”) has a smaller data size than a soft label that comprises information about probability values indicating respective probabilities (Paragraphs [0073]-[0074], a hidden layer 92; a feedforward artificial neural network (ANN) was trained with a single hidden layer (FIG. 9B) to prospectively identify whether a given 30-s epoch of STN-LFP recording took place during one of three possible states: REM, NREM or Awake … The ANN output was a probability that the measured epoch occurs during one of the three possible states) for the i.i.d. global data set being classified into each of the plurality of predefined classes (Paragraph [0073], an input layer 910 (LFP frequency power bands); paragraph [0072], a PSG informed hypnogram assessed by a sleep expert is aligned with the LFP recordings (AWM, awake with movement; AWOM, awake without movement; REM, rapid eye movement; N1-3, non-rapid eye movement stages 1-3)).  
Bhowmick and Zylberberg are analogous art because both pertain to utilize the system/method for training the machine learning model to predict the class of the input data. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the system/method for generating a privatized proposed label on a client device taught by Bhowmick incorporate the teachings of Zylberberg, and applying the method for predicting the sleep stage taught by Zylberberg to have a machine learning model trained for identifying the probability value of one of the possible states from the training data and then determining the observed sleep state by using accuracy of probabilities. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bhowmick according to the relied-upon teachings of Zylberberg to obtain the invention as specified in claim.

	Regarding claim 17, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 16). 
	However, Bhowmick does not specifically wherein the hard label has a smaller data size than a soft label that comprises information about probability values indicating respective probabilities for the i.i.d. global data set being classified into each of the plurality of predefined classes.
 In the similar field of endeavor, Zylberberg discloses (Abstract, various embodiments of the present technology generally relate to closed loop deep brain stimulation based on inferred sleep stage from physiological data using machine learning classifiers …; paragraph [0072], FIG. 9A is an example of a representative spectrogram of a local field potential (LFP) recording acquired over the course of one full night's sleep from a deep brain stimulation (DBS) electrode implanted into the subthalamic nucleus (STN). A PSG informed hypnogram assessed by a sleep expert is aligned with the LFP recordings (AWM, awake with movement; AWOM, awake without movement; REM, rapid eye movement; N1-3, non-rapid eye movement stages 1-3)) wherein the hard label (Paragraph [0073], FIG. 9B is a schematic of a representation of the feedforward classifier 900 used to predict sleep stage from 30-s labelled LFP epochs; an output layer 930 (predicted sleep stage) indicates sleep stage “1:REM”) has a smaller data size than a soft label that comprises information about probability values (Paragraphs [0073]-[0074], a hidden layer 92; a feedforward artificial neural network (ANN) was trained with a single hidden layer (FIG. 9B) to prospectively identify whether a given 30-s epoch of STN-LFP recording took place during one of three possible states: REM, NREM or Awake … The ANN output was a probability that the measured epoch occurs during one of the three possible states) indicating respective probabilities for the i.i.d. global data set being classified into each of the plurality of predefined classes (Paragraph [0073], an input layer 910 (LFP frequency power bands); paragraph [0072], a PSG informed hypnogram assessed by a sleep expert is aligned with the LFP recordings (AWM, awake with movement; AWOM, awake without movement; REM, rapid eye movement; N1-3, non-rapid eye movement stages 1-3)).  
Bhowmick and Zylberberg are analogous art because both pertain to utilize the system/method for training the machine learning model to predict the class of the input data. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the system/method for generating a privatized proposed label on a client device taught by Bhowmick incorporate the teachings of Zylberberg, and applying the method for predicting the sleep stage taught by Zylberberg to have a machine learning model trained for identifying the probability value of one of the possible states from the training data and then determining the observed sleep state by using accuracy of probabilities. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bhowmick according to the relied-upon teachings of Zylberberg to obtain the invention as specified in claim.

Claims 4, 11-12 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Bhowmick et al (U.S. Patent Application Publication 2020/0104705 A1) in view of Szeto et al (U.S. Patent Application Publication 2018/0018590 A1) in view of Atwood et al (U.S. Patent Application Publication 2021/0035059 A1).

	Regarding claim 4, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 2). 
	However, Bhowmick does not specifically wherein the global model is updated using a loss function having at least one of a ground truth (GT) label and the hard label as a variable.
	In the similar field of endeavor, Atwood discloses (Paragraph [0029], FIG. 1 shows an example system 100 for supply chain management; paragraphs [0066]-[0067], FIG. 2 shows the supply chain management computing system 102 can store or include one or more machine-learned models 110 (e.g., any of the models discussed herein) … the supply chain management computing system 102 can receive the one or more machine-learned models 110 from the machine learning computing system 130 over network 180 and can store the one or more machine-learned models 110 in the memory 114; paragraph [0127], FIG. 6 shows an example processing workflow for training a machine-learned model 110) wherein the global model is updated using a loss function (Paragraph [0132], the model 110 can be trained based on the loss function 608. As an example, one example training technique is backwards propagation of errors (“backpropagation”). For example, the loss function 608 can be backpropagated through the model 110 to update one or more parameters of the model 110 (e.g., based on a gradient of the loss function 608)) having at least one of a ground truth (GT) label and the hard label as a variable (Paragraph [0131], a loss function 608 can evaluate a difference between the model prediction 606 and the ground truth label 604).
	Bhowmick and Atwood are analogous art because both pertain to utilize the system/method for training and updating the machine learning model to predict the class of the input data. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the system/method for generating a privatized proposed label on a client device taught by Bhowmick incorporate the teachings of Atwood, and applying the method for training a machine-learned model taught by Atwood to provide a loss value by using a loss function having the ground truth label as a variable in order to improve the accuracy of a machine learning model. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bhowmick according to the relied-upon teachings of Atwood to obtain the invention as specified in claim.

	Regarding claim 11, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 10). 
	However, Bhowmick does not specifically wherein the updating of the global model is performed using a loss function having at least one of a ground truth (GT) label and the hard label as a variable.
	In the similar field of endeavor, Atwood discloses (Paragraph [0029], FIG. 1 shows an example system 100 for supply chain management; paragraphs [0066]-[0067], FIG. 2 shows the supply chain management computing system 102 can store or include one or more machine-learned models 110 (e.g., any of the models discussed herein) … the supply chain management computing system 102 can receive the one or more machine-learned models 110 from the machine learning computing system 130 over network 180 and can store the one or more machine-learned models 110 in the memory 114; paragraph [0127], FIG. 6 shows an example processing workflow for training a machine-learned model 110) wherein the updating of the global model is performed using a loss function (Paragraph [0132], the model 110 can be trained based on the loss function 608. As an example, one example training technique is backwards propagation of errors (“backpropagation”). For example, the loss function 608 can be backpropagated through the model 110 to update one or more parameters of the model 110 (e.g., based on a gradient of the loss function 608)) having at least one of a ground truth (GT) label and the hard label as a variable (Paragraph [0131], a loss function 608 can evaluate a difference between the model prediction 606 and the ground truth label 604).
	Bhowmick and Atwood are analogous art because both pertain to utilize the system/method for training and updating the machine learning model to predict the class of the input data. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the system/method for generating a privatized proposed label on a client device taught by Bhowmick incorporate the teachings of Atwood, and applying the method for training a machine-learned model taught by Atwood to provide a loss value by using a loss function having the ground truth label as a variable in order to improve the accuracy of a machine learning model. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bhowmick according to the relied-upon teachings of Atwood to obtain the invention as specified in claim.

	Regarding claim 12, the combination of Bhowmick in view of Szeto in view of Atwood discloses everything claimed as applied above (see claim 11). 
	However, Bhowmick does not specifically wherein the updating of the global model is performed to reduce a difference between at least one of the GT label and the hard label and a calculation result obtained by inputting the i.i.d. global data set to the global model.
	In the similar field of endeavor, Atwood discloses wherein the updating of the global model is performed to reduce a difference (FIG. 6; paragraph [0131], a loss function 608 can evaluate a difference between the model prediction 606 and the ground truth label 604) between at least one of the GT label (Paragraph [0128], (604) a ground truth label associated with such set of data, where the ground truth label provides a “correct” prediction for the set of data) and the hard label and a calculation result obtained by inputting the i.i.d. global data set to the global model (Paragraph [0128], The training data 162 can include, for example, historical data that indicates the historical outcomes of various previous shipments (e.g., which may take the form of shipping logs). In some implementations, the training data 162 can include a plurality of training example pairs, where each training example pair provides: (602) a set of data (e.g., incorrect and/or incomplete data); paragraph [0130], based on the set of data 602, the machine-learned model 110
can produce a model prediction 606. As examples, the model prediction 606 can include a prediction of the ground truth label 604).
	Bhowmick and Atwood are analogous art because both pertain to utilize the system/method for training and updating the machine learning model to predict the class of the input data. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the system/method for generating a privatized proposed label on a client device taught by Bhowmick incorporate the teachings of Atwood, and applying the method for training a machine-learned model taught by Atwood to provide a loss value by using a loss function having the ground truth label as a variable in order to improve the accuracy of a machine learning model. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bhowmick according to the relied-upon teachings of Atwood to obtain the invention as specified in claim.

	Regarding claim 18, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 16). 
	However, Bhowmick does not specifically wherein the global model is updated using a loss function having at least one of a ground truth (GT) label and the hard label as a variable.
In the similar field of endeavor, Atwood discloses (Paragraph [0029], FIG. 1 shows an example system 100 for supply chain management; paragraphs [0066]-[0067], FIG. 2 shows the supply chain management computing system 102 can store or include one or more machine-learned models 110 (e.g., any of the models discussed herein) … the supply chain management computing system 102 can receive the one or more machine-learned models 110 from the machine learning computing system 130 over network 180 and can store the one or more machine-learned models 110 in the memory 114; paragraph [0127], FIG. 6 shows an example processing workflow for training a machine-learned model 110) wherein the global model is updated using a loss function (Paragraph [0132], the model 110 can be trained based on the loss function 608. As an example, one example training technique is backwards propagation of errors (“backpropagation”). For example, the loss function 608 can be backpropagated through the model 110 to update one or more parameters of the model 110 (e.g., based on a gradient of the loss function 608)) having at least one of a ground truth (GT) label and the hard label as a variable (Paragraph [0131], a loss function 608 can evaluate a difference between the model prediction 606 and the ground truth label 604).
	Bhowmick and Atwood are analogous art because both pertain to utilize the system/method for training and updating the machine learning model to predict the class of the input data. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the system/method for generating a privatized proposed label on a client device taught by Bhowmick incorporate the teachings of Atwood, and applying the method for training a machine-learned model taught by Atwood to provide a loss value by using a loss function having the ground truth label as a variable in order to improve the accuracy of a machine learning model. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bhowmick according to the relied-upon teachings of Atwood to obtain the invention as specified in claim.

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Bhowmick et al (U.S. Patent Application Publication 2020/0104705 A1) in view of Szeto et al (U.S. Patent Application Publication 2018/0018590 A1) in view of Bhattacharjee et al (U.S. Patent Application Publication 2020/0334567 A1).

	Regarding claim 7, the combination of Bhowmick in view of Szeto discloses everything claimed as applied above (see claim 1), and Bhowmick discloses a set of client data and the size of  the individual units of unlabeled data.
	However, Bhowmick does not specifically wherein the i.i.d. global data set has a smaller data size than the local data set.
	In the similar field of endeavor, Bhattacharjee discloses (Abstract, techniques for distributing the training of machine learning models across a plurality of computing devices are presented. An example method includes receiving, from a computing device in a distributed computing environment, a request for a set of outstanding jobs for training part of a machine learning model …; paragraph [0013], FIG. 1 shows an example networked computing environment in which training of machine learning models is distributed across a plurality of client devices, according to an embodiment of the present disclosure. As illustrated, computing environment 100 includes a plurality of client devices 120, a peer registry service 130, a model training manager 140, a job repository 150, and a storage service 160, connected via network 110) wherein the i.i.d. global data set has a smaller data size (Paragraphs [0016]-[0017], to participate in training a machine learning model, training application 122 executing on a client device 120 can request, from model training manager 140, a list of outstanding jobs that client device 120 can participate in. Each of the outstanding jobs may be associated with a particular machine learning model to be trained and a particular training data set associated with the particular machine learning model … model training manager 140 can adjust a performance metric of client device 120 to reflect degraded or lesser expected performance from client device 120 and assign smaller training data sets to client device 120 for processing) than the local data set (Paragraph [0015], training application 122 may further include a benchmarking application that may be executed when training application 122 is installed on client device 120 … The benchmarking application may include, for example, executable code used to determine the amount of time needed for client device 120 to process a data set of a fixed size).
	Bhowmick and Bhattacharjee are analogous art because both pertain to utilize the system/method for training the machine learning model. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the system/method for generating a privatized proposed label on a client device taught by Bhowmick incorporate the teachings of Bhattacharjee, and applying the method for training a machine-learned model taught by Bhattacharjee to transmit the global data having the smaller data size than the fixed size of the client device to the client device. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Bhowmick according to the relied-upon teachings of Bhattacharjee to obtain the invention as specified in claim.

	
Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Xilin Guo whose telephone number is (571)272-5786. The examiner can normally be reached Monday - Friday 9:00 AM-5:30 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on 571-272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/XILIN GUO/Primary Examiner, Art Unit 2616