Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites the limitation "the different sources” in the preamble.  There is insufficient antecedent basis for this limitation in the claim.
Claim 1 recites the limitation "the predetermined requirements”  There is insufficient antecedent basis for this limitation in the claim.
Claim 1 recites the limitation "the calculation.”  There is insufficient antecedent basis for this limitation in the claim.
Claim 1 recites the limitation "the raw data.”  There is insufficient antecedent basis for this limitation in the claim.
Claim 1 recites the limitation "the uploaded data”  There is insufficient antecedent basis for this limitation in the claim.
Claim 1 recites the limitation "the border”  There is insufficient antecedent basis for this limitation in the claim. 
Claim 1 recites the limitation "the local area”  There is insufficient antecedent basis for this limitation in the claim.
Claim 1 recites the limitation "the global model”  There is insufficient antecedent basis for this limitation in the claim.
Claim 2 recites “the same format.” There is insufficient antecedent basis for this limitation in the claim.
Claim 4 recites “and the encrypted data would be transmitted to the secure computing server based on the requirements.”  One skilled in the art could not determine the scope of this claim because the verb tense “would be transmitted” is ambiguous.  That is, it is unclear whether this element requires the data to be transmitted.  As a result, it is unclear how much patentable weight to give this element, if any.  This renders the claim vague an indefinite. 
Claims 2-21 are also rejected because they inherit the deficiencies of claim 1. 

Claim Objections
Claim 2 is objected to for the following informalities: There is a period after the word “analysis” which appears to be typographical in nature. Appropriate correction is required. 
Claim 11 is objected to for the following informalities: There is a period after the word “parameters” which appears to be typographical in nature. Appropriate correction is required. 
Claim 11 is objected to for the following informalities: There is a period after the word “met”  which appears to be typographical in nature. Appropriate correction is required. 
Claim 16 is objected to for the following informalities: There is a period after the word “source”  which appears to be typographical in nature. Appropriate correction is required.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 8, 10-11, 13-17, and 19-21 are rejected under 35 U.S.C. 103 as being unpatentable over Yang, Applied Federated Learning: Improving Google Keyboard Query Suggestions Dec. 2018 in view of McMahan, Google AI Blog: Federated Learning: Collaborative Machine Learning without Centralized Training Data, 2018. 
With respect to claim 1, Yang teaches “1. (Original) A whole-lifecycle encrypted big data analysis method for the data from the different sources, which is characterized by the following steps: processing the data from multiple data sources to obtain the data for each data source involved in the calculation which is required by the analysis” on p. 2 (emphasis added): 
FL is an approach to distributed computation in which the data is kept at the network edges and never collected centrally [1]. Instead, minimal, focused model updates are transmitted, optionally employing additional privacy-preserving technologies such as secure multiparty computation [11] and differential privacy [10, 12, 13].

Yang further suggests encryption on p. 2 column 2 first full paragraph 

In addition to these advantages, FL can guarantee an even higher standard of privacy by making use of two additional techniques. With secure aggregation [11], clients’ updates are securely summed into a single aggregate update without revealing any client’s individual component even to the server. This is accomplished by cryptographically simulating a trusted third party. Differential privacy techniques can be used in which each client adds a carefully calibrated amount of noise to their update to mask their contribution to the learned model [10]. However, since neither of these techniques were employed in the present work, we will not describe them in further detail here

“wherein the raw data of each data source would never cross the border of the local area”  on p. 2 (“FL is an approach to distributed computation in which the data is kept at the network edges and never collected centrally”); p. 2 (“Using FL allows us to train machine learning models without collecting sensitive raw input from users”); 
“. . .”; 
 “according to the predetermined requirements” on p. 2 
We collect training data for  this model by observing user interactions with the app: when
surfacing a query suggestion to a user, a tuple (features; label) is stored in an on-device training cache, a SQLite based
database with a time-to-live based data retention policy.
• features is a collection of query and context related information
• label is the associated user action from {clicked, ignored}.
This data is then used for on-device training and evaluation of models provided by our servers.

“performs federated model training of corresponding analysis method on the data involving in the calculation of the multiple data sources in a trusted execution environment” on p. 1 
The introduction of Federated Learning (FL) [1, 2, 3] enables
a new paradigm of machine learning where both the training
data and most of the computation involved in model training
are decentralized. In contrast to traditional server-side training where user data is aggregated on centralized servers for training, FL instead trains models on end user devices while aggregating only ephemeral parameter updates on a centralized server. This is particularly advantageous for environments where privacy is paramount 
p. 2 
FL is an approach to distributed computation in which the data is kept at the network edges and never collected centrally [1]. Instead, minimal, focused model updates are transmitted, optionally employing additional privacy-preserving technologies such as secure multiparty computation [11] and differential privacy [10, 12, 13]. Compared to traditional approaches in which data is collected and stored in a central
location, FL offers increased privacy. 

p. 2 (“This data is then used for on-device training and evaluation of models provided by our servers”); p. 3 
The server runtime waits until a predefined number of clients for this population have connected, then provides each with a training task that contains: • a model consisting of a TensorFlow graph and checkpoint [15] • metadata about how to execute the model (input + output node names, types and shapes; operational metrics to report to the server such as loss, statistics of the data processed). Execution can refer, but is not limited to, training or evaluation passes. • selection criteria used to query the training cache (e.g. filter data by date)



 “and obtains model training results after multiple iterations” on p. 5 
During federated training, we apply the following server constraints to our FL tasks: Goal Client Count The target number of clients for a round of federated training, here 100. Minimum Client Count The minimum number of clients required to run a round. Here 80, i.e. although we ideally want 100 training clients, we will run a round even if we only have 80 clients. Training Period How frequently we would like to run rounds of training. Here 5 min. Report Window The maximum time to wait for clients to report back with model updates, here 2 minutes. Minimum Reporting Fraction The fraction of clients, relative to the actual number of clients gathered for a round, which have to report back to commit a round by the end of the Report Window. Here 0.8.
	
	p. 7 table 2 (emphasis added) :

Fig. 6. Retained impressions at thresholds τ0 < τ1 < τ2,
uniformly spaced, over time.
Model Live ∆CTR Live Retained Clicks
Model Iteration 1 +14.52% 77.11%
Model Iteration 2 +25.56% 63.39%
Model Iteration 3 +51.49% 82.01%

p. 7 (emphasis added): 
The results detailed here were only the first in a sequence of models trained, evaluated, and launched with FL. Successive iterations differed in that they were trained longer on more users’ data, had better tuned hyperparameters, and incorporated additional features. The results of these successive iterations are shown in Table 2, but we do not describe in depth here all the changes made between these iterations.

“update the global model according to the model training result”  on p. 2 (“Then the clients upload their model update – that is, the difference between the final parameters after training and the original parameters – and the server averages the contributions before accumulating them into the global model”); 
“and validate the global model” on p. 6 (emphasis added): 
To validate the model after training, we interpreted the coefficients of our logistic regression model via direct examination of the weights in order to gain insight into what the model had learned. 

Yang further teaches uploaded data on p. 2 (“instead, minimal, focused model updates are transmitted, optionally employing additional privacy-preserving technologies such as secure multiparty computation [11] and differential privacy [10, 12, 13].”)
It appears Yang fails to explicitly teach “and encryption is required for all the uploaded data.” 
However, McMahan teaches “and encryption is required for all the uploaded data” on p. 2  
It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarizes the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication. 

McMahan and Yang are analogous art because they are from the same field of endeavor. It would have been obvious to one skilled in the art before the effective filing date of the invention to modify the uploaded data in Yang to include “and encryption is required for all the uploaded data” as taught by McMahan. The motivation would have been to maintain to maintain data privacy and maintain data integrity.  
With respect to claim 2, Yang teaches “2. (Original) According to the whole-lifecycle encrypted big data analysis method for the data from the different sources described in claim 1, wherein the steps of processing the data from multiple data sources to obtain the data for each data source involved in the calculation, which is required by the analysis comprises: process the data from the different data sources to obtain the data with a unified format” on p. 4 
McMahan teaches “encrypt the data with the same format according to analysis requirements and transmit them to a secure computing server” on p. 2 
It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarizes the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model. 

McMahan teaches “process the transmitted data in a verified trusted execution environment to form a database of all the processed data, which is a global database: all of the data from the different data sources are involved in the calculation” on p. 2 
It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarizes the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model. 

Yang also teaches “process the transmitted data in a verified trusted execution environment to form a database of all the processed data, which is a global database: all of the data from the different data sources are involved in the calculation” on p. 2 section 2.1 second paragraph. 
Yang teaches “the data corresponding to each data source in the global database is transmitted back to the corresponding data source, that is, the data of each corresponding data source involving in the calculation, to form a local feature database of the data source” on p. 3 (“Upon convergence, a trained checkpoint is sed to create and deploy a model to clients for inference”); 
“and in the trusted execution environment, each data source forms the corresponding data required for analysis to participate in the calculation according to the local feature database” on p. 3 
The client executes the task using a custom task interpreter based on TensorFlow Mobile [16], a stripped down Android build of the TensorFlow runtime. In the case of training, a task-defined number of epochs (stochastic gradient descent passes over the training data) are performed, and the resulting updates to the model and operational metrics are anonymously uploaded to the server. There – again using TensorFlow – these ephemeral updates are aggregated using the Federated Averaging algorithm to produce a new model, and the aggregate metrics allow for monitoring training progress
The motivation to combine is same as claim 1 above. 
 
With respect to claim 3, Yang teaches 3. (Original) According to the whole-lifecycle encrypted big data analysis method for the data from the different sources described in claim 2, wherein the steps of processing the data from the different data sources to obtain the data with a unified format comprises: according to data analysis requirements, a corresponding data model is selected to perform unified processing on the data from the different data sources in each data source locally to generate globally available data structures, model parameters, mapping files, and preprocessed files” on p. 3 
The client executes the task using a custom task interpreter based on TensorFlow Mobile [16], a stripped down Android build of the TensorFlow runtime. In the case of training, a task-defined number of epochs (stochastic gradient descent passes over the training data) are performed, and the resulting updates to the model and operational metrics are anonymously uploaded to the server. There – again using TensorFlow – these ephemeral updates are aggregated using the Federated Averaging algorithm to produce a new model, and the aggregate metrics allow for monitoring training progress
	. . .
Upon convergence, a trained checkpoint is used to create and deploy a model to clients for inference

(Examiner finds the aggregated results are globally available data structures; Examiner finds the deployed model (triggering or baseline model) includes parameters—see p. 1 (“FL instead trains models on end user devices while aggregating only ephemeral parameter updates on a centralized server”). 
With respect to claim 8, Yang teaches “8. (Original) According to the whole-lifecycle encrypted big data analysis method for the data from the different sources described in claim 2, in the step of transmitting back the data corresponding to each data source in the global database to the corresponding data source to form a local feature database, the transmitted data includes the indexing data or the feature data determined according to specific requirements on p. 2 
 In this section we provide a brief technical description of the
client and server side runtime that enables FL in Gboard by
walking through the process of performing training, evaluation and inference of the query suggestion triggering model.
As described earlier, our use case is to train a model that
predicts whether query suggestions are useful, in order to
filter out less relevant queries. We collect training data for
this model by observing user interactions with the app: when
surfacing a query suggestion to a user, a tuple (features; label) is stored in an on-device training cache, a SQLite based
database with a time-to-live based data retention policy.
• features is a collection of query and context related information
• label is the associated user action from {clicked, ignored}.
This data is then used for on-device training and evaluation of models provided by our servers

on p. 3 
The server runtime waits until a predefined number of clients
for this population have connected, then provides each with a
training task that contains: a model consisting of a TensorFlow graph and checkpoint [15]metadata about how to execute the model (input + output
node names, types and shapes; operational metrics
to report to the server such as loss, statistics of the data
processed). Execution can refer, but is not limited to,
training or evaluation passes. selection criteria used to query the training cache (e.g. filter data by date) The client executes the task using a custom task interpreter based on TensorFlow Mobile [16], a stripped down Android build of the TensorFlow runtime. In the case of training,
a task-defined number of epochs (stochastic gradient descent
passes over the training data) are performed, and the resulting
updates to the model and operational metrics are anonymously
uploaded to the server. There – again using Tensor-
Flow – these ephemeral updates are aggregated using the Federated
Averaging algorithm to produce a new model, and the
aggregate metrics allow for monitoring training progress.

“and the data corresponding to each data source in the global database is transmitted back to the corresponding data source” on p. 3 (“Upon convergence, a trained checkpoint is used to create and deploy a model to clients for inference”). 
With respect to claim 10, Yang teaches “10. (Original) According to the whole-lifecycle encrypted big data analysis method for the data from the different sources described in claim 1, where in the step of performing corresponding analytical method and federated learning of the data from the different data sources involved in the calculation in a trusted execution environment based on the predetermined requirements, the described analytical methods include logistic regression, decision trees, support vector machines, various neural network algorithms and statistical analysis methods” on p. 3 (emphasis added): 
The server runtime waits until a predefined number of clients for this population have connected, then provides each with a training task that contains: • a model consisting of a TensorFlow graph and checkpoint [15] • metadata about how to execute the model (input + output node names, types and shapes; operational metrics to report to the server such as loss, statistics of the data processed). Execution can refer, but is not limited to, training or evaluation passes. • selection criteria used to query the training cache (e.g. filter data by date)

With respect to claim 11, Yang teaches “11. (Currently Amended) According to the whole-lifecycle encrypted big data analysis method for the data from the different sources described in claim 1, the steps of federated model training of corresponding analysis methods on the data involving in the calculation of the multiple data sources in a trusted execution environment, and obtains model training results after multiple iterations include according to different data analysis requirements, perform federated modeling on the data of each data source involved in the calculation” on p. 5 
During federated training, we apply the following server constraints to our FL tasks: Goal Client Count The target number of clients for a round of federated training, here 100. Minimum Client Count The minimum number of clients required to run a round. Here 80, i.e. although we ideally want 100 training clients, we will run a round even if we only have 80 clients. Training Period How frequently we would like to run rounds of training. Here 5 min. Report Window The maximum time to wait for clients to report back with model updates, here 2 minutes. Minimum Reporting Fraction The fraction of clients, relative to the actual number of clients gathered for a round, which have to report back to commit a round by the end of the Report Window. Here 0.8

“then perform federated model training to calculate data characteristics and intermediate parameters” 
on p. 5 
During federated training, we apply the following server constraints to our FL tasks: Goal Client Count The target number of clients for a round of federated training, here 100. Minimum Client Count The minimum number of clients required to run a round. Here 80, i.e. although we ideally want 100 training clients, we will run a round even if we only have 80 clients. Training Period How frequently we would like to run rounds of training. Here 5 min. Report Window The maximum time to wait for clients to report back with model updates, here 2 minutes. Minimum Reporting Fraction The fraction of clients, relative to the actual number of clients gathered for a round, which have to report back to commit a round by the end of the Report Window. Here 0.8

(intermediate parameters are the parameters up to 100 or 80 clients for example, and/or maximum wait time up to 2 minutes)
 “On the data source side, the calculation can be performed in a trusted execution environment” (this element has no patentable weight; even if it did, Examiner finds the server and associated architecture taught in Fig. 1 of Yang teaches “on the data source side, the calculation can be performed in a trusted execution environment”
McMahan teaches “The calculated data characteristics and intermediate parameters are encrypted and uploaded to the secure computing server” on p. 2 
It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarizes the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model.

Yang teaches “the secure computing server generates global parameters based on data characteristics and intermediate parameters, and returns them to each data source in encrypted form” on p. 3 (“Upon convergence, a trained checkpoint is used to create and deploy a model to clients for inference”).
Yang teaches “repeat the above two steps until the stop condition is met. The model obtained after the iteration stops is the global model” on p. 5 
During federated training, we apply the following server constraints to our FL tasks: Goal Client Count The target number of clients for a round of federated training, here 100. Minimum Client Count The minimum number of clients required to run a round. Here 80, i.e. although we ideally want 100 training clients, we will run a round even if we only have 80 clients. Training Period How frequently we would like to run rounds of training. Here 5 min. Report Window The maximum time to wait for clients to report back with model updates, here 2 minutes. Minimum Reporting Fraction The fraction of clients, relative to the actual number of clients gathered for a round, which have to report back to commit a round by the end of the Report Window. Here 0.8

(Examiner finds client count, minimum client count, etc. are all stop conditions).  The motivation to combine is same as claim 1 above. 
With respect to claim 13, Yang teaches “13. (Currently Amended) According to the whole-lifecycle encrypted big data analysis methods for the data from the different sources described in claim 1  where it comprises a log recording step for recording information about the data used” on p. 2 
 In this section we provide a brief technical description of the
client and server side runtime that enables FL in Gboard by
walking through the process of performing training, evaluation and inference of the query suggestion triggering model.
As described earlier, our use case is to train a model that
predicts whether query suggestions are useful, in order to
filter out less relevant queries. We collect training data for
this model by observing user interactions with the app: when
surfacing a query suggestion to a user, a tuple (features; label) is stored in an on-device training cache, a SQLite based
database with a time-to-live based data retention policy.
• features is a collection of query and context related information
• label is the associated user action from {clicked, ignored}.
This data is then used for on-device training and evaluation of models provided by our servers
(Examiner finds the data in the cache is the result of a log recording step). 
With respect to claim 14, Yang teaches “14. (Original) According to the whole-lifecycle encrypted big data analysis method for the data from the different sources described in claim 13, wherein the information of the data includes data statistics and/or data content; the log recording method includes files, databases, queue and/or blockchain” on p. 2 (see citations for claim 13 above; tuple is a database record or file). 
With respect to claim 15, Yang teaches “15. (Original) According to the whole-lifecycle encrypted big data analysis method for the data from the different sources described in claim 13, wherein the data is optimized according to different analysis algorithm requirements and data characteristics, and the data is optimized in stages or in parallel”  on p. 1
This is just one application of FL – one where developers have never had access to the training data. Other works have additionally explored federated multi-task learning [5], parallel stochastic optimization [6], and threat actors in the context of FL [7].

on p. 3 
In this section, we describe the model setup used to improve
Gboard’s query suggestion system. The system works in two
stages – a traditionally server-side trained baseline model
which generates query candidates, and a triggering model
trained via FL (Figure 2). The goal is to improve query click-through rate (CTR) by taking suggestions from the baseline model and removing low quality suggestions through the triggering model

on p. 6 
Manual inspection of the weights also uncovered an unusual pattern that revealed a way to improve future model iterations. One binned real-valued feature had zero weight for most of its range, indicating that the expected range of the feature was too large. We improved future iterations of the model by restricting the feature to the correct range so the binned values (which did not change in number) gave more precision within the range. This is just one example approach to the broader domain of debugging without training example access.

(Restriction of a feature to the correct range is an optimization of data). 
“including: removing data entry with missing values, missing values imputation, and/or binning features” on p. 6 
Manual inspection of the weights also uncovered an unusual pattern that revealed a way to improve future model iterations. One binned real-valued feature had zero weight for most of its range, indicating that the expected range of the feature was too large. We improved future iterations of the model by restricting the feature to the correct range so the binned values (which did not change in number) gave more precision within the range. This is just one example approach to the broader domain of debugging without training example access.

With respect to claim 16, “16. (Currently Amended) A whole-lifecycle encrypted big data analysis system for the data from the different sources, is used to execute the method described in claim 1 ; the system includes a data source cluster” on p. 3
The server runtime waits until a predefined number of clients
for this population have connected, then provides each with a
training task that contains:
a model consisting of a TensorFlow graph and checkpoint
[15]
metadata about how to execute the model (input + output
node names, types and shapes; operational metrics
to report to the server such as loss, statistics of the data
processed). Execution can refer, but is not limited to,
training or evaluation passes.
selection criteria used to query the training cache (e.g.
filter data by date)

(Examiner finds a “predefined number clients” suggests a cluster of clients); 
“and a secure computing server” on Fig. 1 (server is secure—see p. 2 “in addition to these advantages, FL can guarantee an even higher standard of privacy by making use of two additional techniques. With secure aggregation [11], clients’ updates are securely summed into a single aggregate update without revealing any client’s individual component even to the server”); 
Mc McMahan explicitly teaches “encryption” on p. 2 
It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarizes the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model. 

“the data source cluster includes multiple data sources and modules for data joint modeling & encryption”
on p. 3 
The client executes the task using a custom task interpreter
based on TensorFlow Mobile [16], a stripped down Android
build of the TensorFlow runtime. In the case of training,
a task-defined number of epochs (stochastic gradient descent
passes over the training data) are performed, and the resulting
updates to the model and operational metrics are anonymously
uploaded to the server. There – again using Tensor-
Flow – these ephemeral updates are aggregated using the Federated
Averaging algorithm to produce a new model, and the
aggregate metrics allow for monitoring training progress.

(Each client sends updates to the model—this is data joint modeling); 
Yang further suggests encryption on p. 2 column 2 first full paragraph 

In addition to these advantages, FL can guarantee an even higher standard of privacy by making use of two additional techniques. With secure aggregation [11], clients’ updates are securely summed into a single aggregate update without revealing any client’s individual component even to the server. This is accomplished by cryptographically simulating a trusted third party. Differential privacy techniques can be used in which each client adds a carefully calibrated amount of noise to their update to mask their contribution to the learned model [10]. However, since neither of these techniques were employed in the present work, we will not describe them in further detail here

Mc McMahan explicitly teaches “encryption” on p. 2 
It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarizes the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model. 

Yang teaches “the multiple data sources are used to provide data with a unified data format” on p. 2
predicts whether query suggestions are useful, in order to
filter out less relevant queries. We collect training data for
this model by observing user interactions with the app: when
surfacing a query suggestion to a user, a tuple (features; label)
is stored in an on-device training cache, a SQLite based
database with a time-to-live based data retention policy.
_ features is a collection of query and context related information
_ label is the associated user action from fclicked, ignored
g. This data is then used for on-device training and evaluation
of models provided by our servers.
A key requirement for our on-device

(Examiner finds the tuple is a united data format); 
Yang teaches  “the data joint modeling & encryption module is used to perform encryption and federated modeling of data provided by multiple data sources, and” on p. 3 
The client executes the task using a custom task interpreter
based on TensorFlow Mobile [16], a stripped down Android
build of the TensorFlow runtime. In the case of training,
a task-defined number of epochs (stochastic gradient descent
passes over the training data) are performed, and the resulting
updates to the model and operational metrics are anonymously
uploaded to the server. There – again using Tensor-
Flow – these ephemeral updates are aggregated using the Federated
Averaging algorithm to produce a new model, and the
aggregate metrics allow for monitoring training progress.

(Each client sends updates to the model—this is data joint modeling); 
and p. 2 
In addition to these advantages, FL can guarantee an
even higher standard of privacy by making use of two additional
techniques. With secure aggregation [11], clients’
updates are securely summed into a single aggregate update
without revealing any client’s individual component even to
the server. This is accomplished by cryptographically simulating
a trusted third party. Differential privacy techniques
can be used in which each client adds a carefully calibrated
amount of noise to their update to mask their contribution
to the learned model [10]. 

See also McMahan  p. 2 (“Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model”); 
“locally perform federated model training computation on the data source” on p. 3 
 The client executes the task using a custom task interpreter
based on TensorFlow Mobile [16], a stripped down Android
build of the TensorFlow runtime. In the case of training,
a task-defined number of epochs (stochastic gradient descent
passes over the training data) are performed, and the resulting
updates to the model and operational metrics are anonymously
uploaded to the server.

McMahan teaches “The raw data of each data source would never cross the border of local area” on p. 6

Applying Federated Learning requires machine learning practitioners to adopt new tools and a new way of thinking: model development, training, and evaluation with no direct access to or labeling of raw data, with communication cost as a limiting factor. We believe the user benefits of Federated Learning make tackling the technical challenges worthwhile, and are publishing our work with hopes of a widespread conversation within the machine learning community. 

McMahan teaches “and encryption is required for all the uploaded data” on p. 2 
It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarizes the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model. All the training data remains on your device, and no individual updates are stored in the cloud.

McMahan teaches “and the secure computing server is used to analyze and process the data”  on p. 2 (“Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model”); 
see also Yang p. 3 (“In the case of training, a task-defined number of epochs (stochastic gradient descent passes over the training data) are performed, and the resulting updates to the model and operational metrics are anonymously uploaded to the server. There – again using Tensor- Flow – these ephemeral updates are aggregated using the Federated Averaging algorithm to produce a new model, and the aggregate metrics allow for monitoring training progress”) 
The motivation to combine Yang and McMahan is the same as claim 1 above. 
With respect to claim 17, Yang teaches “17. (Original) According to the whole-lifecycle encrypted big data analysis system for the data from the different sources described in claim 16, wherein the secure computing server includes a data source data processing sub-module, a model training sub-module, and a model validation sub-module; the data source data processing sub-module processes data from multiple data sources to obtain the data involved in the calculation of each data source required for analysis” 
p. 3 (“In the case of training, a task-defined number of epochs (stochastic gradient descent passes over the training data) are performed, and the resulting updates to the model and operational metrics are anonymously uploaded to the server. There – again using Tensor- Flow – these ephemeral updates are aggregated using the Federated Averaging algorithm to produce a new model, and the aggregate metrics allow for monitoring training progress”) 
(Examiner gives no patentable weight to the descriptors of the modules; modules are merely instructions provided by computer software); 
Yang teaches “according to the predetermined requirements, the model training sub-module performs federated model training of corresponding analysis methods on the data involving in the calculation of the multiple data sources in the trusted execution environment” 
on p. 3 (“In the case of training, a task-defined number of epochs (stochastic gradient descent passes over the training data) are performed, and the resulting updates to the model and operational metrics are anonymously uploaded to the server. There – again using Tensor- Flow – these ephemeral updates are aggregated using the Federated Averaging algorithm to produce a new model, and the aggregate metrics allow for monitoring training progress”) 
“and obtains model training results after multiple iterations” on p. 6 
Manual inspection of the weights also uncovered an unusual pattern that revealed a way to improve future model iterations. One binned real-valued feature had zero weight for most of its range, indicating that the expected range of the feature was too large. We improved future iterations of the model by restricting the feature to the correct range so the binned values (which did not change in number) gave more precision within the range. This is just one example approach to the broader domain of debugging without training example access.

p. 5 
During federated training, we apply the following server constraints to our FL tasks: Goal Client Count The target number of clients for a round of federated training, here 100. Minimum Client Count The minimum number of clients required to run a round. Here 80, i.e. although we ideally want 100 training clients, we will run a round even if we only have 80 clients. Training Period How frequently we would like to run rounds of training. Here 5 min. Report Window The maximum time to wait for clients to report back with model updates, here 2 minutes. Minimum Reporting Fraction The fraction of clients, relative to the actual number of clients gathered for a round, which have to report back to commit a round by the end of the Report Window. Here 0.8.
	
	p. 7 table 2 (emphasis added) :

Fig. 6. Retained impressions at thresholds τ0 < τ1 < τ2,
uniformly spaced, over time.
Model Live ∆CTR Live Retained Clicks
Model Iteration 1 +14.52% 77.11%
Model Iteration 2 +25.56% 63.39%
Model Iteration 3 +51.49% 82.01%
p. 7 (emphasis added): 
The results detailed here were only the first in a sequence of models trained, evaluated, and launched with FL. Successive iterations differed in that they were trained longer on more users’ data, had better tuned hyperparameters, and incorporated additional features. The results of these successive iterations are shown in Table 2, but we do not describe in depth here all the changes made between these iterations.

“then it updates the global model based on the model training results” 
on p. 3 (“In the case of training, a task-defined number of epochs (stochastic gradient descent passes over the training data) are performed, and the resulting updates to the model and operational metrics are anonymously uploaded to the server. There – again using Tensor- Flow – these ephemeral updates are aggregated using the Federated Averaging algorithm to produce a new model, and the aggregate metrics allow for monitoring training progress”) 
and the model validation sub-module validates the global model” on 
on p. 6 (emphasis added): 
To validate the model after training, we interpreted the coefficients of our logistic regression model via direct examination of the weights in order to gain insight into what the model had learned. 

The motivation to combine Yang and McMahan is the same as claim 1 above. 
With respect to claim 19, Yang teaches “19. (Currently Amended) According to any one of the whole-lifecycle encrypted big data analysis methods for the data from the different sources described in claim 16 18, wherein the federated modeling, encryption module and the secure computing server comprises a log sub- module respectively, and the log sub-module used to record the information of data used” 
 on p. 2 
 In this section we provide a brief technical description of the
client and server side runtime that enables FL in Gboard by
walking through the process of performing training, evaluation and inference of the query suggestion triggering model.
As described earlier, our use case is to train a model that
predicts whether query suggestions are useful, in order to
filter out less relevant queries. We collect training data for
this model by observing user interactions with the app: when
surfacing a query suggestion to a user, a tuple (features; label) is stored in an on-device training cache, a SQLite based
database with a time-to-live based data retention policy.
• features is a collection of query and context related information
• label is the associated user action from {clicked, ignored}.
This data is then used for on-device training and evaluation of models provided by our servers

(Examiner finds the data in the cache is the result of a log recording step and that the cache functions as a log). 
With respect to claim 20, Yang teaches “20. (Currently Amended) According to any one of the whole-lifecycle encrypted big data analysis methods for the data from the different sources described in claim 19, wherein the information of the data includes data statistics and/or data content” on p. 2 
 In this section we provide a brief technical description of the
client and server side runtime that enables FL in Gboard by
walking through the process of performing training, evaluation and inference of the query suggestion triggering model.
As described earlier, our use case is to train a model that
predicts whether query suggestions are useful, in order to
filter out less relevant queries. We collect training data for
this model by observing user interactions with the app: when
surfacing a query suggestion to a user, a tuple (features; label) is stored in an on-device training cache, a SQLite based
database with a time-to-live based data retention policy.
• features is a collection of query and context related information
• label is the associated user action from {clicked, ignored}.
This data is then used for on-device training and evaluation of models provided by our servers

“the log recording method includes files, databases, Queue and/or blockchain” 
on p. 2 
 In this section we provide a brief technical description of the
client and server side runtime that enables FL in Gboard by
walking through the process of performing training, evaluation and inference of the query suggestion triggering model.
As described earlier, our use case is to train a model that
predicts whether query suggestions are useful, in order to
filter out less relevant queries. We collect training data for
this model by observing user interactions with the app: when
surfacing a query suggestion to a user, a tuple (features; label) is stored in an on-device training cache, a SQLite based
database with a time-to-live based data retention policy.
• features is a collection of query and context related information
• label is the associated user action from {clicked, ignored}.
This data is then used for on-device training and evaluation of models provided by our servers

(tuple is a file and/or part of a database). 
With respect to claim 21, Yang teaches “21. (Currently Amended) According to any one of the whole-lifecycle encrypted big data analysis method for the data from the different sources described in claim 16, it comprises an optimization sub-module, which optimizes the data according to different analysis algorithm requirements and data characteristics, in stages or in parallel” 
on p. 3 
In this section, we describe the model setup used to improve
Gboard’s query suggestion system. The system works in two
stages – a traditionally server-side trained baseline model
which generates query candidates, and a triggering model
trained via FL (Figure 2). The goal is to improve query click-through rate (CTR) by taking suggestions from the baseline model and removing low quality suggestions through the triggering model
on p. 6 
Manual inspection of the weights also uncovered an unusual pattern that revealed a way to improve future model iterations. One binned real-valued feature had zero weight for most of its range, indicating that the expected range of the feature was too large. We improved future iterations of the model by restricting the feature to the correct range so the binned values (which did not change in number) gave more precision within the range. This is just one example approach to the broader domain of debugging without training example access.

“including : removing data entry with missing values, missing values imputation, and/or binning features.
on p. 6 
We determined that FL had produced a reasonable model (reasonable enough to warrant pushing to devices for live inference experiments) considering that: • the weights corresponding to query categories had intuitive values • the weights corresponding to binned real-valued features tended to have smooth, monotone progression


Manual inspection of the weights also uncovered an unusual pattern that revealed a way to improve future model iterations. One binned real-valued feature had zero weight for most of its range, indicating that the expected range of the feature was too large. We improved future iterations of the model by restricting the feature to the correct range so the binned values (which did not change in number) gave more precision within the range. This is just one example approach to the broader domain of debugging without training example access.

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Yang, Applied Federated Learning: Improving Google Keyboard Query Suggestions Dec. 2018 in view of McMahan, Google AI Blog: Federated Learning: Collaborative Machine Learning without Centralized Training Data, 2018 as applied to claim 1-2 above and further in view of Hardy Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption, 2017. 
With respect to claim 4, Yang teaches “4. (Currently Amended) According to the whole-lifecycle encrypted big data analysis method for the data from the different sources described in claim 2 wherein the steps of encrypting the data with the same data format based on analysis requirements and transmitting them to the secure computing server includes.”  See above claim 2. 
Yang teaches “and the encrypted data would be transmitted to the secure computing server based on the requirements” on p. 2
In addition to these advantages, FL can guarantee an
even higher standard of privacy by making use of two additional
techniques. With secure aggregation [11], clients’
updates are securely summed into a single aggregate update
without revealing any client’s individual component even to
the server. This is accomplished by cryptographically simulating
a trusted third party. Differential privacy techniques
can be used in which each client adds a carefully calibrated
amount of noise to their update to mask their contribution
to the learned model [10]. However, since neither of these
techniques were employed in the present work, we will not
describe them in further detail here.

See also McMahan on p. 2 
It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarizes the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model.

It appears Yang et al. fails to explicitly teach 
“perform security detection between each of the data sources and the secure computing server; generate corresponding security reports and security keys, and then encrypting the modeled data in each data source according to the corresponding security keys”
However, Hardy Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption, 2017 teaches 
“perform security detection between each of the data sources and the secure computing server” on p. 5 
Security model — We assume that the participants are honest-but-curious: (i) they follow the protocol without tampering with it in any way; (ii) they do not collude with one another; but (iii) they will nevertheless try to infer as much as possible from the information received from the other participants. The honest-but-curious assumption is reasonable in our context since A and B have an incentive to compute an accurate model. The third party, C, holds the private key used for decryption; however, the only information C receives from A and B are encrypted model updates, which we do not consider private in our setup. We assume that A and B’s data is secret, but that the schema (the number of features and the type of each) of each data provider is available to all parties. We assume that the agents communicate on pre-established secure channels. We work under additional privacy constraints:

(in order to generate key, there must be some type of security detection)
“generate corresponding security reports and security keys, and then encrypting the modeled data in each data source according to the corresponding security keys” on p. 5 
Security model — We assume that the participants are honest-but-curious: (i) they follow the protocol without tampering with it in any way; (ii) they do not collude with one another; but (iii) they will nevertheless try to infer as much as possible from the information received from the other participants. The honest-but-curious assumption is reasonable in our context since A and B have an incentive to compute an accurate model. The third party, C, holds the private key used for decryption; however the only information C receives from A and B are encrypted model updates, which we do not consider private in our setup. We assume that A and B’s data is secret, but that the schema (the number of features and the type of each) of each data provider is available to all parties.

(Examiner finds “security reports” are merely information that is used to create the corresponding keys). 
Hardy and Yang et al. are analogous art because they are from the same field of endeavor as the claimed invention.  It would have been obvious to one skilled in the art before the effective filing date of the invention to modify the steps of encrypting the data with the same data format based on analysis requirements and transmitting them to the secure computing server” to include “perform security detection between each of the data sources and the secure computing server; generate corresponding security reports and security keys, and then encrypting the modeled data in each data source according to the corresponding security keys.” The motivation would have been to compute an accurate model while preserving user privacy. See Hardy id. 
Claims 5-7 are rejected under 35 U.S.C. 103 as being unpatentable over Yang, Applied Federated Learning: Improving Google Keyboard Query Suggestions Dec. 2018 in view of McMahan, Google AI Blog: Federated Learning: Collaborative Machine Learning without Centralized Training Data, 2018 as applied to claim 1-2 above and further in view of Kaur, Data deduplication techniques for efficient cloud storage management: a systematic review, 2018. 
With respect to claim 5, Yang et al. teaches “5. (Original) According to the whole-lifecycle encrypted big data analysis method for the data from the different sources described in claim 2, wherein the steps of processing the transmitted data in a verified trusted execution environment to form a database of total data”. See above. 
Yang teaches “and the data is organized as needed to form a database of the total data” on p. 3
The client executes the task using a custom task interpreter based on TensorFlow Mobile [16], a stripped down Android build of the TensorFlow runtime. In the case of training, a task-defined number of epochs (stochastic gradient descent passes over the training data) are performed, and the resulting updates to the model and operational metrics are anonymously uploaded to the server. There – again using TensorFlow – these ephemeral updates are aggregated using the Federated Averaging algorithm to produce a new model, and the aggregate metrics allow for monitoring training progress.

It appears Yang et al. fails to explicitly teach “the data is processed for collision checking.” 
However, Kaur, Data deduplication techniques for efficient cloud storage management: a systematic review, 2018 teaches “data is processed for collision checking1” in the title and abstract. 
Kaur and Yang et al. are analogous art because they are from the same field of endeavor as applicant’s claimed invention. It would have been obvious to one skilled in the art before the effective filing date of the invention to modify the “processing the transmitted data in a verified trusted execution environment to form a database of total data” in Yang et al. to include “the data is processed for collision checking” as taught by Kaur. The motivation would have been to “eliminate[] redundant data, improve[] storage utilization and reduce[] storage cost.” Kaur abstract. 
With respect to claim 6, Kaur teaches “6. (Original) According to the whole-lifecycle encrypted big data analysis method for the data from the different sources described in claim 5, wherein the collision check processing adopts at least one of the following: binary search tree, sequential search, binary search, chunking algorithm, red-black tree, balanced search tree, hash table, trie, suffix tree , Bloom Filter, brute-force, Rabin-Karp algorithm, KMP algorithm, Boyer-Moore algorithm, Sunday algorithm, Horspool algorithm, and perform block processing of data in trusted execution environment” in the abstract, p. 2040 (“This issue of file-level deduplication leads to the introduction of block-level deduplication techniques”) (block processing of data); p. 2039 (“Huffman coding and arithmetic coding are the types of entropy encoding used to represent frequently occurring pattern with fewer bits. Huffman coding developed by David A. Huffman uses frequency-sorted binary tree to generate the optimal prefix code”) p. 2040 (“whereas data deduplication techniques can be applied at file level or sub-file level. It compresses data by using fixed- or variable size chunks. The hash values of these chunks are generated using cryptographic hash functions, and duplicates are detected by matching hash values”). 
The motivation to combine is the same as claim 5 above. 
With respect to claim 7, Yang teaches “7. (Currently Amended) According to the whole-lifecycle encrypted big data analysis method for the data from the different sources described in claim 5 ,wherein the database of total data includes union, intersection, difference or a combination of these for different data sources” on p. 3 (”There – again using Tensor-
Flow – these ephemeral updates are aggregated using the Federated
Averaging algorithm to produce a new model, and the aggregate metrics allow for monitoring training progress”); (by definition, “aggregate” suggests  at least a union function). 
The motivation to combine is the same as claim 5 above. 


Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Yang, Applied Federated Learning: Improving Google Keyboard Query Suggestions Dec. 2018 in view of McMahan, Google AI Blog: Federated Learning: Collaborative Machine Learning without Centralized Training Data, 2018 as applied to claim 1-2 above and further in view of Moataz, Resizable Tree-Based Oblivious RAM, 2015. 
With respect to claim 9, Yang teaches 9. (Original) According to the whole-lifecycle encrypted big data analysis method for the data from the different sources described in claim 2, wherein after processing the data of the different data sources, the data of the data source is divided into multiple data sets to reduce the data amount for each data set” on p. 3 (“Upon convergence, a trained checkpoint is used to create and deploy a model to clients for inference”); (trained checkpoint represents divided data for each data source). 
It appears Yang fails to explicitly teach “and establish an encrypted Oblivious RAM Tree for each data set or; after processing the data of multiple data sources, an encrypted Oblivious RAM Tree is established for the data of the data source” 
However, Moataz, Resizable Tree-Based Oblivious RAM, teaches “establish an encrypted Oblivious RAM Tree for each data set or; after processing the data of multiple data sources, an encrypted Oblivious RAM Tree is established for the data of the data source” on p. 149 
An Oblivious RAM is a cryptographic data structure storing blocks of data in such a way that a client’s pattern of accesses to those blocks is hidden from the party which holds them. ORAMs offer block reads and writes. That is, they provide Read(a) and Write(d, a) operations, where a is the address of a block, and d notes some data. Let N be the total number of blocks the ORAM can store. Each ORAM block is uniquely addressable by a ∈ {0, 1}log N , and the size of each block is bits. Data in the ORAM [17] is stored as a binary tree with N leaves. Each node in the tree represents a smaller ORAM bucket [7] which holds k (encrypted) blocks. When clear from the context, we will use the terms node and bucket interchangeably. Each leaf in the tree is uniquely identified by a tag t ∈ {0, 1}log N . With P(t), we denote the path which starts at the root of the tree and ends at the leaf node tagged t

	Moataz and Yang et al. are analogous art because they are from the same field of endeavor as applicant’s claimed invention. It would have been obvious to one skilled in the art before the effective filing date of the invention to modify the processing of data in Yang et al. to include “establish an encrypted Oblivious RAM Tree for each data set or; after processing the data of multiple data sources, an encrypted Oblivious RAM Tree is established for the data of the data source” as taught by Moataz. The motivation would have been to protect client information.  See Moataz p. 1 
Oblivious RAM has been a perennial research topic since it was first introduced by Goldreich [8]. ORAM allows for an access pattern to an adversarially controlled RAM to be effectively obfuscated. Conceptually, a client’s data is store in an encrypted and shuffled form in the ORAM, such that accessing pieces of data will not produce any recognizable pattern to an adversary which observes these accesses.

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Yang, Applied Federated Learning: Improving Google Keyboard Query Suggestions Dec. 2018 in view of McMahan, Google AI Blog: Federated Learning: Collaborative Machine Learning without Centralized Training Data, 2018 as applied to claim 1  above and further in view of Nilsson, A Performance Evaluation of Federated Learning Algorithm, 2018. 
With respect to claim 12, Yang teaches verifying.  See above. It appears Yang and McMahan fail to explicitly teach “12. (Original) According to the whole-lifecycle encrypted big data analysis method for the data from the different sources described in claim 1, wherein the method of verifying the global model includes k-fold cross-validation or leave-one-out cross-validation.” 
However, Nilsson, A Performance Evaluation of Federated Learning Algorithm, 2018 teaches “wherein the method of verifying the global model includes k-fold cross-validation or leave-one-out cross-validation” on p. 4 (“We also create i.i.d. and non-i.i.d. data partitionings for 5-fold cross-validation”) and p. 5 (“The input to the Bayesian t-test1 is a list of differences in accuracy of two classifiers from i iterations of k-fold cross-validation. For us, this motivates a distributed partitioning of the MNIST data into folds, as described in Sec. 4.2”). 
Nilsson and Yang et al are analogous art because they are from the same field of endeavor as the claimed invention. It would have been obvious to one skilled in the art before the effective filing date of the invention to modify the verifying in Yang et al. to include “wherein the method of verifying the global model includes k-fold cross-validation or leave-one-out cross-validation” as taught by Nilsson. The motivation would have been to estimate the skill of the model on new data. 

Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Yang, Applied Federated Learning: Improving Google Keyboard Query Suggestions Dec. 2018 in view of McMahan, Google AI Blog: Federated Learning: Collaborative Machine Learning without Centralized Training Data, 2018 as applied to claim 17  above and further in view of Sattler, Robust and Communication-Efficient Federated Learning from Non-IID Data, Mar 2019
With respect to claim 18, McMahan teaches “18. (Original) According to the whole-lifecycle encrypted big data analysis system for the data from the different sources described in claim 17, wherein the data processing sub-module of the data source includes an encryption processing unit, a collision checking unit, a data organization unit, and a data return unit the encryption processing unit performs encryption processing on data with a unified data format according to analysis requirements” on p. 2 
It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarizes the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model. All the training data remains on your device, and no individual updates are stored in the cloud.
See also Yang p. 2:
In addition to these advantages, FL can guarantee an even higher standard of privacy by making use of two additional techniques. With secure aggregation [11], clients’ updates are securely summed into a single aggregate update without revealing any client’s individual component even to the server. This is accomplished by cryptographically simulating a trusted third party. Differential privacy techniques can be used in which each client adds a carefully calibrated amount of noise to their update to mask their contribution to the learned model [10]. However, since neither of these techniques were employed in the present work, we will not describe them in further detail here.

It appears Yang et al. fails to explicitly teach “the collision checking unit performs collision checking on the transmitted data in a verified trusted execution environment to remove redundant data  the data organization unit reorganizes the data after the collision check to form a database of total data, that is, the data of all data sources involved in the calculation” 
However, Sattler teaches “the collision checking unit performs collision checking on the transmitted data in a verified trusted execution environment to remove redundant data” on p. 7 
In the two previous Sections V-A and V-B we have established that sparsified communication can be seamlessly integrated into Federated Learning. We will now look at ways to further improve the efficiency of our method, by eliminating the remaining sources of redundancy in the communication.

“the data organization unit reorganizes the data after the collision check to form a database of total data, that is, the data of all data sources involved in the calculation” on p. 7 
As we can see, additional ternarization does only have a very minor effect on the convergence speed and sometimes does even increase the final accuracy of the trained model. It seems evident that a combination of sparsity and quantization makes more efficient use of the communication budged than pure sparsification. We therefore make use of ternarization in the weight update compression of both the clients and the server

(Examiner finds weight update compression is a reorganization). 
Sattler and Yang et al. are analogous art because they are from the same field of endeavor as the claimed invention. It would have been obvious to one skilled in the art before the effective filing date of the invention to modify the system in Yang et al. to include “the collision checking unit performs collision checking on the transmitted data in a verified trusted execution environment to remove redundant data the data organization unit reorganizes the data after the collision check to form a database of total data, that is, the data of all data sources involved in the calculation”  as taught by Sattler. The motivation would have been to increase efficiency by eliminating data that does not need to be analyzed therefore making the analysis faster. 
Yang teaches “and the data return unit returns the data corresponding to each data source of the global database to the corresponding data source, that is, the data involved in the calculation of the corresponding data source, to form a local feature database of the data source” on p. 3 (“Upon convergence, a trained checkpoint is used to create and deploy a model to clients for inference”); p. 6 (“We determined that FL had produced a reasonable model (reasonable enough to warrant pushing to devices for live inference experiments). . .”). 
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALBERT M PHILLIPS, III whose telephone number is (571)270-3256. The examiner can normally be reached 10a-6:30pm EST M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela D. Reyes can be reached on (571)270-1006. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ALBERT M PHILLIPS, III/Primary Examiner, Art Unit 2159                                                                                                                                                                                                        


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 When “collision checking” is read in light of the specification, Examiner finds this term is more properly characterized as checking for repeated data (i.e. deduplication).  See Applicant’s Spec at p. 9 lines 32-35.