DETAILED ACTION
This action is in response to the Applicant Response filed 23 March 2022 for application 16/242,045 filed 08 January 2019.
Claims 1, 10, 16 are currently amended.
Claim 8 is cancelled.
Claims 1-7, 9-20 are pending.
Claims 1-7, 9-20 are rejected.

	
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 23 March 2022 has been entered.
 
Response to Arguments
Applicant's arguments regarding the 35 U.S.C. 103 rejections of claims 1-7, 9-20 have been fully considered but are moot because the arguments do not apply to any of the references being used in the current rejections.

Claim Objections
Claims 1-7, 9-20 are objected to because of the following informalities:
Claim 1, line 15, all classes data should read “all classes of data”
Claim 1, line 21, “and” should be removed from end of line
Claim 1, line 24, “and” should be added to end of line
Claim 10, line 16, all classes data should read “all classes of data”
Claim 16, line 17, all classes data should read “all classes of data”
Claim 16, line 25, “and” should be added to end of line
Claims 2-7, 9, 11-15, 17-20 are objected to due to their dependence, either directly or indirectly, on claims 1, 10, 16
Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 3-6, 9-10, 12-16, 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Szeto et al. (US 2018/0018590 A1 - Distributed Machine Learning Systems, Apparatus, and Methods, hereinafter referred to as "Szeto") in view of Hardy et al. (MD-GAN: Multi-Discriminator Generative Adversarial Networks for Distributed Datasets, hereinafter referred to as “Hardy”) and further in view of Zhao et al. (Federated Learning with Non-IID Data, hereinafter referred to as “Zhao”).

Regarding claim 1 (Currently amended), Szeto teaches a computer-implemented method for distributed learning (Szeto, ¶0042 – teaches a computer-based distributed learning system), the method comprising: 
determining, by a server, data statistics at each system node of a plurality of system nodes (Szeto, ¶0045 – teaches the global server analyzing proxy information (e.g., proxy data, proxy data distributions, proxy model parameters, or other proxy related data combined with seeds) to determine whether the proxy related information from one private node has the same shape and/or overall properties as the proxy related data from another private node; see also, Szeto, ¶¶0063-0064 – teaches determining metadata from each private server based on the local dataset), wherein each system node respectively comprises an artificial intelligence model (Szeto, ¶0045 – teaches each private data server [system node] has a machine learning model engine and model) ...; 
determining, by the server, a set of control and coordination instructions for training each artificial intelligence model at each system node of the plurality of system nodes (Szeto, ¶0045 – teaches that the non-private global computing device [server] sends programmatic model instructions on how to create the desired local model to each private data server [system node]; see also Szeto, ¶0049 – teaches different instructions to different nodes; Szeto, ¶0068 – teaches instructions can include instructions that define the conditions under which training occurs; Szeto, ¶0018) ...; 
directing, by the server, an exchange of data between the plurality of system nodes based on the data statistics of each system node of the plurality of system nodes (Szeto, ¶0080-0081 – teaches, according to model instructions [from global server], transmitting proxy data to a different private data server [node]; see also Szeto, ¶¶0045, 0063-0064) ...; and 
directing, by the server, each of the system nodes to train an artificial intelligence model stored by a respective system node using the local mini-batch created by the respective system node (Szeto, ¶0045 – teaches, using instructions from the non-private computing device [server], that the locally trained models [from system nodes] are trained using local data and transmitted to the non-private computing device [server], where the local models are aggregated into a trained global model; see also Szeto, ¶0068 – teaches various instructions for training local models which can include number of iterations [mini-batches]); 
fusing, by the server, trained artificial intelligence models from the plurality of system nodes into a fused artificial intelligence model (Szeto, ¶0045 – teaches that the locally trained models are transmitted to the non-private computing device [server], where the local models are aggregated into a trained global model), wherein the trained artificial intelligence models are trained using the set of control and coordination instructions (Szeto, ¶0045 – teaches that the non-private global computing device [server] sends programmatic model instructions on how to create the desired local model to each private data server [system node]).
While Szeto teaches generating a global model by combining distributed local models trained on local data, Szeto does not address using class imbalance or GANs. Specifically, Szeto does not explicitly teach wherein the data statistics indicate classes of data stored by each system node, and wherein the server is configured to detect a class imbalance based on the data statistics indicating one or more missing classes of data at one or more given system nodes and to determine which missing class of data needs to be exchanged in order for each system node to store each class of data necessary for resolve the class imbalance;  the control and coordination instructions including data exchange instructions to facilitate an exchange of data to provide the given system nodes with the missing class of data, and mini-batch instructions instructing each system node to create a local mini-batch that contains a representative sample of all classes data to each system node among the plurality of system nodes; transmitting, by the server, instructions to each system node of the plurality of system nodes to create a generative adversarial network for selected classes of data; ... wherein the exchanged data includes data created by each generative adversarial network.
Hardy teaches
determining, by a server (Hardy, section III – teaches parameter server; Hardy, section IV – teaches central server), data statistics at each system node of a plurality of system nodes (Hard, section III – teaches each worker node having a local dataset with a given size and probability distribution [data statistics] and each worker node sending training results [data statistics] to parameter server for synchronization; Hardy, section IV – teaches sending error feedback [data statistics] from the workers to the server), wherein each system node respectively comprises an artificial intelligence model (Hardy, section III – teaches each worker node has a generator and discriminator; Hardy section IV – teaches each worker has a discriminator) ...; 
determining, by the server (Hardy, section III – teaches parameter server; Hardy, section IV – teaches central server), a set of control and coordination instructions for training each artificial intelligence model at each system node of the plurality of system nodes (Hardy, section III – teaches setting up local GANs across various worker nodes trained using local data; Hardy section IV – teaches setting up local discriminators across various worker nodes trained using local data; see also Hardy, Fig. 1) ...; 
transmitting, by the server, instructions to each system node of the plurality of system nodes to create a generative adversarial network for selected classes of data (Hardy, section III – teaches setting up GANs for distributed computations across various worker nodes for selected classes; Hardy section IV – teaches setting up discriminators for distributed computations across various worker nodes for selected classes; see also Hardy, Fig. 1; Hardy, section V- experiments and datasets used to train/test models); 
directing, by the server, an exchange of data between the plurality of system nodes based on the data statistics of each system node of the plurality of system nodes, wherein the exchanged data includes data created by each generative adversarial network (Hardy, section IV – teaches various communications between workers and servers, including worker to worker peer-to-peer communications to swap data between worker nodes where each worker node contains a GAN discriminator; see also Hardy, Fig. 1; Hardy, section III – teaches data exchanging based on data statistics of the nodes); and 
directing, by the server, each of the system nodes to train an artificial intelligence model stored by a respective system node using the local mini-batch created by the respective system node (Hardy, section III – teaches performing numerous local iterations of locally stored data [mini-batches] for each round to train the local model; Hardy, section IV – teaches using batches sent by the server and local batches of real data extracted locally from the local data set [mini-batches] to perform several learning iterations on its discriminator); 
fusing, by the server, trained artificial intelligence models from the plurality of system nodes into a fused artificial intelligence model (Hardy, section III – teaches aggregating the parameters from each worker node on the parameter server to create a global model; Hardy, section IV – teaches central server receiving iterative updates from each worker node in order to update the global generator model; see also Hardy, Fig. 1), wherein the trained artificial intelligence models are trained using the set of control and coordination instructions (Hardy, section III- teaches the parameter server using working to train the local model using the worker’s given data share and synchronizing the results at each iteration to generate the global model; Hardy, section IV – teaches the central server sends data batches to the workers, which perform local computations and send results to central server for global model updates, and directs peer-to-peer communication between workers).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Szeto with the teachings of Hardy in order to train distributed GANs and to reduce computation time/convergence time of GANs trained on datasets spread across multiple worker nodes in the field of using a server to coordinate the training of a global model using a plurality of system nodes which train a local model using local data (Hardy, Abstract – “A recent technical breakthrough in the domain of machine learning is the discovery and the multiple applications of Generative Adversarial Networks (GANs). Those generative models are computationally demanding, as a GAN is composed of two deep neural networks, and because it trains on large datasets. A GAN is generally trained on a single server. In this paper, we address the problem of distributing GANs so that they are able to train over datasets that are spread on multiple workers. MD-GAN is exposed as the first solution for this problem: we propose a novel learning procedure for GANs so that they fit this distributed setup. We then compare the performance of MD-GAN to an adapted version of Federated Learning to GANs, using the MNIST and CIFAR10 datasets. MD-GAN exhibits a reduction by a factor of two of the learning complexity on each worker node, while providing better performances than federated learning on both datasets.”).
While Szeto in view of Hardy in view of Hard teaches generating a global model by combining distributed local GAN models trained on local data, Szeto in view of Hardy does not address using class imbalance or GANs. Specifically, Szeto in view of Hardy does not explicitly teach wherein the data statistics indicate classes of data stored by each system node, and wherein the server is configured to detect a class imbalance based on the data statistics indicating one or more missing classes of data at one or more given system nodes and to determine which missing class of data needs to be exchanged in order for each system node to store each class of data necessary for resolve the class imbalance; the control and coordination instructions including data exchange instructions to facilitate an exchange of data to provide the given system nodes with the missing class of data, and mini-batch instructions instructing each system node to create a local mini-batch that contains a representative sample of all classes data to each system node among the plurality of system nodes.
Zhao teaches
determining, by a server, data statistics at each system node of a plurality of system nodes (Zhao, section 3 – teaches client data distribution, population data distribution, and EMD, distance between client distribution and population distribution), wherein each system node respectively comprises an artificial intelligence model (Zhao, section 2 – teaches federated learning where each client has a machine learning model), wherein the data statistics indicate classes of data stored by each system node (Zhao, section 3 – teaches client data distribution [classes stored at node], population data distribution [all classes stored by all nodes], and EMD, distance between client distribution and population distribution), and wherein the server is configured to detect a class imbalance based on the data statistics indicating one or more missing classes of data at one or more given system nodes (Zhao, section 3 – teaches calculating EMD between client distribution and population distribution, which is root cause of weight divergence and an indication of missing data classes at the given node) and to determine which missing class of data needs to be exchanged in order for each system node to store each class of data necessary for resolve the class imbalance (Zhao, section 3 - teaches knowing each client data distributions [individual node classes] and the population distribution [all combined classes]; Zhao, section 4 - teaches generating, on the cloud [server], a small subset having uniform distribution of all classes which is distributed to each of the nodes [Because the server has each client distribution and the population distribution, the server can determine which nodes are missing which classes. Further, creating a small uniform subset resolves imbalance issues.]); 
determining, by the server, a set of control and coordination instructions for training each artificial intelligence model at each system node of the plurality of system nodes (Zhao, section 2 – teaches performing the FedAvg algorithm which is a federated learning algorithm that trains local models on local data then combines results in cloud [server] to create global model), the control and coordination instructions including data exchange instructions to facilitate an exchange of data to provide the given system nodes with the missing class of data (Zhao, section 4 - teaches generating, on the cloud [server], a small subset having uniform distribution of all classes which is distributed to each of the nodes), and mini-batch instructions instructing each system node to create a local mini-batch that contains a representative sample of all classes data to each system node among the plurality of system nodes (Zhao, section 4 - teaches generating, on the cloud [server], a small subset [mini-batch] having uniform distribution of all classes which is distributed to each of the nodes made from a small representative sample from each individual node [local mini-batch]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Szeto in view of Hardy with the teachings of Zhao in order to globally share a representation of an entire dataset which is unbalanced across individual nodes in order to improve accuracy of a global model in the field of using a server to coordinate the training of a global model using a plurality of system nodes which train a local model using local data (Zhao, Abstract – “Federated learning enables resource-constrained edge compute devices, such as mobile phones and IoT devices, to learn a shared model for prediction, while keeping the training data local. This decentralized approach to train models provides privacy, security, regulatory and economic benefits. In this work, we focus on the statistical challenge of federated learning when local data is non-IID. We first show that the accuracy of federated learning reduces significantly, by up to ~55% for neural networks trained for highly skewed non-IID data, where each client device trains only on a single class of data. We further show that this accuracy reduction can be explained by the weight divergence, which can be quantified by the earth mover’s distance (EMD) between the distribution over classes on each device and the population distribution. As a solution, we propose a strategy to improve training on non-IID data by creating a small subset of data which is globally shared between all the edge devices. Experiments show that accuracy can be increased by ~30% for the CIFAR-10 dataset with only 5% globally shared data.”).

Regarding claim 3 (Previously Presented), Szeto in view of Hardy and further in view of Zhao teaches all of the limitations of the method of claim 1 as noted above. Szeto further teaches wherein the set of control and coordination instructions includes a direction for an exchange of training data between system nodes of the plurality of system nodes (Szeto, ¶0080-0081 – teaches, according to model instructions [from global server], transmitting proxy data to a different private data server [node]; see also Szeto, ¶¶0045, 0063-0064 [Transferring data from one node to a different node based on model instructions provides direction (first to second) of the data exchange]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Szeto, Hardy and Zhao for the same reasons as disclosed in claim 1 above.

Regarding claim 4 (Previously Presented), Szeto in view of Hardy and further in view of Zhao teaches all of the limitations of the method of claim 1 as noted above. Szeto further teaches wherein each artificial intelligence model at each system node of the plurality of system nodes includes a neural network, a decision tree, a rule set, a support vector machine, a Gaussian mixture model, or a clustering model (Szeto, ¶0069 – teaches the machine learning model can be at least one of a neural network algorithm, a decision tree algorithm [decision tree/rule based], a clustering algorithm, a support vector machine, expectation maximization [GMM]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Szeto, Hardy and Zhao for the same reasons as disclosed in claim 1 above.

Regarding claim 5 (Original), Szeto in view of Hardy and further in view of Zhao teaches all of the limitations of the method of claim 1 as noted above. Szeto teaches wherein an artificial intelligence model at a first system node of the plurality of system nodes is different than an artificial intelligence model at a second system node of the plurality of system nodes (Szeto, ¶0049 – teaches that a first private server [first node] may receive a first set of instructions to create a first model and a second private server [second node] may receive a second set of instructions to create a second [different] model).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Szeto, Hardy and Zhao for the same reasons as disclosed in claim 1 above.

Regarding claim 6 (Previously Presented), Szeto in view of Hardy and further in view of Zhao teaches all of the limitations of the method of claim 1 as noted above.
Szeto further teaches 
directing all system nodes of the plurality of system nodes to respectively use a same artificial intelligence model (Szeto, ¶0049 – teaches that the instructions to the private servers [nodes] are the same to create the same model); 
averaging artificial intelligence parameters, received from the plurality of system nodes, derived by training with the mini-batches and localized data to create the fused artificial intelligence model (Szeto, ¶0045 – teaches that the locally trained models [from system nodes] are trained using local data and transmitted to the non-private computing device [server], where the local models are aggregated into a trained global model; see also Szeto, ¶0068 – teaches various instructions for training local models which can include number of iterations [mini-batches]); and 
directing each system node of the plurality of system nodes to use a same set of data statistics for training (Szeto, ¶0049 – teaches all private servers [nodes] receiving the same instructions; Szeto, ¶0062 – teaches the instructions including data selection criteria; see also Szeto, ¶¶0063-0064 [Providing the same data selection criteria to all nodes is interpreted as using the same data statistics for each node]).
Hardy further teaches averaging artificial intelligence parameters, received from the plurality of system nodes, derived by training with the mini-batches and localized data to create the fused artificial intelligence model (Hardy, section III – teaches performing, on each worker node, numerous local iterations of locally stored data [mini-batches] for each round to train the local model and the parameter server averaging the results from each worker node to generate a global model; Hardy, section IV – teaches using, by each worker node, batches sent by the server and local batches of real data extracted locally from the local data set [mini-batches] to perform several learning iterations on its discriminator, where the results are sent back to the central server for averaging to generate a global generator).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Szeto, Hardy and Zhao for the same reasons as disclosed in claim 1 above.

Regarding claim 9 (Original), Szeto in view of Hardy and further in view of Zhao teaches all of the limitations of the method of claim 1 as noted above. Szeto further teaches wherein exchanging data comprises: 
generating a class of data through a clustering algorithm at a first system node (Szeto, ¶0085 – teaches generating proxy [synthetic] data using correlations among various attributes [creating a class] using clustering algorithms); and 
transferring the generated data to a second system node (Szeto, ¶0051 – teaches transferring proxy [synthetic] data from one private server [node] to a different private server [node]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Szeto, Hardy and Zhao for the same reasons as disclosed in claim 1 above.

Regarding claim 10 (Currently Amended), it is the system embodiment of claim 1 with similar limitations to claim 1 and is rejected using the same reasoning found in claim 1. Szeto further teaches the following additional limitations:
a processor communicatively coupled to a memory (Szeto, ¶0027 – teaches the computing devices have processors coupled to memory)...
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Szeto, Hardy and Zhao for the same reasons as disclosed in claim 1 above.
Regarding claim 12 (Previously Presented), the rejection of claim 10 is incorporated herein. Further, the limitations in this claim are taught by Szeto in view of Hardy and further in view of Zhao for the reasons set forth in the rejection of claim 3.

Regarding claim 13 (Previously Presented), the rejection of claim 10 is incorporated herein. Further, the limitations in this claim are taught by Szeto in view of Hardy and further in view of Zhao for the reasons set forth in the rejection of claim 6.

Regarding claim 14 (Previously Presented), the rejection of claim 10 is incorporated herein. Further, the limitations in this claim are taught by Szeto in view of Hardy and further in view of Zhao for the reasons set forth in the rejection of claim 4.

Regarding claim 15 (Original), the rejection of claim 10 is incorporated herein. Further, the limitations in this claim are taught by Szeto in view of Hardy and further in view of Zhao for the reasons set forth in the rejection of claim 5.

Regarding claim 16 (Currently Amended), it is the system embodiment of claim 1 with similar limitations to claim 1 and is rejected using the same reasoning found in claim 1. Szeto further teaches the following additional limitations:
a computer program product for distributed learning, the computer product comprising a computer readable storage medium having program instructions embodied therewith, the instructions executable by a processor (Szeto, ¶0027 – teaches processor coupled to memory where processor executes instructions stored on memory)...
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Szeto, Hardy and Zhao for the same reasons as disclosed in claim 1 above.

Regarding claim 18 (Previously Presented), the rejection of claim 16 is incorporated herein. Further, the limitations in this claim are taught by Szeto in view of Hardy and further in view of Zhao for the reasons set forth in the rejection of claim 3.

Regarding claim 19 (Previously Presented), the rejection of claim 16 is incorporated herein. Further, the limitations in this claim are taught by Szeto in view of Hardy and further in view of Zhao for the reasons set forth in the rejection of claim 4.

Regarding claim 20 (Original), the rejection of claim 16 is incorporated herein. Further, the limitations in this claim are taught by Szeto in view of Hardy and further in view of Zhao for the reasons set forth in the rejection of claim 5.

Claims 2, 7, 11, 17 are rejected under 35 U.S.C. 103 as being unpatentable over Szeto in view of Hardy, further in view of Zhao and further in view of Chen, Eric (US 2018/0314981 A1 – Data Sovereignty Compliant Machine Learning, hereinafter referred to as “Chen”).

Regarding claim 2 (Original), Szeto in view of Hardy and further in view of Zhao teaches all of the limitations of the method of claim 1 as noted above. However, Szeto in view of Hardy and further in view of Zhao does not explicitly teach wherein the set of control and coordination instructions includes a permutation of order that the artificial intelligence models are trained.
Chen teaches wherein the set of control and coordination instructions includes a permutation of order that the artificial intelligence models are trained (Chen ¶¶0041-0044 – teaches using a coordination server to direct training of a machine learning model by country A [first node], then country B [next node], then country C [last node]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Szeto in view of Hardy and further in view of Zhao with the teachings of Chen in order to develop a global model while preserving data security of the data stored at each system node in the field of using a server to coordinate the training of a global model using a plurality of system nodes which train a local model using local data (Chen, ¶0016 – “Aspects of the subject disclosure describe solutions for implementing incremental machine learning techniques between sovereign regions for which data export is restricted. As discussed in further detail below, data sovereignty restrictions can restrict the export of certain types of data, such as different types of user data or personal information, that are useful for initializing and training various machine learning models. Using incremental machine learning methods, a given machine learning model can be trained and updated using only data from users residing in the same (sovereign) region. Once trained, the machine learning model can be exported for use in a different sovereign region, without the need to violate export controls by transferring any actual training data. Consequently, the machine learning model can be used in additional sovereign regions, and subsequently updated/trained with data that may also be export restricted, without violating export controls for user data of any sovereign region.”).

Regarding claim 7 (Original), Szeto in view of Hardy and further in view of Zhao teaches all of the limitations of the method of claim 1 as noted above. However, Szeto in view of Hardy and further in view of Zhao does not explicitly teach creating an ensemble of artificial intelligence models to create the fused artificial intelligence model.
Chen teaches creating an ensemble of artificial intelligence models to create the fused artificial intelligence model (Chen ¶¶0041-0044 – teaches using a coordination server to direct training of a machine learning model by country A [first node], then country B [next node], then country C [last node]; [Training the model using sequential models creates an ensemble]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Szeto in view of Hardy and further in view of Zhao with the teachings of Chen in order to develop a global model while preserving data security of the data stored at each system node in the field of using a server to coordinate the training of a global model using a plurality of system nodes which train a local model using local data (Chen, ¶0016 – “Aspects of the subject disclosure describe solutions for implementing incremental machine learning techniques between sovereign regions for which data export is restricted. As discussed in further detail below, data sovereignty restrictions can restrict the export of certain types of data, such as different types of user data or personal information, that are useful for initializing and training various machine learning models. Using incremental machine learning methods, a given machine learning model can be trained and updated using only data from users residing in the same (sovereign) region. Once trained, the machine learning model can be exported for use in a different sovereign region, without the need to violate export controls by transferring any actual training data. Consequently, the machine learning model can be used in additional sovereign regions, and subsequently updated/trained with data that may also be export restricted, without violating export controls for user data of any sovereign region.”).

Regarding claim 11 (Original), the rejection of claim 10 is incorporated herein. Further, the limitations in this claim are taught by Szeto in view Hardy, further in view of Zhao and further in view of Chen for the reasons set forth in the rejection of claim 2.

Regarding claim 17 (Original), the rejection of claim 16 is incorporated herein. Further, the limitations in this claim are taught by Szeto in view Hardy, further in view of Zhao and further in view of Chen for the reasons set forth in the rejection of claim 2.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
Jeong et al. (Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data) teaches federated learning with federated augmentation where training data for missing labels are generated locally on a given node using a GAN. 

Any inquiry concerning this communication or earlier communication from the examiner should be directed to MARSHALL WERNER whose telephone number is (469) 295-9143. The examiner can normally be reached on Monday – Thursday 7:30 AM – 4:30 PM ET.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar, can be reached at (571) 272-7796. The fax number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/MARSHALL L WERNER/               Examiner, Art Unit 2125