DETAILED ACTION
This Final Office Action is responsive to Applicant’s Amendment filed on 07/29/2021 in which claims 1-6, 12-13, and 15-16 are amended.
Claims 1-20 are currently pending and under examination, of which claims 1, 6, and 16 are independent claims. No claims are currently in condition for allowance.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments 07/14/2021 [P.19 of 21] with regard to claim interpretation have been considered and are not persuasive. Applicant alleges that claim 1 invokes 35 U.S.C. 112(f) by using functional language as “step for asynchronously training”. However, examiner respectfully disagrees. Particularly, the limitation fails prong three of the analysis for functional language. Prong three details (C) the term ‘means’ or ‘step’ or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. As set forth by MPEP
2181(I)(A): In general terms, the ‘underlying function’ of a method claim element corresponds to what the element ultimately accomplishes in relationship to what the other elements of the claim and the claim as a whole accomplish. ‘Acts,’ on the other hand, correspond to how the function is accomplished.

	In this case, the recited function is “for asynchronously training the global machine learning model”. Acts modifying the function are “by iteratively sending the global parameters to the plurality of client devices, receiving modified parameter indicators, and updating the global parameters”.
	The specification discloses the training function as including acts of sending parameters to client devices, receiving from them modified parameter indicators and updating the global parameters. While the specification may go into more detail, the limitation recites the acts for performing the recited function of asynchronous training, see [0055-56], [0059]. Additionally [0166] describes step for using 
Applicant’s arguments 07/14/2021 [P.13 of 21] with regard to anticipatory reference Guo has been fully considered and is not persuasive. Applicant alleges that Guo does not qualify as prior art based on the date of publication. However, examiner respectfully disagrees. In at least as much as the provided evidence of Google Scholar explicitly states “Publication date: 2018/5/30” such evidence provides a clear basis for the reference being qualifying art prior to EFD of 2018/6/19. Applicant points to Ex parte Zhang as the standard for viewing the issue. However, Zhang generally is directed to re-exam over translations of web documents in foreign language and whereby it was noted that no evidence corroborated the date and no additional information regarding the dates. Guo is evidenced by eight (8) citations and two (2) additional references disclosing the qualifying date. As such, Zhang is less at issue as a standard than is described by MPEP 2128.01(IV) “Publicly displayed references can constitute a ‘printed publication’ even if the references are not disseminated by copies or indexed in a library or database… case law directs us to consider the nature of the conference or meeting; whether there are any restrictions on public disclosure of the information; expectations of confidentiality”. Confidentiality is generally not expected when conference proceeding takes place in the Grand Ballroom 2 of Marriott Beijing as opposed to a closed corporate event. Regardless, support for the qualifying date is found a total of ten (10) times and with an explicit recitation “Publication date: 2018/05/30” providing clear evidence. Accordingly, the reference Guo is held to be qualifying art.

    PNG
    media_image1.png
    578
    880
    media_image1.png
    Greyscale

Applicant’s arguments 07/14/2021 [P.13 of 21] with regard to claim 12 have been fully considered. Particularly, the limitations specifically addressed represent amendments to the claim in its present status. Updated search and consideration is given to present claim status and new art is cited in meeting the claims as presently amended.
The rejection of claims 1-20 over Double Patenting is hereby withdrawn in view of amendments to the present and copending application 16/040,057.
The rejections of claims 1-20 under 35 U.S.C. 112(b) are hereby withdrawn, as necessitated by applicant’s amendments made to the claims with further clarifying remarks 07/29/2021 [P.13 of 21].

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.



Claims 1, 3, 5-6 and 15 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by: 
Miao et al., US PG Pub No 20170091651A1, hereinafter Miao.
With respect to claim 1, Miao teaches: 
A computer-implemented method for asynchronously training machine-learning models across client devices utilizing client data while preserving client data privacy {Miao [0030] “asynchronous distributed machine learning performed by the system of Fig 1” comprises [0034-36] “asynchronous ADMM” which a technique for model training further described [0022] details training between server and clients to include phones or other devices, and [0026] describes the models as personalized, replete}, comprising: 
generating global parameters for a global machine learning model stored at a server device; {Miao [34-38] “generate global versions of statistical model… In asynchronous ADMM, server 102 may use the following update rules” rules being equations detailing parameters comprised by the model, e.g. [0039-42] “updated variables… optimization parameter ρ… regularization parameter γ”. See Figs 1-2, 5}
identifying a plurality of client devices comprising local machine learning models that represent local versions of the global machine learning model; and {Miao further details [0025-26] “local versions of statistical model 108 are produced on the clients to personalize statistical model 108 to users of the clients. More specifically, the clients may obtain a global version of statistical model 108 from server” wherein [0068] “the server may merge a number of local versions of the statistical model”, [0092]. Further, identifying may be described with respect to personalization employs identifiers for version and update, see [0036] Figs 1-2}
performing a step for asynchronously training the global machine learning model by iteratively sending the global parameters to the plurality of client devices, receiving modified parameter indicators, and updating the global parameters. {Miao [0025] “statistical model 108 may be iteratively trained through bidirectional transmission of data from server 102 to the clients” such that [0036] “During asynchronous ADMM, server 102 may track merging of the updates 114-116 into statistical model 108 using version identifiers” wherein identifiers correspond to indicators in comparing Miao Fig 1 to the instant specification Fig 3B. Further, claimed send/receive is bidirectional/alternating communication for model (with corresponding parameter) updates [0025], Fig 5-506 “Transmit update containing difference between local version and first global version”. See [0034-42], Fig 2}

With respect to claim 3, Miao teaches the computer-implemented method of claim 1, wherein:
	generating the global parameters for the global machine learning model comprises generating weights for a global regression model stored at the server device; and identifying the plurality of client devices comprises identifying client devices comprising local regression models associated with the global regression model. {Miao [0021] “statistical model 108 may be a regression model” where [0059] “weight may be provided as additional training” so as to [0070] “use a set of weights associated with the first subset of clients to merge the first set of updates into second global version, so that some updates contribute more to the second global version than other updates”. Further, Miao is replete with model versioning identifiers over a subset of clients, Figs 1, 4}

With respect to claim 5, Miao teaches the computer-implemented method of claim 1, wherein
	the modified parameter indicators comprise parameter update differentials that each represent a difference between a locally modified parameter generated by a client device and a global parameter generated by the server device. {Miao Fig 5:506 “update containing difference between local version and first global version” version being of model per [0040]. [0076] “update containing a difference… update based on the equations discussed above with respect to asynchronous ADMM” as in [0034-42]}

With respect to claim 6, the rejection of claim 1 is incorporated. Miao teaches:  
	A system for asynchronously training machine learning models across client devices while preserving client data privacy {Miao [0020] “method, apparatus, and system for performing asynchronous distributed machine learning. As shown in Fig 1, a system for performing the asynchronous distributed machine learning” comprises [0034-36] “asynchronous ADMM” which a technique for model training further described [0022] details training between server and clients to include phones or other devices, and [0026] describes the models as personalized, replete} comprising: 
at least one processor; and {Miao [0086] “Fig. 7 shows a computer system… includes a processor 702, memory 704” etc.}
at least one non-transitory computer memory comprising a global machine learning model and instructions that, when executed by at least one processor {Miao [0018] “computer system reads and executes the code and/or data stored on the computer-readable storage medium, the computer system performs the methods” comprising global machine learning model per Abstract, Fig 5}, cause the system to: 
send global parameters for the global machine learning model to a plurality of client devices, the plurality of client devices comprising local machine learning models that represent local versions of the global machine learning model; {Miao Fig 2 illustrates bidirectional transmission of updates for global/local model over a plurality of local clients where clients execute on a variety of devices per [0046]. Model updates comprise parameters noted at [0038-42] and throughout the update equations e.g., [0034-38]. Versioning of local/global is replete as is noted per Title, Fig 1, [0025-26]}
receive modified parameter indicators from a subset of client devices of the plurality of client devices, wherein the subset of client devices utilize the local machine learning models, the global parameters, and client training data on the subset of client devices to generate the modified parameter indicators; {Miao Fig 2 again noting bidirectional communication. Subsets are detailed [0022] “clients may generate updates 114-116 to statistical model 108 in a distributed fashion on different subsets of training data” e.g., [0088-89] “obtain, from a first subset of the client computer systems, a first set of updates to the first global version. The server may then merge the first set of updates into a second global version of the statistical model. Finally, the server may transmit the second global version to the client computer systems asynchronously… transmit an update containing a difference between the local version and the first global version”}
in response to receiving the modified parameter indicators from the subset of client devices, generate adjusted global parameters for the global machine learning model based on the modified parameter indicators; and {Miao [0083, 85] “At the end of the first session, the client transmits an update containing a difference between the first personalized version and the first global version to the server (operation 610). The update may then be used by the server to produce a second global version of the statistical model… feedback at the client may thus be used to adapt the statistical model to the user’s behavior and generate subsequent global versions of the statistical model” model comprising its parameters and version identifiers over model-training sessions, [0034-38, 77]. Figs 1-2}
send the adjusted global parameters for the global machine learning model to the plurality of client devices for implementation in the local machine learning models at the plurality of client devices. {Miao Fig 4-414/416 “merge second set of updates into third global version of statistical model… Transmit third global version to set of clients” i.e., [0036-37] “iteratively generate new global versions of statistical model… Once the global version is generated, the global version may be broadcast to the clients” such that [0026] “the local version of the statistical model 108 on the client may be adapted…aggregated training” as intuitively illustrated per Fig 2}

Claim 15 is rejected for the same rationale as claim 3.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Miao in view of 
Huo et al., US PG Pub No 20170147920A1, hereinafter Huo.
With respect to claim 2, Miao teaches the computer-implemented method of claim 1, wherein:
generating the global parameters for the global machine learning model comprises generating global-neural-network parameters for layers of a global neural network stored at the server device; and identifying the plurality of client devices comprises identifying client devices comprising local neural networks associated with the global neural network. {Miao discloses that the statistical model may comprise “neural network”, see [0021]. Such model class carries its respective parameters per [0038-42]. Further, Miao is replete with model versioning identifiers over a subset of clients, Figs 1, 4}
	However, Miao does not disclose “layers of a global neural network”. 
Huo teaches layers of neural network, [0016] “ADMM algorithm 112 to train a classifier in the form of deep neural networks… include an input layer 116(1) and an output layer 116(N), as well as multiple hidden layers… nine layers” for image classification, illustrated Fig 1:114. See also [0028]
Huo, with title including ADMM, is directed to machine learning model optimization over global and local models thus being analogous. A person having ordinary skill in the art would have considered it obvious prior to the effective filing date to detail the neural network of Miao as layered according to Huo for the motivation of performing image classification as multiple hidden layers are employed to label images (Huo [0019], [0031]). 

Claims 4, 7, 9-10, 16-17 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Miao in view of 
Kim et al., “Federated Tensor Factorization for Computational Phenotyping”, hereinafter Kim.
With respect to claim 4, Miao teaches the computer-implemented method of claim 1, wherein
	the modified parameter indicators are generated by a subset of client devices of the plurality of client devices and {Miao [0088-89] “obtain, from a first subset of the client computer systems, a first set of updates… update containing a difference between the local version and the first global version”}
Miao further discloses transmitting only a “portion” of versions, and merging subsequent updates “without” preceding updates [0072-73].
However, Miao does not disclose “do not disclose client training data”. 
Kim teaches: 
the modified parameter indicators do not disclose client training data from the subset of client devices. {Kim discloses Federated learning based on ADMM for privacy, [Abstract]. Element-wise computation is employed in decomposing patient/hospital data for privacy over distributed learning framework, [P.891 Sect4.5 ¶1], [P.889 ¶1] Eq (2). Fig 1 illustrates bidirectional communication between server and client (hospital) which sends only select parameters such as Ak(n),Hk(n). Local and global parameters/variables are modeled by optimization detailed per [P.890-91 Sect4.3] noting “Update the local factor matrices… Update the global factor matrices… adjust the gap between local and global factor matrices”}
Kim is directed to machine learning model optimization over local and global models thus being analogous. Both Kim and Miao describe modified ADMM techniques. A person having ordinary skill in the art would have considered it obvious prior to the effective filing date to combine the federated ADMM of Kim with the asyncrhronous ADMM of Miao for the motivation of preserving privacy in a distributed learning environment (Kim [P.889 ¶3-5]).

Claim 7 is rejected for the same rationale as claim 4.

With respect to claim 9, Miao teaches the system of claim 6. Kim teaches further comprising instructions that, when executed by the at least one processor, cause the system to, in response to receiving the modified parameter indicators from the subset of client devices, generate the adjusted global parameters for the global machine learning model by: 
identifying a client device, from among the plurality of client devices, that has not sent a set of modified parameter indicators to the system in a threshold number of training iterations; sending a request for modified parameter indicators to the client device; and after receiving a requested set of modified parameter indicators from the client device, generating the adjusted global parameters for the global machine learning model. {Kim [P.890 Sect4.3] details “iteratively updated” among local/client and global server which is repeated until convergence before “maximum iteration” wherein maximum is threshold for iteration. [P.891] Algorithm 1 describes the communication of model parameters optimized over the Federated ADMM}. One having ordinary skill in the art would have considered it obvious to utilize the max iteration of Kim to threshold the iterations of Miao [0036] in order to aggregate local updates without sharing intermediate results (Kim [P.890 Sect4.3]) and/or “iterative updating rules for optimization” (Kim [P.889 Sect.4]).

With respect to claim 10, the combination of Miao and Kim teaches the system of claim 9, further comprising instructions that, when executed by the at least one processor, cause the system to identify the client device from among the plurality of client devices by: 
identifying a parameter-update-iteration indicator for the client device indicating a number of training iterations since the client device has sent a set of modified parameter indicators; and determining that the parameter-update-iteration indicator for the client device satisfies the threshold number of training iterations. {Kim [P.891 Sect4.4] “iteration t” wherein t is iteration indicator and satisfying threshold is with respect to bounds or convergence criteria, e.g., [P.893 ¶1] “run the models until it converges before maximum iteration 100” as illustrated Figs 4c-4d}

With respect to claim 16, Miao teaches: 
A non-transitory computer readable storage medium comprising instructions that, when executed by at least one processor {Miao [0017-19] “computer-readable storage medium… executes the code… processor that executes a particular software”, [0086-87] describing computer system Fig 7}, cause a client device to: 
receive, at the client device from a server device, global parameters corresponding to a global machine learning model at the server device, wherein a local machine learning model at the client device represents a local version of the global machine learning model; {Miao Fig 2 illustrating client and server distributed updating for local/global model as [0025-26] “local versions of statistical model… global version of statistical model” [0068, 92]. Client devices [0022]. Parameters corresponding to the model are detailed at least [0038-42]}
utilize the local machine learning model, the global parameters, and client training data at the client device to generate locally modified parameters; {Miao [0075] “The client may produce the local version from the first global version by providing the user feedback as training data to the first global version and using the regularization parameter  to adjust the amount by which the local version is affected by the training data” wherein the versions are of the model with its parameters i.e., [0025] “local versions of statistical model 108 are produced on the clients to personalize statistical model”, [0038-42], Fig 2}
receive adjusted global parameters corresponding to the global machine learning model from the server device to implement in the local machine learning model at the client device. {Miao Fig 4-414/416 “merge second set of updates into third global version of statistical model… Transmit third global version to set of clients” i.e., [0036-37] “iteratively generate new global versions of statistical model… Once the global version is generated, the global version may be broadcast to the clients” such that [0026] “the local version of the statistical model 108 on the client may be adapted…aggregated training” as intuitively illustrated per Fig 2}
	Miao further discloses transmitting only a “portion” of versions, and merging subsequent updates “without” preceding updates [0072-73].
	However, Miao does not fairly disclose “without providing the client training data to the server” as is particularly understood in light of the application being context applicable to privacy.
	Kim teaches:
provide modified parameter indicators corresponding to the locally modified parameters to the server device, without providing the client training data to the server device, for the server device to utilize the modified parameter indicators in adjusting the global parameters; and {Kim discloses Federated learning based on ADMM for privacy, [Abstract]. Element-wise computation is employed in decomposing patient/hospital data for privacy over distributed learning framework, [P.891 Sect4.5 ¶1], [P.889 ¶1] Eq (2). Fig 1 illustrates bidirectional communication between server and client (hospital) which sends only select parameters such as Ak(n),Hk(n). Local and global parameters/variables are modeled by optimization detailed per [P.890-91 Sect4.3] noting “Update the local factor matrices… Update the global factor matrices… adjust the gap between local and global factor matrices”}
	Kim is directed to machine learning model optimization over local and global models thus being analogous. Both Kim and Miao describe modified ADMM techniques. A person having ordinary skill in the art would have considered it obvious prior to the effective filing date to combine the Federated ADMM of Kim with the Asyncrhronous ADMM of Miao for the motivation of preserving privacy in a distributed learning environment (Kim [P.889 ¶3-5]).

With respect to claim 17, the combination of Miao and Kim teaches the non-transitory computer readable storage medium of claim 16, further comprising instructions that, when executed by the at least one processor, cause the client device to provide the modified parameter indicators to the server device by: 
determining a first difference between a first locally modified parameter and a first global parameter and a second difference between a second locally modified parameter and a second global parameter; and generating a first parameter update differential representing the first difference and a second parameter update differential representing the second difference. {Miao Fig 2 illustrates versioning of global/local progressing to second third etc.; [0073] “merge the second set of updates into a third version of the statistical model” or [0053] “fourth global version”, [0036] “iteratively generate new global versions of statistical model 108 using the following, where K is the version number (e.g., version identifier) of the latest global version of statistical model 108 and Δj is an update”}

With respect to claim 20, the combination of Miao and Kim teaches the non-transitory computer readable storage medium of claim 16, further comprising instructions that, when executed by the at least one processor, cause the client device to, in response to receiving the adjusted global parameters corresponding to the global machine learning model from the server device: 
identify additional client training data at the client device; and provide additional modified parameter indicators to the server device based on the additional client training data at the client device. {Miao [0059] “additional training data… corresponding user feedback 314 before the value is inputted as training data to personalized version” personalized version is the client/local Fig 2}

Claims 8 and 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Miao and Kim in view of  
Zhao et al., “Federated Learning with Non-IID Data”, hereinafter Zhao.
With respect to claim 8, the combination of Miao and Kim teaches the system of claim 6, further comprising instructions that, when executed by the at least one processor, cause the system to, in response to receiving the modified parameter indicators from the subset of client devices, generate the adjusted global parameters for the global machine learning model by: 
determining that the subset of client devices includes a threshold number of client devices from among the plurality of client devices that have generated the modified parameter indicators; and in response to determining that the subset of client devices includes the threshold number of client devices, generating the adjusted global parameters for the global machine learning model. {Kim discloses superscript n ≥ 2 corresponding to the threshold being operand ≥, see [P.889 Eq.3] and/or [P.890 Sect4.3.2] detailing local/client update. This superscript subsequently is applied in determining “gap between local and global” per [P.891 Eq.13] wherein the gap is adjustment} 
However, further evidence is offered to substantiate that such superscript element may correspond to number of clients. Zhao teaches [P.2 Sect2.1] “K is the number of clients included at each synchronization of FedAvg” which is denoted as superscript [P.4-5 Sect3.1] beginning ¶ “In federated learning…” noting Eq.2. Accordingly, the superscript of Kim reasonably describes a number of clients as is evidenced by Zhao. This provides the benefit of evaluating a distribution of clients (Zhao [Sect3.1]).

With respect to claim 13, the combination of Miao and Kim teaches the system of claim 6, further comprising instructions that, when executed by the at least one processor, cause the system to: 
	receive the modified parameter indicators from the subset of client devices by receiving parameter update differentials that each represent a difference between a locally modified parameter generated by a client device of the subset of client devices and a global parameter generated by a server device; and {Miao Fig 2 illustrates, for example, global version 2 receiving update from the subset of clients A and D. Updates are difference between local and global. [0069, 72] “updates to the first global version from a first subset of clients. Each update may include a difference between a local version of the statistical model on the corresponding client and the first global version… updates to one or both global versions from another subset of clients”}
generate the adjusted global parameters for the global machine learning model by: determining weighted averages for the parameter update differentials; and generating the adjusted global parameters based on the weighted averages for the parameter update differentials. {Miao [0049] “averaging the contribution of each update” e.g., [0070] “generate the second global version as an average of the first set of updates” where “weights associated with the first subset of clients to merge the first set of updates into the second global version, so that some updates contribute more to the second global version than other updates” suggests weighted average}
	However, Miao does not expressly disclose “weighted average”.
	Zhao teaches weighted average FedAvg for weight divergence of federated learning per Eq.1 at [P.3 Sect.3] Eq. 1. See also [P.5-6 Sect3.2.1], Figs 5-7. A person having ordinary skill in the art would have considered it obvious prior to the effective filing date to use the federated average of Zhao for the average of Miao because it “indicates that the accuracy of FedAvg may be affected by the exact data distribution… quantifies the difference of weights from two different training processes with the same weight initialization” (Zhao [P.3 Sect.3]).

With respect to claim 14, the combination of Miao, Kim and Zhao teaches the system of claim 13, further comprising instructions that, when executed by the at least one processor, cause the system to: 
	receive, from a client device of the subset of client devices, a number of training samples from a client-training dataset corresponding to the client device; and determine a weight for a parameter update differential based on the number of training samples; and determine the weighted averages for the parameter update differentials based on the weight for the parameter update differential. {Zhao discloses a distribution of clients with corresponding samples for model training through federated learning, see [P.5 Prop3.1]. Weighted averages are determined per [P.3] Eq.1} The motivation for combination is the same as that of claim 13, applied mutatis mutandis.

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Miao and Kim in view of
Afshar et al., “COPA: Constrained PARAFAC2 for Sparse & Large Datasets”, hereinafter Afshar.
With respect to claim 11, the combination of Miao and Kim teach the system of claim 6, further comprising instructions that, when executed by the at least one processor, cause the system to, in response to receiving the modified parameter indicators from the subset of client devices, generate the adjusted global parameters for the global machine learning model by: 
sending a request for modified parameter indicators to a client device, from among the plurality of client devices, that has not sent a set of modified parameter indicators to the system in a threshold number of training iterations; and in response to determining that the client device has not responded to the request for modified parameter indicators within a threshold time, removing the client device from a group of client devices that the system uses for adjusting the global parameters. {Afshar discloses hard and soft-thresholding to zero out values or enforce sparsity in element-wise thresholding process per [P.4-5 Sect3.3.2]. Fig 6 illustrates iteration time for different target ranks with the effect noted [P.4 Sect3.3.1] “adaptively handle time-varying gaps”}
	Afshar is directed to ADMM thus being analogous. A person having ordinary skill in the art would have considered it obvious prior to the effective filing date to utilize the technique of Afshar in combination with Kim and Miao for the motivation such that “element-wise thresholding operators, which can readily scale to large data” (Afshar [P.5 ¶1]) and/or which “enables our approach to achieve significant speedups (up to 36x) over baselines supporting only a specific constraint each, while achieving the same level of accuracy” (Afshar [P.2 ¶1]).

Claims 12 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Miao in view of
Zhang et Wang, “Privacy-Preserving Decentralized Optimization Based on ADMM”, hereinafter Zhang, arXiv: 1707.04338v1. Examiner has added page numbers for convenience
With respect to claim 12, Miao teaches the system of claim 6, further comprising instructions that, when executed by the at least one processor, cause the system to: 
	Maio Fig 2 which [0070] “use a set of weights associated with the first subset of clients to merge the first set of updates into the second global version, so that some updates contribute more to the second global version than other updates”, [0059] “the weight may be provided as additional training” or [0026] “aggregated training” on samples [0038].
However, Miao does not expressly disclose “weights for the modified parameter indicators”.
	Zhang discloses 
determine weights for the modified parameter indicators based on a number of training samples corresponding to particular client devices from the subset of client devices; and generate adjusted global parameters for the global machine learning model based on the modified parameter indicators and the determined weights for the modified parameter indicators. {Zhang teaches ADMM with privacy preserving functionality whereby weights for the modified parameter indicator corresponds to “weight differences” [P,7-8 PgBrk] and/or “time-varying weights in each iteration” [P.3 Last¶]. Notation of said limitation is employed [P.5 Alg.II] Line 5 or [P.3 Alg.I] Lines 2-3. Additional functionality detailing homomorphic encryption of the weight difference provides for decentralized privacy}
	 Zhang is directed to ADMM thus being analogous. A person having ordinary skill in the art would have considered it obvious prior to the effective filing date to utilize the weight difference of Zhang in combination with Miao’s weights and versioning for the motivation being that “our approach can enable privacy preservation without sacrificing accuracy” (Zhang [P.2 ¶3]).
	
With respect to claim 18, the combination of Miao and Kim teaches the non-transitory computer readable storage medium of claim 17, further comprising instructions that, when executed by the at least one processor, cause the client device to: 
generate a number of training samples from a client-training dataset; and {Miao [0038] “each client may generate updates 114-116 to statistical model 108 using the following… Add x, y to samples for client i” such that [0022-26] “the client may provide the feedback as training data for statistical model” as a [0077] “model-training session”. Further, [0059] “each piece of user feedback 314 is assigned a weight, the weight may be provided as additional training” similar [0070], [0081]}
However, the combination of Miao and Kim does not disclose “weighting for the parameter update differentials”. Zhang teaches:
provide the number of training samples to the server device as a basis for weighting the parameter update differentials when adjusting the global parameters. {Zhang Fig 1 illustrates decentralized network communication for ADMM model parameterization. Agents employ public and private keys in a homomorphic encryption scheme utilize “weighted differences” [P.7-8PgBrk], “time-varying weights” [P.3 Last¶] as detailed by algorithms I and II for ADMM parameterization}
	The motivation for combination is the same as that of claim 12, applied mutatis mutandis.

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Miao in view of
Wei et al., “An Inner-Loop Free Solution to Inverse Problems using Deep Neural Networks”, hereinafter Wei.
With respect to claim 19, the combination of Miao and Kim teaches the non-transitory computer readable storage medium of claim 16, further comprising instructions that, when executed by the at least one processor, cause the client device to utilize the local machine learning model, the global parameters, and the client training data at the client device to generate the locally modified parameters by: 
applying the global parameters in the local machine learning model to a set of client training data from the client training data at the client device to generate a predicted feature; and based on a comparison of the predicted feature and a ground-truth feature from the client training data that corresponds to the predicted feature, modifying the global parameters to generate the locally modified parameters. {Wei discloses ADMM as inference with feature-matching loss [P.5 Sect3.3] where x is identified as ground-truth per [P.6 Sect4.1] for synthetic simulation, adversarial training}
	Wei is directed to ADMM thus being analogous. A person having ordinary skill in the art would have considered it obvious prior to the effective filing date to combine the technique of Wei with Miao and Kim for the motivation of providing implicit regularization on ground-truth (Wei [Abstract], [Sect.1 Last¶]).









Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Chase P Hinckley whose telephone number is (571)272-7935.  The examiner can normally be reached on M-F 9:00 - 5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda M. Huang can be reached on 571-270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/CHASE P. HINCKLEY/Examiner, Art Unit 2124                                                                                                                                                                                                        
/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124