DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
2.	This communication is in response to Applicant’s “Response” filed 16 February 2021 [hereinafter Response], in which:
Claims 1, 32, and 38 have been amended.
Claims 1-38 are pending.
	Claims 1-38 are rejected.
Information Disclosure Statement
3.	An information disclosure statement was submitted on 16 February 2021. The submission complies with the provisions of 37 CFR 1.97. Accordingly, the Examiner considered the information disclosure statement.
Claim Rejections - 35 U.S.C. § 103
4.	The following is a quotation of 35 U.S.C. § 103, which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
5.	The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. § 103 are summarized as follows:
1. 	Determining the scope and contents of the prior art.
2. 	Ascertaining the differences between the prior art and the claims at issue.
3. 	Resolving the level of ordinary skill in the pertinent art.
4. 	Considering objective evidence present in the application indicating obviousness or nonobviousness.
6.	This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. § 102(b)(2)(C) for any potential 35 U.S.C. § 102(a)(2) prior art against the later invention.
7.	Claims 1-8, 13, and 18-38 are rejected under 35 U.S.C. § 103 as being unpatentable over US Published Application 20090037351 to Kristal et al. [hereinafter Kristal], in view US Published Application 20130085773 to Yao et al. [hereinafter Yao] and Park et al., “Perturbed Gibbs Samplers for Generating Large-Scale Privacy-Safe Synthetic Health Data,” IEEE Int’l Conf on Healthcare Informatics (2013) [hereinafter Park].
Regarding claim 1, Kristal teaches [a] distributed machine learning system (Kristal ¶ 0006 teaches a system and method for training a machine learning network) comprising:
a plurality of private data servers, wherein each private data server of the plurality of private data servers has access to local private data and has at least one modeling engine (Kristal ¶ 0007 teaches a host device performs the initializing, the outputting of the at least one question, and the adjusting/creating of the training input. The system also includes a plurality of client computing devices (plurality of private data servers) receiving the at least one question and transmitting choices from users thereof in response to the questions (each private data server having access to local private data); Kristal ¶ 0063 teaches each trial may utilize its own ML network or a selected portion of a single ML network or a set of networks (that is, each private data server . . . has at least one modeling engine); Kristal ¶ 0036 teaches training the neural network may be for any prediction or classification purpose, including, but not limited to, stock analysis, drug development, terrorist activity, event wagering, medical diagnosis, detection of credit card fraud, classification of DNA sequences (each private data server having access to local private data), . . . ) . . . , 
wherein, for each private data server of the plurality of private data servers, the local private data includes restricted features to which the at least one non-private computing device does not have authorization to access (Kristal ¶ 0041 teaches the user is identified by . . . receiving identification data associated with a user profile. . . . After generating the user profile, the user may be provided with . . . identification data (e.g., username, password (that is, the local private data includes restricted features)) so that for subsequent uses of the network 300, the user is identified (that is, for each private data server . . . , the local private data includes restricted features to which the at least one non-private computing device does not have authorization to access); Examiner points out that by use of a username and password, there are restricted features that in turn are only accessible via the username and the password as taught by Kristal), and
wherein each private data server of the plurality of private data servers, upon execution by at least one processor software instructions stored in a non-transitory computer readable memory causes (Kristal claim 51 teaches a computer-readable medium (non-transitory computer readable memory) storing a set of instructions for execution by a processor . . . .) its at least one modeling engine to:
receive model instructions (Kristal ¶ 0029 teaches [e]xamples of learning algorithms, rules, trees and decision strategies (receive model instructions) include, but are not limited to: Bayesnet, ComplimentNaiveBayes, NaiveBayesMultinomial, [etc.]) to create a trained actual model from at least some of the local private data and according to an implementation of a machine learning algorithm (Kristal ¶ 0002 teaches use of ML algorithms (to create a trained actual model) requires only the existence of sufficient and relevant set of prior experiential data (i.e., accurate training exemplars that include examples of input and associated output (from at least some of the local private data)), and does not require the user to have any knowledge of the rules that govern the system’s behavior (and according to an implementation of a machine learning algorithm));
create the trained actual model according to the model instructions and as a function of the at least some of the local private data (Kristal ¶ 0002 teaches use of [machine learning] algorithms requires only the existence of sufficient and relevant set of prior experiential data (i.e., accurate training exemplars that include examples of input and associated output), and does not require the user to have any knowledge of the rules that govern the system’s behavior (create the trained actual model using a machine learning algorithm)) by training the implementation of the machine learning algorithm on the local private data (Kristal ¶ 0050 teaches where the number of training cycles required when no synthetic data is used (create the trained actual model according to the model instructions and as a function of the at least some of the local private data) may be between 700 and 5000 . . . ), the trained actual model comprising trained actual model parameters (Kristal ¶ 0002 teaches that prior to use, the [neural] networks must be trained with known input and outcome data (the trained actual model comprising trained actual model parameters) to provide predictions with an acceptable level of accuracy);
* * *
generate a set of proxy data according to the plurality of private data . . . distributions (Kristal ¶ 0027 teaches the training of an ML network (e.g., an artificial neural network) by using human input (e.g., synthetic data) (generate a set of proxy data) which reduces the network’s learning time and improves problem solving, decision making, classification and prediction accuracy); 
create a trained proxy model from the set of proxy data by training another implementation of the machine learning algorithm on the set of proxy data, wherein the trained proxy model comprises proxy model parameters (Kristal, Abstract teaches the training of an ML network (e.g., an artificial neural network) by using human input (e.g., synthetic data) (create a trained proxy model from the set of proxy data by training the type of machine learning model on the set of proxy data . . . comprises proxy model parameters); Kristal, Abstract teaches synthetic training sets (set of proxy data) derived from expert or non-expert human guesstimates can replace or augment training data sets comprised of actual training exemplars);
calculate a model similarity score as a function of the proxy model parameters and the trained actual model parameters (Kristal ¶ 0044 teaches one may train the algorithm, obtain predictions, and then rerun network training after distributing supporting information or information about user guesstimates, etc. Thus, it may be possible to use the present invention to obtain multiple related data sets for differential comparison (calculate a model similarity score as a function of the proxy model parameters and the trained actual model parameters); see also Kristal FIG. 9, which shows simulation results based on the method of FIG. 2, in which machine learning algorithm comparisons are made in view of synthetic data (proxy model parameters) in contrast to “without synthetic data” (trained actual model parameters)); and
transmit the set of proxy data . . . (Kristal ¶ 0033 teaches input data received from, for example, the users 15-30 and/or from further nodes connected to the node, is used to generate the output data (set of proxy data). The output data may be fed to a subsequent node (transmit the set of proxy data) . . .).
Though Kristal teaches the feature of generating a set of proxy data as set out above, Kristal does not explicitly teach -
* * *
. . . the plurality of private data servers are communicatively coupled, via a network, to at least one non-private computing device, . . . causes its at least one modeling engine to:
* * *
generate a plurality of private data . . . distributions from the local private data where the plurality of private data . . . distributions represents the local private data in aggregate used to create the trained actual model and does not include individual elements of the local private data; 
* * *
. . . and transmit the set of proxy data, over the network, to the at least one non-private computing device as a function of the model similarity score.
But Yao teaches -
. . . the plurality of private data servers are communicatively coupled, via a network, to at least one non-private computing device (Yao ¶ 0049 teaches [t]he term “server' is meant to refer to a stand-alone computer and/or a networked server (communicatively coupled, via a network) that is dedicated to carry out a subject operation. Examples of stand-alone and networked servers that may be used with the present invention include without limitation, . . . a local on-site server (plurality of private data servers), and an off-site cloud server at least one non-private computing device)) . . . causes its at least one modeling engine to:
* * *
generate a plurality of private data . . . distributions from the local private data where the private data . . . distributions represent the local private data in aggregate used to create the trained actual model and does not include individual elements of the local private data (Yao Fig. 2 teaches, with Examiner annotations provided in text boxes:

    PNG
    media_image1.png
    547
    807
    media_image1.png
    Greyscale

Yao ¶ 0008 teaches providing more than one healthcare centers with a Model Deconstruction Transfer (MDT) platform comprised of a variable library (VL), wherein each healthcare center enters at least one data set (generate a plurality of private data distributions from the local private data) relevant to the health outcome of interest into the MDT platform and selects variables from the VL that are relevant to the health outcome of interest; and (b) generating at least one prediction model (PM0) for each healthcare center from the MDT platform (where the private data distributions represent the local private data in aggregate used to create the trained actual model); Yao ¶ 0079 teaches that [t]he dataset may include [public health information (PHI)] and personal identifiers, if permitted by the policies of the [Participating Healthcare Center (PHC)] and any applicable HIPPA [sic] regulations. Alternatively, this dataset may be a de-identified or limited data set, which contains dates, but not other personal identifiers. It is to be understood that the dataset from any particular PHC is not shared with Company A, any of the other PHCs, or any ultimate users; Examiner construes “individual elements” as “actual private or restricted features of the local, private data” (see PGPUB1 ¶ 0016); Accordingly, Yao ¶ 0007 teaches to generate and validate healthcare prediction models based upon healthcare data obtained from multiple sources, without the need to de-identify or transfer clinical data beyond the physical and network boundaries of healthcare facilities (that is, does not include individual elements of the local private data));
* * *
transmit the set of proxy data, over the network, to the at least one non-private computing device as a function of the model similarity score (Yao ¶ 0022 teaches the at least two [prediction model] PM1 are generated, ranked, and selected according to model performance parameters (as a function of the model similarity score) selected from group consisting of posterior log likelihoods, predictive power based upon posterior probability of an event, [etc.]; see also Yao ¶ 0052 that further teaches a “Model Deconstruction and Transfer” and “MDT” are meant to refer to a platform that allows the [participating healthcare center] to provide access to a third party, to use a PHC’s healthcare data to generate a prediction model that is devoice of any personal identifiers on-site, and transfer that prediction model to a third-party facility, without the transfer of personal identifiers or any raw data, for deconstruction).
Kristal and Yao are from the same or similar field of endeavor. Kristal teaches neural network training with and without synthetic data from multiple sources. Yao teaches prediction models generated from multiple sources. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention to implement the teachings of Kristal pertaining to neural network training with synthetic data and without synthetic data from multiple sources with the generation, ranking, and selection of model performance parameters and cloud servers of Yao.
The motivation for doing so is to harness the massive amounts of healthcare data that exists among multiple sources and turning that data into prediction models that are rich in diversity and prognostic value and are capable of being applied to patient populations that may or may not have accumulated years of healthcare data. (Yao ¶ 0057).
Though Kristal and Yao teach training models based on data with restricted and/or personal data removed (such as proxy data, or representative data distributions), the combination of Kristal and Yao does not explicitly teach that the private data distributions are “private data statistical distributions.”
But Park teaches “private data statistical distributions” (Park, left column of p. 494, “II. Related Work”, fourth paragraph, teaches [Markov Chain Monte Carlo (MCMC)] methods generally refer to a class of algorithms that draw samples from non-trivial probability (that is, statistical) distributions (that is, private data statistical distributions) through Markov chain simulations. Some special cases are . . . Gibbs sampler . . . . Among many choices, this paper uses Gibbs sampler for two reasons. First, Gibbs sampler only requires conditional distributions and no other tuning parameters. . . . Second, unlike [Metropolis-Hastings] algorithm, Gibbs sampler does not reject samples, thus Gibbs sampler is usually more efficient than MH algorithm when drawing high-dimensional samples).
Kristal, Yao, and Park are from the same or similar field of endeavor. Kristal teaches neural network training with and without synthetic data from multiple sources. Yao teaches prediction models generated from multiple sources. Park teaches the generation of large-scale privacy-safe synthetic health data. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention to modify the combination of Kristal and Yao pertaining to neural network training with synthetic data and without synthetic data from multiple sources that is based on generation, ranking, and selection of model performance parameters with the synthetic health data of Park.
The motivation for doing so is to synthesize artificial records while preserving the statistical characteristics of the original data to the extent possible in order to deliver results that are substantially identical to those obtained from the original dataset. (Park, Abstract).
Regarding claim 2, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above.
Yao further teaches wherein the local private data comprises local private healthcare data (Yao ¶ 0043 teaches “healthcare data” is meant to refer to medical data that is created in the process of inpatient or outpatient clinical services that are provided to the patient, such as for example, patient history, physical exams, lab tests and results, medical procedures and results, medications and patient response to the medications (healthcare data); Yao ¶ 0056 further teaches the costs of data-sharing (local) associated with legal- and institution-imposed protection of healthcare data (private) that is linked to personal identifiers).
Regarding claim 3, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 2, as described above.
Yao further teaches wherein the local private healthcare data includes patient-specific data (Yao ¶ 0056 further teaches the costs of data-sharing (local) associated with legal- and institution-imposed protection of healthcare data (private) that is linked to personal identifiers (patient-specific data)).
Regarding claim 4, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above.
Yao further teaches wherein the local private data includes at least one of the following types of data: genomic data, whole genome sequence data, whole exosome sequence data, proteomic data, proteomic pathway data, k-mer data, neoepitope data, RNA data, allergy information, encounter data, treatment data, outcome data, appointment data, order data, billing code data, diagnosis code data, results data, treatment response data, tumor response data, demographic data, medication data, vital sign data, payor data, drug study data, drug response data, longitudinal study data, biometric data, financial data, proprietary data, electronic medical record data, research data, human capital data, performance data, analysis results data, and event data (Yao ¶ 0043 teaches “healthcare data” is meant to include data that is generated for billing and other administrative tasks that are required for the patient to receive clinical services with his/her healthcare providers and navigate his/her healthcare needs, such as for example, pharmaceutical prescriptions, lab orders, billing records (billing code data), and referrals).
Regarding claim 5, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above.
Kristal further teaches wherein the network comprises at least one of the following types of networks: a wireless network, a packet switched network, the Internet, an intranet, a virtual private network, a cellular network, an ad hoc network, and a peer-to-peer network (Kristal ¶ 0030 teaches a communications network 35, e.g., a wired/wireless LAN/WAN, and intranet (an intranet), the Internet, etc.).
Regarding claim 6, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above.
Yao further teaches wherein the at least one non-private computing device is a different one of the plurality of private data servers lacking authorization to the local private data on which the trained actual model was created (Yao ¶ 0059 teaches a non-participating healthcare center (PHC) regional healthcare network wants a prediction model for IUI treatment successes (at least one non-private computing device is a different one of the plurality of private data servers lacking authorization to the local private data on which the trained model was created), but has very little date [sic] of its own).
Regarding claim 7, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above.
Yao further teaches wherein the at least one non-private computing device includes a global model server (Yao ¶ 0049 teaches [t]he term “server' is meant to refer to a stand-alone computer and/or a networked server (communicatively coupled, via a network) that is dedicated to carry out a subject operation. Examples of stand-alone and networked servers that may be used with the present invention include without limitation, . . . a local on-site server (plurality of private data servers), and an off-site cloud server at least one non-private computing device); Examiner points out that a cloud server is a global model server).
Regarding claim 8, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 7, as described above.
Kristal teaches wherein the global model server is configured to aggregate sets of proxy data from at least two of the plurality of private data servers (Kristal ¶ 0030 teaches the input (proxy data) of the users 15-30 (at least two of private data servers) may be collected and transmitted to the host device 10 by a single user (e.g., an administrator) (global model server is configured to aggregate sets of proxy data) and is configured to train a global model on the sets of proxy data (Kristal ¶ 0034 teaches the neural network is initialized and/or trains itself (configured to train a global model) as a function of a novel type of synthetic input data received from the users 15-30 (sets of proxy data)).
Regarding claim 13, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above.
Park teaches wherein, for each statistical distribution of the plurality of private data statistical distributions, the statistical distribution is generated such that the . . . distribution comprises at least one of the following types of statistical distributions: a Gaussian distribution, a Poisson distribution, a Bernoulli distribution, a Rademacher distribution, a discrete distribution, a binomial distribution, a zeta distribution, a Gamma distribution, a beta distribution, and a histogram distribution (Park, left column of p. 497, “VI. Empirical Studies-A. Marginal Distribution”, first paragraph teaches Figures 2 and 3 shows marginal histograms of drug and ICD-9 procedure codes over different levels of privacy metrics log l and ϵ (that is, ach statistical distribution . . . is generated such that the statistical distribution comprises at least one of the following types of statistical distributions: . . . a histogram distribution). Histograms of the original data are shown in the top left cells (that is, plurality of private data statistical distributions)).
Regarding claim 18, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above.
Kristal further teaches wherein the trained actual model is based on an implementation of at least one of the following types of machine learning algorithms: a classification algorithm, a neural network algorithm, a regression algorithm, a decision tree algorithm, a clustering algorithm, a genetic algorithm, a supervised learning algorithm, a semi-supervised learning algorithm, an unsupervised learning algorithm, and a deep learning algorithm (Kristal ¶ 0003 teaches [t]raining the network by supervised learning (supervised learning algorithm) thus involves sequentially generating outcome data from a known set of input data (where inputs and outputs are correctly matched)).
Regarding claim 19, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above.
Yao further teaches wherein the trained actual model is based on an implementation of at least one of the following machine learning algorithms: a support vector machine, a nearest neighbor algorithm, a random forest, a ridge regression, a Lasso algorithm, a k-means clustering algorithm, a spectral clustering algorithm, a mean shift clustering algorithm, a non-negative matrix factorization algorithm, an elastic net algorithm, a Bayesian classifier algorithm, a RANSAC algorithm, and an orthogonal matching pursuit algorithm (Yao ¶ 0021 teaches [t]he machine learning technique may be selected from the group consisting of classification tree methods, LASSO (least absolute shrinkage and selection operator) (Lasso algorithm), Bayesian network modeling (Bayesian classifier algorithm), and combinations of any of the foregoing).
Regarding claim 20, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above.
Kristal further teaches wherein, for a first private data server of the plurality of private data servers, the model instructions include instructions to create the trained actual model from a base-line model created external to the private data server (Kristal ¶ 0034 teaches [i]n a conventional neural network, each node and connection in the neural network (trained actual model) is assigned a predetermined bias and an initial weight (create the trained actual model from a base-line model created external to the private data server)).
Regarding claim 21, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 20, as described above.
Kristal further teaches wherein the baseline model comprises a global trained actual model (Kristal ¶ 0034 teaches [i]n a conventional neural network (global trained actual model), each node and connection in the neural network (trained actual model) is assigned a predetermined bias and an initial weight).
Regarding claim 22, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 21, as described above.	
Kristal further teaches wherein the global trained actual model is trained, at least in part, on sets of proxy data from at least two of the plurality of private data servers other than the first private data server (Kristal ¶ 0030 teaches the input (proxy data) of the users 15-30 (at least two of private data servers) may be collected and transmitted to the host device 10 by a single user (e.g., an administrator)).
Regarding claim 23, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1.
Kristal teaches wherein the similarity score is determined based on a cross validation of the trained proxy model (Kristal FIG. 9, which shows simulation results (similarity score) based on the method of FIG. 2, in which machine learning algorithm comparisons are made in view of synthetic data (proxy model parameters) in contrast to “without synthetic data” (trained actual model parameters) (that is similarity score is determined based on a cross validation of the proxy model)).
Regarding claim 24, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 23, as described above. 
Kristal further teaches wherein the cross validation includes an internal cross validation on a portion of the set of proxy data (Kristal ¶ 0045 teaches comparison (internal cross validation) of guesstimates, votes, etc. between voters (on a portion of the proxy data) may be used to determine if there are outliers).
Regarding claim 25, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 23, as described above.
Kristal further teaches wherein the cross validation includes an internal cross validation of the local private data (Kristal ¶ 0044 teaches it may be possible to use the present invention to obtain multiple related data sets (of the local private data) for differential comparison (cross validation)).
Regarding claim 26, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 23, as described above.
Kristal further teaches wherein the cross validation includes an external cross validation by a different one of the plurality of private data servers on its local private data (Kristal FIG. 9 teaches that simulation results based on the method of FIG. 2 where machine learning algorithm comparisons are made in view of synthetic data (proxy model parameters) in contrast to “without synthetic data” (trained actual model parameters), accordingly, conducting an external cross validation by a different one of the plurality of private servers on its local private data because the synthetic data is external to the instances “without synthetic data”, or rather actual private data of a server).
Regarding claim 27, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above.
Kristal teaches wherein the model similarity score comprises a difference between an accuracy measure of the trained proxy model and an accuracy measure of the trained actual model (Kristal ¶ 0058 & FIG. 9 teaches that synthetic data (proxy model parameters) are beneficial primarily when unassisted algorithms (without synthetic data, which is trained actual model parameters) make somewhat accurate predictions, here < 40% (difference between an accuracy measure of the trained actual model parameters and the proxy model parameters)).
Regarding claim 28, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above
Kristal teaches wherein the model similarity score comprises a metric distance calculated from the trained actual model parameters and the proxy model parameters (Kristal ¶ 0077 & FIG. 13 teaches results produced by the DECORATE algorithm . . . showing that input of human guesstimates (proxy model parameters) markedly (metric distance calculated) reduces errors [to unassisted ML] (trained actual model parameters))
Regarding claim 29, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above.
Yao teaches wherein the set of proxy data is transmitted when the function of the model similarity score satisfies at least one transmission criterion. (Yao ¶ 0061 teaches use of subsets of the predictive variables and thresholds to optimally define patient populations that are enriched for a certain trait, prognosis or outcome; predictive power; discrimination (similarity score satisfies at least one transmission criterion); and reclassification).
Regarding claim 30, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 29, as described above.
Yao teaches wherein the at least one transmission criterion include at least one of the following conditions relating to the model similarity score: a threshold condition, a multi-valued condition, a change in value condition, a trend condition, a human command condition, an external request condition, and a time condition (Yao ¶ 0062 teaches [d]iscrimination refers to how well a model can differentiate patients with higher versus lower probabilities of outcomes, or those with significantly different prognoses. The ability to discriminate can be measured by receiver operator characteristics analysis, where the area-under-the-curve (AUC) indicates the degree of discrimination and AUC=0.5 indicates the model has no ability to discriminate (threshold condition)).
Regarding claim 31, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above. 
Yao further teaches wherein the modeling engine is further configured to update the trained actual model on new local private data (Yao ¶ 0069 teaches a method for incentivizing PHCs to provide updated data, at regular time intervals (to update the trained actual model on new local private data)).
Regarding claim 32, Kristal teaches [a] computing-device implemented method of distributed machine learning, the method comprising:
receiving, by a private data server, model instructions (Kristal ¶ 0029 teaches [e]xamples of learning algorithms, rules, trees and decision strategies (receive model instructions) include, but are not limited to: Bayesnet, ComplimentNaiveBayes, NaiveBayesMultinomial, [etc.]) to create a trained actual model from at least some of local private data local to the private data server and according to an implementation of a machine learning algorithm (Kristal ¶ 0002 teaches use of ML algorithms (to create a trained actual model) requires only the existence of sufficient and relevant set of prior experiential data (i.e., accurate training exemplars that include examples of input and associated output (from at least some of the local private data)), and does not require the user to have any knowledge of the rules that govern the system’s behavior (and according to an implementation of a machine learning algorithm)), wherein the local private data includes restricted features (Kristal ¶ 0041 teaches the user is identified by . . . receiving identification data associated with a user profile. . . . After generating the user profile, the user may be provided with . . . identification data (e.g., username, password (that is, the local private data includes restricted features));
creating, by a machine learning engine, the trained actual model according to the model instructions and as a function of the at least some of the local private data (Kristal ¶ 0002 teaches use of ML algorithms requires only the existence of sufficient and relevant set of prior experiential data (i.e., accurate training exemplars that include examples of input and associated output), and does not require the user to have any knowledge of the rules that govern the system’s behavior) by training the implementation of the machine learning algorithm on the local private data (Kristal ¶ 0050 teaches where the number of training cycles required when no synthetic data is used (create the trained actual model according to the model instructions and as a function of the at least some of the local private data) may be between 700 and 5000 . . . ), wherein the trained actual model comprises trained actual model parameters (Kristal ¶ 0002 teaches that prior to use, the [neural] networks must be trained with known input and outcome data (the trained actual model comprising trained actual model parameters) to provide predictions with an acceptable level of accuracy);
* * *
identifying, by the machine learning engine, salient private data features from the plurality of private data . . .distribution wherein the salient private data features allow for replication of the plurality of private data distributions (Kristal, Abstract, teaches synthetic training sets derived from expert or non-expert human guesstimates can replace or augment training data sets (for replication of the plurality of proxy data distributions) compromised of actual training exemplars (identifying . . . salient private data features from the private data distribution) that are too limited in size, scope, or quality to otherwise generate accurate predictions),
wherein the non-private computing device is not authorized to access the restricted features of the local private data (Kristal ¶ 0041 teaches the user is identified by . . . receiving identification data associated with a user profile. . . . After generating the user profile, the user may be provided with . . . identification data (e.g., username, password (that is, the local private data includes restricted features)) so that for subsequent uses of the network 300, the user is identified (that is, for each private data server . . . , the local private data includes restricted features to which the at least one non-private computing device does not have authorization to access); Examiner points out that by use of a username and password, there are restricted features that in turn are only accessible via the username and the password as taught by Kristal), . . . .
However, Kristal fails to explicitly teach -
* * *
generating, by the machine learning engine, a plurality of private data . . . distributions from the local private data, wherein the plurality of private data . . . distributions represents the local private data in aggregate used to create the trained actual model and does not include individual elements of the local private data;
* * *
and transmitting, by the machine learning engine, the salient private data features over a network to a non-private computing device.
But Yao teaches -
* * *
(Yao Fig. 2 teaches, with Examiner annotations provided in text boxes:

    PNG
    media_image1.png
    547
    807
    media_image1.png
    Greyscale

Yao ¶ 0008 teaches providing more than one healthcare centers with a Model Deconstruction Transfer (MDT) platform comprised of a variable library (VL), wherein each healthcare center enters at least one data set (generate a plurality of private data distributions from the local private data) relevant to the health outcome of interest into the MDT platform and selects variables from the VL that are relevant to the health outcome of interest; and (b) generating at least one prediction model (PM0) for each healthcare center from the MDT platform (where the private data distributions represent the local private data in aggregate used to create the trained actual model); Yao ¶ 0079 teaches that [t]he dataset may include [public health information (PHI)] and personal identifiers, if permitted by the policies of the [Participating Healthcare Center (PHC)] and any applicable HIPPA [sic] regulations. Alternatively, this dataset may be a de-identified or limited data set, which contains dates, but not other personal identifiers. It is to be understood that the dataset from any particular PHC is not shared with Company A, any of the other PHCs, or any ultimate users; Examiner construes “individual elements” as “actual private or restricted features of the local, private data” (see PGPUB2 ¶ 0016); Accordingly, Yao ¶ 0007 teaches to generate and validate healthcare prediction models based upon healthcare data obtained from multiple sources, without the need to de-identify or transfer clinical data beyond the physical and network boundaries of healthcare facilities (that is, does not include individual elements of the local private data));
* * *
and transmitting, by the machine learning engine, the salient private data features over a network to a non-private computing device (Yao ¶ 0010 teaches selecting variables from each data set that are relevant to the health outcome of interest (transmitting . . . the salient private data features); Yao ¶ 0049 teaches [e]xamples of stand-alone and networked servers that may be used with the present invention include without limitation, . . . an off-site cloud server (being a non-private computing device)).
Kristal and Yao are from the same or similar field of endeavor. Kristal teaches neural network training with and without synthetic data from multiple sources. Yao teaches prediction models generated from multiple sources. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention to implement the teachings of Kristal pertaining to neural network training with synthetic data and without synthetic data from multiple sources with the generation, ranking, and selection of model performance parameters and cloud servers of Yao.
The motivation for doing so is to harness the massive amounts of healthcare data that exists among multiple sources and turning that data into prediction models that are rich in diversity and prognostic value and are capable of being applied to patient populations that may or may not have accumulated years of healthcare data. (Yao ¶ 0057).
However, the combination of Kristal and Yao does not explicitly teach that the private data distributions are “private data statistical distributions,” and also, the combination does not explicitly teach -
* * *
. . . and wherein the salient private data features exclude the restricted features.
But Park teaches “private data statistical distributions” (Park, left column of p. 494, “II. Related Work”, fourth paragraph, teaches [Markov Chain Monte Carlo (MCMC)] methods generally refer to a class of algorithms that draw samples from non-trivial probability (that is, statistical) distributions (that is, private data statistical distributions) through Markov chain simulations. Some special cases are . . . Gibbs sampler . . . . Among many choices, this paper uses Gibbs sampler for two reasons. First, Gibbs sampler only requires conditional distributions and no other tuning parameters. . . . Second, unlike [Metropolis-Hastings] algorithm, Gibbs sampler does not reject samples, thus Gibbs sampler is usually more efficient than MH algorithm when drawing high-dimensional samples).
Park also teaches -
* * *
. . . and wherein the salient private data features exclude the restricted features (Park, left column of p. 497, “VI. Empirical Studies-A. Marginal Distribution”, first paragraph [a] data file contains seven variables (that is, salient private data features): ID, Gender, Age, DRG (drug code), ICD-9 (procedure code), Length (the length of stay), and Amount (payment), and has 15K rows (that is, salient private data features) . . . . PeGS is the first privacy safe data synthesizer, which adheres to the rigorous privacy metrics (that is, privacy safe being salient private data features exclude restricted features)),
Kristal, Yao, and Park are from the same or similar field of endeavor. Kristal teaches neural network training with and without synthetic data from multiple sources. Yao teaches prediction models generated from multiple sources. Park teaches the generation of large-scale privacy-safe synthetic health data. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention to modify the combination of Kristal and Yao pertaining to neural network training with synthetic data and without synthetic data from multiple sources that is based on generation, ranking, and selection of model performance parameters with the synthetic health data of Park.
The motivation for doing so is to synthesize artificial records while preserving the statistical characteristics of the original data to the extent possible in order to deliver results that are substantially identical to those obtained from the original dataset. (Park, Abstract).
Regarding claim 33, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 32, as described above.
	Kristal teaches wherein the salient private data features includes a set of proxy data (Kristal, Abstract teaches synthetic training sets (set of proxy data) derived from expert or non-expert human guesstimates can replace or augment training data sets comprised of actual training exemplars (salient private data features)).
Regarding claim 34, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 32, as described above.
	Kristal teaches further comprising generating a set of proxy data according to at least one of the following: the plurality of private data . . . distributions and the salient private data features (Kristal, Abstract teaches synthetic training sets (set of proxy data) derived from expert or non-expert human guesstimates can replace or augment training data sets comprised of actual training exemplars (salient private data features)).
Regarding claim 35, the combination of Kristal and Yao teaches all of the limitations of claim 34, as described above.
	Kristal teaches further comprising creating a trained proxy model from the set of proxy data by training another implementation of the machine learning algorithm on the set of proxy data, wherein the trained proxy model comprises proxy model parameters (Kristal ¶ 0027 teaches the training of an ML network (e.g., an artificial neural network) by using human input (e.g., synthetic data) (create a trained proxy model from the set of proxy data by training the type of machine learning model on the set of proxy data; Kristal, Abstract teaches synthetic training sets (set of proxy data) derived from expert or non-expert human guesstimates can replace or augment training data sets comprised of actual training exemplars).
Regarding claim 36, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 35, as described above.
	Kristal teaches further comprising calculating a model similarity score of the trained proxy model as a function of the proxy model parameters and the trained actual model parameters (Kristal ¶ 0044 teaches one may train the algorithm, obtain predictions, and then rerun network training after distributing supporting information or information about user guesstimates, etc. Thus, it may be possible to use the present invention to obtain multiple related data sets for differential comparison (calculate a model similarity score as a function of the proxy model parameters and the trained actual model parameters); see also Kristal FIG. 9, which shows simulation results based on the method of FIG. 2, in which machine learning algorithm comparisons are made in view of synthetic data (proxy model parameters) in contrast to “without synthetic data” (trained actual model parameters)).
Regarding claim 37, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 36, as described above.
	Yao teaches further comprising aggregating the set of proxy data into an aggregated global model based on the model similarity score (Yao ¶ 0008 teaches providing more than one healthcare centers with a Model Deconstruction Transfer (MDT) platform comprised of a variable library (VL), wherein each healthcare center enters at least one data set (generate a plurality of private data distributions from the local private data) relevant to the health outcome of interest into the MDT platform and selects variables from the VL that are relevant to the health outcome of interest; and (b) generating at least one prediction model (PM0) for each healthcare center from the MDT platform (where the private data distributions represent the local private data in aggregate used to create the trained actual model)).
Regarding claim 38, Kristal teaches a computer-implemented method of generating proxy data (Kristal ¶ 0007 (Kristal ¶ 0027 teaches the training of an ML network (e.g., an artificial neural network) by using human input (e.g., synthetic data) (computer-implemented method of generating proxy data)), the method comprising: 
creating, at a private data server, from local private data accessible to the private data server, a trained actual model using a machine learning algorithm (Kristal ¶ 0002 teaches use of ML algorithms requires only the existence of sufficient and relevant set of prior experiential data (i.e., accurate training exemplars that include examples of input and associated output), and does not require the user to have any knowledge of the rules that govern the system’s behavior (create, from the private data, a trained actual model using a machine learning algorithm)), wherein:
the local private data includes restricted features that are accessible to the private data server and are inaccessible to at least one system (Kristal ¶ 0041 teaches the user is identified by . . . receiving identification data associated with a user profile. . . . After generating the user profile, the user may be provided with . . . identification data (e.g., username, password (that is, the local private data includes restricted features)) so that for subsequent uses of the network 300, the user is identified (that is, for each private data server . . . , the local private data includes restricted features . . . are inaccessible to at least one system); Examiner points out that by use of a username and password, there are restricted features that are accessible to the private data server via the username and the password as taught by Kristal) . . . 
* * *
generating a set of proxy data based on the plurality of private data distributions (Kristal ¶ 0027 teaches the training of an ML network (e.g., an artificial neural network) by using human input (e.g., synthetic data) (generate a set of proxy data) which reduces the network’s learning time and improves problem solving, decision making, classification and prediction accuracy), . . . ;
creating, from the set of proxy data, a trained proxy model using the machine learning algorithm (Kristal, Abstract teaches the training of an ML network (e.g., an artificial neural network) by using human input (e.g., synthetic data) (create a trained proxy model from the set of proxy data by training the type of machine learning model on the set of proxy data; Kristal, Abstract teaches synthetic training sets (set of proxy data) derived from expert or non-expert human guesstimates can replace or augment training data sets comprised of actual training exemplars); and
* * *
	However, Kristal fails to explicitly teach -
* * *
. . . wherein:
. . . and the restricted features include protected health information;
generating, at the private data server, a plurality of private data . . . distributions from at least some of the local private data, wherein the plurality of private data . . . distributions represents the local private data in aggregate and does not include individual elements of the local private data;
* * *
and distributing the trained proxy model to the at least one system that lacks access to the restricted features.
	But Yao teaches -
* * *
. . . wherein:
. . . and the restricted features include protected health information (Yao ¶ 0079 teaches dataset may include PHI and personal identifiers (that is, protected health information), if permitted by the policies of the PHC and any applicable HIPPA regulations. Alternatively, this dataset may be a de-identified or limited data set, which contains dates, but not other personal identifiers. It is to be understood that the dataset from any particular PHC is not shared (that is, restricted features) with Company A, any of the other PHCs, or any ultimate users (that is, the restricted features include protected health information));generating, at the private data server, a plurality of private data . . . distributions from at least some of the local private data, wherein the plurality of private data . . . distributions represent the local private data in aggregate (Yao ¶ 0008 teaches providing more than one healthcare centers with a Model Deconstruction Transfer (MDT) platform comprised of a variable library (VL), wherein each healthcare center enters at least one data set (generate a plurality of private data . . . distributions from the local private data) relevant to the health outcome of interest into the MDT platform and selects variables from the VL that are relevant to the health outcome of interest; and (b) generating at least one prediction model (PM0) for each healthcare center from the MDT platform (where the private data . . . distributions represent the local private data in aggregate used to create the trained actual model)) and does not include individual elements of the local private data (Yao Fig. 2 teaches, with Examiner annotations provided in text boxes:

    PNG
    media_image1.png
    547
    807
    media_image1.png
    Greyscale

Yao ¶ 0008 teaches providing more than one healthcare centers with a Model Deconstruction Transfer (MDT) platform comprised of a variable library (VL), wherein each healthcare center enters at least one data set (generate a plurality of private data distributions from the local private data) relevant to the health outcome of interest into the MDT platform and selects variables from the VL that are relevant to the health outcome of interest; and (b) generating at least one prediction model (PM0) for each healthcare center from the MDT platform (where the private data distributions represent the local private data in aggregate used to create the trained actual model); Yao ¶ 0079 teaches that [t]he dataset may include [public health information (PHI)] and personal identifiers, if permitted by the policies of the [Participating Healthcare Center (PHC)] and any applicable HIPPA [sic] regulations. Alternatively, this dataset may be a de-identified or limited data set, which contains dates, but not other personal identifiers. It is to be understood that the dataset from any particular PHC is not shared with Company A, any of the other PHCs, or any ultimate users; Examiner construes “individual elements” as “actual private or restricted features of the local, private data” (see PGPUB3 ¶ 0016); Accordingly, Yao ¶ 0007 teaches to generate and validate healthcare prediction models based upon healthcare data obtained from multiple sources, without the need to de-identify or transfer clinical data beyond the physical and network boundaries of healthcare facilities (that is, does not include individual elements of the local private data));
* * *
and distributing the trained . . . model to the at least one system that lacks access to the restricted features (Yao ¶ 0057 teaches the [Participating Healthcare Center] may further utilize the [Model Deconstruction and Transfer] platform to extract statistical features and/or components from the prediction model and send (that is, distributing the trained . . . model) those features and/or components to a third party who can in turn reassemble . . . into a Model Component Library (MCL) (that is, and distributing the trained proxy model to the at least one system that lacks access to the restricted features)).
Kristal and Yao are from the same or similar field of endeavor. Kristal teaches neural network training with and without synthetic data from multiple sources. Yao teaches prediction models generated from multiple sources. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention to implement the teachings of Kristal pertaining to neural network training with synthetic data and without synthetic data from multiple sources with the generation, ranking, and selection of model performance parameters and cloud servers of Yao.
The motivation for doing so is to harness the massive amounts of healthcare data that exists among multiple sources and turning that data into prediction models that are rich in diversity and prognostic value and are capable of being applied to patient populations that may or may not have accumulated years of healthcare data. (Yao ¶ 0057).
However, the combination of Kristal and Yao does not explicitly teach that the private data distributions are “private data statistical distributions,” and also, the combination does not explicitly teach -
. . . wherein the restricted features are absent from the set of proxy data;
* * *
But Park teaches “private data statistical distributions” (Park, left column of p. 494, “II. Related Work”, fourth paragraph, teaches [Markov Chain Monte Carlo (MCMC)] methods generally refer to a class of algorithms that draw samples from non-trivial probability (that is, statistical) distributions (that is, private data statistical distributions) through Markov chain simulations. Some special cases are . . . Gibbs sampler . . . . Among many choices, this paper uses Gibbs sampler for two reasons. First, Gibbs sampler only requires conditional distributions and no other tuning parameters. . . . Second, unlike [Metropolis-Hastings] algorithm, Gibbs sampler does not reject samples, thus Gibbs sampler is usually more efficient than MH algorithm when drawing high-dimensional samples).
Park also teaches -
* * *
. . . wherein the restricted features are absent from the set of proxy data (Park, left column of p. 497, “VI. Empirical Studies-B. Effect on Predictive Models”, first paragraph, teaches [p]rivacy preserving synthetic data can be publicly disclosed to answer various types of data mining research questions. If the statistical properties of the original data are well preserved in the PeGS-generated data, then the data mining results from the synthetic data (that is, the set of proxy data) should be significantly identical to those from the original data (that is, the restricted features are absent from the set of proxy data));
* * *
Kristal, Yao, and Park are from the same or similar field of endeavor. Kristal teaches neural network training with and without synthetic data from multiple sources. Yao teaches prediction models generated from multiple sources. Park teaches the generation of large-scale privacy-safe synthetic health data. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention to modify the combination of Kristal and Yao pertaining to neural network training with synthetic data and without synthetic data from multiple sources that is based on generation, ranking, and selection of model performance parameters with the synthetic health data of Park.
The motivation for doing so is to synthesize artificial records while preserving the statistical characteristics of the original data to the extent possible in order to deliver results that are substantially identical to those obtained from the original dataset. (Park, Abstract).
8.	Claims 9 and 10 are rejected under 35 U.S.C. § 103 as being unpatentable over US Published Application 20090037351 to Kristal et al. [hereinafter Kristal], in view US Published Application 20130085773 to Yao et al. [hereinafter Yao] and Park et al., “Perturbed Gibbs Samplers for Generating Large-Scale Privacy-Safe Synthetic Health Data,” IEEE Int’l Conf on Healthcare Informatics (2013) [hereinafter Park], and further in view of US Published Application 20150154646 to Mishra et al. [hereinafter Mishra].
Regarding claim 9, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above.
However, the combination of Kristal, Yao, and Park does not explicitly teach -
wherein each private data server is communicatively coupled with a local storage system that stores the local private data.
But Mishra teaches -
	wherein each private data server is communicatively coupled with a local storage system that stores the local private data (Mishra ¶ 0041 teaches a client process (e.g., in the form of an application) may run privately on the user's local data by accessing the user's local database).
Kristal, Yao, Park, and Mishra are from the same or similar field of endeavor. Kristal teaches neural network training with and without synthetic data from multiple sources. Yao teaches prediction models generated from multiple sources. Park teaches the generation of large-scale privacy-safe synthetic health data. Mishra teaches a system for securely storing, retrieving, sharing, and selling private personal data in relation to providing health care services. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention to modify the combination of Kristal, Yao, and Park pertaining to neural network training with synthetic data and without synthetic data, including private data statistical distribution from multiple sources that is based on generation, ranking, and selection of model performance parameters with the secure storage of private data of Mishra.
The motivation for doing so is to provide for better utilization of private personal data for improved healthcare by experts. (Mishra ¶ 0054).
Regarding claim 10, the combination of Kristal, Yao, Park, and Mishra teaches all of the limitations of claim 9, as described above.
However, the combination of Kristal and Yao fails to explicitly teach wherein the local storage system includes at least one of the following: a RAID system, a file server, a network accessible storage device, a storage area network device, a local computer readable memory, a hard disk drive, an optical storage device, a tape drive, a tape library, and a solid state disk.
But Mishra teaches wherein the local storage system includes at least one of the following: a RAID system, a file server, a network accessible storage device, a storage area network device, a local computer readable memory, a hard disk drive, an optical storage device, a tape drive, a tape library, and a solid state disk. (Mishra ¶ 0081 teaches [p]rocessor 701 may be coupled to storage 703 (the local storage system), which may include a hard-disk drive (at least one of the following: . . . a hard disk drive) or other large capacity storage device).
Kristal, Yao, Park, and Mishra are from the same or similar field of endeavor. Kristal teaches neural network training with and without synthetic data from multiple sources. Yao teaches prediction models generated from multiple sources. Park teaches the generation of large-scale privacy-safe synthetic health data. Mishra teaches a system for securely storing, retrieving, sharing, and selling private personal data in relation to providing health care services. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention to modify the combination of Kristal, Yao, and Park pertaining to neural network training with synthetic data and without synthetic data, including private data statistical distribution from multiple sources that is based on generation, ranking, and selection of model performance parameters with the secure storage of private data of Mishra.
The motivation for doing so is to provide for better utilization of private personal data for improved healthcare by experts. (Mishra ¶ 0054).
9.	Claim 11 is rejected under 35 U.S.C. § 103 as being unpatentable over US Published Application 20090037351 to Kristal et al. [hereinafter Kristal], in view US Published Application 20130085773 to Yao et al. [hereinafter Yao] and Park et al., “Perturbed Gibbs Samplers for Generating Large-Scale Privacy-Safe Synthetic Health Data,” IEEE Int’l Conf on Healthcare Informatics (2013) [hereinafter Park], and further in view of US Published Application 20150154646 to Mishra et al. [hereinafter Mishra] and US Published Application 20150120329 to Rangadass et al. [hereinafter Rangadass]
Regarding claim 11, the combination of Kristal, Yao, Park, and Mishra teaches all of the limitations of claim 9, as described above. 
However, the combination of Kristal, Yao, Park, and Mishra fails to explicitly teach wherein the local storage system includes at least one of the following: a local database, a BAM server, a SAM server, a GAR server, BAMBAM server, and a clinical operating system server.
But Rangadass teaches wherein the local storage system includes at least one of the following: a local database, a BAM server, a SAM server, a GAR server, BAMBAM server, and a clinical operating system server (Rangadass ¶ 0062 teaches that [t]he architecture of the clinical operating system described herein may be implemented on a web server (clinical operating system server), wherein the web service may be located internally (local storage system) to an organization utilizing the clinical operating system . . .).
Kristal, Yao, Park, Mishra, and Rangadass are from the same or similar field of endeavor. Kristal teaches neural network training with and without synthetic data from multiple sources. Yao teaches prediction models generated from multiple sources. Park teaches the generation of large-scale privacy-safe synthetic health data via statistical distributions. Mishra teaches a system for better utilization of private personal data for improved healthcare. Rangadass teaches a healthcare architecture for patient information. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention to implement the teachings of Kristal pertaining to neural network training with synthetic data and without synthetic data from multiple sources and the generation, ranking, and selection of model performance parameters of Yao, the statistical distribution of Park, and the private personal data for improved healthcare of Mishra with the healthcare access architecture of Rangadass.
The motivation for doing so is the major opportunity for healthcare application integration because health informatics is substantially standards-based. (Rangadass ¶ 0020).
10.	Claim 12 is rejected under 35 U.S.C. § 103 as being unpatentable over US Published Application 20090037351 to Kristal et al. [hereinafter Kristal], in view US Published Application 20130085773 to Yao et al. [hereinafter Yao] and Park et al., “Perturbed Gibbs Samplers for Generating Large-Scale Privacy-Safe Synthetic Health Data,” IEEE Int’l Conf on Healthcare Informatics (2013) [hereinafter Park], and further in view of US Published Application 20160260023 to Miserendino, Jr., et al. [hereinafter Miserendino].
Regarding claim 12, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above.
However, the combination of Kristal, Yao, and Park fails to explicitly teach wherein the model instructions comprise at least one of the following: a local command, a remote command, an executable file, a protocol command, and a selected command.
But Miserendino teaches wherein the model instructions comprises at least one of the following: a local command, a remote command, an executable file, a protocol command, and a selected command (Miserendino ¶ 0042 teaches system 100 and method 200 may generate the one or more models (model instructions). Training cluster 106 may include software code (instructions) necessary to generate machine learning models per particular machine learning algorithms and techniques. A model may include digital objects (listed in training list) that are (or are more likely than not) of a certain type (e.g., PDF, Windows executable, Linux executable, Microsoft Office files (executable file)). A machine-learning model may be generated on a single computer or multiple computers simultaneously).
Kristal, Yao, Park, and Miserendino are from the same or similar field of endeavor. Kristal teaches neural network training with and without synthetic data from multiple sources. Yao teaches prediction models generated from multiple sources. Park teaches the generation of large-scale privacy-safe synthetic health data via statistical distributions. Miserendino teaches a system for managing digital object libraries used for training and testing of machine learning model. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention to implement the teachings of the combination of Kristal, Yao, and Park pertaining to neural network training with on synthetic data, including statistical distributions, from multiple sources with the digital object libraries of Miserendino.
The motivation for doing so is a mechanism for efficiently managing, developing, and evaluating supervised or semi-supervised machine learning-based classification models based on massive datasets using a database and computer processing environment. (Miserendino ¶ 0015)
11.	Claims 14-16 are rejected under 35 U.S.C. § 103 as being unpatentable over US Published Application 20090037351 to Kristal et al. [hereinafter Kristal], in view US Published Application 20130085773 to Yao et al. [hereinafter Yao] and Park et al., “Perturbed Gibbs Samplers for Generating Large-Scale Privacy-Safe Synthetic Health Data,” IEEE Int’l Conf on Healthcare Informatics (2013) [hereinafter Park], and further in view of Poh et al., “Challenges in Designing an Online Healthcare Platform for Personalised Patient Analytics,” pp. 1-6 (IEEE 2014) [hereinafter Poh].
Regarding claim 14, the combination of Kristal, Yao, and Park, teaches all of the limitations of claim 1, as described above.	
However, the combination of Kristal, Yao, and Park fails to explicitly teach wherein the plurality of private data distributions are based on eigenvalues derived from the trained actual model parameters and the local private data.
But Poh teaches wherein the plurality of private data distributions are based on eigenvalues derived from the trained actual model parameters and the local private data (Poh left column at page 5, Section V.B, second full paragraph, teaches model adaptation [where a] background model is first trained on data samples aggregated from all patients. This model is also called a Universal background model or a world model. This model is then adapted with the training data of a particular patient in order to obtain a patient specific model. There are several ways to realize the adaptation, namely, maximum a posteriori adaptation, maximum likelihood linear regression, and adaptation via eigen vectors (the set of proxy data includes combinations of eigenvectors derived from trained actual model parameters and the local private data); Examiner notes the scalar value of the eigen vector is an eigen value, and accordingly, are derived on the model adaptation of Poh).
Kristal, Yao, Park, and Poh are from the same or similar field of endeavor. Kristal teaches neural network training with and without synthetic data from multiple sources. Yao teaches prediction models generated from multiple sources. Park teaches the generation of large-scale privacy-safe synthetic health data. Poh teaches healthcare modeling to render a model as patient-specific. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention to implement the teachings of the combination of Kristal, Yao, and Park pertaining to neural network training based on private and synthetic data with the model adaptation of Poh.
The motivation for doing so is to help clinicians make better use of their time to processes data through more adequate data processing and analytical tools. (Poh, Abstract).
Regarding claim 15, the combination of Kristal, Yao, and Park teaches all of the limitations of claim 1, as described above.
However, the combination of Kristal, Yao, and Park fails to explicitly teach wherein the set of proxy data includes combinations of eigenvectors derived from the trained actual model parameters and the local private data.
But Poh teaches wherein the set of proxy data includes combinations of eigenvectors derived from the trained actual model parameters and the local private data (Poh left column at page 5, Section V.B, second full paragraph, teaches model adaptation [where a] background model is first trained on data samples aggregated from all patients. This model is also called a Universal background model or a world model. This model is then adapted with the training data of a particular patient in order to obtain a patient specific model. There are several ways to realize the adaptation, namely, maximum a posteriori adaptation, maximum likelihood linear regression, and adaptation via eigenvectors (the set of proxy data includes combinations of eigenvectors derived from trained actual model parameters and the local private data)).
Kristal, Yao, Park, and Poh are from the same or similar field of endeavor. Kristal teaches neural network training with and without synthetic data from multiple sources. Yao teaches prediction models generated from multiple sources. Park teaches the generation of large-scale privacy-safe synthetic health data. Poh teaches healthcare modeling to render a model as patient-specific. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention to implement the teachings of Kristal and Yao pertaining to neural network training based on private and synthetic data with the model adaptation of Poh.
The motivation for doing so is to help clinicians make better use of their time to processes data through more adequate data processing and analytical tools. (Poh, Abstract). 
Regarding claim 16, the combination of Kristal, Yao, Park, and Poh teaches all of the limitations of claim 15, as described above.
Poh teaches wherein the set of proxy data comprises linear combinations of the eigenvectors. (Poh left column at page 5, Section V.B, second full paragraph, teaches model adaptation [where a] background model is first trained on data samples aggregated from all patients. This model is also called a Universal background model or a world model. This model is then adapted with the training data of a particular patient in order to obtain a patient specific model. There are several ways to realize the adaptation, namely, maximum a posteriori adaptation, maximum likelihood linear regression (proxy data comprises linear combinations of the eigenvectors), and adaptation via eigenvectors).
12.	Claim 17 is rejected under 35 U.S.C. § 103 as being unpatentable over US Published Application 20090037351 to Kristal et al. [hereinafter Kristal], in view US Published Application 20130085773 to Yao et al. [hereinafter Yao] and Park et al., “Perturbed Gibbs Samplers for Generating Large-Scale Privacy-Safe Synthetic Health Data,” IEEE Int’l Conf on Healthcare Informatics (2013) [hereinafter Park], and further in view of Poh et al., “Challenges in Designing an Online Healthcare Platform for Personalised Patient Analytics,” pp. 1-6 (IEEE 2014) [hereinafter Poh] and Mitra et al., “Eigen-Profiles of Spatio-Temporal Fragments for Adaptive Region-Based Tracking,” pp. 1497-1500 (IEEE 2012) [hereinafter Mitra].
Regarding claim 17, the combination of Kristal, Yao, Park, and Poh teaches all of the limitations of claim 15, as described above.
	However, the combination of Kristal, Yao, Park, and Poh fails to explicitly teach wherein the eigenvectors include at least one of the following: an eigenpatient, an eigenprofile, an eigendrug, an eigenhealth record, an eigengenome, an eigenproteome, an eigenRNA profile, and an eigenpathway.
	But Mitra teaches wherein the eigenvectors include at least one of the following: an eigenpatient, an eigenprofile, an eigendrug, an eigenhealth record, an eigengenome, an eigenproteome, an eigenRNA profile, and an eigenpathway (Mitra, right column at page 1497, Section I, first full paragraph, teaches a [general] novel space-time descriptor which we call an Eigenprofile (eigenprofile). Estimation of [eigen profile] is equivalent to joint diagonalization of these covariance matrices and they form a matrix of orthonormal vectors).
	Kristal, Yao, Park, Poh and Mitra are from the same or similar field of endeavor. Kristal teaches neural network training with and without synthetic data from multiple sources. Yao teaches prediction models generated from multiple sources. Park teaches the generation of large-scale privacy-safe synthetic health data. Poh teaches healthcare modeling to render a model as patient-specific. Mitra teaches a novel space-time descriptor called an Eigenprofile. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention to implement the teachings of Kristal, Yao, Park, and Poh pertaining to neural network training based on private data and synthetic data and modeling with the space-time Eigenprofile of Mitra.
	The motivation for doing so is to incrementally build models for a target using an Eigenprofile. (Mitra right column at page 1497, Section 1, first paragraph).
Response to Arguments
13.	Applicant's arguments have been fully considered, in which Examiner responds below. 
14.	Applicant argues that Yao does not teach the feature of Applicant’s claims, using claim 38 as an example thereof, which is reproduced below:

    PNG
    media_image2.png
    604
    642
    media_image2.png
    Greyscale

Unlike (B) of Applicant’s claimed features, “which shows statistical distributions that (i) are generated from the private data the includes restricted protected health information features yet (ii) do not include individual elements of the private data, Yao teaches deconstructing a machine learning model to extract statistical features, as at (2) [of Yao]. See Yao [0008] and [0057].” (Response at p. 13). Applicant also submits that Yao does not teach limitations (c) - (e) of the Figure above.
Examiner respectfully disagrees because the combination of Kristal and Yao teaches the features of Applicant’s claims as set out in detail above. Applicant appears to argue that the claims recites features that Yao does not teach.
However, Yao teaches the feature of, inter alia, “local private data accessible to a private server” (item “A” in the image above), as set out in detail in the rejections above (Yao teaches the feature of local private data of a participating healthcare center (PHC-A) data, as set out in Fig. 1 of Yao).

    PNG
    media_image3.png
    187
    422
    media_image3.png
    Greyscale

Moreover, Yao ¶ 0079 teaches that [t]he dataset may include [public health information (PHI)] and personal identifiers, if permitted by the policies of the [Participating Healthcare Center (PHC)] and any applicable HIPPA [sic] regulations, which generates a prediction model PM0-A. Yao teaches the further features of “generating a plurality of private data . . . distributions” via a deconstruction and Model Component Library that in turn is used to produce a second prediction model PM1 (see Yao, Fig. 2). 
Moreover, the rejections clearly set forth which claim limitations are taught by each of the prior art references, and the reason why it would be obvious to a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention to combine their teachings, and Applicant has not explained why the cited prior art references cannot be combined in the manner set forth in the rejection.
With respect to advancing the prosecution of the application, Examiner suggests that the features set out in the above figure be positively recited in the claims; also, Applicant may consider further clarification of aspects of the claims as supported by the Specification, such as that of the similarity score 490 (see PGPUB ¶ 0097 & Fig. 4). Upon of a written response, Examiner will consider the cited prior art and conduct a further search in view thereof.
Conclusion
15.	THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
16.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
(Securosis, "Understanding and Selecting Data Masking Solutions: Creating Secure and Useful Data," (10 August 2012)) teaches data masking platforms in which a proxy data substitute retains part of the value of the original.
17.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEVIN L. SMITH whose telephone number is (571) 272-5964. Normally, the examiner is available on Monday-Thursday 0730-1730. 
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USSPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KAKALI CHAKI can be reached on 571-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/K.L.S./
Examiner, Art Unit 2122

/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122                                                                                                                                                                                                        


    
        
            
        
            
    

    
        1 “PGPUB” refers to Applicant’s U.S. Published Application 20180018590, entitled “Distributed Machine Learning Systems, Apparatus, and Methods,” to Szeto et al.
        2 “PGPUB” refers to Applicant’s U.S. Published Application 20180018590, entitled “Distributed Machine Learning Systems, Apparatus, and Methods,” to Szeto et al.
        3 “PGPUB” refers to Applicant’s U.S. Published Application 20180018590, entitled “Distributed Machine Learning Systems, Apparatus, and Methods,” to Szeto et al.