DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
Claims 1-20 are pending.
This communication is in response to the communication filed 2/19/2019.

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 1-20 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
Claims 1, 14-16, and 20 recites the limitations “the output”, “the data”, “the deterministic model”, and “the user”.  There is insufficient antecedent basis for these limitations in the claims. The dependent claims are rejected based on their dependency to indefinite claims. The deterministic model is interpreted as the discriminative model.
Claims 4-5 recite the limitation “the outputs” There is insufficient antecedent basis for these limitations in the claims.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for 
This application currently names joint inventors. In considering patentability of the claims, the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-5, 8-9, 11-12, 14-18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Sharma et al. US2018/0233233 in view of Karlov et al. US2009/0024332.
As per claim 1
a method of training a discriminative model to approximate the output of a probabilistic graphical model, comprising: (Sharma par. 24 teaches database of disease trajectories may be used to train a machine-learning based classifier to estimate a disease trajectory based on the state variables of a patient, a discriminative model is interpreted as a neural network)
receiving by the discriminative model samples from said probabilistic graphical model; and (Sharma fig. 3 and associated paragraphs., par. 24 teaches inputting variable to the trained machine learning model to train a machine-learning based classifier to estimate a disease trajectory)
training the discriminative model using said samples, (Sharma fig. 3 and associated paragraphs., par. 24 teaches receiving input data to train a machine-learning based classifier to estimate a disease trajectory)
wherein some of the data of the samples has been masked (Sharma par. 20 teaches a dynamical system identification algorithm uses statistical methods to estimate unknown parameters and hidden states of the dynamical system from the available measured data).
Sharma does not specifically teach the following limitations met by Karlov, wherein some of the data of the samples has been masked to allow the deterministic model to produce data which is robust to the user failing to input at least one symptom (Karlov par. 55, 183 teaches a variety data from a subject presenting with such symptoms or healthy are performed and a hidden list of diagnoses that may be used for hypotheses).
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify the systems and methods as taught by Sharma to allow the model to produce data which is robust to the user failing to input at least one symptom as taught by Karlov with the motivation to help accurately model disease combinations and handle multiple diagnoses as non-exclusive hypotheses (Karlov par. 183).
As per claim 2, Sharma and Karlov teach all the limitations of claim 1 and further teach wherein the masking is based on a uniform distribution (Karlov par. 56 teaches a uniform probability used over a set of data). 
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify the systems and methods as taught by Sharma to use a uniform distribution as taught by Karlov with the motivation to help accurately model disease combinations and handle multiple diagnoses as non-exclusive hypotheses (Karlov par. 183).
As per claim 3, Sharma and Karlov teach all the limitations of claim 1 and further teach wherein the discriminative model is a neural network (Sharma par. 20 teaches using neural networks and deep learning methods for system identification).
As per claim 4, Sharma and Karlov teach all the limitations of claim 3 and further teach wherein the neural network is a neural network comprising a plurality of sub-networks that can approximate the outputs of the probabilistic graphical model (Sharma par. 22, 24 teaches selecting sub-groups for estimating disease trajectories based on the state variables of a patient, here the sub-networks of the neural network is interpreted as a neural network with multiple layers of processing, where a first layer of processing is based on processing a subgroup).
As per claim 5, Sharma and Karlov teach all the limitations of claim 3 and further teach wherein the neural network is a single neural network that can approximate the outputs of the probabilistic graphical model (Sharma par. 20 teaches a dynamical system identification algorithm, which is stated to be singular or may use a plurality of statistical methods).
As per claim 8, Sharma teaches
A method of deriving a vectorised representation of evidence, the method comprising: training a discriminative model to approximate the output of a probabilistic graphical model, the probabilistic graphical model by
inputting into the discriminative model samples from said probabilistic graphical model; and (Sharma fig. 3 and associated paragraphs, par. 24 teaches inputting variables to the trained machine learning model)
training the discriminative model using said samples, (Sharma fig. 3 and associated paragraphs., par. 24 teaches receiving input data to train a machine-learning based classifier to estimate a disease trajectory)
the discriminative model being a neural net comprising a plurality of layers, (Sharma par. 20 teaches using neural networks and deep learning methods for system identification and a dynamical system identification algorithm, which is stated to be singular or may use a plurality of statistical methods)
Sharma par. 16, 24 teaches various data as patient state vectors and using neural networks, but does not specifically teach the following limitations met by Karlov, wherein the input layer to the neural net comprises the evidence, the vectorised representation being extracted from an inner layer of the neural net (Karlov fig. 9 and associated paragraphs, par. 165 teaches splitting data points with respect to distance layers, where splitting parameters and measurement noise in the test values helps to make patterns more statistically stable, here extraction from inner layers as explained in the specification par. 138 is interpreted to include data from nodes that have a smaller distance from one another so that the data is more similar than independent). 
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify the systems and methods as taught by Sharma to use vectorised representations extracted from an inner layer as taught by Karlov with the motivation that sampling increases the number of data points used for computing the discrete neighbor counting patterns and (Karlov par. 165). Here, clinical test data can be analyzed and mined for improved disease diagnosis (Karlov par. 8).
As per claim 9, Sharma and Karlov teach all the limitations of claim 8 and further teach wherein the vectorised representation is extracted from the activations of an inner layer (Karlov fig. 9 and associated paragraphs, par. 165 teaches splitting data points with respect to distance layers, where 
As per claim 11, Sharma and Karlov teach all the limitations of claim 8 and further teach wherein the vectorised representation is used for classification (Sharma par. 23-24 teaches using a trajectory clustering algorithm to group the database of diverse trajectories into several clusters where disease trajectories may be used to train a machine-learning based classifier, interpreted as classification).
As per claim 12, Sharma and Karlov teach all the limitations of claim 8 and further teach wherein the vectorised representation is used for clustering (Sharma par. 23 teaches using a trajectory clustering algorithm to group the database of diverse trajectories into several clusters, the whole state vector or some predefined subset of state variables in the state vector may be used to find the most similar cluster).
As per claim 14, Sharma teaches 
A non-transitory carrier medium comprising computer readable code configured to cause a computer to perform a method of training a discriminative model to approximate the output of a probabilistic graphical model, comprising: (Sharma par. 24, 47 teaches systems and methods using computers, processors, memory, and storage devices, where a database of disease trajectories may be used to train a machine-learning based classifier to estimate a disease trajectory based on the state variables of a patient, a discriminative model is interpreted as a neural network)
receiving by the discriminative model samples from said probabilistic graphical model; and
training the discriminative model using said samples, (Sharma fig. 3 and associated paragraphs., par. 24 teaches receiving input data to train a machine-learning based classifier to estimate a disease trajectory)
wherein some of the data of the samples has been masked (Sharma par. 20 teaches a dynamical system identification algorithm uses statistical methods to estimate unknown parameters and hidden states of the dynamical system from the available measured data).
Sharma does not specifically teach the following limitations met by Karlov, wherein some of the data of the samples has been masked to allow the deterministic model to produce data which is robust to the user failing to input at least one symptom (Karlov par. 55, 183 teaches a variety data from a subject presenting with such symptoms or healthy are performed and a hidden list of diagnoses that may be used for hypotheses).
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify the systems and methods as taught by Sharma to allow the model to produce data which is robust to the user failing to input at least one symptom as taught by Karlov with the motivation to help accurately model disease combinations and handle multiple diagnoses as non-exclusive hypotheses (Karlov par. 183).
As per claim 15, Sharma teaches 
A non-transitory carrier medium comprising computer readable code configured to cause a computer to perform a method of deriving a vectorised representation of evidence, the method comprising: (Sharma par. 24, 47 teaches systems and methods using computers, processors, memory, and storage devices, where a database of disease trajectories may be used to train a machine-learning based classifier to estimate a disease trajectory based on the state variables of a patient, a discriminative model is interpreted as a neural network)
training a discriminative model to approximate the output of a probabilistic graphical model, the probabilistic graphical model by: (Sharma par. 24 teaches database of disease trajectories may be 
inputting into the discriminative model samples from said probabilistic graphical model; and (Sharma fig. 3 and associated paragraphs, par. 24 teaches inputting variables to the trained machine learning model)
training the discriminative model using said samples, (Sharma fig. 3 and associated paragraphs., par. 24 teaches receiving input data to train a machine-learning based classifier to estimate a disease trajectory)
the discriminative model being a neural net comprising a plurality of layers, (Sharma par. 20 teaches using neural networks and deep learning methods for system identification and a dynamical system identification algorithm, which is stated to be singular or may use a plurality of statistical methods).
Sharma par. 16, 24 teaches various data as patient state vectors and using neural networks, but does not specifically teach the following limitations met by Karlov, wherein the input layer to the neural net comprises the evidence, the vectorised representation being extracted from an inner layer of the neural net (Karlov fig. 9 and associated paragraphs, par. 165 teaches splitting data points with respect to distance layers, where splitting parameters and measurement noise in the test values helps to make patterns more statistically stable, here extraction from inner layers as explained in the specification par. 138 is interpreted to include data from nodes that have a smaller distance from one another so that the data is more similar than independent). 
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify the systems and methods as taught by Sharma to use vectorised representations extracted from an inner layer as taught by Karlov with the motivation that sampling increases the number of data points used for computing the discrete neighbor counting patterns and (Karlov par. 165). Here, clinical test data can be analyzed and mined for improved disease diagnosis (Karlov par. 8).
As per claim 16
A system for training a discriminative model to approximate the output of a probabilistic graphical model, comprising a processor, said processor comprising a probabilistic graphical model and a discriminative model, wherein the processor is adapted to: (Sharma par. 24, 47 teaches systems and methods using computers, processors, memory, and storage devices, where a database of disease trajectories may be used to train a machine-learning based classifier to estimate a disease trajectory based on the state variables of a patient, a discriminative model is interpreted as a neural network)
receive by the discriminative model samples from said probabilistic graphical model; and (Sharma fig. 3 and associated paragraphs., par. 24 teaches inputting variable to the trained machine learning model to train a machine-learning based classifier to estimate a disease trajectory)
train the discriminative model using said samples, (Sharma fig. 3 and associated paragraphs., par. 24 teaches receiving input data to train a machine-learning based classifier to estimate a disease trajectory)
wherein some of the data of the samples has been masked (Sharma par. 20 teaches a dynamical system identification algorithm uses statistical methods to estimate unknown parameters and hidden states of the dynamical system from the available measured data) 
 Sharma does not specifically teach the following limitations met by Karlov, wherein some of the data of the samples has been masked to allow the deterministic model to produce data which is robust to the user failing to input at least one symptom (Karlov par. 55, 183 teaches a variety data from a subject presenting with such symptoms or healthy are performed and a hidden list of diagnoses that may be used for hypotheses).
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify the systems and methods as taught by Sharma to allow the model to produce data which is robust to the user failing to input at least one symptom as taught by Karlov with the motivation to help 
As per claim 17, Sharma and Karlov teach all the limitations of claim 16 and further teach wherein the processor comprises a graphical processing unit (Sharma par. 44 teaches a graphical processing unit as a display device).
As per claim 18, Sharma and Karlov teach all the limitations of claim 16 and further teach wherein the masking is based on a uniform distribution. (Karlov par. 56 teaches a uniform probability used over a set of data). 
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify the systems and methods as taught by Sharma to use a uniform distribution as taught by Karlov with the motivation to help accurately model disease combinations and handle multiple diagnoses as non-exclusive hypotheses (Karlov par. 183).
As per claim 20, Sharma teach 
A system for deriving a vectorised representation of evidence, the system comprising a processor adapted to: (Sharma par. 24, 47 teaches systems and methods using computers, processors, memory, and storage devices, where a database of disease trajectories may be used to train a machine-learning based classifier to estimate a disease trajectory based on the state variables of a patient, a discriminative model is interpreted as a neural network)
train a discriminative model to approximate the output of a probabilistic graphical model, the probabilistic graphical model by: (Sharma par. 24 teaches database of disease trajectories may be used to train a machine-learning based classifier to estimate a disease trajectory based on the state variables of a patient, a discriminative model is interpreted as a neural network)
inputting into the discriminative model samples from said probabilistic graphical model; and
training the discriminative model using said samples, (Sharma fig. 3 and associated paragraphs., par. 24 teaches receiving input data to train a machine-learning based classifier to estimate a disease trajectory)
the discriminative model being a neural net comprising a plurality of layers, (Sharma par. 20 teaches using neural networks and deep learning methods for system identification and a dynamical system identification algorithm, which is stated to be singular or may use a plurality of statistical methods).
Sharma par. 16, 24 teaches various data as patient state vectors and using neural networks, but does not specifically teach the following limitations met by Karlov, wherein the input layer to the neural net comprises the evidence, the vectorised representation being extracted from an inner layer of the neural net (Karlov fig. 9 and associated paragraphs, par. 165 teaches splitting data points with respect to distance layers, where splitting parameters and measurement noise in the test values helps to make patterns more statistically stable, here extraction from inner layers as explained in the specification par. 138 is interpreted to include data from nodes that have a smaller distance from one another so that the data is more similar than independent). 
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify the systems and methods as taught by Sharma to use vectorised representations extracted from an inner layer as taught by Karlov with the motivation that sampling increases the number of data points used for computing the discrete neighbor counting patterns and (Karlov par. 165). Here, clinical test data can be analyzed and mined for improved disease diagnosis (Karlov par. 8).




Claims 6-7, 10, 13, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Sharma et al. US2018/0233233 in view of Karlov et al. US2009/0024332 in further view of Glass et al. US2017/0061324
As per claim 6, Sharma and Karlov teach all the limitations of claim 1, but do not teach the following limitations taught or suggested by Glass, wherein the probabilistic graphical model is a noisy-OR model (Glass par. 31 teaches noisy or model based on symptoms and findings indicating disease). 
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify the systems and methods as taught by Sharma and Karlov to use a noisy OR model as taught by Glass with the motivation to produce accurate and calibrated confidences for candidate diagnoses using an inference graph constructed from the sub-question evidence (Glass par. 20).
As per claim 7, Sharma and Karlov teach all the limitations of claim 1, but do not teach the following limitations taught or suggested by Glass wherein the probabilistic graphical model expresses probabilistic relationships between variables comprising diagnosis, symptoms and risk factors for medical diagnosis (Glass par. 23, 31, 46 teaches combining probabilistic causations, a causal model including clinical factors and diagnosis, here symptoms and lab values may be used to determine relationships between diagnosis and health risks). 
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify the systems and methods as taught by Sharma and Karlov to express probabilistic relationships between variables as taught by Glass with the motivation to produce accurate and calibrated confidences for candidate diagnoses using an inference graph constructed from the sub-question evidence (Glass par. 20).
As per claim 10, Sharma and Karlov teach all the limitations of claim 8, but do not teach the following limitations taught or suggested by Glass wherein the probabilistic graphical model expresses probabilistic relationships between variables comprising diagnosis, symptoms and risk factors for medical diagnosis, wherein evidence is the presence or absence of symptoms and risk factors (Glass fig. 1 and associated paragraphs, par. 23, 31, 46 teaches an evidence graph of disease and symptoms, and related risk, combining probabilistic causations, a causal model including clinical factors and diagnosis, here symptoms and lab values may be used to determine relationships between diagnosis and health risks, here evidence may be either present symptoms and risk). 
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify the systems and methods as taught by Sharma and Karlov to express probabilistic relationships between variables as taught by Glass with the motivation to produce accurate and calibrated confidences for candidate diagnoses using an inference graph constructed from the sub-question evidence (Glass par. 20).
As per claim 13, Sharma and Karlov teach all the limitations of claim 8, but do not teach the following limitations taught or suggested by Glass wherein the vectorised representation is used for interpretation of node relationships within the probabilistic graphical model (Glass par. 4, 68 teaches graphical model that is useful for formalizing relationships, an inference graph the probability of any node is conditionally independent of the probability for all other nodes, given the probability of its parents, here 
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify the systems and methods as taught by Sharma and Karlov to interpret node relationships as taught by Glass with the motivation to produce accurate and calibrated confidences for candidate diagnoses using an inference graph constructed from the sub-question evidence (Glass par. 20).
As per claim 19, Sharma and Karlov teach all the limitations of claim 16, but do not teach the following limitations taught or suggested by Glass wherein the probabilistic graphical model expresses probabilistic relationships between variables comprising diagnosis, symptoms and risk factors for medical diagnosis (Glass par. 23, 31, 46 teaches combining probabilistic causations, a causal model including clinical factors and diagnosis, here symptoms and lab values may be used to determine relationships between diagnosis and health risks).
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify the systems and methods as taught by Sharma and Karlov to express probabilistic relationships between variables as taught by Glass with the motivation to produce accurate and calibrated confidences for candidate diagnoses using an inference graph constructed from the sub-question evidence (Glass par. 20).

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1, 3-7, 10, 12-14, 16-17, 19 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20 of U.S. Patent Application 16/277,975, which has been issued a notice of allowance. Although the claims at issue are not identical, they are not patentably distinct from each other because all of the limitations of instant Claim 1 are recited limitations in reference claims 1 and 7.
The instant claims include specific limitations that are or were recited at various points in prosecution in the reference application: Neural network (Reference application claim 1), sub-networks (claim 10), single neural network (claim 11), noisy-OR model (claim 12), expressing probabilistic relationships (claim 1), vectorised representation used for classification, clustering, interpretation of nodes (claim 1 on 1/4/2021), graphical processing unit (claim 1), uniform distribution (claim 1)

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAY M PATEL whose telephone number is (571)272-6793.  The examiner can normally be reached on Monday-Thursday 7AM-5PM.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Victoria Augustine can be reached on (313) 446-4858.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/JAY M. PATEL/Examiner, Art Unit 3686