DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 1/13/2021 have been fully considered but they are not persuasive. Applicant’s first argument is as follows:
“Thus as amended, the present invention takes a collection of data sources (e.g., public depositories, surveys, clinical data, etc.) from a plurality of members of a population and extracts a set a features for the at least one member of the population to be used in the prediction of end stage renal disease for the at least one member. (see Figure 1) None of the cited references teach such an extraction process from a pool of data of multiple members of the population. See, e.g., filed application, paras. [0012]- [0016] ("[b]ecause of the diversity of sources from which input data 302 may be comprised, a data feature extraction process may be implemented to identify data variables from the various sources.")…
The passages cited by the Examiner in Anderberg do not teach or describe the extraction process of identifying data variables from multiple data sources having data relating to a plurality of members to extract features of one member to be used in the predictive modeling as recited in the independent claims. In other words, the system in Anderberg, at most, discusses collection of certain member data by the clinician, but does not teach or suggest the extraction process of the present invention, much less extraction of features from data from a pluralty of data sources having data relating to a plurality of members as recited.”
The 10/13/2020 Non-Final Rejection recites that a plurality of members of a member population can be found in [0101]-[0128] of Anderberg.  [0102] teaches “approximately 250 adults undergoing radiographic/angiographic procedures…are enrolled.  To be enrolled in the study, each patient must meet all of the following inclusion criteria and none of the following exclusion criteria [wherein the inclusion criteria includes] males and females 18 years of age or older, undergoing a radiographic/angiographic procedure…”  Determining whether or not the inclusion 
Applicant’s second argument is as follows:
“Based on the discussions above, Applicant also respectfully submits that none of the cited references teach or suggest the use of recited features of the following dependent claims for example: 
i. claims 4, 12, 18: use of holdout data to determine the selection of features 
which result in a model with the greatest accuracy (none of the references cited discuss selection of features which result in a model with the greatest 
accuracy much less using holdout data to determine the selection of features 
as claimed); 
ii. claim 6: use of extracted features as listed in claim 6; 
iii. claims 7, 11, 17: identifying various predictor categories; and12 Ser No 15/057,091 
Response to Office Action of 10/13/2020analyzing each of the predictor categories to determine its predictor value by using claims data obtained from members of a health insurance provider to which the predictive model was applied. 
Applicant respectfully submits that the cited references also do not teach or suggest the other limitations of independent claim 8 (and claim 15), of the present invention. For example, none of the cited references teach or suggest the limitations of: 
1. applying a plurality of models to the first set of features which identify 
relationships between characteristics of the data and progression of chronic 
kidney disease to the requirement of dialysis in the described patient(s); 
2. comparing the relationships identified by the plurality of models to data 
representing actual patient outcomes; and 
3. selecting one of the plurality of the applied models with the relationship that 
most accurately reflects the actual patient outcome.”

The limitation of “determine the selection of features which result in a model with the greatest accuracy” appears to be indefinite and will be discussed in further detail in .
The remainder of Applicant's cited arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.
Applicant’s third argument is as follows:
 “The Examiner cites to the Anderberg reference for these limitations, stating that Woods "teaches a combination of multiple classifiers via dynamic classifier selection such that only the output of the most likely to be correct single classifier is considered in the final classification decision (para. 1). Office Action dated 10/13/2020, paras. 50-54, 81-86. These portions of Woods merely teach the use of multiple classifiers - not distinct models as claimed. See filed application, para. [0019] (discussing neural network, logistic regression, decision tree modeling). These portions of Woods do not teach or suggest the application of a plurality of models as recited in the claims at-issue.”
None of the claims require “distinct models.”  A plurality of predictive models does not require the use of different algorithms (e.g., neural network, logistic regression and decision tree).  Multiple neural networks trained on different data sets are a plurality of predictive models.
Applicant’s third argument is as follows:
“For example, Woods teaches the method of combining classifiers using estimates of each individual classifier's accuracy, but does not teach or suggest anything about using a plurality of models to predict a risk score for end stage renal disease as claimed, nor is there any support for a motivation to combine the cited references as asserted.”
The Woods reference was introduced to teach using multiple classifiers instead of a single classifier, as taught by Anderberg, to increase classification accuracy (pg. 8 of 10/13/2020 Non-Final Rejection).  Furthermore, Examiner stated that Woods 
Applicant’s fourth argument is as follows:
Using an applicant's disclosure as a blueprint to reconstruct the claimed invention from isolated pieces of the prior art contravenes the statutory mandate of §103 which requires judging obviousness at the point in time when the invention was made. See Grain Processing Corp. v. American Maize-Prods. Co., 840 F.2d 902, 907 (Fed. Cir. 1988).”
In response to applicant's argument that the examiner's conclusion of obviousness is based upon improper hindsight reasoning, it must be recognized that any judgment on obviousness is in a sense necessarily a reconstruction based upon hindsight reasoning.  But so long as it takes into account only knowledge which was within the level of ordinary skill at the time the claimed invention was made, and does not include knowledge gleaned only from the applicant's disclosure, such a reconstruction is proper.  See In re McLaughlin, 443 F.2d 1392, 170 USPQ 209 (CCPA 1971).
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 4, 12 and 18 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the enablement requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to enable one skilled in the art to which it pertains, or with which it is most nearly connected, to make and/or use the invention. 
The limitation “the first set of features to which the predictive model is applied is selected by verifying the features using holdout data to determine the selection of features which result in a model with the greatest accuracy” does not have any support in the Specification or Drawings.  Examiner could not find any use of the terms “holdout,” “accuracy” or mention of verifying features outside of the claims.
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 4, 12 and 18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The limitation “the first set of features to which the predictive model is applied is selected by verifying the features using holdout data to determine the selection of features which result in a model with the greatest accuracy” is considered indefinite because “a model with the greatest accuracy” does not define a frame of reference (i.e., how can you prove it’s the greatest and as compared to the accuracy of what other models?).  Examiner could not find support in the Specification to cure this deficiency.  
Different models will perform better or worse than others on input data based on the training data used to train the respective models and the type of classification task.  For example, a classifier trained on a dataset with predominantly a first group defined by a first demographic and first type of medical history will likely perform well with other members that also meet the criteria of the first group but would likely be less accurate on members of groups having other demographic populations and medical histories.  Therefore, “greatest accuracy” via a “selection of features” must be referenced to a variable to be optimized or a clear objective.  For the purposes of examination, Examiner will interpret “result in a model with the greatest accuracy” as “result in a model with high accuracy.”
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim 1-7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Anderberg et al (US 2017/0363642) in view of Beniwal (NPL: “Classification and feature selection techniques in data mining”)
For claim 1, Anderberg teaches a method for predicting the onset of end stage renal disease in a population ([0037]-[0041], [0114]-[0118]) suffering from chronic kidney disease (Abstract) comprising the steps of: 
receiving health related patient data from a plurality of sources (body fluid samples, [0042]; demographic information, medical history and clinical variables, [0043]) containing heath related patient data for a plurality of members of a member population (e.g., [0101]-[0128]); 
performing an extraction process upon the received data of the plurality of members of the member population to identify variables (assay method, [0042]) from the received data (kidney injury markers for the evaluation of renal injury, [0015]) and to extract features (demographic information, medical history and clinical variables, [0043]) that describe at least one member of the population (inclusion and exclusion criteria, [0102]-[0127]); 
applying a predictive model (neural networks, [0038] and [0083]) to the extracted features that identify the relationships between characteristics of the data and the transition from chronic kidney disease to end stage renal disease for at least one member to generate a risk score for that member (concentrations of kidney injury markers can be converted to corresponding probability scores, [0040] and [0083]); and 
alertinq a predetermined percentage of members according to risk scores ([0100]).
Anderberg fails to distinctly disclose:
processing the extracted features to generate a first set of features including a member's demographic profile, clinical profile, and behavior profile; 
However, Beniwal teaches preprocessing data is needed before apply any kind of data mining algorithm (¶1, §2) including data cleaning and attribute selection.
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to pre-process Anderberg’s extracted features in order to remove noisy data, irrelevant attributes and missing data (¶1, §2).
The combination of Anderberg and Beniwal as defined above teaches:
processing the extracted features to generate a first set of features including a member's demographic profile, clinical profile, and behavior profile (after pre-processing features discussed in [0043] and [0102]-[0104]), wherein the member’s demographic profile includes the member’s age or gender ([0043] and [0103]) and wherein the member’s clinical profile includes previous health conditions or surgeries ([0043] and [0104]). 
For claim 2, Anderberg as modified by Beniwal teaches all of the limtiations of claim 1 as cited above and Beniwal further teaches:
the step of processing the extracted features is performed using a summarization process (“2. Data Preprocessing” teaches discretization of continuous attributes into a few discrete values), a standardization process (“2. Data Preprocessing” teaches data integration to remove inconsistencies), and a filtration process (“2. Data Preprocessing” teaches attribute selection for filtering irrelevant attributes).
For claim 3, Anderberg as modified by Beniwal teaches all of the limtiations of claim 1 as cited above and Anderberg further teaches:
the predictive model applied is selected from a list comprising a neural network, logistic regression, or a decision tree (neural network, [0038]).
For claim 4, Anderberg as modified by Beniwal teaches all of the limitations of claim 1 as cited above and Beniwal further teaches:
the first set of features to which the predictive model is applied is selected by verifying the features using holdout data (via ROC, [0039] of Anderberg) 
Anderberg as modified by Beniwal as defined above fails to teach:
determine the selection of features which result in a model with the greatest accuracy
However, Beniwal teaches in §3 that “one of the major goals of a classification algorithm is to maximize the predictive accuracy.”
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to select features which result in a model with the greatest accuracy in order to maximize predictive accuracy.
For claim 5, Anderberg as modified by Beniwal teaches all of the limitations of claim 1 as cited above and Anderberg further teaches:
the received data comprises at least one of: membership data, participation in programs to improve the health of a participant, data representing demographics of the group of individuals (see rejection of claim 1 above), data comprising medical lab test results for the group of individuals (medical history, see claim 1 above), insurance claims by members of the group of individuals for medical care, insurance claims by members of the group for pharmacy services, and consumer data regarding the members.
For claim 6, Anderberg as modified by Beniwal teaches all of the limitations of claim 1 as cited above and Anderberg further teaches:
the extracted features first set of features comprise at least one of: a member's demographic profile, a member's clinical profile, a member's behavior profile, a member's medication profile, and a member's dialysis specific features (demographic information, medical history and clinical variables, [0043]).
For claim 7, Anderberg as modified by Beniwal teaches all of the limitations of claim 1 as cited above and Anderberg further teaches:
identifying various predictor categories (plurality of classifications, [0083]); and 
analyzing each of the predictor categories to determine its predictor value (probability values, [0083]).
Anderberg alone or in combination fails to disctinctly disclose:
claims data obtained from members of a health insurance provider to which the predictive model was applied.
However, Anderberg teaches in [0043] that “the foregoing method steps should not be interpreted to mean that the kidney injury marker assay result(s) is/are used in isolation in the methods described herein. Rather, additional variables or other clinical indicia may be included in the methods described herein.”
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to use claims data obtained from members of a health insurance provider as inputs to Anderberg’s neural network since Anderberg teaches the inclusion of additional variables or other clinical indicia.  Furthermore, all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods with no change in their respective functions, and the combination would have yielded predictable results to one of ordinary skill in the art at the time of invention.
Claim 8-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Anderberg in view of Beniwal and Woods et al (NPL: “Combination of Multiple Classifiers Using Local Accuracy Estimates”).
For claim 8, Anderberg teaches a method (Abstract and [0011]) comprising the steps of: 
receiving historical health related data from a plurality of sources (body fluid samples, [0042]; demographic information, medical history and clinical variables, [0043])  containing heath related patient data for a plurality of members of a member population ([0101]-[0128]); 
performing an extraction process upon the received data of the plurality of members of the member population to identify variables (assay method, [0042]) from the received data (kidney injury markers for the evaluation of renal injury, [0015]) and to extract features (demographic information, medical history and clinical variables, [0043])  that describe at least one patient (inclusion and exclusion criteria, [0101]-[0128]); 
applying a model (neural network, [0038] and [0083])  to the first set of features which identify relationships between characteristics of the data and progression of chronic kidney disease to the requirement of dialysis in the described patient(s) ([0038]-[0040], [0083]); 
comparing the relationships identified by the plurality of models to data representing actual patient outcomes ([0127]); 
determine the selection of features which result in a model with the greatest accuracy (via ROC, [0039]); and 
alerting a predetermined percentage of members according to a predicted risk score ([0100]).
Anderberg fails to distinctly disclose:
applying a plurality of models
selecting one of the plurality of the applied models with the relationship that most accurately reflects the actual patient outcome; and
processing the extracted features to generate a first set of features including a member's demographic profile, clinical profile, and behavior profile, wherein the member’s demographic profile includes the member’s age or gender and wherein the member’s clinical profile includes previous health conditions or surgeries; and
grouping members after the extraction process based on characteristic homogeneity and data availability.
However, Beniwal teaches preprocessing data is needed before apply any kind of data mining algorithm (¶1, §2) including data cleaning and attribute selection.  
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to pre-process Anderberg’s extracted features in order to remove noisy data, irrelevant attributes and missing data (¶1, §2).
The combination of Anderberg and Beniwal as defined above teaches:
processing the extracted features to generate a first set of features including a member's demographic profile, clinical profile, and behavior profile (after pre-processing features discussed in [0043] and [0102]-[0104]), wherein the member’s demographic profile includes the member’s age or gender ([0043] and [0103]) and wherein the member’s clinical profile includes previous health conditions or surgeries ([0043] and [0104]); and
grouping members after the extraction process based on characteristic homogeneity and data availability (via inclusion and exclusion criteria, [0102]-[0104]).
Anderberg as modified by Beniwal fails to teach:
applying a plurality of models
selecting one of the plurality of the applied models with the relationship that most accurately reflects the actual patient outcome;
However, Woods teaches a combination of multiple classifiers via dynamic classifier selection such that only the output of the most likely to be correct single classifier is considered in the final classification decision (¶1, §1).
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to implement Anderberg’s neural network using a combination 
For claim 9, Anderberg as modified by Beniwal and Woods teaches all of the limtiations of claim 8 as cited above and Beniwal further teaches:
the step of processing the extracted features is performed using a summarization process (“2. Data Preprocessing” teaches discretization of continuous attributes into a few discrete values), a standardization process (“2. Data Preprocessing” teaches data integration to remove inconsistencies), and a filtration process (“2. Data Preprocessing” teaches attribute selection for filtering irrelevant attributes).
For claim 10, Anderberg as modified by Beniwal and Woods teaches all of the limitations of claim 8 as cited above and Anderberg further teaches:
application of the model produces a list of patients arranged progressively from a low risk to a high risk of progressing from chronic kidney disease to the requirement of dialysis (implementing neural networks as discussed in [0038] and [0083] will output a classification result and a corresponding confidence level for each subject which can be arranged as an ordered list).
For claim 11, Anderberg as modified by Beniwal and Woods teaches all of the limitations of claim 8 as cited above and Anderberg further teaches:
identifying various predictor categories  (plurality of classifications, [0083]); and analyzing each of the predictor categories to determine its predictor value (probability values, [0083]).
Anderberg alone or in combination fails to disctinctly disclose:
claims data obtained from members of a health insurance provider to which the predictive model was applied.
However, Anderberg teaches in [0043] that “the foregoing method steps should not be interpreted to mean that the kidney injury marker assay result(s) is/are used in isolation in the methods described herein. Rather, additional variables or other clinical indicia may be included in the methods described herein.”
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to use claims data obtained from members of a health insurance provider as inputs to Anderberg’s neural network since Anderberg teaches the inclusion of additional variables or other clinical indicia.  Furthermore, all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods with no change in their respective functions, and the combination would have yielded predictable results to one of ordinary skill in the art at the time of invention.
For claim 12, Anderberg as modified by Beniwal and Woods teaches all of the limitations of claim 8 as cited above and Woods further teaches:
the model applied is selected by verifying each of the plurality of models using holdout data to determine the accuracy of each model (¶2, §3) and the model with the greatest accuracy is selected (¶1, §1). 
For claim 13, Anderberg as modified by Beniwal and Woods teaches all of the limtiations of claim 8 as cited above and Anderberg further teaches:
the received data comprises at least one of: 
health surveys received from a group of individuals, data representing demographics of the group of individuals (see rejection of claim 8 above), data comprising summarized medical lab test results for the group of individuals, insurance claims by members of the group of individuals for medical care, insurance claims by members of the group for pharmacy services, and consumer data regarding the members.
For claim 14, Anderberg as modified by Beniwal teaches all of the limtiations of claim 8 as cited above and Anderberg further teaches the first set of features comprise at least one of: 
a patient's demographic profile, a patient's clinical profile, a patient's behavior profile, a patient's medication profile, and a member's dialysis specific features (demographic information, medical history and clinical variables, [0043]).
For claim 15, Anderberg teaches a method for predicting the onset of end stage renal disease in a population ([0037]-[0041], [0114]-[0118]) suffering from chronic kidney disease (Abstract) comprising the steps of:
receiving health related patient data from a plurality of sources (body fluid samples, [0042]; demographic information, medical history and clinical variables, [0043]) containing heath related patient data for a plurality of members of a member population ([0101]-[0128]); 
performing an extraction process upon the received data of the plurality of members of the member population to identify variables (assay method, [0042]) from the received data (kidney injury markers for the evaluation of renal injury, [0015]) and to extract features (demographic information, medical history and clinical variables, [0043]) that describe at least one member of the population (inclusion and exclusion criteria, [0102]-[0127]);
receiving historical health related data from a plurality of sources (medical history and clinical variables, [0043]); and 
alerting a predetermined percentage of members according to risk scores ([0100]).
Anderson fails to distinctly disclose:
determining the most accurate model for predicting the likelihood that a patient with chronic kidney disease will require dialysis be performing the substeps of: 
performing an extraction process upon the received historical data to extract features that describe at least one patient including a member's demographic profile, clinical profile, and behavior profile; wherein the member’s demographic profile includes the member’s age or gender and wherein the member’s clinical profile includes previous health conditions or surgeries;
grouping members after the extraction process based on characteristic homogeneity and data availability;
applying a plurality of models to the first set of features which identify relationships between characteristics of the data and progression of chronic kidney disease to the requirement of dialysis in the described patient(s); 
comparing the relationships identified by the plurality of models to data representing actual patient outcomes from the historical data; and 
selecting one of the plurality of the applied models with the relationship that most accurately reflects the actual patient outcome; 
applying the selected predictive model to the data that identifies the relationships between characteristics of the data and the transition from chronic kidney disease to end stage renal disease for at least one member to generate a risk score for that member;
However, Beniwal teaches preprocessing data is needed before apply any kind of data mining algorithm (¶1, §2) including data cleaning and attribute selection.  
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to pre-process Anderberg’s extracted features in order to remove noisy data, irrelevant attributes and missing data (¶1, §2) before applying the extracted features to Anderberg’s neural network ([0038] and [0083]).
The combination of Anderberg and Beniwal as defined above teaches:
processing the extracted features to generate a first set of features including a member's demographic profile, clinical profile, and behavior profile (after pre-processing features discussed in [0043] and [0102]-[0104]), wherein the member’s demographic profile includes the member’s age or gender ([0043] and [0103]) and wherein the member’s clinical profile includes previous health conditions or surgeries ([0043] and [0104]); and
grouping members after the extraction process based on characteristic homogeneity and data availability (via inclusion and exclusion criteria, [0102]-[0104]).
Anderberg as modified by Beniwal fails to teach:
applying a plurality of models to the first set of features which identify relationships between characteristics of the data and progression of chronic kidney disease to the requirement of dialysis in the described patient(s); 
comparing the relationships identified by the plurality of models to data representing actual patient outcomes from the historical data; and 
selecting one of the plurality of the applied models with the relationship that most accurately reflects the actual patient outcome; 
applying the selected predictive model to the data that identifies the relationships between characteristics of the data and the transition from chronic kidney disease to end stage renal disease for at least one member to generate a risk score for that member;
However, Woods teaches a combination of multiple classifiers via dynamic classifier selection such that only the output of the most likely to be correct single classifier is considered in the final classification decision (¶1, §1).
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to implement Anderberg’s neural network using a combination of multiple neural network classifiers via dynamic classifier selection as taught by Woods since the particular known technique was recognized as part of the ordinary capabilities of one skilled in the art.
Anderberg as modified by Beniwal and Woods teaches:
applying a plurality of models (multiple classifiers, Woods, ¶1, §1) to the first set of features which identify relationships between characteristics of the data and progression of chronic kidney disease to the requirement of dialysis in the described patient(s) 
comparing the relationships identified by the plurality of models to data representing actual patient outcomes from the historical data ([0127] of Anderson, §4.3 of Woods); and 
selecting one of the plurality of the applied models with the relationship that most accurately reflects the actual patient outcome (¶1, §1, Woods); 
applying the selected predictive model to the data that identifies the relationships between characteristics of the data and the transition from chronic kidney disease to end stage renal disease for at least one member to generate a risk score for that member (concentrations of kidney injury markers can be converted to corresponding probability scores, [0040] and [0083], Anderson);
For claim 16, Anderberg as modified by Beniwal and Woods teaches all of the limtiations of claim 15 as cited above and Beniwal further teaches:
the step of processing the extracted features is performed using a summarization process (“2. Data Preprocessing” teaches discretization of continuous attributes into a few discrete values), a standardization process (“2. Data Preprocessing” teaches data integration to remove inconsistencies), and a filtration process (“2. Data Preprocessing” teaches attribute selection for filtering irrelevant attributes).
For claim 17 Anderberg as modified by Beniwal and Woods teaches all of the limitations of claim 15 as cited above and Anderberg further teaches:
identifying various predictor categories  (plurality of classifications, [0083]); and analyzing each of the predictor categories to determine its predictor value (probability values, [0083]).
Anderberg alone or in combination fails to disctinctly disclose:
claims data obtained from members of a health insurance provider to which the predictive model was applied.
However, Anderberg teaches in [0043] that “the foregoing method steps should not be interpreted to mean that the kidney injury marker assay result(s) is/are used in isolation in the methods described herein. Rather, additional variables or other clinical indicia may be included in the methods described herein.”
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to use claims data obtained from members of a health insurance provider as inputs to Anderberg’s neural network since Anderberg teaches the inclusion of additional variables or other clinical indicia.  Furthermore, all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods with no change in their respective functions, and the combination would have yielded predictable results to one of ordinary skill in the art at the time of invention.
For claim 18, Anderberg as modified by Beniwal and Woods teaches all of the limitations of claim 15 as cited above and Woods further teaches:
the first set of features to which the predictive model is applied is selected by verifying the features using holdout data (§4.3) to determine the selection of features which result in a model with the greatest accuracy (¶1, §1).
For claim 19, Anderberg as modified by Beniwal and Woods teaches all of the limitations of claim 15 as cited above and Anderberg further teaches:
the received data comprises at least one of: membership data, participation in programs to improve the health of a participant, data representing demographics of the group of individuals (see rejection of claim 15 above), data comprising medical lab test results for the group of individuals (medical history, see claim 15 above), insurance claims by members of the group of individuals for medical care, insurance claims by members of the group for pharmacy services, and consumer data regarding the members.
For claim 20, Anderberg as modified by Beniwal teaches all of the limitations of claim 15 as cited above and Anderberg further teaches:
the extracted features first set of features comprise at least one of: a member's demographic profile, a member's clinical profile, a member's behavior profile, a member's medication profile, and a member's dialysis specific features (demographic information, medical history and clinical variables, [0043]).
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL CALRISSIAN PUENTES whose telephone number is (571)270-5070.  The examiner can normally be reached on M-F 9-6:30 (flex).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ALEXEY SHMATOV can be reached on 571-270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.