DETAILED ACTION
This is the first office action regarding application number 15/914,656, filed March 7, 2018.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Abstract
The abstract of the disclosure is objected to because of the following informality: a typographical error in line 5, "it's" should be "its".  Correction is required.  See MPEP § 608.01(b).

Drawings
The drawings are objected to because of the following informalities:
Figure 1, element 102: “Record File” text label is missing. Appropriate correction is required.
Figure 1, element 103: “Target Tags” text label is missing. Appropriate correction is required.
Figure 2, element 209: text label “Scoring” should be corrected as “Scoring Process”. Appropriate correction is required.
Figure 2, element 210: text label “Model Evaluation” should be corrected as “Model Evaluation Process”. Appropriate correction is required.
Figure 3, element 309: text label “Insight-Enhanced Training Set” should be corrected as “Insights-Enhanced Training Set”. Appropriate correction is required.
Figure 3, element 310: text label “Insight-Enhanced Holdout Set” should be corrected as “Insights-Enhanced Holdout Set”. Appropriate correction is required.
Figure 8, element 801: “Training or Holdout Set Records File” text label is missing. Appropriate correction is required.
Figure 8, element 803: “Insights-Enhanced Training or Holdout Sets” text label is missing. Appropriate correction is required.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Specification
The disclosure is objected to because of the following informalities:
Paragraph [0004]: “SAS System”; SAS is a registered trademark ®. Appropriate correction is required.
Paragraph [0007]: “Facebook Social Medial Website” is a trademark ™; “Medial” should be corrected to “Media”. Appropriate correction is required.
Paragraph [0009]: A period punctuation mark (.) is missing at the end of the last sentence. Appropriate correction is required.
Paragraph [0019], line 3: Remove the transitional term “Although” to form a complete sentence (e.g., “odeling-ready data takes many forms, including, …”). Appropriate correction is required.
Paragraph [0022], element 102, lines 4 and 12: element 102 is referenced with text label “Record File 102” (line 4) and “Historical Predictor File 102” (line 12). Figure 1 shows element 102 as unlabeled. Line 12 should be corrected as “Record File 102”. Appropriate correction is required.
Paragraph [0023], element 205, lines 8 and 9: element 205 is referenced with text label “Model Training Set 205” (line 8) and text label “Training Set 205” (line 9). Figure 2 shows element 205 using text label “Model Training Set”. Line 9 should be corrected as “Model Training Set 205”. Appropriate correction is required. 
Paragraph [0024], line 6: “Training Set 3.04” should be corrected as “Training Set 304”. Appropriate correction is required.
Paragraph [0025], line 14: “Insighted Training Set 309” should be corrected as “Insights-Enhanced Training Set 309”. Appropriate correction is required.
Paragraph [0025], line 14: “Insighted Holdout Set 310” should be corrected as “Insights-Enhanced Holdout Set 310”. Appropriate correction is required.
Paragraph [0026], line 1: Remove the extra “Process” word from the text label “Raw Insights Dictionary Process 
Paragraph [0027], line 3: text label “Raw Insights Dictionary 408” should be corrected as “Raw Insights Dictionary 403”. Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 

(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that use the word “means” or “step” but are nonetheless not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph because the claim limitation(s) recite(s) sufficient structure, materials, or acts to entirely perform the recited function.  Such claim limitation(s) is/are: 
Claim 1: “raw insights dictionary creation means for creating a raw insights dictionary using data from the first partition”
Claim 1: “lookup means for applying insights from an insights dictionary to records in the second and third partitions”
Claim 2: “dictionary aggregation means for applying one or more aggregation rules to create an aggregated insights dictionary from the raw insights dictionary”
Claim 6: “statistical analysis means for determining the statistical significance of a label-value pair in a data record”
Because this/these claim limitation(s) is/are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are not being interpreted to cover only the corresponding structure, material, or acts described in the specification as performing the claimed function, and equivalents thereof.
If applicant intends to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to remove the structure, materials, or acts that performs the claimed function; or (2) present a sufficient showing that the claim limitation(s) does/do not recite sufficient structure, materials, or acts to perform the claimed function.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 4, 5, 6, 11, 13, and 19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding Claims 4 and 11, 
The term "relevant insighted data" is a relative term which renders the claim indefinite.  The term "relevant insighted data" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. The specification fails to disclose the boundaries or terms of degree of what is considered “relevant insighted data”, as it fails to establish any measurement of how to determine whether data is relevant or not with respect to the claimed invention. The specification hints that the use of the term “relevant” is associated with determining the statistical significance of an entry, but the term “statistical significance” is also indefinite, as the specification also fails to establish a measurement to calculate “statistically insignificant” or “not statistically significant” values with respect to aspects of the claimed invention, as discussed below.
Regarding Claim 5, 
Claim 5 recites the limitation "the third set" in line 5.  There is insufficient antecedent basis for this limitation in the claim, since there is no concept of “a third set” in earlier claims 1, 2, and 4. For the purposes of examination, this claim limitation will be interpreted as “the third partition”.
Regarding Claim 6, 
 The term "statistical significance" is a relative term which renders the claim indefinite. The term "statistical significance" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. Paragraph [0027] in the specification Once the aggregation rule is established, a test is performed at step 606 to determine if the Entry is statistically significant. An entry is said to be statistically significant if the descriptive statistics metrics, for example average value, of the Target as stored by the entry is predictive, to within specified tolerances, of the same metrics if measured from an arbitrarily large population of records with the same Label-Value pair as the entry. The calculations necessary to determine if an entry is statistically significant are familiar to a person skilled in the arts of statistics.” This description is vague, as it fails to describe any “specified tolerances” of the descriptive statistics metrics that need to be used for the comparison test in measuring the boundaries of an entry for “statistical significance”. Furthermore, the specification fails to disclose the boundaries or terms of degree for quantifying “statistical significance”, as it fails to establish any specific set of rules or measurement for determining statistical significance with respect to the claimed invention, and instead defers the analysis of statistical significance to a person skilled in the arts of statistics. While a person skilled in the art of statistics may know how to perform statistical significant calculations on data, there are potentially multiple ways to interpret and hence calculate the statistical significance of a label-value pair, such that a person skilled in the art of statistics would not know the exact specific statistical calculation being referenced or even required to correctly perform the functions of the claimed invention. 
Regarding Claims 13 and 19,
The term "statistically insignificant" is a relative term which renders the claim indefinite. The term "statistically insignificant" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. Paragraph [0027] in the specification discloses “Once the aggregation rule is established, a test is performed at step 606 to determine if the Entry is statistically significant. An entry is said to be statistically significant if the descriptive statistics metrics, for example average value, of the Target as stored by the entry is predictive, to within specified tolerances, of the same metrics if measured from an arbitrarily large population of records with the same Label-Value pair as the entry. The calculations necessary to determine if an entry is statistically significant are familiar to a person skilled in the arts of statistics.” This description is vague, as it fails to describe any “specified tolerances” of the descriptive statistics metrics that need to be used for the comparison test in measuring the boundaries of a “statistically significant” entry. Furthermore, the specification fails to disclose the boundaries of what is considered “statistically insignificant”, as it fails to establish any specific set of rules or measurement for determining statistical significance with respect to the claimed invention, and instead defers the analysis of statistical significance to a person skilled in the arts of statistics. While a person skilled in the art of statistics may know how to perform statistical significant calculations on data, there are potentially multiple ways to interpret and hence calculate the statistical significance of a label-value pair, such that a person skilled in the art of statistics would not know the exact specific statistical calculation being referenced or even required to correctly perform the functions in the claimed invention.
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 4, 6, 11, and 19 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to 
Regarding Claim 4,
The claim limitation “querying the aggregated insights dictionary for relevant insighted data based on the fields and values in the record” fails to comply with the written description requirement, as the specification fails to describe the concept of relevant insighted data, and hence fails to describe how to query for relevant insighted data. The specification hints that “relevant” is associated with statistical significance, but the term “relevant insighted data” is not explicitly used when discussing statistical significance. Figure 6, element 606 and paragraph [0027] describe performing a test at step 606 “to determine if the entry is statistically significant” in the context of establishing an aggregation rule, and the specification further describes an entry is statistically significant “if the descriptive statistics metrics, for example average value, of the Target as store by the entry is predictive, to within specified tolerances, of the same metrics if measured from an arbitrarily large population of records with the same Label-Value pair as the entry. The calculations necessary to determine if an entry is statistically significant are familiar to a person skilled in the arts of statistics.”. This description is vague, as it fails to describe any “specified tolerances” of the descriptive statistics metrics that need to be used for the comparison test in measuring the boundaries of whether a “statistically significant” entry containing a field and value is considered “relevant” enough to be classified as “relevant insighted data”. Furthermore, the specification defers the analysis of statistical significance to a person skilled in the arts of statistics. While a person skilled in the art of statistics may know how to perform calculations that represent statistical significance, there are potentially multiple ways to calculate the statistical significance of a label-value pair, and the specification fails to provide any direction in the types of statistical significant calculations as required by the claimed 
Regarding Claim 6,
The claim limitation “statistical analysis means for determining the statistical significance of a label-value pair in a data record” fails to comply with the written description requirement, as the specification fails to disclose the type of statistical calculations necessary to determine statistical significance of a label-value pair in a data record for the claimed invention. Similar to what is described above in Claim 4, the specification also fails to disclose the degree of statistical significance that determines whether to continue aggregation or to retrieve another label-value pair entry, and hence fails to establish adequate written support for performing the claimed function.
Regarding Claim 11,
The claim limitation “for each record in the modeling set, querying the insights dictionary for relevant insighted data based on the fields and values in the record” fails to comply with the written description requirement, as the specification fails to describe the concept of relevant insighted data, and hence fails to describe how to query for relevant insighted data. This claim limitation is similar in scope as Claim 4, and hence is rejected based on the same rationale.
Regarding Claim 19,
The claim limitation “wherein aggregating the label-value pair comprises assigning the pair to a designated entry for statistically insignificant values” fails to comply with the written description requirement, as the specification fails to describe a designated entry for statistically insignificant values. Figure 6, element 606 and paragraph [0027] describe performing a test at step 606 “to determine if the entry is statistically significant” in the context of establishing an aggregation rule, and the specification further describes an entry is statistically significant “if the descriptive statistics metrics, for example average value, of the Target as store by the entry is predictive, to within specified tolerances, of the same metrics if measured from an arbitrarily large population of records with the same Label-Value pair as the entry. The calculations necessary to determine if an entry is statistically significant are familiar to a person skilled in the arts of statistics.”. This description is not only vague in terms of describing “statistically insignificant values”, it also fails to disclose any method or step to determine a designated entry for statistically insignificant values.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

When considering subject matter eligibility under 35 U.S.C. 101, it must be determined whether the claim is directed to one of the four statutory categories of invention, i.e., process, machine, manufacture, or composition of matter (Step 1). If the claim does fall within one of the statutory categories, the second step in the analysis is to determine whether the claim is directed to a judicial exception (Step 2A). The Step 2A analysis is broken into two prongs. In the first prong (Step 2A, Prong 1), it is determined whether or not the claims recite a judicial exception (e.g., mathematical concepts, mental processes, certain methods of organizing human activity). If it is determined in Step 2A, Prong 1 that the claims recite a judicial exception, the analysis proceeds to the second prong (Step 2A, Prong 2), where it is determined whether or not the claims integrate the judicial exception into a practical application. If it is determined at step 2A, Prong 2 that the claims do not integrate the judicial exception into a practical application, the analysis proceeds to determining whether the claim is a patent-eligible application of the exception (Step 2B). If an abstract idea is present in the claim, any element or combination of elements in the claim must be sufficient to ensure that the claim integrates the 
Claims 1-4 and 6-10 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more than the abstract idea itself, and hence is not patent-eligible subject matter. 
Regarding Claim 1, 
Step 1: The claim recites a system for processing data records in a computer database for predictive modeling, therefore it falls into one of the four statutory categories (i.e., process, machine, article of manufacture, or composition of matter).
Step 2A Prong 1: This claim recites the following abstract idea:
lookup means for applying insights from an insights dictionary to records in the second and third partitions (Under its broadest reasonable interpretation, this claim element recites a judicial exception, as observations, evaluations, judgments, and opinions are mental processes that are implementable in the human mind. See MPEP 2106.04(a)(2)(III).)
Step 2A Prong 2: This claim further recites:
a database comprising data records, the data records comprising labeled data fields populated with values (This claim element places an additional limitation on the type of database and type of data records, as well as generally linking the system to a technological environment. Type definitions and a general association to a technological environment do not further integrate the judicial exception into a practical application. See MPEP 2106.05(h).)
a first partition of the database for generating an insights dictionary (This claim element places an additional limitation on the type of database and type of data records, as well as generally linking the system to a technological environment. See MPEP 2106.05(h). This claim element is also directed to generating data (an insights dictionary), which is a pre-solution/insignificant extra-solution activity for use in a claimed process. See MPEP 2106.05(g). 
a second partition of the database for training a predictive model (This claim element places an additional limitation on the type of database and type of data records, as well as generally linking the system to a technological environment. See MPEP 2106.05(h). This claim element is also directed to generating data (for training a predictive model), which is an insignificant extra-solution activity for use in a claimed process. See MPEP 2106.05(g). This additional element does not add a meaningful limitation to the claim, and hence does not integrate the judicial exception into a practical application.)
a third partition of the database for evaluating the predictive model (This claim element places an additional limitation on the type of database and type of data records, as well as generally linking the system to a technological environment. See MPEP 2106.05(h). This claim element is also directed to generating data (for evaluating the predictive model), which is an insignificant extra-solution activity for use in a claimed process. See MPEP 2106.05(g). This additional element does not add a meaningful limitation to the claim, and hence does not integrate the judicial exception into a practical application.)
raw insights dictionary creation means for creating a raw insights dictionary using data from the first partition (This claim element is considered a form of applying mere instructions on a generic computer to implement a judicial exception. See MPEP 2106.05(f). This additional element does not add a meaningful limitation to the claim, and hence does not integrate the judicial exception into a practical application.)
Step 2B: This claim further recites:
a database comprising data records, the data records comprising labeled data fields populated with values (As analyzed in Step 2A Prong 2, type definitions and a general association to a technological environment do not further integrate the judicial exception into a practical application. See MPEP 2106.05(h). Hence this claim element does not add 
a first partition of the database for generating an insights dictionary (This claim element is directed to storing and retrieving information in memory, which is a well-known, understood, routine, and conventional activity, and hence does not add significantly more than the judicial exception, alone or in combination with other elements in the claim. See MPEP 2106.05(d)(II), list 1, example iv.)
a second partition of the database for training a predictive model (This claim element is directed to storing and retrieving information in memory, which is a well-known, understood, routine, and conventional activity, and hence does not add significantly more than the judicial exception, alone or in combination with other elements in the claim. See MPEP 2106.05(d)(II), list 1, example iv.)
a third partition of the database for evaluating the predictive model (This claim element is directed to storing and retrieving information in memory, which is a well-known, understood, routine, and conventional activity, and hence does not add significantly more than the judicial exception, alone or in combination with other elements in the claim. See MPEP 2106.05(d)(II), list 1, example iv.)
raw insights dictionary creation means for creating a raw insights dictionary using data from the first partition (As analyzed in Step 2A Prong 2, applying mere instructions on a generic computer to implement a judicial exception does not further integrate the judicial exception into a practical application. See MPEP 2106.05(f). Hence this claim element does not add significantly more than the judicial exception, alone or in combination with other elements in the claim.)
Regarding Claim 2, 
Step 1: The claim recites the system of claim 1, therefore it falls into one of the four statutory categories (i.e., process, machine, article of manufacture, or composition of matter).
Step 2A Prong 1: Claim 2 is a dependent claim of Claim 1, and hence inherits the same abstract idea mentioned above. This claim further recites the following abstract idea:
wherein the lookup means is for applying insights from the aggregated insights dictionary to records in the second and third partitions (Under its broadest reasonable interpretation, this claim element recites a judicial exception, as observations, evaluations, judgments, and opinions are mental processes that are implementable in the human mind. See MPEP 2106.04(a)(2)(III).)
Step 2A Prong 2: This claim further recites:
dictionary aggregation means for applying one or more aggregation rules to create an aggregated insights dictionary from the raw insights dictionary (This claim element is considered a form of applying mere instructions on a generic computer to implement a judicial exception. See MPEP 2106.05(f). This additional element does not add a meaningful limitation to the claim, and hence does not integrate the judicial exception into a practical application.)
Step 2B: This claim further recites:
dictionary aggregation means for applying one or more aggregation rules to create an aggregated insights dictionary from the raw insights dictionary (As analyzed in Step 2A Prong 2, applying mere instructions on a generic computer to implement a judicial exception does not further integrate the judicial exception into a practical application. See MPEP 2106.05(f). Hence this claim element does not add significantly more than the judicial exception, alone or in combination with other elements in the claim.)
Regarding Claim 3, 
Step 1: The claim recites the system of claim 2, therefore it falls into one of the four statutory categories (i.e., process, machine, article of manufacture, or composition of matter).
Step 2A Prong 1: Claim 3 is a dependent claim of Claim 2, and hence inherits the same abstract ideas mentioned above. 
Step 2A Prong 2: This claim further recites:
wherein the aggregation rule corresponds to a data-type of fields in the records (This claim element places an additional limitation on the type of aggregation rule, as well as generally linking the system to a technological environment. Type definitions and a general association to a technological environment do not further integrate the judicial exception into a practical application. See MPEP 2106.05(h).)
Step 2B: This claim further recites:
wherein the aggregation rule corresponds to a data-type of fields in the records (As analyzed in Step 2A Prong 2, type definitions and a general association to a technological environment do not further integrate the judicial exception into a practical application. See MPEP 2106.05(h). Hence this claim element does not add significantly more than the judicial exception, alone or in combination with other elements in the claim.)
Regarding Claim 4, 
Step 1: The claim recites the system of claim 2, therefore it falls into one of the four statutory categories (i.e., process, machine, article of manufacture, or composition of matter).
Step 2A Prong 1: Claim 4 is a dependent claim of Claim 2, and hence inherits the same abstract ideas mentioned above. 
Step 2A Prong 2: This claim further recites:
computer executable instructions embedded on a fixed tangible medium, which upon execution, cause a computer to perform the steps of (This claim element places an additional limitation on the type of computer executable instructions, as well as generally linking the system to a technological environment. Type definitions and a general association to a technological environment do not further integrate the judicial exception into a practical application. See MPEP 2106.05(h).)
for each record in the second partition, querying the aggregated insights dictionary for relevant insighted data based on the fields and values in the record (This claim element is considered a form of applying mere instructions on a generic computer to implement a judicial 
and appending the relevant insighted data to the record in the second partition (This claim element is considered a form of applying mere instructions on a generic computer to implement a judicial exception. See MPEP 2106.05(f). This additional element does not add a meaningful limitation to the claim, and hence does not integrate the judicial exception into a practical application.)
Step 2B: This claim further recites:
computer executable instructions embedded on a fixed tangible medium, which upon execution, cause a computer to perform the steps of (As analyzed in Step 2A Prong 2, type definitions and a general association to a technological environment do not further integrate the judicial exception into a practical application. See MPEP 2106.05(h). Hence this claim element does not add significantly more than the judicial exception, alone or in combination with other elements in the claim.)
for each record in the second partition, querying the aggregated insights dictionary for relevant insighted data based on the fields and values in the record (As analyzed in Step 2A Prong 2, applying mere instructions on a generic computer to implement a judicial exception does not further integrate the judicial exception into a practical application. See MPEP 2106.05(f). Hence this claim element does not add significantly more than the judicial exception, alone or in combination with other elements in the claim.)
and appending the relevant insighted data to the record in the second partition (As analyzed in Step 2A Prong 2, applying mere instructions on a generic computer to implement a judicial exception does not further integrate the judicial exception into a practical application. See MPEP 2106.05(f). Hence this claim element does not add significantly more than the judicial exception, alone or in combination with other elements in the claim.)
Regarding Claim 6, 
Step 1: The claim recites the system of claim 2, therefore it falls into one of the four statutory categories (i.e., process, machine, article of manufacture, or composition of matter).
Step 2A Prong 1: Claim 6 is a dependent claim of Claim 2, and hence inherits the same abstract ideas mentioned above. This claim further recites the following abstract idea:
wherein the dictionary aggregation means further comprises statistical analysis means for determining the statistical significance of a label-value pair in a data record (Under its broadest reasonable interpretation, this claim element recites a judicial exception, as observations, evaluations, judgments, and opinions are mental processes that are implementable in the human mind. See MPEP 2106.04(a)(2)(III).)
Step 2A Prong 2: This claim does not recite any additional elements to be further analyzed at this step.
Step 2B: This claim does not recite any additional elements to be further analyzed at this step.
Regarding Claim 7, 
Step 1: The claim recites the system of claim 3, therefore it falls into one of the four statutory categories (i.e., process, machine, article of manufacture, or composition of matter).
Step 2A Prong 1: Claim 7 is a dependent claim of Claim 3, and hence inherits the same abstract ideas mentioned above. 
Step 2A Prong 2: This claim further recites:
wherein the aggregation rule corresponds to a data-type that is a string of natural language text (This claim element places an additional limitation on the type of aggregation rule, as well as generally linking the system to a technological environment. Type definitions and a general association to a technological environment do not further integrate the judicial exception into a practical application. See MPEP 2106.05(h).)
Step 2B: This claim further recites:
wherein the aggregation rule corresponds to a data-type that is a string of natural language text (As analyzed in Step 2A Prong 2, type definitions and a general association to a technological environment do not further integrate the judicial exception into a practical application. See MPEP 2106.05(h). Hence this claim element does not add significantly more than the judicial exception, alone or in combination with other elements in the claim.)
Regarding Claim 8, 
Step 1: The claim recites the system of claim 3, therefore it falls into one of the four statutory categories (i.e., process, machine, article of manufacture, or composition of matter).
Step 2A Prong 1: Claim 8 is a dependent claim of Claim 3, and hence inherits the same abstract ideas mentioned above. 
Step 2A Prong 2: This claim further recites:
wherein the aggregation rule corresponds to a data-type that is a continuous quantitative value (This claim element places an additional limitation on the type of aggregation rule, as well as generally linking the system to a technological environment. Type definitions and a general association to a technological environment do not further integrate the judicial exception into a practical application. See MPEP 2106.05(h).)
Step 2B: This claim further recites:
wherein the aggregation rule corresponds to a data-type that is a continuous quantitative value (As analyzed in Step 2A Prong 2, type definitions and a general association to a technological environment do not further integrate the judicial exception into a practical application. See MPEP 2106.05(h). Hence this claim element does not add significantly more than the judicial exception, alone or in combination with other elements in the claim.)
Regarding Claim 9, 
Step 1: The claim recites the system of claim 3, therefore it falls into one of the four statutory categories (i.e., process, machine, article of manufacture, or composition of matter).
Step 2A Prong 1: Claim 9 is a dependent claim of Claim 3, and hence inherits the same abstract ideas mentioned above. 
Step 2A Prong 2: This claim further recites:
wherein the aggregation rule corresponds to a data-type that is a date (This claim element places an additional limitation on the type of aggregation rule, as well as generally linking the system to a technological environment. Type definitions and a general association to a technological environment do not further integrate the judicial exception into a practical application. See MPEP 2106.05(h).)
Step 2B: This claim further recites:
wherein the aggregation rule corresponds to a data-type that is a date (As analyzed in Step 2A Prong 2, type definitions and a general association to a technological environment do not further integrate the judicial exception into a practical application. See MPEP 2106.05(h). Hence this claim element does not add significantly more than the judicial exception, alone or in combination with other elements in the claim.)
Regarding Claim 10, 
Step 1: The claim recites the system of claim 3, therefore it falls into one of the four statutory categories (i.e., process, machine, article of manufacture, or composition of matter).
Step 2A Prong 1: Claim 10 is a dependent claim of Claim 3, and hence inherits the same abstract ideas mentioned above. 
Step 2A Prong 2: This claim further recites:
wherein the aggregation rule corresponds to a data-type that is a category with a large number of possible values (This claim element places an additional limitation on the type of aggregation rule, as well as generally linking the system to a technological environment. Type definitions and a general association to a technological environment do not further integrate the judicial exception into a practical application. See MPEP 2106.05(h).)
Step 2B: This claim further recites:
wherein the aggregation rule corresponds to a data-type that is a category with a large number of possible values (As analyzed in Step 2A Prong 2, type definitions and a general association to a technological environment do not further integrate the judicial exception into a practical application. See MPEP 2106.05(h). Hence this claim element does not add significantly more than the judicial exception, alone or in combination with other elements in the claim.)

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-14 and 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over Dillon et al., U.S. PGPUB 2003/0088562, published 5/8/2003 [henceforth referred as Dillon] in view of Berk, Richard A., Statistical Inference After Model Selection, Journal of Quantitative .
Regarding Claim 1,
Dillon teaches A system for processing data records in a computer database for predictive modeling, comprising: 
a database comprising data records, the data records comprising labeled data fields populated with values; ([Figure 1, element 105, paragraph [0045]: proprietary source data (“database”) that contains structured data with data fields and data elements (“values”) (“A proprietary or source data file 105 containing structured data, is provided to the system 100 by the user.”).] [paragraph [0044]: structured data stored in the form of transactional records (“data records”) containing labeled data fields and values (“In general, structured data refers to data stored in a standard format (such as a fixed-length transaction record). Structured data sets are designed to be read and understood by a computer. Thus, data fields within a record are defined by a pre-defined format and data elements stored within data fields typically can only take on particular values or ranges.”).] [paragraph [0081]: examples of labeled data fields and their corresponding values from a transactional record (“Binary, categorical data fields (e.g., marital status[M,S], gender[M/F], etc.)…”).])
a … database for generating an insights dictionary; ([Figure 13, element 130; paragraph [0046]: enhanced proprietary data including structured records from the proprietary source data, appended with unstructured descriptive data, in the form of natural language text (“a … database for generating an insights dictionary”) (“The descriptive data derived by the analyzer 125 is then added to the original proprietary data 105 to create an enhanced proprietary data file 130. This enhanced data file 130 may be used in a predictive modeling module 135 and/or a data mining module 140 to extract useful business information.”).])
a second partition of the database for training a predictive model; ([Figure 13, elements 1210, 1235: enhanced proprietary data is partitioned by a data partitioning step into a training set (“a second partition of the database for training a predictive model”).] [Figure 12, elements 1205, 1210, 1235; paragraph [0077]: training set after database construction and data partitioning (“Referring back to FIG. 12, the enhanced data set generated by the database construction module 1205 is randomly partitioned into three data sets by a data-partitioning module 1210. The three data sets, a training data set 1235, a test data set 1245, and a validation dataset 1255 are used in different stages of the model building process 1200.”).])
a third partition of the database for evaluating the predictive model; ([Figure 13, elements 1210, 1255: enhanced proprietary data is partitioned by a data partitioning step into a validation set (“a third partition of the database for evaluating the predictive model”).] [Figure 12, elements 1205, 1210, 1255; paragraph [0077]: validation set after database construction and data partitioning (“Referring back to FIG. 12, the enhanced data set generated by the database construction module 1205 is randomly partitioned into three data sets by a data-partitioning module 1210. The three data sets, a training data set 1235, a test data set 1245, and a validation dataset 1255 are used in different stages of the model building process 1200.”).])
raw insights dictionary creation means for creating a raw insights dictionary using data [from the database]; and ([Figure 12, element 1205; Figure 13, elements 1205 (containing elements 1315, 1325, 1327, 1330, 1335) and 1215; paragraph [0071]: database construction module using the enhanced proprietary data (“using data [from the database]”) and applying database construction steps, creating summary statistics and reference data tables (“a raw insights dictionary”) (“Database construction involves several subprocesses that will be detailed below with reference to FIG. 13, including data cleaning, data augmentation, consolidation with other databases, interpolation (if necessary) between mismatched datasets, and data record sampling. The resulting dataset is referred to as a consolidated (or enhanced) dataset. During database construction, summary statistics and reference tables 1215 are generated for later use in a variable creation module 1220.”).] [Figure 13, elements 1327, 1330, and 1335: paragraphs [0074]-[0076]: data interpolation, dataset sampling, and preliminary data analysis steps within the database construction module generating the summary statistics and reference data tables (“raw insights dictionary creation means for creating a raw insights dictionary”) (“… a data interpolation module 1327 calculates or estimates intermediate values for the less frequently sampled database. … A dataset sampling module 1330 is sometimes required for modeling studies designed to detect rare events. In the case of response-based marketing modeling, the typical response rate is low (under 3%) making such events rare from a statistical modeling viewpoint. In such cases, the rare events are left unsampled, but common events (the non-responders) are sampled down until an 'effective' ratio of cases are created. This ratio is highly dependent on the modeling methodology used ( e.g. decision trees, versus neural networks). At 1335, preliminary analysis may be conducted on the resulting database. Preliminary analysis involves generating statistics (i.e., mean, standard deviations, minimum/ maximum, missing value counts, etc.) on some or all data fields. Correlation analysis between fields (especially between any field value and the target values) is conducted to estimate the relative value of various data fields in predicting these targets. The results of these analyses are stored in tables 1215 for later use in variable creation, transformations, and evaluations.”).])
lookup means for applying insights from an insights dictionary to records in the second and third partitions. ([Figure 12, elements 1215, 1220, 1235, and 1255; paragraph [0078]: variable creation step receiving a training set and a validation set (“records in the second and third partitions”) and the summary statistics and reference data tables (“[raw] insights dictionary”) (“The three data sets 1235, 1245 and 1255 are passed on to a variable creation module 1220. … The variable creation module 1220 transforms these raw data fields into a mathematical representation of the data, so that mathematical modeling and optimization methods can be applied to generate predictions about account or transaction behavior. The complete mathematical representation is referred to as a 'pattern', while individual elements within a pattern are referred to as 'variables'.”).] [Figure 12, elements 1220, 1225, 1230, 1235, and 1255; paragraph [0079]; paragraph [0084]: variable creation step generating transformed data (“lookup means for applying insights from an insights dictionary”), with the variable selection and generate exemplar patterns steps applying this information to the training set and validation set (“The variable creation process 1220 uses several techniques to transform raw data…”); (“At 1225 variables are evaluated for their predictive value alone and in combination with other variables, and the most effective set of variables are selected for inclusion into a model. This can be done using a variety of standard variable sensitivity techniques, including analyzing the predictive performance of the variable in isolation, ranking the variables by linear regression weight magnitude and using step-wise regression. At 1230 pattern exemplars are constructed, using the selected variable selected in 1225.”).])
However, Dillon does not teach
a first partition of the database … ; 
… using data from the first partition;
Berk teaches
a first partition of the database … ; ([p.21-p.23: partitioning a population of interest (“a database”) first into a test/observation sample (“a first partition of the database …”) and a training/modeling sample, using the test/observation sample to apply statistical inference, and using the training/modeling sample to “arrive at a preferred model” (e.g., training and evaluation of a model) (“Post-model-selection sampling distributions can be highly non-normal, very complex, and with unknown finite sample properties even when the model responsible for the data happens to be selected. There can be substantial bias in the regression estimates, and conventional tests and confidence intervals are undertaken at some peril. … If the post-model-selection sampling distributions may be problematic, probably the most effective solution is to have two random samples from the population of interest: a training sample and a test sample. The training sample is used to arrive at a preferred model. The test sample is used to estimate the parameters of the chosen model and to apply statistical inference. For the test sample, the model is known in advance. The requisite structure for proper statistical inference is in place, and problems resulting from post-model-selection statistical inference are prevented. The dual-sample approach is easy to implement once there are two samples.”).])
… using data from the first partition; ([p.21-p.23: partitioning a population of interest (“a database”) first into a test/observation sample (“the first partition”) and a training/modeling sample, using the test/observation sample to apply statistical inference (“… using data from the first partition”), and using the training/modeling sample to “arrive at a preferred model” (e.g., training and evaluation of a model) (“Post-model-selection sampling distributions can be highly non-normal, very complex, and with unknown finite sample properties even when the model responsible for the data happens to be selected. There can be substantial bias in the regression estimates, and conventional tests and confidence intervals are undertaken at some peril. … If the post-model-selection sampling distributions may be problematic, probably the most effective solution is to have two random samples from the population of interest: a training sample and a test sample. The training sample is used to arrive at a preferred model. The test sample is used to estimate the parameters of the chosen model and to apply statistical inference. For the test sample, the model is known in advance. The requisite structure for proper statistical inference is in place, and problems resulting from post-model-selection statistical inference are prevented. The dual-sample approach is easy to implement once there are two samples.”).])
Dillon and Berk are analogous art as both teach using data to produce statistical inferences and training predictive models. 
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the enhanced proprietary data database of Dillion and perform the database partitioning of Berk as a way to produce partitioned test/observation and modeling samples of the enhanced proprietary database. The motivation to combine is taught in Berk, as a way to avoid incorporating bias from the model and to prevent questionable statistical inferences when using data that has already been used for training and evaluating the model, improving the reliability of the model ([p.21-p.23: “Post-model-selection sampling distributions can be highly non-normal, very complex, and with unknown finite sample properties even when the model responsible for the data happens to be selected. There can be substantial bias in the regression estimates, and conventional tests and confidence intervals are undertaken at some peril. … If the post-model-selection sampling distributions may be problematic, probably the most effective solution is to have two random samples from the population of interest: a training sample and a test sample. The training sample is used to arrive at a preferred model. The test sample is used to estimate the parameters of the chosen model and to apply statistical inference. For the test sample, the model is known in advance. The requisite structure for proper statistical inference is in place, and problems resulting from post-model-selection statistical inference are prevented. The dual-sample approach is easy to implement once there are two samples.”])
Regarding Claim 2,
Dillon in view of Berk teaches The system of claim 1 further comprising 
dictionary aggregation means for applying one or more aggregation rules to create an aggregated insights dictionary from the raw insights dictionary, and ([Figure 12, elements 1215, 1220, 1235, 1255; paragraphs [0079]-[0083] and Table 1: variable creation step applying transformation rules (“one or more aggregation rules”) using data from the training and validation sets, and the summary statistics and reference data tables (“raw insights dictionary”), with the transformed data representing the “aggregated insights dictionary” (“dictionary aggregation means”) (“The variable creation process 1220 uses several techniques to transform raw data. Examples of data transformation techniques and specific variables are listed below for illustrative purposes with additional examples of variables given in Table 1. … 1. Numerical Transformations: Raw data already in numerical form (such as transaction amounts) may be used directly as variables themselves. Numerical or categorical data referencing time (such as dates) are simply transformed into a standard unit of time (e.g., 'days since Jan. 1, 1999‘ or 'seconds since account open date'). … 2. Categorical data transformations: Binary, categorical data fields ( e.g., marital status[M,S], gender[M/F], etc.) can simply be transformed binary, logical data values ([0,1] or [ -1,1 ]). Higher-dimensional, categorical variables can be transformed in multiple ways. For example, ZIP code fields can be used to index probability or affinity tables 1215 (generated by the database construction module 1205 shown in FIG. 13). … 3. Variables or functions of many variables or data fields: Individual data fields and/or variables can be combined with other data fields to exploit higher order interactions. … 4. Temporal data: Raw account data consist of sequences of transactions over time. Several classes of variables are constructed to exploit this sequence. Variables designed to capture the temporal nature include summary variables (i.e., moving averages), rate estimation variables (i.e., using Kalman filtering techniques), periodicity or recurrent event detection (i.e., variables designed to detect the most frequently called telephone numbers, regular grocery stores, and periodic payments). Signal processing techniques can be used to develop custom temporal filters. Hidden Markov models can also be used to update behavioral state transitions with each transaction.”).])
wherein the lookup means is for applying insights from the aggregated insights dictionary to records in the second and third partitions. ([Figure 12, elements 1215, 1220, 1235, and 1255; paragraph [0078]: variable creation step receiving a training set and a validation set (“records in the second and third partitions”) and the summary statistics and reference data tables (“[raw] insights dictionary”) (“The three data sets 1235, 1245 and 1255 are passed on to a variable creation module 1220. … The variable creation module 1220 transforms these raw data fields into a mathematical representation of the data, so that mathematical modeling and optimization methods can be applied to generate predictions about account or transaction behavior. The complete mathematical representation is referred to as a 'pattern', while individual elements within a pattern are referred to as 'variables'.”).] [Figure 12, elements 1220, 1225, 1230, 1235, and 1255; paragraph [0079]; paragraph [0084]: variable creation step generating transformed data (“aggregated insights dictionary”), with the variable selection and generate exemplar patterns steps applying this information to the training set and validation set (“lookup means is for applying insights from the aggregated insights dictionary to records in the second and third partitions”) (“The variable creation process 1220 uses several techniques to transform raw data…”); (“At 1225 variables are evaluated for their predictive value alone and in combination with other variables, and the most effective set of variables are selected for inclusion into a model. This can be done using a variety of standard variable sensitivity techniques, including analyzing the predictive performance of the variable in isolation, ranking the variables by linear regression weight magnitude and using step-wise regression. At 1230 pattern exemplars are constructed, using the selected variable selected in 1225.”).])
Regarding Claim 3, 
Dillon in view of Berk teaches The system of claim 2 
wherein the aggregation rule corresponds to a data-type of fields in the records. ([paragraph [0081] and Table 1: an aggregation rule involving fields in a transactional record, with corresponding variables (“label-value pairs”) with labels (marital status [M,S], gender [M/F]) (“corresponding to a data-type of fields in the records”) (“2. Categorical data transformations: Binary, categorical data fields (e.g., marital status[M,S], gender[M/F], etc.) can simply be transformed binary, logical data values ([0,1] or [ -1,1 ]). Higher-dimensional, categorical variables can be transformed in multiple ways. For example, ZIP code fields can be used to index probability or affinity tables 1215 (generated by the database construction module 1205 shown in FIG. 13).”).])
Regarding Claim 4,
Dillon in view of Berk teaches The system of claim 2, further comprising 
computer executable instructions embedded on a fixed tangible medium, which upon execution, cause a computer to perform the steps of: ([paragraph [0007]: a computerized system performing described functions, including partitioning data, creating a raw insights dictionary, using aggregation rules to produce an aggregated insights dictionary, training and evaluating a machine-learning model (“The present invention also includes a computerized system for augmenting data from a source database with data from a reference database to generate an augmented database that can be used for predictive modeling, including a source database including structured data, a reference database having reference data, a locator component configured to use the structured data to locate reference data in the reference database suitable for association with the source database, an analyzer component configured to process the reference data into a set of descriptors and associating the descriptors to the source data to form an augmented database, a predictive modeling component configured to classify behavior with the augmented database, and a data mining component configured to conduct searches of data in the augmented database.”).])
for each record in the second partition, querying the aggregated insights dictionary for relevant insighted data based on the fields and values in the record; and ([Figure 12, elements 1220, 1225, 1235, and 1255; paragraph [0079]; paragraph [0084]: variable creation step generating transformed data (“aggregated insights dictionary”), with the variable selection step applying and selecting this information (“querying … for relevant insighted data”) based on the fields and values in the records in the training set and validation set (“The variable creation process 1220 uses several techniques to transform raw data…”); (“At 1225 variables are evaluated for their predictive value alone and in combination with other variables, and the most effective set of variables are selected for inclusion into a model. This can be done using a variety of standard variable sensitivity techniques, including analyzing the predictive performance of the variable in isolation, ranking the variables by linear regression weight magnitude and using step-wise regression. At 1230 pattern exemplars are constructed, using the selected variable selected in 1225.”).])
appending the relevant insighted data to the record in the second partition. ([Figure 12, elements 1230, 1235, 1240, and 1255; Figure 15, elements 1505, 1520; paragraphs [0084]-[0085]: generating pattern exemplars step constructing the pattern exemplars consisting of selected variables (“relevant insighted data”) and appending them to records in the training set for training the predictive model (“appending the relevant insighted data to the record in the second partition”) (“At 1230 pattern exemplars are constructed, using the selected variable selected in 1225. … The pattern exemplars constructed in 1230 are used to train (or construct) a pattern recognition model 1240.”).])
Regarding Claim 5,
Dillon in view of Berk teaches The system of claim 4, further comprising 
computer executable instructions embedded on a fixed tangible medium, which upon execution, cause a computer to perform the steps of: ([paragraph [0007]: a computerized system performing described functions, including partitioning data, creating a raw insights dictionary, using aggregation rules to produce an aggregated insights dictionary, training and evaluating a machine-learning model (“The present invention also includes a computerized system for augmenting data from a source database with data from a reference database to generate an augmented database that can be used for predictive modeling, including a source database including structured data, a reference database having reference data, a locator component configured to use the structured data to locate reference data in the reference database suitable for association with the source database, an analyzer component configured to process the reference data into a set of descriptors and associating the descriptors to the source data to form an augmented database, a predictive modeling component configured to classify behavior with the augmented database, and a data mining component configured to conduct searches of data in the augmented database.”).])
training a data model from the second partition using the appended insighted data in the records; ([Figure 12, elements 1230, 1240; Figure 15, elements 1505, 1520; paragraph [0085]: generating pattern exemplars step constructing the pattern exemplars consisting of selected variables (“relevant insighted data”) and appending them to records in the training set for training the predictive model (“training a data model from the second partition using the appended insighted data in the records”) (“The pattern exemplars constructed in 1230 are used to train (or construct) a pattern recognition model 1240.”).])
applying the data model to a record in the third set to generate predicted scores for one or more tagged fields in the record; and ([Figure 12, element 1250: model evaluation performed using appended transactional records containing exemplar patterns in the validation set from the third partition (“a record in the third set”).] [Figure 15, elements 1515, 1520, 1525, 1530, 1540, 1545; paragraphs [0090]-[0095]: a trained model using a transactional record from the validation set (“a record from the third set”) to generate prediction/scores (“applying the data model … to generate predicted scores”) using the exemplar patterns consisting of variables (“one or more tagged fields in the record”) (“A schematic representation of the model training process is illustrated in FIG. 15. Model training is the process wherein the set of model parameters is optimized to minimize prediction or classification error. As discussed above, the training exemplars are constructed at 1230 in the variable selection process at 1225 of FIG. 12. An example is a single data record, consisting of: 1. a transaction or account identifier (key) 1515, 2. a series of raw or calculated numerical quantities, or variables 1520 (collectively referred to as a "pattern"), 3. one or more binary or continuously-valued tags 1525, representing the historical outcome associated with the transaction, and 4. a dataset partition label (Validation tag 1530), which indicates how the exemplar is to be used in the training process. A model 1540 is a mathematical function or mapping, which takes a numerical pattern as input and returns one or more values indicating a score or prediction 1545 of this pattern.”).])
comparing the predicted scores to the actual values of the tagged fields. ([Figure 15, elements 1505, 1525, 1555: a comparator comparing predicted scores to known outcomes or classification tags(“actual values of the tagged fields”) in the appended transactional record in the validation set (“The accuracy of a model's predictions is measured using a comparator 1555 to compare the model 1540 to known outcomes or classifications tags 1525.”).])
Regarding Claim 6,
Dillon in view of Berk teaches The system of claim 2 
wherein the dictionary aggregation means further comprises 
statistical analysis means for determining the statistical significance of a label-value pair in a data record. ([paragraph [0075] and Table 1: a variable (“label-value pair”) with label Transaction statistics for responders versus non-responders (segment by outcome tags)) based a selected aggregation rule from the dataset sampling module involving detecting and calculating statistics for rare events (“statistical significance of a label-value pair”) (“A dataset sampling module 1330 is sometimes required for modeling studies designed to detect rare events. In the case of response-based marketing modeling, the typical response rate is low (under 3%) making such events rare from a statistical modeling viewpoint. In such cases, the rare events are left unsampled, but common events (the non-responders) are sampled down until an 'effective' ratio of cases are created. This ratio is highly dependent on the modeling methodology used ( e.g. decision trees, versus neural networks).”).] [Figure 12, element 1225; paragraph [0084]: variable selection step evaluating the transformed data (“aggregated insights dictionary”) based on their predictive value (“statistical significance”) according to the label-value pairs in the records in the training and validation sets (“statistical analysis means for determining the statistical significance of a label-value pair in a data record”) (“At 1225 variables are evaluated for their predictive value alone and in combination with other variables, and the most effective set of variables are selected for inclusion into a model. This can be done using a variety of standard variable sensitivity techniques, including analyzing the predictive performance of the variable in isolation, ranking the variables by linear regression weight magnitude and using step-wise regression.”).])
Regarding Claim 7,
Dillon in view of Berk teaches The system of claim 3 
wherein the aggregation rule corresponds to a data-type that is a string of natural language text. ([Figure 115, element 115; paragraph [0008]: reference database containing content in unstructured format (“string of natural language text”) (“The source database contains investment transactions and the reference database contains public information regarding companies, mutual funds and/or other investment interests. The source database contains insurance transactions, and wherein the reference database contains information regarding insurance products, claims and/or insurance evaluations. The source database contains product inventories, and wherein the reference database contains information describing products …The reference database contains data in an unstructured format …The processing of reference data in the reference database is accomplished by reducing natural language text to a set of weighted keywords.”).] [Figure 1, element 125; paragraph [0065]: an analyzer applying a series of steps (“aggregation rule”) involving content that is a natural language text string (“corresponds to a data-type that is a string of natural language text”) (“The analyzer 125 creates merchant descriptors, for example lists of weighted words … The text content is then parsed to extract unique words and the corresponding word counts by an extract words module 715 … The extracted words are matched against a lexicographic database 725 to map words to their linguistic roots or more common synonyms in a linguistic reduction module 720. The lexicographic database 725 maps words to their natural language root (e.g., running becomes ran). The linguistic reduction module 720 is further explained with reference to FIG. 10. The word counts are updated and each word is assigned a word weight based on the total number of keywords. The analyzer outputs a list of weighted keywords 730.”).])
Regarding Claim 8,
Dillon in view of Berk teaches The system of claim 3 
wherein the aggregation rule corresponds to a data-type that is a continuous quantitative value. ([paragraph [0080] and Table 1: an aggregation rule involving transaction amounts (“a continuous quantitative value”), with corresponding variables (“label-value pair”) with labels (average spending/month, max transaction amounts, estimated net worth) (“corresponds to a data-type that is a continuous quantitative value”) (“1. Numerical Transformations: Raw data already in numerical form (such as transaction amounts) may be used directly as variables themselves.”).])
Regarding Claim 9,
Dillon in view of Berk teaches The system of claim 3 
wherein the aggregation rule corresponds to a data-type that is a date. ([paragraph [0080] and Table 1: an aggregation rule involving a date, with a corresponding variable (“label-value pair”) with label (customer since date) (“corresponds to a data-type that is a date”) (“1. Numerical Transformations: Raw data already in numerical form (such as transaction amounts) may be used directly as variables themselves. Numerical or categorical data referencing time (such as dates) are simply transformed into a standard unit of time (e.g., 'days since Jan. 1, 1999‘ or 'seconds since account open date').”).])
Regarding Claim 10,
Dillon in view of Berk teaches The system of claim 3 
wherein the aggregation rule corresponds to a data-type that is a category with a large number of possible values. ([Paragraph [0082] and Table 1: an aggregation rule for a category data-type that has many possible values/strings (“gas”, “oil”, “convenience”) representing transaction amounts, with a corresponding variable (“label-value pair”) with label (ratio of spending on “preferences/luxuries” -vs- “necessities” (e.g., “percent of total spent on groceries, fuel, & utilities”)) (“corresponds to a data-type that is a category with a large number of possible values”) (“3. Variables or functions of many variables or data fields: Individual data fields and/or variables can be combined with other data fields to exploit higher order interactions. For example, transactions at a gas station SIC code or co-occurring with keywords 'gas', 'oil' and 'convenience' would normally suggest a gasoline purchase, with larger transaction amounts implying the customer owns a larger car or truck; however, high transaction amounts may also imply auto repair service. When appropriate, specific variables are designed to capture such non-linear relationships between several data fields.”).])
Regarding Claim 11,
Dillon teaches A method for processing data records in a computer database for predictive modeling, 
the data records comprising labeled data fields populated with values, ([Figure 1, element 105, paragraph [0045]: proprietary source data (“database”) that contains structured data with data fields and data elements (“values”) (“A proprietary or source data file 105 containing structured data, is provided to the system 100 by the user.”).] [paragraph [0044]: structured data stored in the form of transactional records (“data records”) containing labeled data fields and values (“In general, structured data refers to data stored in a standard format (such as a fixed-length transaction record). Structured data sets are designed to be read and understood by a computer. Thus, data fields within a record are defined by a pre-defined format and data elements stored within data fields typically can only take on particular values or ranges.”).] [paragraph [0081]: examples of labeled data fields and their corresponding values from a transactional record (“Binary, categorical data fields (e.g., marital status[M,S], gender[M/F], etc.)…”).]) 
comprising: 
generating an insights dictionary using data in the … set of records … ; ([Figure 13, element 130; paragraph [0046]: enhanced proprietary data including structured records from the proprietary source data, appended with unstructured descriptive data, in the form of natural language text (“The descriptive data derived by the analyzer 125 is then added to the original proprietary data 105 to create an enhanced proprietary data file 130. This enhanced data file 130 may be used in a predictive modeling module 135 and/or a data mining module 140 to extract useful business information.”).] [Figure 12, element 1205; Figure 13, elements 1215 and 1205 (including elements 1315, 1325, 1327, 1330, 1335); paragraph [0071]: database construction module using the enhanced proprietary data (“using data in the … set of records”) and applying database construction steps, creating summary statistics and reference data tables (“a [raw] insights dictionary”) (“Database construction involves several subprocesses that will be detailed below with reference to FIG. 13, including data cleaning, data augmentation, consolidation with other databases, interpolation (if necessary) between mismatched datasets, and data record sampling. The resulting dataset is referred to as a consolidated (or enhanced) dataset. During database construction, summary statistics and reference tables 1215 are generated for later use in a variable creation module 1220.”).] [Figure 13, elements 1327, 1330, and 1335: paragraphs [0074]-[0076]: data interpolation, dataset sampling, and preliminary data analysis steps within the database construction module generating the summary statistics and reference data tables (“generating a [raw] insights dictionary”) (“… a data interpolation module 1327 calculates or estimates intermediate values for the less frequently sampled database. … A dataset sampling module 1330 is sometimes required for modeling studies designed to detect rare events. In the case of response-based marketing modeling, the typical response rate is low (under 3%) making such events rare from a statistical modeling viewpoint. In such cases, the rare events are left unsampled, but common events (the non-responders) are sampled down until an 'effective' ratio of cases are created. This ratio is highly dependent on the modeling methodology used ( e.g. decision trees, versus neural networks). At 1335, preliminary analysis may be conducted on the resulting database. Preliminary analysis involves generating statistics (i.e., mean, standard deviations, minimum/ maximum, missing value counts, etc.) on some or all data fields. Correlation analysis between fields (especially between any field value and the target values) is conducted to estimate the relative value of various data fields in predicting these targets. The results of these analyses are stored in tables 1215 for later use in variable creation, transformations, and evaluations.”).])
for each record in the modeling set, querying the insights dictionary for relevant insighted data based on the fields and values in the record; and ([Figure 12, elements 1235, 1255; paragraph [0077]: training and validation sets after database construction and data partitioning form a modeling set for the predictive model (“The three data sets, a training data set 1235, a test data set 1245, and a validation dataset 1255 are used in different stages of the model building process 1200. Partitioning data into training and validation sets is a desirable precaution for any type of statistical modeling. The validation set (sometimes referred to as the “hold out” sample is used evaluate model predictions on unknown observations (data that the model has never “seen”).”).] [Figure 12, elements 1220, 1225, 1235, and 1255; paragraph [0079]; paragraph [0084]: variable creation step generating transformed data (“aggregated insights dictionary”), with the variable selection step applying and selecting this information (“querying … for relevant insighted data”) based on the fields and values in the records in the training set and validation set (“The variable creation process 1220 uses several techniques to transform raw data…”); (“At 1225 variables are evaluated for their predictive value alone and in combination with other variables, and the most effective set of variables are selected for inclusion into a model. This can be done using a variety of standard variable sensitivity techniques, including analyzing the predictive performance of the variable in isolation, ranking the variables by linear regression weight magnitude and using step-wise regression.”).])
appending the relevant insighted data to the record in the modeling set. ([Figure 12, elements 1230, 1235, 1240, and 1255; Figure 15, elements 1505, 1520; paragraphs [0084]-[0085]: generating pattern exemplars step constructing the pattern exemplars consisting of selected variables (“relevant insighted data”) and appending them to records in the training set for training the predictive model (“appending the relevant insighted data to the record in the modeling set”) (“At 1230 pattern exemplars are constructed, using the selected variable selected in 1225. … The pattern exemplars constructed in 1230 are used to train (or construct) a pattern recognition model 1240.”).])
However, Dillon does not teach
partitioning the data records into an insights set and a modeling set; 
... using data in the insights set … ;
Berk teaches
partitioning the data records into an insights set and a modeling set; ([p.21-p.23: partitioning a population of interest (“a database”) first into a test/observation sample (“an insights set”) and a training/modeling sample (“a modeling set”), using the test/observation sample to apply statistical inference, and using the training/modeling sample to “arrive at a preferred model” (e.g., training and evaluation of a model) (“Post-model-selection sampling distributions can be highly non-normal, very complex, and with unknown finite sample properties even when the model responsible for the data happens to be selected. There can be substantial bias in the regression estimates, and conventional tests and confidence intervals are undertaken at some peril. … If the post-model-selection sampling distributions may be problematic, probably the most effective solution is to have two random samples from the population of interest: a training sample and a test sample. The training sample is used to arrive at a preferred model. The test sample is used to estimate the parameters of the chosen model and to apply statistical inference. For the test sample, the model is known in advance. The requisite structure for proper statistical inference is in place, and problems resulting from post-model-selection statistical inference are prevented. The dual-sample approach is easy to implement once there are two samples.”).])
... using data in the insights set … ; ([p.21-p.23: partitioning a population of interest (“a database”) first into a test/observation sample (“an insights set”) and a training/modeling sample (“a modeling set”), using the test/observation sample to apply statistical inference (“… using data in the insights set …”), and using the training/modeling sample to “arrive at a preferred model” (e.g., training and evaluation of a model) (“Post-model-selection sampling distributions can be highly non-normal, very complex, and with unknown finite sample properties even when the model responsible for the data happens to be selected. There can be substantial bias in the regression estimates, and conventional tests and confidence intervals are undertaken at some peril. … If the post-model-selection sampling distributions may be problematic, probably the most effective solution is to have two random samples from the population of interest: a training sample and a test sample. The training sample is used to arrive at a preferred model. The test sample is used to estimate the parameters of the chosen model and to apply statistical inference. For the test sample, the model is known in advance. The requisite structure for proper statistical inference is in place, and problems resulting from post-model-selection statistical inference are prevented. The dual-sample approach is easy to implement once there are two samples.”).])
Dillon and Berk are analogous art as both teach using data to produce statistical inferences and training predictive models. 
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the enhanced proprietary data database of Dillion and perform the database partitioning of Berk as a way to produce partitioned test/observation and modeling samples of the enhanced proprietary database. The motivation to combine is taught in Berk, as a way to avoid incorporating bias from the model and to prevent questionable statistical inferences when using data that has already been used for training and evaluating the model, improving the reliability of the model ([p.21-p.23: “Post-model-selection sampling distributions can be highly non-normal, very complex, and with unknown finite sample properties even when the model responsible for the data happens to be selected. There can be substantial bias in the regression estimates, and conventional tests and confidence intervals are undertaken at some peril. … If the post-model-selection sampling distributions may be problematic, probably the most effective solution is to have two random samples from the population of interest: a training sample and a test sample. The training sample is used to arrive at a preferred model. The test sample is used to estimate the parameters of the chosen model and to apply statistical inference. For the test sample, the model is known in advance. The requisite structure for proper statistical inference is in place, and problems resulting from post-model-selection statistical inference are prevented. The dual-sample approach is easy to implement once there are two samples.”])
Regarding Claim 12,
Dillon in view of Berk teaches The method of claim 11 further comprising: 
partitioning the modeling set into a training set of records and a holdout set of records; ([Figure 12, elements 1235, 1255; paragraph [0077]: training and validation sets after database construction and data partitioning form a modeling set for the predictive model (“The three data sets, a training data set 1235, a test data set 1245, and a validation dataset 1255 are used in different stages of the model building process 1200. Partitioning data into training and validation sets is a desirable precaution for any type of statistical modeling. The validation set (sometimes referred to as the “hold out” sample is used evaluate model predictions on unknown observations (data that the model has never “seen”).”), with the validation set also referred as a holdout set.])
training a data model from the modeling set using the appended insighted data in the records; ([Figure 12, elements 1235, 1255; paragraph [0077]: training and validation sets after database construction and data partitioning form a modeling set for the predictive model (“The three data sets, a training data set 1235, a test data set 1245, and a validation dataset 1255 are used in different stages of the model building process 1200. Partitioning data into training and validation sets is a desirable precaution for any type of statistical modeling. The validation set (sometimes referred to as the “hold out” sample is used evaluate model predictions on unknown observations (data that the model has never “seen”).”).] [Figure 12, elements 1230, 1240; Figure 15, elements 1505, 1520; paragraph [0085]: training data from the modeling set, appended with the pattern exemplars consisting of selected variables (“appended insighted data”), is used to train the predictive model (“The pattern exemplars constructed in 1230 are used to train (or construct) a pattern recognition model 1240.”).])
applying the data model to a record in the holdout set to generate predicted scores for one or more tagged fields in the record; and ([Figure 12, element 1250: model evaluation performed using appended transactional records containing exemplar patterns in the validation set from the third partition (“the holdout set”).] [Figure 15, elements 1515, 1520, 1525, 1530, 1540, 1545; paragraphs [0090]-[0095]: a trained model using a transactional record from the validation set (“the holdout set”) to generate prediction/scores (“applying the data model … to generate predicted scores”) using the exemplar patterns consisting of variables (“one or more tagged fields in the record”) (“A schematic representation of the model training process is illustrated in FIG. 15. Model training is the process wherein the set of model parameters is optimized to minimize prediction or classification error. As discussed above, the training exemplars are constructed at 1230 in the variable selection process at 1225 of FIG. 12. An example is a single data record, consisting of: 1. a transaction or account identifier (key) 1515, 2. a series of raw or calculated numerical quantities, or variables 1520 (collectively referred to as a "pattern"), 3. one or more binary or continuously-valued tags 1525, representing the historical outcome associated with the transaction, and 4. a dataset partition label (Validation tag 1530), which indicates how the exemplar is to be used in the training process. A model 1540 is a mathematical function or mapping, which takes a numerical pattern as input and returns one or more values indicating a score or prediction 1545 of this pattern.”).])
comparing the predicted scores to the actual values of the tagged fields. ([Figure 15, elements 1505, 1525, 1555: a comparator comparing predicted scores to known outcomes or classification tags(“actual values of the tagged fields”) in the appended transactional record in the validation set (“the holdout set”) (“The accuracy of a model's predictions is measured using a comparator 1555 to compare the model 1540 to known outcomes or classifications tags 1525.”).]) 
Regarding Claim 13,
Dillon in view of Berk teaches The method of claim 11, wherein generating the insights dictionary comprises: 
generating a raw insights dictionary, including one entry for each unique label- value pair in the records in the insight set; and ([Figure 13, elements 1327, 1330, and 1335: paragraph [0076]: data interpolation, dataset sampling, and preliminary data analysis steps within the database construction module generating the summary statistics and reference data tables (“generating a raw insights dictionary”) by examining each transactional records; performing preliminary data analysis requires generating statistics on all data fields in the records (“each unique label-value pair in the records”) (“… a data interpolation module 1327 calculates or estimates intermediate values for the less frequently sampled database. … A dataset sampling module 1330 is sometimes required for modeling studies designed to detect rare events. In the case of response-based marketing modeling, the typical response rate is low (under 3%) making such events rare from a statistical modeling viewpoint. In such cases, the rare events are left unsampled, but common events (the non-responders) are sampled down until an 'effective' ratio of cases are created. This ratio is highly dependent on the modeling methodology used ( e.g. decision trees, versus neural networks). At 1335, preliminary analysis may be conducted on the resulting database. Preliminary analysis involves generating statistics (i.e., mean, standard deviations, minimum/ maximum, missing value counts, etc.) on some or all data fields. Correlation analysis between fields (especially between any field value and the target values) is conducted to estimate the relative value of various data fields in predicting these targets. The results of these analyses are stored in tables 1215 for later use in variable creation, transformations, and evaluations.”).])
generating an aggregated insights dictionary, including at least one entry for an aggregation of statistically insignificant unique label-value pairs. ([Figure 12, elements 1220; paragraph [0078]: variable creation step applying transformation rules (“aggregation rules”) on the fields and values in the records in the training and validation sets, using information from the summary statistics and tables (“raw insights dictionary”) to transform data (“aggregated insights dictionary”) for use in the variable selection and generate exemplar patterns steps (“The variable creation module 1220 transforms these raw data fields into a mathematical representation of the data, so that mathematical modeling and optimization methods can be applied to generate predictions about account or transaction behavior. The complete mathematical representation is referred to as a 'pattern', while individual elements within a pattern are referred to as 'variables'.”).] [paragraph [0075] and Table 1: selected aggregation rule involving detecting and calculating statistics for rare events, with a corresponding variable (“unique label-value pair”) with label (Transaction statistics for responders versus non-responders (segment by outcome tags)), with the usage of the “non-responders” outcome representing “at least one entry for an aggregation of statistically insignificant unique label-value pairs” (“A dataset sampling module 1330 is sometimes required for modeling studies designed to detect rare events. In the case of response-based marketing modeling, the typical response rate is low (under 3%) making such events rare from a statistical modeling viewpoint. In such cases, the rare events are left unsampled, but common events (the non-responders) are sampled down until an 'effective' ratio of cases are created. This ratio is highly dependent on the modeling methodology used ( e.g. decision trees, versus neural networks).”).])
Regarding Claim 14,
Dillon in view of Berk teaches The method of claim 13, wherein generating the aggregated insights dictionary comprises: 
selecting an aggregation rule from a set of pre-defined aggregation rules, the pre- defined aggregation rules corresponding to the data-types of fields in the records; ([paragraphs [0079]-[0083] and Table 1: variable creation step applying transformation rules (“selecting an aggregation rule from a set of pre-defined aggregation rules”) using data from the training and validation sets, and the summary statistics and reference data tables, with the pre-defined aggregation rule producing corresponding variables (“label-value pair”) with fields in a transactional record (such as marital status [M,S], gender [M/F]) (“corresponding to data-types of fields in the records”) (“The variable creation process 1220 uses several techniques to transform raw data. Examples of data transformation techniques and specific variables are listed below for illustrative purposes with additional examples of variables given in Table 1. … 1. Numerical Transformations: Raw data already in numerical form (such as transaction amounts) may be used directly as variables themselves. Numerical or categorical data referencing time (such as dates) are simply transformed into a standard unit of time (e.g., 'days since Jan. 1, 1999‘ or 'seconds since account open date'). … 2. Categorical data transformations: Binary, categorical data fields ( e.g., marital status[M,S], gender[M/F], etc.) can simply be transformed binary, logical data values ([0,1] or [ -1,1 ]). Higher-dimensional, categorical variables can be transformed in multiple ways. For example, ZIP code fields can be used to index probability or affinity tables 1215 (generated by the database construction module 1205 shown in FIG. 13). … 3. Variables or functions of many variables or data fields: Individual data fields and/or variables can be combined with other data fields to exploit higher order interactions. … 4. Temporal data: Raw account data consist of sequences of transactions over time. Several classes of variables are constructed to exploit this sequence. Variables designed to capture the temporal nature include summary variables (i.e., moving averages), rate estimation variables (i.e., using Kalman filtering techniques), periodicity or recurrent event detection (i.e., variables designed to detect the most frequently called telephone numbers, regular grocery stores, and periodic payments). Signal processing techniques can be used to develop custom temporal filters. Hidden Markov models can also be used to update behavioral state transitions with each transaction.”).])
determining that a first record in the insights set includes a label-value pair that is not statistically significant with respect to other records in the insights set; ([paragraph [0075] and Table 1: selected aggregation rule involving detecting and calculating statistics for rare events, with a corresponding variable (“label-value pair”) with label (Transaction statistics for responders versus non-responders (segment by outcome tags)), with the “non-responders” outcome representing a determination that “a label-value pair that is not statistically significant with respect to other records in the insights set” (“A dataset sampling module 1330 is sometimes required for modeling studies designed to detect rare events. In the case of response-based marketing modeling, the typical response rate is low (under 3%) making such events rare from a statistical modeling viewpoint. In such cases, the rare events are left unsampled, but common events (the non-responders) are sampled down until an 'effective' ratio of cases are created. This ratio is highly dependent on the modeling methodology used ( e.g. decision trees, versus neural networks).”).])
determining that the selected aggregation rule is applicable to the first record; and ([Figure 12, elements 1220, 1225, and 1230; paragraph [0079]; paragraph [0084]: variable selection and generate exemplar patterns steps applying transformed data from the variable creation step, producing an “aggregated insights dictionary” containing the aggregated insight from paragraph [0075] that is “not statistically significant”; the generate exemplar patterns step determines whether the selected aggregated insight is applicable to a data record (“The variable creation process 1220 uses several techniques to transform raw data.”); (“At 1225 variables are evaluated for their predictive value alone and in combination with other variables, and the most effective set of variables are selected for inclusion into a model. This can be done using a variety of standard variable sensitivity techniques, including analyzing the predictive performance of the variable in isolation, ranking the variables by linear regression weight magnitude and using step-wise regression. At 1230 pattern exemplars are constructed, using the selected variable selected in 1225.”).])
aggregating the label-value pair from the first record with information in the aggregated insights dictionary according to the aggregation rule. ([Figure 12, elements 1220, 1225, and 1230; paragraph [0079]; paragraph [0084]: variable selection and generate exemplar patterns steps applying transformed data from the variable creation step, producing an “aggregated insights dictionary” containing the aggregated insight from paragraph [0075] that is “not statistically significant”; after determining that the selected aggregate insight (according to the aggregated rule) is applicable to a data record, the generate exemplar patterns step appends the aggregated insight to the data record (“aggregating the label-value pair from the first record with information in the aggregated insights dictionary”) (“The variable creation process 1220 uses several techniques to transform raw data.”); (“At 1225 variables are evaluated for their predictive value alone and in combination with other variables, and the most effective set of variables are selected for inclusion into a model. This can be done using a variety of standard variable sensitivity techniques, including analyzing the predictive performance of the variable in isolation, ranking the variables by linear regression weight magnitude and using step-wise regression. At 1230 pattern exemplars are constructed, using the selected variable selected in 1225.”).])
Regarding Claim 16,
Dillon in view of Berk teaches The method of claim 14 
wherein the selected aggregation rule corresponds to a natural language text data-type, and ([Figure 115, element 115; paragraph [0008]: reference database containing content in unstructured format (“string of natural language text”) (“The source database contains investment transactions and the reference database contains public information regarding companies, mutual funds and/or other investment interests. The source database contains insurance transactions, and wherein the reference database contains information regarding insurance products, claims and/or insurance evaluations. The source database contains product inventories, and wherein the reference database contains information describing products …The reference database contains data in an unstructured format …The processing of reference data in the reference database is accomplished by reducing natural language text to a set of weighted keywords.”).] [Figure 1, element 125; paragraph [0065]: an analyzer applying a series of steps (“selected aggregation rule”) involving content that is a natural language text string (“corresponds to a data-type that is a string of natural language text”) (“The analyzer 125 creates merchant descriptors, for example lists of weighted words … The text content is then parsed to extract unique words and the corresponding word counts by an extract words module 715 … The extracted words are matched against a lexicographic database 725 to map words to their linguistic roots or more common synonyms in a linguistic reduction module 720. The lexicographic database 725 maps words to their natural language root (e.g., running becomes ran). The linguistic reduction module 720 is further explained with reference to FIG. 10. The word counts are updated and each word is assigned a word weight based on the total number of keywords. The analyzer outputs a list of weighted keywords 730.”).])
wherein aggregating the label-value pair comprises stemming the value. ([Figure 1, elements 115, 120, 125; paragraph [0046]: an analyzer appending content of words (“values” of a “label-value pair”) that is a natural language text string (“label” of a “label-value pair”) to  (“The content is passed from the cache 120 to the analyzer 125, which reduces the data obtained from the reference data sources 115 to descriptive data.”).] [Figure 1, element 125; paragraph [0065]: an analyzer applying a series of steps, of which one step consists of mapping and reducing words (“values”) in the content to their linguistic roots (“aggregating the label-value pair comprises stemming the value”) (“The analyzer 125 creates merchant descriptors, for example lists of weighted words … The text content is then parsed to extract unique words and the corresponding word counts by an extract words module 715 … The extracted words are matched against a lexicographic database 725 to map words to their linguistic roots or more common synonyms in a linguistic reduction module 720. The lexicographic database 725 maps words to their natural language root (e.g., running becomes ran). The linguistic reduction module 720 is further explained with reference to FIG. 10. The word counts are updated and each word is assigned a word weight based on the total number of keywords. The analyzer outputs a list of weighted keywords 730.”).])
Regarding Claim 17,
Dillon in view of Berk teaches The method of claim 14 
wherein the selected aggregation rule corresponds to a continuous quantitative data-type, and ([Paragraph [0082] and Table 1: selected aggregation rule for a data-type that has many possible values/strings (e.g., “gas”, “oil”, “convenience”) representing transaction amounts (“corresponds to a continuous quantitative data-type”), with a corresponding variable (“label-value pair”) with label (ratio of spending on “preferences/luxuries” -vs- “necessities” (e.g., “percent of total spent on groceries, fuel, & utilities”)) that represents a grouping of the transaction amounts (“3. Variables or functions of many variables or data fields: Individual data fields and/or variables can be combined with other data fields to exploit higher order interactions. For example, transactions at a gas station SIC code or co-occurring with keywords 'gas', 'oil' and 'convenience' would normally suggest a gasoline purchase, with larger transaction amounts implying the customer owns a larger car or truck; however, high transaction amounts may also imply auto repair service. When appropriate, specific variables are designed to capture such non-linear relationships between several data fields.”).])
wherein aggregating the label-value pair comprises grouping the pair with label-value pairs containing values of the same sign. ([Paragraph [0082] and Table 1: selected aggregation rule for a data-type that has many possible strings (e.g., “gas”, “oil”, “convenience”) representing transaction amounts, with a corresponding variable (“label-value pair”) with label (ratio of spending on “preferences/luxuries” -vs- “necessities” (e.g., “percent of total spent on groceries, fuel, & utilities”)) that represents a grouping of the transaction amounts (“grouping the pair with label-value pairs containing values of the same sign”) (“aggregating the label-value pair comprises grouping the pair with label-value pairs containing values of the same sign”) (“3. Variables or functions of many variables or data fields: Individual data fields and/or variables can be combined with other data fields to exploit higher order interactions. For example, transactions at a gas station SIC code or co-occurring with keywords 'gas', 'oil' and 'convenience' would normally suggest a gasoline purchase, with larger transaction amounts implying the customer owns a larger car or truck; however, high transaction amounts may also imply auto repair service. When appropriate, specific variables are designed to capture such non-linear relationships between several data fields.”).]) 
Regarding Claim 18,
Dillon in view of Berk teaches The method of claim 14 
wherein the selected aggregation rule corresponds to a hierarchical coding data-type, and ([Paragraph [0081] and Table 1: selected aggregation rule involving a zip code (“corresponds to a hierarchical coding data-type”), with a corresponding variable (“label-label (Probability tables based on geography (2-3 digit zip codes)) that represents reducing the zip code from 5 digits to 2-3 digits (“2. Categorical data transformations: Binary, categorical data fields ( e.g., marital status[M,S], gender[M/F], etc.) can simply be transformed binary, logical data values ([0,1] or [ -1,1 ]). Higher-dimensional, categorical variables can be transformed in multiple ways. For example, ZIP code fields can be used to index probability or affinity tables 1215 (generated by the database construction module 1205 shown in FIG. 13).”).])
wherein aggregating the label-value pair comprises single digit truncation. ([Paragraph [0081] and Table 1: selected aggregation rule involving a zip code (“corresponds to a hierarchical coding data-type”), with a corresponding variable (“label-value pair”) with label (Probability tables based on geography (2-3 digit zip codes)) that represents reducing the zip code from 5 digits to 2-3 digits (“aggregating the label-value pair comprises single digit truncation”) (“2. Categorical data transformations: Binary, categorical data fields ( e.g., marital status[M,S], gender[M/F], etc.) can simply be transformed binary, logical data values ([0,1] or [ -1,1 ]). Higher-dimensional, categorical variables can be transformed in multiple ways. For example, ZIP code fields can be used to index probability or affinity tables 1215 (generated by the database construction module 1205 shown in FIG. 13).”).])
Regarding Claim 19,
Dillon in view of Berk teaches The method of claim 14 
wherein the selected aggregation rule corresponds to a categorical data-type, and ([paragraph [0075] and Table 1: selected aggregation rule from the dataset sampling module involving detecting and calculating statistics for rare events (“corresponds to a categorical data-type”), with a corresponding variable (“label-value pair”) with label (Transaction statistics for responders versus non-responders (segment by outcome tags)) (“A dataset sampling module 1330 is sometimes required for modeling studies designed to detect rare events. In the case of response-based marketing modeling, the typical response rate is low (under 3%) making such events rare from a statistical modeling viewpoint. In such cases, the rare events are left unsampled, but common events (the non-responders) are sampled down until an 'effective' ratio of cases are created. This ratio is highly dependent on the modeling methodology used ( e.g. decision trees, versus neural networks).”).])
wherein aggregating the label-value pair comprises assigning the pair to a designated entry for statistically insignificant values. ([paragraph [0075] and Table 1: selected aggregation rule involving detecting and calculating statistics for rare events (“corresponds to a categorical data-type”), with a corresponding variable (“label-value pair”) with label (Transaction statistics for responders versus non-responders (segment by outcome tags)), with the usage of the “non-responders” outcome representing “assigning the pair to a designated entry for statistically insignificant values” (“A dataset sampling module 1330 is sometimes required for modeling studies designed to detect rare events. In the case of response-based marketing modeling, the typical response rate is low (under 3%) making such events rare from a statistical modeling viewpoint. In such cases, the rare events are left unsampled, but common events (the non-responders) are sampled down until an 'effective' ratio of cases are created. This ratio is highly dependent on the modeling methodology used ( e.g. decision trees, versus neural networks).”).])
Regarding Claim 20,
Dillon in view of Berk teaches The method of claim 14 
wherein the selected aggregation rule corresponds to a date data-type. ([paragraph [0080] and Table 1: selected aggregation rule involving a date, with corresponding variable (“label-value pair”) with label containing a date (e.g., customer since date) (“corresponds to a date data-type”) (“1. Numerical Transformations: Raw data already in numerical form (such as transaction amounts) may be used directly as variables themselves. Numerical or categorical data referencing time (such as dates) are simply transformed into a standard unit of time (e.g., 'days since Jan. 1, 1999‘ or 'seconds since account open date').”).])

Claim 15  is rejected under 35 U.S.C. 103 as being unpatentable over Dillon et al., U.S. PGPUB 2003/0088562, published 5/8/2003 [henceforth referred as Dillon] in view of Berk, Richard A., Statistical Inference After Model Selection, Journal of Quantitative Criminology, June 2010, DOI: 10.1007/s10940-009-9077-7, pp.1-31 [henceforth referred as Berk], as applied to Claims 11 and 13; and in further view of Dirac et al., U.S. PGPUB 2015/0379430, published 12/31/2015 [henceforth referred as Dirac].
Regarding Claim 15,
Dillon in view of Berk teaches The method of claim 13, 
wherein generating the raw insights directory comprises: 
reading a data record from the insights set; ([Figure 13, element 1335; paragraph [0076]: preliminary analysis step within the database construction module (“generating the raw insights dictionary”) that performs correlation analysis on all data fields on the enhanced proprietary database; performing correlation analysis requires reading a data record (“At 1335, preliminary analysis may be conducted on the resulting database. Preliminary analysis involves generating statistics (i.e., mean, standard deviations, minimum/maximum, missing value counts, etc.) on some or all data fields. Correlation analysis between fields (especially between any field value and the target values) is conducted to estimate the relative value of various data fields in predicting these targets. The results of these analyses are stored in tables 1215 for later use in variable creation, transformations, and evaluations.”).])
identifying the value of a target data field in the record; ([Figure 13, element 1335; paragraph [0076]: preliminary analysis step within the database construction module (“generating the raw insights dictionary”) that performs correlation analysis on all data fields on performing correlation analysis requires identifying the value of a target data field in the record (“At 1335, preliminary analysis may be conducted on the resulting database. Preliminary analysis involves generating statistics (i.e., mean, standard deviations, minimum/maximum, missing value counts, etc.) on some or all data fields. Correlation analysis between fields (especially between any field value and the target values) is conducted to estimate the relative value of various data fields in predicting these targets. The results of these analyses are stored in tables 1215 for later use in variable creation, transformations, and evaluations.”).])
identifying a predictor field for the target data field; ([Figure 13, element 1335; paragraph [0076]: preliminary analysis step within the database construction module (“generating the raw insights dictionary”) that performs correlation analysis on all data fields on the enhanced proprietary database; performing correlation analysis requires identifying a predictor field for the target data field (“At 1335, preliminary analysis may be conducted on the resulting database. Preliminary analysis involves generating statistics (i.e., mean, standard deviations, minimum/maximum, missing value counts, etc.) on some or all data fields. Correlation analysis between fields (especially between any field value and the target values) is conducted to estimate the relative value of various data fields in predicting these targets. The results of these analyses are stored in tables 1215 for later use in variable creation, transformations, and evaluations.”).])
identifying the value of the predictor field in the record; and ([Figure 13, element 1335; paragraph [0076]: preliminary analysis step within the database construction module (“generating the raw insights dictionary”) that performs correlation analysis on all data fields on the enhanced proprietary database; performing correlation analysis requires identifying the value of the predictor field in the record (“At 1335, preliminary analysis may be conducted on the resulting database. Preliminary analysis involves generating statistics (i.e., mean, standard deviations, minimum/maximum, missing value counts, etc.) on some or all data fields. Correlation analysis between fields (especially between any field value and the target values) is conducted to estimate the relative value of various data fields in predicting these targets. The results of these analyses are stored in tables 1215 for later use in variable creation, transformations, and evaluations.”).])
However, Dillon in view of Berk does not teach
if an entry already exists in the dictionary for the identified predictor field-value combination, incrementing a counter for the entry. (Note that under its broadest reasonable interpretation, this claim limitation within a method recites a contingent limitation that is not required to be performed because the condition precedent (“if an entry already exists in the dictionary”) is not a positive contingent limitation on the claimed invention, as the claimed invention can still function and be practiced without the condition occurring. See MPEP 2111.04(II). Although it is not required to analyze contingent limitations that are not required to be performed in a claimed invention, for purposes of compact prosecution, this claim limitation will still be analyzed for obviousness.)
Dirac teaches
if an entry already exists in the dictionary for the identified predictor field-value combination, incrementing a counter for the entry. ([paragraph [0348]: a space-efficient alternate representation for identifying duplicates is a Bloom filter (“In the depicted embodiment, at least one space-efficient alternate representation 7030 of the training data set which may be used for duplicate detection, such as a Bloom filter, may be constructed.”).] [Figure 71a, 71b, element 7104; paragraphs [0353]-[0354]: usage of a Bloom filter with hashing to keep track of duplicate observation records (“if an entry already exists in the dictionary”) for a machine learning service, setting a bit within the filter when a duplicate is detected (“incrementing a counter for the entry”) (“FIGS. 71a and 71b collectively illustrate an example of a use of a Bloom filter for probabilistic detection of duplicate observation records at a machine learning service, according to at least some embodiments. A Bloom filter 7104 comprising 16 bits (Bit0 through Bit15) is shown being constructed from a training data set comprising ORs 7110A and 7110B in the depicted scenario. To construct the Bloom filter, a given OR 7110 may be provided as input to each of a set of hash functions HO, Hl and H2 in the depicted embodiment. The output of each hash function may then be mapped, e.g., using a modulo function, to one of the 16 bits of the filter 7104, and that bit may be set to 1. … the presence of a 1 at a given location within the Bloom filter may result from hash values generated for different ORs (or even from hash values generated for the same OR using different hash functions). As such, the presence of 1s at any given set of bit locations of the filter may not uniquely or necessarily imply the existence of a corresponding OR in the data set use to construct the filter. The size of the Bloom filter 7104 may be much smaller than the data set used to build the filter-for example, a filter of 512 bits may be used as an alternate representation of several megabytes of data. … As indicated in FIG. 71b, the same hash functions may be applied to the test data set ORs 7150 ( e.g., 7150A and 7150B) to detect possible duplicates with respect to the training data set. If a particular test data set OR 7150 maps to a set of bits that contains at least one zero, the duplicate detector may determine with certainty that the OR is not a duplicate.”).])
Dillon in view of Berk and Dirac are analogous art as both teach methods for training and evaluating machine learned models.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the contingent limitation of Dillion in view of Berk and replace it with a Bloom filter of Dirac as a way to identify duplicate entries in a data set. The motivation to combine is taught in Dirac, as a way to detect and remove entries that are duplicates and as a space-efficient representation and optimal mechanism to maintain this information when dealing with large numbers of records in a data set ([paragraph [0348]: a space-efficient alternate representation for identifying duplicates is a Bloom filter (“In the depicted embodiment, at least one space-efficient alternate representation 7030 of the training data set which may be used for duplicate detection, such as a Bloom filter, may be constructed.”).] [paragraph [0004]: “Constraints on raw input data set size, cleansing or normalizing large numbers of potentially incomplete or error-containing records, and/or on the ability to extract representative subsets of the raw data also represent barriers that are not easy to overcome for many potential beneficiaries of machine learning techniques. For many machine learning problems, transformations may have to be applied on various input data variables before the data can be used effectively to train models. In some traditional machine learning environments, the mechanisms available to apply such transformations may be less than optimal---e.g., similar transformations may sometimes have to be applied one by one to many different variables of a data set, potentially requiring a lot of tedious and error-prone work.”).])

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Shao et al., U.S. Patent 7,191,150, Enhancing Delinquent Debt Collection Using Statistical Models of Debt Historical Information and Account Events, published 3/13/2007, filed 6/30/2000.
Yan et al., U.S. PGPUB 2011/0173116, System and Method of Detecting and Assessing Multiple Types of Risks Related to Mortgage Lending, published 7/14/2011, filed 1/13/2010.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM WAI YIN KWAN whose telephone number is 303-297-4332.  The examiner can normally be reached on Monday-Friday 8:00am - 4:30pm PT.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li Zhen can be reached on 571-272-3768.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/WILLIAM WAI YIN KWAN/
Examiner, Art Unit 2121



/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121