DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to the Application filed on 9/11/2020.  Claims 1-20 are pending in the case.  Claims 1, 9, and 16 are independent claims.

Claim Rejections - 35 U.S.C. § 112
The following is a quotation of 35 U.S.C. § 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 3, 11, and 18 are rejected under 35 U.S.C. § 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.  Each claim recites “repeatable actions to be performed on said model training dataset” and “wherein said model is completed by performing said one or more repeatable actions on said model training dataset,” but also contradictorily “performing, automatically, said one or more repeatable actions on said holdout dataset.”  For the purpose of prior art analysis, Examiner assumes the repeatable actions are performed on the model training dataset.

Claim Rejections - 35 U.S.C. § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA  35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

The following is a quotation of the appropriate paragraphs of 35 U.S.C. § 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1, 3-9, 11-16, and 18-20 are rejected under 35 U.S.C. § 102(a)(1) as being anticipated by Achin et al. (US 2017/0243140 A1, hereinafter Achin).

As to independent claim 1, Achin discloses a method for automating model validation, said method comprising:
receiving an original dataset via a processor (“At step 404 of method 400, exploration engine 110 loads the data (e.g., by reading the specified file or accessing the specified information systems),” paragraph 0146 lines 1-3);
segmenting, automatically, said original dataset into a plurality of data groups (“In automatic mode, the exploration engine 110 partitions the dataset (step 418) using a default sampling algorithm,” paragraph 0157 lines 3-5), wherein said plurality of data groups include a model training dataset and a holdout dataset (“predictive modeling system 100 may partition the dataset (or suggest a partitioning of the dataset) into a training set and a ‘holdout’ test set,” paragraph 0160 lines 2-4);
generating a model with said model training dataset (“The training set may then be used to train and evaluate the predictive models,” paragraph 0160 lines 6-7); and
validating said model with said holdout dataset (“the holdout test set may be reserved strictly for testing the predictive models,” paragraph 0160 lines 7-8).

As to dependent claim 3, Achin further discloses a method comprising:
retrieving, by said processor, said model training dataset from said original dataset (“predictive modeling system 100 may partition the dataset (or suggest a partitioning of the dataset) into a training set and a ‘holdout’ test set,” paragraph 0160 lines 2-4), wherein said holdout dataset is unavailable for use in said model training dataset (“the holdout test set may be reserved strictly for testing the predictive models,” paragraph 0160 lines 7-8);
defining, by a user, one or more repeatable actions to be performed on said model training dataset via the processor (“To facilitate cross-validation, predictive modeling system 100 may partition the dataset (or suggest a partitioning of the dataset) into K ‘folds’. Cross-validation comprises fitting a predictive model to the partitioned dataset K times, such that during each fitting, a different fold serves as the test set and the remaining folds serve as the training set,” paragraph 0159 lines 1-6); and
performing, automatically, said one or more repeatable actions on said holdout dataset (“To facilitate cross-validation, predictive modeling system 100 may partition the dataset (or suggest a partitioning of the dataset) into K ‘folds’. Cross-validation comprises fitting a predictive model to the partitioned dataset K times, such that during each fitting, a different fold serves as the test set and the remaining folds serve as the training set,” paragraph 0159 lines 1-6);
wherein said generating said model is completed by performing said one or more repeatable actions on said model training dataset (“The training set may then be used to train and evaluate the predictive models,” paragraph 0160 lines 6-7).

As to dependent claim 4, Achin further discloses a method comprising:
selecting said original dataset from a catalog (“The user can chose from previously loaded datasets or create a new dataset, either from a file or instructions for retrieving data from other information systems,” paragraph 0145 lines 3-6); and
aggregating said original dataset in a project (“At step 404 of method 400, exploration engine 110 loads the data (e.g., by reading the specified file or accessing the specified information systems),” paragraph 0146 lines 1-3).

As to dependent claim 5, Achin further discloses a method wherein said selecting said original dataset includes receiving user input by a user, wherein said user input includes said user selecting said original dataset (“The user can chose from previously loaded datasets or create a new dataset, either from a file or instructions for retrieving data from other information systems,” paragraph 0145 lines 3-6).

As to dependent claim 6, Achin further discloses a method comprising:
comparing validation results of said model to pre-selected metrics (“suitability scores,” paragraph 0031 line 3), wherein said pre-selected metrics establish a model validation threshold (“selecting one or more predictive modeling procedures having suitability scores that exceed a threshold suitability score,” paragraph 0031 lines 5-7); and
rejecting said model if said model fails to meet said model validation threshold (“selecting one or more predictive modeling procedures having suitability scores that exceed a threshold suitability score,” paragraph 0031 lines 5-7).

As to dependent claim 7, Achin further discloses a method wherein said pre-selected metrics are selected from a list consisting of fairness, bias, quality (“comparing and ranking performance by other quality measures,” paragraph 0221 lines 7-8), and drift.

As to dependent claim 8, Achin further discloses a method wherein segmenting, automatically, said original dataset into a plurality of data groups includes randomly, in a uniform fashion, segmenting said original dataset (“In automatic mode, the exploration engine 110 partitions the dataset (step 418) using a default sampling algorithm,” paragraph 0157 lines 3-5).

As to independent claim 9, Achin discloses a system that automatically validates models, said system comprising:
a memory (“computer readable medium,” paragraph 0301 line 2); and
a processor (“processor,” paragraph 0298 line 5) in communication with said memory, said processor being configured to perform operations comprising:
receiving an original dataset (“At step 404 of method 400, exploration engine 110 loads the data (e.g., by reading the specified file or accessing the specified information systems),” paragraph 0146 lines 1-3);
segmenting, automatically, said original dataset into a plurality of data groups (“In automatic mode, the exploration engine 110 partitions the dataset (step 418) using a default sampling algorithm,” paragraph 0157 lines 3-5), wherein said plurality of data groups include a model training dataset and a holdout dataset (“predictive modeling system 100 may partition the dataset (or suggest a partitioning of the dataset) into a training set and a ‘holdout’ test set,” paragraph 0160 lines 2-4);
generating a model with said model training dataset (“The training set may then be used to train and evaluate the predictive models,” paragraph 0160 lines 6-7); and
validating said model with said holdout dataset (“the holdout test set may be reserved strictly for testing the predictive models,” paragraph 0160 lines 7-8).

As to dependent claim 11, Achin further discloses a system wherein the operations further comprise:
retrieving, by said processor, said model training dataset from said original dataset (“predictive modeling system 100 may partition the dataset (or suggest a partitioning of the dataset) into a training set and a ‘holdout’ test set,” paragraph 0160 lines 2-4), wherein said holdout dataset is unavailable for use in said model training dataset (“the holdout test set may be reserved strictly for testing the predictive models,” paragraph 0160 lines 7-8);
defining, by a user, one or more repeatable actions to be performed on said model training dataset via the processor (“To facilitate cross-validation, predictive modeling system 100 may partition the dataset (or suggest a partitioning of the dataset) into K ‘folds’. Cross-validation comprises fitting a predictive model to the partitioned dataset K times, such that during each fitting, a different fold serves as the test set and the remaining folds serve as the training set,” paragraph 0159 lines 1-6); and
performing, automatically, said one or more repeatable actions on said holdout dataset (“To facilitate cross-validation, predictive modeling system 100 may partition the dataset (or suggest a partitioning of the dataset) into K ‘folds’. Cross-validation comprises fitting a predictive model to the partitioned dataset K times, such that during each fitting, a different fold serves as the test set and the remaining folds serve as the training set,” paragraph 0159 lines 1-6);
wherein said generating said model is completed by performing said one or more repeatable actions on said model training dataset (“The training set may then be used to train and evaluate the predictive models,” paragraph 0160 lines 6-7).

As to dependent claim 12, Achin further discloses a system wherein the operations further comprise:
selecting said original dataset from a catalog (“The user can chose from previously loaded datasets or create a new dataset, either from a file or instructions for retrieving data from other information systems,” paragraph 0145 lines 3-6); and
aggregating said original dataset in a project (“At step 404 of method 400, exploration engine 110 loads the data (e.g., by reading the specified file or accessing the specified information systems),” paragraph 0146 lines 1-3).

As to dependent claim 13, Achin further discloses a system wherein said selecting said original dataset includes receiving user input by a user, wherein said user input includes said user selecting said original dataset (“The user can chose from previously loaded datasets or create a new dataset, either from a file or instructions for retrieving data from other information systems,” paragraph 0145 lines 3-6).

As to dependent claim 14, Achin further discloses a system wherein the operations further comprise:
comparing validation results of said model to pre-selected metrics (“suitability scores,” paragraph 0031 line 3), wherein said pre-selected metrics establish a model validation threshold (“selecting one or more predictive modeling procedures having suitability scores that exceed a threshold suitability score,” paragraph 0031 lines 5-7); and
rejecting said model if said model fails to meet said model validation threshold (“selecting one or more predictive modeling procedures having suitability scores that exceed a threshold suitability score,” paragraph 0031 lines 5-7).

As to dependent claim 15, Achin further discloses a system wherein segmenting, automatically, said original dataset into a plurality of data groups includes randomly, in a uniform fashion, segmenting said original dataset (“In automatic mode, the exploration engine 110 partitions the dataset (step 418) using a default sampling algorithm,” paragraph 0157 lines 3-5).

As to independent claim 16, Achin discloses a computer program product for automatic model validation, said computer program product comprising a computer readable storage medium (“computer readable medium,” paragraph 0301 line 2) having program instructions embodied therewith, said program instructions executable by a processor to cause said processor perform a function, said function comprising:
receiving an original dataset (“At step 404 of method 400, exploration engine 110 loads the data (e.g., by reading the specified file or accessing the specified information systems),” paragraph 0146 lines 1-3);
segmenting, automatically, said original dataset into a plurality of data groups (“In automatic mode, the exploration engine 110 partitions the dataset (step 418) using a default sampling algorithm,” paragraph 0157 lines 3-5), wherein said plurality of data groups include a model training dataset and a holdout dataset (“predictive modeling system 100 may partition the dataset (or suggest a partitioning of the dataset) into a training set and a ‘holdout’ test set,” paragraph 0160 lines 2-4);
generating a model with said model training dataset (“The training set may then be used to train and evaluate the predictive models,” paragraph 0160 lines 6-7); and
validating said model with said holdout dataset (“the holdout test set may be reserved strictly for testing the predictive models,” paragraph 0160 lines 7-8).

As to dependent claim 18, Achin further discloses a computer program product wherein said function further comprises:
retrieving, by said processor, said model training dataset from said original dataset (“predictive modeling system 100 may partition the dataset (or suggest a partitioning of the dataset) into a training set and a ‘holdout’ test set,” paragraph 0160 lines 2-4), wherein said holdout dataset is unavailable for use in said model training dataset (“the holdout test set may be reserved strictly for testing the predictive models,” paragraph 0160 lines 7-8);
defining, by a user, one or more repeatable actions to be performed on said model training dataset via the processor (“To facilitate cross-validation, predictive modeling system 100 may partition the dataset (or suggest a partitioning of the dataset) into K ‘folds’. Cross-validation comprises fitting a predictive model to the partitioned dataset K times, such that during each fitting, a different fold serves as the test set and the remaining folds serve as the training set,” paragraph 0159 lines 1-6); and
performing, automatically, said one or more repeatable actions on said holdout dataset (“To facilitate cross-validation, predictive modeling system 100 may partition the dataset (or suggest a partitioning of the dataset) into K ‘folds’. Cross-validation comprises fitting a predictive model to the partitioned dataset K times, such that during each fitting, a different fold serves as the test set and the remaining folds serve as the training set,” paragraph 0159 lines 1-6);
wherein said generating said model is completed by performing said one or more repeatable actions on said model training dataset (“The training set may then be used to train and evaluate the predictive models,” paragraph 0160 lines 6-7).

As to dependent claim 19, Achin further discloses a computer program product wherein said function further comprises:
comparing validation results of said model to pre-selected metrics (“suitability scores,” paragraph 0031 line 3), wherein said pre-selected metrics establish a model validation threshold (“selecting one or more predictive modeling procedures having suitability scores that exceed a threshold suitability score,” paragraph 0031 lines 5-7), and wherein said pre-selected metrics are selected from a list consisting of fairness, bias, quality (“comparing and ranking performance by other quality measures,” paragraph 0221 lines 7-8), and drift; and
rejecting said model if said model fails to meet said model validation threshold (“selecting one or more predictive modeling procedures having suitability scores that exceed a threshold suitability score,” paragraph 0031 lines 5-7).

As to dependent claim 20, Achin further discloses a computer program product wherein segmenting, automatically, said original dataset into a plurality of data groups includes randomly, in a uniform fashion, segmenting said original dataset (“In automatic mode, the exploration engine 110 partitions the dataset (step 418) using a default sampling algorithm,” paragraph 0157 lines 3-5).

Claim Rejections - 35 U.S.C. § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA  35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 10, and 17 are rejected under 35 U.S.C. § 103 as being unpatentable over Achin in view of Li et al. (US 5553218 A, hereinafter Li).

As to dependent claim 2, the rejection of claim 1 is incorporated.
Achin does not appear to expressly teach a method comprising:
said original dataset has at least one linked dataset; and
said holdout dataset maintains referential integrity across linked datasets.
Li teaches a method comprising:
a first dataset has at least one linked dataset; and
each dataset maintains referential integrity across linked datasets (“Referential integrity among the database tables is maintained through the use of primary and foreign keys,” abstract lines 9-11).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the dataset of Achin to comprise the linked dataset of Li.  One would have been motivated to make such a combination to break up the overall database into multiple tables.

As to dependent claim 10, the rejection of claim 9 is incorporated.
Achin does not appear to expressly teach a system wherein:
said original dataset has at least one linked dataset; and
said holdout dataset maintains referential integrity across linked datasets.
Li teaches a system wherein:
a first dataset has at least one linked dataset; and
each dataset maintains referential integrity across linked datasets (“Referential integrity among the database tables is maintained through the use of primary and foreign keys,” abstract lines 9-11).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the dataset of Achin to comprise the linked dataset of Li.  One would have been motivated to make such a combination to break up the overall database into multiple tables.

As to dependent claim 17, the rejection of claim 16 is incorporated.
Achin does not appear to expressly teach a computer program product wherein:
said original dataset has at least one linked dataset; and
said holdout dataset maintains referential integrity across linked datasets.
Li teaches a computer program product wherein:
a first dataset has at least one linked dataset; and
each dataset maintains referential integrity across linked datasets (“Referential integrity among the database tables is maintained through the use of primary and foreign keys,” abstract lines 9-11).
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the dataset of Achin to comprise the linked dataset of Li.  One would have been motivated to make such a combination to break up the overall database into multiple tables.

Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicants’ disclosure:
US 2021/0390455 A1 disclosing managing machine learning models
Applicants are required under 37 C.F.R. § 1.111(c) to consider these references fully when responding to this action.
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way.  A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art.  In re Heck, 699 F.2d 1331, 1332-33, 216 U.S.P.Q. 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 U.S.P.Q. 275, 277 (C.C.P.A. 1968)).
In the interests of compact prosecution, Applicants are invited to contact the examiner via electronic media pursuant to USPTO policy outlined MPEP § 502.03.  All electronic communication must be authorized in writing.  Applicants may wish to file an Internet Communications Authorization Form PTO/SB/439.  Applicants may wish to request an interview using the Interview Practice website: http://www.uspto.gov/patent/laws-and-regulations/interview-practice.
Applicants are reminded Internet e-mail may not be used for communication for matters under 35 U.S.C. § 132 or which otherwise require a signature.  A reply to an Office action may NOT be communicated by Applicants to the USPTO via Internet e-mail.  If such a reply is submitted by Applicants via Internet e-mail, a paper copy will be placed in the appropriate patent application file with an indication that the reply is NOT ENTERED.  See MPEP § 502.03(II).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ryan Barrett whose telephone number is 571 270 3311.  The examiner can normally be reached 9:00am to 5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool.  To schedule an interview, Applicants are encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor Adam Queler can be reached at 571 272 4140.  The fax phone number for the organization where this application or proceeding is assigned is 571 273 8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center.  Unpublished application information in Patent Center is available to registered users.  To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov.  Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format.  For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Ryan Barrett/
Primary Examiner, Art Unit 2145