DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action has been issued in response to Applicant’s Communication of application S/N 17/156,234 filed on January 22, 2021. Claims 1 to 20 are currently pending with the application.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are:  “a data analyzer”, “model selector and evaluator” in claim 1, “a performance evaluator” in claim 3,  and “a parameter selection interface” in claim 6.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

	Claim 1 recites on page 39 line 13 “transformed data set”. There is insufficient antecedent basis for these limitations in claim 1.
	Claim 10 recites on page 42 line 2 ““transformed data set”. There is insufficient antecedent basis for these limitations in claim 10.
	Claim 10 recites on page 44 line 3 ““transformed data set”. There is insufficient antecedent basis for these limitations in claim 10.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
With respect to claims 1, 10, and 18, the limitations directed towards tag at least a data set of the plurality of ingested data sets, detect redundant occurrence of the plurality of attributes in each of the data tables, the data sheets, and the data matrices of the encoded data set, and determining, that the predictive analysis yields a positive response for the transformed data set, validate the executed learning model, and conduct a predictive analysis on the transformed data set, is a process that, under its broadest reasonably interpretation, covers performance of these limitation in the mind but for the recitation of generic computer components. That is, other than reciting a system for predictive analysis, the system comprising: a processor; a data lake coupled to the processor, the data lake in claim 1, a method for predictive analysis, the method comprising a processor in claim 10, non-transitory computer readable medium comprising machine executable instructions that are executable by a processor in claim 18, and ingest a plurality of data sets, where each of the ingested plurality of data sets comprises of at least one of data tables, data sheets, and data matrices, and wherein each of the data tables, the data sheets, and the data matrices has a plurality of attributes including at least one of a row, a column, and a list; a data analyzer coupled to the processor, the data analyzer, encode the tagged data set into a pre-defined format such that the encoded and transformed data set has a machine-readable format, the detected redundant plurality of attributes are eliminated, execute a first set of instructions on the encoded data set to obtain a transformed data set, a model selector and evaluator coupled to the processor, the model selector and evaluator to: execute a machine, where the execution is done based on predefined second set of instructions stored in a database maintained at the model selector and evaluator, nothing in the claim precludes these steps from practically being performed in the mind and/or by a human with pen and paper.
For example, but for the limitations stating a system for predictive analysis, the system comprising: a processor; a data lake coupled to the processor, the data lake in claim 1, a method for predictive analysis, the method comprising a processor in claim 10, non-transitory computer readable medium comprising machine executable instructions that are executable by a processor in claim 18, and ingest a plurality of data sets, where each of the ingested plurality of data sets comprises of at least one of data tables, data sheets, and data matrices, and wherein each of the data tables, the data sheets, and the data matrices has a plurality of attributes including at least one of a row, a column, and a list; a data analyzer coupled to the processor, the data analyzer, encode the tagged data set into a pre-defined format such that the encoded and transformed data set has a machine-readable format, the detected redundant plurality of attributes are eliminated, execute a first set of instructions on the encoded data set to obtain a transformed data set, a model selector and evaluator coupled to the processor, the model selector and evaluator to: execute a machine, where the execution is done based on predefined second set of instructions stored in a database maintained at the model selector and evaluator, the mention of tag at least a data set of the plurality of ingested data sets, detect redundant occurrence of the plurality of attributes in each of the data tables, the data sheets, and the data matrices of the encoded data set, and determining, that the predictive analysis yields a positive response for the transformed data set, validate the executed learning model, and conduct a predictive analysis on the transformed data set, in the context of this claim, encompasses a user mentally tagging or classifying data, determining if there are duplicates in the data, and making a prediction based on the data. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
The judicial exception is not integrated into a practical application by additional elements. In particular, a system for predictive analysis, the system comprising: a processor; a data lake coupled to the processor, the data lake in claim 1, a method for predictive analysis, the method comprising a processor in claim 10, non-transitory computer readable medium comprising machine executable instructions that are executable by a processor in claim 18, and ingest a plurality of data sets, where each of the ingested plurality of data sets comprises of at least one of data tables, data sheets, and data matrices, and wherein each of the data tables, the data sheets, and the data matrices has a plurality of attributes including at least one of a row, a column, and a list; a data analyzer coupled to the processor, the data analyzer, encode the tagged data set into a pre-defined format such that the encoded and transformed data set has a machine-readable format, the detected redundant plurality of attributes are eliminated, execute a first set of instructions on the encoded data set to obtain a transformed data set, a model selector and evaluator coupled to the processor, the model selector and evaluator to: execute a machine, where the execution is done based on predefined second set of instructions stored in a database maintained at the model selector and evaluator. A system for predictive analysis, the system comprising: a processor; a data lake coupled to the processor, the data lake in claim 1, a method for predictive analysis, the method comprising a processor in claim 10, non-transitory computer readable medium comprising machine executable instructions that are executable by a processor in claim 18, a database, machine, machine-readable format, a data analyzer, a model selector and evaluator recited in claims 1, 10, and 18 are recited at high levels of generality (i.e., as a generic computer performing a generic computer function of data processing) such that it amounts to no more than mere instructions to apply the exception. Ingest a plurality of data sets, where each of the ingested plurality of data sets comprises of at least one of data tables, data sheets, and data matrices, and wherein each of the data tables, the data sheets, and the data matrices has a plurality of attributes including at least one of a row, a column, and a list, encode the tagged data set into a pre-defined format such that the encoded and transformed data set has a machine-readable format, and obtain a transformed data set is considered by the examiner to be mere data gathering such that it amounts to no more than insignificant extra solution activity. Execute a first set of instructions and execute a machine learning model and the above mentioned additional elements do not integrate the abstract idea into a practical application because it do not impose a meaningful limit on the judicial exception and it merely confines the claim to a particular technological environment or field of use for data gathering in conjunction with the abstract idea.
This claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements, a system for predictive analysis, the system comprising: a processor; a data lake coupled to the processor, the data lake in claim 1, a method for predictive analysis, the method comprising a processor in claim 10, non-transitory computer readable medium comprising machine executable instructions that are executable by a processor in claim 18, a database, machine, machine-readable format, a data analyzer, a model selector and evaluator recited in claims 1, 10, and 18 is recited at high levels of generality to apply the exception using generic components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. The additional elements, ingest a plurality of data sets, where each of the ingested plurality of data sets comprises of at least one of data tables, data sheets, and data matrices, and wherein each of the data tables, the data sheets, and the data matrices has a plurality of attributes including at least one of a row, a column, and a list, encode the tagged data set into a pre-defined format such that the encoded and transformed data set has a machine-readable format, and obtain a transformed data setis interpreted to be well understood, routine and conventional activity (Receiving or transmitting data over a network e.g., using the internet to gather data, Symantec (see MPEP 2106.05(d))). Execute a first set of instructions and execute a machine learning model merely confines the claim to a particular technological environment or field of use for data gathering in conjunction with the abstract idea. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. To further elaborate, the additional limitations of a system for predictive analysis, the system comprising: a processor; a data lake coupled to the processor, the data lake in claim 1, a method for predictive analysis, the method comprising a processor in claim 10, non-transitory computer readable medium comprising machine executable instructions that are executable by a processor in claim 18, and ingest a plurality of data sets, where each of the ingested plurality of data sets comprises of at least one of data tables, data sheets, and data matrices, and wherein each of the data tables, the data sheets, and the data matrices has a plurality of attributes including at least one of a row, a column, and a list; a data analyzer coupled to the processor, the data analyzer, encode the tagged data set into a pre-defined format such that the encoded and transformed data set has a machine-readable format, the detected redundant plurality of attributes are eliminated, execute a first set of instructions on the encoded data set to obtain a transformed data set, a model selector and evaluator coupled to the processor, the model selector and evaluator to: execute a machine, where the execution is done based on predefined second set of instructions stored in a database maintained at the model selector and evaluator does not impose a meaningful limit on the judicial exception and it merely confines the claim to a particular technological environment or field of use. Claim 1 is not patent eligible.
Claim 10 and 18 are similarly rejected because they are similar in scope.

With respects to claims 2, 11, and 19, the limitations are directed towards wherein upon determining that the predictive analysis yields a negative response for the transformed data set, invalidate the executed machine learning model. These elements further elaborates the abstract idea and the human mind and/or with pen and paper can upon determining that the predictive analysis yields a negative response for the transformed data set, invalidate the executed machine learning model. Therefore, claims 2, 11, and 19, do not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.

With respects to claims 3 and 12, the limitations are directed towards wherein the system further comprises a performance evaluator coupled to the processor, the performance evaluator to: evaluate performance of a machine learning model by at least one of a regularization and bias variance trade off based on factors including at least one of interpretability, simplicity, speed, and stability. The elements directed to evaluate performance of a machine learning model by at least one of a regularization and bias variance trade off based on factors including at least one of interpretability, simplicity, speed, and stability further elaborates the abstract idea and the human mind and/or with pen and paper can evaluate performance of a machine learning model by at least one of a regularization and bias variance trade off based on factors including at least one of interpretability, simplicity, speed, and stability. The elements directed to performance evaluator coupled to the processor and machine are interpreted to merely confine the claim to a particular technological environment or field of use for data gathering in conjunction with the abstract idea. Therefore, claims 3 and 13 do not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.

With respects to claims 4 and 13, the limitations are directed towards wherein upon determining that the predictive analysis yields a negative response for the transformed data set, invalidate the executed machine learning model. The elements directed to determining that the predictive analysis yields a negative response for the transformed data set, invalidate the learning model further elaborates the abstract idea and the human mind and/or with pen and paper can determine that the predictive analysis yields a negative response for the transformed data set and invalidate the learning model. The elements directed towards executed machine model merely confine the claim to a particular technological environment or field of use for data gathering in conjunction with the abstract idea. Therefore, claims 4 and 13 do not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.

With respects to claims 5, 14, and 20, the limitations are directed towards wherein a new machine learning model is tested with a sample dataset, trained with a training data set, and corresponding model results and performance evaluation metrics are validated by matching with a validation dataset, wherein the new model is maintained at the model selector and evaluator. These additional elements are interpreted to merely confine the claim to a particular technological environment or field of use for data gathering in conjunction with the abstract idea. Therefore, claims 5, 14, and 20 do not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.

With respects to claims 6 and 15, the limitations are directed towards wherein the system further comprises a parameter selection interface coupled to the processor to allow a user to select a parameter associated with the ingested data sets, the parameter including at 40D20-279-04290-PR-US Patent Application least one of a threshold, an accuracy, a relevant variable, and a prediction error ranger. These additional elements are interpreted to merely confine the claim to a particular technological environment or field of use for data gathering in conjunction with the abstract idea.  Therefore, claims 6 and 15 do not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.

With respects to claims 7 and 16, the limitations are directed towards wherein the first set of instructions are executed to facilitate scaling, dimension reduction of the encoded data set by reducing dimensionality of space associated with the redundant plurality of attributes, and conversion of the encoded data set from continuous form to discrete form. These additional elements are interpreted to merely confine the claim to a particular technological environment or field of use for data gathering in conjunction with the abstract idea. Therefore, claims 7 and 16 do not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.

With respects to claims 8 and 17, the limitations are directed towards wherein the model selector and evaluator facilitates in calculation of hyper parameters and tuning with the encoded dataset. The elements directed to calculation of hyper parameters further elaborates the abstract idea and the human mind and/or with pen and paper can calculate hyper parameters. The additional elements directed to model selector and evaluator facilitating and tuning with the encoded dataset are interpreted to merely confine the claim to a particular technological environment or field of use for data gathering in conjunction with the abstract idea. Therefore, claims 8 and 17 do not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.

With respects to claims 9, the limitations are directed towards wherein the data lake comprises staging zones for incoming data, the staging zones comprising: transient data zone to hold ephemeral datasets; raw data zone to encrypt and secure the ingested datasets; and refined data zone to accommodate enriched and manipulated datasets. These additional elements are interpreted to merely confine the claim to a particular technological environment or field of use for data gathering in conjunction with the abstract idea. Therefore, claim 9, does not recite additional limitations which tie the abstract idea into a practical application and does not amount to significantly more than the identified judicial exception.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1, 2, 8, 10, 11, 17, 18, and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Saillet et al. (U.S. Publication No.: US 20190095801 A1) hereinafter Saillet, in view of Yates et al. (U.S. Publication No.: US 20190102693 A1) hereinafter Yates.
As to claim 1:
Saillet discloses:
A system for predictive analysis [Paragraph 0029 teaches FIG. 1 is a block diagram depicting a generalized computer system, in accordance with an embodiment of the present invention. Paragraph 0031 teaches computer 101 may include, for example, processor 105 and memory 110. Paragraph 0037 teaches software such as software 112 may include a machine learning application such as machine learning application 177, executable to implement a set of machine learning models. Paragraph 0037 teaches each machine learning model may respectively receive input data (e.g., an input dataset) relating to a set of features of a dataset, to determine for output one or more data processing actions performable on the database… determining the data processing actions may include predictively selecting suitable or final data processing actions performable on the dataset.], the system comprising: 
a processor [Paragraph 0032 teaches processor 105 may be a hardware device for executing software, such as software 112 stored in memory 110.]; a data lake coupled to the processor [Paragraph 0016 teaches “dataset” refers to a collection of data, such as in the form of a data table, database, a list, or the like. Paragraph 0036 teaches processor 105 executes software 112 stored in memory 110 during operation of computer 101 in performing methods as described herein. The methods as described herein may otherwise be stored on any type of computer readable storage medium, such as storage 120, for execution and use by a computer system such as computer 101, in accordance with embodiments of the present invention. Storage 120 may include, for example, a disk storage such as an the data processing actions may be determined by each of the machine learning models based on the respectively received input data. Paragraph 0037 teaches each machine learning model may respectively receive input data (e.g., an input dataset) relating to a set of features of a dataset, to determine for output one or more data processing actions performable on the database… determining the data processing actions may include predictively selecting suitable or final data processing actions performable on the dataset. Note: Receiving data sets into storage (data lake) associated with a computer and a processor reads on the claimed data lake coupled to the processor.], the data lake to: 
ingest a plurality of data sets [Paragraph 0016 teaches the collection of data may be formatted so as to enable extraction of sets of input features therefrom, for input to a set of machine learning models. Paragraph 0093 teaches signal Hub provides tools that can discover schema (e.g., data types and column names) from a flat file or a database table. Note: The collection of data sets as input reads on ingest a plurality of data sets ], where each of the ingested plurality of data sets comprises of at least one of data tables [Paragraph 0016 teaches “dataset” refers to a collection of data, such as in the form of a data table, database, a list, or the like. The collection of data may be formatted so as to enable extraction of sets. Note: A dataset (each data set) of the collection of sets (plurality of data set) are in the form of a data table.], and wherein each of the data tables [Paragraph 0016 teaches “dataset” refers to a collection of data, such as in the form of a data table, database, a list, or the like. The collection of data may be formatted so as to enable extraction of sets.], has a plurality of attributes including at least one of a column [Paragraph 0016 teaches a particular collection of data corresponding to a dataset may be formatted in tabular form, where each column may represent a particular variable or attribute, and each row may represent a given member, record, or entry, with respect to the dataset.], and, data sheets and data matrices, a row, a list the data sheets, and the data matrices; 
a data analyzer coupled to the processor [Figure 3:313 and Paragraph 0030 teach the methods may be implemented, for example, in software such as in the form of an executable program, and may be executed by a special or general-purpose digital computer, such as a personal computer. Paragraph 0031 teaches computer 101 may include, for example, processor 105.], the data analyzer to: tag at least a data set of the plurality of ingested data sets [Paragraph 0050 teaches a list of terms and column classifications associated with a dataset, or with fields of the dataset, may form features required for input. Paragraph 0054 teaches dataset analyzer 313 may first apply data profiling algorithms, data classification, and data quality algorithms to dataset 311. Note: A data analyzer that applies classifications (tags) for columns in datasets reads on the claimed the data analyzer to: tag at least a data set of the plurality of ingested data sets.]; 
encode the tagged data set into a pre-defined format such that the encoded and transformed data set has a machine-readable format [Paragraph 0030 teaches the methods may be implemented in software 112… such as in the form of an executable program. Paragraph 0034 teaches an executable program such as in the form of object code. Paragraph 0039 teaches transforming one or more columns of respective datasets to a standardized or compatible format, such as by way of lookup codes. Paragraph 0050 teaches a list of terms and column classifications associated with a dataset, or with fields of the dataset, may form features required for input. Paragraph 0054 teaches dataset analyzer 313 may first apply data profiling algorithms, data classification, and data quality algorithms to dataset 311. Paragraph 0066 teaches the computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention. Paragraph 0067 teaches specific examples of the computer readable storage medium include an encoded device… having instructions recorded thereon. Note: An encoded device (encoding) with software or computer readable (machine-readable) instructions (pre-defined format) that includes transformed column classifications (encoded and transformed tagged data set) reads on the claimed encode the tagged data set into a pre-defined format such that the encoded and transformed data set has a machine-readable format.]; 
detect redundant occurrence of the plurality of attributes in each of the data tables of the encoded data set, wherein the detected redundant plurality of attributes are eliminated [Paragraph 0039 teaches data processing actions applied to datasets may comprise, row deduplication, for identification and merge of redundant data such as that relating to multiple records representing the same entity. Note: Deduplicating data in rows associated with columns (detected redundant plurality of attributes are eliminated) as processing datasets (plurality of attributes in each of the data tables) reads on the claimed detect redundant occurrence of the plurality of attributes in each of the data tables, the of the encoded data set, wherein the detected redundant plurality of attributes are eliminated deduplicating data reasonably includes detecting and eliminating redundant or duplicate data] the data sheets, and the data matrices 
execute a first set of instructions on the encoded data set to obtain a transformed data set [Paragraph 0030 teaches the methods may be implemented in software 112… such as in the form of an executable program. Paragraph 0034 teaches an executable program such as in the form of object code. Paragraph 0039 teaches transforming one or more columns of respective datasets to a standardized or compatible format, such as by way of lookup codes. Paragraph 0050 teaches a list of terms and column classifications associated with a dataset, or with fields of the dataset, may form features required for input. Paragraph 0054 teaches dataset analyzer 313 may first apply data profiling algorithms, data classification, and data quality algorithms to dataset 311. Paragraph 0066 teaches the computer program product may include a computer recladable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention. Paragraph 0067 teaches specific examples of the computer readable storage medium include an encoded device… having instructions recorded thereon. Note: Instructions (first set of instructions) causing a processor to carry out (execute) transforming one or more classification columns of respective datasets into a compatible format (transformed data set) stored on an encoded device (encoded) reads on the claimed execute a first set of instructions on the encoded data set to obtain a transformed data set.]; 
transformed data set [Paragraph 0039 teaches transforming one or more columns of respective datasets to a standardized or compatible format, such as by way of lookup codes.]

Saillet discloses most of the limitation as set forth in claim 1 but does not appear to expressly disclose a model selector and evaluator coupled to the processor, the model selector and evaluator to: execute a machine learning model to conduct a predictive analysis on the data set, where the execution is done based on predefined second set of instructions stored in a database maintained at the model selector and evaluator; and upon determining, that the predictive analysis yields a positive response for the data set, validate the executed machine learning model.
Yates discloses:
a model selector and evaluator coupled to the processor [Paragraph 0034 teaches the model generation module 160 may include various components including a parameter selection module 210, a model training module 220, and a model evaluation module 230. Paragraph 0053 teaches the model evaluation module 230 calculates an evaluation score for each trained machine learning model based on the performance of the machine learning model across the examples of the evaluation data 280… the machine learning model associated with the best evaluation score may be selected to be entered into production. Paragraph 0075 teaches a software module is implemented with a computer program product comprising a computer-readable medium containing computer program code, which can be executed by a computer processor for performing any or all of the steps, operations, or processes described. Note: The model generation module that includes a model evaluation module (a model selector and evaluator) associated with computer program code executed by a computer processor reads on the claimed a model selector and evaluator coupled to the processor.], the model selector and evaluator to: execute a machine learning model to conduct a predictive analysis on the data set [Paragraph 0034 teaches the model generation module 160 may include various components including a parameter selection module 210, a model training module 220, and a model evaluation module 230. Paragraph 0050 teaches training dataset 270 from the training data store 190. Paragraph 0052 teaches the evaluation data 280 represents a portion of the training data obtained from the training data store 190. Paragraph 0053 teaches applies the examples in the evaluation data 280 and determines the performance of the machine learning model. More specifically, the model evaluation module 230 applies the features of a user of the online system 150 and the features of a content item as input to the trained machine learning model and compares the prediction to the ground truth dat. Note: The model generation module that includes the model evaluation model (model selector and evaluator) applying evaluation data 280 to a machine learning model that output predictions (predictive analysis) based on the input data set reads on the claimed the model selector and evaluator to: execute a machine learning model to conduct a predictive analysis on the data set.], where the execution is done based on predefined second set of instructions stored in a database maintained at the model selector and evaluator [Paragraph 0034 teaches the model generation module 160 may include various components including a parameter selection module 210, a model training module 220, and a model evaluation module 230. Paragraph 0050 teaches training dataset 270 from the training data store 190. Paragraph 0052 teaches the evaluation data 280 represents a portion of the training data obtained from the training data store 190. Paragraph 0051 teaches training examples in the training data include 1) input features of a user of the online system 150, 2) input features of a content item, and 3) ground truth data indicating whether the user of the online system interacted (e.g., clicked/converted) on the content item. The model training module 220 iteratively trains a machine learning model using the training examples to minimize an error between a230 prediction and the ground truth data. The model training module 220 provides the trained machine learning models to the model evaluation module 230. Paragraph 0053 teaches applies the examples in the evaluation data 280 and determines the performance of the machine learning model. More specifically, the model evaluation module 230 applies the features of a user of the online system 150 and the features of a content item as input to the trained machine learning model and compares the prediction to the ground truth dat. Note: The model generation module that includes the model evaluation module (model selector and evaluator) iteratively evaluates trained models, such as a second iteration of trained models (predefined second set of instructions) based on trained models maintained at the model generation module reads on the claimed   where the execution is done based on predefined second set of instructions stored in a database maintained at the model selector and evaluator.]; and
upon determining, that the predictive analysis yields a positive response for the data set, validate the executed machine learning model [Paragraph 0053 teaches the evaluation score represents an error between the predictions outputted by trained machine learning model and the ground truth data. In various embodiments, the evaluation score is one of a logarithmic loss error or a mean squared error. The machine learning model associated with the best evaluation score may be selected to be entered into production. Paragraph 0072 teaches 150 generates 705 a prediction error between a predicted output determined by the trained machine learning model and an actual output. The online system 150 determines 710 an estimated performance score corresponding to the candidate parameter values used by the trained machine learning model. In various embodiments, the estimated performance score is outputted by the prediction model 340. The online system 150 determines 715 whether a difference between the estimated performance score and the prediction error is above a threshold value. If so, the online system 150 triggers 720 a corrective action for the trained machine learning model. In one embodiment, the online system 150 replaces the machine learning model currently in production with a different machine learning model that is performing as expected. Note: A machine learning model entered into production (validate the executed machine learning model) based on the machine learning model predictions resulting in a best evaluation score (upon determining, that the predictive analysis yields a positive response for the transformed data set), wherein the claimed validate is interpreted to be promoting machine learning models to production reads on the claimed upon determining, that the predictive analysis yields a positive response for the transformed data set, validate the executed machine learning model.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Saillet, by incorporating a machine learning model entered into production based on the machine learning model predictions resulting in a best evaluation score, as taught by Yates (see Paragraph 0053 and 0072), because both applications are directed to machine learning processing; incorporating a machine learning model entered into production based on the machine learning model predictions resulting in a best evaluation score provides an accurate machine learning model based on historical information corresponding to past parameter searches and on training dataset properties (see Yates Paragraph 0004).

Claims 10 and 18 are similarly rejected because it is similar in scope.

As to claim 2:
Saillet discloses:
the transformed data set [Paragraph 0039 teaches transforming one or more columns of respective datasets to a standardized or compatible format, such as by way of lookup codes]

Saillet and Yates discloses all of the limitations as set forth in claim 1 and some of claim 2.
Yates also discloses:
The system as claimed in claim 1, wherein upon determining that the predictive analysis yields a negative response for the data set, invalidate the executed machine learning model [Paragraph 0053 teaches the evaluation score represents an error between the predictions outputted by trained machine learning model and the ground truth data. In various embodiments, the evaluation score is one of a logarithmic loss error or a mean squared error. The machine learning model associated with the best evaluation score may be selected to be entered into production. Paragraph 0072 teaches 150 generates 705 a prediction error between a predicted output determined by the trained machine learning model and an actual output. The online system 150 determines 710 an estimated performance score corresponding to the candidate parameter values used by the trained machine learning model. In various embodiments, the estimated performance score is outputted by the prediction model 340. The online system 150 determines 715 whether a difference between the estimated performance score and the prediction error is above a threshold value. If so, the online system 150 triggers 720 a corrective action for the trained machine learning model. In one embodiment, the online system 150 replaces the machine learning model currently in production with a different machine learning model that is performing as expected. Note: Replacing (invalidate) machine learning model currently in production (invalidate the executed machine learning model) based on the machine learning models less than expected performance regarding predictions (negative response) reads on the claimed wherein upon determining that the predictive analysis yields a negative response for the data set, invalidate the executed machine learning model.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Saillet, by incorporating a machine learning model entered into production based on the machine learning model predictions resulting in a model being replaced, as taught by Yates (see Paragraph 0053 and 0072), because both applications are directed to machine learning processing; incorporating a machine learning model entered into production based on the machine learning model predictions resulting in a model being replaced provides an accurate machine learning model based on historical information corresponding to past parameter searches and on training dataset properties (see Yates Paragraph 0004).

Claims 11 and 19 are similarly rejected because it is similar in scope.

As to claim 8:
Saillet discloses:
The system as claimed in claim 1, wherein the model selector and evaluator facilitates in tuning with the encoded dataset [Paragraph 0030 teaches the methods may be implemented in software 112… such as in the form of an executable program. Paragraph 0034 teaches an executable program such as in the form of object code. Paragraph 0039 teaches data processing actions applied to datasets may comprise, row deduplication, for identification and merge of redundant data such as that relating to multiple records representing the same entity… transforming one or more columns of respective datasets to a standardized or compatible format, such as by way of lookup codes. Paragraph 0050 teaches a list of terms and column classifications associated with a dataset, or with fields of the dataset, may form features required for input. Paragraph 0053 teaches sub-diagram 300 may comprise elements required for training and preparing each machine learning model 301, 302, and 303, respectively, for use. Paragraph 0054 teaches dataset analyzer 313 may first apply data profiling algorithms, data classification, and data quality algorithms to dataset 311. Paragraph 0066 teaches the computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention. Paragraph 0067 teaches specific examples of the computer readable storage medium include an encoded device… having instructions recorded thereon. Paragraph 0073 teaches each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions. Note: Block diagram 300 (model selector and evaluator) training and preparing machine learning model associated with actions to data sets that include deduplication (tuning) data sets reads on the claimed model selector and evaluator facilitates in tuning with the encoded dataset.]

Saillet and Yates discloses all of the limitations as set forth in claim 1.
Yates also discloses:
the model selector and evaluator facilitates in calculation of hyper parameters [Paragraph 0027 and Figure 2 teaches the model generation module 160 trains a machine learning model using candidate parameter values predicted by a prediction model… candidate parameters refer to any type of parameters used in training a machine learning model. For example, candidate parameters refer to parameters as well as hyperparameters, i.e., parameters that are not learned from the training process. Examples of hyperparameters include the number of training examples, learning rate, and learning rate decrease rate. Note: The model generation module that includes the model evaluation module determining hyperparameters based on counting the number of training examples, calculating a learning rate, and learning rate decrease reads on the clamed the model selector and evaluator facilitates in calculation of hyper parameters.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Saillet, by incorporating model generation module that includes the model evaluation module determining hyperparameters based on counting the number of training examples, calculating a learning rate, and learning rate decrease, as taught by Yates (see Paragraph 0027 and Figure 2), because both applications are directed to machine learning processing; incorporating model generation module that includes the model evaluation module determining hyperparameters based on counting the number of training examples, calculating a learning rate, and learning rate decrease provides an accurate machine learning model based on historical information corresponding to past parameter searches and on training dataset properties (see Yates Paragraph 0004).

Claim 17 is similarly rejected because it is similar in scope.

Claim(s) 3, 4, 12, and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Saillet et al. (U.S. Publication No.: US 20190095801 A1) hereinafter Saillet, in view of Yates et al. (U.S. Publication No.: US 20190102693 A1) hereinafter Yates, and further in view of Kesarwani et al. (U.S. Publication No.: US 20190347410 A1) hereinafter Kesarwani.
As to claim 3:
Saillet and Yates discloses all of the limitations as set forth in claim 1 and some of claim 2 but does not appear to expressly disclose the system as claimed in claim 1, wherein the system further comprises a performance evaluator coupled to the processor the performance evaluator to: evaluate performance of a machine learning model by at least one of a regularization based on factors including at least one of stability.
Kesarwani discloses:
The system as claimed in claim 1, wherein the system further comprises a performance evaluator coupled to the processor [Paragraph 0031 teaches to compute the resiliency score…, the system may use different components.. one component may include the stability of the model. Paragraph 0040 teaches the risk evaluator 205 may compute the resiliency score of the machine learning model 203. Paragraph 0049 teaches the computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention. Note: A risk evaluator (performance evaluator) evaluating machine learning models based on performance utilizing (coupled) to a processor reads on the claimed a performance evaluator coupled to the processor.], the performance evaluator to: evaluate performance of a machine learning model by at least one of a regularization [Paragraph 0031 teaches to compute the resiliency score…, the system may use different components. Paragraph 0032 teaches …another component is the regularization component. Paragraph 0040 teaches the risk evaluator 205 may compute the resiliency score of the machine learning model 203.] based on factors including at least one of stability [Paragraph 0031 teaches to compute the resiliency score…, the system may use different components.. one component may include the stability of the model]. bias variance trade off, interpretability, simplicity, speed,
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Saillet and Yates, by incorporating a risk evaluator , as taught by Yates (see Paragraph 0053 and 0072), because both applications are directed to machine learning processing; incorporating a machine learning model entered into production based on the machine learning model predictions resulting in a best evaluation score provides an accurate machine learning model based on historical information corresponding to past parameter searches and on training dataset properties (see Yates Paragraph 0004).

Claim 12 is similarly rejected because it is similar in scope.

As to claim 4:
Saillet, Yates, and Kesarwani discloses all of the limitations as set forth in claim 1 and 3.
Yates also discloses:
The system as claimed in claim 3, wherein the machine learning model is validated based on the evaluated performance [Paragraph 0053 teaches the evaluation score represents an error between the predictions outputted by trained machine learning model and the ground truth data. In various embodiments, the evaluation score is one of a logarithmic loss error or a mean squared error. The machine learning model associated with the best evaluation score may be selected to be entered into production. Paragraph 0072 teaches 150 generates 705 a prediction error between a predicted output determined by the trained machine learning model and an actual output. The online system 150 determines 710 an estimated performance score corresponding to the candidate parameter values used by the trained machine learning model. In various embodiments, the estimated performance score is outputted by the prediction model 340. The online system 150 determines 715 whether a difference between the estimated performance score and the prediction error is above a threshold value. If so, the online system 150 triggers 720 a corrective action for the trained machine learning model. In one embodiment, the online system 150 replaces the machine learning model currently in production with a different machine learning model that is performing as expected. Note: A machine learning model entered into production (validate the executed machine learning model) based on the machine learning model predictions resulting in a best evaluation score (evaluated performance), wherein the claimed validate is interpreted to be promoting machine learning models to production reads on the claimed the machine learning model is validated based on the evaluated performance.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Saillet, by incorporating a machine learning model entered into production based on the machine learning model predictions resulting in a best evaluation score, as taught by Yates (see Paragraph 0053 and 0072), because both applications are directed to machine learning processing; incorporating a machine learning model entered into production based on the machine learning model predictions resulting in a best evaluation score provides an accurate machine learning model based on historical information corresponding to past parameter searches and on training dataset properties (see Yates Paragraph 0004).

Claim 13 is similarly rejected because it is similar in scope.

Claim(s) 5, 14, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Saillet et al. (U.S. Publication No.: US 20190095801 A1) hereinafter Saillet, in view of Yates et al. (U.S. Publication No.: US 20190102693 A1) hereinafter Yates, in view of Calmon et al. (U.S. Patent No.: US 11361197 B2) hereinafter Calmon, and further in view of Chen et al. (U.S. Publication No.: US 20150178811 A1) hereinafter Chen.
As to claim 5:
Yates also discloses:
wherein the new model is maintained at the model selector and evaluator [Paragraph 0030 teaches a machine learning model that has been trained using the candidate parameter values can be stored (e.g., in the training data store 190) or provided to the model application module 170 for execution. Paragraph 0034 teaches FIG. 2 shows the details of the model generation module along with the data flow for determining candidate parameter values by the model generation module. Note: The training data store included in the model generation module (model selector and evaluator) storing a trained machine learning model (new model) reads on the claimed the new model is maintained at the model selector and evaluator.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Saillet and Yates, by incorporating the training data store included in the model generation module (model selector and evaluator) storing a trained machine learning model (new model), as taught by Yates (see Paragraph 0030 and 0034), because both applications are directed to machine learning processing; incorporating the training data store included in the model generation module (model selector and evaluator) storing a trained machine learning model (new model) provides an accurate machine learning model based on historical information corresponding to past parameter searches and on training dataset properties (see Yates Paragraph 0004).

Saillet and Yates discloses all of the limitations as set forth in claim 1 but does not appear to expressly disclose the system as claimed in claim 1, wherein a new machine learning model is tested with a sample dataset, trained with a training data set, and corresponding model results and performance evaluation metrics are validated by matching with a validation dataset, wherein the new model is maintained at the model selector and evaluator.
Calmon discloses:
The system as claimed in claim 1, wherein a new machine learning model is tested with a sample dataset [Column 6 Lines 16-21 teach the exemplary method employs a supervised machine learning approach using the histogram of likelihoods. In this approach, a training data set is used in which anomalous time-series samples are properly labeled as such (the ground truth). The objective is to build a trained model to find anomalies in new, unlabeled data. Column 7 Lines 38-41 teach the test phase serves the purpose of evaluating model performance on new data sets. The test data set thus also includes labeled time-series data so that the F1-score can be computed in order to evaluate. Column 7 Lines 43-46 teach each new time series in the test data set needs to be clustered into states, and a log-likelihood metric needs to be extracted for each of its samples. Note: Test data that includes samples (sample data set) used during a testing phase of a trained (new) model using machine learning techniques (machine learning model) reads on the claimed a new machine learning model is tested with a sample dataset.], trained with a training data set [Column 6 Lines 17-22 teach a training data set is used in which anomalous time-series samples are properly labeled as such (the ground truth). The objective is to build a trained model to find anomalies in new, unlabeled data. The model parameters are adjusted using the training time-series samples as input and training labels as output. Column 6 Lines 41-44 teach the exemplary method employs a training phase and a test phase. In the training phase, the model parameters are adjusted to substantially optimize the F1-score on the training data.  Note: A trained model that is trained in a training phase utilizing training data (training data set) reads on the claimed trained with a training data set.], and 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Saillet and Yates, by incorporating a training models using machine learning techniques associated a training data set and sample data for test, as taught by Calmon (see Column 6 Lines 16-21, Column 6 Lines 41-44, Column 6 Lines 17-22, Column 7 Lines 38-41, and Column 7 Lines 43-46), because all three applications are directed to machine learning processing; incorporating a training models using machine learning techniques associated a training data set and sample data for test provide improved methods, apparatus and computer program products for anomaly detection in time-series data (see Calmon Column 8 Lines 38-40).

Saillet, Yates, and Calmon discloses all of the limitations as set forth in claim 1 and some of 5 but does not appear to expressly disclose corresponding model results and performance evaluation metrics are validated by matching with a validation dataset.
Chen discloses:
corresponding model results and performance evaluation metrics are validated by matching with a validation dataset [Paragraph 0044 teaches the performance of the model may be tested using the validation data set (345). The validation data set may be provided as input to the model, and the output may be compared to the actual results associated with the validation data set and/or to a baseline performance level to determine whether the model performance is adequate. Note: Comparing (matching) output (model results) with actual results associated with the validation data set and performance baseline (performance evaluation metrics) reads on the claimed corresponding model results and performance evaluation metrics are validated by matching with a validation dataset.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Saillet, Yates, and Calmon, by incorporating comparing (matching) output (model results) with actual results associated with the validation data set and performance baseline (performance evaluation metrics), as taught by Chen (see Paragraph 0044), because all four applications are directed to machine learning processing; incorporating comparing (matching) output (model results) with actual results associated with the validation data set and performance baseline (performance evaluation metrics) improves the opportunity explanations and/or recommendations (see Chen Paragraph 0094).

Claim 14 and 20 are similarly rejected because it is similar in scope.

Claim(s) 6 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Saillet et al. (U.S. Publication No.: US 20190095801 A1) hereinafter Saillet, in view of Yates et al. (U.S. Publication No.: US 20190102693 A1) hereinafter Yates, and further in view of Bui et al. (U.S. Publication No.: US 20210295191 A1) hereinafter Bui.
As to claim 6:
Saillet and Yates discloses all of the limitations as set forth in claim 1 but does not appear to expressly disclose corresponding model results and performance evaluation metrics are validated by matching with a validation dataset.
Bui discloses:
The system as claimed in claim 1, wherein the system further comprises a parameter selection interface coupled to the processor to allow a user to select a parameter associated with the ingested data sets the parameter including at least an accuracy [Paragraph 0045 teaches the client application 110 can present or display information to a user, including a user interface for selecting values for accuracy-training efficiency balance metrics, generating hyper-parameter sets, training machine learning models, and/or applying machine learning models. Additionally, the client application 110 can present information in the form of a trained machine learning model and/or resultant accuracy metrics and/or training efficiency metrics. Paragraph 0047 teaches the machine learning system 106 can communicate with the client device 108 to perform various functions associated with the client application 110 such as classifying digital content items. Paragraph 0092 teaches the machine learning system 106 and/or the hyper-parameter determination system 102 can access a repository of machine learning models and stored hyper-parameters (e.g., stored within the database 114) to generate hyper-parameter sets for training the machine learning models. Paragraph 0128 teaches the components of the hyper-parameter determination system 102 can include one or more instructions stored on a computer-readable storage medium and executable by processors of one or more computing devices (e.g., the computing device 1100). Note: User interface allowing a user to select values (parameters) for accuracy training (the parameter including an accuracy) associated with machine learning systems classifying digital content items (ingested data sets) and processors of a computing device reads on the claimed a parameter selection interface coupled to the processor to allow a user to select a parameter associated with the ingested data sets the parameter including an accuracy.] one of a threshold, a relevant variable, and a prediction error range
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Saillet and Yates, by incorporating user interface allowing a user to select values (parameters) for accuracy training (the parameter including an accuracy) associated with machine learning systems classifying digital content items (ingested data sets) and processors of a computing device, as taught by Bui (see Paragraphs 0045, 0047, 0092, and 0128), because all three applications are directed to machine learning processing; incorporating user interface allowing a user to select values (parameters) for accuracy training (the parameter including an accuracy) associated with machine learning systems classifying digital content items (ingested data sets) and processors of a computing device provides a selection of hyper-parameters for machine learning models that improve model accuracy and training efficiency (see Bui Paragraph 0005)

Claim 15 is similarly rejected because it is similar in scope.

Claim(s) 7 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Saillet et al. (U.S. Publication No.: US 20190095801 A1) hereinafter Saillet, in view of Yates et al. (U.S. Publication No.: US 20190102693 A1) hereinafter Yates, in view of Bui et al. (U.S. Publication No.: US 20210295191 A1) hereinafter Bui, in view of Vasseur et al. (U.S. Publication No.: US 20160219070 A1) hereinafter Vasseur,  and further in view of Durham et al. (U.S. Patent No.: US 10311361 B1) hereinafter Durham.
As to claim 7:
Saillet discloses:
encoded data set [Paragraph 0030 teaches the methods may be implemented in software 112… such as in the form of an executable program. Paragraph 0034 teaches an executable program such as in the form of object code. Paragraph 0039 teaches transforming one or more columns of respective datasets to a standardized or compatible format, such as by way of lookup codes. Paragraph 0050 teaches a list of terms and column classifications associated with a dataset, or with fields of the dataset, may form features required for input. Paragraph 0054 teaches dataset analyzer 313 may first apply data profiling algorithms, data classification, and data quality algorithms to dataset 311. Paragraph 0066 teaches the computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention. Paragraph 0067 teaches specific examples of the computer readable storage medium include an encoded device… having instructions recorded thereon. Note: An encoded device (encoding) with software or computer readable (machine-readable) instructions (pre-defined format) that includes transformed column classifications (encoded and transformed tagged data set) reads on the claimed encoded data set.]

Saillet and Yates discloses all of the limitations as set forth in claim 1 and some of claim 7 but does not appear to expressly disclose the system as claimed in claim 1, wherein the first set of instructions are executed to facilitate scaling, dimension reduction of the data set by reducing dimensionality of space associated with the redundant plurality of attributes, and conversion of the data set from continuous form to discrete form.
Bui discloses:
The system as claimed in claim 1, wherein the first set of instructions are executed to facilitate scaling [Paragraph 0128 teaches when executed by the one or more processors, the computer-executable instructions of the hyper-parameter determination system 102 can cause the computing device 1100 to perform the methods described herein. Paragraph 0149 teaches shared pool of configurable computing resources can be rapidly provisioned via virtualization and released with low management effort or service provider interaction, and then scaled accordingly. Note: Instructions (first set of instructions) executed for scaling accordingly (facilitate scaling) is interpreted to read on the claimed the first set of instructions are executed to facilitate scaling because the examiner interprets the claimed scaling to be resource scaling.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Saillet and Yates, by incorporating instructions (first set of instructions) executed for scaling accordingly (facilitate scaling), as taught by Bui (see Paragraphs 0128 and 0149), because all three applications are directed to machine learning processing; incorporating instructions (first set of instructions) executed for scaling accordingly (facilitate scaling) provides a selection of hyper-parameters for machine learning models that improve model accuracy and training efficiency (see Bui Paragraph 0005)
 
Saillet, Yates, and Bui discloses all of the limitations as set forth in claim 1 and some of 7 but does not appear to expressly disclose dimension reduction of the encoded data set by reducing dimensionality of space associated with the redundant plurality of attributes, and conversion of the data set from continuous form to discrete form.
Vasseur discloses:
dimension reduction of the data set by reducing dimensionality of space associated with the redundant plurality of attributes [Paragraph 0054 teaches a feature vector refers to an n-dimensional set of observable and measurable properties. Paragraph 0057 teaches features are created for each quantitative metric and each “relevant” combination of the categorical columns. Paragraph 0058 teaches several techniques may be used to reduce the dimensionality of a dataset such as, e.g., principal component analysis (PCA). PCA is the orthogonal projection of the data onto a lower dimensional linear space, known as the principal subspace, such that the variance of the projected data is maximized. Paragraph 0059 teaches in the process of dimensionality reduction, redundant dimensions are “merged” into so-called principal components (or, in the context of PCA, aligned along those). Note: Dimensions  associated with features and categorical columns (plurality of attributes) that undergo dimension reduction based merging redundant dimensions (redundant plurality of attributes) of a dataset using PCA resulting in lower dimensional linear space (reducing dimensionality of space) reads on the claimed dimension reduction of the data set by reducing dimensionality of space associated with the redundant plurality of attributes.], and 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Saillet, Yates, and Bui, by incorporating dimensions  associated with features and categorical columns (plurality of attributes) that undergo dimension reduction based merging redundant dimensions (redundant plurality of attributes) of a dataset using PCA resulting in lower dimensional linear space (reducing dimensionality of space), as taught by Vasseur (see Paragraphs 0054, 0057, 0058, and 0059), because all four applications are directed to machine learning processing; incorporating dimensions  associated with features and categorical columns (plurality of attributes) that undergo dimension reduction based merging redundant dimensions (redundant plurality of attributes) of a dataset using PCA resulting in lower dimensional linear space (reducing dimensionality of space)  achieves much better application experience (see Vasseur Paragraph 0066).

Saillet, Yates, Bui, and Vasseur discloses all of the limitations as set forth in claim 1 and most of 7 but does not appear to expressly disclose conversion of the data set from continuous form to discrete form.
Durham discloses:
conversion of the data set from continuous form to discrete form [Column 4 Lines 3-7 teach Adsorption is suited for use with discrete variables or values and media feature extraction may result in continuous variables, the feature discretization module may discretize (e.g., convert or map) the continuous variables into discrete values for use with Adsorption. Note: Converting continuous variables or values to discrete values reads on the claimed conversion of the data set from continuous form to discrete form.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Saillet, Yates, Bui, and Vasseur, by incorporating instructions (first set of instructions) executed for scaling accordingly (facilitate scaling), as taught by Durham (see Paragraphs Column 4 Lines 3-7), because all five applications are directed to machine learning processing; incorporating instructions (first set of instructions) executed for scaling accordingly (facilitate scaling) provides a selection of hyper-parameters for machine learning models that improve model accuracy and training efficiency (see Durham Column 13 lines 11-12).

Claim 16 is similarly rejected because it is similar in scope.

Claim(s) 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Saillet et al. (U.S. Publication No.: US 20190095801 A1) hereinafter Saillet, in view of Yates et al. (U.S. Publication No.: US 20190102693 A1) hereinafter Yates, in view of Pradhan et al. (U.S. Patent No.: US 10452490 B2) hereinafter Pradhan, and further in view of Periyathambi et al. (U.S. Patent No.: US 20220156175 A1) hereinafter Periyathambi.
As to claim 9:
Saillet and Yates, discloses all of the limitations as set forth in claim 1 but does not appear to expressly disclose the system as claimed in claim 1, wherein the data lake comprises staging zones for incoming data.
Pradhan discloses:
The system as claimed in claim 1, wherein the data lake comprises staging zones for incoming data [Column 56 Lines 55-67 and Column 57 Lines 1-3 teach the method 700 may be used to facilitate transferring data from a particular storage domain to another storage domain. For example, the method 700 may be used to transfer data from an entity's locally managed storage to a third-party storage solution. In some cases, the methods 500 and 700 may be combined to facilitate the transfer of data from one storage domains another storage domain. For example, the method 500 may be used to access data from a first storage environment, such as a local storage system, and move the data to a staging area, such as the secondary storage subsystem 118. This data may then be reorganized or processed in preparation for transfer to a second storage environment. The method 700 may then transfer the data from the staging area to the second storage environment, such as the distributed storage environment 302.  Note: The combined local storage, secondary storage subsystem, and second storage environment that all serve as staging areas reads on the claimed the data lake comprises staging zones for incoming data. Data lake is interpreted to be storage with for data received (incoming) at the staging areas.], the staging zones comprising: transient data zone to hold ephemeral datasets [Column 56 Lines 62-64 teach access data from a first storage environment, such as a local storage system, and move the data to a staging area, such as the secondary storage subsystem 118. Note: Data that is temporarily (ephemeral datasets) in local storage (transient data zone) is interpreted to be read on the claimed transient data zone to hold ephemeral datasets because will eventually move to another staging area.];and 
refined data zone to accommodate enriched and manipulated datasets [Column 56 Lines 62-64 teach access data from a first storage environment, such as a local storage system, and move the data to a staging area, such as the secondary storage subsystem 118. Column 56 Lines 65-67 and Column 57 Line 1 teach data may then be reorganized or processed in preparation for transfer to a second storage environment. The method 700 may then transfer the data from the staging area to the second storage environment. Note: Received data from the local storage (first staging area) to the secondary storage subsystem (refined data zone) where data in the secondary subsystem are reorganized (manipulated) and processed (enriched) reads on the claimed refined data zone to accommodate enriched and manipulated datasets because processed data is interpreted to be enriched data (see Spec Paragraph 0070).]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Saillet and Yates, by incorporating the combined local storage, secondary storage subsystem, and second storage environment that all serve as staging areas, as taught by Pradhan (see Column 56 Lines 55-67 and Column 57 Lines 1-3), because the three applications are directed to machine learning processing; incorporating the combined local storage, secondary storage subsystem, and second storage environment that all serve as staging areas improves the speed with which system performs information management operations and can also improve the capacity of the system to handle large numbers of such operations, while reducing the computational load on the production environment of client computing devices (see Pradhan Column 12 Lines 57-62).

Saillet, Yates, and Pradhan discloses all of the limitations as set forth in claim 1 and some of claim 9 but does not appear to expressly disclose raw data zone to encrypt and secure the ingested datasets.
Periyathambi discloses:
raw data zone to encrypt and secure the ingested datasets [Paragraph 0092 teaches in response to copying listings from the data store 303 to the staging data store 305, various embodiments encrypt, at the staging data store 305. Note: Data that is not yet encrypted (raw data) ingested at the staging data store (ingested data) (raw data zone) to be encrypted (encrypt and secure) reads on the claimed raw data zone to encrypt and secure the ingested datasets.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Saillet, Yates, and Pradhan, by incorporating data that is not yet encrypted (raw data) ingested at the staging data store (ingested data) (raw data zone) to be encrypted (encrypt and secure) , as taught by Pradhan (see Column 56 Lines 55-67 and Column 57 Lines 1-3), because the four applications are directed to machine learning processing; incorporating data that is not yet encrypted (raw data) ingested at the staging data store (ingested data) (raw data zone) to be encrypted (encrypt and secure) provides better computer security, and better computer resource utilization, among other things (see Periyathambi Paragraph 0001).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EARL ELIAS whose telephone number is (571)272-9762. The examiner can normally be reached Monday - Friday (IFP).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on 571-272-4046. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/EARL ELIAS/Examiner, Art Unit 2169                                                                                                                                                                                                        
/USMAAN SAEED/Supervisory Patent Examiner, Art Unit 2169