DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The filing date of the present application is 09/15/2017.
This action is in response to amendments and/or arguments filed on 02/09/2021. In the current amendments, claims 1, 8 and 15 have been amended and claims 7, 14 and 20 have been cancelled. Claims 1-6, 8-13, 15-19 and 21-23 are pending and have been examined. 

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 02/09/2021 has been entered.
 



Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-2, 8-9, 15-16 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Pinto et al. (US Pat No. 8751273 B2) in view of Weston et al. (US Pat No. 7624074 B2) and further in view of Wagner et al. (“Stepwise selection of . 
Regarding claim 1 (Currently Amended)
Pinto teaches a computer-implemented method of reducing an amount of computational resources consumed by a machine-learning model, (abstract “Models are generated using a variety of tools and features of a model generation platform. For example, in connection with a project in which a user generates a predictive model based on historical data about a system being modeled, the user is provided through a graphical user interface a structured sequence of model generation activities to be followed, the sequence including dimension reduction, model generation, model process validation, and model re-generation”)
the method comprising: applying, by at least one data processor of a computing device, (col 5 lines 20-22 “As shown in FIG. 2, the model development platform may be implemented in software 30 running on a workstation 32 that includes a microprocessor 34, a random access memory 36 that stores instructions and data that are used by the microprocessor to run the program”)
a machine-learning model to a first dataset to generate a first output, (Examiner notes that the machine learning model ‘Dimensionality reduction’ is used in FIG. 10 and item 170 to 172 output first output after removing sparse variables see col 11 lines 9-13 “The first filter 170 reduces the dimensionality of the modeling space by eliminating variables, xjk for which the density, D(x1n), is less than some fixed constant, C1.”)
(col 6 lines 17-20 “In the dataset exploration stage, aspects of the historical data may be examined in terms of the predictor variables. By predictor variable, we mean a potential covariate or independent variable of a model.”)
wherein the machine-learning model is trained prior to the applying; (FIG. 11 shows training model at item 194 before evaluating performance also see col 12 lines 47-53 “The next stage 194 is to fit or train the model to the sample subset of the historical data using the predictive variables generated by a set of variable transformation to maximum univariate predictive capability and a set of dimension reduction filters to retain only the most predictive subgroup of variables, including up to a level of interaction, for example, tertiary interactions.”)
iteratively (col 7 lines 9-14 “The user can invoke the modeling process repeatedly, for example, to review outcome response functions for individual predictor variables, or recursive partition trees for individual variables or for the set of predictor variables; to modify the predictor variable pool, to create and filter additional variables, or to modify the candidate model.”) removing, by at least one data processor, (col 5 lines 20-22 “As shown in FIG. 2, the model development platform may be implemented in software 30 running on a workstation 32 that includes a microprocessor 34, a random access memory 36 that stores instructions and data that are used by the microprocessor to run the program”) variables from the first dataset; (col 11 lines 7-13 “The flowchart in FIG. 10 provides an example of such cascade of filtering operations on the set of predictor variables. The first filter 170 reduces the dimensionality of the modeling space by eliminating variables, X" for which the density, D(x'), is less than some fixed constant, C. (These are variables which have not or cannot be processed for missing data imputation.)”)
for each iteration, (FIG. 10 shows the steps of each iteration for items 170-186. See col 11 lines 20-22 “In the second filtering stage 172, a Subspace is iteratively generated by including only significant variables, e.g., x, whose probability of non-contribution,”)
(i) applying, by at least one data processor, (col 5 lines 20-22 “As shown in FIG. 2, the model development platform may be implemented in software 30 running on a workstation 32 that includes a microprocessor 34, a random access memory 36 that stores instructions and data that are used by the microprocessor to run the program”)
the machine-learning model to a subset of data, wherein the subset of data has one or more variables removed from the first dataset to generate a second output, (Examiner notes that the machine learning model ‘Dimensionality reduction’ is used in FIG. 10 and item 176 removes irrelevant variables[corresponds to subset data] after removing irrelevant variables see col 11 lines 30-34 “In the third stage 174, the subspace, X is expanded by including all significant cross-products… where xjk and xpq, are in XQ, then applying a filter 176 to retain only significant variables, e.g., x1k, whose probability of non-contribution, [1-Pr(ylx4 k,)] is less than a fixed constant, C.”)
…
(col 5 lines 20-22 “As shown in FIG. 2, the model development platform may be implemented in software 30 running on a workstation 32 that includes a microprocessor 34, a random access memory 36 that stores instructions and data that are used by the microprocessor to run the program”)
a minimum set of variables comprising the subset of data with variables having impact below a predetermined threshold on an output of the machine-learning model based on the comparisons, (Examiner interprets minimum set of variables as accuracy in prediction and the process of dimensionality reduction contains set of minimum variables because the process removes irrelevant variables thus corresponds to maintaining prediction accuracy see col 11 lines 45-50 “In the fifth stage 182, the augmented subspace, X## s, is further augmented with all the cross-products, xjk * zrs, where xjk, are from, X## and zrs from Xn-Xm, then applying a filter 184 to retain only significant variables, e.g., x5k, whose probability of non-contribution, [1-Pr(y| x5k), is less than a fixed constant, C4[corresponds to threshold].”)
…
and applying, by at least one data processor, (col 5 lines 20-22 “As shown in FIG. 2, the model development platform may be implemented in software 30 running on a workstation 32 that includes a microprocessor 34, a random access memory 36 that stores instructions and data that are used by the microprocessor to run the program”)
(Examiner notes FIG. 10 shows the workflow diagram of a dimension reduction process and item 184 remove irrelevant variables and item 186 update set of predictor variables since the process of FIG. 10 got rid of the unnecessary variables therefore subsequent set of determining machine learning model has been apply to generate the new data see col lines 61-66 “The resulting hyperplane represents an efficient classification mapping. The set of predictor variables is then adjusted to include the variables passing through the sequence of filters resulting in an updated collection of predictor variables 186 available for constructing predictive models.”)
wherein the new data includes the minimum set of variables. (Col lines 61-66 “The resulting hyperplane represents an efficient classification mapping. The set of predictor variables is then adjusted to include the variables passing through the sequence of filters resulting in an updated collection of predictor variables 186 available for constructing predictive models.”)
Pinto does not teach …and (ii) comparing the first and second outputs;
…
wherein applying the machine learning model to the first dataset comprising only the minimum set of variables generates an output equal to the first output, wherein each of the output and the first output comprise at least one of a prediction, a decision, or a classification. 
Weston teaches …and (ii) comparing the first and second outputs; (Col 11 lines 11-16“The new data set may then be processed by one or more appropriately trained and tested second-level SVMs 1312a and 1312b. The resulting outputs 1314a and 1314b from second-level SVMs 1312a and 1312b may be compared to determine an optimal output 1316.”)
…
wherein each of the output and the first output comprise at least one of a prediction, a decision, or a classification. (Col 11 lines 11-16 and 35-40 “The new data set may then be processed by one or more appropriately trained and tested second-level SVMs 1312a and 1312b. The resulting outputs 1314a and 1314b from second-level SVMs 1312a and 1312b may be compared to determine an optimal output 1316. Optimal output 1316 may identify causal relationships between the mammography and genomic data points…. The methods, devices and systems described herein can be used with publicly available data to find relevant answers, such as genes determinative of a cancer diagnosis, or with specifically generated data.” The optimal output from SVM is used to identify causal relationship between the mammography and genomic data points which can determined/predict cancer diagnosis. Also see col 11 lines 59-66 “While the present discussion is directed to two-class classification problems, this is not to limit the scope of the invention. The two classes are identified with the symbols (+) and (-). A training set of a number of patterns {x1, x2, ... x . . . X, with known class labels (y1, y2,...y....y,},y,6{-1,+1}, is given. The training patterns are used to build a decision function (or discriminant function) D(X), that is a scalar function of an input pattern X.”)
Pinto and Weston are analogous art because they are both directed to dimensionality reduction.  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Pinto to incorporate the teaching of Weston to include method or system with feature selection for machine learning model.  
One of ordinary skill in the art would have been motivated to make this modification in order to identify relevant patterns from a datasets in “a method and system for selection of features within the data sets which best enable identification of relevant patterns” as disclosed by Weston (col 1 lines 39-43). 
Pinto in view of Weston does not teach wherein applying the machine learning model to the first dataset comprising only the minimum set of variables generates an output equal to the first output. 
Wagner teaches wherein applying the machine learning model to the first dataset comprising only the minimum set of variables generates an output equal to the first output. (Examiner notes that Wagner teaches removing variables that have no effect thus means output equal to the first output see pg. 61 right col first paragraph “the average difference in the efficiency scores between E1,i and E* is 0. In other words, these three input variables can be removed from the model without affecting a single efficiency score since they have no effect at all on the efficiency scores. In this example then, we can skip directly to a model with only 3 inputs (price, convenience, and room comfort) and the 2 outputs. This model yields the same efficiencies as in the starting model.”)
Pinto, Weston and Wagner are analogous art because they are all directed to feature selection.  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Pinto in view of Weston to incorporate the teaching of Wagner to include feature selection that is able to remove variables without effecting the accuracy of the score and this allow the system to output the same efficiencies as in the starting model as disclosed by Wagner (pg. 61 right col). 

Regarding claim 2
Pinto in view of Weston with Wagner teaches claim 1. 
Weston further teaches wherein the output for the new data comprises a prediction, decision, or classification associated with the new data. (Col 11 lines 11-16 and 35-40 “The new data set may then be processed by one or more appropriately trained and tested second-level SVMs 1312a and 1312b. The resulting outputs 1314a and 1314b from second-level SVMs 1312a and 1312b may be compared to determine an optimal output 1316. Optimal output 1316 may identify causal relationships between the mammography and genomic data points…. The methods, devices and systems described herein can be used with publicly available data to find relevant answers, such as genes determinative of a cancer diagnosis, or with specifically generated data.” The optimal output from SVM is used to identify causal relationship between the mammography and genomic data points which can determined/predict cancer diagnosis. Also see col 11 lines 59-66 “While the present discussion is directed to two-class classification problems, this is not to limit the scope of the invention. The two classes are identified with the symbols (+) and (-). A training set of a number of patterns {x1, x2, ... x . . . X, with known class labels (y1, y2,...y....y,},y,6{-1,+1}, is given. The training patterns are used to build a decision function (or discriminant function) D(X), that is a scalar function of an input pattern X.”)
Pinto, Wagner and Weston are analogous art because they are all directed to dimensionality reduction.  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Pinto in view of Wagner to incorporate the teaching of Weston to include method or system with feature selection for machine learning model.  
One of ordinary skill in the art would have been motivated to make this modification in order to identify relevant patterns from a datasets in “a method and system for selection of features within the data sets which best enable identification of relevant patterns” as disclosed by Weston (col 1 lines 39-43). 

Regarding claim 8 (Currently Amended) 
Pinto teaches a computer-implemented method of reducing an amount of computational resources consumed by a machine-learning model, (abstract “Models are generated using a variety of tools and features of a model generation platform. For example, in connection with a project in which a user generates a predictive model based on historical data about a system being modeled, the user is provided through a graphical user interface a structured sequence of model generation activities to be followed, the sequence including dimension reduction, model generation, model process validation, and model re-generation”)
the method comprising: applying, by at least one data processor of a computing device, (col 5 lines 20-22 “As shown in FIG. 2, the model development platform may be implemented in software 30 running on a workstation 32 that includes a microprocessor 34, a random access memory 36 that stores instructions and data that are used by the microprocessor to run the program”)
a machine-learning model to a first dataset to generate a first output, (Examiner notes that the machine learning model ‘Dimensionality reduction’ is used in FIG. 10 and item 170 to 172 output first output after removing sparse variables see col 11 lines 9-13 “The first filter 170 reduces the dimensionality of the modeling space by eliminating variables, xjk for which the density, D(x1n), is less than some fixed constant, C1.”)
the first dataset comprising a plurality of variables, (col 6 lines 17-20 “In the dataset exploration stage, aspects of the historical data may be examined in terms of the predictor variables. By predictor variable, we mean a potential covariate or independent variable of a model.”)
wherein the machine-learning model is trained prior to the applying; (FIG. 11 shows training model at item 194 before evaluating performance also see col 12 lines 47-53 “The next stage 194 is to fit or train the model to the sample subset of the historical data using the predictive variables generated by a set of variable transformation to maximum univariate predictive capability and a set of dimension reduction filters to retain only the most predictive subgroup of variables, including up to a level of interaction, for example, tertiary interactions.”)
iteratively (col 7 lines 9-14 “The user can invoke the modeling process repeatedly, for example, to review outcome response functions for individual predictor variables, or recursive partition trees for individual variables or for the set of predictor variables; to modify the predictor variable pool, to create and filter additional variables, or to modify the candidate model.”) removing, by at least one data processor, (col 5 lines 20-22 “As shown in FIG. 2, the model development platform may be implemented in software 30 running on a workstation 32 that includes a microprocessor 34, a random access memory 36 that stores instructions and data that are used by the microprocessor to run the program”) variables from the first dataset; (col 11 lines 7-13 “The flowchart in FIG. 10 provides an example of such cascade of filtering operations on the set of predictor variables. The first filter 170 reduces the dimensionality of the modeling space by eliminating variables, X" for which the density, D(x'), is less than some fixed constant, C. (These are variables which have not or cannot be processed for missing data imputation.)”)
for each iteration, (FIG. 10 shows the steps of each iteration for items 170-186. See col 11 lines 20-22 “In the second filtering stage 172, a Subspace is iteratively generated by including only significant variables, e.g., x, whose probability of non-contribution,”)
(col 5 lines 20-22 “As shown in FIG. 2, the model development platform may be implemented in software 30 running on a workstation 32 that includes a microprocessor 34, a random access memory 36 that stores instructions and data that are used by the microprocessor to run the program”)
the machine-learning model to a subset of data, wherein the subset of data has one or more variables removed from the first dataset to generate a second output, (Examiner notes that the machine learning model ‘Dimensionality reduction’ is used in FIG. 10 and item 176 removes irrelevant variables[corresponds to subset data] after removing irrelevant variables see col 11 lines 30-34 “In the third stage 174, the subspace, X is expanded by including all significant cross-products… where xjk and xpq, are in XQ, then applying a filter 176 to retain only significant variables, e.g., x1k, whose probability of non-contribution, [1-Pr(ylx4 k,)] is less than a fixed constant, C.”)
…
determining, by at least one data processor, (col 5 lines 20-22 “As shown in FIG. 2, the model development platform may be implemented in software 30 running on a workstation 32 that includes a microprocessor 34, a random access memory 36 that stores instructions and data that are used by the microprocessor to run the program”)
a minimum set of variables comprising the subset of data with variables having impact below a predetermined threshold on an output of the machine-learning model based on the comparisons, (Examiner interprets minimum set of variables as accuracy in prediction and the process of dimensionality reduction contains set of minimum variables because the process removes irrelevant variables thus corresponds to maintaining prediction accuracy see col 11 lines 45-50 “In the fifth stage 182, the augmented subspace, X## s, is further augmented with all the cross-products, xjk * zrs, where xjk, are from, X## and zrs from Xn-Xm, then applying a filter 184 to retain only significant variables, e.g., x5k, whose probability of non-contribution, [1-Pr(y| x5k), is less than a fixed constant, C4[corresponds to threshold].”)
wherein applying the machine learning model to the first dataset comprising only the minimum set of variables; (Examiner notes that after the process of dimensionality reduction is done and the system identifies an optimal variables which corresponds to only minimum set of variables see col 12 lines 53-55 “At Successive stages of convergence of the model parameters, the performance may be evaluated 198 and tested for optimality 200. The evaluation is based on a set of criteria including cumulative lift over the region of interest. If the performance indicates the model is optimal then the model is persisted;”)
and applying, by at least one data processor, (col 5 lines 20-22 “As shown in FIG. 2, the model development platform may be implemented in software 30 running on a workstation 32 that includes a microprocessor 34, a random access memory 36 that stores instructions and data that are used by the microprocessor to run the program”)
subsequent to the determining, the machine-learning model to new data to generate an output for the new data, (Examiner notes FIG. 10 shows the workflow diagram of a dimension reduction process and item 184 remove irrelevant variables and item 186 update set of predictor variables since the process of FIG. 10 got rid of the unnecessary variables therefore subsequent set of determining machine learning model has been apply to generate the new data see col lines 61-66 “The resulting hyperplane represents an efficient classification mapping. The set of predictor variables is then adjusted to include the variables passing through the sequence of filters resulting in an updated collection of predictor variables 186 available for constructing predictive models.”)
wherein the new data includes the minimum set of variables. (Col lines 61-66 “The resulting hyperplane represents an efficient classification mapping. The set of predictor variables is then adjusted to include the variables passing through the sequence of filters resulting in an updated collection of predictor variables 186 available for constructing predictive models.”)
Pinto does not teach …and (ii) comparing the first and second outputs;
wherein applying the machine learning model to the first dataset comprising only the minimum set of variables generates an output equal to the first output
…
wherein each of the output and the first output comprise at least one of a prediction, a decision, or a classification. 
Weston teaches …and (ii) comparing the first and second outputs; (Col 11 lines 11-16“The new data set may then be processed by one or more appropriately trained and tested second-level SVMs 1312a and 1312b. The resulting outputs 1314a and 1314b from second-level SVMs 1312a and 1312b may be compared to determine an optimal output 1316.”)
wherein each of the output and the first output comprise at least one of a prediction, a decision, or a classification (Col 11 lines 11-16 and 35-40 “The new data set may then be processed by one or more appropriately trained and tested second-level SVMs 1312a and 1312b. The resulting outputs 1314a and 1314b from second-level SVMs 1312a and 1312b may be compared to determine an optimal output 1316. Optimal output 1316 may identify causal relationships between the mammography and genomic data points…. The methods, devices and systems described herein can be used with publicly available data to find relevant answers, such as genes determinative of a cancer diagnosis, or with specifically generated data.” The optimal output from SVM is used to identify causal relationship between the mammography and genomic data points which can determined/predict cancer diagnosis. Also see col 11 lines 59-66 “While the present discussion is directed to two-class classification problems, this is not to limit the scope of the invention. The two classes are identified with the symbols (+) and (-). A training set of a number of patterns {x1, x2, ... x . . . X, with known class labels (y1, y2,...y....y,},y,6{-1,+1}, is given. The training patterns are used to build a decision function (or discriminant function) D(X), that is a scalar function of an input pattern X.”)
Pinto and Weston are analogous art because they are both directed to dimensionality reduction.  
Pinto to incorporate the teaching of Weston to include method or system with feature selection for machine learning model.  
One of ordinary skill in the art would have been motivated to make this modification in order to identify relevant patterns from a datasets in “a method and system for selection of features within the data sets which best enable identification of relevant patterns” as disclosed by Weston (col 1 lines 39-43). 
	Pinto in view of Weston does not teach wherein applying the machine learning model to the first dataset comprising only the minimum set of variables generates an output equal to the first output. 
Wagner teaches wherein applying the machine learning model to the first dataset comprising only the minimum set of variables generates an output equal to the first output. (Examiner notes that Wagner teaches removing variables that have no effect thus means output equal to the first output see pg. 61 right col first paragraph “the average difference in the efficiency scores between E1,i and E* is 0. In other words, these three input variables can be removed from the model without affecting a single efficiency score since they have no effect at all on the efficiency scores. In this example then, we can skip directly to a model with only 3 inputs (price, convenience, and room comfort) and the 2 outputs. This model yields the same efficiencies as in the starting model.”)
Pinto, Weston and Wagner are analogous art because they are all directed to feature selection.  
Pinto in view of Weston to incorporate the teaching of Wagner to include feature selection that is able to remove variables without effecting the accuracy of the score and this allow the system to output the same efficiencies as in the starting model as disclosed by Wagner (pg. 61 right col). 
Regarding claim 15 (Currently Amended) 
Pinto teaches a non-transitory computer-readable storage medium for reducing an amount of computational resources consumed by a machine-learning model, (abstract “Models are generated using a variety of tools and features of a model generation platform. For example, in connection with a project in which a user generates a predictive model based on historical data about a system being modeled, the user is provided through a graphical user interface a structured sequence of model generation activities to be followed, the sequence including dimension reduction, model generation, model process validation, and model re-generation”)
the computer- readable storage medium comprising computer executable instructions which, (col 5 lines 20-22 “As shown in FIG. 2, the model development platform may be implemented in software 30 running on a workstation 32 that includes a microprocessor 34, a random access memory 36 that stores instructions and data that are used by the microprocessor to run the program”)
when executed, cause a processing system to execute steps including: applying a machine-learning model to a first dataset… to generate a first output, (Examiner notes that the machine learning model ‘Dimensionality reduction’ is used in FIG. 10 and item 170 to 172 output first output after removing sparse variables see col 11 lines 9-13 “The first filter 170 reduces the dimensionality of the modeling space by eliminating variables, xjk for which the density, D(x1n), is less than some fixed constant, C1.”)
the first dataset comprising a plurality of variables, (col 6 lines 17-20 “In the dataset exploration stage, aspects of the historical data may be examined in terms of the predictor variables. By predictor variable, we mean a potential covariate or independent variable of a model.”)
wherein the machine-learning model is trained prior to the applying; (FIG. 11 shows training model at item 194 before evaluating performance also see col 12 lines 47-53 “The next stage 194 is to fit or train the model to the sample subset of the historical data using the predictive variables generated by a set of variable transformation to maximum univariate predictive capability and a set of dimension reduction filters to retain only the most predictive subgroup of variables, including up to a level of interaction, for example, tertiary interactions.”)
iteratively (col 7 lines 9-14 “The user can invoke the modeling process repeatedly, for example, to review outcome response functions for individual predictor variables, or recursive partition trees for individual variables or for the set of predictor variables; to modify the predictor variable pool, to create and filter additional variables, or to modify the candidate model.”) removing variables from the first dataset; (col 11 lines 7-13 “The flowchart in FIG. 10 provides an example of such cascade of filtering operations on the set of predictor variables. The first filter 170 reduces the dimensionality of the modeling space by eliminating variables, X" for which the density, D(x'), is less than some fixed constant, C. (These are variables which have not or cannot be processed for missing data imputation.)”)
for each iteration, (FIG. 10 shows the steps of each iteration for items 170-186. See col 11 lines 20-22 “In the second filtering stage 172, a Subspace is iteratively generated by including only significant variables, e.g., x, whose probability of non-contribution,”)
(i) applying the machine-learning model to a subset of data, wherein the subset of data has one or more variables removed from the first dataset to generate a second output, (Examiner notes that the machine learning model ‘Dimensionality reduction’ is used in FIG. 10 and item 176 removes irrelevant variables[corresponds to subset data] after removing irrelevant variables see col 11 lines 30-34 “In the third stage 174, the subspace, X is expanded by including all significant cross-products… where xjk and xpq, are in XQ, then applying a filter 176 to retain only significant variables, e.g., x1k, whose probability of non-contribution, [1-Pr(ylx4 k,)] is less than a fixed constant, C.”)
…
determining a minimum set of variables comprising the subset of data with variables having impact below a predetermined threshold on an output of the machine-learning model based on the comparisons, (Examiner interprets minimum set of variables as accuracy in prediction and the process of dimensionality reduction contains set of minimum variables because the process removes irrelevant variables thus corresponds to maintaining prediction accuracy see col 11 lines 45-50 “In the fifth stage 182, the augmented subspace, X## s, is further augmented with all the cross-products, xjk * zrs, where xjk, are from, X## and zrs from Xn-Xm, then applying a filter 184 to retain only significant variables, e.g., x5k, whose probability of non-contribution, [1-Pr(y| x5k), is less than a fixed constant, C4[corresponds to threshold].”)
…
and applying the machine-learning model to new data to generate an output for the new data, (Examiner notes FIG. 10 shows the workflow diagram of a dimension reduction process and item 184 remove irrelevant variables and item 186 update set of predictor variables since the process of FIG. 10 got rid of the unnecessary variables therefore subsequent set of determining machine learning model has been apply to generate the new data see col lines 61-66 “The resulting hyperplane represents an efficient classification mapping. The set of predictor variables is then adjusted to include the variables passing through the sequence of filters resulting in an updated collection of predictor variables 186 available for constructing predictive models.”)
wherein the new data …includes the minimum set of variables (Col lines 61-66 “The resulting hyperplane represents an efficient classification mapping. The set of predictor variables is then adjusted to include the variables passing through the sequence of filters resulting in an updated collection of predictor variables 186 available for constructing predictive models.”)
Pinto does not teach …applying a machine-learning model to a first dataset comprising medical data; 
…
and (ii) comparing the first and second outputs; 
wherein applying the machine learning model to the first dataset comprising only the minimum set of variables generates an output equal to the first6 NAI-1516269105v1Attorney Docket No. 14291-289-999 / 170236US01output, wherein each of the output and the first output comprise at least one of a prediction, a decision, or a classification; 
wherein the new data comprises medical data for a patient includes the minimum set of variables;
and the output for the new data predicts whether the patient has a medical condition.
Weston teaches …applying a machine-learning model to a first dataset comprising medical data; (col 13 lines 59-63 “FIG. 5 illustrates an exemplary hierarchical system of SVMs. As shown, one or more first-level SVMs 1302a and 60 1302b may be trained and tested to process a first type of input data 1304a, such as mammography data, pertaining to a sample of medical patients” also see col 7 lines 2-7 “For example, biological data obtained from clinical case information, Such as diagnostic test data, family or genetic histories, prior or current medical treatments, and the clinical outcomes of such activities, can be utilized in the methods, systems and devices of the present invention.”)
…
(Col 11 lines 11-16“The new data set may then be processed by one or more appropriately trained and tested second-level SVMs 1312a and 1312b. The resulting outputs 1314a and 1314b from second-level SVMs 1312a and 1312b may be compared to determine an optimal output 1316.”)
wherein each of the output and the first output comprise at least one of a prediction, a decision, or a classification (Col 11 lines 11-16 and 35-40 “The new data set may then be processed by one or more appropriately trained and tested second-level SVMs 1312a and 1312b. The resulting outputs 1314a and 1314b from second-level SVMs 1312a and 1312b may be compared to determine an optimal output 1316. Optimal output 1316 may identify causal relationships between the mammography and genomic data points…. The methods, devices and systems described herein can be used with publicly available data to find relevant answers, such as genes determinative of a cancer diagnosis, or with specifically generated data.” The optimal output from SVM is used to identify causal relationship between the mammography and genomic data points which can determined/predict cancer diagnosis. Also see col 11 lines 59-66 “While the present discussion is directed to two-class classification problems, this is not to limit the scope of the invention. The two classes are identified with the symbols (+) and (-). A training set of a number of patterns {x1, x2, ... x . . . X, with known class labels (y1, y2,...y....y,},y,6{-1,+1}, is given. The training patterns are used to build a decision function (or discriminant function) D(X), that is a scalar function of an input pattern X.”)
comprises medical data for a patient; (col 13 lines 59-63 “FIG. 5 illustrates an exemplary hierarchical system of SVMs. As shown, one or more first-level SVMs 1302a and 60 1302b may be trained and tested to process a first type of input data 1304a, such as mammography data, pertaining to a sample of medical patients”)
and the output for the new data predicts whether the patient has a medical condition. (Col 11 lines 11-16 “The new data set may then be processed by one or more appropriately trained and tested second-level SVMs 1312a and 1312b. The resulting outputs 1314a and 1314b from second-level SVMs 1312a and 1312b may be compared to determine an optimal output 1316. Optimal output 1316 may identify causal relationships between the mammography and genomic data points…. The methods, devices and systems described herein can be used with publicly available data to find relevant answers, such as genes determinative of a cancer diagnosis, or with specifically generated data.” The optimal output from SVM is used to identify causal relationship between the mammography and genomic data points which can determined/predict cancer diagnosis for medical patient see col 11 lines 35-40 “The methods, devices and systems described herein can be used with publicly available data to find relevant answers, such as genes determinative of a cancer diagnosis, or with specifically generated data.”)
Pinto and Weston are analogous art because they are both directed to dimensionality reduction.  
Pinto to incorporate the teaching of Weston to include method or system with feature selection for machine learning model.  
One of ordinary skill in the art would have been motivated to make this modification in order to identify relevant patterns from a datasets in “a method and system for selection of features within the data sets which best enable identification of relevant patterns” as disclosed by Weston (col 1 lines 39-43). 
Pinto in view of Weston does not teach wherein applying the machine learning model to the first dataset comprising only the minimum set of variables generates an output equal to the first output. 
Wagner teaches wherein applying the machine learning model to the first dataset comprising only the minimum set of variables generates an output equal to the first output. (Examiner notes that Wagner teaches removing variables that have no effect thus means output equal to the first output see pg. 61 right col first paragraph “the average difference in the efficiency scores between E1,i and E* is 0. In other words, these three input variables can be removed from the model without affecting a single efficiency score since they have no effect at all on the efficiency scores. In this example then, we can skip directly to a model with only 3 inputs (price, convenience, and room comfort) and the 2 outputs. This model yields the same efficiencies as in the starting model.”)
Pinto, Weston and Wagner are analogous art because they are all directed to feature selection.  
Pinto in view of Weston to incorporate the teaching of Wagner to include feature selection that is able to remove variables without effecting the accuracy of the score and this allow the system to output the same efficiencies as in the starting model as disclosed by Wagner (pg. 61 right col). 
Regarding claim 9 and 16
	Referring to dependent claims 9 and 16, they are rejected on the same basis as dependent claim 2 since they are analogous claims. 

Regarding claim 21
Pinto in view of Weston with Wagner teaches claim 1. 
Wagner further teaches wherein the iteratively removing is repeated either (i) a number of times equal to a number of the plurality of variables (Examiner notes that Wagner teaches removing variables that have no effect thus means output equal to the first output see pg. 61 right col first paragraph “the average difference in the efficiency scores between E1,i and E* is 0. In other words, these three input variables can be removed from the model without affecting a single efficiency score since they have no effect at all on the efficiency scores. In this example then, we can skip directly to a model with only 3 inputs (price, convenience, and room comfort) and the 2 outputs. This model yields the same efficiencies as in the starting model.”) 

Pinto, Weston and Wagner are analogous art because they are all directed to feature selection.  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Pinto in view of Weston to incorporate the teaching of Wagner to include feature selection that is able to remove variables without effecting the accuracy of the score and this allow the system to output the same efficiencies as in the starting model as disclosed by Wagner (pg. 61 right col). 

Claims 3, 10 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Pinto et al. (US Pat No. 8751273 B2) in view of Weston et al. (US Pat No. 7624074 B2) in view of Wagner et al. (“Stepwise selection of variables in data envelopment analysis: Procedures and managerial perspectives”, hereinafter: Wagner) and further in view of Chen et al. (“Combining SVMs with Various Feature Selection Strategies”, hereinafter: Chen).
Regarding claim 3
	Pinto in view of Weston with Wagner teaches claim 1. 
Pinto in view of Weston with Wagner does not teach wherein the comparing of the first and second outputs comprises determining a difference between the first and second outputs and the subset of variables is determined based on the differences.
Chen teaches wherein the comparing of the first and second outputs comprises determining a difference between the first and second outputs, (pg. 4 section 3.3 “To obtain feature importance, first we split the training sets to two parts. By training the first and predicting the second we obtain an accuracy value. For the jth feature, we randomly permute its values in the second set and obtain another accuracy. The difference between the two numbers can indicate the importance of the jth feature.”)
and the subset of variables is determined based on the differences. (pg. 4 right second paragraph “Thus, before using RF to select features, we obtain a subset of features using F-score selection first.”)
Pinto, Weston, Wagner and Chen are analogous art because they are all directed to feature extraction. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Pinto in view of Weston with Wagner to incorporate the teaching of Chen to include “F-score + RF + SVM” with various feature selection that is able to calculate the difference between two numbers and classify the importance of the feature being selected for the purpose of obtaining accuracy results as disclosed by Chen (pg. 3.3). 

Regarding claim 10 and 17
	Referring to dependent claims 10 and 17, they are rejected on the same basis as dependent claim 3 since they are analogous claims. 

Claims 4, 11 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Pinto et al. (US Pat No. 8751273 B2) in view of Weston et al. (US Pat No. 7624074 B2) in view of Wagner et al. in view of Chen et al. (“Combining SVMs with Various Feature Selection Strategies”, hereinafter: Chen) and further in view of Yuan et al. (“A Two-phase Feature Selection Method using both Filter and Wrapper”, hereinafter: Yuan).
Regarding claim 4
Pinto in view of Weston with Wagner and Chen teaches claim 3. 
Pinto in view of Weston with Wagner and Chen does not teach comprising:25Attorney Docket No. 14291-289-999 determining whether the difference is less than the predetermined threshold, wherein a variable is determined to have impact below the predetermined threshold based on a determination that the difference is less than the predetermined threshold. 
Yuan teaches the method comprising:25Attorney Docket No. 14291-289-999 determining whether the difference is less than the predetermined threshold, wherein a variable is determined to have impact below the predetermined threshold based on a determination that the difference is less than the predetermined threshold. (Pg. 135 left col “With the difference in performance between two networks with different sets of input features, SBFCV also decides whether to continue or to stop removing more features. SBFCV stops if the performance of network drops below a given threshold by removal the least relevant feature.”)
Pinto, Weston, Wagner, Chen and Yuan are analogous art because they are all directed to feature extraction.  
Pinto in view of Chen with Weston and Wagner to incorporate the teaching of Yuan to include feature extractions and cross validation that would increase performance between two networks with different sets of input features as disclosed by Yuan (pg. 135 left col). 
Regarding claim 11 and 18
	Referring to dependent claims 11 and 18, they are rejected on the same basis as dependent claim 3 since they are analogous claims. 

Claims 5-6, 12-13 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Pinto et al. (US Pat No. 8751273 B2) in view of Weston et al. (US Pat No. 7624074 B2) in view of Wagner et al. and further in view of Martinez et al. (WO 2014/100738 A1).
Regarding claim 5
Pinto in view of Weston with Wagner teaches claim 1. 
Pinto further teaches the method further comprising: receiving training data determined to be usable for training the machine-learning model; (col 17 lines 20-26 “The development dataset is first partitioned 300 to obtain a training sample. Then the training sample dataset is reviewed 302 at the individual variable level 304 and transformed 306 or reviewed in terms of the set of predictor variables using partition binary trees and automatically transformed 308 into variables suitable for dimension reduction 310”)
Pinto in view of Weston with Wagner does not teach wherein the machine-learning model includes weighting factors associated with the plurality of variables, training the machine-learning model with the processing system using the training data to determine values for the weighting factors; and configuring the machine-learning model with the determined values of the weighting factors.  
	Martinez teaches wherein the machine-learning model includes weighting factors associated with the plurality of variables, (abstract “Each of the machine learning training instances includes a state-action pair and is weighted during the training based on its associated quality value using a weighting factor that weights different quality values differently such that the classifier learns more from a machine learning training instance with a higher quality value than from a machine learning training instance with a lower quality value”)
training the machine-learning model with the processing system using the training data to determine values for the weighting factors; (pg. 3 lines 16-21 “In this example, each of the machine learning training instances may be weighted during the training based on its associated quality value using a weighting factor that weights different quality values differently. Also, in this example, the method may further include determining the quality values that should be associated with the machine learning training instances and associating the determined quality values with the machine learning training instances.”)
and configuring the machine-learning model with the determined values of the weighting factors. (Pg. 4 lines 12-17 “In this example embodiment, each of the machine learning training instances may be weighted during the training based on its associated quality value using a weighting factor that weights different quality values differently according to the following formula: u(q) = (a + b q), where: q is the associated quality value, u(q) is the weighting factor, is a first empirical parameter, and b is a second empirical parameter.”)
Pinto, Weston, Wagner and Martinez are analogous art because they are all directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Pinto in view of Weston with Wagner to incorporate the teaching of Martinez to include training the machine-learning model with the processing system using the training data to determine values for the weighting factors such that machine learning classifier learns more from machine learning training instances with a higher quality value as disclosed by Martinez (abstract). 
Regarding claim 12 and 19
	Referring to dependent claims 12 and 19, they are rejected on the same basis as dependent claim 5 since they are analogous claims. 

Regarding claim 6
Pinto in view of Weston with Wagner and Martinez teaches claim 5. 
Martinez further teaches wherein the training comprises: processing the training data to determine numerical measures from the training data for the plurality of variables; (pg. 14 lines 18-24 “Then, for training data with m temporal sequences T = {L L2 , Lm}, training instances can be derived from each sequence added to the training set. Thus, the total number of training instances that can be added to the training set is N(L ) + N{L + + N(L m) where N(L ) is the length, or number of state-action training instances, of L (i = 1, 2, m). After a training set is built from the temporal sequences T = {L , , Lm }, a classifier can be trained to learn a policy for decision making. The purpose of training is to enable a machine learning classifier to learn an optimal policy for making a decision (choosing action vector a) given an input feature vector[corresponds to numerical measures] (state vector s).”)
and conducting a numerical machine-learning analysis based on the numerical measures from the training data to determine the values of the weighting factors. (Pg. 4 lines 12-17 “In this example embodiment, each of the machine learning training instances may be weighted during the training based on its associated quality value using a weighting factor that weights different quality values differently according to the following formula: u(q) = (a + b q), where: q is the associated quality value, u(q) is the weighting factor, is a first empirical parameter, and b is a second empirical parameter.”)
Pinto, Weston, Wagner and Martinez are analogous art because they are all directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Pinto in view of Weston with Wagner to incorporate the teaching of Martinez to include training the machine-learning model with the processing system using the training data to determine values for the weighting factors such that machine learning classifier learns more from machine Martinez (abstract). 
Regarding claim 13
	Referring to dependent claims 13, it is rejected on the same basis as dependent claim 6 since they are analogous claims. 

Claim 22 is rejected under 35 U.S.C. 103 as being unpatentable over Pinto et al. (US Pat No. 8751273 B2) in view of Weston et al. (US Pat No. 7624074 B2) in view of Wagner et al. and further in view of Somol et al. (“Evaluating Stability and Comparing Output of Feature Selectors that Optimize Feature Subset Cardinality”, hereinafter: Somol). 
Regarding claim 22
Pinto in view of Weston with Wagner teaches claim 1. 
Pinto in view of Weston with Wagner does not teach wherein the comparing of the first and second outputs determines whether the first output equals the second output.  
Somol teaches wherein the comparing of the first and second outputs determines whether the first output equals the second output. (Abstract “We also introduce an alternative approach to feature selection evaluation in the form of measures that enable comparing the similarity of two feature selection processes. These measures enable comparing, e.g., the output of two feature selection methods or two runs of one method with different parameters. The information obtained using the considered stability and similarity measures is shown to be usable for assessing feature selection methods (or criteria) as such.” Also see pg. 1925 right col “Both IC and ICW take values from… with 0 indicating that no feature appears in more than one system and 1 indicating that the relative frequencies are equal for each feature in both systems, i.e., feature selector confidence regarding each feature is equal among the two compared systems.”)
Pinto, Weston, Wagner and Somol are analogous art because they are all directed to data automation. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Pinto in view of Weston with Wagner to incorporate the teaching of Somol to include feature selection evaluation in the form of measures that enable comparing the similarity of two feature selection processes and provide accurate information quickly as disclosed by Somol (abstract). 
Claim 23 is rejected under 35 U.S.C. 103 as being unpatentable over Pinto et al. (US Pat No. 8751273 B2) in view of Weston et al. (US Pat No. 7624074 B2) and further in view of Yu et al. (“Feature Selection Using Principal Feature Analysis”, hereinafter: Yu). 
Regarding claim 23
Pinto in view of Weston with Wagner teaches claim 1. 
Pinto in view of Weston with Wagner does not teach wherein the first dataset is a randomized subset of data randomly selected from a larger dataset.
Yu teaches wherein the first dataset is a randomized subset of data randomly selected from a larger dataset. (pg. 304 right col “The performance of the original feature set and the principal feature set are tested against 3 randomly selected (but the same number of) features. It clearly shows that principal features yield comparable results as that of original set and PCA, and significantly higher results than any of random picks.”)
Pinto, Weston, Wagner and Yu are analogous art because they are all directed to machine learning model. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Pinto in view of Weston with Wagner to incorporate the teaching of Yu to include randomly selection of variables in dimensionality reduction for choosing the principal features in face tracking and content-based image retrieval as disclosed by Yu (abstract). 
 

Response to Arguments
Applicant's arguments filed 02/09/2021 have been fully considered but they are not persuasive. 
Rejections Under 35 U.S.C. 103 
Applicant asserts that “it is respectfully submitted that the skilled artisan would not have resulted in the subject matter recited in the claims using the art of record given the differences described below. In one example, the combinations of (i) Pinto and Weston and (ii) Pinto, Weston, and Wagner fail to teach or described "wherein each of the output and the first output comprise at least one of a prediction, a decision, or a classification," as 
Examiner’s response:
The Examiner respectfully disagrees. Weston teaches machine learning classifiers SVM as evidenced by col 11 lines 11-16. The output data of SVM classifier 1314a and 1314b are compared to determine an optimal output 1316 (col 11 lines 11-16). The optimal output 1316 “may identify causal relationships between the mammography and genomic data points”, it is clear that Weston teaches prediction as to the causal relationship between mammography and genomic data points as evidenced by col 11 lines 11-16 and 35-40. 

Applicant asserts that “It is respectfully submitted that the Examiner's continued reliance on Weston is unreasonable. Weston is directed to the training of machine learning model. As noted in the last reply, while Weston does describe in Col. 11-16 the application of "new data" and comparison of outputs. Such comparisons are for the purpose of training a machine learning model, rather than determining "a minimum set of variables comprising the subset of data with variables having impact below a predetermined threshold on an output of the machine-learning model based on the comparisons," as recited by the independent claims. The machine learning model is then applied to new data having that minimum set of variables. Furthermore, the independent claims recite that "the machine-learning model is trained prior to the applying." Weston specifically describes the process of training a machine learning model, and as such the teachings of Weston may not be reasonably construed as describing techniques for determining a minimum set of variables after the machine learning model is trained. As such, the Examiner unreasonably relies upon a reference that focuses on training a machine learning model for allegedly teaching aspects of the use of a previously trained machine learning model… Wagner is directed to data envelopment analysis and the stepwise selection of variables. The Examiner relies upon Wagner to allegedly teach "an output equal to the first output." More specifically, the Examiner points to Wagner's statement that "the average difference in the efficiency scores between El, i and E* is 0." From this description, Wagner makes at least two things clear. One, that Wagner is Attorney Docket No. 14291-289-999 / 170236US01the art as qualitative data. In other words, Wagner deals with quantitative, rather than qualitative comparisons. As such, Wagner may not be reasonably construed as teaching or describing "wherein applying the machine learning model to the first dataset comprising only the minimum set of variables generates an output equal to the first output, wherein each of the output and the first output comprise at least one of a prediction, a decision, or a classification," as recited by the independent claims. For at least these reasons, the combination of Pinto and Weston fails to teach or describe the elements of independent claim 1. Similarly, the combination of Pinto, Weston, and Wagner fail to teach or describe the elements of independent claims 8 and 15. These claims are therefore nonobvious under 35 U.S.C. § 103. Any claims depending therefrom are also nonobvious by virtue of dependency and for their own features. Withdrawal of these rejections is respectfully requested.” (Remarks pg. 11-12)
Examiner’s response:
The Examiner respectfully disagrees. Weston teaches machine learning classifiers SVM as evidenced by col 11 lines 11-16. The output data of SVM classifier 1314a and 1314b are compared to determine an optimal output 1316 (col 11 lines 11-16). The optimal output 1316 “may identify causal relationships between the mammography and genomic data points”, it is clear that Weston teaches prediction as to the causal relationship between mammography and genomic data points as evidenced by col 11 lines 11-16 and 35-40. 
Regarding, “As such, Wagner may not be reasonably construed as teaching or describing "wherein applying the machine learning model to the first dataset comprising only the minimum set of variables generates an output equal to the first output”” the Wagner reference discloses removing variables that have no impact or effect thus mean generating output equal to the first output. It is well understood in the art that if only certain variables are remove and have absolutely zero effects, then “only the minimum set of variables generates an output equal to the first output.” 

Applicant asserts “In another example, the combination of Pinto, Weston, and Wagner also fails to teach or describe "a first dataset comprising medical data" and "the output for the new data predicts whether the patient has a medical condition," as recited by amended independent claim 15. None of the references mention any use of medical data or predicting whether a patient has a medical condition based on that data. For at least these reasons, the combination of Pinto, Weston, and Wagner fails to teach or describe the elements of independent claim 15. These claims are therefore nonobvious under 35 U.S.C. § 103. Any claims depending therefrom are also nonobvious by virtue of dependency and for their own features. Withdrawal of these rejections is respectfully requested.” (Remarks pg. 12)

Examiner’s response:
The Examiner respectfully disagrees. First, Weston teaches generating medical data as evidenced by col 7 lines 2-7 For example, biological data obtained from clinical case information, Such as diagnostic test data, family or genetic histories, prior or current medical treatments, and the clinical outcomes of such activities, can be utilized in the methods, systems and devices of the present invention.” And machine learning classifier SVM is used to generate output of medical related data because the datasets are medical records see col 13 lines 59-63. 
Weston further teaches the optimal output from SVM is used to identify causal relationship between the mammography and genomic data points which can determined/predict cancer diagnosis for medical patient see col 11 lines 35-40.




Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Raymer et al.
Liu et al. (“Early Diagnosis of Alzheimer’s Disease with Deep learning”) teaches deep learning architecture that uses stacked auto-encoders and a softmax output layer, to overcome the bottleneck and aid the diagnosis of AD and its prodromal stage, Mild Cognitive Impairment. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VAN C MANG whose telephone number is (571)270-7598.  The examiner can normally be reached on Mon - Fri 8:00-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on 5712729767.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access 






/V.M./Examiner, Art Unit 2126
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126