Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-4, 6-7, 9-14, and 16-19 are rejected under 35 U.S.C. 103 as being unpatentable over Bryan Cutler (“Model Parallelism with Spark ML Tuning” – hereinafter referred to as Bryan) in view of Zhang et al.  (“GStream: A General-Purpose Data Streaming Framework on GPU Clusters” – hereinafter referred to as Zhang) and further in view of Chacon et al. (“Automatic .

In regards to claim 1, Bryan discloses a method comprising: 

identifying a model for a machine-learning program, the model comprising a plurality of hyperparameter value sets to be tested based on a dataset, (Bryan page 1 teaches “This process uses a parameter grid where a model is trained for each combination of parameters and evaluated according to a metric.” This teaches an identified model (a model) comprised a plurality of hyperparameter value sets (combination of parameters)

 the dataset having performance data for a plurality of features identified for the machine-learning program; (Bryan page 3 teaches the dataset consisting of 100 features)

breaking the dataset into a plurality of fragments for evaluating the model with a processing unit (Bryan page 3 teaches “Because the data has 2 partitions, Spark will be able to use 2 cores to run tasks from the stages in parallel…”, this teaches data is fragmented and processed by processing units (cores).) 

loading a plurality of cores with the model and a respective hyperparameter value set; (Bryan page 3 last paragraph – page 4 first paragraph teaches wherein 3 models are evaluated in parallel, wherein 6 cores are being used. These teaches loading cores with a model and respective hyperparameters.)

for each fragment from the plurality of fragments of the dataset: 
loading the fragment of the dataset to a core memory; and (Bryan page 3 teaches “Because the data has 2 partitions, Spark will be able to use 2 cores to run tasks from the stages in parallel…”, this teaches data is fragmented and processed by processing units (cores).)

evaluating, in parallel by the plurality of cores, the fragment of the dataset based on the model and the respective hyperparameter value set associated with each core; (Bryan page 1 first paragraph teaches the use of a parameter grid wherein a model is trained for each combination of parameters. Then page 4 paragraphs 1-2 teaches running 3 models in parallel on 6 cores.)

determining a best hyperparameter value set, from the plurality of hyperparameter value sets, for the machine-learning program; and (Bryan page 3 first paragraph teaches determining the best model of 16 models, wherein each model has different parameters from the 4x4 param grid. From this determining the best model is also determining the best hyperparameter value set.)

storing and causing presentation of the best hyperparameter value set.  

	However Bryan fails to disclose wherein using GPUs, streaming data to GPUs, and storing and causing presentation of the best hyperparameter value set.

	Zhang discloses using GPUs and streaming data to GPUs. (Zhang abstract teaches the use of GPUs and page 1 right column last paragraph - page 2 second paragraph teaches streaming data to GPUs.)

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Bryan with that of Zhang to utilize GPUs and stream data to those GPUs as both references deal with the use of cores and parallel processing. It would provide the benefit of faster processing and more efficient parallel processing over CPU as suggested in the first two paragraphs of Zhang. 

	However Bryan in view of Zhang fail to disclose storing and causing presentation of the best hyperparameter value set.

	Chacon discloses storing and causing presentation of the best hyperparameter value set. (Chacon page 24 section 3.5.4. teaches that the best models at each level of the hyperparameter hierarchy, creates a ranked list of best models, and reporting the result of the model selection stage. In created a ranked listed models are stored and when the best model is choosen those hyperparameters used are the best and displayed.)

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Bryan in view of Zhang with that of Chacon to store and present the best hyperparameter set as both Bryan and Chacon deal with model selection. It would provide the benefit creating a ranked list that is easy for user to understand which model is best.

In regards to claim 2, Bryan in view of Zhang in view of Chacon disclose the method as recited in claim 1, wherein streaming the fragment further includes: transmitting a first fragment to the GPU memory; while the first fragment is being evaluated, transmitting a second fragment to the GPU memory; and  Attorney Docket No. 3080.J15US1-36- Client Ref. No. 902116-US-NPafter the first fragment has been evaluated, transmitting a third fragment to the GPU memory while the second fragment is being evaluated.  (Zhang page 4 right column last paragraph of section B and page 5 paragraphs 1-2 teaches continuous storage for the data in memory wherein data is input into buffer (memory) and processed and the this keeps happening. Thus with continuous data, there would be a first, second, third and so on fragment in memory until no data is available.)

In regards to claim 3, Bryan in view of Zhang in view of Chacon disclose the method as recited in claim 1, wherein breaking the dataset into the plurality of fragments further comprises: identifying a fragment size; and breaking the dataset into fragments with a size up to the fragment size.  (Zhang page 5 teaches “It keeps popping data from its input port (lines 16 to 17). The returned size (int batch in line 16) always falls in the provided range of [getMinDegreel(0) ... getMaxDegree(0)]. The GStream runtime system guarantees continuous storage for the data in memory.” Wherein the fragment size is in the provided range.)

In regards to claim 4, Bryan in view of Zhang in view of Chacon disclose the method as recited in claim 1, further comprising: generating the plurality of hyperparameter value sets based on one or more of user-specified hyperparameters, a uniform distribution of hyperparameters, a nonparametric distribution of hyperparameters, a prior distribution of hyperparameters, a distribution based on Bayesian rules and experimental results, and a distribution modeled by a Gaussian process.  (Chacon page 10 section 3.1.1 to page 12 teaches “Hyperparameter Distributions” wherein hyperparameter values are based on uniform distributions and Gaussian distribution, page 27 last paragraph teaches non-parametric distribution, page 30 first paragraph teaches a prior distribution.)

In regards to claim 6, Bryan in view of Zhang in view of Chacon disclose the method as recited in claim 1, wherein determining the best hyperparameter value set further comprises: testing the corresponding machine-learning program for each hyperparameter value set; and  Attorney Docket No. 3080.J15US1-37- Client Ref. No. 902116-US-NPselecting the hyperparameter value set that is the best predictor.  (Bryan page 3 first paragraph teaches determining the best model of 16 models, wherein each model has different parameters from the 4x4 param grid. From this determining the best model is also determining the best hyperparameter value set.)


In regards to claim 7, Bryan in view of Zhang in view of Chacon disclose the method as recited in claim 1, wherein the GPU is in a computing device having a memory and a processor, wherein an arbiter executing on the processor coordinates the streaming of fragments and loading of models in the cores of the GPU.  (Zhang abstract teaches the use of GPUs and page 1 right column last paragraph - page 2 second paragraph teaches streaming data to GPUs. Zhang abstract teaches the use of GPUs and page 1 right column last paragraph - page 2 second paragraph teaches streaming data to GPUs.)


In regards to claim 9, Bryan in view of Zhang in view of Chacon disclose the method as recited in claim 1, wherein loading the plurality of cores of the GPU further comprises: transferring a model program to the GPU memory; and invoking the model program with the corresponding hyperparameter value set at each of the cores of the GPU.  (Bryan page 3 last paragraph – page 4 first paragraph teaches wherein 3 models are evaluated in parallel, wherein 6 cores are being used. These teaches loading cores with a model and respective hyperparameters.)

In regards to claim 10, Bryan in view of Zhang in view of Chacon disclose the method as recited in claim 1, further comprising: utilizing the machine program trained with the best hyperparameter parameter value set for making predictions associated with new input data. (Bryan page 3 first paragraph teaches determining the best model of 16 models, wherein each model has different parameters from the 4x4 param grid. From this determining the best model is also determining the best hyperparameter value set. Also Chacon page 1 third paragraph teaches using predictive model to be applied to new data for making predictions. This teaches using best trained model to make prediction as that was the purpose of finding the best model.)

In regards to claim 11, it is the system embodiment of claim 1 with similar limitations and thus rejected using the reasoning found in claim 1. 

In regards to claim 12, it is the system embodiment of claim 2 with similar limitations and thus rejected using the reasoning found in claim 2. 

In regards to claim 13, it is the system embodiment of claim 3 with similar limitations and thus rejected using the reasoning found in claim 3. 



In regards to claim 16, it is non-transitory machine readable storage medium embodiment of claim 1 with similar limitations and thus rejected using the reasoning found in claim 1. 

In regards to claim 17, it is non-transitory machine readable storage medium embodiment of claim 2 with similar limitations and thus rejected using the reasoning found in claim 2. 

In regards to claim 18, it is non-transitory machine readable storage medium embodiment of claim 3 with similar limitations and thus rejected using the reasoning found in claim 3. 

In regards to claim 19, it is non-transitory machine readable storage medium embodiment of claim 4 with similar limitations and thus rejected using the reasoning found in claim 4. 

Claims 5, 15 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Bryan Cutler (“Model Parallelism with Spark ML Tuning” – hereinafter referred to as Bryan) in view of Zhang et al.  (“GStream: A General-Purpose Data Streaming Framework on GPU Clusters” – hereinafter referred to as Zhang) in view of Chacon et al. (“Automatic Problem-Specific Hyperparameter Optimization and Model Selection for Supervised Machine Learning: Technical Report” – hereinafter referred to as Chacon) and further in view of Pranoy (“What are Hyperparameters? And How to Tune the Hyperparameters in a Deep Neural Network?”).


In regards to claim 5, Bryan in view of Zhang in view of Chacon disclose the method as recited in claim 1, but fails to disclose wherein each hyperparameter value set includes one or more of a number of hidden layers in the machine-learning program, a number of hidden nodes in each layer, a learning rate for one or more adaptation schemes, a regularization parameter, types of nonlinear activation functions, and use of dropout.  
Pranoy discloses wherein each hyperparameter value set includes one or more of a number of hidden layers in the machine-learning program, a number of hidden nodes in each 
It would have been obvious to one of ordinary skill before the effective filing date of the claimed invention to modify the teachings of Bryan in view of Zhang in view of Chacon with that of Pranoy to include hyperparameters for the number of hidden layers, nodes, learning rate, regularization and dropout as Bryan and Chacon deal with model selection and hyperparameter optimization and Pranoy deal with Hyperparameter tuning or optimization. It provides the benefit of creating a more efficient and robust system that tune parameters of a wide variety.

In regards to claim 15, it is the system embodiment of claim 5 with similar limitations and thus rejected using the reasoning found in claim 5. 

In regards to claim 20, it is non-transitory machine readable storage medium embodiment of claim 5 with similar limitations and thus rejected using the reasoning found in claim 5. 

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Bryan Cutler (“Model Parallelism with Spark ML Tuning” – hereinafter referred to as Bryan) in view of Zhang et al.  (“GStream: A General-Purpose Data Streaming Framework on GPU Clusters” – hereinafter referred to as Zhang) in view of Chacon et al. (“Automatic Problem-Specific Hyperparameter Optimization and Model Selection for Supervised Machine Learning: Technical Report” – hereinafter referred to as Chacon) and further in view of Karatzoglou et al (US 2015/0187024 A1 – hereinafter referred to as Alexandros)

In regards to claim 8, Bryan in view of Zhang in view of Chacon disclose the method as recited in claim 1, but fails to disclose wherein the dataset includes data corresponding to interactions of users performed in a context of a social network.  
Alexandros disclose wherein the dataset includes data corresponding to interactions of users performed in a context of a social network.  (Alexandros para. [0023] teaches a dataset of user-item interactions on an online social network.)
It would have been obvious to one of ordinary skill in the art before effective filing data of the claimed invention to modify the teachings of Bryan in view of Zhang in view of Chacon with that of Alexandros to allow for user interaction data as both Bryan, Chacon and Alexandros perform model selection and the benefit of allowing for user interactions as data allows the system to be used in a wide variety of applications to make predictions.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAULINHO E SMITH whose telephone number is (571)270-1358.  The examiner can normally be reached on Mon-Fri. 10AM-6PM CST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on (571) 272-7796.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/PAULINHO E SMITH/Primary Examiner, Art Unit 2125