Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant’s submission filed on January 21st, 2021 has been entered.
Amendments
This action is in response to amendments filed January 21st, 2021, in which Claims 1, 4, 6, 10, 11, and 16 have been amended.  Claim 5 has been cancelled.  The amendments have been entered, and Claims 1-4 and 6-20 are currently pending.
Claim Rejections - 35 USC § 112
Claims 1-4 and 6-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 1, 10, and 16 recite the limitation hardware sensors in the computer.  There is insufficient antecedent basis for this limitation in the claim, because no specific computer has previously been positively recited.  For the purpose of examination, the computer will be interpreted to mean a computer of the machine learning system
	The dependent claims are rejected for inheriting the indefiniteness of the claims upon which they depend.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-4 and 6-20 are rejected under 35 U.S.C. 103 as being unpatentable over Hadjis et al., “Omnivore: An Optimizer for Multi-device Deep Learning on CPUs and GPUs,” in view of Teodoro et al., “AARTS: Low Overhead Online Adaptive Auto-tuning” and further in view of Barker, US PG Pub 2015/0331963.
Regarding Claim 1, Hadjis teaches a computer-implemented method for tuning a machine learning model using approximate computing (Hadjis, pg. 1, Abstract, “Given a specification of a convolutional neural network, our goal is to minimize the time to train this model … tuning momentum is critical in asynchronous parallel configurations,” further, the system is for “stochastic gradient descent”, e.g. approximate computing, see pg. 3, 1st column, 4th paragraph) the computer-implemented method comprising:  a) communicatively coupling a tuning [module] to a machine learning system, whereby the tuning [module] receives data from a machine learning program being executed by the machine learning system (Hadjis, pg. 2, 2nd column, 2nd paragraph, “we grid-search the parameters for learning rate and momentum by measuring the statistical and hardware efficiency for minutes (less than 10% of the time to train a network)”) and the tuning [module] sends at least one tuning parameter for tuning the machine learning system to  … the machine learning program … and hardware upon which the machine learning program is being executed (Hadjis, pg. 8, Algorithm 1, the optimizer/tuning module selects g (number of compute groups, configuring the hardware), μ (momentum), and η (learning rate, parameters for the machine learning program) and trains with those parameters (Algorithm 1, line 8)) b) receiving, by the tuning server, at least one performance objective of the machine learning system, the performance objective includes one or more of … speed (Hadjis, Abstract, “out goal is to minimize the time to train the model”) and gradient update momentum (Hadjis, pg. 8, Algorithm 1, “Input:  … momentum M” with pg. 7, 1st column, 2nd-to-last paragraph, “optimal total momentum” the system adjusts to match actual momentum, from explicit momentum and asynchrony, to the desired performance objective of “optimal total momentum”) c) using, by the tuning [module], an n-dimensional approximate computing configuration space comprising the at least one tuning parameter (Hadjis, pg. 8, Algorithm 1, the combination of (g, μ, η) is a 3-dimensional configuration space) d) collecting, by the tuning [module], the performance data of the machine learning system performance with at least one of … ii) hardware sensors in the computer (Hadjis, pg. 13, 1st column, 6th paragraph, “Appendix D-D … describes how to measure necessary quantities from the system” & pg. 25, 2nd column, 5th paragraph discusses measuring compute times, e.g. speed; pg. 30, 1st column, last paragraph, “we built-in a wall-clock timer to ensure accurate timing” denotes hardware sensors to measure time/speed) e) comparing the performance data to the system performance objective  f) in response to the comparing being outside a threshold, dynamically updating the n-dimensional approximate computing configuration space by adjusting the at least one tuning parameter using an objective function to identify a smallest value or a largest value subject to the performance objective (Hadjis, pg. 7, 2nd column, 2nd-to-last paragraph, “The optimal settings of these parameters also might change during training, so our optimizer runs periodically in epochs (e.g. every hour)” that is, every hour, the measured performance is re-compared to the optimally computed performance on the objective via the optimizer via Algorithm 1; if a better performance can be obtained, than the new parameter configuration is used – see pg. 7, last paragraph – pg. 8, first paragraph, “by selecting the configuration with the lowest final loss … set the highest amount of asynchrony such that this explicit momentum is non-zero”) g) training the machine learning system in accordance with apply the at least one tuning parameter therewith (Hadjis, pg. 8, Algorithm 1, Line 8).
Hadjis teaches that the optimizer/tuning module operates on the same hardware as the machine learning system (Hadjis, pg. 9, 2nd column, 1st paragraph, “includes the 10% overhead of Omnivore’s optimizer during the run) and thus does not teach a separate tuning server.  Teodoro, however, teaches using a tuning server to optimize a multi-core parallel processing system (Teodoro, pg. 3, Fig. 1 “Optimization server” and pg. 3, 2nd column, 1st paragraph, “tuning server”).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to run the optimization on its own tuning server, as does Teodoro, in the invention of Hadjis.  The motivation to do so is to avoid the overhead causes by running the optimizer on the same hardware, as does Hadjis.
The Hadjis/Teodoro combination does not teach, but Barker does teach, displaying on a graphical interface a performance report of the machine learning system … along with current value of the tuning parameters (Barker, [0041-0043] & Fig. 5, “the effects of different tuning parameters can be observed in substantially real time” denotes a performance report; “In the display of Fig. 5, the initial or default “solution path” value of the tuning parameter … is illustrated by the placement of the tuning parameter handle 524 in the window 504” denotes displaying a current value of the tuning parameters).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to add a graphical interface such as that of Barker’s into the machine learning system of Hadjis/Teodoro.  The motivation to do so is that “as tuning parameter handle is moved left and right, the corresponding parameter estimates … will change.  As a result, the interactive process possible with the illustrated embodiments is far easier and faster than conventional techniques” to see the effects of the tuning parameters on the performance (Barker, [0041]).
The graphical interface of Barker, as integrated into Hadjis/Teodoro, does not display a communication bandwidth utilized.  However, Hadjis teaches that communication bandwidth is an important value to know when tuning their machine learning system (Hadjis, pg. 15, 2nd column, 3rd paragraph, “direct computation of the 3D convolution is usually memory bandwidth-bound” & pg. 16, 1st column, 4th paragraph, discusses a situation which is “more likely memory-bandwidth-bound than higher batch sizes (and this phenomenon is likely more sever when the GEMM kernel is executed with multiple threads) … 16 threads was slightly slower than 8 is that we hit the memory bandwidth bottleneck”).  Since Hadjis teaches that memory bandwidth is an important factor determining processing performance, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to display this on the GUI.  The motivation to do so is so that a user can see many important factors in real-time.
Regarding Claim 2, the Hadjis/Teodoro/Barker combination of Claim 1 teaches the computer-implemented method of Claim 1 (and thus the rejection of Claim 1 is incorporated). Hadjis does not teach, but Teodoro teaches wherein the collecting and monitoring are performed in a background process (Teodoro, pg. 1, 2nd column, last paragraph, “the tuning should run in the background during application execution, not stopping the application when a new set of tunable parameters are required” where “tuning” includes collecting and monitoring, see Teodoro, pg. 3, 2nd column, 1st paragraph, “Tuning code sections … to collect performance measurements”).   It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to run the collecting and monitoring in the background.  The motivation to do so is so as not to have to stop the application when a new set of tunable parameters are required (Teodoro, pg. 1, 2nd column, last paragraph).
Regarding Claim 3, the Hadjis/Teodoro/Barker combination teaches the computer-implemented method of Claim 1 (and thus the rejection of Claim 1 is incorporated).  Hadjis has already been shown to teach wherein the at least one tuning parameter is selected from a group consisting of … update step size (Hadjis, the tuning parameters were identified as (g, μ, η) where η, learning rate, is an update step size parameter, see pg. 3, 2nd column, Eq. (4)).
Regarding Claim 4, the Hadjis/Teodoro/Barker combination teaches the computer-implemented method of Claim 1 (and thus the rejection of Claim 1 is incorporated).  Hadjis has already been shown to teach wherein adjusting the at least one tuning parameter is selected from a group consisting of … changing a number of nodes for parallelization … [and] changing a momentum parameter (Hadjis, the adjustable tuning parameters were identified as (g, μ, η) where g is a number of compute groups/a number of nodes for parallelization and μ is an explicit momentum parameter).
Regarding Claim 6, the Hadjis/Teodoro/Barker combination of Claim 1 teaches the computer-implemented method of Claim 1 (and thus the rejection of Claim 1 is incorporated).  The combination has already been shown to teach, via the included GUI of Barker, wherein the graphical user interface includes adjustable graphical elements representing real-time values of the tuning parameters (Barker, [0041], “a user can select the tuning parameter … in the user interface display and can drag the tuning parameter line in the user interface display, changing the tuning value correspondingly”).
Regarding Claim 7, the Hadjis/Teodoro/Barker combination of Claim 6 teaches the computer-implemented method of Claim 6 (and thus the rejection of Claim 6 is incorporated).  The combination has been shown to teach wherein a dynamic update of the n-dimensional approximate computing configuration space is overridden by engagement of the adjustable graphical elements (the tuning parameters are changed by engagement of the adjustable graphical elements of Barker, and can be changed multiple times, see Barker, [0031], “varying the tuning parameter over a range of values creates a sequence of candidate models”; the first update is a dynamic update and it is overridden by changing it again). 
Regarding Claim 8, the Hadjis/Teodoro/Barker combination of Claim 1 teaches the computer-implemented method of Claim 1 (and thus the rejection of Claim 1 is incorporated).  Hadjis further teaches changing the machine learning system performance objective in response to system changes (Hadjis, pg. 29, 2nd column, Table III, the “Optimal Momentum”/part of the system performance objective identified in Claim 1, changes for different datasets, e.g. in response to system changes).
Regarding Claim 9, the Hadjis/Teodoro/Barker combination of Claim 1 teaches the computer-implemented method of Claim 1 (and thus the rejection of Claim 1 is incorporated).  Hadjis further teaches wherein updating the n-dimensional approximate computing configuration space further comprises determining what tuning parameters to adjust using at least one of … iterative methods (Hadjis, pg. 2, 2nd column, 2nd paragraph, “we grid-search the parameters” where “grid-search” is an iterative method).
Claims 10, 11, and 13 recite a computer system comprising a processor device and a memory coupled to the processor device storing computer-executable instructions causing a computer to perform the methods of Claims 1, 6, and 9, respectively.  As Hadjis performs their methods on a computer (Hadjis, title, “Deep learning on CPUs”), the processor and memory are inherent.  Claims 10, 11, and 13 are thus rejected for reasons set forth in the rejections of Claims 1, 6, and 9, respectively.
Regarding Claim 12, the Hadjis/Teodoro/Barker combination of Claim 10 teaches the computer system of Claim 10 (and thus the rejection of Claim 12 is incorporated).  Hadjis further teaches wherein the machine learning model is a neural network (Hadjis, pg. 1, Abstract, “Given a specification of a convolutional neural network, our goal is to minimize the time to train this model”).
Regarding Claim 14, the Hadjis/Teodoro/Barker combination of Claim 13 teaches the computer system of Claim 13 (and thus the rejection of Claim 13 is incorporated).  Hadjis further teaches sending an instruction to modify a training algorithm to incorporate an adjusted training parameter (Hadjis, the adjustable tuning parameters were identified as (g, μ, η) where at least momentum and step size are training parameters required by the training algorithm, see Hadjis, pg. 8, Algorithm 1, line 8).
Regarding Claim 15, the Hadjis/Teodoro/Barker combination of Claim 14 teaches the computer system of Claim 14 (and thus the rejection of Claim 14 is incorporated).  Hadjis further teaches an instruction to incorporate multiple adjusted training parameters at one time(Hadjis, the adjustable tuning parameters were identified as (g, μ, η) where at least momentum and step size are training parameters required by the training algorithm, see Hadjis, pg. 8, Algorithm 1, line 8, where both momentum and step size are simultaneously optimized).
Claims 16, 17, 18, 19, and 20 recite the computer program product comprising a non-transitory computer readable storage medium readable by a processing device and storing program instructions for execution, said program instructions comprising the instructions stored in the memory of Claims 10, 11, 13, 14, and 12, respectively.  As Hadjis performs their methods on a computer (Hadjis, title, “Deep learning on CPUs”), the non-transitory computer readable storage medium is inherent.  Thus, Claims 16, 17, 18, 19, and 20 are rejected for reasons set forth in the rejections of Claims 10, 11, 13, 14, and 12, respectively.

Response to Arguments
Applicant’s arguments filed January 21st, 2021 have been fully considered, but are not fully persuasive.
Applicant’s amendments have overcome the 35 U.S.C. 112(a) and (b) rejections of the previous office action.  However, the amendments have required a new 35 U.S.C. 112(b) antecedent basis rejection for the independent claims.
Applicant’s amendments, specifically requiring training the machine learning model with the adjusted parameters, represent a practical application of any abstract idea steps.  The 35 U.S.C. 101 rejections of the previous office action have been withdrawn.
Applicant’s arguments with respect to the prior art rejections of the previous office action have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.  Specifically, new reference Hadjis teaches the recited performance objectives and tuning parameters.  New reference Teodoro teaches a separate tuning server.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRIAN M SMITH whose telephone number is (469)295-9104. The examiner can normally be reached Monday - Friday, 8:30am -5pm Central.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/BRIAN M SMITH/Primary Examiner, Art Unit 2122