DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments with respect to rejection of claims 1-20 under 35 USC 101 have been fully considered but are not persuasive.

Applicant asserts that the claims are directed to an improvement to the technological field of optimizing a neural network, or directed to an improvement to the functioning of a computer.  Applicant supports this assertion pointing to specification sections for the invention can use an approximation of a specific algorithms to speed up the technological field of neural network training while still reliably reaching a global minima, and improve the function inf of a computer by increasing the speed of learning rate, reducing power consumption and reducing chip area (Remarks p. 8-9).
Examiner respectfully disagrees.  What is novel in the claimed invention is the mathematical concepts claimed, the gradient descent mathematical algorithm, wherein the gradient descent algorithm transforms a cumulative change function of a gradient for an error function into an inverse square root function in the Gradient Descent algorithm and operating an inverse square root approximate value by using a Newton-Raphson method for the inverse square root function.  Furthermore, the specification further describes the gradient descent algorithm [00102], [00171-00205], [00193-00197], and [00206-00209], which discloses use of this mathematical algorithm to determine a steps size of a learning rate of a neural network.  As such the novelty is in the math, which is merely “applied” to a particular technological environment in a manner that merely generally links the math to the neural network as in claim 1: “trains the neural network model based on an output of the Gradient Descent algorithm to generate an updated neural network”.  Similarly any asserted improvement in the functioning of a computer flows as a direct result of the Gradient Descent mathematical algorithm.  Better math may make a computer run faster, calculate more efficiently, but is not a technological improvement in the functioning of the computer itself.  Finally, ‘[i]t is important to keep in mind that an improvement in the abstract idea itself … is not an improvement in technology” MPEP 2106.05.a.II.  See also MPEP 2106.05.I., the “inventive concept cannot be furnished by the unpatentable law or nature (or natural phenomenon or abstract idea) itself”.

Applicant further asserts the claims recite a particular machine that is integral to the claim because the claim recites a learning apparatus to optimize a neural network comprising an input interface configured to obtain input data or training data; a memory configured to store the input data, the training data and a neural network model for deep learning, and a learning processor configured to apply a gradient descent algorithm to the neural network model, wherein the learning processor transforms a cumulative change function of a gradient for an error function into an inverse square root function in the gradient descent algorithm, operates an inverse square root approximate value by using a Newton-Raphson method for the inverse square root function, and trains the neural network model based on an output of the gradient descent algorithm to generate an updated neural network.  In support, Applicant asserts the claim cannot be carried out by mere generic components or merely generically link mathematical concepts because claim 1 is directed to a specific type of apparatus including a processor configured with a specific algorithm making it a special purpose device (Remarks p. 9-10).
Examiner respectfully disagrees.  The claim recites mathematical concepts: applying a Gradient Descent algorithm by transforming a cumulative change function of a gradient for an error function into an inverse square root function in the Gradient Descent algorithm and operating an inverse square root approximate value by using a Newton-Raphson method for the inverse square root function.  The assertion that the algorithm is specific is not relevant to the Alice framework analysis; specifically claimed math is still math under the Alice framework step 2A prong 1 analysis.  The remaining additional elements: a learning apparatus to optimize a neural network comprising an input interface configured to obtain input data or training data; a memory configured to store the input data, a neural network model for deep learning are merely generally linked to the math, and comprise well understood, routine and conventional activities (see rejection below). As such the claim comprises a specific mathematical algorithm performed by conventional computing components and merely generally linked to a particular technological environment. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. § 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.  

Regarding claim 1, the under the Alice framework Step 1, the claim falls within the four statutory categories of patentable subject matter identified by 35 USC 101: a process, machine, manufacture or composition of matter. 
 Under the Alice framework Step 2A prong 1, the claim recites Mathematical Concepts.  Claim 1 recites applying a Gradient Descent algorithm by transforming a cumulative change function of a gradient for an error function into an inverse square root function in the Gradient Descent algorithm and operating an inverse square root approximate value by using a Newton-Raphson method for the inverse square root function. 
A Gradient Descent algorithm is a mathematical algorithm that obtains a gradient of a cost function through the use of partial differential equations, using model parameters and updating the model parameters in the direction of the gradient. See specification [00102], [00171-00205], and figures 2-4.  Equations to apply the partial differential include an inverse square root and application of a Newton-Raphson mathematical method.  See specification [00193-00197].  Specific implementation of the method as claimed further includes mathematical concepts: division by 2, multiplication, inverse square root of integer and floating point numbers. See specification [00206-00209].  For these reasons, the claim recites Mathematical Concepts.
Under the Alice framework Step 2A prong 2 analysis, the claim recites the following additional elements: a learning apparatus to optimize a neural network, the learning apparatus comprising: an input interface configured to obtain input data or training data; a memory configured to store the input data, the training data, and a neural network model for deep learning; a learning processor configured to apply a Gradient Descent algorithm to the neural network model, and train the neural network model based on an output of the Gradient Descent algorithm to generate an updated neural network. The claim does no more than generally link the additional elements to the mathematical concepts, the application of a Gradient Descent algorithm in a manner that in effect merely recites “apply it” on a computer comprising a processor, an input interface and a memory.  Furthermore the claim merely applies the mathematical concept to a particular technological environment, a learning apparatus, and training a neural network model.  At most, these elements merely recite an insignificant extra-solution activity. For these reasons the claims are not integrated into a practical application.
Moreover, under the Alice Framework Step 2B analysis, the claims, considered individually and as an ordered combination does not include additional elements that are sufficient to amount to significantly more than the abstract idea. As discussed in the Step 2A prong 2 analysis, the claim either merely generally links an additional element to the math, by applying the math in a computer, particular technological environment or recites insignificant extra-solution activity.  The innovative concept is in the mathematical concepts, i.e., the specific gradient descent algorithm applied.  Furthermore, these elements are well understood, routine and conventional activities commonly performed in computing systems.  See, e.g. D.A. Patterson et al., Computer Organization and Design: The Hardware/Software Interface, Elsevier, Ch 1, and Ch 3, 2007 (hereinafter “Patterson”).  See chapter 1, specifically figure 1.5 which depicts an input interface configured to obtain input data, a memory storing the input data, and a processor operating on the data.  See also chapter 3, which discloses various approaches for configuring a processor to perform arithmetic functions. See also K. Gurney, An introduction to neural networks, Taylor & Francis Group, 1997, Ch 5, which discloses use of a gradient descent algorithm for training a neural network.  For these reasons claim 1, when considering the mathematical concepts as a whole in combination with these generally applied and/or well understood, routine, and conventional activities,  does not amount to significantly more than the abstract idea. 

Claims 23 and 24 are rejected for at least the reasons cited with respect to claim 22.  Under the Step 2A prong 1 analysis, in addition to the mathematical concepts recited in claim 22, both claims 23 and 24 recite further mathematical steps to execute third (claim 23, 24) and fourth Jacobi variants on a smaller problem size, i.e., a subproblem of the first subproblem and pass results back (claim 24).  Under the step 2A prong 2 analysis, additional elements third kernel method, call by the second kernel method (claim 23, 24), and fourth kernel method are insignificant extra solution activities consistent with the claim 22 analysis.  For this reason the claims are not integrated into a practical application.  Under the step 2B analysis these additional elements when considered as a whole in combination with all limitations comprise well-understood, routine, and conventional activities.  The innovative concept remains within the mathematical calculations, the first, second third (claim 23, 24) and fourth Jacobi variant mathematical algorithms applied (claim 24) consistent with the claim 22 analysis.  For these reasons, the claims do not amount to significantly more than the abstract idea.

Regarding claim 2, claim 2 is rejected for at least the reasons cited with respect to claim 1.  Under the step 2A prong 1 analysis, claim 2 merely recites further mathematical concepts by transforming an error prevention constant value ϵ into the inverse square root function by shifting the error prevention constant value ϵ into an inverse square root, in the cumulative change function of the gradient. As to the shifting, shifting is mathematically equal to dividing by two. See specification [00216].  Claim 2 contains no further additional elements that would require further analysis under step 2A prong 2 or step 2B.

Regarding claim 3, claim 3 is rejected for at least the reasons cited with respect to claim 1.  Furthermore, claim 3 recites the following additional elements: the learning processor includes an inverse square root operator including a shifter, an integer subtractor, a floating-point subtractor, and a floating-point multiplier.  Under the step 2A prong 2 and step 2B analysis, these additional elements are merely generally linked to the processor in a manner that merely recites do some that in some math hardware. 

Regarding claims 4-6, claims 4-6 are rejected for at least the reasons cited with respect to claim 3.  Under the step 2A prong 1 analysis, claims 4-6 merely further recite recites further mathematical calculations and relationships performed by the inverse square root operator.  Claims 4-6 contain no further additional elements that would require further analysis under step 2A prong 2 or step 2B.

Regarding claims 7-8, claims 7-8 are rejected for at least the reasons cited with respect to claim 3.  Under the step 2A prong 1 analysis, claim 7 recites a series of mathematical calculations that include transforming number formats from floating point to integer, shifting integer bits, and performing a series of subtracting, squaring, multiplying operations.  Without further limitation, the element “operation” is interpreted to include mathematical calculations. Furthermore the element “receive an integer form of a single precision floating-point x” is not being considered as an additional element based on the further limitation “by transforming the single precision floating-point x into integer form”, which is a mathematical calculation.  Claim 8 merely further recites mathematical calculation by repeating the claim 1 Newton-Raphson method.  Claims 7-8 contain no further additional elements that would require further analysis under step 2A prong 2 or step 2B.

Regarding claim 9, the under the Alice framework Step 1, the claim falls within the four statutory categories of patentable subject matter identified by 35 USC 101: a process, machine, manufacture or composition of matter. 
 Under the Alice framework Step 2A prong 1, the claim recites Mathematical Concepts.  Claim 9 recites a series of mathematical calculations that include transforming number formats from floating point to integer, shifting integer bits, and performing a series of subtracting, squaring, multiplying operations.  Without further limitation, the element “operation” is interpreted to include mathematical calculations. Furthermore the element “receive an integer form of a single precision floating-point x” is not being considered as an additional element based on the further limitation “by transforming the single precision floating-point x into integer form”, which is a mathematical calculation.  For these reasons, the claim recites Mathematical Concepts.
Under the Step 2A prong 2 analysis, claim 9 contains the following additional elements: a method of controlling a learning apparatus, the math steps being performed by a processor, and training, by the processor, the neural network based on y to generate an updated neural network.  The claim does no more than generally link the additional elements to the mathematical concepts, the application of a Gradient Descent algorithm in a manner that in effect merely recites “apply it” by a processor.  Furthermore the claim merely applies the mathematical concept to a particular technological environment to a method of controlling a learning apparatus and training the neural network.  At most, these elements merely recite an insignificant extra-solution activity. For these reasons the claims are not integrated into a practical application.
Moreover, under the Alice Framework Step 2B analysis, the claims, considered individually and as an ordered combination does not include additional elements that are sufficient to amount to significantly more than the abstract idea. As discussed in the Step 2A prong 2 analysis, the claim either merely generally links an additional element to the math, by applying the math in a processor, in a particular technological environment or recites insignificant extra-solution activity.  The innovative concept is in the mathematical concepts, i.e., the specific gradient descent algorithm applied.  Furthermore, these elements are well understood, routine and conventional activities commonly performed in computing systems.  See, e.g. Patterson Ch 1, and Ch 3.  See chapter 1, specifically figure 1.5 which depicts an input interface configured to obtain input data, a memory storing the input data, and a processor operating on the data.  See also chapter 3, which discloses various approaches for configuring a processor to perform arithmetic functions. See Gurney, Ch 5, which discloses use of a gradient descent algorithm for training a neural network.  For these reasons claim 9, when considering the mathematical concepts as a whole in combination with these generally applied and/or well understood, routine, and conventional activities,  does not amount to significantly more than the abstract idea. 


Claims 10-11 are rejected for at least the reasons cited with respect to claim 9.  Claims 10-11 merely further mathematically limit the limitations of claim 9.  Claims 10-11 contain no further additional elements that would require further analysis under step 2A prong 2 or step 2B.

Claim 12 is directed to a computer readable recording medium in which a computer program for implementing the method of claim 9 has been recorded.  All steps recited in the method of claim 9 are performed by the computer readable recording medium of claim 12.  The claim 9 analysis applies equally to claims 12.

Claims 13-20 are directed to a method that would be practiced by the apparatus of claims 1-8.  All steps recited in the method of claims 13-20 are performed by the apparatus of claims 1-8, with the exception that the steps performed by the learning apparatus of claims 1-8 are performed by a learning processor in the method of claims 13-20.  The claim 1-8 analysis applies equally to claims 13-20, with the analysis with respect to the learning apparatus applying equally to the learning processor.

Allowable Subject Matter
For the reasons stated in office action dated 06/13/22, claims 1-20 would be allowable if rewritten to overcome the rejections under 35 USC 101.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EMILY E LAROCQUE whose telephone number is (469)295-9289.  The examiner can normally be reached on 10:00am - 1200pm, 2:00pm - 8pm ET M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor Jyoti Mehta can be reached on 571-270-3995.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/EMILY E LAROCQUE/Primary Examiner, Art Unit 2182