DETAILED ACTION
Status of Claims
This action is in response to the application filed on 10/29/2018. Claim 1 – 18 are pending and have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Acknowledgment is made of applicant's claim for foreign priority based on an application filed in China on 4/29/2016. It is noted, however, that applicant has not filed a certified copy of the CN201610285062 application as required by 37 CFR 1.55.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/29/2018, 10/31/2019 and 4/15/2021 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Objections
Claim 4 and 13 are objected to because of the following informalities:  “… gradients o generate …". This appears to be a typographical error and will be interpreted as “… gradients to generate …”. Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
“controller unit configured to” in claim 1. (spec. [0037) - [0043], reiterate the functions but does not provide description of the structure)
“master computation module configured to” in claim 1 – 4. (spec. [0059] mater computation unit may include several units. However, it is not clear whether these units are electronic circuit, piece of software or just an algorithm.)
“slave computation modules configured to” in claim 1, 5. (spec. [0052] slave computation module comprises several units. However, it is not clear whether these units are electronic circuit, piece or software or just an algorithm.) 
“interconnection unit configured to” in claim 6 – 7. (spec. [0049] interconnection may be structured as a binary tree that include multiple levels, each level may include one or more nodes. However, it is not clear whether the tree and node are electronic circuit, or a logical connection in a software.) 
“slave neuron caching unit configured to” in claim 8. (para. [0052] slave neuron caching unit … may refer to an on-chip caching unit integrated in the neural network acceleration processor)
“weight value caching unit configured to” in claim 9. (para. [0052] weight value caching unit … may refer to an on-chip caching unit integrated in the neural network acceleration processor)
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 1 – 9 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Claim 1 – 7 contains the following generic placeholders and invoke3 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
controller unit
master computation module
slave computation modules
interconnection unit
However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. In particular, the corresponding description found in the Specification (see Section 6 of the Office Action) of each of the generic placeholders listed above and does not provide description of the structure that performs the corresponding functions. Therefore, each of the claims 1 – 7 and their dependent claim 8 – 9 are rejected under 35 U.S.C. 112(a) for lack of written description. See MPEP 2181 (IV) ("the means- (or step-) plus- function claim must still be analyzed to determine whether there exists corresponding adequate support for such claim limitation under 35 U.S.C. 112(a) or pre-AIA  35 U.S.C. 112, first paragraph. In considering whether there is 35 U.S.C. 112(a) or pre-Al A 35 U.S.C. 112, first paragraph support for the claim limitation, the examiner must consider whether the specification describes the claimed invention in sufficient detail to establish that the inventor or joint inventor(s) had possession of the claimed invention as of the application's filing date"). 

Claim 1 – 9 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  
Claim 1 – 7 contains the following generic placeholders and invoke3 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
controller unit
master computation module
slave computation modules
interconnection unit
However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. In particular, the corresponding description found in the Specification (see Section 6 of the Office Action) of each of the generic placeholders listed above and does not provide description of the structure that performs the corresponding functions. Therefore, each of the claims 1 – 7 and their dependent claim 8 – 9 are rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph. For examination purpose, it is interpreted that each of the functions for which 35 U.S.C. 112(f) is invoked but without sufficient description of structure in the specification is performed by a generic computation component.
Applicant may:
(a) Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph;
(b) Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or
(c) Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either:
 (a) Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or
(b) Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function.
For more information, see 37 CFR 1.75(d) and MPEP §§ 608.0l(o) and 2181.

Claim 8 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  
Claim 8 recites: “slave computation module includes a slave neuron caching unit configured to …”. It is not clear the limit “configured to” is connect to the slave computation module or the slave neuron caching unit. One of ordinary skilled in the art would not be able to evaluate which element the term “configured to” is referring to and thus not be reasonably apprise the scope of the invention. For the examination purpose, this limitation is interpreted as “slave computation module includes a slave neuron caching unit. The slave neuron caching unit is configured to …”.

Claim 9 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  
Claim 8 recites: “slave computation module includes a weight value caching unit configured to”. It is not clear the limit “configured to” is connect to the slave computation module or the weight value caching unit. One of ordinary skilled in the art would not be able to evaluate which element the term “configured to” is referring to and thus not be reasonably apprise the scope of the invention. For the examination purpose, this limitation is interpreted as “slave computation module includes a slave neuron caching unit. The weight value caching unit is configured to …”.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.



Claims 1 – 18 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.

As of Claim 1, in the Subject Matter Eligibility Test Step 1, the claimed apparatus in Claim 1 as a whole falls within one or more statutory category. 
In the Subject Matter Eligibility Test Step 2A Prong One, the claimed apparatus in Claim 1 recites the abstract ideas in the following limitations:
multiply one of the one or more first data gradients with the input data to generate a default weight gradient vector.
The steps of multiply recited a mathematical calculation and thus falls under the mathematical concepts group of abstract idea. The mere nominal recitation of first data gradients, input data and default weight gradient vector does not take the claim limitation out of the mathematical calculation group and thus the claim falls within judicial exception of mental processes of abstract idea and requires further analysis under Step 2A Prong Two.
In the Subject Matter Eligibility Test Step 2A Prong Two, Claim 1, recite the following additional elements along with the abstract ideas:
a controller unit configured to receive an instruction;
and one or more computation modules that include a master computation module and one or more slave computation modules, 
receive input data and one or more first data gradients in response to the instruction and transmit the input data and the one or more first data gradients to one or more slave computation modules
The recited additional element of control unit configured to receive instruction is highly generic, no more than an idea of a solution and mere instructions to apply an exception. The element of computation module that include a master and one or more slave is referring to the master/slave architecture, is highly generic and generally linking the use of the judicial exception to a particular technological environment or field of use. And, step for master module to receive data and transmit data to slave module add insignificant extra solution activities to the judicial exception. Thus, the additional element in Claim 1 does not integrate the abstract idea into a practical application and the claim as a whole is directed to the judicial exception that requires further analysis under Step 2B.
In the Subject Matter Eligibility Test Step 2B, the recited additional element of control unit configured to receive instruction is highly generic, no more than an idea of a solution and mere instructions to apply an exception. The element of computation module that include a master and one or more slave is referring to the master/slave architecture, is highly generic and generally linking the use of the judicial exception to a particular technological environment or field of use. The data transition between master/slave module does not add meaningful limitation beyond appending well understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (See at least Hamalainen, TUTNC: a general purpose parallel computer for neural network computations, Microprocessors and Microsystems Vol 19, 1995, fig. 1, fig. 2, & page. 451, col 2, para. 3, where during the execution, the host [master computation module] serves data transfers). Thus, the additional element in Claim 1 does not contribute an inventive concept and Claim 1 is not eligible subject matter under 35 U.S.C. 101.

	As of Claim 2, depending on Claim 1. In the Subject Matter Eligibility Test Step 2A Prong One, Claim 2 recites additional abstract idea in the following limit:
update one or more weight values based on the default weight gradient vector
The steps of update weight value based on gradient recited either a mathematical relationship or a mental process.
The Subject Matter Eligibility Test Step 2A Prong Two and 2B are same as set forth in claim 1. This claim does not recite any additional elements that integrates the abstract idea into practical application or amount to significantly more. The claim is not eligible.

As of Claim 3, depending on Claim 1. In the Subject Matter Eligibility Test Step 2A Prong One, Claim 3 recites additional abstract idea in the following limit:
calculate a scaled weight gradient vector based on the default weight gradient vector and a predetermined threshold value; 
update one or more weight values based on the scaled weight gradient vector.
The steps of calculate recite a mathematical calculation and the step of “update weight based on scaled gradient” recited either a mathematical relationship or a mental process.
The Subject Matter Eligibility Test Step 2A Prong Two and 2B are same as claim  1. This claim does not recite any additional elements that integrates the abstract idea into practical application or amount to significantly more. The claim is not eligible.

As of Claim 4, depending on Claim 1. In the Subject Matter Eligibility Test Step 2A Prong One, Claim 4 recites additional abstract idea in the following limit:
apply a derivative of an activation function to the one or more first data gradients o generate one or more input gradients
The steps of “apply derivative of activation function to gradient to generate gradient” recited mathematical calculation and mathematical relationship.
The Subject Matter Eligibility Test Step 2A Prong Two and 2B are same as  claim  1. This claim does not recite any additional elements that integrates the abstract idea into practical application or amount to significantly more. The claim is not eligible.

As of Claim 5, depending on Claim 4. In the Subject Matter Eligibility Test Step 2A Prong One, Claim 5 recites additional abstract idea in the following limit:
multiply one of the one or more input gradients with one or more weight vectors in a weight matrix to generate one or more multiplication results
The steps of “multiply input with weight to generate result” recited a mathematical calculation.
The Subject Matter Eligibility Test Step 2A Prong Two and 2B are same as claim 4. This claim does not recite any additional elements that integrates the abstract idea into practical application or amount to significantly more. The claim is not eligible.

As of Claim 6, depending on Claim 5. In the Subject Matter Eligibility Test Step 2A Prong One, Claim 6 recites additional abstract idea in the following limit:
combine the one or more multiplication results calculated respectively by the one or more slave computation modules into an output gradient vector
The steps of “combine result into a vector” recited either a mathematical calculation or a mental step. The mere nominal recitation of calculated respectively by the one or more slave computation module does not take the claim limitation out of the abstract idea.
In the Subject Matter Eligibility Test Step 2A Prong Two and 2B, Claim 6 further recites the following additional elements:
interconnection unit
The additional element of interconnection unit do not add meaningful limitation beyond linking the use of judicial exception to a particular technological environment or field of use at high level of generality and thus neither integrate the abstract idea into a practical application in Step 1A Prong Two test nor contribute inventive concept in Step 2B test. Claim 6 is rejected under the same rationale as Claim 5.

As of Claim 7, depending on Claim 1, Claim 7 recites additional elements in the following limit:
an interconnection unit configured to channel data between the master computation module and the one or more slave computation modules.
In the Subject Matter Eligibility Test Step 2A, the recited additional element do not add meaningful limitation beyond adding insignificant extra solution activity to the judicial exception. Thus, the additional element in Claim 7 does not integrate the abstract idea into a practical application and the claim as a whole is directed to the judicial exception that requires further analysis under Step 2B.
In the Subject Matter Eligibility Test Step 2B, the recited additional element does not add meaningful limitation beyond appending well understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (MPEP 2106.05(d)(II) The courts have recognized the following computer function as well-understood routine, and conventional function when they are claimed in a merely generic manner … i. Receiving or transmitting data over a network). Thus, the additional element in Claim 7 does not contribute an inventive concept and Claim 7 is not eligible subject matter under 35 U.S.C. 101, as pointed out in Step 2A, Prong 2 analysis.

	As of Claim 8 and 9, depending on Claim 1 and 5, Claim 8 and 9 recites additional elements in the following limit:
each of the one or more slave computation modules includes a slave neuron caching unit configured to store the one or more first data gradients with the input data
each of the one or more slave computation modules includes a weight value caching unit configured to store the weight matrix that includes the one or more weight vectors
In the Subject Matter Eligibility Test Step 2A, the recited additional elements of caching unit storing data is highly generic and add insignificant extra solution activity to the judicial exception. Thus, the additional element in Claim 8 and 9 do not integrate the abstract idea into a practical application and the claim as a whole is directed to the judicial exception that requires further analysis under Step 2B.
In the Subject Matter Eligibility Test Step 2B, the recited additional element does not add meaningful limitation beyond appending well understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (MPEP 2106.05(d)(II), The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner … iv. Storing and retrieving information in memory). Thus, the additional element in Claim 8 and 9 do not contribute an inventive concept and Claim 8 and 9 are not eligible subject matter under 35 U.S.C. 101, as pointed out in Step 2A, Prong 2 analysis.

As of Claim 10, in the Subject Matter Eligibility Test Step 1, the claimed apparatus in Claim 10 as a whole falls within one or more statutory category. 
In the Subject Matter Eligibility Test Step 2A Prong One, the claimed method in Claim 10 recites the abstract ideas in the following limitations:
multiplying … one of the one or more first data gradients with the input data to generate a default weight gradient vector.
The steps of multiplying recited a mathematical calculation and thus falls under the mathematical concepts group of abstract idea. The mere nominal recitation of first data gradients, input data and default weight gradient vector does not take the claim limitation out of the mathematical calculation group and thus the claim falls within judicial exception of mental processes of abstract idea and requires further analysis under Step 2A Prong Two.
In the Subject Matter Eligibility Test Step 2A Prong Two, Claim 10, recite the following additional elements along with the abstract ideas:
receiving, by a controller unit, an instruction;
receive, by a master computation module, input data and one or more first data gradients in response to the instruction;
 transmitting, by the master computation module, the input data and the one or more first data gradients to one or more slave computation modules
The recited additional element of receiving instruction by controller unit, receiving data by master computation module and transmitting data by master computation module to slave computation module add insignificant extra solution activities to the judicial exception. Thus, the additional element in Claim 10 does not integrate the abstract idea into a practical application and the claim as a whole is directed to the judicial exception that requires further analysis under Step 2B.
In the Subject Matter Eligibility Test Step 2B, the recited additional elements of receiving data/instruction (BRI instruction is one type of data) and transmitting data does not add meaningful limitation beyond appending well understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (MPEP 2106.05(d)(II) The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity … i. Receiving or transmitting data). Thus, the additional element in Claim 10 does not contribute an inventive concept and Claim 10 is not eligible subject matter under 35 U.S.C. 101.

	As of Claim 11, depending on Claim 10. In the Subject Matter Eligibility Test Step 2A Prong One, Claim 11 recites additional abstract idea in the following limit:
Update, by the master computation module, one or more weight values based on the default weight gradient vector
The steps of update weight value based on gradient recited either a mathematical relationship or a mental process. The mere nominal recitation of master computation module does not take the claim limitation out of the mathematical concept or mental process group. 
The Subject Matter Eligibility Test Step 2A Prong Two and 2B are same as claim  10. This claim does not recite any additional elements that integrates the abstract idea into practical application or amount to significantly more. The claim is not eligible.

As of Claim 12, depending on Claim 10. In the Subject Matter Eligibility Test Step 2A Prong One, Claim 12 recites additional abstract idea in the following limit:
Calculating, by the master computation module, a scaled weight gradient vector based on the default weight gradient vector and a predetermined threshold value; 
Updating, by the master computation module, one or more weight values based on the scaled weight gradient vector.
The steps of calculate recite a mathematical calculation and the step of “update weight based on scaled gradient” recited either a mathematical relationship or a mental process. The mere nominal recitation of master computation module does not take the claim limitation out of the mathematical concept or mental process group.
The Subject Matter Eligibility Test Step 2A Prong Two and 2B are same as claim 10. This claim does not recite any additional elements that integrates the abstract idea into practical application or amount to significantly more. The claim is not eligible.

As of Claim 13, depending on Claim 10. In the Subject Matter Eligibility Test Step 2A Prong One, Claim 13 recites additional abstract idea in the following limit:
applying, by the master computation module, a derivative of an activation function to the one or more first data gradients o generate one or more input gradients
The steps of “applying derivative of activation function to gradient to generate gradient” recited mathematical calculation and mathematical relationship. The mere nominal recitation of master computation module does not take the claim limitation out of the mathematical concept group.
The Subject Matter Eligibility Test Step 2A Prong Two and 2B are same as claim 10. This claim does not recite any additional elements that integrates the abstract idea into practical application or amount to significantly more. The claim is not eligible.

As of Claim 14, depending on Claim 13. In the Subject Matter Eligibility Test Step 2A Prong One, Claim 14 recites additional abstract idea in the following limit:
multiplying, by the one or more slave computation module, one of the one or more input gradients with one or more weight vectors in a weight matrix to generate one or more multiplication results
The steps of “multiplying input with weight to generate result” recited a mathematical calculation. The mere nominal recitation of slave computation module does not take the claim limitation out of the mathematical concept group.
The Subject Matter Eligibility Test Step 2A Prong Two and 2B are same as  claim  13. This claim does not recite any additional elements that integrates the abstract idea into practical application or amount to significantly more. The claim is not eligible.

As of Claim 15, depending on Claim 14. In the Subject Matter Eligibility Test Step 2A Prong One, Claim 15 recites additional abstract idea in the following limit:
combining, by the interconnection unit, the one or more multiplication results, calculated respectively by the one or more slave computation module, into an output gradient vector
The steps of “combining result into a vector” recited either a mathematical calculation or a mental step. The mere nominal recitation of interconnection unit and slave computation module does not take the claim limitation out of the mathematical concept group.
The Subject Matter Eligibility Test Step 2A Prong Two and 2B are same as  claim  14. This claim does not recite any additional elements that integrates the abstract idea into practical application or amount to significantly more. The claim is not eligible.

As of Claim 16, depending on Claim 10, Claim 16 recites additional elements in the following limit:
channeling, by an interconnection unit, data between the master computation module and the one or more slave computation modules.
In the Subject Matter Eligibility Test Step 2A, the recited additional element do not add meaningful limitation beyond adding insignificant extra solution activity to the judicial exception. Thus, the additional element in Claim 16 does not integrate the abstract idea into a practical application and the claim as a whole is directed to the judicial exception that requires further analysis under Step 2B.
In the Subject Matter Eligibility Test Step 2B, the recited additional element does not add meaningful limitation beyond appending well understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (MPEP 2106.05(d)(II) The courts have recognized the following computer function as well-understood routine, and conventional function when they are claimed in a merely generic manner … i. Receiving or transmitting data over a network). Thus, the additional element in Claim 16 does not contribute an inventive concept and Claim 16 is not eligible subject matter under 35 U.S.C. 101.

	As of Claim 17 and Claim 18, depending on Claim 10 and Claim 14, Claim17 and Claim 18 recites additional elements in the following limit:
storing, by a slave neuron caching unit of each of the one or more slave computation modules, the one or more first data gradients with the input data
storing, by a weight value caching unit of each of the one or more slave computation modules, the weight matrix that includes the one or more weight vectors
In the Subject Matter Eligibility Test Step 2A, the recited additional elements of storing data is highly generic and add insignificant extra solution activity to the judicial exception. Thus, the additional element in Claim 17 and Claim 18 do not integrate the abstract idea into a practical application and the claim as a whole is directed to the judicial exception that requires further analysis under Step 2B.
In the Subject Matter Eligibility Test Step 2B, the recited additional element does not add meaningful limitation beyond appending well understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (MPEP 2106.05(d)(II), The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner … iv. Storing and retrieving information in memory). Thus, the additional element in Claim 17 and Claim 18 do not contribute an inventive concept and Claim 17 and Claim 18 are not eligible subject matter under 35 U.S.C. 101.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim 1 - 18 are rejected under 35 U.S.C. 103 as being unpatentable over Touretzky, Backpropagation Learning, Lecture 15-486/782: Artificial Neural Networks, Computer Science Carnegie Mellon University, 2006 in view of Hamalainen, TUTNC: a general purpose parallel computer for neural network computations, Microprocessors and Microsystems Vol 19, 1995.

Regarding Claim 1, Touretzky discloses: 
backpropagation in a fully connected layer of a neural network (Touretzky, page. 12, where backpropagation of error in training a full connected neural network), comprising:
receive input data and one or more first data gradients (Touretzky, page. 9, where during training, the neural network layers generate/receive data xi [input data] and the gradient of data y                  
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            y
                        
                    
                
             [first data gradient])
multiply one of the one or more first data gradients with the input data to generate a default weight gradient vector (Touretzky, page. 9, where gradient of weight [default weight gradient]                 
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            w
                            i
                        
                    
                    =
                     
                    
                        
                            d
                            E
                        
                        
                            d
                            y
                        
                    
                    *
                     
                    
                        
                            ∂
                            y
                        
                        
                            ∂
                            w
                            i
                        
                    
                    =
                    
                        
                            y
                            -
                            d
                        
                    
                    *
                    x
                    i
                
            . (y-d) =                 
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            y
                        
                    
                
             is the gradient of the data y [first data gradient] and xi is the input data ).
Touretzky does not explicitly disclose:
An apparatus for … neural network
a controller unit configured to receive an instruction;
and one or more computation modules that include a master computation module and one or more slave computation modules, 
wherein the master computation module configured to receive input data and one or more first data gradients in response to the instruction and transmit the input data and the one or more first data gradients to one or more slave computation modules
and wherein the one or more slave computation modules are respectively configured to multiply one of the one or more first data gradients with the input data to generate a default weight gradient vector
Hamalainen explicitly discloses:
An apparatus for … neural network (Hamalainen, page. 447, col. 1, ln. 3 – 4, where TUTNC [apparatus] is designed for parallel computation and it is suitable for several artificial neural network algorithm)
a controller unit configured to receive an instruction (Hamalainen, page. 451, col. 1, para. 3, ln. 1 – 5 & fig. 6, where the root is an interface IFC [controller unit] to a host computer which receive instructions from the host to control the CU and PU);
and one or more computation modules that include a master computation module and one or more slave computation modules (Hamalainen, fig. 1 & page. 448, col. 2, para. 2, ln. 4 – 7, where this architecture can be referred to as a master slave configuration; fig. 6, where in TUTNC system, PUs are slave processing unit [slave computation modules], IFC and host [master computation module] controls the overall system), 
wherein the master computation module configured to receive input data and one or more first data gradients in response to the instruction and transmit the input data and the one or more first data gradients to one or more slave computation modules (Hamalainen, page. 451, col 2, para. 3, ln. 3 – 6, where host [master computation module] serves data transfers [receive data and transmit data to each PU]; fig. 4, where x1 [data] and w11-w41 [one or more first data] are transmit to PE1 [slave computation module] for multiplication operation)
and wherein the one or more slave computation modules are respectively configured to multiply one of the one or more first data gradients with the input data to generate a default weight gradient vector (Hamalainen, fig. 4, where multiple PE [slave computation module] multiplies x [input data] and w [one or more first data gradients]; page. 449, col. 1, para. 3, ln. 7 – 8, where root collects all outputs [default weight gradient] and forms an output vector [default weight gradient vector])
Touretzky and Hamalainen both discloses method of implementing neural network and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Touretzky’s teaching of the training neural network using gradient descending back propagation with Hamalainen’s teaching of hardware/software implementation of master/slave parallel computation for neural network to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification to achieve a good cost/performance ratio (Hamalainen, page. 447, col. 1, para. 1, ln. 13 – 14).

Regarding Claim 2, depending on Claim 1. Touretzky in view of Hamalainen discloses the apparatus of Claim 1. Touretzky in view of Hamalainen further discloses:  
further configured to update one or more weight values based on the default weight gradient vector (Touretzky, page 9 & page. 13, where during the training phase, weights are updated by                 
                    ∆
                    w
                    i
                
             which is based on the weight gradients                 
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            w
                            i
                        
                    
                
             [default weight gradient vector]).
wherein the master computation module is further configured to update … (Hamalainen, page. 451, col. 2, para. 3, ln. 3 – 6, where during execution, the host [master computation module] … controls the overall execution by sending real-time command to the system;)
The rationale to combine Touretzky’s teaching and Hamalainen’s teaching is the same as set forth in claim 1. 

Regarding Claim 3, depending on Claim 1. Touretzky in view of Hamalainen discloses the apparatus of Claim 1. Touretzky in view of Hamalainen further discloses:  
further configured to calculate a scaled weight gradient vector based on the default weight gradient vector and a predetermined threshold value; and update one or more weight values based on the scaled weight gradient vector (Hamalainen, page. 9, where during training, weights are updated by                  
                    ∆
                    w
                    i
                    =
                    -
                    η
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            w
                            i
                        
                    
                
            , the default weights gradient scaled by learning rate η; page. 26, where the learning rate is depending on cosine [threshold]).
wherein the master computation module is further configured to calculate … (Hamalainen, page. 451, col. 2, para. 3, ln. 3 – 6, where during execution, the host [master computation module] … controls the overall execution by sending real-time command to the system;)
The rationale to combine Touretzky’s teaching and Hamalainen’s teaching is the same as set forth in claim 1. 

Regarding Claim 4, depending on Claim 1. Touretzky in view of Hamalainen discloses the apparatus of Claim 1. Touretzky in view of Hamalainen further discloses:  
further configured to apply a derivative of an activation function to the one or more first data gradients o generate one or more input gradients (Touretzky, page. 12, where during the back propagation, the gradient of the middle layer back propagation input                 
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            
                                
                                    n
                                    e
                                    t
                                
                                
                                    k
                                
                            
                        
                    
                
             [input gradients] is calculated by the derivative of the activation function g’(netk) and the first data gradients (yk-dk), i.e.,                  
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            
                                
                                    n
                                    e
                                    t
                                
                                
                                    k
                                
                            
                        
                    
                    =
                    
                        
                            
                                
                                    y
                                
                                
                                    k
                                
                            
                            -
                            
                                
                                    d
                                
                                
                                    k
                                
                            
                        
                    
                    *
                    g
                    '
                    (
                    
                        
                            n
                            e
                            t
                        
                        
                            k
                        
                    
                    )
                
            ).
wherein the master computation module is further configured to apply … (Hamalainen, page. 451, col. 2, para. 3, ln. 3 – 6, where during execution, the host [master computation module] … controls the overall execution by sending real-time command to the system;)
The rationale to combine Touretzky’s teaching and Hamalainen’s teaching is the same as set forth in claim 1. 


Regarding Claim 5, depending on Claim 4. Touretzky in view of Hamalainen discloses the apparatus of Claim 4. Touretzky in view of Hamalainen further discloses:  
multiply one of the one or more input gradients with one or more weight vectors in a weight matrix to generate one or more multiplication results (Touretzky, page. 12, where during back propagation, the gradient of the middle layer back propagation output yj is calculated as                 
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            y
                            j
                        
                    
                    =
                    
                        
                            ∑
                            
                                k
                            
                        
                        
                            (
                            
                                
                                    ∂
                                    E
                                
                                
                                    ∂
                                    
                                        
                                            n
                                            e
                                            t
                                        
                                        
                                            k
                                        
                                    
                                
                            
                            *
                            
                                
                                    ∂
                                    
                                        
                                            n
                                            e
                                            t
                                        
                                        
                                            k
                                        
                                    
                                
                                
                                    ∂
                                    
                                        
                                            y
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            )
                        
                    
                
            ; page. 10,                 
                    
                        
                            n
                            e
                            t
                        
                        
                            k
                        
                    
                    =
                    
                        
                            ∑
                            
                                i
                            
                        
                        
                            
                                
                                    w
                                
                                
                                    i
                                    k
                                
                            
                            *
                            
                                
                                    y
                                
                                
                                    i
                                
                            
                        
                    
                
            ; thus                 
                    
                        
                            ∂
                            
                                
                                    n
                                    e
                                    t
                                
                                
                                    k
                                
                            
                        
                        
                            ∂
                            
                                
                                    y
                                
                                
                                    i
                                
                            
                        
                    
                    =
                    
                        
                            w
                        
                        
                            i
                            k
                        
                    
                
             ,                  
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            y
                            j
                        
                    
                    =
                    
                        
                            ∑
                            
                                k
                            
                        
                        
                            (
                            
                                
                                    ∂
                                    E
                                
                                
                                    ∂
                                    
                                        
                                            n
                                            e
                                            t
                                        
                                        
                                            k
                                        
                                    
                                
                            
                            *
                            
                                
                                    w
                                
                                
                                    i
                                    k
                                
                            
                            )
                        
                    
                
            ; i.e., multiply, for each of the k node of the output layer, input gradient                  
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            
                                
                                    n
                                    e
                                    t
                                
                                
                                    k
                                
                            
                        
                    
                
             with weight wik)
wherein the one or more slave computation modules are respectively configured to multiply one of the one or more input gradients with one or more weight vectors in a weight matrix to generate one or more multiplication results (Hamalainen, fig. 4, where multiple PE [slave computation module] multiplies x [input gradient] and w [weight] to generate multiplication results)
The rationale to combine Touretzky’s teaching and Hamalainen’s teaching is the same as set forth in claim 4. 

Regarding Claim 6, depending on Claim 5. Touretzky in view of Hamalainen discloses the apparatus of Claim 5. Touretzky in view of Hamalainen further discloses:  
combine the one or more multiplication results … into an output gradient vector (Touretzky, page. 12, where during back propagation, the gradients of the middle layer that output and back propagate to the lower layer is                 
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            y
                            j
                        
                    
                
             [output gradient vector], which is calculated as                 
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            y
                            j
                        
                    
                    =
                    
                        
                            ∑
                            
                                k
                            
                        
                        
                            (
                            
                                
                                    ∂
                                    E
                                
                                
                                    ∂
                                    
                                        
                                            n
                                            e
                                            t
                                        
                                        
                                            k
                                        
                                    
                                
                            
                            *
                            
                                
                                    w
                                
                                
                                    i
                                    k
                                
                            
                            )
                        
                    
                
            ; i.e., combine all the multiplication results of k nodes).
further comprising an interconnection unit configured to combine the one or more multiplication results calculated respectively by the one or more slave computation modules (Hamalainen, page. 449, col. 2, para. 2, ln. 9 – 11 & fig. 4, where the communication network [interconnect unit] sums [combine] these multiplication results).
The rationale to combine Touretzky’s teaching and Hamalainen’s teaching is the same as set forth in claim 5. 

Regarding Claim 7, depending on Claim 1. Touretzky in view of Hamalainen discloses the apparatus of Claim 1. Touretzky in view of Hamalainen further discloses:  
further comprising an interconnection unit configured to channel data between the master computation module and the one or more slave computation modules (Hamalainen, fig. 10, where mechanism to create a communication path [channel] to PU).
The rationale to combine Touretzky’s teaching and Hamalainen’s teaching is the same as set forth in claim 1. 

Regarding Claim 8, depending on Claim 1. Touretzky in view of Hamalainen discloses the apparatus of Claim 1. Touretzky in view of Hamalainen further discloses:  
wherein each of the one or more slave computation modules includes a slave neuron caching unit configured to store the one or more first data gradients with the input data (Hamalainen, p. 451, col. 1, para. 4, ln. 5 – 6, & col. 2, para. 1, ln. 1, where the local memory [slave neuron caching unit] … is used for storage of run-time variables [one or more first data gradients; input data]).
Touretzky and Hamalainen disclose the apparatus of Claim 1. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further integrating Hamalainen’s teaching of local caching to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification in order to store program code and run time variables (Hamalainen, page. 451, col. 1, para. 4, ln. 5 – 6 and col. 2, para. 1, ln. 1).


Regarding Claim 9, depending on Claim 5. Touretzky in view of Hamalainen discloses the apparatus of Claim 5. Touretzky in view of Hamalainen further discloses:  
wherein each of the one or more slave computation modules includes a weight value caching unit configured to store the weight matrix that includes the one or more weight vectors (Hamalainen, p. 451, col. 1, para. 4, ln. 5 – 6, & col. 2, para. 1, ln. 1, where the local memory [weight value caching unit] … is used for storage of run-time variables [one or more weight vectors]).
Touretzky and Hamalainen disclose the apparatus of Claim 5. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further integrating Hamalainen’s teaching of local caching to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification in order to store program code and run time variables (Hamalainen, page. 451, col. 1, para. 4, ln. 5 – 6 and col. 2, para. 1, ln. 1).

Regarding Claim 10, Touretzky discloses: 
A method for backpropagation in a fully connected layer of a neural network (Touretzky, page. 12, where backpropagation of error in training a full connected neural network), comprising:
receiving … input data and one or more first data gradients (Touretzky, page. 9, where during training, the neural network layers generate/receive data xi [input data] and the gradient of data y                  
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            y
                        
                    
                
             [first data gradient]) 
multiply … one of the one or more first data gradients with the input data to generate a default weight gradient vector (Touretzky, page. 9, where gradient of weight [default weight gradient]                 
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            w
                            i
                        
                    
                    =
                     
                    
                        
                            d
                            E
                        
                        
                            d
                            y
                        
                    
                    *
                     
                    
                        
                            ∂
                            y
                        
                        
                            ∂
                            w
                            i
                        
                    
                    =
                    
                        
                            y
                            -
                            d
                        
                    
                    *
                    x
                    i
                
            . (y-d) =                 
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            y
                        
                    
                
             is the gradient of the data y [first data gradient] and xi is the input data ).
Touretzky does not explicitly disclose:
receiving, by a controller unit, an instruction;
receiving, by a master computation module, input data and one or more first data gradients in response to the instruction
transmitting, by the master computation module, the input data and the one or more first data gradients to one or more slave computation modules
multiply, by the one or more slave computation modules, one of the one or more first data gradients with the input data to generate a default weight gradient vector
Hamalainen explicitly discloses:
receiving, by a controller unit, an instruction (Hamalainen, page. 451, col. 1, para. 3, ln. 1 – 5 & fig. 6, where the root is an interface IFC [controller unit] to a host computer which receive instructions from the host to control the CU and PU);
receiving, by a master computation module, input data and one or more first data gradients in response to the instruction; transmitting, by the master computation module, the input data and the one or more first data gradients to one or more slave computation modules (Hamalainen, page. 451, col 2, para. 3, ln. 3 – 6, where base on the program instruction, host [master computation module] serves data transfers [receiving data and transmitting data to each PU]; fig. 4, where x1 [data] and w11-w41 [one or more first data] are transmit to PE1 [slave computation module] for multiplication operation)
multiply, by the one or more slave computation modules, one of the one or more first data gradients with the input data to generate a default weight gradient vector (Hamalainen, fig. 4, where multiple PE [slave computation modules] multiplies x [input data] and w [one or more first data gradients]; page. 449, col. 1, para. 3, ln. 7 – 8, where root collects all outputs [default weight gradient] and forms an output vector [default weight gradient vector])
Touretzky and Hamalainen both discloses method of implementing neural network and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Touretzky’s teaching of the training neural network using gradient descending back propagation with Hamalainen’s teaching of hardware/software implementation of master/slave parallel computation for neural network to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification to achieve a good cost/performance ratio (Hamalainen, page. 447, col. 1, para. 1, ln. 13 – 14).

Regarding Claim 11, depending on Claim 10. Touretzky in view of Hamalainen discloses the apparatus of Claim 10. Touretzky in view of Hamalainen further discloses:  
Updating … one or more weight values based on the default weight gradient vector (Touretzky, page 9 & page. 13, where during the training phase, weights are updated by                 
                    ∆
                    w
                    i
                
             which is based on the weight gradients                 
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            w
                            i
                        
                    
                
             [default weight gradient vector]).
Updating, by the master computation module, (Hamalainen, page. 451, col. 2, para. 3, ln. 3 – 6, where during execution, the host [master computation module] … controls the overall execution by sending real-time command to the system;)
The rationale to combine Touretzky’s teaching and Hamalainen’s teaching is the same as set forth in claim 10. 

Regarding Claim 12, depending on Claim 10. Touretzky in view of Hamalainen discloses the apparatus of Claim 10. Touretzky in view of Hamalainen further discloses:  
Calculating … a scaled weight gradient vector based on the default weight gradient vector and a predetermined threshold value; and update one or more weight values based on the scaled weight gradient vector (Hamalainen, page. 9, where during training, weights are updated by                  
                    ∆
                    w
                    i
                    =
                    -
                    η
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            w
                            i
                        
                    
                
            , the default weights gradient scaled by learning rate η; page. 26, where the learning rate is depending on cosine [threshold]).
Calculating, by the master computation module, (Hamalainen, page. 451, col. 2, para. 3, ln. 3 – 6, where during execution, the host [master computation module] … controls the overall execution by sending real-time command to the system;)
The rationale to combine Touretzky’s teaching and Hamalainen’s teaching is the same as set forth in claim 10. 

Regarding Claim 13, depending on Claim 10. Touretzky in view of Hamalainen discloses the apparatus of Claim 10. Touretzky in view of Hamalainen further discloses:  
applying … a derivative of an activation function to the one or more first data gradients o generate one or more input gradients (Touretzky, page. 12, where during the back propagation, the gradient of the middle layer back propagation input                 
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            
                                
                                    n
                                    e
                                    t
                                
                                
                                    k
                                
                            
                        
                    
                
             [input gradients] is calculated by the derivative of the activation function g’(netk) and the first data gradients (yk-dk), i.e.,                  
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            
                                
                                    n
                                    e
                                    t
                                
                                
                                    k
                                
                            
                        
                    
                    =
                    
                        
                            
                                
                                    y
                                
                                
                                    k
                                
                            
                            -
                            
                                
                                    d
                                
                                
                                    k
                                
                            
                        
                    
                    *
                    g
                    '
                    (
                    
                        
                            n
                            e
                            t
                        
                        
                            k
                        
                    
                    )
                
            ).
applying, by the master computation module, (Hamalainen, page. 451, col. 2, para. 3, ln. 3 – 6, where during execution, the host [master computation module] … controls the overall execution by sending real-time command to the system;)
The rationale to combine Touretzky’s teaching and Hamalainen’s teaching is the same as set forth in claim 10.

Regarding Claim 14, depending on Claim 13. Touretzky in view of Hamalainen discloses the apparatus of Claim 13. Touretzky in view of Hamalainen further discloses:  
multiplying … one of the one or more input gradients with one or more weight vectors in a weight matrix to generate one or more multiplication results (Touretzky, page. 12, where during back propagation, the gradient of the middle layer back propagation output yj is calculated as                 
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            y
                            j
                        
                    
                    =
                    
                        
                            ∑
                            
                                k
                            
                        
                        
                            (
                            
                                
                                    ∂
                                    E
                                
                                
                                    ∂
                                    
                                        
                                            n
                                            e
                                            t
                                        
                                        
                                            k
                                        
                                    
                                
                            
                            *
                            
                                
                                    ∂
                                    
                                        
                                            n
                                            e
                                            t
                                        
                                        
                                            k
                                        
                                    
                                
                                
                                    ∂
                                    
                                        
                                            y
                                        
                                        
                                            i
                                        
                                    
                                
                            
                            )
                        
                    
                
            ; page. 10,                 
                    
                        
                            n
                            e
                            t
                        
                        
                            k
                        
                    
                    =
                    
                        
                            ∑
                            
                                i
                            
                        
                        
                            
                                
                                    w
                                
                                
                                    i
                                    k
                                
                            
                            *
                            
                                
                                    y
                                
                                
                                    i
                                
                            
                        
                    
                
            ; thus                 
                    
                        
                            ∂
                            
                                
                                    n
                                    e
                                    t
                                
                                
                                    k
                                
                            
                        
                        
                            ∂
                            
                                
                                    y
                                
                                
                                    i
                                
                            
                        
                    
                    =
                    
                        
                            w
                        
                        
                            i
                            k
                        
                    
                
             ,                  
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            y
                            j
                        
                    
                    =
                    
                        
                            ∑
                            
                                k
                            
                        
                        
                            (
                            
                                
                                    ∂
                                    E
                                
                                
                                    ∂
                                    
                                        
                                            n
                                            e
                                            t
                                        
                                        
                                            k
                                        
                                    
                                
                            
                            *
                            
                                
                                    w
                                
                                
                                    i
                                    k
                                
                            
                            )
                        
                    
                
            ; i.e., multiply, for each of the k node of the output layer, input gradient                  
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            
                                
                                    n
                                    e
                                    t
                                
                                
                                    k
                                
                            
                        
                    
                
             with weight wik)
multiplying, by the one or more slave computation modules, (Hamalainen, fig. 4, where multiple PE [slave computation modules] multiplies x [input gradient] and w [weight] to generate multiplication results)
The rationale to combine Touretzky’s teaching and Hamalainen’s teaching is the same as set forth in claim 13.

Regarding Claim 15, depending on Claim 14. Touretzky in view of Hamalainen discloses the apparatus of Claim 14. Touretzky in view of Hamalainen further discloses:  
combining … the one or more multiplication results … into an output gradient vector (Touretzky, page. 12, where during back propagation, the gradients of the middle layer that output and back propagate to the lower layer is                 
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            y
                            j
                        
                    
                
             [output gradient vector], which is calculated as                 
                    
                        
                            ∂
                            E
                        
                        
                            ∂
                            y
                            j
                        
                    
                    =
                    
                        
                            ∑
                            
                                k
                            
                        
                        
                            (
                            
                                
                                    ∂
                                    E
                                
                                
                                    ∂
                                    
                                        
                                            n
                                            e
                                            t
                                        
                                        
                                            k
                                        
                                    
                                
                            
                            *
                            
                                
                                    w
                                
                                
                                    i
                                    k
                                
                            
                            )
                        
                    
                
            ; i.e., combine all the multiplication results of k nodes).
combining, by an interconnection unit, the one or more multiplication results calculated respectively by the one or more slave computation modules (Hamalainen, page. 449, col. 2, para. 2, ln. 9 – 11 & fig. 4, where the communication network [interconnect unit] sums [combine] these multiplication results).
The rationale to combine Touretzky’s teaching and Hamalainen’s teaching is the same as set forth in claim 14.

Regarding Claim 16, depending on Claim 10. Touretzky in view of Hamalainen discloses the apparatus of Claim 10. Touretzky in view of Hamalainen further discloses:  
channeling, by an interconnection unit, data between the master computation module and the one or more slave computation modules (Hamalainen, fig. 10, where mechanism to create a communication path [channel] to PU).
The rationale to combine Touretzky’s teaching and Hamalainen’s teaching is the same as set forth in claim 10.

Regarding Claim 17, depending on Claim 10. Touretzky in view of Hamalainen discloses the apparatus of Claim 10. Touretzky in view of Hamalainen further discloses:  
Storing, by a slave neuron caching unit of each of the one or more slave computation modules, the one or more first data gradients with the input data (Hamalainen, p. 451, col. 1, para. 4, ln. 5 – 6, & col. 2, para. 1, ln. 1, where the local memory [slave neuron caching unit] … is used for storage of run-time variables [one or more first data gradients; input data]).
Touretzky and Hamalainen disclose the apparatus of Claim 10. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further integrating Hamalainen’s teaching of local caching to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification in order to store program code and run time variables (Hamalainen, page. 451, col. 1, para. 4, ln. 5 – 6 and col. 2, para. 1, ln. 1).

Regarding Claim 18, depending on Claim 14. Touretzky in view of Hamalainen discloses the apparatus of Claim 14. Touretzky in view of Hamalainen further discloses:  
Storing, by a weight value caching unit of each of the one or more slave computation modules, the weight matrix that includes the one or more weight vectors (Hamalainen, p. 451, col. 1, para. 4, ln. 5 – 6, & col. 2, para. 1, ln. 1, where the local memory [weight value caching unit] … is used for storage of run-time variables [one or more weight vectors]).
Touretzky and Hamalainen disclose the apparatus of Claim 14. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further integrating Hamalainen’s teaching of local caching to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification in order to store program code and run time variables (Hamalainen, page. 451, col. 1, para. 4, ln. 5 – 6 and col. 2, para. 1, ln. 1).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIEN MING CHOU whose telephone number is (571)272-9354.  The examiner can normally be reached on Monday- Friday 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CHAKI KAKALI can be reached on (571) 272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/S.C./Examiner, Art Unit 2122                                                                                                                                                                                                        

/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122