DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to the application and the preliminary amendment filed 4/23/2018. In the preliminary amendment, claims 6-10 were amended, claims 11-20 were added, and no claims were cancelled. Thus, claims 1-20 are pending and have been examined.

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, 365(c), or 386(c) is acknowledged. The present application is a national stage application of PCT application number PCT/CN2016/086098 filed on 06/17/2016, and claims foreign priority to Chinese Patent Application No. CN201510792463.0 filed on 11/17/2015. 
The examiner acknowledges that a certified copy of Chinese application number No. CN201510792463.0 has been retrieved, as required by 37 CFR 1.55.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 4/23/2018 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement has been considered by the examiner.

Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference characters not mentioned in the description: 
Reference characters 11, 12, 13, 14, 15 and 16 shown in Figure 2 are not found in the detailed description.
Reference characters 21, 22, 23, 24 and 25 shown in Figure 3 are not found in the detailed description.
Reference characters 5, 10 and 20 shown in Figure 5 are not found in the detailed description.
The drawings are also objected to because Figures 1 and 5 need text labels to define at least some of the empty boxes in the Figures. In particular, Figures 1 and 5 include unlabeled boxes / blocks (See, e.g., boxes 1-8 in FIG. 1 and boxes 5, 10 and 20 in FIG. 5). That is, none of the boxes in FIGs. 1 and 5 have any text labels (only reference characters).
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for 

Specification
The disclosure is objected to because of the following informalities:
The specification is objected to as failing to provide proper antecedent basis for the claimed subject matter. See 37 CFR 1.75(d)(1) and MPEP § 608.01(o). Correction of the following is required: Amended claims 6-10 and new claims 16-20 do not appear to have support in the originally filed specification filed on 4/23/2018. While the specification was amended on 4/23/2018 to mention an “operating means” and “acceleration means” (see, e.g., pages 4-6 and 8-101) there does not appear to be any discussion of the “performing operation means” recited in amended claims 6-10 or the “acceleration means by an acceleration chip for accelerating a deep neural network algorithm” recited in new claims 16-20 in applicant’s original specification. Although the amended specification includes some of the claim language recited in amended claims 6-10 and new claims 16-20, like the original specification, it also fails to mention, let alone discuss any “performing operation means” as recited in amended claims 6-10.
all silent regarding any “performing operation means” as recited in amended claims 6-10. Further, the original specification and the specification of the priority PCT application both fail to mention any “acceleration means by an acceleration chip for accelerating a deep neural network algorithm” as recited in new claims 16-20. Further, because the priority application, Chinese Patent Application No. CN201510792463.0 is not in English and no certified English translation of the foreign application has been submitted to-date, it not clear that it provides proper antecedent basis for the subject matter recited in amended claims 6-10 or new claims 16-20. Appropriate correction is required.
In the “BACKGROUND” section on page 2, line 7 of the specification, reference is made to “The patent document 1 (publication No.: CN101527010A)”. This application should be identified by its corresponding U.S. Patent Application number, and its U.S. Patent Application Publication number, if it has been published. 
Also, the listing of the patent application reference in the specification is not a proper information disclosure statement. 37 CFR 1.98(b) requires a list of all patents, publications, or other information submitted for consideration by the Office, and MPEP § 609.04(a) states, "the list may not be incorporated into the specification but must be submitted in a separate paper." Therefore, unless the reference listed ion page 2 of applicant’s specification was also submitted in an information disclosure statement, or has been cited by the examiner on form PTO-892, it has not been considered. Appropriate correction is required.

The specification includes numerous grammatical and typographical errors. For example, the numerous recitations of “As regards to” (see, e.g., page 3, lines 9 and 29, page 4, lines 4, 9 and 13, page 5, line 29, page 6, lines 6, 12 and 21, page 7, lines 21 and 26, page 8, lines 1 and 5, page 9 lines 22 and 28, page 10, lines 3 and 11, and page 16, lines 1, 7, 12 and 20) are grammatically incorrect and should read “[[As]] With regards to” or “Regarding 
Also, for example, the recitation of “the present application further provides a operating means” on page 4, line 19 of the specification is grammatically incorrect and should read “the present application further provides [[a]] an operating means”. Appropriate correction is required.
Further, for example, the recitation of “A acceleration means” on page 8, line 9 should read “[[a]] an acceleration means”. Appropriate correction is required.
Additionally, for example, the recitation of “With respect to characteristics of the deep neural network algorithm, some optimization designs are further made on the intermediate value storage regions of the present disclosure, and support continuous writing and reading the memory address of certain intermediate values for several times through a counter in response to the instructions, which greatly promotes, such as, calculation of the pooling layer in the Convolutional Neural Network (CNN).” on page 13, lines 21-26 includes grammatical errors and appears to be missing one or more words. Appropriate correction is required.
2, lines 13-29 describing FIG. 2, which do not mention reference characters 11-16).
Reference characters 21, 22, 23, 24 and 25 shown in Figure 3 are not found in the detailed description (see, e.g., page 14, line 30-page 15, line 22 describing FIG. 3, which do not mention reference characters 21-25).
Reference characters 5, 10 and 20 shown in Figure 5 are not found in the detailed description (see, e.g., page 17, lines 12-14 describing FIG. 5, which do not mention reference characters 21-25). 
Appropriate correction is required.
Applicant is reminded of the proper language and format for an abstract of the disclosure.
The abstract should be in narrative form and generally limited to a single paragraph on a separate sheet within the range of 50 to 150 words. It is important that the abstract not exceed 150 words in length since the space provided for the abstract on the computer tape used by the printer is limited. The form and legal phraseology often used in patent claims, such as "means" and "said," should be avoided. The abstract should describe the disclosure sufficiently to assist readers in deciding whether there is a need for consulting the full patent text for details.
The language should be clear and concise and should not repeat information given in the title. It should avoid using phrases which can be implied, such as, "The 

Claim Objections
Claims 1-20 objected to because of the following informalities: 
In claim 1, the word “wherein” appears to be missing at the beginning of lines 8 and 13, and the word “and” is missing at the end of line 12. Appropriate correction is required.
The preambles of each of claims 2-5 each recite “an acceleration chip for accelerating a deep neural network algorithm” (see, e.g., lines 1-2 of claims 2-5). Applicant previously introduced “an acceleration chip for accelerating a deep neural network algorithm” in lines 1-2 of base claim 1. As such, it appears the recitations of “an acceleration chip for accelerating a deep neural network algorithm” in claims 2-5 should read “[[an]] the acceleration chip for accelerating [[a]] the deep neural network algorithm”. For examination purposes, the recitations of “an acceleration chip for accelerating a deep neural network algorithm” in each of claims 2-5 are being interpreted as the previously-introduced “acceleration chip for accelerating” the previously-introduced “deep neural network algorithm”. Appropriate correction is required.

The recitation of “A performing operation means by using an operation apparatus for an acceleration chip for accelerating a deep neural network algorithm” in lines 1-2 of claim 6 is grammatically incorrect. As discussed below in the rejection of this claim under 112(b), this recitation is also unclear. If supported by the original specification, the examiner suggests that one way to at least partially address this objection would be to amend lines 1-2 of claim 6 to recite “A performing operation means [[by]] for using an operation apparatus [[for]] of an acceleration chip”. Appropriate correction is required.

In independent claim 11, the word “wherein” appears to be missing at the beginning of lines 10 and 15, and the word “and” is missing at the end of line 14. Appropriate correction is required.
In independent claim 16, the word “wherein” appears to be missing at the beginning of lines 28 and 33, and the word “and” is missing at the end of line 32. Appropriate correction is required.
The recitation of “A acceleration means by an acceleration chip for accelerating a deep neural network algorithm” in line 1 of claim 16 is grammatically incorrect. As discussed below in the rejection of this claim under 112(b), this recitation is also unclear. If supported by the original specification, the examiner suggests that one way to at least partially address this objection would be to amend line 1 of claim 16 to recite “[[A]] An acceleration means”. Appropriate correction is required.


Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f):
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f), is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f), is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action.
Claim limitations in this application that use the word “means” (or “step”) that are being interpreted under 35 U.S.C. 112(f) include:
A performing operation means by using an operation apparatus for an acceleration chip for accelerating a deep neural network algorithm in claim 6; 
a step of vector addition processing operation, 
a step of vector function value operation, … ; and
a step of vector multiply-add operation in claims 6 and 16; and
A acceleration means by an acceleration chip for accelerating a deep neural network algorithm in claim 16.
Regarding claim 6 and the above-noted three-prong test, the recited performing operation means limitation uses the term “means” or “step”, by using an operation apparatus for an acceleration chip for accelerating a deep neural network algorithm is functional language, and there is no recitation in the claim of sufficient structure to perform the using and accelerating.
Regarding claims 6 and 16 and the above-noted three-prong test, the recited step of limitations use the term “step”, the vector addition processing operation, vector function value operation, and vector multiply-add operation is functional language, and there is no recitation in the claim of sufficient structure to perform the processing or operations. 
Regarding claim 16 and the above-noted three-prong test, the recited acceleration means limitation uses the term “means”, by an acceleration chip for accelerating a deep neural network algorithm is functional language, and there is no recitation in the claim of sufficient structure to perform the accelerating.
This application also includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f), because the claim limitations use a generic placeholder that is coupled with functional language 
a vector addition processor module for performing addition or subtraction of a vector,
a vector function value arithmetic unit module for performing a vectorized operation … ; and
a vector multiplier-adder module for performing a multiply-add operation on vectors in each of claims 1, 6, 11 and 16.
Regarding claims 1, 6, 11 and 16 and the above-noted three-prong test, the recited vector addition processor module is a generic placeholder, for performing addition or subtraction of a vector is functional language, and there is no recitation in the claims of sufficient structure to perform the addition or subtraction. Also in claims 1, 6, 11 and 16, the recited vector function value arithmetic unit module is a generic placeholder, for performing a vectorized operation is functional language, and there is no recitation in the claims of sufficient structure to perform the operation. Additionally, in claims 1, 6, 11 and 16, the recited vector multiplier-adder module is a generic placeholder, for performing a multiply-add operation on vectors is functional language, and there is no recitation in the claims of sufficient structure to perform the operation.
Regarding the “performing operation means by using an operation apparatus for an acceleration chip for accelerating a deep neural network algorithm” recited in claim 6, applicant’s specification is silent regarding any “performing operation means”. Further, aside from mentioning that “the present application further provides a [sic – an]  any “performing operation means by using [sic – that uses] an operation apparatus for an acceleration chip for accelerating a deep neural network algorithm”. Aside from repeating some of the language of claim 6 on pages 4-6 and the above-noted “operating means” language on page 4 of the specification, there is no other description of the “performing operation means by using an operation apparatus for an acceleration chip for accelerating a deep neural network algorithm” recited in claim 6. Further, aside from mentioning generic “general or special computer system environments or configurations, such as, personal computer, server computer, handheld or portable device” (specification, page 16, line 30-page 17, line 2), the specification is silent regarding any special-purpose computer hardware, let alone any special-purpose computer hardware for the “performing operation means by using an operation apparatus for an acceleration chip for accelerating a deep neural network algorithm” as recited in claim 6. That is, the specification does not disclose any structure or specific hardware for the claimed performing operation means other than generally stating that “The present disclosure may be applied in many general or special computer system environments or configurations, such as, personal computer, server computer, handheld or portable device, flat type device, multiprocessor system, microprocessor-based system, set-top box, programmable consumer electronic device, network PC, minicomputer, mainframe computer, distributed computing environment including any 
Regarding the “acceleration means by an acceleration chip for accelerating a deep neural network algorithm” recited in claim 16, aside from mentioning that “the present application further provides a [sic – an] acceleration means by an acceleration chip for accelerating a deep neural network algorithm” and merely repeating some of the claim language in the amended specification (see, e.g., pages 8-9), the specification does not mention, let alone describe any structure for the “acceleration means” recited in claim 16. Further, the specification does not identify corresponding algorithms or special-purpose computer hardware for any “acceleration means by an acceleration chip for accelerating a deep neural network algorithm”. Aside from merely repeating the language of claim 16 on pages 8-9, there is no other description of the “acceleration means” recited in claim 16. Further, aside from mentioning generic “general or special computer system environments or configurations, such as, personal computer, server computer, handheld or portable device” (specification, page 16, line 30-page 17, line 2), the specification is silent regarding any special-purpose computer hardware, much less any special-purpose computer hardware for the “performing operation means by using an operation apparatus for an acceleration chip for accelerating a deep neural network algorithm” as recited in claim 6. That is, the specification does not disclose any structure or specific hardware for the claimed acceleration means other than generally stating that “The present disclosure may be applied in many general or special computer 
Regarding the step of vector addition processing operation, step of vector function value operation, and step of vector multiply-add operation limitations recited in claims 6 and 16, aside from merely repeating the claim language in the amended specification (see, e.g., pages 5-6, 9-10 and 15-16), the specification does not describe any structure for the “step of” limitations recited in claims 6 and 16. Further, besides repeating the claim language with reference to the high-level flow chart of FIG. 4 (see, e.g., page 15), the specification does not identify corresponding algorithms or special-purpose computer hardware for any “acceleration means by an acceleration chip for accelerating a deep neural network algorithm”. Aside from merely repeating the language of the step limitations of claims 6 and16 on pages 5-6, 9-10 and 15-16, there is no other description of the “step of” limitations for vector addition processing operation, vector function value operation, and vector multiply-add operations recited in claims 6 and 16. That is, the specification does not identify corresponding algorithms or special-purpose computer hardware for performing the “vector addition processing operation”, the “vector function value operation”, and the “vector multiply-add operation”. step of vector addition processing operation, step of vector function value operation, and step of vector multiply-add operation other than generally stating that “The present disclosure may be applied in many general or special computer system environments or configurations, such as, personal computer, server computer, handheld or portable device, flat type device, multiprocessor system, microprocessor-based system, set-top box, programmable consumer electronic device, network PC, minicomputer, mainframe computer, distributed computing environment including any system or device thereof, and the like.” (see, page 16, line 30-page 17, line 5). Accordingly, for these “step of” claim limitations, the written description fails to disclose both an algorithm(s) and special-purpose computer hardware to perform the algorithm(s). For more information, see MPEP § 2181.
Regarding the vector addition processor module for performing addition or subtraction of a vector, vector function value arithmetic unit module for performing a vectorized operation and vector multiplier-adder module for performing a multiply-add operation on vectors limitations recited in claims 1, 6, 11 and 16, aside from merely repeating the claim language in the amended specification  any of the claimed modules. Aside from merely repeating the language of the module limitations of claims 1, 6, 11 and 16 on pages 5-6, 9-10 and 15-16 and the above-noted references to the “three functional modules” on page 11, there is no other description of the “vector addition processor module”, “vector function value arithmetic unit module” and “vector multiplier-adder module” recited in claims 1, 6, 11 and 16. That is, the specification does not identify corresponding algorithms or special-purpose computer hardware for “performing addition or subtraction of a vector, “performing a vectorized operation” and “performing a multiply-add operation on vectors”. Moreover, apart from generally stating “In the present disclosure, ‘module’, ‘apparatus’ and ‘system’ refer to the related physical objects applied to the computer, such as, hardware, combination of hardware and software, software or software in execution, etc.” and mentioning generic “general or special computer system environments or configurations, such as, personal specific hardware for the claimed vector addition processor module, vector function value arithmetic unit module and vector multiplier-adder module other than generally stating “The present disclosure may be described in general context of the computer executable instructions executed by the computer, such as, a program module” where any "module" may “refer to the related physical objects applied to the computer, such as, hardware, combination of hardware and software, software or software in execution, etc.” and “The present disclosure may be applied in many general or special computer system environments or configurations, such as, personal computer, server computer, handheld or portable device, flat type device, multiprocessor system, microprocessor-based system, set-top box, programmable consumer electronic device, network PC, minicomputer, mainframe computer, distributed computing environment including any system or device thereof, and the like.” (see, page 16, line 30-page 17, line 13). Accordingly, for these “module” claim limitations, the written description fails to disclose both an algorithm(s) and special-purpose computer hardware to perform the algorithm(s). For more information, see MPEP § 2181.
If applicant does not intend to have these limitations interpreted under 35 U.S.C. 112(f) applicant may:  (1) amend the claim limitations to avoid them being interpreted under 35 U.S.C. 112(f) (e.g., by reciting sufficient structure to perform the claimed 

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claims 1-20 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement. 
Independent claims 1, 6, 11 and 16 contain subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, at the time the application was filed, had possession of the claimed invention. 
In particular, and as previously noted, the claim limitations “a vector addition processor module for performing addition or subtraction of a vector”, “a vector function value arithmetic unit module for performing a vectorized operation” and “a vector multiplier-adder module for performing a multiply-add operation on vectors” in claims 1, 6, 11 and 16 invoke 35 U.S.C. 112(f). However, as noted above, the written description of the current application fails to disclose the corresponding structure, material, or acts for performing each of the above-identified claimed functions and to clearly link the structure, material, or acts to the function. In particular, for each of the claimed 
In particular, and as also previously noted, the claim limitations “a step of vector addition processing operation”, “a step of vector function value operation”, and “a step of vector multiply-add operation” in claims 6 and 16 invoke 35 U.S.C. 112(f). However, as noted above, the written description of the current application fails to disclose the corresponding structure, material, or acts for performing each of the above-identified claimed functions and to clearly link the structure, material, or acts to the function. In particular, for each of the claimed functions, the written description fails to disclose both an algorithm(s) and special-purpose computer hardware to perform the algorithm. For more information, see MPEP § 2181. Accordingly, claims 6 and 16 are also rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement for this additional reason.
Further, in particular, and as previously noted above, the claim limitation “A performing operation means by using an operation apparatus for an acceleration chip for accelerating a deep neural network algorithm” in claim 6 invokes 35 U.S.C. 112(f). However, as noted above, the written description of the current application fails to disclose the corresponding structure, material, or acts for performing the above-identified claimed function and to clearly link the structure, material, or acts to the function. In particular, for the claimed function, the written description fails to disclose both an algorithm(s) and special-purpose computer hardware to perform the algorithm. 
Lastly, in particular, and as previously noted, the claim limitation an “acceleration means by an acceleration chip for accelerating a deep neural network algorithm” in claim 16 invokes 35 U.S.C. 112(f). However, as noted above, the written description of the current application fails to disclose the corresponding structure, material, or acts for performing the above-identified claimed function and to clearly link the structure, material, or acts to the function. In particular, for the claimed function, the written description fails to disclose both an algorithm(s) and special-purpose computer hardware to perform the algorithm. For more information, see MPEP § 2181. Accordingly, claim 16 is additionally rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement for this reason.
Claims 2-5, 7-10, 12-15, and 17-20, which depend directly or indirectly from claims 1, 6, 11 and 16, respectively, are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement under the same rationale as claims 1, 6, 11 and 16.
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
 recited in claims 1, 6, 11 and 16 are indefinite. Therefore, claims 1, 6, 11 and 16 are indefinite and are rejected under 35 U.S.C. 112(b).
As additionally discussed above, the claim limitations “a step of vector addition processing operation”, “a step of vector function value operation”, and “a step of vector multiply-add operation” in claims 6 and 16 invoke 35 U.S.C. 112(f). However, as further noted above, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the 
As previously noted above, the written description of the current application fails to disclose the corresponding structure, material, or acts for performing each of the claimed functions and to clearly link the structure, material, or acts to each function for the following claim limitation: “A performing operation means by using an operation apparatus for an acceleration chip for accelerating a deep neural network algorithm” in claim 6. Therefore, because this claim limitation is indefinite, claim 6 is rejected under 35 U.S.C. 112(b) as being indefinite for at least this additional reason.
As further noted above, the written description of the current application fails to disclose the corresponding structure, material, or acts for performing each of the claimed functions and to clearly link the structure, material, or acts to each function for the following claim limitation: an “acceleration means by an acceleration chip for accelerating a deep neural network algorithm” in claim 16. Therefore, because this claim limitation is indefinite, claim 16 is rejected under 35 U.S.C. 112(b) as being indefinite for at least this additional reason.
Also, the recitation of “A performing operation means by using an operation apparatus for an acceleration chip for accelerating a deep neural network algorithm” in lines 1-2 of independent claim 6 is grammatically incorrect and unclear. In particular, it is unclear how the recited “performing operation means by using” [sic - uses?] the recited “an operation apparatus for an acceleration chip” to accelerate the recited “deep neural network algorithm”. Appropriate correction is required.

Also, claims 11 and 16 recite “the data bus” in lines 3-4. These recitations lack antecedent basis. Applicant did not previously introduce any “data bus” or any other “bus” in these claims. For examination purposes, the recitations of “the data bus” in these claims are being interpreted as any data bus. Appropriate correction is required.
Additionally, lines 19-21 of claim 6 recite “a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm”. However, applicant previously introduced “a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm” in lines 4-5 of this claim. It is unclear whether the subsequently-recited “a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm” refers to the previously-introduced “vector addition processor module [that] performs addition or subtraction of” the previously introduced “vector”, “and/or” performs the previously-introduced “vectorized operation of” the previously-introduced “pooling layer algorithm” or to another “vector addition processor module” that performs addition or subtraction of another, second vector, and/or another, second “vectorized operation of” another, second “pooling layer algorithm”. If supported by the original specification, the examiner the vector addition processor module performs addition or subtraction of [[a]] the vector, and/or [[a]] the vectorized operation of [[a]] the pooling layer algorithm”. For examination purposes, “a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm” is being interpreted as the previously-introduced “vector addition processor module [that] performs addition or subtraction of” the previously introduced “vector”, “and/or” performs the previously-introduced “vectorized operation of” the previously-introduced “pooling layer algorithm. Appropriate correction is required.
Further, lines 23-25 of claim 6 recite “a vector function value arithmetic unit module performs a vectorized operation of a non-linear evaluation in the deep neural network algorithm according to an instruction”. However, applicant previously introduced “a vector function value arithmetic unit module for performing a vectorized operation of a non-linear evaluation in the deep neural network algorithm” in lines 6-7 of this claim. It is unclear whether the subsequently-recited “a vector function value arithmetic unit module performs a vectorized operation of a non-linear evaluation in the deep neural network algorithm according to an instruction” refers to the previously-introduced “vector function value arithmetic unit module for performing” the previously-introduced “vectorized operation of” the previously-introduced “non-linear evaluation in the deep neural network algorithm” or to another “vector function value arithmetic unit module” that performs another “vectorized operation of” another “non-linear evaluation in the deep the vector function value arithmetic unit module performs [[a]] the vectorized operation of [[a]] the non-linear evaluation in the deep neural network algorithm according to an instruction”. For examination purposes, “a vector function value arithmetic unit module performs a vectorized operation of a non-linear evaluation in the deep neural network algorithm” is being interpreted as the previously-introduced “vector function value arithmetic unit module for performing” the previously-introduced “vectorized operation of” the previously-introduced “non-linear evaluation in the deep neural network algorithm”. Appropriate correction is required.
Also, lines 26-27 of claim 6 recite “a vector multiplier-adder module performs a multiply-add operation on the vector according to an instruction”. However, applicant previously introduced “a vector multiplier-adder module for performing a multiply-add operation on vectors” in line 8 of this claim. It is unclear whether the subsequently-recited “a vector multiplier-adder module performs a multiply-add operation on the vector according to an instruction” refers to the previously-introduced “vector multiplier-adder module for performing” the previously-introduced “multiply-add operation on vectors” or to another “vector multiplier-adder module for performing” another “multiply-add operation on the vector”. If supported by the original specification, the examiner suggests that one possible way to clarify the claim would be to amend the recitation of “a vector multiplier-adder module performs a multiply-add operation on the vector the vector multiplier-adder module performs [[a]] the multiply-add operation on the vector according to an instruction”. For examination purposes, “a vector multiplier-adder module performs a multiply-add operation on the vector” is being interpreted as the previously-introduced “vector multiplier-adder module” that performs the previously-introduced “multiply-add operation” on the vector. Appropriate correction is required.
Claims 9, 10, 19 and 20 each recite “the flag which is returned” (see, e.g., the last two lines of each of these claims). These recitations lack antecedent basis. Applicant did not previously introduce any “flag” in these claims, their respective intervening claims, claims 8 and 18, or in their respective base claims, claims 6 and 16. For examination purposes, the recitations of “the flag which is returned” are being interpreted as “[[the]] a flag which is returned”. Appropriate correction is required.
Claims 9 and 19 both recite “the unwritten positions of a storage block” (see, e.g., lines 5-6 of claim 9 and line 4 of claim 19). These recitations lack antecedent basis. Applicant did not previously introduce any “unwritten positions” or any other “positions of a storage block” in these claims, their respective intervening claims, claims 8 and 18, or in their respective base claims, claims 6 and 16. For examination purposes, the recitations of “the unwritten positions of a storage block” are being interpreted as “[[the]] an unwritten positions of a storage block”. Appropriate correction is required.
Claims 10 and 20 both recite “the written positions of a storage block” (see, e.g., line 5 of claim 10 and line 4 of claim 20). These recitations lack antecedent basis. Applicant did not previously introduce any “written positions” or any other “positions of a storage block” in these claims, their respective intervening claims, claims 8 and 18, or in a written positions of a storage block”. Appropriate correction is required.
In claims 9 and 19, the recitations of “in the step of vector addition processing operation, the step of vector function value operation and the step of vector multiply-add operation, if the unwritten positions of a storage block designated by the index previously within the intermediate value storage regions are requested to be read, the intermediate value storage regions refuse this reading request, and the flag which is returned to indicate the success of data reading is invalid.” are rife with grammatical errors and are unclear. For example, due to the grammatical errors, conditional and passive language, it is unclear under what condition the recited “reading request” is refused, and it is also unclear which claim element or component returns the recited “flag”. That is, it is unclear which of the three recited steps (presumably all three, as drafted), cause the “reading request” to be refused, how the “flag” is returned and what “the flag” indicates (i.e., that the “success” is invalid – reading failed or was unsuccessful, the “data reading” failed, or that the “data” from the “reading” is invalid). For examination purposes, “if the unwritten positions of a storage block designated by the index previously within the intermediate value storage regions are requested to be read, the intermediate value storage regions refuse this reading request, and the flag which is returned to indicate the success of data reading is invalid” is being interpreted as if any unwritten positions of a storage block designated by the index previously within the intermediate value storage regions are requested to be read, the intermediate value 
In claims 10 and 20, the recitations of “in the step of vector addition processing operation, the step of vector function value operation and the step of vector multiply-add operation, if the written positions of a storage block designated by the index previously within the intermediate value storage regions are requested to be read, the intermediate value storage regions refuse this writing request, and the flag which is returned to indicate the success of data reading is invalid.” are rife with grammatical errors and are unclear. For example, due to the grammatical errors, conditional and passive language, it is unclear under what condition the recited “writing request” is refused, and it is also unclear which claim element or component returns the recited “flag”. That is, it is unclear which of the three recited steps (presumably all three, as drafted), cause the “writing request” to be refused, how the “flag” is returned and what “the flag” indicates (i.e., that the “success” is invalid – writing or reading failed or was unsuccessful, the “data reading” [sic – writing] failed, or that the “data” from the “reading” [sic – writing] is invalid). It also appears that the recitation of “the success of the data reading” should read “the success of the data writing 
The recitation of “A acceleration means by an acceleration chip for accelerating a deep neural network algorithm” in lines 1-2 of independent claim 16 is grammatically incorrect and unclear. In particular, it is unclear how the recited “acceleration means by” [sic - uses?] the recited “an acceleration chip” to accelerate the recited “deep neural network algorithm”. Appropriate correction is required.
Also, lines 20-22 of independent claim 16 recite “a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm”. However, applicant previously introduced “a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm” in lines 5-6 of this claim. It is unclear whether the subsequently-recited “a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm” refers to the previously-introduced “vector addition processor module [that] performs addition or subtraction of” the previously introduced “vector”, “and/or” performs the previously-introduced “vectorized operation of” the previously-introduced “pooling layer algorithm” or to another “vector addition processor module” that performs addition or subtraction of another, second vector, and/or another, second “vectorized operation of” another, second “pooling layer algorithm”. If supported by the original specification, the examiner suggests that one possible way to clarify the claim would be to amend the recitation of “a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm” to recite “[[a]] the vector addition the vector, and/or [[a]] the vectorized operation of [[a]] the pooling layer algorithm”. For examination purposes, “a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm” is being interpreted as the previously-introduced “vector addition processor module [that] performs addition or subtraction of” the previously introduced “vector”, “and/or” performs the previously-introduced “vectorized operation of” the previously-introduced “pooling layer algorithm”. Appropriate correction is required.
Additionally, lines 23-25 of claim 16 recite “a vector function value arithmetic unit module performs a vectorized operation of a non-linear evaluation in the deep neural network algorithm according to an instruction”. However, applicant previously introduced “a vector function value arithmetic unit module for performing a vectorized operation of a non-linear evaluation in the deep neural network algorithm” in lines 7-8 of this claim. It is unclear whether the subsequently-recited “a vector function value arithmetic unit module performs a vectorized operation of a non-linear evaluation in the deep neural network algorithm according to an instruction” refers to the previously-introduced “vector function value arithmetic unit module for performing” the previously-introduced “vectorized operation of” the previously-introduced “non-linear evaluation in the deep neural network algorithm” or to another “vector function value arithmetic unit module” that performs another “vectorized operation of” another “non-linear evaluation in the deep neural network algorithm”. If supported by the original specification, the examiner suggests that one possible way to clarify the claim would be to amend the recitation of “a vector function value arithmetic unit module performs a vectorized operation of a non-the vector function value arithmetic unit module performs [[a]] the vectorized operation of [[a]] the non-linear evaluation in the deep neural network algorithm according to an instruction”. For examination purposes, “a vector function value arithmetic unit module performs a vectorized operation of a non-linear evaluation in the deep neural network algorithm” is being interpreted as the previously-introduced “vector function value arithmetic unit module for performing” the previously-introduced “vectorized operation of” the previously-introduced “non-linear evaluation in the deep neural network algorithm”. Appropriate correction is required.
Further, lines 26-27 of claim 16 recite “a vector multiplier-adder module performs a multiply-add operation on the vector according to an instruction”. However, applicant previously introduced “a vector multiplier-adder module for performing a multiply-add operation on vectors” in line 9 of this claim. It is unclear whether the subsequently-recited “a vector multiplier-adder module performs a multiply-add operation on the vector according to an instruction” refers to the previously-introduced “vector multiplier-adder module for performing” the previously-introduced “multiply-add operation on vectors” or to another “vector multiplier-adder module for performing” another “multiply-add operation on the vector”. If supported by the original specification, the examiner suggests that one possible way to clarify the claim would be to amend the recitation of “a vector multiplier-adder module performs a multiply-add operation on the vector according to an instruction” to recite “[[a]] the vector multiplier-adder module performs [[a]] the multiply-add operation on the vector according to an instruction”. For examination purposes, “a vector multiplier-adder module performs a multiply-add 
Claims 4 and 14 both recite “wherein the random access memory is configured to store the intermediate values produced itself from each of the neurons and the variation amount of the synaptic weight”, which is grammatically incorrect and unclear. In particular, it is unclear what “values produced itself” refers to. For examination purposes, “wherein the random access memory is configured to store the intermediate values produced itself from each of the neurons” is being interpreted as wherein the random access memory is configured to store the intermediate values from each of the neurons. Appropriate correction is required.

Also, claims 2-5, 7-10, 12-15, and 17-20, which depend directly or indirectly from claims 1, 6, 11 and 16, respectively, are rejected under 35 U.S.C. 112(b) as being indefinite under the same rationale as claims 1, 6, 11 and 16.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-8 and 11-18 are rejected under 35 U.S.C. 103 as being unpatentable over non-patent literature Chen et al. ("Dadiannao: A machine-learning supercomputer." 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE,  A1, hereinafter “Burger”).
Although the Chen reference appears to be a grace-period disclosure by some of the inventors Yunji Chen, Tao Luo, Shaoli Liu, Shijin Zhang and Tianshi Chen, the 35 USC § 102(b)(1)(A) exception does not apply because the reference was authored in part by Liqiang He, Jia Wang, Ling Li, , Zhiwei Xu, Ninghui Sun and Olivier Temam, who are not named as inventors of the instant application. See MPEP § 2153.01(a): “If ... the application names fewer joint inventors than a publication (e.g., the application names as joint inventors A and B, and the publication names as authors A, B and C), it would not be readily apparent from the publication that it is by the inventor (i.e., the inventive entity) or a joint inventor and the publication would be treated as prior art under AIA  35 U.S.C. 102(a)(1).” 
With respect to claim 1, Chen discloses the invention as claimed including an operation apparatus for an acceleration chip for accelerating a deep neural network (intended use language with no patentable weight – aside from the preamble, the acceleration chip or the accelerating is not recited elsewhere in the claim or its dependent claims) (see, e.g., FIG. 3 – showing “Block diagram of the DianNao accelerator” – an operation apparatus, Abstract and page 612, “The state-of-the-art and most popular such machine-learning algorithms are Convolutional and Deep Neural Networks (CNNs and DNNs)”, “the DianNao accelerator for the fast and low-energy execution of the inference of large CNNs and DNNs … architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU)” [i.e., , comprising:
a vector addition processor module for performing addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm in the deep neural network (as indicated above, aside from repeating the claim language (see, pages 5-6, 9-10 and 15-16) and stating “In the present disclosure, ‘module’, ‘apparatus’ and ‘system’ refer to the related physical objects applied to the computer, such as, hardware, combination of hardware and software, software or software in execution, etc.” (see, page 17, lines 11-13), applicant’s specification does not specifically define “a vector addition processor module”. Therefore, “a vector addition processor module”, under its broadest reasonable interpretation (BRI), in light of the specification, is any hardware or software component/object that is able to perform addition or subtraction on a vector/neural values, and/or an operation of a pooling layer values/vector) (see, e.g., FIGs. 7 and 8 – showing pipeline configurations including pooling layers in a deep neural network and mapping of neurons of a pooling layer, and pages 611-613, “A pooling layer computes the max or average over a number of neighbor points, e.g., out(x, y)f = max 0≤kx≤Kx,0≤ky≤Ky in(x + kx, y + ky)f” [i.e., a vectorized operation of a pooling layer algorithm], “The architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU) which is largely a pipelined version of the typical computations required to evaluate a neuron output: the multiplication of synaptic values by input neurons values in the first stage, additions of all these products in the second stage (adder trees)” [i.e., a unit/NFU for performing additions of synaptic and neural values/vector addition], “The general architecture is a ;
a vector function value arithmetic unit module for performing a vectorized operation of a non-linear evaluation in the deep neural network (as indicated above, aside from repeating the claim language (see, pages 5-6, 9-10 and 15-16) and stating “In the present disclosure, ‘module’, ‘apparatus’ and ‘system’ refer to the related physical objects applied to the computer, such as, hardware, combination of hardware and software, software or software in execution, etc.” (see, page 17, lines 11-13), applicant’s specification does not specifically define “a vector function value arithmetic unit module”. Therefore, “a vector function value arithmetic unit module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform an operation of a non-linear evaluation of a set of neural values/a vector) (see, e.g., page 613, “The general architecture is a set of nodes, one per chip … Each node contains … neural computational units (the classic pipeline of multipliers, adder trees and non-linear transfer functions” [i.e., a neural computational unit/module for performing non-linear transfer functions/a vectorized operation of a non-linear evaluation]) … ; and
a vector multiplier-adder module for performing a multiply-add operation on vectors (as indicated above, aside from repeating the claim language (see, pages 5-6, 9-10 and 15-16) and stating “In the present disclosure, ‘module’, ‘apparatus’ and ‘system’ refer to the related physical objects applied to the computer, such as, ;
the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module are configured to execute programmable instructions and interact with each other to calculate values of neurons and a network output result of a neural network (as indicated above, the claimed modules, under the BRI, are any hardware or software components/objects that are able to perform operations on sets of neural values/vectors) (see, e.g., FIG. 3 – showing that the “NFU” units/nodes/modules of the “DianNao accelerator” receive and execute “Instructions” and pages 610, 612 and 615-616, “an architecture, composed of interconnected nodes, each containing computational logic, eDRAM, and the router fabric”, “The architecture contains buffers for caching input/output neurons and , and a variation amount of a synaptic weight representing the interaction strength of the neurons on an input layer to the neurons on an output layer (see, e.g., pages 609 and 613, “the neurons and synapses (i.e., weights of connections between neurons)” [i.e., synapse/synaptic weights represent strength of connections/interactions between neurons], “in a classifier layer, the No outputs are typically connected to all the Ni inputs, with one synaptic weight per connection” [i.e., variations in synaptic weights representing connection/interaction strengths of neurons Ni in an input layer and neurons No in an output layer]);
the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module are all provided with an intermediate value storage region respectively for storing a vectorized intermediate value calculated according to the instructions (page 4, lines 5-8 and page 7, lines 27-30 of the specification state “the intermediate value storage regions of the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module are configured as a random access memory.” Therefore, “an intermediate value storage region”, under the BRI, in light of the specification, is any memory, such as RAM, usable to store values) (see, e.g., FIG. 5 – depicting “a node (left) and tile architecture (right). A node contains 16 tiles, two central eDRAM banks and … a tile has an NFU, four eDRAM banks and input/output interfaces to/from the central eDRAM banks” [i.e., nodes/modules have eDRAM banks/storage regions] and pages 610 and 612, “an architecture, composed of interconnected nodes, each containing computational logic, eDRAM”, “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), or it can come from the two SRAM storages (16KB) which are used to buffer input and output neuron values, or even temporary values (such as neurons partial sums to enable reuse of input neurons values)” [i.e., nodes/modules of the NFU each include a respective eDRAM/an intermediate storage region for storing temporary/intermediate values in buffers]), and perform read and write operations on the primary memory (as indicated above, “the primary memory” has been interpreted as any memory) (see, e.g., pages 609 and 615, “the neurons and synapses (i.e., weights of connections between neurons) intermediate values have to be stored in main memory” [i.e., performing write operations on a main memory], “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), … SRAM storages (16KB) which ;
the means comprising the following steps:
a step of vector addition processing operation, in which a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm in the deep neural network algorithm according to an instruction (as indicated above, “a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm” has been interpreted as the previously-introduced “vector addition processor module [that] performs addition or subtraction of” the previously introduced “vector”, “and/or” performs the previously-introduced “vectorized operation of” the previously-introduced “pooling layer algorithm”. As also indicated above, “a vector addition processor module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform addition or subtraction on a vector/neural values, and/or an operation of a pooling layer values/vector) (see, e.g., FIG. 3 - showing that the “NFU” units/nodes/modules of the “DianNao accelerator” receive and execute “Instructions”, FIGs. 7 and 8 – showing pipeline configurations including pooling layers in a deep neural network and steps for mapping of neurons of a pooling layer, Abstract, and pages 611-613 and 616, “state-of-the-art and most popular such machine-learning algorithms are Convolutional and Deep Neural Networks (CNNs and DNNs)” [i.e., a deep neural network algorithm], “A pooling out(x, y)f = max 0≤kx≤Kx,0≤ky≤Ky in(x + kx, y + ky)f” [i.e., a step of a vectorized operation/computation of a pooling layer algorithm], “The architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU) which is largely a pipelined version of the typical computations required to evaluate a neuron output: the multiplication of synaptic values by input neurons values in the first stage, additions of all these products in the second stage (adder trees)” [i.e., a unit/NFU for performing the step of addition of synaptic and neural values/vector addition], “The general architecture is a set of nodes, one per chip … Each node contains neural computational units (the classic pipeline of multipliers, adder trees and non-linear transfer functions implemented via linear interpolation), which we also call NFU” [i.e., neural computational units/modules include adders/adder trees for performing the step vector addition] “The neural network configuration is implemented in the form of a sequence of node instructions, one sequence per node … These node instructions themselves drive the control of each tile” [i.e., according to an instruction]);
a step of vector function value operation, in which a vector function value arithmetic unit module performs a vectorized operation of a non-linear evaluation in the deep neural network algorithm according to an instruction (as indicated above, “a vector function value arithmetic unit module performs a vectorized operation of a non-linear evaluation in the deep neural network algorithm” is being interpreted as the previously-introduced “vector function value arithmetic unit module for performing” the previously-introduced “vectorized operation of” the previously-introduced “non-linear evaluation in the deep neural network algorithm”. As also indicated above, “a vector ; and
a step of vector multiply-add operation, in which a vector multiplier-adder module performs a multiply-add operation on the vector according to an instruction (as indicated above, “a vector multiplier-adder module performs a multiply-add operation on the vector” has been interpreted as the previously-introduced “vector multiplier-adder module” that performs the previously-introduced “multiply-add operation” on the vector. As further indicated above, “a vector multiplier-adder module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform a multiplication-addition operation on set of neural values/a vector) (see, e.g., FIGs. 6 and 7 – depicting “The different (parallel) operators of an NFU: multipliers, adders” and “Multiply” and “Add” steps/operations on values of neural network layers [i.e., the NFU includes a unit/node/module to perform multiply-add operations on neural values/vectors] and page 616, “The spirit of a node or tile instruction is to perform the same layer computations (e.g., multiply-add-transf for ;
the step of vector addition processing operation, the step of vector function value operation and the step of vector multiply-add operation interact with each other to calculate values of neurons and a network output result of a neural network (as indicated above, the claimed modules, under the BRI, are any hardware or software components/objects that are able to perform operations on sets of neural values/vectors) (see, e.g., pages 610, 612 and 615-616, “an architecture, composed of interconnected nodes, each containing computational logic, eDRAM, and the router fabric”, “The architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU) … computations required to evaluate a neuron output”, “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), or it can come from the two SRAM storages (16KB) which are used to buffer input and output neuron values, or even temporary values (such as neurons partial sums to enable reuse of input neurons values)” [i.e., nodes/modules of the NFU interact with each other for the steps of calculating neuron values and an output result of a neural network], “The neural network configuration is implemented in the form of a sequence of node instructions, one sequence per node … The spirit of a node or tile instruction is to perform the same layer , and a variation amount of a synaptic weight representing the interaction strength of the neurons on an input layer to the neurons on an output layer (see, e.g., pages 609 and 613, “the neurons and synapses (i.e., weights of connections between neurons)” [i.e., synapse/synaptic weights represent strength of connections/interactions between neurons], “in a classifier layer, the No outputs are typically connected to all the Ni inputs, with one synaptic weight per connection” [i.e., variations in synaptic weights representing connection/interaction strengths of neurons Ni in an input layer and neurons No in an output layer]);
the vectorized intermediate values produced by the step of vector addition processing operation, the step of vector function value operation and the step of vector multiply-add operation are stored in the intermediate value storage regions of the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module (as indicated above, “an intermediate value storage region”, under the BRI, in light of the specification, is any memory, such as RAM, usable to store values) (see, e.g., FIG. 5 – depicting “a node (left) and tile architecture (right). A node contains 16 tiles, two central eDRAM banks and … a tile has an NFU, four eDRAM banks and input/output interfaces to/from the central eDRAM banks” [i.e., nodes/modules have eDRAM banks/storage regions] and pages 610 and 612, “an architecture, composed of interconnected nodes, each containing computational logic, eDRAM”, “a classifier layer input can come from the , and the intermediate value storage regions may perform read and write operations on the primary memory (as indicated above, “the primary memory” has been interpreted as any memory) (see, e.g., pages 609 and 615, “the neurons and synapses (i.e., weights of connections between neurons) intermediate values have to be stored in main memory” [i.e., performing write operations on a main memory], “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), … SRAM storages (16KB) which are used to buffer input and output neuron values, or even temporary values (such as neurons partial sums to enable reuse of input neurons values) … the NFU must also write to the tile eDRAM” [i.e., the eDRAMs of the nodes perform read and write operations to read input values and write values to the central eDRAM/primary memory]).
Although Chen substantially discloses the claimed invention, Chen is not relied on for explicitly disclosing an operation apparatus for … for accelerating a deep neural network algorithm, comprising:
a vector addition processor module for performing addition or subtraction of a vector … in the deep neural network algorithm; and
a vector function value arithmetic unit module for performing a vectorized operation of a non-linear evaluation in the deep neural network algorithm.
operation apparatus … for accelerating a deep neural network algorithm (see, e.g., paragraph 307, “Service that may be implemented on an acceleration component … a deep neural network (DNN). … state-of-the-art DNN algorithms” [i.e., acceleration component operation apparatus for accelerating a DNN/deep neural network algorithm]), comprising:
a vector addition processor module for performing addition or subtraction of a vector … in the deep neural network algorithm (as indicated above, “a vector addition processor module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform addition or subtraction on a vector/neural values, and/or an operation of a pooling layer values/vector) (see, e.g., FIG. 58 – depicting NEURAL ENGINE 5802 with MULTIPLY-ACCUMULATOR COMPONENT 5812 and paragraphs 314 and 330, “the input activations are represented by a 4-tuple vector [x0 , x1 , x2 , x3]T in Layer i-1. Every neuron in Layer i processes the input vector of Layer i-1 using an activation function and generates output activations of Layer i. Typically, the activation function is a weighted sum of products”, “a neural engine 5802, which includes … a multiply-accumulate component 5812” [i.e., module to perform addition/summing of products from an input vector in the DNN algorithm]); and
a vector function value arithmetic unit module for performing a vectorized operation of a non-linear evaluation in the deep neural network algorithm (as indicated above, “a vector function value arithmetic unit module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform an operation of a non-linear evaluation of a set of neural values/a vector) (see, T in Layer i-1. Every neuron in Layer i processes the input vector of Layer i-1 using an activation function and generates output activations of Layer i. Typically, the activation function is a weighted sum of products, taking the input activation of each neuron and scaling it by a tunable weight parameter. The dot product is further transformed by a non-linear differentiable function such as hyperbolic tangent, sigmoid or other non-linear differentiable function”, “a neural engine 5802, which includes … a non-linear functions component 5814” [i.e., module for performing vectorized operation of a non-linear evaluation in the DNN/deep neural network algorithm]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “custom multi-chip machine-learning architecture” for implementing “machine-learning algorithms” such as “Convolutional and Deep Neural Networks (CNNs and DNNs)” of Chen (See, Chen, Abstract) to incorporate the teachings of Burger to provide a method “for processing on an acceleration component a deep neural network” and a high bandwidth “High BW Service that may be implemented on an acceleration component that includes high bandwidth, low power memory using die stacking techniques” for “a deep neural network (DNN)” and “state-of-the-art DNN algorithms” (See, e.g., Burger, Abstract and paragraph 307). Doing so would have allowed Chen to use Burger’s method that “includes configuring the acceleration component to perform forward propagation and backpropagation stages of the deep neural network” in order “to achieve state-of-the-art 

With regard to independent claim 6, Chen discloses the invention as claimed including a performing operation means by using an operation apparatus for an acceleration chip for accelerating a deep neural network algorithm (intended use language with no patentable weight – aside from the preamble, the acceleration chip or the accelerating is not recited elsewhere in the claim or its dependent claims) (see, e.g., FIG. 3 – showing “Block diagram of the DianNao accelerator” – an operation apparatus, Abstract and page 612, “The state-of-the-art and most popular such machine-learning algorithms are Convolutional and Deep Neural Networks (CNNs and DNNs)”, “the DianNao accelerator for the fast and low-energy execution of the inference of large CNNs and DNNs … architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU)” [i.e., the DianNao accelerator is an operation apparatus for accelerating a DNN/deep neural network]), the operation apparatus comprising:
a vector addition processor module for performing addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm in the deep neural network (as indicated above, “a vector addition processor module”, under the BRI, in light of the specification, is any hardware or software component/object that is or subtraction on a vector/neural values, and/or an operation of a pooling layer values/vector) (see, e.g., FIGs. 7 and 8 – showing pipeline configurations including pooling layers in a deep neural network and mapping of neurons of a pooling layer, and pages 611-613, “A pooling layer computes the max or average over a number of neighbor points, e.g., out(x, y)f = max 0≤kx≤Kx,0≤ky≤Ky in(x + kx, y + ky)f” [i.e., a vectorized operation of a pooling layer algorithm], “The architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU) which is largely a pipelined version of the typical computations required to evaluate a neuron output: the multiplication of synaptic values by input neurons values in the first stage, additions of all these products in the second stage (adder trees)” [i.e., a unit/NFU for performing additions of synaptic and neural values/vector addition], “The general architecture is a set of nodes, one per chip … Each node contains neural computational units (the classic pipeline of multipliers, adder trees and non-linear transfer functions implemented via linear interpolation), which we also call NFU” [i.e., neural computational units/modules include adders/adder trees for performing vector addition]) … ;
a vector function value arithmetic unit module for performing a vectorized operation of a non-linear evaluation in the deep neural network (as indicated above, “a vector function value arithmetic unit module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform an operation of a non-linear evaluation of a set of neural values/a vector) (see, e.g., page 613, “The general architecture is a set of nodes, one per chip … Each node contains … neural computational units (the classic pipeline of multipliers, adder trees and non-linear ; and
a vector multiplier-adder module for performing a multiply-add operation on vectors (as indicated above, “a vector multiplier-adder module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform a multiplication-addition operation on set of neural values/a vector) (see, e.g., FIGs. 6 and 7 – depicting “The different (parallel) operators of an NFU: multipliers, adders” and “Multiply” and “Add” operations on values of neural network layers [i.e., the NFU includes a unit/node/module to perform multiply-add operations on neural values/vectors] and page 616, “The spirit of a node or tile instruction is to perform the same layer computations (e.g., multiply-add-transf for classifier layers) on a set of contiguous input data (input neurons in the forward phase, output neurons, gradients or synapses in the backward phase).” [i.e., node/module to perform multiply-add operations on neural data values/vectors]);
the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module are configured to execute programmable instructions and interact with each other to calculate values of neurons and a network output result of a neural network (as indicated above, the claimed modules, under the BRI, are any hardware or software components/objects that are able to perform operations on sets of neural values/vectors) (see, e.g., FIG. 3 – showing that the “NFU” units/nodes/modules of the “DianNao accelerator” receive and execute “Instructions” and pages 610, 612 and 615-616, “an architecture, composed of interconnected nodes, each containing computational logic, eDRAM, and the router , and a variation amount of a synaptic weight representing the interaction strength of the neurons on an input layer to the neurons on an output layer (see, e.g., pages 609 and 613, “the neurons and synapses (i.e., weights of connections between neurons)” [i.e., synapse/synaptic weights represent strength of connections/interactions between neurons], “in a classifier layer, the No outputs are typically connected to all the Ni inputs, with one synaptic weight per connection” [i.e., variations in synaptic weights representing connection/interaction strengths of neurons Ni in an input layer and neurons No in an output layer]);
the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module are all provided with an intermediate value storage region respectively for storing a vectorized intermediate value calculated according to the instructions (page 4, lines 5-8 and page 7, lines 27-30 of the specification state “the intermediate value storage regions of the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module are configured as a random access memory.” Therefore, “an intermediate value storage region”, under the BRI, in light of the specification, is any memory, such as RAM, usable to store values) (see, e.g., FIG. 5 – depicting “a node (left) and tile architecture (right). A node contains 16 tiles, two central eDRAM banks and … a tile has an NFU, four eDRAM banks and input/output interfaces to/from the central eDRAM banks” [i.e., nodes/modules have eDRAM banks/storage regions] and pages 610 and 612, “an architecture, composed of interconnected nodes, each containing computational logic, eDRAM”, “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), or it can come from the two SRAM storages (16KB) which are used to buffer input and output neuron values, or even temporary values (such as neurons partial sums to enable reuse of input neurons values)” [i.e., nodes/modules of the NFU each include a respective eDRAM/an intermediate storage region for storing temporary/intermediate values in buffers]), and perform read and write operations on the primary memory (as indicated above, “the primary memory” has been interpreted as any memory) (see, e.g., pages 609 and 615, “the neurons and synapses (i.e., weights of connections between neurons) intermediate values have to be stored in main memory” [i.e., performing write ;
the means comprising the following steps:
a step of vector addition processing operation, in which a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm in the deep neural network algorithm according to an instruction (as indicated above, “a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm” has been interpreted as the previously-introduced “vector addition processor module [that] performs addition or subtraction of” the previously introduced “vector”, “and/or” performs the previously-introduced “vectorized operation of” the previously-introduced “pooling layer algorithm”. As also indicated above, “a vector addition processor module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform addition or subtraction on a vector/neural values, and/or an operation of a pooling layer values/vector) (see, e.g., FIG. 3 - showing that the “NFU” units/nodes/modules of the “DianNao accelerator” receive and execute “Instructions”, FIGs. 7 and 8 – showing pipeline configurations including pooling layers in a deep neural network and steps for mapping of neurons of a pooling layer, Abstract, and pages 611-613 and 616, “state-of-out(x, y)f = max 0≤kx≤Kx,0≤ky≤Ky in(x + kx, y + ky)f” [i.e., a step of a vectorized operation/computation of a pooling layer algorithm], “The architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU) which is largely a pipelined version of the typical computations required to evaluate a neuron output: the multiplication of synaptic values by input neurons values in the first stage, additions of all these products in the second stage (adder trees)” [i.e., a unit/NFU for performing the step of addition of synaptic and neural values/vector addition], “The general architecture is a set of nodes, one per chip … Each node contains neural computational units (the classic pipeline of multipliers, adder trees and non-linear transfer functions implemented via linear interpolation), which we also call NFU” [i.e., neural computational units/modules include adders/adder trees for performing the step vector addition] “The neural network configuration is implemented in the form of a sequence of node instructions, one sequence per node … These node instructions themselves drive the control of each tile” [i.e., according to an instruction]);
a step of vector function value operation, in which a vector function value arithmetic unit module performs a vectorized operation of a non-linear evaluation in the deep neural network algorithm according to an instruction (as indicated above, “a vector function value arithmetic unit module performs a vectorized operation of a non-linear evaluation in the deep neural network algorithm” is being interpreted as the previously-introduced “vector function value arithmetic unit module for performing” ; and
a step of vector multiply-add operation, in which a vector multiplier-adder module performs a multiply-add operation on the vector according to an instruction (as indicated above, “a vector multiplier-adder module performs a multiply-add operation on the vector” has been interpreted as the previously-introduced “vector multiplier-adder module” that performs the previously-introduced “multiply-add operation” on the vector. As further indicated above, “a vector multiplier-adder module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform a multiplication-addition operation on set of neural values/a vector) (see, e.g., FIGs. 6 and 7 – depicting “The different (parallel) operators of an NFU: multipliers, adders” and “Multiply” and “Add” steps/operations on values of neural network layers [i.e., the NFU includes a unit/node/module to perform ;
the step of vector addition processing operation, the step of vector function value operation and the step of vector multiply-add operation interact with each other to calculate values of neurons and a network output result of a neural network (as indicated above, the claimed modules, under the BRI, are any hardware or software components/objects that are able to perform operations on sets of neural values/vectors) (see, e.g., pages 610, 612 and 615-616, “an architecture, composed of interconnected nodes, each containing computational logic, eDRAM, and the router fabric”, “The architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU) … computations required to evaluate a neuron output”, “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), or it can come from the two SRAM storages (16KB) which are used to buffer input and output neuron values, or even temporary values (such as neurons partial sums to enable reuse of input neurons values)” [i.e., nodes/modules of the NFU interact with each other for the steps of calculating neuron values and an output result of a neural network], “The neural network , and a variation amount of a synaptic weight representing the interaction strength of the neurons on an input layer to the neurons on an output layer (see, e.g., pages 609 and 613, “the neurons and synapses (i.e., weights of connections between neurons)” [i.e., synapse/synaptic weights represent strength of connections/interactions between neurons], “in a classifier layer, the No outputs are typically connected to all the Ni inputs, with one synaptic weight per connection” [i.e., variations in synaptic weights representing connection/interaction strengths of neurons Ni in an input layer and neurons No in an output layer]);
the vectorized intermediate values produced by the step of vector addition processing operation, the step of vector function value operation and the step of vector multiply-add operation are stored in the intermediate value storage regions of the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module (as indicated above, “an intermediate value storage region”, under the BRI, in light of the specification, is any memory, such as RAM, usable to store values) (see, e.g., FIG. 5 – depicting “a node (left) and tile architecture (right). A node contains 16 tiles, two central eDRAM banks and … a tile has an NFU, four eDRAM banks and input/output interfaces to/from the central eDRAM banks” [i.e., nodes/modules have eDRAM banks/storage regions] and , and the intermediate value storage regions may perform read and write operations on the primary memory (as indicated above, “the primary memory” has been interpreted as any memory) (see, e.g., pages 609 and 615, “the neurons and synapses (i.e., weights of connections between neurons) intermediate values have to be stored in main memory” [i.e., performing write operations on a main memory], “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), … SRAM storages (16KB) which are used to buffer input and output neuron values, or even temporary values (such as neurons partial sums to enable reuse of input neurons values) … the NFU must also write to the tile eDRAM” [i.e., the eDRAMs of the nodes perform read and write operations to read input values and write values to the central eDRAM/primary memory]).
Although Chen substantially discloses the claimed invention, Chen is not relied on for explicitly disclosing an operation apparatus for … for accelerating a deep neural network algorithm, comprising:
a vector addition processor module for performing addition or subtraction of a vector … in the deep neural network algorithm; and
a vector function value arithmetic unit module for performing a vectorized operation of a non-linear evaluation in the deep neural network algorithm.
In the same field, analogous art Burger teaches an operation apparatus … for accelerating a deep neural network algorithm (see, e.g., paragraph 307, “Service that may be implemented on an acceleration component … a deep neural network (DNN). … state-of-the-art DNN algorithms” [i.e., acceleration component operation apparatus for accelerating a DNN/deep neural network algorithm]), comprising:
a vector addition processor module for performing addition or subtraction of a vector … in the deep neural network algorithm (as indicated above, “a vector addition processor module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform addition or subtraction on a vector/neural values, and/or an operation of a pooling layer values/vector) (see, e.g., FIG. 58 – depicting NEURAL ENGINE 5802 with MULTIPLY-ACCUMULATOR COMPONENT 5812 and paragraphs 314 and 330, “the input activations are represented by a 4-tuple vector [x0 , x1 , x2 , x3]T in Layer i-1. Every neuron in Layer i processes the input vector of Layer i-1 using an activation function and generates output activations of Layer i. Typically, the activation function is a weighted sum of products”, “a neural engine 5802, which includes … a multiply-accumulate component 5812” [i.e., module to perform addition/summing of products from an input vector in the DNN algorithm]); and
a vector function value arithmetic unit module for performing a vectorized operation of a non-linear evaluation in the deep neural network algorithm (as indicated above, “a vector function value arithmetic unit module”, under the BRI, in light T in Layer i-1. Every neuron in Layer i processes the input vector of Layer i-1 using an activation function and generates output activations of Layer i. Typically, the activation function is a weighted sum of products, taking the input activation of each neuron and scaling it by a tunable weight parameter. The dot product is further transformed by a non-linear differentiable function such as hyperbolic tangent, sigmoid or other non-linear differentiable function”, “a neural engine 5802, which includes … a non-linear functions component 5814” [i.e., module for performing vectorized operation of a non-linear evaluation in the DNN/deep neural network algorithm]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “custom multi-chip machine-learning architecture” for implementing “machine-learning algorithms” such as “Convolutional and Deep Neural Networks (CNNs and DNNs)” of Chen (See, Chen, Abstract) to incorporate the teachings of Burger to provide a method “for processing on an acceleration component a deep neural network” and a high bandwidth “High BW Service that may be implemented on an acceleration component that includes high bandwidth, low power memory using die stacking techniques” for “a deep neural network (DNN)” and “state-of-the-art DNN algorithms” (See, e.g., Burger, Abstract and paragraph 307). Doing so would have allowed Chen to use Burger’s method that 

With regard to independent claim 11, Chen discloses the invention as claimed including an acceleration chip for accelerating a deep neural network algorithm (intended use language with no patentable weight – aside from the preamble, the acceleration chip or the accelerating is not recited elsewhere in the claim or its dependent claims) (see, e.g., FIG. 3 – showing “Block diagram of the DianNao accelerator” – an operation apparatus, Abstract and page 612, “The state-of-the-art and most popular such machine-learning algorithms are Convolutional and Deep Neural Networks (CNNs and DNNs)”, “the DianNao accelerator for the fast and low-energy execution of the inference of large CNNs and DNNs … architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU)” [i.e., the DianNao accelerator is an operation apparatus for accelerating a DNN/deep neural network]) …, the acceleration chip comprising:
a primary memory for performing read and write operations (see, e.g., pages 609 and 615, “the neurons and synapses (i.e., weights of connections between neurons) intermediate values have to be stored in main memory” [i.e., performing write ;
a vector addition processor module for performing addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm in the deep neural network (as indicated above, “a vector addition processor module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform addition or subtraction on a vector/neural values, and/or an operation of a pooling layer values/vector) (see, e.g., FIGs. 7 and 8 – showing pipeline configurations including pooling layers in a deep neural network and mapping of neurons of a pooling layer, and pages 611-613, “A pooling layer computes the max or average over a number of neighbor points, e.g., out(x, y)f = max 0≤kx≤Kx,0≤ky≤Ky in(x + kx, y + ky)f” [i.e., a vectorized operation of a pooling layer algorithm], “The architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU) which is largely a pipelined version of the typical computations required to evaluate a neuron output: the multiplication of synaptic values by input neurons values in the first stage, additions of all these products in the second stage (adder trees)” [i.e., a unit/NFU for performing additions of synaptic and neural values/vector addition], “The general architecture is a set of nodes, one per chip … Each node contains neural computational units (the classic pipeline of multipliers, adder ;
a vector function value arithmetic unit module for performing a vectorized operation of a non-linear evaluation in the deep neural network (as indicated above, “a vector function value arithmetic unit module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform an operation of a non-linear evaluation of a set of neural values/a vector) (see, e.g., page 613, “The general architecture is a set of nodes, one per chip … Each node contains … neural computational units (the classic pipeline of multipliers, adder trees and non-linear transfer functions” [i.e., a neural computational unit/module for performing non-linear transfer functions/a vectorized operation of a non-linear evaluation]) … ; and
a vector multiplier-adder module for performing a multiply-add operation on vectors (as indicated above, “a vector multiplier-adder module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform a multiplication-addition operation on set of neural values/a vector) (see, e.g., FIGs. 6 and 7 – depicting “The different (parallel) operators of an NFU: multipliers, adders” and “Multiply” and “Add” operations on values of neural network layers [i.e., the NFU includes a unit/node/module to perform multiply-add operations on neural values/vectors] and page 616, “The spirit of a node or tile instruction is to perform the same layer computations (e.g., multiply-add-transf for classifier layers) on a set of contiguous input data (input neurons in the forward phase, output neurons, gradients or ;
the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module are configured to execute programmable instructions and interact with each other to calculate values of neurons and a network output result of a neural network (as indicated above, the claimed modules, under the BRI, are any hardware or software components/objects that are able to perform operations on sets of neural values/vectors) (see, e.g., FIG. 3 – showing that the “NFU” units/nodes/modules of the “DianNao accelerator” receive and execute “Instructions” and pages 610, 612 and 615-616, “an architecture, composed of interconnected nodes, each containing computational logic, eDRAM, and the router fabric”, “The architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU) … computations required to evaluate a neuron output”, “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), or it can come from the two SRAM storages (16KB) which are used to buffer input and output neuron values, or even temporary values (such as neurons partial sums to enable reuse of input neurons values)” [i.e., nodes/modules of the NFU interact with each other to calculate neuron values and an output result of a neural network], “The neural network configuration is implemented in the form of a sequence of node instructions, one sequence per node … These node instructions themselves drive the control of each tile; the control circuit of each node generates tile instructions and sends them to each tile. The spirit of a node or tile instruction is to perform the same layer computations (e.g., multiply-add-transf for , and a variation amount of a synaptic weight representing the interaction strength of the neurons on an input layer to the neurons on an output layer (see, e.g., pages 609 and 613, “the neurons and synapses (i.e., weights of connections between neurons)” [i.e., synapse/synaptic weights represent strength of connections/interactions between neurons], “in a classifier layer, the No outputs are typically connected to all the Ni inputs, with one synaptic weight per connection” [i.e., variations in synaptic weights representing connection/interaction strengths of neurons Ni in an input layer and neurons No in an output layer]);
the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module are all provided with an intermediate value storage region respectively for storing a vectorized intermediate value calculated according to the instructions (page 4, lines 5-8 and page 7, lines 27-30 of the specification state “the intermediate value storage regions of the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module are configured as a random access memory.” Therefore, “an intermediate value storage region”, under the BRI, in light of the specification, is any memory, such as RAM, usable to store values) (see, e.g., FIG. 5 – depicting “a node (left) and tile architecture (right). A node contains 16 tiles, two central eDRAM banks and … a tile has an NFU, four eDRAM banks and input/output interfaces to/from the central eDRAM banks” [i.e., nodes/modules have eDRAM banks/storage regions] and pages 610 and 612, “an architecture, composed of interconnected nodes, , and perform read and write operations on the primary memory (as indicated above, “the primary memory” has been interpreted as any memory) (see, e.g., pages 609 and 615, “the neurons and synapses (i.e., weights of connections between neurons) intermediate values have to be stored in main memory” [i.e., performing write operations on a main memory], “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), … SRAM storages (16KB) which are used to buffer input and output neuron values, or even temporary values (such as neurons partial sums to enable reuse of input neurons values) … the NFU must also write to the tile eDRAM” [i.e., perform read and write operations to read input values and write values to the central eDRAM/a primary memory]).

Although Chen substantially discloses the claimed invention, Chen is not relied on for explicitly disclosing an acceleration chip for accelerating a deep neural network algorithm, comprising:
a primary memory for performing read and write operations simultaneously via the data bus; 
a vector addition processor module for performing addition or subtraction of a vector … in the deep neural network algorithm; and
a vector function value arithmetic unit module for performing a vectorized operation of a non-linear evaluation in the deep neural network algorithm.
In the same field, analogous art Burger teaches an acceleration chip for accelerating a deep neural network algorithm (see, e.g., paragraphs 58 and 307, “components shown in the figures can be implemented in any manner by any physical and tangible mechanisms, for instance, by … hardware ( e.g., chip-implemented logic functionality” [i.e., a chip], “Service that may be implemented on an acceleration component … a deep neural network (DNN). … state-of-the-art DNN algorithms” [i.e., acceleration chip/component for accelerating a DNN/deep neural network algorithm]), comprising:
a primary memory for performing read and write operations simultaneously via the data bus (as indicated above, “the data bus” has been interpreted as any data bus) (see, paragraphs 220, 243 and 263, “One or more communication buses 3124 communicatively couple the above-described components together.” [i.e., a data bus], “the acceleration component can automatically generate duplicate versions of itself, which thereupon operate in parallel” [i.e., perform operations in parallel/simultaneously], “acceleration component 2502 is coupled to local memory 2522 (e.g., DDR3 or DDR4 DRAM devices, such as traditional DIMMS), via a multi-channel memory bus … and has a memory bandwidth of about 10 GB/sec” [i.e., a primary memory for performing read and write operations simultaneously via the data bus]);
a vector addition processor module for performing addition or subtraction of a vector … in the deep neural network algorithm (as indicated above, “a vector addition processor module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform addition or subtraction on a vector/neural values, and/or an operation of a pooling layer values/vector) (see, e.g., FIG. 58 – depicting NEURAL ENGINE 5802 with MULTIPLY-ACCUMULATOR COMPONENT 5812 and paragraphs 314 and 330, “the input activations are represented by a 4-tuple vector [x0 , x1 , x2 , x3]T in Layer i-1. Every neuron in Layer i processes the input vector of Layer i-1 using an activation function and generates output activations of Layer i. Typically, the activation function is a weighted sum of products”, “a neural engine 5802, which includes … a multiply-accumulate component 5812” [i.e., module to perform addition/summing of products from an input vector in the DNN algorithm]); and
a vector function value arithmetic unit module for performing a vectorized operation of a non-linear evaluation in the deep neural network algorithm (as indicated above, “a vector function value arithmetic unit module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform an operation of a non-linear evaluation of a set of neural values/a vector) (see, e.g., FIG. 58 - depicting NON-LINEAR FUNCTIONS COMPONENT 5814 [i.e., vector function value unit module] and paragraphs 314 and 330, “the input activations are represented by a 4-tuple vector [x0 , x1 , x2 , x3]T in Layer i-1. Every neuron in Layer i processes the input vector of Layer i-1 using an activation function and generates output activations of Layer i. Typically, the activation function is a weighted sum of 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “custom multi-chip machine-learning architecture” for implementing “machine-learning algorithms” such as “Convolutional and Deep Neural Networks (CNNs and DNNs)” of Chen (See, Chen, Abstract) to incorporate the teachings of Burger to provide a method “for processing on an acceleration component a deep neural network” and a high bandwidth “High BW Service that may be implemented on an acceleration component that includes high bandwidth, low power memory using die stacking techniques” for “a deep neural network (DNN)” and “state-of-the-art DNN algorithms” (See, e.g., Burger, Abstract and paragraph 307). Doing so would have allowed Chen to use Burger’s method that “includes configuring the acceleration component to perform forward propagation and backpropagation stages of the deep neural network” in order “to achieve state-of-the-art accuracy on human recognition tasks such as image and speech recognition” while avoiding other hardware implementations of “DNN algorithms” that are “bottlenecked by the capabilities of commodity hardware”, as suggested by Burger (See, e.g., Burger, Abstract and paragraph 307). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.

With regard to independent claim 16, Chen discloses the invention as claimed including an acceleration means by an acceleration chip for accelerating a deep neural network algorithm (intended use language with no patentable weight – aside from the preamble, the acceleration chip or the accelerating is not recited elsewhere in the claim or its dependent claims) (see, e.g., FIG. 3 – showing “Block diagram of the DianNao accelerator” – an operation apparatus, Abstract and page 612, “The state-of-the-art and most popular such machine-learning algorithms are Convolutional and Deep Neural Networks (CNNs and DNNs)”, “the DianNao accelerator for the fast and low-energy execution of the inference of large CNNs and DNNs … architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU)” [i.e., the DianNao accelerator is an operation apparatus for accelerating a DNN/deep neural network]) …, the acceleration chip comprising:
a primary memory for performing read and write operations (see, e.g., pages 609 and 615, “the neurons and synapses (i.e., weights of connections between neurons) intermediate values have to be stored in main memory” [i.e., performing write operations on a main memory], “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), … SRAM storages (16KB) which are used to buffer input and output neuron values, or even temporary values (such as neurons partial sums to enable reuse of input neurons values) … the NFU must also write to the tile eDRAM” [i.e., perform read and write operations to read input values and write values to the central eDRAM/a primary memory]) … ;
a vector addition processor module for performing addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm in the deep neural network as indicated above, “a vector addition processor module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform addition or subtraction on a vector/neural values, and/or an operation of a pooling layer values/vector) (see, e.g., FIGs. 7 and 8 – showing pipeline configurations including pooling layers in a deep neural network and mapping of neurons of a pooling layer, and pages 611-613, “A pooling layer computes the max or average over a number of neighbor points, e.g., out(x, y)f = max 0≤kx≤Kx,0≤ky≤Ky in(x + kx, y + ky)f” [i.e., a vectorized operation of a pooling layer algorithm], “The architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU) which is largely a pipelined version of the typical computations required to evaluate a neuron output: the multiplication of synaptic values by input neurons values in the first stage, additions of all these products in the second stage (adder trees)” [i.e., a unit/NFU for performing additions of synaptic and neural values/vector addition], “The general architecture is a set of nodes, one per chip … Each node contains neural computational units (the classic pipeline of multipliers, adder trees and non-linear transfer functions implemented via linear interpolation), which we also call NFU” [i.e., neural computational units/modules include adders/adder trees for performing vector addition]) … ;
a vector function value arithmetic unit module for performing a vectorized operation of a non-linear evaluation in the deep neural network (as indicated above, “a vector function value arithmetic unit module”, under the BRI, in light of the ; and
a vector multiplier-adder module for performing a multiply-add operation on vectors (as indicated above, “a vector multiplier-adder module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform a multiplication-addition operation on set of neural values/a vector) (see, e.g., FIGs. 6 and 7 – depicting “The different (parallel) operators of an NFU: multipliers, adders” and “Multiply” and “Add” operations on values of neural network layers [i.e., the NFU includes a unit/node/module to perform multiply-add operations on neural values/vectors] and page 616, “The spirit of a node or tile instruction is to perform the same layer computations (e.g., multiply-add-transf for classifier layers) on a set of contiguous input data (input neurons in the forward phase, output neurons, gradients or synapses in the backward phase).” [i.e., node/module to perform multiply-add operations on neural data values/vectors]);
the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module are configured to execute programmable instructions and interact with each other to calculate values of neurons and a network output result of a neural network (as indicated above, the claimed modules, under the BRI, are any hardware or software components/objects that , and a variation amount of a synaptic weight representing the interaction strength of the neurons on an input layer to the neurons on an output layer (see, e.g., pages 609 and 613, “the neurons and synapses (i.e., weights of connections between neurons)” [i.e., synapse/synaptic weights represent strength of connections/interactions between neurons], “in a classifier layer, the No outputs are i inputs, with one synaptic weight per connection” [i.e., variations in synaptic weights representing connection/interaction strengths of neurons Ni in an input layer and neurons No in an output layer]);
the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module are all provided with an intermediate value storage region respectively for storing a vectorized intermediate value calculated according to the instructions (page 4, lines 5-8 and page 7, lines 27-30 of the specification state “the intermediate value storage regions of the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module are configured as a random access memory.” Therefore, “an intermediate value storage region”, under the BRI, in light of the specification, is any memory, such as RAM, usable to store values) (see, e.g., FIG. 5 – depicting “a node (left) and tile architecture (right). A node contains 16 tiles, two central eDRAM banks and … a tile has an NFU, four eDRAM banks and input/output interfaces to/from the central eDRAM banks” [i.e., nodes/modules have eDRAM banks/storage regions] and pages 610 and 612, “an architecture, composed of interconnected nodes, each containing computational logic, eDRAM”, “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), or it can come from the two SRAM storages (16KB) which are used to buffer input and output neuron values, or even temporary values (such as neurons partial sums to enable reuse of input neurons values)” [i.e., nodes/modules of the NFU each include a respective eDRAM/an intermediate storage region for storing temporary/intermediate values in buffers]), and perform read and write operations on the primary memory (as ;
the means comprising the following steps:
a step of vector addition processing operation, in which a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm in the deep neural network algorithm according to an instruction (as indicated above, “a vector addition processor module performs addition or subtraction of a vector, and/or a vectorized operation of a pooling layer algorithm” has been interpreted as the previously-introduced “vector addition processor module [that] performs addition or subtraction of” the previously introduced “vector”, “and/or” performs the previously-introduced “vectorized operation of” the previously-introduced “pooling layer algorithm”. As also indicated above, “a vector addition processor module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform addition or subtraction on a vector/neural values, and/or an operation of a pooling layer values/vector) (see, e.g., FIG. 3 - showing that the “NFU” units/nodes/modules of the out(x, y)f = max 0≤kx≤Kx,0≤ky≤Ky in(x + kx, y + ky)f” [i.e., a step of a vectorized operation/computation of a pooling layer algorithm], “The architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU) which is largely a pipelined version of the typical computations required to evaluate a neuron output: the multiplication of synaptic values by input neurons values in the first stage, additions of all these products in the second stage (adder trees)” [i.e., a unit/NFU for performing the step of addition of synaptic and neural values/vector addition], “The general architecture is a set of nodes, one per chip … Each node contains neural computational units (the classic pipeline of multipliers, adder trees and non-linear transfer functions implemented via linear interpolation), which we also call NFU” [i.e., neural computational units/modules include adders/adder trees for performing the step vector addition] “The neural network configuration is implemented in the form of a sequence of node instructions, one sequence per node … These node instructions themselves drive the control of each tile” [i.e., according to an instruction]);
a step of vector function value operation, in which a vector function value arithmetic unit module performs a vectorized operation of a non-linear evaluation in the deep neural network algorithm according to an instruction (as indicated ; and
a step of vector multiply-add operation, in which a vector multiplier-adder module performs a multiply-add operation on the vector according to an instruction (as indicated above, “a vector multiplier-adder module performs a multiply-add operation on the vector” has been interpreted as the previously-introduced “vector multiplier-adder module” that performs the previously-introduced “multiply-add operation” on the vector. As further indicated above, “a vector multiplier-adder module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform a multiplication-addition operation on set of ;
the step of vector addition processing operation, the step of vector function value operation and the step of vector multiply-add operation interact with each other to calculate values of neurons and a network output result of a neural network (as indicated above, the claimed modules, under the BRI, are any hardware or software components/objects that are able to perform operations on sets of neural values/vectors) (see, e.g., pages 610, 612 and 615-616, “an architecture, composed of interconnected nodes, each containing computational logic, eDRAM, and the router fabric”, “The architecture contains buffers for caching input/output neurons and synapses, and a Neural Functional Unit (NFU) … computations required to evaluate a neuron output”, “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), or it can come from the two SRAM storages (16KB) which are used to buffer input and output neuron values, or even , and a variation amount of a synaptic weight representing the interaction strength of the neurons on an input layer to the neurons on an output layer (see, e.g., pages 609 and 613, “the neurons and synapses (i.e., weights of connections between neurons)” [i.e., synapse/synaptic weights represent strength of connections/interactions between neurons], “in a classifier layer, the No outputs are typically connected to all the Ni inputs, with one synaptic weight per connection” [i.e., variations in synaptic weights representing connection/interaction strengths of neurons Ni in an input layer and neurons No in an output layer]);
the vectorized intermediate values produced by the step of vector addition processing operation, the step of vector function value operation and the step of vector multiply-add operation are stored in the intermediate value storage regions of the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module (as indicated above, “an intermediate value storage region”, under the BRI, in light of the specification, is any memory, such as RAM, usable to store values) (see, e.g., FIG. 5 – depicting “a node , and the intermediate value storage regions may perform read and write operations on the primary memory (as indicated above, “the primary memory” has been interpreted as any memory) (see, e.g., pages 609 and 615, “the neurons and synapses (i.e., weights of connections between neurons) intermediate values have to be stored in main memory” [i.e., performing write operations on a main memory], “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), … SRAM storages (16KB) which are used to buffer input and output neuron values, or even temporary values (such as neurons partial sums to enable reuse of input neurons values) … the NFU must also write to the tile eDRAM” [i.e., the eDRAMs of the nodes perform read and write operations to read input values and write values to the central eDRAM/primary memory]).

 acceleration means by an accelerating chip for accelerating a deep neural network algorithm, comprising:
a primary memory for performing read and write operations simultaneously via the data bus; 
a vector addition processor module for performing addition or subtraction of a vector … in the deep neural network algorithm; and
a vector function value arithmetic unit module for performing a vectorized operation of a non-linear evaluation in the deep neural network algorithm.
In the same field, analogous art Burger teaches an acceleration means by an accelerating chip for accelerating a deep neural network algorithm (see, e.g., paragraphs 58 and 307, “components shown in the figures can be implemented in any manner by any physical and tangible mechanisms, for instance, by … hardware ( e.g., chip-implemented logic functionality” [i.e., a chip], “Service that may be implemented on an acceleration component … a deep neural network (DNN). … state-of-the-art DNN algorithms” [i.e., an acceleration component/means/apparatus for accelerating a DNN/deep neural network algorithm]), comprising:
a primary memory for performing read and write operations simultaneously via the data bus (as indicated above, “the data bus” has been interpreted as any data bus) (see, paragraphs 220, 243 and 263, “One or more communication buses 3124 communicatively couple the above-described components together.” [i.e., a data bus], “the acceleration component can automatically generate duplicate versions of itself, which thereupon operate in parallel” [i.e., perform operations in parallel/simultaneously], ;
a vector addition processor module for performing addition or subtraction of a vector … in the deep neural network algorithm (as indicated above, “a vector addition processor module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform addition or subtraction on a vector/neural values, and/or an operation of a pooling layer values/vector) (see, e.g., FIG. 58 – depicting NEURAL ENGINE 5802 with MULTIPLY-ACCUMULATOR COMPONENT 5812 and paragraphs 314 and 330, “the input activations are represented by a 4-tuple vector [x0 , x1 , x2 , x3]T in Layer i-1. Every neuron in Layer i processes the input vector of Layer i-1 using an activation function and generates output activations of Layer i. Typically, the activation function is a weighted sum of products”, “a neural engine 5802, which includes … a multiply-accumulate component 5812” [i.e., module to perform addition/summing of products from an input vector in the DNN algorithm]); and
a vector function value arithmetic unit module for performing a vectorized operation of a non-linear evaluation in the deep neural network algorithm (as indicated above, “a vector function value arithmetic unit module”, under the BRI, in light of the specification, is any hardware or software component/object that is able to perform an operation of a non-linear evaluation of a set of neural values/a vector) (see, e.g., FIG. 58 - depicting NON-LINEAR FUNCTIONS COMPONENT 5814 [i.e., vector T in Layer i-1. Every neuron in Layer i processes the input vector of Layer i-1 using an activation function and generates output activations of Layer i. Typically, the activation function is a weighted sum of products, taking the input activation of each neuron and scaling it by a tunable weight parameter. The dot product is further transformed by a non-linear differentiable function such as hyperbolic tangent, sigmoid or other non-linear differentiable function”, “a neural engine 5802, which includes … a non-linear functions component 5814” [i.e., module for performing vectorized operation of a non-linear evaluation in the DNN/deep neural network algorithm]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “custom multi-chip machine-learning architecture” for implementing “machine-learning algorithms” such as “Convolutional and Deep Neural Networks (CNNs and DNNs)” of Chen (See, Chen, Abstract) to incorporate the teachings of Burger to provide a method “for processing on an acceleration component a deep neural network” and a high bandwidth “High BW Service that may be implemented on an acceleration component that includes high bandwidth, low power memory using die stacking techniques” for “a deep neural network (DNN)” and “state-of-the-art DNN algorithms” (See, e.g., Burger, Abstract and paragraph 307). Doing so would have allowed Chen to use Burger’s method that “includes configuring the acceleration component to perform forward propagation and backpropagation stages of the deep neural network” in order “to achieve state-of-the-art accuracy on human recognition tasks such as image and speech recognition” while 

Regarding claims 2, 7, 12 and 17, as discussed above, Chen in view of Burger teaches the apparatus of claim 1, the means of claim 6, the chip of claim 11, and the means of claim 16.
Although Chen substantially discloses the claimed invention, Chen is not relied on for explicitly disclosing wherein after the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module generate an output value, the intermediate values stored in the intermediate value storage regions are discarded.
In the same field, analogous art Burger teaches wherein after the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module generate an output value, the intermediate values stored in the intermediate value storage regions are discarded (as indicated above, the claimed modules, under the BRI, are any hardware or software components/objects that are able to perform operations on sets of neural values/vectors) (see, e.g., paragraph 202, “application logic 2712 retrieves the data from input buffer 2710, processes it to generate an output result, and places the output result in an output buffer 2714. In operation (6), acceleration component 2704 copies the contents of output buffer 2714 into an output buffer in the host logic's memory. … 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “custom multi-chip machine-learning architecture” for implementing “machine-learning algorithms” such as “Convolutional and Deep Neural Networks (CNNs and DNNs)” of Chen (See, Chen, Abstract) to incorporate the teachings of Burger to provide a method “for processing on an acceleration component a deep neural network” and a high bandwidth “High BW Service that may be implemented on an acceleration component that includes high bandwidth, low power memory using die stacking techniques” for “a deep neural network (DNN)” and “state-of-the-art DNN algorithms” (See, e.g., Burger, Abstract and paragraph 307). Doing so would have allowed Chen to use Burger’s method that “includes configuring the acceleration component to perform forward propagation and backpropagation stages of the deep neural network” in order “to achieve state-of-the-art accuracy on human recognition tasks such as image and speech recognition” while avoiding other hardware implementations of “DNN algorithms” that are “bottlenecked by the capabilities of commodity hardware”, as suggested by Burger (See, e.g., Burger, Abstract and paragraph 307). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.


Chen further discloses wherein the intermediate value storage regions of the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module are configured as a random-access memory (as indicated above, the claimed modules, under the BRI, are any hardware or software components/objects that are able to perform operations on sets of neural values/vectors) (see, e.g., FIG. 5 – depicting “a node (left) and tile architecture (right). A node contains 16 tiles, two central eDRAM banks and … a tile has an NFU, four eDRAM banks and input/output interfaces to/from the central eDRAM banks” [i.e., nodes/modules have a random-access memory/RAM/eDRAM banks] and pages 610 and 612, “an architecture, composed of interconnected nodes, each containing computational logic, eDRAM” [i.e., a random-access memory/RAM], “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), or it can come from the two SRAM storages (16KB) which are used to buffer input and output neuron values, or even temporary values (such as neurons partial sums to enable reuse of input neurons values)” [i.e., nodes/modules of the NFU each include a respective - eDRAM/an intermediate storage region configured as RAM for storing temporary/intermediate values in buffers]).

Regarding claims 4 and 19, as discussed above, Chen in view of Burger teaches the teaches the apparatus of claim 3 and the chip of claim 13.
wherein the random access memory is configured to store the intermediate values produced itself from each of the neurons (as indicated above, “wherein the random access memory is configured to store the intermediate values produced itself from each of the neurons” has been interpreted as wherein the random access memory is configured to store the intermediate values from each of the neurons) (see, e.g., FIG. 5 – showing that “A node contains 16 tiles, two central eDRAM banks and … a tile has an NFU, four eDRAM banks and input/output interfaces to/from the central eDRAM banks” [i.e., nodes/modules have a random-access memory/RAM/eDRAM banks] and pages 610 and 612, “an architecture, composed of interconnected nodes, each containing computational logic, eDRAM” [i.e., a random-access memory/RAM], “a classifier layer input can come from the node central eDRAM (possibly after transfer from another node), or it can come from the two SRAM storages (16KB) which are used to buffer input and output neuron values, or even temporary values (such as neurons partial sums to enable reuse of input neurons values)” [i.e., nodes/modules of the NFU each include a respective - eDRAM/RAM for storing temporary/intermediate values, and input and output neuron values in buffers from each of the input and output neurons]) and the variation amount of the synaptic weight (see, e.g., pages 609, 613 and 615-616, “the neurons and synapses (i.e., weights of connections between neurons)” [i.e., synapse/synaptic weights for each of the neurons], “in a classifier layer, the No outputs are typically connected to all the Ni inputs, with one synaptic weight per connection” [i.e., variations in synaptic weights representing connection/interaction strengths of neurons Ni in an input layer and neurons No in an output layer], “the NFU must also write to the tile eDRAM after the 

Regarding claims 5, 8, 15 and 18, as discussed above, Chen in view of Burger teaches the apparatus of claim 1, the means of claim 6, the chip of claim 11, and the means of claim 16.
Although Chen substantially discloses the claimed invention, Chen is not relied on for explicitly disclosing wherein the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module access the intermediate value storage regions through an index.
In the same field, analogous art Burger teaches wherein the vector addition processor module, the vector function value arithmetic unit module and the vector multiplier-adder module access the intermediate value storage regions through an index (as indicated above, the claimed modules, under the BRI, are any hardware or software components/objects that are able to perform operations on sets of neural values/vectors. As further indicated above, an “intermediate value storage region”, under the BRI, in light of the specification, is any memory, such as RAM, usable to store values) (see, e.g., FIG. 10 – depicting data store 126 with service-to-address mapping [i.e., accessing a data store through an address mapping/index] and paragraphs 82, 86 and 218, “determination component 124 returns an address associated with the service, if that address is present in data store 126. The address may identify a particular acceleration component”, “SMC 128 may consult data store 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the disclosed “custom multi-chip machine-learning architecture” for implementing “machine-learning algorithms” such as “Convolutional and Deep Neural Networks (CNNs and DNNs)” of Chen (See, Chen, Abstract) to incorporate the teachings of Burger to provide a method “for processing on an acceleration component a deep neural network” and a high bandwidth “High BW Service that may be implemented on an acceleration component that includes high bandwidth, low power memory using die stacking techniques” for “a deep neural network (DNN)” and “state-of-the-art DNN algorithms” (See, e.g., Burger, Abstract and paragraph 307). Doing so would have allowed Chen to use Burger’s method that “includes configuring the acceleration component to perform forward propagation and backpropagation stages of the deep neural network” in order “to achieve state-of-the-art accuracy on human recognition tasks such as image and speech recognition” while avoiding other hardware implementations of “DNN algorithms” that are “bottlenecked by 

Allowable Subject Matter
Upon overcoming of all the objections, and the rejections as discussed above in items 8-19, claims 9-10 and 19-20 are objected to as being dependent upon a rejected base claim (i.e., claims 1 and 16), but would be allowable if amended to address the above-noted objections and rejections under 35 U.S.C. 112(a) and 112(b) and rewritten in independent form including all of the limitations of the base claim and any intervening claims (i.e., intervening claims 8 and 18).
For example, with regard to dependent claims 9 and 19, the prior art of record does not anticipate, nor do they render obvious in any reasonable combination to one of ordinary skill in the art at the time of Applicants' invention, the combination of recited limitations of claims 9 and 19, their respective base claims, independent claims 1 and 16, and their respective intervening claims, claims 8 and 18. 
As discussed above, Chen in view of Burger teaches the apparatuses of claims 1 and 8, and the means of claims 16 and 18.
With regard to claims 9 and 19, the prior art of record does not anticipate or render obvious the limitations “wherein:
in the step of vector addition processing operation, the step of vector function value operation and the step of vector multiply-add operation, if the unwritten positions of a storage block designated by the index previously within the intermediate value storage regions are requested to be read, the intermediate value storage regions refuse a flag which is returned” and recitations of “the unwritten positions of a storage block” have been interpreted as “[[the]] an unwritten positions of a storage block”. As further indicated in the section 112(b) rejections above, “if the unwritten positions of a storage block designated by the index previously within the intermediate value storage regions are requested to be read, the intermediate value storage regions refuse this reading request, and the flag which is returned to indicate the success of data reading is invalid” has been interpreted as “if any unwritten positions of a storage block designated by the index previously within the intermediate value storage regions are requested to be read, the intermediate value storage regions refuse this reading request, and a flag is returned or set indicating that the data reading failed or was unsuccessful”.
With regard to claims 10 and 20, the prior art of record does not anticipate or render obvious the limitations “wherein:
in the step of vector addition processing operation, the step of vector function value operation and the step of vector multiply-add operation, if the written positions of a storage block designated by the index previously within the intermediate value storage regions are requested to be read, the intermediate value storage regions refuse this writing request, and the flag which is returned to indicate the success of data reading is 
As indicated in the section 112(b) rejections of claims 10 and 20 above, recitations of “the flag which is returned” have been interpreted as “[[the]] a flag which is returned” and “the written positions of a storage block” are being interpreted as “[[the]] a written positions of a storage block”. As also indicated in the section 112(b) rejections above, “if the written positions of a storage block designated by the index previously within the intermediate value storage regions are requested to be read, the intermediate value storage regions refuse this writing request, and the flag which is returned to indicate the success of data reading is invalid” has been interpreted as “if any written positions of a storage block designated by the index previously within the intermediate value storage regions are requested to be read, the intermediate value storage regions refuse this writing request, and a flag is returned or set indicating that the data writing failed or was unsuccessful”.

Conclusion
The prior art made of record, listed on form PTO-892, and not relied upon, is considered pertinent to applicant's disclosure.
The examiner requests, in response to this office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s)  line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the reference cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111 (c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANDY K BALDWIN whose telephone number is (571)270-5222. The examiner can normally be reached on Mon - Fri 9:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO 





/R.K.B./Examiner, Art Unit 2125 

/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125                                                                                                                                                                                                        


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 Although the specification as amended on 4/23/2018 does not include page numbers, unless stated otherwise, references herein to “the specification” are to the amended specification and include page numbers of the amended specification, with the first page being page 1
        2 While the specification as amended on 4/23/2018 does not include page numbers. Unless stated otherwise, references herein to “the specification” herein are to the amended specification and include page and line numbers, with the first page being page 1.