DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The previous objections and 35 USC 112 rejections on the claims are withdrawn based on the amendments submitted on those claims.
Response to Arguments
Applicant's arguments have been fully considered but they are not persuasive. In regards to the applicants’ first argument that the prior art does not teach that computations are for a particular layer of the neural network (Remarks, p. 10) the Examiner points to Figs. 4 and 14-15, and paragraphs 70, 87-88, 92, 168, 210 and 223 of Henry, for example, that collectively teaches instructions for computations to be performed for a particular layer of the neural network such as the convolution, pooling or input layer. In a secondary  rejection provided, the applicants’ IDS reference of Chen also teaches this (see additional rejection below). In regards to the applicants’ second argument regarding the computing memory address limitations (Remarks, p. 11) the Examiner further points to paragraph 56 of Henry that describes the generation or calculations of these memory addresses.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For 

Claims 2-3, 9, 11-12, 18, 20 are rejected on the ground of nonstatutory double patenting as being unpatentable over Claims 1-2, 10-11, 15-16 of U.S. Patent No. 9,836,691 and Claims 1-2, 9-10, 14-15 of U.S. Patent No. 9,959,498. Although the claims at issue are not identical, they are not patentably distinct from each other because it would have been obvious for one of ordinary skill in the art, given the patented claims, to derive the inventive concept and claims of the current application, and vice-versa.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 2-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Looking at the limitations of similar independent Claims 2, 11 and 20 we see the recitation of performing tensor computations and computing memory addresses. The limitation of generating an output for the particular neural network layer is also based on computations. These limitations, under their broadest reasonable interpretation, are directed to “Mathematical Concepts” that involves mathematical relationships, calculations and/or equations. If a claim limitation, under its broadest reasonable interpretation, covers mathematical relationships/calculations but for the recitation of generic computer components, then it 
This judicial exception is not integrated into a practical application. The additional elements recited in the claims such a processor/processors and/or a hardware integrated circuit are recited at a high-level of generality such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are therefore directed to an abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claims are therefore not patent eligible.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 2-9, 11-18, 20-21 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Henry et al., US 2017/0103301 A1.

Regarding Claim 2, Henry teaches:
A computer-implemented method for accelerating tensor computations for a neural network having a plurality of neural network layers, the method comprising (Abstract; paragraphs 53, 186-187, 220: wherein accelerating matrix/tensor computations for a multilayer neural network is collectively described. Examiner’s note: Chang et al., US 2017/0103309 A1, and Koster et al., US 2017/0316307 A1, also teach accelerating tensor computations for a neural network, see their Abstract for example):
providing, by a controller, an instruction to a compute unit of a hardware integrated circuit configured to communicate with the controller, wherein the instruction, when executed by a processor of the compute unit, causes the compute unit to perform operations comprising (Abstract; paragraphs 51-52: execution units controlled by microinstructions):
determining, based on an opcode in the instruction provided to the compute unit, that the tensor computations are for a particular type of neural network layer (Figs. 4 and 14-15; and paragraphs 53, 70, 87-88, 92, 168, 210 and 223: that collectively teaches determining computations for an instant layer of the neural network and instructions for computations to be performed for a particular layer of the neural network such as the convolution, pooling or input layer); 
computing values for a first set of memory address locations that are used to store, at the compute unit, inputs to the particular type of neural network layer; computing values for a second set of memory address locations that are used to store, at the compute unit, weights for the particular type of neural network layer; performing the tensor computations for the particular type of neural network layer using: i) the inputs obtained from the first set of memory address locations and ii) the weights obtained from the second set of memory address locations (Abstract; paragraphs 53, 56, 60, 87, 92: wherein it is collectively described the generation or calculations of these memory addresses and the inputs and weights from memory storage/address are used in the computations to generate the output or result);
and generating an output for the particular neural network layer based on the tensor computations performed at the compute unit (Abstract; paragraphs 53, 60: generating an output for a layer of the neural network).

Regarding Claim 3, Henry further teaches:
The method of claim 2, wherein the tensor computations performed at the compute unit are at least a portion of a computation of a first neural network layer that is the particular type of neural network layer (paragraph 87: wherein the neural network unit (NNU) execution unit performs the computation for each layer of the neural network).

Regarding Claim 4, Henry further teaches:
The method of claim 2, wherein the operations further comprise: receiving the inputs to the first neural network layer based on instructions executed at the compute unit; and storing the inputs at a first memory of the compute unit based on a first memory access operation that uses the first set of memory address locations (paragraphs 44, 60-61, 100: wherein collectively it is described the receiving/accessing and storing of the inputs from and to memory).

Regarding Claim 5, Henry further teaches:
The method of claim 4, wherein the operations further comprise: receiving the weights for the first neural network layer based on the instructions executed at the compute unit; and storing the weights at a second memory of the compute unit based on a second memory access operation that uses the second set of memory address locations (Abstract; paragraphs 56-58, 73, 87, 99, 131: wherein collectively it is described the receiving/accessing and storing of the weights from and to memory).

Regarding Claim 6, Henry further teaches:
The method of claim 5, wherein performing the tensor computations comprises: obtaining an input from a memory location of the first memory; obtaining a weight from a memory location of the second memory; and performing the tensor computations using the input of the first memory and the weight of the second memory (paragraphs 60, 87-92: wherein collectively it is described the computations using the weights and inputs of the neural network. . Examiner’s note: Chang et al., US 2017/0103309 A1, and Koster et al., US 2017/0316307 A1, also teach these tensor computations for a neural network, see for example paragraph 24 and paragraph 35 respectively).

Regarding Claim 7, Henry further teaches:
The method of claim 6, wherein performing the tensor computations comprises: generating an activation value using the input obtained from the first memory and the weight obtained from the second memory; providing the activation value to a data bus configured to exchange tensor data between respective compute units; and providing the activation value to a second compute unit using the data bus (Figs. 3, 7; paragraphs 60-61, 65, 71, 104: wherein the activation function unit (AFC) 

Regarding Claim 8, Henry further teaches:
The method of claim 2, wherein: the first set of memory address locations represent elements of an input tensor; and the second set of memory address locations represent elements of a weight tensor (Abstract; paragraphs 88-89, 130, 210: wherein the memory for the inputs and weights are discussed. Note that outputs of lower layer neurons are inputs of higher layer neurons).

Regarding Claim 9, Henry further teaches:
The method of claim 8, wherein the operations further comprise: determining a loop nest structure for generating a nested loop; generating a plurality of nested loops using the loop nest structure; and performing the tensor computations for the first neural network layer using the plurality of nested loops (paragraphs 186-188: wherein it is described the nested loop structure for performing the neural network computations. Examiner’s note: The applicants’ provided NPL of Peemen also teaches this, see for example sections III-IV).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 10, 19 are rejected under 35 U.S.C. 103 as being unpatentable over Henry et al., US 2017/0103301 A1, in view of Koster et al., US 2017/0316307 A1.

Regarding Claim 10, Henry teaches:
The method of claim 8, wherein performing the tensor computations comprises: using a tensor traversal unit to access memory address locations of the input tensor to obtain multiple inputs to the first neural network layer; using the tensor traversal unit to access memory address locations of the weight tensor to obtain multiple weights for the first neural network layer (paragraph 56: wherein as described, a sequencer is used to fetch data, that includes the weights and inputs, from the memory address locations for those data).
 Although Henry teaches the multiplication/product of the inputs and weights (see paragraph 60 for example), it may not have explicitly taught the following:
and generating a dot product using the multiple inputs of the input tensor and the multiple weights of the weight tensor. (Emphasis added).
However, Koster in a similar field of endeavor shows (paragraph 41: wherein the dot product of tensors is discussed).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to use the teachings of Koster with that of Henry for generating a dot product using the multiple inputs of the input tensor and the multiple weights of the weight tensor.
The ordinary artisan would have been motivated to modify Henry in the manner set forth above for the purposes of performing tensor computations pertaining to neural networks [Koster: Abstract; paragraph 41].

Claims 11-19 are similar to Claims 2-10 respectively, and are rejected under the same rationale as stated above for those claims.
Claim 20 is similar to Claim 2 and is rejected under the same rationale as stated above for that claim.
Claim 21 is a combination of Claims 4 and 5 and is rejected under the same rationale as stated above for those claims.

Additionally, the independent claims are further rejected as shown below.
Claims 2, 11 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Henry et al., US 2017/0103301 A1, in view of Chen et al., “DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning”, March 2014, in the applicants’ IDS.

Regarding Claim 2, Henry teaches:
A computer-implemented method for accelerating tensor computations for a neural network having a plurality of neural network layers, the method comprising (Abstract; paragraphs 53, 186-187, 220: wherein accelerating matrix/tensor computations for a multilayer neural network is collectively described. Examiner’s note: Chang et al., US 2017/0103309 A1, and Koster et al., US 2017/0316307 A1, also teach accelerating tensor computations for a neural network, see their Abstract for example):
providing, by a controller, an instruction to a compute unit of a hardware integrated circuit configured to communicate with the controller, wherein the instruction, when executed by a processor of the compute unit, causes the compute unit to perform operations comprising (Abstract; paragraphs 51-52: execution units controlled by microinstructions):
computing values for a first set of memory address locations that are used to store, at the compute unit, inputs to the particular type of neural network layer; computing values for a second set of memory address locations that are used to store, at the compute unit, weights for the particular type of neural network layer; performing the tensor computations for the particular type of neural network layer using: i) the inputs obtained from the first set of memory address locations and ii) the weights obtained from the second set of memory address locations (Abstract; paragraphs 53, 56, 60, 87, 92: wherein it is collectively described the 
and generating an output for the particular neural network layer based on the tensor computations performed at the compute unit (Abstract; paragraphs 53, 60: generating an output for a layer of the neural network).
Although Henry teaches the below, Chen also explicitly shows:
determining, based on an opcode in the instruction provided to the compute unit, that the tensor computations are for a particular type of neural network layer (p. 277, subsections 5.3.1-5.3.2: “A layer execution is broken down into a set of instructions. Roughly, one instruction corresponds to the loops ii; i; n for classifier and convolutional layers, see Figures 5 and 7, and to the loops ii; i in pooling layers (using the interleaving mechanism described in Section 5.2.3), see Figure 8. The instructions are stored in an SRAM associated with the Control Processor (CP), see Figure 11. The CP drives the execution of the DMAs of the three buffers and the NFU”. And, “So we have implemented three dedicated code generators for the three layers. In Table 4, we give an example of the code generated for a classifier/perceptron layer”. That is, the code is used to determine the computations for the particular layer of the neural network). 
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to use the teachings of Chen with that of Henry for determining, based on an opcode in the instruction provided to the compute unit, that the tensor computations are for a particular type of neural network layer.


Claims 11 and 20 are similar to Claim 2 and are rejected under the same rationale as stated above for that claim.

Examiner's Note:
The Examiner cites particular pages, sections, columns, line numbers, and/or paragraphs in the references as applied to the claims above for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in its entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner and the additional related prior arts made of record that are considered pertinent to applicant's disclosure to further show the general state of the art. The Examiner's interpretations in parenthesis are provided with the cited references to assist the applicants to better understand how the examiner interprets the prior art to read on the claims. Such comments are entirely consistent with the intent and spirit of compact prosecution.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See previously supplied PTO-892 for the relevant and pertinent prior art relating to this application where for example Koster et al., US 2017/0316307, teaches neural network tensor computations and Shoaib, US 2017/0132496 A1, teaches implementing convolutional neural networks efficiently in hardware.

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Applicant's submission of an information disclosure statement under 37 CFR 1.97(c) with the fee set forth in 37 CFR 1.17(p) on 11/13/2020 prompted the additional new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 609.04(b).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 


Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVE MISIR whose telephone number is (571)272-5243.  The examiner can normally be reached on M-R 8-5 pm, F some hours.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on 571-272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  






/DAVE MISIR/Primary Examiner, Art Unit 2122