DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statements (IDS) submitted on August 27, 2019; May 6 and July 31 of 2020; April 12, 2021; and May 20, 2022 were filed after the mailing date of the application on August 27, 2019.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “weight decompression unit” in claim 2.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim(s) 1, 11, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Albericio (see citation below), Judd (US 20170357891A1), Young (US009721203B1), and Narayanaswami (US009836691B1).
As per Claim 1, Albericio teaches a processor, comprising:  a first tile, a second tile (each DaDN chip comprises 16 tiles, p. 386, left column, last paragraph), a memory, and a bus, the bus being connected to:  the memory, the first tile, and the second tile (each DaDN chip comprises 16 tiles and a Neuron Memory, p. 386, left column, last paragraph; processing starts by reading from external memory the first layer’s filter weights, and the input image, p. 386, right column, 3rd paragraph), the first tile comprising:  a first weight register, a second weight register, an activations buffer (internally, each tile has a weight buffer (SB), an input activation buffer, p. 386, left column, last paragraph; a set of weight set registers are introduced in front of the SB, p. 390, left column, 2nd paragraph), a first multiplier, and a second multiplier (each IP computes an inner product with 16 parallel multipliers, p. 386, left column, last paragraph), the processor being configured to perform a first convolution of an array of activations with weights (convolutional layer performs inner products where both weights and activations are reused, p. 385, left column, 2nd paragraph), the performing of the first convolution comprising:  broadcasting a first subarray of the array of activations to:  the first tile, and the second tile (activations are broadcast to the tiles, p. 387, left column, 2nd paragraph); forming a first product, the first product being a product of a first subarray of the weights with the first subarray of the array of activations (computes the inner product of a brick of weights and a brick of activation each cycle, p. 386, left column, last paragraph); broadcasting a second subarray of the array of activations to:  the first tile, and the second tile (p. 387, left column, 2nd paragraph); forming a second product, the second product being a product of a second subarray of the weights with the second subarray of the array of activations (p. 386, left column, last paragraph); and adding the first product and the second product (PRA calculates the product of weight and activation a, that is, each cycle, the weight multiplied by f, the next constituent power two of a, and the result is accumulated, p. 387, 5.1.2).
However, Albericio does not expressly teach storing the first tensor product in the memory.  However, Judd teaches storing the first product in the memory (every cycle, the engine can calculate the product of two 2-bit inputs, i (weight) and v (activation) and store it into the output register, [0117], the output register contains the inner-product of an activation and weight set, [0145]).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Albericio to include storing the first product in the memory because Judd suggests that this is needed in order to accumulate the products [0117, 0145].
	However, Albericio and Judd do not expressly teach that the weights are a first kernel of weights.  However, Young teaches the processor being configured to perform a first convolution of an array of activations with a first kernel of weights, the performing of the first convolution comprising applying the first kernel of weights with the array of activations (convolutional neural network layer may have a 4x4 kernel of weights to be applied to the 8x8 array of activation values, col. 15, lines 20-22).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Albericio and Judd so that the weights are a first kernel of weights as suggested by Young.  It is well-known in the art that a convolutional neural network performs convolution using a kernel of weights.
However, Albericio, Judd, and Young do not expressly teach that the products are tensor products.  However, Narayanaswami teaches forming a first tensor product, the first tensor product being a tensor product of weights with activations (computations can include multiplication of activation tensor with weight tensor on one or more computation cycles to produce outputs in the form of output tensor, col. 11, lines 50-53; col. 12, lines 4-19).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Albericio, Judd, and Young so that the products are tensor products as suggested by Narayanaswami.  It is well-known in the art that tensors are the data structure used by machine learning systems, and it is the way that information is stored within machine learning systems.
11.	As per Claims 11 and 20, these claims are each similar in scope to Claim 1, and therefore are rejected under the same rationale.
12.	Claim(s) 2 and 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Albericio (see citation below), Judd (US 20170357891A1), Young (US009721203B1), and Narayanaswami (US009836691B1) in view of Martin (US 20190147327A1).
13.	As per Claim 2, Albericio, Judd, Young, and Narayanaswami are relied upon for the teachings as discussed above relative to Claim 1.
However, Albericio, Judd, Young, and Narayanaswami do not teach wherein the first tile further comprises a weight decompression unit configured to:  decompress a data word encoding a plurality of weights in compressed form, to extract a first weight and a second weight; feed the first weight to the first weight register; and feed the second weight to the second weight register.  However, Martin teaches wherein the first tile further comprises a weight decompression unit configured to:  decompress a data word encoding a plurality of weights in a compressed form, to extract a first weight and a second weight (decompress the weight data but not for sparsity, zero value weights are not restored to the correct position in a sequence of weights in a word, [0197]); feed the first weight to the first weight register; and feed the second weight to the second weight register (registers 306, [0205], sparsity map and unpacked weights may be combined so as to arrange the received weight values in their proper sequence at register 306, [0206]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Albericio, Judd, Young, and Narayanaswami so that the first tile further comprises a weight decompression unit configured to:  decompress a data word encoding a plurality of weights in compressed form, to extract a first weight and a second weight; feed the first weight to the first weight register; and feed the second weight to the second weight register because Martin suggests allowing only part of the compressed weight data to be decompressed at a time, which reduces the amount of data that needs to be stored [0196].
14.	As per Claim 12, Claim 12 is similar in scope to Claim 2, and therefore is rejected under the same rationale.
15.	Claim(s) 4 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Albericio (see citation below), Judd (US 20170357891A1), Young (US009721203B1), and Narayanaswami (US009836691B1) in view of Ross (US010521488B1).
16.	As per Claim 4, Albericio, Judd, Young, and Narayanaswami are relied upon for the teachings as discussed above relative to Claim 1.
	However, Albericio, Judd, Young, and Narayanaswami do not teach wherein:  the activations buffer is configured to include:  a first queue connected to the first multiplier, and a second queue connected to the second multiplier, the first queue comprises a first register and a second register adjacent to the first register, the first register being an output register of the first queue, the first tile is further configured:  in a first state:  to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state:  to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.  However, Ross teaches the activations buffer being configured to include:  a first queue connected to the first multiplier, and a second queue connected to the second multiplier, the first queue comprising a first register and a second register adjacent to the first register, the first register being an output register of the first queue (each cell of the first plurality of cells including multiple activation registers, each activation register of the multiple activation registers configured to store a corresponding activation input, multiplexer circuitry communicatively coupled to the multiple activation registers and configured to select, from the multiple activation registers, one of the activation input as a selected activation input, and multiplication circuitry communicatively coupled to the weight register and to the multiplexer, in which the multiplication circuitry is configured to output a product of the weight input and the selected activation input, col. 2, lines 17-28; cell may include multiple activation registers (506a, 506b, 506c) that store activation inputs, col. 10, lines 40-41, Fig. 5), the first tile being configured:  in a first state:  to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state:  to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue (mux select value of 0 may result in selection of an activation input from register 506a, a mux select value of 1 may result in select of an activation input from register 506b, and a mux select value of 2 may result in selection of an activation input from register 506c, multiplication circuitry 508 may be used to multiply the weight input from the weight register 502 with the selected activation input from the activation register 506, col. 11, lines 25-35).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Albericio, Judd, Young, and Narayanaswami so that the activations buffer is configured to include:  a first queue connected to the first multiplier, and a second queue connected to the second multiplier, the first queue comprises a first register and a second register adjacent to the first register, the first register being an output register of the first queue, the first tile is further configured:  in a first state:  to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state:  to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue because Ross suggests that this is useful for partitioning (col. 9, line 63-col. 10, line 7), which is useful for independently and simultaneously performing a matrix computation on different matrix values (col. 1, line 66-col. 2, line 10).
17.	As per Claim 14, Claim 14 is similar in scope to Claim 4, and therefore is rejected under the same rationale.
18.	Claim(s) 5, 6, 15, and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Albericio (see citation below), Judd (US 20170357891A1), Young (US009721203B1), Narayanaswami (US009836691B1), and Ross (US010521488B1) in view of Yan (US 20180218518A1).
19.	As per Claim 5, Albericio, Judd, Young, Narayanaswami, and Ross are relied upon for the teachings as discussed above relative to Claim 4.
	However, Albericio, Judd, Young, Narayanaswami, and Ross do not teach wherein, in the second state, the output register of the first queue contains zero.  However, Yan teaches preventing the input activation registers 262 and weight registers 260 from updating the input activation and weight values output to the input registers 275 when either the input activation or the weight equals zero [0042].  Fig. 2C shows that an input activation is output from an input activation register 262 to the input registers 275, and a weight is output from a weight register 260 to the input registers 275, and the input activation and the weight are output from the input registers 275 to the multiplier 280 where they are multiplied.  Thus, it would have been obvious to one of ordinary skill in the art that in a state where the last input activation register 262 contains zero, then the last input activation register 262 is prevented from updating the input activation output to the input register 275 and thus is not output to the multiplier 280, so then a second input activation register 262 that contains a non-zero updates the input activation output to the input register 275 and thus is output to the multiplier 280 [0042] (Fig. 2C).  Thus, Yan teaches wherein, in the second state, the output register of the first queue contains zero.
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Albericio, Judd, Young, Narayanaswami, and Ross so that, in the second state, the output register of the first queue contains zero because Yan suggests that this avoids performing multiplication operations when the product is zero which reduces energy consumption [0020].
20.	As per Claim 6, Albericio, Judd, Young, Narayanaswami, and Ross do not teach further comprising:  a first adder, configured, in the first state:  to be connected to an output of the first multiplier, and an output of the second multiplier, and to add; a product received from the output of the first multiplier, and a product received from the output of the second multiplier.  However, Yan teaches further comprising:  a first adder, configured, in the first state:  to be connected to an output of the first multiplier, and an output of the second multiplier, and to add; a product received from the output of the first multiplier, and a product received from the output of the second multiplier (each of the PEs 250 generates a product by multiplying a weight value and an input activation, the products for each pipeline stage are summed by an adder 243 to produce a partial product, the partial products generated by the PEs 250 in the PE array 240 are summed by an adder 286 and the resulting partial sum is output to the accumulator 245, [0037], multiple PEs 250 are included within the PE array 240, each PE 250 includes a multiplier 280, [0038]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Albericio, Judd, Young, Narayanawami, and Ross to include a first adder, configured, in the first state:  to be connected to an output of the first multiplier, and an output of the second multiplier, and to add; a product received from the output of the first multiplier, and a product received from the output of the second multiplier because Yan suggests that this is needed to output the sum to the accumulator [0037], which is needed to complete the convolution operation [0032], and it is well-known in the art that convolutional neural networks have become the most popular algorithmic approach for deep learning for many domains [0002].
21.	As per Claims 15-16, these claims are similar in scope to Claims 5-6 respectively, and therefore are rejected under the same rationale.
Allowable Subject Matter
22.	Claims 3, 7-10, 13, and 17-19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  
23.	The prior art taken singly or in combination do not teach or suggest the combination of all the limitations of Claim 3 and base Claim 1, and in particular, do not teach wherein the first tile is further configured to perform a second convolution of an array of activations with a second kernel of weights, the performing of the second convolution comprising, in order:  forming a tensor product of a first portion of the second kernel with a first subarray of the array of activations, the first portion of the second kernel comprising a weight stored in the first weight register; forming a tensor product of a second portion of the second kernel with the first subarray of the array of activations, the second portion of the second kernel comprising a weight stored in the second weight register; and forming a tensor product of the first portion of the second kernel with a second subarray of the array of activations, the first portion of the second kernel comprising the weight stored in the first weight register.  Claim 13 is similar in scope to Claim 3, and therefore also contains allowable subject matter.
24.	The prior art taken singly or in combination do not teach or suggest the combination of all the limitations of Claim 7 and base Claim 1 and intervening Claims 4 and 6, and in particular, do not teach a second adder, configured, in the second state, to be connected to the output of the first multiplier.  Claims 8-10 depend from Claim 7, and therefore also contain allowable subject matter.  Claims 17-19 are similar in scope to Claims 7-9 respectively, and therefore also contain allowable subject matter.
Prior Art of Record
Jorge Albericio, Bit-Pragmatic Deep Neural Network Computing, October 2017, MICRO-50, p. 385-387, 390.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JONI HSU whose telephone number is (571)272-7785. The examiner can normally be reached M-F 10am-6:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on (571)272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





JH
/JONI HSU/Primary Examiner, Art Unit 2611