Detailed Action
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending for examination. Claims 1, 11, and 17 are independent. 

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 04/02/2019, 08/03/2021, and 11/08/2021.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 

Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: "an aligner coupled to the controller, the aligner, in response to receiving input data, aligns the input data into rows to generate a pooling array and, over a number of arithmetic cycles, shift the rows relative to each other to reformat the input data into reformatted data" and  
"a pooler coupled to the aligner, the pooler applies, in subsequent arithmetic cycles, a pooling operation to at least some of the reformatted data to obtain a pooling output that comprises a pooling value, wherein a subset of data from each row is combined to a set of data from which the pooling value is generated." in claim 1.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


.
Claim limitation “"an aligner" and "a pooler" in claim 1” invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to 
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-3, 5, and 9-19 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Ross et al. (US 20160342893, hereinafter "Ross").

Regarding Claim 1
([Para 0039-0040 and Fig 3] “vector computation unit 314 generates normalized values, pooled values, or both.”) comprising: 
a controller ([0097-0099] “the multiplexor control signal”); 
an aligner coupled to the controller, the aligner, in response to receiving input data, aligns the input data into rows to generate a pooling array and, over a number of arithmetic cycles, shift the rows relative to each other to reformat the input data into reformatted data ([Para 0042-0043] “value loaders 402 send activation inputs to rows of the array 406 … The value loader can also send the activation input to an adjacent value loader, and the activation input can be used at another left-most cell of the array 406. This allows activation inputs to be shifted for use in another particular cell of the array 406.” [Para 0097-0099] “The circuit, e.g., a host 202 of FIG. 2, can select the multiplexor control signal based on what remaining products need to be computed for a given convolution calculation.” Examiner interprets the matrix operations of the CNN as aligning the input data and generating a pooling array.); and 
a pooler coupled to the aligner, the pooler applies, in subsequent arithmetic cycles, a pooling operation to at least some of the reformatted data to obtain a pooling output that comprises a pooling value, wherein a subset of data from each row is combined to a set of data from which the pooling value is generated ([Para 0040] “The matrix computation unit 312 can process the weight inputs and the activation inputs and provide a vector of outputs to the vector computation unit 314… In some implementations, the vector computation unit 314 generates normalized values, pooled values, or both. The vector of processed outputs can be used as activation inputs to the matrix computation unit 312, e.g., for use in a subsequent layer in the neural network.” Examiner interprets the activation inputs as the subset of data from each row (see para 0042) is combined to a set of data from which the pooling value is generated (i.e. weights).).

Regarding Claim 11
Ross discloses: A method for using a hardware-based pooling system ([Para 0039-0040 and Fig 3] “vector computation unit 314 generates normalized values, pooled values, or both.”), the method comprising: 
receiving from a convolution engine an array of data that represents an output channel of a convolution layer in a convolutional neural network (CNN) ([Para 0076] “FIG. 9 is a flow diagram of an example method for computing a layer output for a convolutional neural network layer... The process 900 can be performed for each convolutional layer of the neural network in order to compute an inference from a received input.” Examiner interprets layer output of a CNN as the received array of data. [Para 0042-0043] also describe the array computations of the CNN.);  Customer No. 14945319PATENTP0905-1NUS 
converting the array of data into a set of arrays that are aligned according to a pooling operation that applies data to at least two arrays of the set of arrays to generate a pooling result ([Para 0009] “pooling the accumulated output to generate the layer output. Sending the plurality of vector inputs to one or more cells along a first dimension of the systolic array comprises: at a particular clock cycle,” [Para 0040] “The matrix computation unit 312 can process the weight inputs and the activation inputs and provide a vector of outputs to the vector computation unit 314… In some implementations, the vector computation unit 314 generates normalized values, pooled values, or both.” Examiner interprets the activation input and the weights as the two arrays that are used to generate pooled values.); and 
outputting the pooling result into a memory device ([Para 0033] “The output of the layer can be stored in the unified buffer for use as an input to a subsequent layer in the neural network or can be used to determine the inference.”[Para 0040, Para 0047, and Para 0085] also describe storing an output into memory.).

Regarding Claim 17
Ross discloses: A method for using a pooling unit architecture ([Para 0039-0040 and Fig 3] “vector computation unit 314 generates normalized values, pooled values, or both.”), the method comprising: 
receiving, at a hardware-based pooling engine, a set of data arrays that each have a predefined relationship with each other ([Para 0042 and Fig 4] “value loaders 402 send activation inputs to rows of the array 406 and a weight fetcher interface 408 sends weight inputs to columns of the array 406. ” Examiner interprets systolic array 406 as the received set of data arrays and the activation/weight inputs to the array as having a predefined relationship.); 
using the hardware-based pooling unit, applying, according to a stride value ([Para 0034] “the input to the neural network from which the inference is to be computed, corresponding input and output sizes of each layer, a stride value for the neural network computation, and a type of layer to be processed, e.g., a convolutional layer or a fully connected layer.”), a pooling operation to data in at least two arrays from the set of data arrays to obtain a pooling result ([Para0074] and [Para 0040] “the vector computation unit 314 generates normalized values, pooled values, or both. The vector of processed outputs can be used as activation inputs to the matrix computation unit 312, e.g., for use in a subsequent layer in the neural network.”) without having to satisfy a requirement of writing a convolution result into memory ([Para 0033] “The output of the layer can be stored in the unified buffer for use as an input to a subsequent layer in the neural network or can be used to determine the inference.” Examiner interprets a buffer as temporary storage that does not require writing to memory.); and  Customer No. 14945320PATENTP0905-1NUS 
outputting the pooling result as a row of data points that each represent a neuron in a layer of a convolutional neural network (CNN) ([Para 0033-0034, Para 0040] [0076] “The process 900 can be performed for each convolutional layer of the neural network in order to compute an inference from a received input.” Examiner interprets the computed inference as the output.).

Regarding Claim 2
Ross discloses: The pooling unit according to claim 1, wherein the input data has been generated by a matrix processor ([Para 0061-0063, Fig 4, and Fig 7] “The matrix structure 600 can be a set of activation inputs. Generally, the neural network processor can send the activation inputs, e.g., elements within matrix structure 600, and weight inputs, e.g., Kernels A-D 710, to rows and columns of the array, respectively… The neural network processor “flattens' the matrix structure 600 before sending portions of the structure 600 to rows of the systolic array,” Examiner interprets the matrix structure 600 as input data that has been generated by neural network processor (i.e. matrix processor). The neural network processor (i.e. matrix processor) converts the high dimensional matrix structure 600 into a two-dimensional structure and sends it to the matrix computation unit 406/706 (i.e. the aligner).).

Regarding Claim 3
Ross discloses: The pooling unit according to claim 2, wherein, to maintain a stream of the input data, the pooling output is generated at a same rate as a rate at which the matrix processor generates the input data ([Para 0009] “Generating the layer output from the accumulated output comprises normalizing and pooling the accumulated output to generate the layer output. Sending the plurality of vector inputs to one or more cells along a first dimension of the systolic array comprises: at a particular clock cycle, storing a first vector input in the plurality of vector inputs in a first cell of the systolic array” Examiner reads the clock cycle as the same rate of the input data.).

Regarding Claim 5
Ross discloses: The pooling unit according to claim 1, further comprising a multiply-and-shift circuit coupled to the pooler, the multiply-and-shift circuit generates the pooling output based on the pooling operation ([Para 0052] “shifting the weight input or the activation input takes one or more clock cycles. The control signal can also determine whether the activation input or weight inputs are transferred to the multiplication circuitry 508, or can determine whether the multiplication circuitry 508 operates on the activation and weight inputs.”).

Regarding Claim 9
Ross discloses: The pooling unit according to claim 1, wherein the controller determines, without modifying the sequence of the pooling operation itself, a number and a location of data points involved in a pooling operation ([Para 0097-0099] “the multiplexor control signal can determine which element of which vector is sent to the value loader's corresponding row cell and thereby used in the current operation. The circuit, e.g., a host 202 of FIG. 2, can select the multiplexor control signal based on what remaining products need to be computed for a given convolution calculation.”).

Regarding Claim 10
The pooling unit according to claim 1, wherein a shift from one row to another row corresponds to a shift of a pooling window that convolves across a matrix at a stride value, the shift being defined by the number of arithmetic cycles ([Para 0051-0053] “weights are pre-shifted into a weight path register 512… The weight register 502 can statically store the weight input such that as activation inputs are transferred to the cell, e.g., through the activation register 506, over multiple clock cycles, the weight input remains within the cell and is not transferred to an adjacent cell.”).

Regarding Claim 12
Ross discloses: The method according to claim 11, wherein the array of data is received at a hardware-based pooling unit ([Para 0042 and Fig 4] “value loaders 402 send activation inputs to rows of the array 406 and a weight fetcher interface 408 sends weight inputs to columns of the array 406. ” Examiner interprets systolic array 406 as the received set of data arrays and vector computation unit 314 as a hardware-based pooling unit.).

Regarding Claim 13
Ross discloses: The method according to claim I1, wherein arrays of data are received at intervals of a number of arithmetic cycles [Para 0051-0053] “weights are pre-shifted into a weight path register 512… The weight register 502 can statically store the weight input such that as activation inputs are transferred to the cell, e.g., through the activation register 506, over multiple clock cycles, the weight input remains within the cell and is not transferred to an adjacent cell.”).

Regarding Claim 14
Ross discloses: The method according to claim 11, wherein pooling results are generated at each interval ([Para 0009] “Generating the layer output from the accumulated output comprises normalizing and pooling the accumulated output to generate the layer output. Sending the plurality of vector inputs to one or more cells along a first dimension of the systolic array comprises: at a particular clock cycle, storing a first vector input in the plurality of vector inputs in a first cell of the systolic array ;” Examiner reads a clock cycles as an interval.).

Regarding Claim 15
Ross discloses: The method according to claim 14, wherein pooling results are output at each interval ([Para 0009] “Generating the layer output from the accumulated output comprises normalizing and pooling the accumulated output to generate the layer output. Sending the plurality of vector inputs to one or more cells along a first dimension of the systolic array comprises: at a particular clock cycle, storing a first vector input in the plurality of vector inputs in a first cell of the systolic array ;” Examiner reads a clock cycles as an interval.).

Regarding Claim 16
Ross discloses: The method according to claim 11, wherein the array of data corresponds to a set of feature maps ([Para 0058-0060 and Fig 6] “That is, depth level 602 can correspond to a feature of nine '1' activation inputs, e.g., red values, depth level 604 can correspond to a feature of nine 2 activation inputs, e.g., green values, and depth level 606 can correspond to a feature of nine 3' activation inputs, e.g., blue values.”).

Regarding Claim 18
Ross discloses: The method according to claim 17, wherein the set of data arrays are received from a convolution engine ([Para 0042 and Fig 4] “value loaders 402 send activation inputs to rows of the array 406 and a weight fetcher interface 408 sends weight inputs to columns of the array 406. ” Examiner interprets systolic array 406 as the received set of data arrays and vector computation unit 314 as a convolution engine.).

Regarding Claim 19
The method according to claim 17, wherein obtaining the pooling result utilizes a one-to-one relationship between an output channel and an input channel ([0039-0040 and 0074-0075] “The matrix computation unit 312 can process the weight inputs and the activation inputs and provide a vector of outputs to the vector computation unit 314…the vector computation unit 314 generates normalized values, pooled values, or both.”).


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 4, 6-8, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over  Ross et al. (US 20160342893, hereinafter "Ross") in view of Lau et al. (US 20180189238, hereinafter "Lau").

Regarding Claim 4
Ross discloses: The pooling unit according to claim 2, wherein the pooler performs one or more pooling calculations in parallel ([Para 0055] “In order to effectively perform convolution calculations using the systolic array, the neural network processor parallelizes matrix multiplications having large dimensional spaces, which are generally required for convolution calculations.”), 
Ross does not explicitly disclose: wherein the number of pooling calculations equals a number of output channels in the matrix processor, such that, independent of a kernel size, the pooling output corresponds to a width of the matrix processor.
However, Lau discloses in the same field of endeavor: wherein the number of pooling calculations equals a number of output channels in the matrix processor, such that, independent of a kernel size, the pooling output corresponds to a width of the matrix processor ([Para 0103] “FIG. 6A illustrates a simplified example of forward pooling (e.g., performed by matrix processing engine 500 of FIG. 5). The illustrated example performs forward pooling on an input feature map 610 with dimensions H.times.W (e.g., height H and width W).”).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the method for rotating data for neural network computations taught by Ross with method for max pooling taught by ([Abstract, Lau]).

Regarding Claim 6
Ross in view of Lau discloses: The pooling unit according to claim 1, wherein the input data corresponds to a set of feature maps, and wherein the pooler uses the reformatted input data to reduce, by a predetermined factor, at least one of a height and a width of the set of feature maps ([Para 0103] Lau “FIG. 6A illustrates a simplified example of forward pooling (e.g., performed by matrix processing engine 500 of FIG. 5). The illustrated example performs forward pooling on an input feature map 610 with dimensions H.times.W (e.g., height H and width W).”).

Regarding Claim 7
Ross in view of Lau discloses: The pooling unit according to claim 1, wherein the rows have the same width as the input data, each row comprising sections of data that correspond to a set of neighborhood values in a matrix ([Para 0103] Lau “FIG. 6A illustrates a simplified example of forward pooling (e.g., performed by matrix processing engine 500 of FIG. 5). The illustrated example performs forward pooling on an input feature map 610 with dimensions H.times.W (e.g., height H and width W). Moreover, the illustrated example uses a 4.times.4 filter size with a stride of 4 in both the horizontal and vertical directions. In the illustrated example, the stride and filter size are equal for ease of illustration.”[Para 0112] “For backward pooling, the macro-column width may be fixed at a particular size, such as 32 elements. Moreover, there may also be a maximum supported filter size, such as 16.times.16 elements. Accordingly, in some embodiments, the size of the active feature map may be 16 row elements by 32 column elements, or 512 elements.”).

Regarding Claim 8
Ross in view of Lau discloses: The pooling unit according to claim 1, further comprising a state machine that shifts the pooling output into an output array ([Para 0133] Lau “Where appropriate, any of the foregoing may be used to build or describe appropriate discrete or integrated circuits, whether sequential, combinatorial, state machines, or otherwise.” [Para 0050] discloses that the matric processing unit can perform shift operations.).

Regarding Claim 20
Ross in view of Lau discloses: The method according to claim 17, wherein pooling result comprises one of an average pooling result ([Para 0009], Ross “Generating the layer output from the accumulated output comprises normalizing and pooling the accumulated output to generate the layer output.) and a max pooling result ([para 0013] Lau “The matrix processing functionality described throughout this disclosure provides an efficient hardware-based approach for performing max pooling in an artificial neural network.”).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Yang et al. (US 20160342888, hereinafter "Yang") also describes aligning arrays to perform poo.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TEWODROS E MENGISTU whose telephone number is (571)270-7714. The examiner can normally be reached Mon-Fri 9:30-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ABDULLAH KAWSAR can be reached on (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 





/TEWODROS E MENGISTU/Examiner, Art Unit 2127                                                                                                                                                                                                        
/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127