DETAILED ACTION

Status of Application
Claims 1-20 are pending in the present application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 12/23/2019 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 2, 6, 7-10, 14, and 16-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ross et al (hereinafter Ross), U.S. Publication No. 2016/0342893 A1, in view of Kowald et al (hereinafter Kowald), U.S. Publication No. 2019/0279353 A1.
	Referring to claims 1 and 16, taking claim 1 as exemplary, Ross discloses a processing unit comprising:
an array of processing elements [fig. 4, cells 404] comprising a sub-array of primary processing elements [fig. 4, cells 404 on the left edge (left column of cells)] and secondary processing elements bordering the sub-array [fig. 4, 2nd column of cells 404 to the right of the left edge, 2nd column bordering the sub-array]; and
a controller [fig. 3, sequencer 306] in communication with the processing elements in the array [fig. 3, sequencer 306 in communication with matrix computation unit 312, shown as 406 in fig. 4],
wherein, for computing a convolution, the controller pre-loads activation values from an activations matrix into the primary processing elements such that each primary processing element stores a corresponding activation value [paragraphs 4, 43, the value loaders 402 can receive the activation inputs from a unified buffer, e.g., the unified buffer 308 of FIG. 3. Each value loader can send a corresponding activation input to a distinct left-most cell of the array 406. The left-most cell can be a cell along a left-most column of the array 406. For example, value loader 412 can send an activation input to cell 414. The value loader can also send the activation input to an adjacent value loader, and the activation input can be used at another left-most cell of the array 406; The set of activation inputs can also be represented as a matrix structure; paragraphs 35, 39, sequencer 306, which converts the instructions into low level control signals that control the circuit to perform the neural network computations. In some implementations, the control signals regulate dataflow in the circuit, e.g., how the sets of weight inputs and the sets of activation inputs flow through the circuit. The sequencer 306 can send the control signals to a unified buffer 308… the unified buffer 308 can send the activation inputs to the matrix computation unit 312],
wherein, during an initial clock cycle, the controller selects a specific weight value from a weights kernel [paragraph 4, 44, The weight fetcher interface 408 can select a weight input; Kernels can be represented as a matrix structure of weight inputs], the controller loads the specific weight value into all the primary processing elements [paragraphs 51-53, The cell can also shift the weight input to adjacent cells for processing. For example, the weight register 502 can send the weight input to another weight register in the bottom adjacent cell. Both the weight input and the activation input can therefore be reused by other cells in the array; The control register can store a control signal that determines whether the cell should shift either the weight input or the activation input to adjacent cells. In some implementations, shifting the weight input or the activation input takes one or more clock cycles; In some implementations, weights are pre-shifted into a weight path register 512. The weight path register 512 can receive the weight input, e.g., from a top adjacent cell, and transfer the weight input to the weight register 502 based on the control signal; hence the weight input can be shifted into all the primary processing elements in one initial clock cycle], and each primary processing element performs a multiply-accumulate operation using the corresponding activation value and the specific weight value [fig. 5 shows the contents of a cell; paragraph 59, Multiplication circuitry 508 can be used to multiply the weight input from the weight register 502 with the activation input from the activation register 506. The multiplication circuitry 508 can output the product to summation circuitry 510]; and
wherein, during each successive clock cycle, the controller follows a pattern to select a next weight value from the weights kernel [paragraph 64, When a matrix structure is sent to a cell, a first element of the matrix can be stored in the cell during one clock cycle. On the next clock cycle, a next element can be stored in the cell; fig. 11, rotating pattern of weights (see paragraphs 105, 49, 59); also see paragraph 35, where the sequencer controls the flow of weights through the circuit].
Ross does not explicitly disclose the controller following a spiral pattern.
However, Kowald discloses the controller following a spiral pattern [paragraphs 41, 165, applying a spiral sampling pattern based on a location of the leaves in the selected images; combining the spiral sampling pattern 816 and the gridded sampling pattern 814 by means of convolution].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings of Kowald in the invention of Ross, to implement the controller following a spiral pattern, in order to reduce assessment time [Kowald, paragraphs 9-10].
Referring to claims 2 and 10, taking claim 2 as exemplary, the modified Ross discloses the processing unit of claim 1, further comprising a memory accessible by the controller and storing the activations matrix and the weights kernel [Ross, fig. 3, memories 308, 310].
Referring to claims 6, 14, and 17, taking claim 6 as exemplary, the modified Ross discloses the processing unit of claim 1, wherein the spiral pattern begins with a first weight at a top left corner of the weights kernel and ends with a last weight at a center of the weights kernel [Kowald discloses the controller following a spiral pattern [paragraphs 41, 165].
Referring to claim 7, the modified Ross discloses the processing unit of claim 1, wherein the sub-array comprises a NxN sub-array [Ross, fig. 7, besides the citation of Ross for claim 1, Ross can also use the sub-array in fig. 7 where the sub-array would be a 3x3 matrix starting from the top left] comprising a first number (N) of columns of the primary processing elements and the first number (N) of rows of the primary processing elements [Ross, fig. 7, 3x3 sub-array would mean three columns and three rows], and wherein the activations matrix comprises a NxN matrix [fig. 7, see 702 with a 3x3 input matrix] comprising the first number (N) of columns of the activations and the first number (N) of rows of the activations [fig. 7].
Referring to claim 8, the modified Ross discloses The processing unit of claim 1, wherein the weights kernel comprises a MxM weights kernel [Ross, fig. 7 see Kernels  with 3x3x10 dimension] comprising a second number (M) of columns of weights, the second number (M) of rows of the weights [Ross, fig. 7 see Kernels  with 3x3x10 dimension], wherein the computing of the convolution is completed in a third number (Y) of clock cycles, and wherein the third number (Y) is equal to the second number squared (M2).
	Referring to claim 9, Ross discloses a processing unit comprising:
an array of the processing elements [fig. 4, cells 404] comprising:
a sub-array of primary processing elements [fig. 4, cells 404 on the left edge (left column of cells)], wherein each primary processing element comprises a register [fig. 5, register 506] and a multiply-accumulate unit [fig. 5, elements 508, 510]; and 
secondary processing elements bordering the sub-array [fig. 4, 2nd column of cells 404 to the right of the left edge, 2nd column bordering the sub-array], wherein each secondary processing element comprises a buffer [fig. 5, element 504];
a controller [fig. 3, sequencer 306] in communication with the processing elements in the array [fig. 3, sequencer 306 in communication with matrix computation unit 312, shown as 406 in fig. 4],
wherein, for computing a convolution, the controller pre-loads activation values from an activations matrix into registers in the primary processing elements [paragraphs 4, 43, the value loaders 402 can receive the activation inputs from a unified buffer, e.g., the unified buffer 308 of FIG. 3. Each value loader can send a corresponding activation input to a distinct left-most cell of the array 406. The left-most cell can be a cell along a left-most column of the array 406. For example, value loader 412 can send an activation input to cell 414. The value loader can also send the activation input to an adjacent value loader, and the activation input can be used at another left-most cell of the array 406; The set of activation inputs can also be represented as a matrix structure; paragraphs 35, 39, sequencer 306, which converts the instructions into low level control signals that control the circuit to perform the neural network computations. In some implementations, the control signals regulate dataflow in the circuit, e.g., how the sets of weight inputs and the sets of activation inputs flow through the circuit. The sequencer 306 can send the control signals to a unified buffer 308… the unified buffer 308 can send the activation inputs to the matrix computation unit 312] such that each register of each primary processing element stores a corresponding activation value [paragraph 49, The cell can include an activation register 506 that stores an activation input. The activation register can receive the activation input from a left adjacent cell, i.e., an adjacent cell located to the left of the given cell, or from a unified buffer],
wherein, during an initial clock cycle, the controller selects a specific weight value from a weights kernel [paragraph 4, 44, The weight fetcher interface 408 can select a weight input; Kernels can be represented as a matrix structure of weight inputs], the controller loads the specific weight value into all multiply-accumulate units in all the primary processing elements [paragraphs 51-53, The cell can also shift the weight input to adjacent cells for processing. For example, the weight register 502 can send the weight input to another weight register in the bottom adjacent cell. Both the weight input and the activation input can therefore be reused by other cells in the array; The control register can store a control signal that determines whether the cell should shift either the weight input or the activation input to adjacent cells. In some implementations, shifting the weight input or the activation input takes one or more clock cycles; In some implementations, weights are pre-shifted into a weight path register 512. The weight path register 512 can receive the weight input, e.g., from a top adjacent cell, and transfer the weight input to the weight register 502 based on the control signal; hence the weight input can be shifted into all the primary processing elements in one initial clock cycle], and each multiply-accumulate unit in each primary processing element performs a multiply-accumulate operation using the corresponding activation value and the specific weight value [fig. 5 shows the contents of a cell; paragraph 59, Multiplication circuitry 508 can be used to multiply the weight input from the weight register 502 with the activation input from the activation register 506. The multiplication circuitry 508 can output the product to summation circuitry 510], and 
wherein, during each successive clock cycle, the controller follows a pattern when selecting a next weight value from the weights kernel [paragraph 105, fig. 11, rotated pattern] and loading the next weight value into the primary processing elements and the controller further follows the pattern when controlling accumulated partial product input selections within the processing elements across the array [paragraphs 105, 49, 59].
Ross does not explicitly disclose a spiral pattern.
However, Kowald discloses a spiral pattern [paragraphs 41, 165, applying a spiral sampling pattern based on a location of the leaves in the selected images; combining the spiral sampling pattern 816 and the gridded sampling pattern 814 by means of convolution].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize the teachings of Kowald in the invention of Ross, to implement a spiral pattern, in order to reduce assessment time [Kowald, paragraphs 9-10].



Allowable Subject Matter
Claims 3-5, 8, 11-13, 15, and 18-20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  The prior art of record taken alone or in combination fails to teach and/or fairly suggest wherein, during any given clock cycle, the multiplexor in a processing element receives accumulated partial product inputs from all adjacent processing elements in the array and all multiplexors in all of the processing elements receive a same specific control signal from the controller, and wherein the specific control signal causes the multiplexor to select one accumulated partial product input from one of the adjacent processing elements for processing such that the spiral pattern is followed, in combination with other recited limitations in claim 3.
The prior art of record taken alone or in combination fails to teach and/or fairly suggest wherein the weights kernel comprises a MxM weights kernel comprising a second number (M) of columns of weights, the second number (M) of rows of the weights, wherein the computing of the convolution is completed in a third number (Y) of clock cycles, and wherein the third number (Y) is equal to the second number squared (M2), in combination with other recited limitations in claim 8.
The prior art of record taken alone or in combination fails to teach and/or fairly suggest wherein each of the primary processing elements and the secondary processing elements further comprises a multiplexor, wherein, during any given clock cycle, the multiplexor in a processing element receives accumulated partial product inputs from all adjacent processing elements in the array and all multiplexors in all of the processing elements receive a same specific control signal from the controller, and
wherein the specific control signal causes the multiplexor to select one accumulated partial product input from one of the adjacent processing elements for processing such that the spiral pattern is followed, in combination with other recited limitations in claim 11.
The prior art of record taken alone or in combination fails to teach and/or fairly suggest wherein the weights kernel comprises a MxM weights kernel comprising a second number (M) of columns of weights, the second number (M) of rows of the weights, wherein the computing of the convolution is completed in a third number (Y) of clock cycles, and wherein the third number (Y) is equal to the second number squared (M2), in combination with other recited limitations in claim 15.
The prior art of record taken alone or in combination fails to teach and/or fairly suggest during each successive clock cycle: controlling accumulated partial product input selections within the processing elements across the array such that within each processing element one of multiple accumulated partial product inputs received from multiple adjacent processing elements, respectively, is selected according to the spiral pattern, wherein within each primary processing element the selected accumulated partial product input is used during the multiply-accumulate operation, wherein the multiply-accumulate operation comprises: determining a product of the corresponding activation value and the specific weight value; and determining a sum of the product and the selected accumulated partial product input, and wherein the sum is output to each adjacent processing element in the array as an accumulated partial product input available for selection during a next-clock cycle and wherein within each secondary processing element the selected accumulated partial product input is buffered and the buffered accumulated partial product is output to each adjacent processing element as an accumulated partial product input available for selection during a next- clock cycle, in combination with other recited limitations in claim 18.
The prior art of record taken alone or in combination fails to teach and/or fairly suggest wherein the weights kernel comprises a MxM weights kernel comprising a second number (M) of columns of weights, the second number (M) of rows of the weights, wherein the computing of the convolution is completed in a third number (Y) of clock cycles, and wherein the third number (Y) is equal to the second number squared (M2), in combination with other recited limitations in claim 20.
Claims 4-5, 12-13, and 19 are objected to by virtue of their dependency.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Agarwal et al (hereinafter Agarwal), U.S. Publication No. 2019/0102653 A1, discloses “In a preferred embodiment, the weights 1, 2, 4, . . . 128 are assigned to the eight elements of the 3×3 block of the weight function in a zigzag manner in the diagonal direction, starting from the top left element, as illustrated in FIG. 3. The use of the zigzag encoding shown in FIG. 3 is inspired by characteristics of normal handwriting where most of the characters are written from left to right and top to bottom. Other patterns can also be used, such as a zigzag pattern that is a mirror image of the one shown in FIG. 3 with respect to the upper-left to lower-right diagonal line or with respect to the upper-right to lower-left diagonal line, a zigzag in the reverse direction as that shown in FIG. 3, a spiral pattern, a column-by-column pattern, a row-by-row pattern, etc., depending upon the use-case” [paragraph 29].

Any inquiry concerning this communication or earlier communications from the examiner should be directed to FARLEY J ABAD whose telephone number is (571)270-3425. The examiner can normally be reached Mon-Thurs 8 AM - 7 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Idriss Alrobaye can be reached on (571) 270-1023. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Farley Abad/           Primary Examiner, Art Unit 2181