DETAILED ACTION
This is in response to the application filed on 12/3/2019 in which claims 1 – 25 are presented for examination.
Status of Claims
Claims 1 – 25 are pending, of which claims 1 and 16 are in independent form.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 4/27/2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 3 – 5, 10, 16, 18, and 19 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Martin, U.S. Patent Application 2019/0147325 (hereinafter referred to as Martin).

Referring to claim 1, Martin discloses “A convolutional neural network processor configured to compute an input data” (Fig. 2 and [0077] hardware to implement a CNN, [0078] to process input data), “comprising: an information decode unit configured to receive a program input and a plurality of weight parameter inputs” ([0081] memory interface 210 receives weights and commands) “and comprising: a decoding module receiving the program input and outputting an operational command according to the program input” (Fig. 2 command decoder 220); “a parallel processing module electrically connected to the decoding module, receiving the weight parameter inputs and comprising a plurality of parallel processing sub-modules, wherein the parallel processing sub-modules produce a plurality of weight parameter outputs according to the operational command and the weight parameter inputs” (Figs. 2 and 3 and [0116] neuron engines 245a-n receive input data and weight data, operates on data indicated by configuration information from command decoder); “and a convolutional neural network inference unit electrically connected to the information decode unit and comprising: a computing module electrically connected to the parallel processing module, wherein the computing module computes to produce an output data according to the input data and the weight parameter outputs” ([0231] command decoder could configure the width converter 250. Fig. 2 width converter 250, activation 255, normalization 265, pooling 275).

As per claim 3, Martin discloses “when the weight parameter inputs have a non-compressed form, the parallel processing sub-modules comprise: a plurality of parallel sub-memories configured to parallelly store the weight parameter inputs having the non-compressed form; and a plurality of parallel sub-processors, wherein each of the parallel sub-processors is electrically connected to the decoding module and one of the parallel sub-memories so that the parallel sub-processors parallelly receive the weight parameter inputs having the non-compressed form according to the operational command to produce the weight parameter outputs” ([0126] In some implementations, the weights may be provided to the neuron engine in a compressed format with the zeros removed. (Therefore, some weights are provided in an uncompressed format). Fig. 2 240a-n sub-memories, 245a-n sub-processors.  [0116] neuron engines 245a-n receive input data and weight data, operates on data indicated by configuration information from command decoder).

	As per claim 4, Martin discloses “when the weight parameter inputs have a compressed form, the parallel processing sub-modules comprise: a plurality of parallel sub-memories configured to parallelly store the weight parameter inputs having the compressed form; and a plurality of parallel sub-processors, wherein each of the parallel sub-processors is electrically connected to the decoding module and one of the parallel sub-memories so that the parallel sub-processors parallelly receive and decompress the weight parameter inputs having the compressed form according to the operational command to produce the weight parameter outputs” ([0126] In some implementations, the weights may be provided to the neuron engine in a compressed format with the zeros removed. Fig. 2 240a-n sub-memories, 245a-n sub-processors.  [0116] neuron engines 245a-n receive input data and weight data, operates on data indicated by configuration information from command decoder).

	As per claim 5, Martin discloses “wherein the weight parameter inputs comprise a plurality of first input weight parameters, the weight parameter outputs comprise a plurality of first output weight parameters, and the parallel processing sub-modules comprise: a plurality of parallel sub-memories configured to parallelly store the weight parameter inputs and comprising: a plurality of first parallel sub-memories provided, respectively, for parallelly receiving and storing one of the first input weight parameters; and a plurality of parallel sub-processors provided, respectively, for being electrically connected to the decoding module and one of the parallel sub-memories and comprising: a plurality of first parallel sub-processors provided, respectively, for being electrically connected to one of the first parallel sub-memories to receive one of the first input weight parameters according to the operational command for outputting one of the first output weight parameters” (Fig. 2 weight buffers 240a-n, neuron engines 245a-n connected via crossbar 242 to a weight buffer).

	As per claim 10, Martin discloses “the weight parameter inputs further comprise at least one second input weight parameter, the weight parameter outputs further comprise at least one second output weight parameter, the parallel sub-memories further comprise at least one second parallel sub-memory configured to parallelly receive and store the at least one second input weight parameter, the parallel sub-processors further comprise at least one second parallel sub-processor electrically connected to the at least one second parallel sub-memory, and the at least one second parallel sub-processor receives the at least one second input weight parameter according to the operational command to output the at least one second output weight parameter” (Fig. 2 weight buffers 240a-n, neuron engines 245a-n connected via crossbar 242 to a weight buffer).

Referring to claim 16, claim 1 recites the corresponding limitations as that of claim 16.  Therefore, the rejection of claim 1 applies to claim 16. 

Note, claim 18 recites the corresponding limitations of claim 3.  Therefore, the rejection of claim 3 applies to claim 18.

Note, claim 19 recites the corresponding limitations of claims 3 and 4.  Therefore, the rejections of claims 3 and 4 apply to claim 19.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Martin in view of Japanese Patent Application JP 3210319 B2 (hereinafter referred to as JP).

As per claim 2, Martin discloses “the decoding module” “the program input” “a command decoder” “wherein the command decoder decodes the program input to output the operational command” ([0081] memory interface 210 receives weights and commands, Fig. 2 command decoder 220, Figs. 2 and 3 and [0116] neuron engines 245a-n receive input data and weight data, operates on data indicated by configuration information from command decoder).
Martin does not appear to explicitly disclose “the decoding module comprises: a program memory configured to store the program input; and a command decoder electrically connected to the program memory, wherein the command decoder decodes the program input to output the operational command.”
However, JP discloses “the decoding module comprises: a program memory configured to store the program input; and a command decoder electrically connected to the program memory, wherein the command decoder decodes the program input to output the operational command” (last paragraph on page 4, neuron program holding / interpreting means.  Page 6 lines 27 – 29 program holding / interpreting means for decomposing a program).
Martin and JP are analogous art because they are from the same field of endeavor, which is machine learning circuits with parallel processing.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Martin and JP before him or her, to modify the teachings of Martin to include the teachings of JP so that a program memory is connected to the command decoder and stores program input.
The motivation for doing so would have been to overcome the problems of transmission time of input data overhead and ensuring no deviation in transmission and reception timing (as described by JP on page 4 at lines 8 – 11 and 24 – 28).
Therefore, it would have been obvious to combine JP with Martin to obtain the invention as specified in the instant claim.

Note, claim 17 recites the corresponding limitations of claim 2.  Therefore, the rejection of claim 2 applies to claim 17.

Claims 11, 20, 23 are rejected under 35 U.S.C. 103 as being unpatentable over Martin in view of Dally et al., U.S. Patent Application 2018/0046900 (hereinafter referred to as Dally).

As per claim 11, Martin discloses “the first parallel sub-processors” (Fig. 2 weight buffers 240a-n, neuron engines 245a-n) “and computing for producing” “computing data according to the first output weight parameters and the input data” ([0116] neuron engines 245a-n receive input data and weight data, operates on data indicated by configuration information from command decoder.  Fig. 2 width converter 250, activation 255, normalization 265, pooling 275).
Martin does not appear to explicitly disclose “the computing module comprises: a 3x3 computing sub-module electrically connected to the first parallel sub-processors and computing for producing a 3x3 post-processing computing data according to the first output weight parameters and the input data; and a 1x1 computing sub-module electrically connected to the at least one second parallel sub-processor and the 3x3 computing sub-module and computing for producing a 1x1 post-processing computing data according to the at least one second output weight parameter and the 3x3 post-processing computing data; wherein the output data is the 1x1 post-processing computing data.”
However, Dally discloses “the computing module comprises: a 3x3 computing sub-module electrically connected to the first parallel sub-processors and computing for producing a 3x3 post-processing computing data according to the first output weight parameters and the input data; and a 1x1 computing sub-module electrically connected to the at least one second parallel sub-processor and the 3x3 computing sub-module and computing for producing a 1x1 post-processing computing data according to the at least one second output weight parameter and the 3x3 post-processing computing data; wherein the output data is the 1x1 post-processing computing data” ([0051] convolution layers characterized by a set of filters that are usually 1x1 or 3x3).
Martin and Dally are analogous art because they are from the same field of endeavor, which is machine learning circuits with parallel processing.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Martin and Dally before him or her, to modify the teachings of Martin to include the teachings of Dally so that a 3x3 computing sub-module and a 1x1 computing sub-module compute data according to weights.
The motivation for doing so would have been to provide for the filter layers of a neural network (as described in Dally at [0005] and [0051]).
Therefore, it would have been obvious to combine Dally with Martin to obtain the invention as specified in the instant claim.

As per claim 20, Martin discloses “the weight parameter outputs comprise a plurality of first output weight parameters” (Fig. 2 weight buffers 240a-n, neuron engines 245a-n connected via crossbar 242 to a weight buffer. [0116] neuron engines 245a-n receive input data and weight data, operates on data indicated by configuration information from command decoder).
Martin does not appear to explicitly disclose “the computing module comprises a 3x3 computing sub-module, and the computing step comprises: performing a first computing sub-step to drive the 3x3 computing sub-module to receive the input data and the first output weight parameters for producing a 3x3 post-processing computing data.”
However, Dally discloses “the computing module comprises a 3x3 computing sub-module, and the computing step comprises: performing a first computing sub-step to drive the 3x3 computing sub-module to receive the input data and the first output weight parameters for producing a 3x3 post-processing computing data” ([0051] convolution layers characterized by a set of filters that are usually 1x1 or 3x3).
Martin and Dally are analogous art because they are from the same field of endeavor, which is machine learning circuits with parallel processing.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Martin and Dally before him or her, to modify the teachings of Martin to include the teachings of Dally so that a 3x3 computing sub-module and a 1x1 computing sub-module compute data according to weights.
The motivation for doing so would have been to provide for the filter layers of a neural network (as described in Dally at [0005] and [0051]).
Therefore, it would have been obvious to combine Dally with Martin to obtain the invention as specified in the instant claim.

As per claim 23, Martin discloses “the weight parameter outputs further comprise at least one second output weight parameter” (Fig. 2 weight buffers 240a-n, neuron engines 245a-n connected via crossbar 242 to a weight buffer. [0116] neuron engines 245a-n receive input data and weight data, operates on data indicated by configuration information from command decoder).
Martin does not appear to explicitly disclose “the computing module comprises a 1x1 computing sub-module, and the computing step further comprises: performing a second computing sub-step to drive the 1x1 computing sub-module to receive the 3x3 post-processing computing data and the at least one second output weight parameter so as to produce a 1x1 post-processing computing data.”
However, Dally discloses “the computing module comprises a 1x1 computing sub-module, and the computing step further comprises: performing a second computing sub-step to drive the 1x1 computing sub-module to receive the 3x3 post-processing computing data and the at least one second output weight parameter so as to produce a 1x1 post-processing computing data” ([0051] convolution layers characterized by a set of filters that are usually 1x1 or 3x3).
Martin and Dally are analogous art because they are from the same field of endeavor, which is machine learning circuits with parallel processing.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Martin and Dally before him or her, to modify the teachings of Martin to include the teachings of Dally so that a 3x3 computing sub-module and a 1x1 computing sub-module compute data according to weights.
The motivation for doing so would have been to provide for the filter layers of a neural network (as described in Dally at [0005] and [0051]).
Therefore, it would have been obvious to combine Dally with Martin to obtain the invention as specified in the instant claim.

Allowable Subject Matter
Claims 6 – 9, 12 – 15, 21, 22, 24, and 25 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
WIPO Publication WO 2018/214913 A1 teaches a multi-layer neural network with weights.
Chinese Patent Application CN 113326916 A teaches a 3x3 weight matrix and a data input matrix of 3x3.
Chinese Patent Application CN 108268939 A teaches a neural network with parallel calculation including weights and biases.
Chinese Patent Application CN 107886166 A teaches a neural network with parallel calculation and a weight matrix.
Japanese Patent Application JP 6257848 B1 teaches a 3x3 bias matrix and a weight matrix.
Korean Patent Application KR 20200139829 A teaches a neural network with parallel calculation, input neuron matrix, weight matrix.
Korean Patent Application KR 20200060302 A teaches a neural network with parallel calculation and a weight matrix.
U.S. Patent Application 20180096226 and Patent 10489680 teach neural networks with 3x3 input data and 3x3 kernel.
U.S. Patent Applications 20190147324, 20190147326, 20190147327 and Patent 11182668 are copending applications to Martin.
U.S. Patent Application 20200293867 and Patent 11270197 teach a neural network with processing elements including a weight buffer, an activation buffer, and vector multiply-accumulate units to combine, in parallel, the weight values and the activation values using stationary data flows.
U.S. Patent Application 20190236135 teaches parallel convolution in a neural network and providing the data to an inference layer.
U.S. Patent Application 20190244100 teaches convolution performed on data in registers, e.g., 3x3 convolution based on 3x3 kernel data (weighting).
U.S. Patent Application 20190318229 teaches a neural network with inference pipeline.
U.S. Patent Application 20200074202 teaches a neural network with hidden layers and weights.
U.S. Patent Application 20220067513 teaches a weight buffer for a processing element in a neural network.
‘A Programmable Parallel Accelerator for Learning and Classification’ by Srihari Cadambi et al., PACT ’10, September 2010 teaches an accelerator with hundreds of simple processing elements (PEs) laid out in a two-dimensional grid and in-memory processing.
‘On the Properties of Neural Machine Translation: Encoder–Decoder Approaches’ by Kyunghyun Cho et al., October 2014 teaches a neural network with four weight matrices.
‘A Parallel Computing Platform for Training Large Scale Neural Networks’ by Rong Gu et al., 2013 IEEE International Conference on Big Data teaches neurons in adjacent layers are fully connected with weights and biases.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to STEVEN G SNYDER whose telephone number is (571)270-1971.  The examiner can normally be reached on M-F 8:00am-4:30pm (flexible).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Henry Tsai can be reached on 571-272-4176.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/STEVEN G SNYDER/Primary Examiner, Art Unit 2184