DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The applicant’s submission of the Information Disclosure Statement(s) (IDS), received 04/13/2018, 09/28/2018, 12/19/2018, 06/25/2019, 10/03/2019, 10/24/2019, 06/29/2020, and 09/07/2020, in compliance with 37 CFR 1.97 and 37 CFR 1.98 is acknowledged by the examiner. The examiner has considered the cited references in examination of the application and attached signed and dated copies to the Office action.	

	
	Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Yan et al., U.S. Patent Application Publication No. 2018/0218518 (hereinafter Yan).
Regarding claim 1, Yan teaches a neural network module [Deep Learning Accelerator (DLA) 200. FIG. 2A], comprising: 
[Processing Engines (PE) 250. Paragraphs 38; FIGS. 2B, 2C]; 
a memory device [Memory 504. FIG. 5] storing a
first buffer including first data for processing by the plurality of neurons in the neural network module [Memory 504 stores input data for DLA 200 (Paragraph 53), which includes input activations, and, therefore, comprises a section for input activations (i.e. first buffer). Paragraph 27], and 
a second buffer storing second data for processing by the plurality of neurons in the neural network module [Memory 504 stores input data for DLA 200 (Paragraph 53), which includes weights, and, therefore, comprises a section for weights (i.e. second buffer). Paragraph 27], wherein the first data in the first buffer and the second data in the second buffer are organized into corresponding rows and columns [The input activations and weights are 3 dimensional matrices, each having row and columns. Paragraphs 46 and 48; FIGS. 4A and 4B]; and 
wherein the neural network module is configured to 
determine whether the first data in a column of the first buffer comprises a predetermined value or range of values, or whether the second data in a corresponding column of the second buffer comprises the predetermined value or range of values [The compaction engine determines whether the input data, which includes weights and activations and is organized in matrices (i.e. are in a column), comprises zero values (i.e. predetermined value) or close to zero (i.e. range of values) and sets bitmask accordingly. Paragraphs 44, 34, 35], and 
cause the plurality of neurons to skip processing of the first data and the second data if the first data or the second data comprises the predetermined value or range of values [The zero masks control the zero gating control unit 270 to cause the PEs (i.e. neurons) to skip processing of the weights and input activations by preventing loading of the input registers and preventing switching of the multipliers. Paragraphs 41-42].


Regarding claim 2, Yan teaches the neural network module of claim 1, wherein the predetermined value comprises zero, a range of values, or values above or below a threshold value [The predetermined values are zero and values within a threshold of zero (i.e. a range of values and values above or below a threshold). Paragraphs 34, 35, 44].

Regarding claim 3, Yan teaches the neural network module of claim 1, wherein the first data in the first buffer comprises input data to a neural network [The input activations (i.e. data in the first buffer) are input for a neural network. Paragraph 33].

Regarding claim 4, Yan teaches the neural network module of claim 1, wherein the second data in the second buffer comprise weights associated with a neural network [The weights (i.e. data in the second buffer) are input for a neural network. Paragraph 33].

Regarding claim 5, Yan teaches the neural network module of claim 1, wherein the neural network module further comprises a group partitioner and scheduler, and wherein the group partitioner and scheduler determines whether the first data in the column of the first buffer comprise the predetermined value or whether the second data in the column of the second buffer comprise the predetermined value [The DLA comprises a compaction engine 215 (i.e. group partitioner and scheduler). Paragraph 27. The compaction engine determines whether the input data, which includes weights and activations and is organized in matrices (i.e. are in a column), comprises zero values (i.e. predetermined value) or close to zero (i.e. range of values) and sets bitmask accordingly. Paragraphs 44, 34, 35].

Regarding claim 6, Yan teaches the neural network module of claim 1, wherein the plurality of neurons use ReLu (y=max(x,0)) as an activation function for a neural network [The PEs (neurons) use the post processor to perform ReLU as output (activation function). Paragraph 32].

Regarding claim 7, Yan teaches the neural network module of claim 1, wherein the plurality of neurons are configured to process the first data and the second data synchronously [The PEs (i.e. neurons) receive broadcast data and operate on respective data each cycle and, therefore, operate synchronously. Paragraph 36; FIG. 2B].

Regarding claim 8, Yan teaches a neural network module [Deep Learning Accelerator (DLA) 200. FIG. 2A], comprising: 
a plurality of neurons [Processing Engines (PE) 250. Paragraphs 38; FIGS. 2B, 2C]; and 
a memory device [Memory 504. FIG. 5] storing a first buffer storing first data for processing by the plurality of neurons in the neural network module [Memory 504 stores input data for DLA 200 (Paragraph 53), which includes input activations, and, therefore, comprises a section for input activations (i.e. first buffer). Paragraph 27], and wherein the neural network module is configured to 
determine whether data in the first buffer comprises a predetermined value or range of values [The compaction engine determines whether the input data, which includes weights and activations and is organized in matrices (i.e. are in a column), comprises zero values (i.e. predetermined value) or close to zero (i.e. range of values) and sets bitmask accordingly. Paragraphs 44, 34, 35], and 
skip processing of the data in the first buffer if the data comprises the predetermined value or range of values [The zero masks control the zero gating control unit 270 to cause the PEs (i.e. neurons) to skip processing of the weights and input activations by preventing loading of the input registers and preventing switching of the multipliers. Paragraphs 41-42].

Regarding claim 9, Yan teaches the neural network module of claim 8, wherein the predetermined value comprises zero, a range of values, or values above or below a threshold value [The predetermined values are zero and values within a threshold of zero (i.e. a range of values and values above or below a threshold). Paragraphs 34, 35, 44].

Regarding claim 10, Yan teaches the neural network module of claim 8, wherein the first data in the first buffer comprises input data to a neural network [The input activations (i.e. data in the first buffer) are input for a neural network. Paragraph 33].

Regarding claim 11, Yan teaches the neural network module of claim 8, wherein the neural network module further comprises a group partitioner and scheduler, and wherein the group partitioner and scheduler determines whether the data located in the first buffer comprises the predetermined value [The DLA comprises a compaction engine 215 (i.e. group partitioner and scheduler). Paragraph 27. The compaction engine determines whether the input data, which includes weights and activations and is organized in matrices (i.e. are in a column), comprises zero values (i.e. predetermined value) or close to zero (i.e. range of values) and sets bitmask accordingly. Paragraphs 44, 34, 35].

Regarding claim 12, Yan teaches the neural network module of claim 8, wherein the plurality of neurons use ReLu (y=max(x,0)) as an activation function for a neural network [The PEs (neurons) use the post processor to perform ReLU as output (activation function). Paragraph 32].

Regarding claim 13, Yan teaches the neural network module of claim 8, wherein the plurality of neurons are configured to process the first data and second data in a second buffer asynchronously [The zero gating control unit prevents the loading/storing of input activations and weights (i.e. first data and second data) that are equal to zero, thereby causing the different neurons to operate asynchronously relative to each other when skipping operations. Paragraph 41].

Regarding claim 14, Yan teaches the neural network module of claim 8, wherein the plurality of neurons are configured to process the first data and second data in a second buffer synchronously [The PEs (i.e. neurons) receive broadcast data and operate on respective data each cycle and, therefore, operate synchronously. Paragraph 36; FIG. 2B].

Regarding claim 15, Yan teaches a neural network module [Deep Learning Accelerator (DLA) 200. FIG. 2A], comprising: 
a plurality of neurons [Processing Engines (PE) 250. Paragraphs 38; FIGS. 2B, 2C]; 
a memory device [Memory 504. FIG. 5] storing 
[Memory 504 stores input data for DLA 200 (Paragraph 53), which includes input activations, and, therefore, comprises a section for input activations (i.e. first buffer). Paragraph 27], and 
a second buffer storing second data for processing by the plurality of neurons in the neural network module [Memory 504 stores input data for DLA 200 (Paragraph 53), which includes weights, and, therefore, comprises a section for weights (i.e. second buffer). Paragraph 27], wherein the first data in the first buffer and the second data in the second buffer are organized into corresponding rows and columns [The input activations and weights are 3 dimensional matrices, each having row and columns. Paragraphs 46 and 48; FIGS. 4A and 4B]; and wherein the neural network module is configured to
determine whether data located at a row and column in the first buffer or the second buffer comprises a predetermined value or range of values [The compaction engine determines whether the input data, which includes weights and activations and is organized in matrices (i.e. are in a column), comprises zero values (i.e. predetermined value) or close to zero (i.e. range of values) and sets bitmask accordingly. Paragraphs 44, 34, 35], 
cause a first neuron of the plurality of neurons to skip processing of the data located at the row and column if the data comprises the predetermined value of range of values [The zero masks control the zero gating control unit 270 to cause the PEs (i.e. neurons) to skip processing of the weights and input activations by preventing loading of the input registers and preventing switching of the multipliers. Paragraphs 41-42], and 
cause the first neuron of the plurality of neurons to perform at least one operation on behalf of a second neuron of the plurality of neurons responsive to skipping processing of the data located at the row and column [When a particular PE (i.e. a first neuron) prevents processing of a weight or input activation (i.e. responsive to skipping processing), a zero is output as the product. Paragraph 41; FIG. 2C. This zero output/product is combined with the product of other PEs (Paragraph 37; FIG. 2B) and, therefore, the particular PE (i.e. first neuron) performs an operation (i.e. the outputting of a zero) on behalf of other PEs. (i.e. second neuron)].

Regarding claim 16, Yan teaches the neural network module of claim 15, wherein the predetermined value comprises zero, a range of values, or values above or below a threshold value [The predetermined values are zero and values within a threshold of zero (i.e. a range of values and values above or below a threshold). Paragraphs 34, 35, 44].

Regarding claim 17, Yan teaches the neural network module of claim 15, wherein the neural network module is further configured to combine results of the at least one operation performed by the first neuron on behalf of the second neuron with results of one or more operations performed by the second neuron [].

Regarding claim 18, Yan teaches the neural network module of claim 15, wherein the first data in the first buffer comprises input data to a neural network [The input activations (i.e. data in the first buffer) are input for a neural network. Paragraph 33] and wherein the second data in the second buffer comprise weights associated with the neural network [The weights (i.e. data in the second buffer) are input for a neural network. Paragraph 33].

Regarding claim 19, Yan teaches the neural network module of claim 15, wherein the plurality of neurons use ReLu (y=max(x,0)) as an activation function for a neural network [The PEs (neurons) use the post processor to perform ReLU as output (activation function). Paragraph 32].

Regarding claim 20, Yan teaches the neural network module of claim 15, wherein the plurality of neurons are configured to process the first data and the second data asynchronously [The zero gating control unit prevents the loading/storing of input activations and weights (i.e. first data and second data) that are equal to zero, thereby causing the different neurons to operate asynchronously relative to each other when skipping operations. Paragraph 41].




Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Brothers et al., U.S. Patent Application Publication No. 2016/0358069, teaches a neural network engine wherein computations involving zero-valued inputs or weights are skipped.
Turakhia et al., U.S. Patent Application Publication No. 2018/0164866, teaches a device for processing neural networks comprises a plurality of computation units that disables the load and processing of zero-valued weights or activations.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BENJAMIN P GEIB whose telephone number is (571)272-8628.  The examiner can normally be reached on Monday - Friday 8:30 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ALEXEY SHMATOV can be reached on (571)270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BENJAMIN P GEIB/Primary Examiner, Art Unit 2123