DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 11/12/2021 has been entered.
 
Status of Claims
Claims 21, 27-31, 33, and 37-40 have been amended by Applicant. No claims have been currently cancelled or added. Claims 21-40 are currently pending. 

Information Disclosure Statement
The Information disclosure statement submitted by Applicant on 02/04/2022 has been considered.  

Response to Arguments
The rejection of claims 21, 22, 23, 24, 31, 32, 33, and 34 under 35 U.S.C. 103 has been withdrawn in view of Applicant’s amendment to independent claims 21 and 31. However, upon further consideration a new grounds of rejection has been made under 35 U.S.C. 103. 
The rejection of claims 25, 26, 35, and 36 under 35 U.S.C. 103 has been withdrawn in view of Applicant’s amendment to independent claims 21 and 31. However, upon further consideration a new grounds of rejection has been made under 35 U.S.C. 103. 
The rejection of claims 28, 29, 38, and 39 under 35 U.S.C. 103 has been withdrawn in view of Applicant’s amendment to independent claims 21 and 31. However, upon further consideration a new grounds of rejection has been made under 35 U.S.C. 103. 
The rejection of claims 30 and 40 under 35 U.S.C. 103 has been withdrawn in view of Applicant’s amendment to independent claims 21 and 31. However, upon further consideration a new grounds of rejection has been made under 35 U.S.C. 103. 

Applicant’s arguments with respect to claim 21 and 31 and dependent claims therefrom have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.



Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

Claim 34 is rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends.  Claim 34 recites the exact same limitations as claim 33.  Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention. 
 
15.	Claims 21, 22, 23, 24, 31, 32, 33, and 34 are rejected under 35 U.S.C. 103 as being unpatentable over Dally et al. (US 20180046900A1) in view of Quinones et al. (US 20080244223A1) in view of Khalvati (“Computational Redundancy in Image Processing”), in further view of Diamantopoulos et al. (US 20200257986 A1).

Regarding claim 21, Dally teaches:
graphics processing unit having a plurality of execution units to perform CNN operations (Dally, Paragraph [0150]: GPU configured to implement CNN operations);

processing circuitry including pruning circuitry and window processing circuitry, the processing circuitry to receive a CNN model having an associated list of instructions including conditional branches and, process the CNN model via a plurality of execution units of the graphics processing unit, , (Dally, Paragraph [0048]: PE [processing circuitry] includes a compaction engine [pruning circuitry]; Dally, Paragraph [0052]: pruning creates an optimized CNN model using compaction engine; Dally, Paragraph [0208]: system contains control logic as software and thus naturally contains instruction lists and conditional branches)

wherein the pruning circuitry is to eliminate, within a list of instructions, conditional branches descended from a first weight value (Dally, Paragraph [0052]: pruning involves setting weights close to zero to zero; thus the instructions regarding those weights are effectively ignored);

output circuitry to output the optimized CNN model and associated optimized list of instructions (Dally, Paragraph [0052]: the optimized network is outputted to be retrained).

While the previously cited art do not teach the remaining limitations, Quinones teaches:
wherein the pruning circuitry is to eliminate, within a list of instructions, conditional branches descended from a first weight value (Quinones, Paragraph [0008]: infrequently taken paths regarding conditional branches are eliminated for pruning).
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to modify the teachings of a convolutional neural network and primitive operations of a sparse convolutional neural network accelerator, as taught by Dally, with the pruning of Quinones, so as to effectively compress a computational model (see Quinones, Paragraph [0002]).

While the previously cited art do not teach the remaining limitations, Khalvati does teach:
 the window processing circuitry is to determine whether pixels in a second convolution window are different from a previously stored first convolution window and eliminate the second convolution window in response to determination that the second convolution window is the same as the first convolution window (Khalvati, pg. 3, teaches Window memoization uses a reuse table to store the results of previously performed computations. When a set of computations has to be performed for the first time, the computations are performed and the corresponding result is stored in the reuse table. When the same set of computations has to be performed again in the future, the previously calculated result is reused and the actual computations are skipped; [Note: window does not include additional information]; [Note: In the process of re-using a previously stored window the current window is eliminated because the actual computation is skipped].).

Before the effective filing date of the invention It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Dally with the window memorization method of Khalvati in order to increase efficiency of a CNN by avoiding redundant calculations (Khalvati pg. 3).

While the previously cited art do not teach the remaining limitations, Diamantopoulos does teach:
, wherein the processing circuitry includes an instruction set architecture that provides support for quantization primitives to quantize weights of the CNN and the processing circuitry is to quantize multiple weights of the CNN in response to a single instruction (Diamantopoulos, Abstract and Paragraph [0005], teaches a trained model of the neural network is processed, in which weights are defined in a floating-point format, to quantize each set of weights to a respective reduced precision format in dependence on effect of quantization on accuracy of the model.; Paragraph [0028] teaches computer readable program instructions for carrying out operations of the present invention may be instruction-set-architecture (ISA) instructions);
Before the effective filing date of the invention It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the combination Dally and Khalvati with the instruction set architecture supporting quantized weights, as taught by Diamantopoulos, in order to provide high density storage of reduced precision ANN weights quantized on a per layer basis in appropriately partitioned BRAMs allowing persistent storage of all weights in FPGA memory for the lifetime of the ANN implementation. (Diamantopoulos, Paragraph [0047]).  


Regarding claim 22, the combination of Dally in view of Quinones, in further view of Khalvati and Diamantopoulos teaches all of the limitations of claim 21, and Khalvati further teaches wherein the window processing circuitry is to generate checksum signature for the first convolution window and eliminate the second convolution window upon determination that the checksum signature for the second convolution window matches the checksum signature for the first convolution window (pg. 79: “Afterward, the content of the reuse table at the location where the key points to is read and compared against symbol [checksum] of the incoming window. If symbol of the incoming window matches the content of the reuse table at that particular location, a hit occurs; the response is read from the reuse table and the mask operations set for the incoming window is skipped”).
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to modify Dally to include the checksum process of Khalvati, as it is more convenient to consider windows of pixels as symbol (see Khalvati pg. 40).


Regarding claim 23, the combination of Dally in view of Quinones in further view of Khalvati and Diamantopoulos teach all of the limitations of claim 21, and Dally further teaches wherein the pruning circuitry is to further bypass traversal of conditional branches in the list of instructions associated with the CNN model for conditional branches that descend from a second weight value (Dally, Fig. 2A: model has many weights that can be subjected to identical pruning as the first weight).


Regarding claim 24, the combination of Dally in view of Quinones in further view of Khalvati and Diamantopoulos teaches all of the limitations of claim 23, and the combination further teaches wherein the pruning circuitry is further to eliminate conditional branches in the instructions associated with the CNN model having a weight value within a predetermined threshold of the first weight value (Dally, Paragraph [0052]: any weight within a threshold can be eliminated using the Quinones branch pruning process).

Claim 31 encompasses similar limitations to that of claim 21. Thus, the claim is similarly rejected under 35 U.S.C. 103.

Claim 32 encompasses similar limitations to that of claim 22. Thus, the claim is similarly rejected under 35 U.S.C. 103.

Claim 33 encompasses similar limitations to that of claim 23. Thus, the claim is similarly rejected under 35 U.S.C. 103.
______________________________________________________________________
16.	Claims 25, 26, 35, and 36 are rejected under 35 U.S.C. 103 as being unpatentable over Dally in view of Quinones Khalvati and Diamantopoulos, and further in view of Anwar et al. (“Structured Pruning of Deep Convolutional Neural Networks”).
Regarding claim 25, the combination of Dally in further view of Quinones in further view of Khalvati and Diamantopoulos teaches all of the limitations of claim 23, however the combination does not distinctly disclose wherein the pruning circuitry is to expand the CNN model into elementary operations upon receipt of the CNN model.

While the previously cited art do not teach the remaining limitations, Anwar does teach:
wherein the pruning circuitry is to expand the CNN model into elementary operations upon receipt of the CNN model (Sec. 4.5: feature maps and kernels can be unrolled into elementary matrix multiplication on the CPU).
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to modify the optimization method of Dally with the elementary operation representation of Anwar in order to allow a CPU to conduct convolution operations (Anwar, Section 4.5).

Regarding claim 26, the combination of Dally in further view of Quinones, in further view of Khalvati and Diamantopoulos, and in further view of Anwar teaches all of the limitations of claim 25, and Quinones further teaches wherein the pruning logic compresses a representation of instructions in the optimized instruction list (Quinones, Paragraph [0008]: removal instructions corresponding to branches).

Claim 35 encompasses similar limitations to that of claim 25. Thus, the claim is similarly rejected under 35 U.S.C. 103.

Claim 36 encompasses similar limitations to that of claim 26. Thus, the claim is similarly rejected under 35 U.S.C. 103.

17.	Claims 27 and 37 are rejected under 35 U.S.C. 103 as being unpatentable over Dally in view of Quinones and Khalvati and Diamantopoulos, and further in view of Son et al. (US 20180032867A1). 

Regarding claim 27, the combination of Dally in view of Quinones, in further view of Khalvati and Diamantopoulos teaches all of the limitations of claim 21. However, the combination does not distinctly disclose wherein the processing circuitry includes an instruction set architecture that provides support for de-quantization primitives to de-quantize weights of the CNN in response to a single instruction.

While the previously cited art do not teach the remaining limitations, Son does teach:
wherein the processing circuitry includes an instruction set architecture that provides support for de-quantization primitives to de-quantize weights of the CNN in response to a single instruction (Son, 0017: dequantizing to real numbers).
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to modify Dally with the quantization method of Son, as quantizing a CNN’s parameters reduces its size and thus memory requirements (Son, Paragraph [0068]).

Claim 37 encompasses similar limitations to that of claim 27. Thus, the claim is similarly rejected under 35 U.S.C. 103.

18.	Claims 28, 29, 38, and 39 are rejected under 35 U.S.C. 103 as being unpatentable over Dally in view of Quinones, Khalvati, Diamantopoulos and Son, and further in view of Yang et al. (US 20190073582 A1)

Regarding claim 28, the combination of Dally in view of Quinones, in further view of Khalvati and Diamantopoulos, in further view of Son teaches all of the limitations of claim 27, however the combination does not distinctly disclose wherein the instruction set architecture provides an instruction to quantize weights into 8-bit integers.

While the previously cited art do not teach the remaining limitations, Yang teaches:
wherein the instruction set architecture provides an instruction to quantize weights into 8-bit integers (Yang, Paragraph [0138] teaches fixed point CNN algorithm implemented with 128 bit integer vector instructions provided by Intel SSSE3 and SSSE4 ISA, using quantized parameters with 8 bit width)
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to modify Dally as modified by Quinones, Khalvati, Diamantopoulos and Son, to further include the quantization method of Yang, so as to speed up performance without the loss in accuracy. (Yang, Paragraph [0138])


Regarding claim 29, the combination of Dally in view of Quinones, in further view of Khalvati and Diamantopoulos, in further view of Son teaches all of the limitations of claim 27. However, the combination does not distinctly disclose wherein the instruction set architecture provides an instruction to quantize weights into 4-bit integers.  	

While the previously cited art do not teach the remaining limitations, Yang teaches:
wherein the instruction set architecture provides an instruction to quantize weights into 4-bit integers (Yang, Paragraph [0120] teaches quantization precision of 4 bits, wherein in this approach is easy to implement on CPUs and GPUs with SIMD integer instruction set architecture (ISA) as well as FPGAs and ASICs.)
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to modify Dally as modified by Quinones, Khalvati, Diamantopoulos and Son, to further include the quantization method of Yang, so as to speed up performance without the loss in accuracy. (Yang, Paragraph [0138])

Claim 38 encompasses similar limitations to that of claim 28. Thus, the claim is similarly rejected under 35 U.S.C. 103.

Claim 39 encompasses similar limitations to that of claim 29. Thus, the claim is similarly rejected under 35 U.S.C. 103.

19.	Claims 30 and 40 are rejected under 35 U.S.C. 103 as being unpatentable over Dally in view of Quinones, Khalvati, Diamantopoulos and Son, and further in view of Dhanani et al. (“Digital Video Processing for Engineers”).

Regarding claim 30, t the combination of Dally in view of Quinones, in further view of Khalvati and Diamantopoulos, in further view of Son teaches all of the limitations of claim 27. However, the combination does not distinctly disclose wherein the instruction set architecture provides an instruction to generate a quantization table for non-uniform quantization.

While the previously cited art do not teach the remaining limitations, Dhanani does teach:
wherein the instruction set architecture provides an instruction to generate a quantization table for non-uniform quantization (Dhanani, Chapter 13.3 teaches quantization tables are often used for image compression; less quantization used for the upper and leftmost DCT coefficients).
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to modify Dally as modified by Quinones, Khalvati, Diamantopoulos and Son, to further include the quantization method of Dhanani. As the human eye is more sensitive to lower frequencies, less quantization and more bits is ideal for compressing images at the upper and leftmost DCT coefficients of the quantization table (Dhanani , Chapter 13.3).

Claim 40 encompasses similar limitations to that of claim 30. Thus, the claim is similarly rejected under 35 U.S.C. 103.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BEATRIZ RAMIREZ BRAVO whose telephone number is 571-272-2156. The examiner can normally be reached Mon. - Fri. 7:30a.m.-5:00p.m..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ALEXEY SHMATOV can be reached on 571-270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/B.R.B./Examiner, Art Unit 2123                                                                                                                                                                                                        
/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123