DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Specification
The disclosure is objected to because of the following informalities: In the paragraph [0003] of the specification the description of multiple figures 1A-1B is singular (FIG.) and not plural (i.e., FIGS.).  
Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-8,15-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu (patent application publication No. 2019/0347256) in view of Turakhia (patent application publication No 2018/0164866)(hereafter referred to as Turakhia ‘866.

Wu taught the invention substantially as claimed including (as to claim 1) A method of increasing computer hardware efficiency of a matrix computation, comprising: receiving, at a computer processing device (e.g., see figs. 12,13 and paragraphs 0012 and 0013), digital signals encoding one or more operations of the matrix computation (e.g., see figs. 4,5,6,7), each operation including one or more operands (e.g., see figs. 9A,9B and  paragraphs 0048 and 0048); Wu did not expressly detail responsive to determining, by a sparse data check device of the computer processing machine, that an operation of the matrix computation includes all dense operands, forwarding the operation to a dense computation device of the computer processing machine configured to perform the operation of the matrix computation based on the dense operands. Turakhia ‘866 however taught this limitation  (e.g., see fig. 6,7 and paragraphs 0052,0057-0060,0067-0068) .  Turakhia ‘866 also taught  responsive to determining, by the sparse data check device (706), that the operation of the matrix computation includes one or more sparse operands, forwarding the operation to a sparse computation device configured to perform the operation of the matrix computation (e.g., see fig. 6, steps 608,612,614,616)[note the detection that one tag has  a value of zero provides detection of a sparse operand, and the forwarding the operation to  sparse computation device is provided by the disabling of loading of the weight disabling of the previously accumulated value and  sending a control signal to multiplexer 510  and selecting of the previously accumulated value  as the new value by the multiplexer where the data gating component provides the sparse computation device (e.g., see paragraphs 0064-0066 and  0075)].
	It would have been obvious to one of ordinary skill in the art to combine the teachings of Wu and Turakhia ‘866. Both references were directed toward the problems of optimizing the execution of matrix arithmetic operations in a data processor.  One of ordinary skill would have been motivated to incorporate the Turakhia ‘866 teachings of detecting whether a zero value was present in an operand and sending the operand to a processor dedicated to processing dense  or only non-zero valued  operands to decrease time to process the operand(s) since processing the zero valued operands would have increased the time necessary to process the operands and providing a separate device including a multiplexer to control sending a previously accumulated value for the sparse operand operations would have increased throughput.
Due to the similarities between claim 1 and claim 15 and claim 17; claims 15 and claim 17 are rejected for the same reasons as claim 1 above.
As to claim 2 Wu and Turakhia ‘866 taught The method of claim 1, Wu taught wherein the sparse data check device is configured to determine whether operands are sparse or dense based on determining a zero-valued operand to be sparse and a non-zero valued operand to be dense (e.g., see paragraph 0062) [Wu taught using determination of threshold(s) of zero or non-zero value in the record are used to determine whether the record or operand is sparse or dense and note range include (e.g., 10%   or fewer which includes where there are zero non-zero values for sparse and over 95% have non-zero for determining the record or operand is dense [which includes where the record or operand is 100% non-zero values for dense operand or record]).[Also Turakhia ‘866 taught determining that one tag is zero for determining that the operand is zero (e.g., see paragraphs 0008-0009 and  0072)]

As to claims 3,16 Wu and Turakhia  ‘866 taught The method of claim 1, Wu taught wherein the sparse data check device is configured to determine whether operands are sparse or dense based on determining an operand to be dense if a value of the operand exceeds a pre-defined threshold and sparse if the value of the operand does not exceed the pre- defined threshold (e.g., see paragraph 0062).

As to claim 4 Wu and Turakhia ‘866 taught The method of claim 3, Wu taught wherein the pre-defined threshold is a hardware hyperparameter of the sparse data check device (e.g., see paragraph 0062)[note the classification  or sparse or dense is done via machine learned classifier therefore it is within the scope of hardware hyperparameter as it is a parameter learned by the hardware].

As to claim 5 Wu and Turakhia  ‘866 taught  The method of claim 1, Wu taught  wherein the matrix computation is a matrix multiplication(e.g., see paragraph 0058, and 0071).Turakhia also taught this limitation (e.g., see paragraph 0009 and 0039))

As to claim 6 Wu and Turakhia ‘866 taught  The method of claim 1, Turakhia ‘866 taught wherein the dense computation device is configured to perform a multiply-and-accumulate operation(e.g., see paragraphs 0009 and 0039).

As to claim 7 Wu and Turakhia ‘866 taught  The method of claim 1, Turakhia ‘866 taught wherein the matrix computation is a neural network computation(e.g., see paragraph 0024).Wu also the computing device may be implemented as multiple processors including a network (e.g., see paragraph 0097).

As to  claim 8 Wu and Turakhia ‘866 taught  The method of claim 1, Turakhia ‘866 taught wherein the sparse computation device is configured to automatically save a sparse result value to a location derived from the operation of the matrix computation (e.g., see paragraph 0066).
Claims 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu and Turakhia ‘866 as applied to claim 1 above, and further in view of Dally (patent application publication No. 2018/0046900).

As to claim 9 Wu and Turakhia ‘866 taught  The method of claim 1, Wu and Turakhia ‘866 did not expressly detail  wherein the sparse computation device is configured to replace an executable instruction of the operation with a no- op instruction. Dally however taught this limitation (e.g., see paragraphs 0038-0011 and 0166 and fig. 8).
It would have been obvious to one of ordinary skill in the art to combine the teachings of Wu and Dally. Both references were directed toward the problems of  producing vector product in a data processor. One of ordinary skill would have been motivated to incorporate the Dally teachings of implementing operations on inactive instruction using no operation instruction (Noop) at least to enable the system control the system in a properly timed manner to perform the operation(s) that were active along with the operations that were inactive, without using extra control hardware, which would reduce system cost and enable flexible programming and scheduling for inactive and active operation(s). 
Claims 10,11,19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wu and Turakhia ‘866 as applied to claim 1 above, and further in view of Turakhia ‘056 (patent application publication No. 2018/0189056)

As to claims 10, 19 Wu and Turakhia ‘866 taught The method of claim 1, Wu and Turakhia ‘866 did not expressly detail wherein forwarding the operation having all dense operands to the dense computation device includes enqueuing the operation having all dense operands into a lookahead dense instruction queue, wherein the dense computation device is configured to execute operations from the lookahead dense instruction queue in order. Turakhia ‘056 however taught this limitation (e.g., see figs. 6,8 steps 604,608,610,612 and paragraphs 0058-0061).
It would have been obvious to one of ordinary skill in the art to combine the teachings of Wu and Turakhia ‘056. Both references were directed toward the problems of optimizing the execution of matrix arithmetic operations in a data processor.  One of ordinary skill would have been motivated to incorporate the Turakhia ‘056 teachings of detecting whether a zero value was present in an operand and sending the operand to a  queue  and preventing sending  sparse operands to a queue to decrease time to process the operands since processing the zero valued operands with non-zero operands would have increased the time necessary to process the operands and using the queue(s) provide flexible scheduling of dense operands so the control necessary to properly time the operation on dense operand operations could be reduced  saving system cost.


As to claim 11 Wu and Turakhia ‘866 The method of claim 10, Turakhia ‘056 taught further comprising feeding the lookahead dense instruction queue in excess of a number of operations the dense computation device is configured to process in a subsequent cycle. (e.g., see paragraphs 0051-0056)[note the system in cycle selected a pair of operation from the queue(s)  and the queue have depth “K”] .
Allowable Subject Matter
Claims 12-14,18,20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  The closest prior art includes Wu and Turakhia ‘866 and Turakhia ‘056. The features corresponding to the claims of the instant application are taught by Wu and Turakhia ‘866 and Turakhia ‘056 as detailed above. Additionally Parashar A. et al.,  (ACM paper entitled SCNN: An Accelerator for Compressed sparse Convolutional Neural networks taught in a system(cited on IDS submitted by Applicant) directed to compressed sparse convolution weight fifo (sparse) for input to the multiplier array (e.g., see fig. 6). 
 However Wu and Turakhia ‘866 and Turakhia ‘056  and Parashar among other things did not disclose the features of  claims 12,13,14,18 and 20 respective as recited below:


As to claim 12, The method of claim 1, wherein forwarding the operation having one or more sparse operands to the sparse computation device includes enqueuing the operation having one or more sparse operands into a lookahead sparse instruction queue was not disclosed by Wu and Turakhia ‘866 and Turakhia ‘056 and Parashar. 
As to claim 13 The method of claim 12, wherein the sparse computation device is configured to automatically store sparse result values from operations in the lookahead sparse instruction queue, in a program order of the operations in the lookahead sparse instruction queue was not disclosed by Wu and Turakhia ‘866 and Turakhia ‘056 and Parashar.

As to claim 14 The method of claim 12, further comprising feeding the lookahead sparse instruction queue in excess of a number of operations the sparse computation device is configured to process in a subsequent cycle was not disclosed by Wu and Turakhia ‘866 and Turakhia ‘056 and Parashar.

As claim 18 The computer system of claim 17, wherein the sparse data check device is configured to detect a plurality of instructions operating on sparse data and enqueue the plurality of instructions operating on sparse data into a lookahead sparse instruction queue was not disclosed by Wu and Turakhia ‘866 and Turakhia ‘056 and Parashar.

As to claim 20, The computer system of claim 17, wherein the sparse computation device is configured to forward instruction identifiers to a sparse register identifier queue, and the sparse register identifier queue is configured to write corresponding results according to a program-order associated with the instruction identifiers was not disclosed by Wu and Turakhia ‘866 and Turakhia ‘056 and Parashar.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Golovashkin (patent application publication No. 2015/0378962) disclosed use of computing resources while calculating cross product or its approximation for logistic regression  on big data sets (e.g., see abstract).
Gauria (patent No.10,409,887) disclosed generalized dot product for computer vision applications (e.g., see abstract).
Staudemenmaier (patent application publication No. 2018/0349290) disclosed sparse matrix accelerator (e.g., see abstract). 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC COLEMAN whose telephone number is (571)272-4163. The examiner can normally be reached M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta can be reached on 0-3995. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

ERIC . COLEMAN
Primary Examiner
Art Unit 2183



EC
/ERIC COLEMAN/           Primary Examiner, Art Unit 2183