DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-4,10,11,18, 21 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Zhu (IEEE paper entitled BHNN a Memory-Efficient Accelerator for Compressing Deep Neural Networks with Blocked Hashing Techniques).

Zhu taught the invention as claimed including (as to claim 1) A neural-network inference engine (e.g., see section III on left column page 691) comprising: a processor (e.g., see fig. 4) configured to generate weights for a matrix of at least one hidden layer of a neural network based on a tabulation hash operation using entries from a table to select from a set of weights(e.g., see section B.2, Hash engine on page 694 left column)[note virtual weights are generated using entries from a table of hash table indices  and table  including blocks of  real weights (see fig. 3)  with real weights using hash index  in figs. 1,3], wherein the table (real weights W1, real weights W2) is associated with the at least one hidden layer and the set of weights comprises a smaller number of weights than a number of the weights for the matrix of at least one hidden layer (e.g., see fig. 1 and section III  A. on page 691) and a memory located on-chip with the processor, the memory configured to store the table and the set of weights(e.g., see section B Architecture-level parallelism on  page 693, left column)[Initially all real weights are stored in on-chip SRAM].

As to claim 2 Zhu taught the neural-network inference engine of claim 1, wherein the table is to group weights of the matrix into a pseudo-random set of connections and the connections in a same hash bucket share a same weight from the set of weights (e.g., see section IV B2 left column, page 694; and section III B on left column page 692).

As to claim 3 Zhu taught the neural-network inference engine of claim 1, wherein the weights for the matrix of at least one hidden layer comprise a set of one or more weights that are a single weight value from the set of weights (e.g., see fig. 2,3).

            As to claim 4, Zhu taught  The neural-network inference engine of claim 1, comprising: XOR operation on entries of the table an XOR-tree to perform an XOR operation on at least two entries from the table (e.g., see section B2 Hash engine on page 694 and fig. 4).
As to claim 10 Zhu taught the neural-network inference engine of claim 1, comprising one or more of: an inference engine accelerator, graphics processing unit, network interface, or a storage device (e.g., see section V. C.  BHNN hardware accelerator on pages 694-695).

As to claim 11 Zhu taught  A method comprising: during an inference phase of a neural network: selecting a first weight in a matrix of weights from a set of weights stored in memory based on a tabulation hash of multiple entries from a table(e.g., see section B Architecture-level parallelism on page 693 and B.2, Hash engine on page 694 left column)[note virtual weights are selected using entries from a table of hash table indices  and table  including blocks of  real weights (see fig. 3)  with real weights using hash index  in figs. 1,3]; and selecting a second weight in the matrix of weights from the set of weights stored in memory based on a second tabulation hash of multiple entries from the table(e.g., see figs. 1,3,4)[note Zhu taught performing the operations of sequentially inputting blocks data from BRAM layers to the Register layer(s)  for selection  and this is performed in unrolled loop manner which would provide a least selecting a second weight similarly to the operation for a preceding layer]. Zhu also taught wherein: a number of weights in the matrix of weights is greater than the set of weights stored in a memory (e.g., see fig. 1 and section III A. on page 691) and the table is stored in a memory (e.g., see section B Architecture-level parallelism on page 693). 
As to claim 18 Zhu taught a system to perform neural-network inferences, the system comprising: a memory; at least one core communicatively coupled to the memory, wherein the memory and the at least one core are mounted to the same board (e.g., see section B Architecture-level parallelism on  page 693, left column)[Initially all real weights are stored in on-chip SRAM]. Zhu taught  and an accelerator device to: perform tabulation hashes of entries in a table (e.g., see section V  C., BHNN hardware accelerator on pages 694-695) to generate indices and form a matrix of weights using a set of weights and based on indices from the tabulation hashes of entries in the table (e.g., see section IV B Architecture-level parallelism on page 693)[Zhu taught “there are the required input arguments for hash function to generate  the index for the virtual weights”], wherein a number of weights in the set of weights is less than a number of weights in the matrix of weights (e.g., see fig. 1 and section III  A. on page 691).
As to claim 21 Zhu taught the system of claim 18, wherein the table is stored in the memory and the table comprises indexes randomly selected from a number of weights in the set of weights (e.g., see section III A Hashed neural network and   page 691 and fig. 2b).


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhu as applied to claim 1 above, and further in view of Spring (ACM paper entitled Scalable  and Sustainable Deep Learning via Randomized Hashing).

As to claim 5, Zhu taught The neural-network inference engine of claim 1, wherein the memory is configured to store at least one table for the hidden layer (e.g., see figs. 1,3,4) but did not expressly detail memory is configured to store  at least one table for a second hidden layer. Spring however taught this limitation e.g., see fig. 2 and section 3.1 on page 448).
It would have been obvious to one of ordinary skill in the art to combine the teachings of Zhu and Spring. Both references were directed toward the problems of using hashing to use and access weights in a neural network. One of ordinary skill would have been motivated to incorporate the Spring teachings of using plural  tables one for each of plural hidden layers at least to reduce the number of tables needed for processing by providing localized table for each hidden layer (e.g., see  Spring, page 448 left column).
Claim(s) 6,13,15,19,20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhu as applied to claim 1,11,18 above, and further in view of Hariprasath (IEEE paper entitled FPGA Implementation of Multilayer Feed Forward Neural Network Architecture using VHDL).
As to claim 6 Zhu taught The neural-network inference engine of claim 1,  Hariprasath wherein the processor is to perform at least one multiply-and-carry operation to compute a value based on a weight of the matrix and an input activation value (e.g,. see fig. 3 and section III A weight storage and accumulation  and section III B. digital multiplication)[note the  operation of the carry ahead adder performs the carry operation] .
It would have been obvious to one of ordinary skill to combine the teachings of Zhu and Hariprasath. Both references were directed toward the problems of processing neural network processing of data using weights. One of ordinary skill would have been motivated to incorporate the Hariprasath teachings of multiplication and adder carry operation at least to efficiently process algorithms the entail summation of data especially for use in Deep learning operations  of neural networks.  This would increase throughput.



As to claim 13 Zhu taught  taught The method of claim 11,  Hariprasath wherein the tabulation hash comprises an XOR operation on two entries from the table (e.g,. see fig. 3)  and Zhu  taught  a second XOR operation on an output from the XOR operation and a third entry from the table (e.g., see fig. 4)[note the parallel  XORs and XOR tree provides this limitation].


As to claim 15 Zhu taught the method of claim 11, Hariprasath comprising: performing multiply and carry operations based on an input activation and the first and second weights and storing outputs from the multiply and carry operations into memory (e.g.,. see section III and section III C).
.



As to claim19 Zhu taught the system of claim 18, Hariprasath  taught wherein a tabulation hash comprises an XOR operation on two entries from the table and a second XOR operation on an output from the XOR operation and a third entry from the table (e.g., see fig. 3).

As to claim 20 Zhu taught  The system of claim 18, Hariprasath taught  wherein the acceleration device is to: perform multiply and carry operations based on an input activation and the matrix of weights and store outputs from the multiply and carry operations into the memory (e.g., see section III and  section III C).

Allowable Subject Matter
Claims 7-9,12,14,16,17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  Claims 7-9,12,14,16,17 respectively require among other things:.

Claim 7. The neural-network inference engine of claim 1, …, wherein the multiplexer comprises: at least one shift register to shift contents of the table based on a first portion of the address of the weight; a second multiplexer to output contents from at least one shift register based on a second portion of the address of the weight; a third multiplexer to output contents from at least one shift register based on a third portion of the address of the weight; an XOR logic to perform an XOR operation on outputs from the second and third multiplexers and generate an output; and a second XOR logic to perform an XOR operation on an output from the XOR logic and shifted contents of the table.

Claim 8. The neural-network inference engine of claim 1, wherein when the neural-network inference engine operates in a training mode, the processor is to: …; set two or more weights of the matrix based on the single weight value; determine a gradient of the two or more weights; and replace the single weight value with a sum of gradients of the two or more weights divided by a number of the two or more weights.

Claim 9. The neural-network inference engine of claim 1, wherein when the neural-network inference engine operates in an inference mode, the processor is to:…; set two or more weights of the matrix based on the single weight value; and perform a multiply-and-carry operation using the single weight value and an activation signal.
Claim 12. The method of claim 11, comprising: performing a tabulation hash on multiple entries from the table to generate an index to a weight in the set of weights stored in memory.
Claim 14   The method of claim 11,  comprising:…; performing a third tabulation hash based on entries from the second table to generate an index to a weight in the set of weights stored in a memory; and setting a third weight based on the index from the third tabulation hash based on entries from the second table. 
Claim 16. The method of claim 11, comprising: selecting entries from the table based on non-overlapping portions of an address of the first weight for use in the tabulation hash and selecting entries from the table based on non-overlapping portions of an address of the second weight for use in the second tabulation hash.

Claim 17. The method of claim 11, comprising: allocating a first core for selecting the first weight in the matrix of weights and selecting the second weight in the matrix of weights and performing multiply and carry calculations using the first and second weights and allocating a second core for selecting a third weight in a matrix of weights and selecting a fourth weight in the matrix of weights and performing multiply and carry calculations using the third and fourth weights.

The closest prior art includes Zhu and Spring and Hariprasath.  The limitations which claims 7-9,12,14,16,17 depend are taught by the closest prior art as detailed above. However the closest prior art does not disclose among other things: the respective limitations included in dependent claims 7-9,12,14,16,17 as shown above. 
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Yang (patent application publication No. 2019/0050709) disclosed system and method for neural networks with retrieving  a hash table that corresponds to an index from a plurality of hash tables (e.g., see abstract).
Galron (patent application publication  No. 2018/0204113) disclosed interaction analysis and prediction based on neural networking (e.g., see abstract).
Mody (patent application publication No. 2018/0197067) disclosed methods and apparatus for matrix processing in a convolutional neural network (e.g., see abstract).
Kim (patent application publication No. 2014/0129568) disclosed reduced complexity hashing (e.g,. see abstract).
 Vijayanarasimhan (patent No. 8,977,627) disclosed filter based object detection using hash functions (e.g., see abstract).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC COLEMAN whose telephone number is (571)272-4163. The examiner can normally be reached M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta can be reached on 0-3995. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

ERIC . COLEMAN
Primary Examiner
Art Unit 2183



EC
/ERIC COLEMAN/           Primary Examiner, Art Unit 2183