DETAILED CORRESPONDENCE
Response to Amendment
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .  This Office action is in response to Amendments and Remarks filed on 15 October 2021 as a response to the Non-Final Office Action issued 14 September 2021.  Claims 1, 2, 4-8, 11, 12, and 14-18 are amended and have been carefully considered.  Claim 21 is new.  Claims 1-21 are pending and considered below.

Claim Rejections - 35 USC § 102
Applicant’s arguments and amendments, see Remarks/Amendments, filed 15 October 2021, with respect to the rejection of all pending claims 1-21 under 35 USC 102(a)(2) have been fully considered and are persuasive.  The rejection of all pending claims 1-21 under 35 USC 102(a)(2) has been withdrawn. 

Examiner's Amendment
An Examiner's amendment to the record appears below.  Should the changes and/or additions be unacceptable to Applicants, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.  Examiner amends the instant claims with respect to minor typographic issues.

Claim 1. (Currently Amended)	A crossbar-based inference engine configured to perform a machine learning (ML) operation on an input data stream, comprising: 
a plurality of on-chip memories (OCMs) coupled to a crossbar and each OCM is configured to load and maintain data from the input data stream for local access by components in the inference engine; 
s of the ML operation performed by the components in the inference engine as an output data stream; 
a first plurality of processing units, wherein each processing unit of the first plurality of processing units is coupled to one OCM of the plurality of OCMs without going through the crossbar and configured to perform a dense and/or regular computation task of the ML operation on the data within the corresponding OCM; 
a second plurality of processing units coupled to the first plurality of processing units and the plurality of OCMs through the crossbar, wherein each processing unit of the second plurality of processing units is configured to perform a sparse and/or irregular computation task of the ML operation on at least one of the data within the OCMs or from the first plurality of processing units; and 
said crossbar configured to connect the second plurality of processing units to the plurality of OCMs to enable each processing unit of the second plurality of processing units to read data from and/or write data to the corresponding OCM.

	Claim 11. (Currently Amended)	A method to perform a machine learning (ML) operation on an input data stream via an inference engine, comprising: 
loading and maintaining data from the input data stream for local access by components in the inference engine in each on-chip memory (OCM) of a plurality of OCMs, wherein the plurality of OCMs is coupled to a crossbar in the inference engine; 
performing a dense and/or regular computation task of the ML operation on the data in a OCM of the plurality of OCMs via one processing unit of a first plurality of processing units that are coupled to the OCM without going through the crossbar; 
performing a sparse and/or irregular computation task of the ML operation on the data in the plurality of OCMs and/or from the first plurality of processing units via one processing unit of 
connecting the second plurality of processing units to the plurality of OCMs via the crossbar to enable each processing unit of the second plurality of processing units to read data from and/or write data to the plurality of OCMs; and 
maintaining and outputting results of the ML operation performed by a processing tile that comprises at least a processing unit from a first plurality of processing units, a processing unit from a second plurality of processing units, and an OCM, wherein the OCM is configured to output a data stream from the processing tile.

	Claim 12. (Currently Amended)	The method of claim 11, further comprising: streaming data between each OCM and its corresponding processing unit of the first plurality of processing units via an OCM streamer

	Claim 17. (Currently Amended)	The method of claim 16, further comprising: perform one or more post matrix multiplication operations by each processing unit of the first plurality of processing units on the output from the matrix multiplication operation.

	Claim 21. (Currently Amended)	A crossbar-based inference engine configured to perform a machine learning (ML) operation on an input data stream, comprising: 
a plurality of on-chip memories (OCMs) coupled to a crossbar and each OCM is configured to load and maintain data from the input data stream for local access by components in the inference engine; 
maintain and output results of the ML operation performed by the components in the inference engine as an output data stream; 
a first plurality of processing units, wherein each processing unit of the first plurality of processing units is directly coupled to one OCM of the plurality of OCMs and configured to perform a dense and/or regular computation task of the ML operation on the data within the corresponding OCM; 
a second plurality of processing units coupled to the first plurality of processing units and the plurality of OCMs through the crossbar, wherein each processing unit of the second plurality of processing units is configured to perform a sparse and/or irregular computation task of the ML operation on at least one of the data within the OCMs or from the first plurality of processing units; and 
said crossbar configured to connect the second plurality of processing units to the plurality of OCMs to enable each processing unit of the second plurality of processing units to read data from and/or write data to the corresponding OCM.

Reasons for Allowance
Claims 1-21 are allowed. 
The following is the Examiner's statement of reasons for allowance: 
The closest art of record, Nurvitadhi et al. (20180315158) discloses an apparatus, method and computer readable medium comprising: 
A crossbar-based inference engine configured to perform a machine learning (ML) operation on an input data stream, comprising: 
a plurality of on-chip memories (OCMs) coupled to a crossbar and each OCM is configured to load and maintain data from the input data stream for local access by components in the inference engine; 
maintain and output result of the ML operation performed by the components in the inference engine as an output data stream; 
a second plurality of processing units coupled to the first plurality of processing units and the plurality of OCMs through the crossbar, wherein each processing unit of the second plurality of processing units is configured to perform a sparse and/or irregular computation task of the ML operation on at least one of the data within the OCMs or from the first plurality of processing units; and 
said crossbar configured to connect the second plurality of processing units to the plurality of OCMs to enable each processing unit of the second plurality of processing units to read data from and/or write data to the corresponding OCM.

However, Nurvitadhi does not teach at least: 
	a first plurality of processing units, wherein each processing unit of the first plurality of processing units is coupled to one OCM of the plurality of OCMs without going through the crossbar and configured to perform a dense and/or regular computation task of the ML operation on the data within the corresponding OCM; 

Moreover, the missing claimed elements from Nurvitadhi are not found in a reasonable number of references.  Yet even if the missing claimed elements were found in a reasonable number of references, a person of ordinary skill in the art at the time the invention was made would not have been motivated to include these missing elements in the combination of Nurvitadhi because a person of ordinary skill in the art at the time of Applicant's invention would not find a motivation for including a first plurality of processing units, wherein each processing unit of the first plurality of processing units is coupled to one OCM of the plurality of OCMs without going through the crossbar and configured to perform a dense and/or regular computation task of the ML operation on the data within the corresponding OCM. 

Any comments considered necessary by Applicants must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled "Comments on Statement of Reasons for Allowance." 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to David Stoltenberg whose telephone number is (571) 270-3472. 
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Please see attached References Cited form 892
The examiner can normally be reached on Monday-Friday 8:30AM to 5:00PM EST.  If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Waseem Ashraf, can be reached on (571) 270-3948.  The fax phone number for the organization where this application or proceeding is assigned is (571)-273-8300, or the examiner’s direct fax phone number is 571 270 4472.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published application may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center at (866) 217-9197 (toll free).  If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call (800) 786-9199 (IN USA OR CANADA) or (571) 272-1000.

/DAVID J STOLTENBERG/Primary Examiner, Art Unit 3682