DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Wang et al. 20210073169 herein Wang.
Per claim 1, Wang discloses: analyze a machine learning workload and assign corresponding priority levels to 5identified data requests in the machine learning workload based on an associated data dependency delay performance impact; (fig. 3a-b, ¶0046; the task scheduling module can partition, according to configuration options indicating task allocation, the computation graph of the neural network to be processed into a plurality of computation subtasks and distribute the computation subtasks to task queues of the corresponding hardware of the computation units; examiner notes that the workload is equated to the subtasks) and indicate the assigned corresponding priority levels when providing the data requests to a memory controller; (¶0046; the controller can be configured by configuration files, configuration options or software programs. For example, the task scheduling module can partition, according to configuration options indicating task allocation, … For example, the task allocation based on type matching has the highest priority, the task allocation based on a specified flag has the second highest priority, and the task allocation based on load has the lowest priority) and the memory controller configured to:  10sort the received data requests into a plurality of different priority queues based on the indicated corresponding priority levels; (¶0036, ¶0046; In this way, upon receiving the computation graph, the task scheduling module can distribute, according to the configuration options indicating task allocation, the computation subtasks to the corresponding task queues of the computation units. Additionally, the dependency among the computation subtasks may be configured, set, or adjusted by configuration files, software programs) and initiate the data requests from the different priority queues to memory in an order based on different qualities of service of the different priority queues (fig. 3a-b, ¶0046; the allocation of the computation subtasks to different computation units, the synchronization of the task queues, the data dependence and synchronization of the computation subtasks and the like can be flexibly adjusted by various different configuration options, so that the operation mode for each computation unit can be flexibly set for various application scenarios, and the hardware performance and computation efficiency of the heterogeneous processor itself can be fully utilized and exerted).
Per claim 2, Wang discloses: wherein to analyze the machine learning workload, the processor isis configured to generate a data dependency graph (¶0037; according to the dependency among nodes in a computation graph, a dependency among the computation subtasks to be allocated to the computation units, and define synchronization event points in the task queues according to the determined dependency).
Per claim 3, Wang discloses: wherein the data dependency graph is comprised of a plurality of nodes, wherein each node of the plurality of nodes corresponds to a data request (¶0037; according to the dependency among nodes in a computation graph, a dependency among the computation subtasks to be allocated to the computation units, and define synchronization event points in the task queues according to the determined dependency).
Per claim 4, Wang discloses: wherein the processor is configured to determine the associated data dependency delay performance impact for each node of the plurality of nodes (¶0046; the data dependence and synchronization of the computation subtasks and the like can be flexibly adjusted by various different configuration options, so that the operation mode for each computation unit can be flexibly set for various application scenarios, and the hardware performance and computation efficiency of the heterogeneous processor itself can be fully utilized and exerted).
Per claim 5, Wang discloses: wherein the assigned corresponding priority levels include at least a high priority level, a medium priority level, and a low priority level (¶0046; One or more configuration options can be set at the same time, and different configuration options have different priorities. For example, the task allocation based on type matching has the highest priority, the task allocation based on a specified flag has the second highest priority, and the task allocation based on load has the lowest priority).
Per claim 6, Wang discloses: wherein the memory controller is configured to initiate a data request in a priority queue with the high priority level when memory bandwidth associated with the memory is available (¶0046; The configuration options indicating task allocation may include task allocation based on type matching (e.g., the computation units for processing computation subtasks are selected according to the type of the computation subtasks), task allocation based on load (e.g., task allocation is performed according to the queuing status of the task queues of the computation units), task allocation based on a specified flag (e.g., computation subtasks with a specified flag are distributed to particular computation units indicated by this flag), etc).
Per claim 7, Wang discloses: wherein the memory controller is configured to initiate a data request in a priority queue with the medium priority level when memory bandwidth associated with the memory is available and after a first threshold number of data requests have been fulfilled (¶0046; The configuration options indicating task allocation may include task allocation based on type matching (e.g., the computation units for processing computation subtasks are selected according to the type of the computation subtasks), task allocation based on load (e.g., task allocation is performed according to the queuing status of the task queues of the computation units), task allocation based on a specified flag (e.g., computation subtasks with a specified flag are distributed to particular computation units indicated by this flag), etc…… the allocation of the computation subtasks to different computation units, the synchronization of the task queues, the data dependence and synchronization of the computation subtasks and the like can be flexibly adjusted by various different configuration options, so that the operation mode for each computation unit can be flexibly set for various application scenarios, and the hardware performance and computation efficiency of the heterogeneous processor itself can be fully utilized and exerted).
Per claim 8, Wang discloses: wherein the memory controller is configured to initiate a data request in a priority queue with the low priority level when memory bandwidth associated with the memory is available and after a second threshold number of data requests have been fulfilled (¶0046; The configuration options indicating task allocation may include task allocation based on type matching (e.g., the computation units for processing computation subtasks are selected according to the type of the computation subtasks), task allocation based on load (e.g., task allocation is performed according to the queuing status of the task queues of the computation units), task allocation based on a specified flag (e.g., computation subtasks with a specified flag are distributed to particular computation units indicated by this flag), etc…… the allocation of the computation subtasks to different computation units, the synchronization of the task queues, the data dependence and synchronization of the computation subtasks and the like can be flexibly adjusted by various different configuration options, so that the operation mode for each computation unit can be flexibly set for various application scenarios, and the hardware performance and computation efficiency of the heterogeneous processor itself can be fully utilized and exerted).
Per claim 9, Wang discloses: wherein to analyze the machine learning workload, the processor 5is configured to determine a current portion of the machine learning workload (¶0044; units. The task queue corresponding to each the computation units is configured to store computation subtasks to be executed by the computation unit. The AI processor further includes a controller. Upon receiving a computation graph corresponding to a neural network program to be processed, the controller performs functional analysis on the computation graph according to the current load conditions of the computation clusters and the characteristics of different computation units included in the computation clusters, partitions the computation graph into a plurality of computation subtasks, and distributes the computation subtasks to the corresponding task queues of the computation units that can process this type of computation subtasks).  
Per claim 10, Wang discloses: wherein the processor is configured to indicate the assigned corresponding priority levels based on the determined current portion of the machine learning workload (¶0046; One or more configuration options can be set at the same time, and different configuration options have different priorities. For example, the task allocation based on type matching has the highest priority, the task allocation based on a specified flag has the second highest priority, and the task allocation based on load has the lowest priority …¶0044; … the controller performs functional analysis on the computation graph according to the current load conditions of the computation clusters and the characteristics of different computation units included in the computation clusters, partitions the computation graph into a plurality of computation subtasks, and distributes the computation subtasks to the corresponding task queues of the computation units that can process this type of computation subtasks).  
Per claim 11, Wang discloses: wherein the determined current portion corresponds to a 10compute heavy portion of the machine learning workload (…¶0044; … the controller performs functional analysis on the computation graph according to the current load conditions of the computation clusters and the characteristics of different computation units included in the computation clusters, partitions the computation graph into a plurality of computation subtasks, and distributes the computation subtasks to the corresponding task queues of the computation units that can process this type of computation subtasks).  
Per claim 12, Wang discloses: wherein during the compute heavy portion, the processor is configured to assign a data request corresponding to a compute operation to a different priority queue then a data request corresponding to a communication operation (¶0046; One or more configuration options can be set at the same time, and different configuration options have different priorities. For example, the task allocation based on type matching has the highest priority, the task allocation based on a specified flag has the second highest priority, and the task allocation based on load has the lowest priority …¶0044; … the controller performs functional analysis on the computation graph according to the current load conditions of the computation clusters and the characteristics of different computation units included in the computation clusters, partitions the computation graph into a plurality of computation subtasks, and distributes the computation subtasks to the corresponding task queues of the computation units that can process this type of computation subtasks).  
Per claim 13, Wang discloses: wherein the determined current portion corresponds to a iscommunication heavy portion of the machine learning workload (¶0044; … the controller performs functional analysis on the computation graph according to the current load conditions of the computation clusters and the characteristics of different computation units included in the computation clusters, partitions the computation graph into a plurality of computation subtasks, and distributes the computation subtasks to the corresponding task queues of the computation units that can process this type of computation subtasks).  
Per claim 14, Wang discloses: wherein during the communication heavy portion, the processor is configured to assign a data request corresponding to a compute operation to a different priority queue then a data request corresponding to a communication operation (¶0046; One or more configuration options can be set at the same time, and different configuration options have different priorities. For example, the task allocation based on type matching has the highest priority, the task allocation based on a specified flag has the second highest priority, and the task allocation based on load has the lowest priority …¶0044; … the controller performs functional analysis on the computation graph according to the current load conditions of the computation clusters and the characteristics of different computation units included in the computation clusters, partitions the computation graph into a plurality of computation subtasks, and distributes the computation subtasks to the corresponding task queues of the computation units that can process this type of computation subtasks).  
Claims 15-19 are the method claims corresponding to the system claims 1-5 and are rejected under the same reasons set forth in connection with the rejection of claims 1-5.
Claim 20 is the method claim corresponding to the system claim 1 and is rejected under the same reasons set forth in connection with the rejection of claim 1.
Remark
Examiner respectfully requests, in response to this Office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line number(s) in the specification and/or drawing figure(s). This will assist Examiner in prosecuting the application.

Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to BABOUCARR FAAL whose telephone number is (571)270-5073. The examiner can normally be reached M-F 8:30-5:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tom VO can be reached on 5712723642. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/BABOUCARR FAAL/Primary Examiner, Art Unit 2138