Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

CLAIM INTERPRETATION

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as "configured to" or "so that"; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
	Such claim limitation(s) is/are: “means for obtaining, means for comparing, means for identifying neural network unit, means for identifying a target processor, means for migrating” in claim 19.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made

Claims 1-30 are rejected under 35 U.S.C. 103 as being unpatentable over Tu et al. (US 2014/0115363, hereinafter Tu) in view of Merrill et al. (US 2016/0210550, hereinafter Merrill).

Regarding claim 1, Tu discloses 
A method for workload re-allocation (paragraph [0008]: Based on the operational mode and its associated performance goal(s), an active workload of the processing components may be reallocated across the processing components based on the individual performance capabilities of each) in a system-on-a-chip (“SoC”) having a plurality of heterogeneous processors (paragraph [0005]: Various embodiments of methods and systems for mode-based workload reallocation in a portable computing device that contains a heterogeneous, multi-processor system on a chip ("SoC") are disclosed; paragraph [0042]: FIG. 3 is a functional block diagram illustrating an embodiment of an on-chip system 102 for mode-based workload reallocation in a heterogeneous, multi-core PCD 100 ... the processing component(s) 110 is depicted as a group of heterogeneous processing engines 222, 224, 226, 228 for illustrative purposes only and may represent a single processing component having multiple, heterogeneous cores 222, 224, 226, 228 or multiple, heterogeneous processors 222, 224, 226, 228)), comprising: 
obtaining, by a workload allocation control processor, a plurality of measurements, each characterizing operation (paragraph [0022]: it will be understood that the terms "thermal" and "thermal energy" may be used in association with a device or component capable of generating or dissipating energy that can be measured in units of "temperature”) of a corresponding the SoC processor (paragraph [0005]: Various embodiments of methods and systems for mode-based workload reallocation in a portable computing device that contains a heterogeneous, multi-processor system on a chip ("SoC") are disclosed); 
comparing, by the workload allocation control processor, each of the measurements with one or more thresholds (paragraph [0022]: it will be understood that the terms "thermal" and "thermal energy" may be used in association with a device or component capable of generating or dissipating energy that can be measured in units of "temperature”; claim 9: wherein a recognized mode-decision condition is associated with the PS mode and comprises a temperature reading that exceeds a predetermined threshold); 
identifying, by the workload allocation control processor, a workload executing on one of the SoC processors (paragraph [0005]: Various embodiments of methods and systems for mode-based workload reallocation in a portable computing device that contains a heterogeneous, multi-processor system on a chip ("SoC") are disclosed) based on metadata associated with the workload (paragraph [0023]: one of ordinary skill in the art will acknowledge that use of these "thermal" terms in the present disclosure may be related to process load distributions, workload burdens and power consumption; paragraph [0029]: ne of ordinary skill in the art will recognize that the performance characteristics associated with any given processing component may vary in relation with the operating temperature of that processing component, the power supplied to that processing component, etc.) and based on a result of a comparison of the measurements (paragraph [0022]: it will be understood that the terms "thermal" and "thermal energy" may be used in association with a device or component capable of generating or dissipating energy that can be measured in units of "temperature”; paragraph [0029]: one of ordinary skill in the art will recognize that the performance characteristics associated with any given processing component may vary in relation with the operating temperature of that processing component, the power supplied to that processing component, etc.) with the one or more thresholds (paragraph [0041]: the PS mode-decision conditions outlined in the FIG. 2 graph are not offered as an exhaustive list of the triggers that may be used to point a MAM module to a PS mode and, as such, one of ordinary skill in the art will recognize that other triggers or conditions within a PCD may be used to indicate that workloads should be allocated or reallocated to processing components with low power consumption characteristic; claim 9: wherein a recognized mode-decision condition is associated with the PS mode and comprises a temperature reading that exceeds a predetermined threshold); 
identifying, by the workload allocation control processor, a target processor (paragraph [0042]: FIG. 3 is a functional block diagram illustrating an embodiment of an on-chip system 102 for mode-based workload reallocation in a heterogeneous, multi-core PCD 100 ... the processing component(s) 110 is depicted as a group of heterogeneous processing engines 222, 224, 226, 228 for illustrative purposes only and may represent a single processing component having multiple, heterogeneous cores 222, 224, 226, 228 or multiple, heterogeneous processors 222, 224, 226, 228) based on the metadata associated with the workload and based on the result of the comparison (paragraph [0041]: the PS mode-decision conditions outlined in the FIG. 2 graph are not offered as an exhaustive list of the triggers that may be used to point a MAM module to a PS mode and, as such, one of ordinary skill in the art will recognize that other triggers or conditions within a PCD may be used to indicate that workloads should be allocated or reallocated to processing components with low power consumption characteristic; claim 9: wherein a recognized mode-decision condition is associated with the PS mode and comprises a temperature reading that exceeds a predetermined threshold); and 
migrating, by the workload allocation control processor, the workload from the one of the processors to the target processor (paragraph [0042]: the NNP region may not be fixed, and may be dynamically allocated ... the workload reallocation across the processing components 222, 224, 226, 228 may be based on determination of an operational mode). 
Tu does not explicitly disclose neural network unit. Merrill discloses neural network unit (paragraph [0007]: the architecture may allow for leveling and load balancing to achieve near-optimal throughput across heterogeneous processing units with widely varying individual throughput capabilities, while minimizing the cost of processing including power usage; paragraph [0066]: when an appropriate configuration is available, data associated with each user request may be sent through the Network API 158 to an initiator 155, which may be tightly coupled 150 to one or more of the same or different types of processors 156. In one example, the dispatcher 153 may assign user requests to a specific NNP, being controlled by an initiator 155. In another example, the initiator 155 may assign user requests to one or more of the processors 156 it controls. The types of neural network processors 156 may include, but are not limited to, a reconfigurable interconnect NNP, a fixed-architecture NNP, a GPU, standard multi-processors, and/or virtual machines; paragraph [0067]: the Load Balancer 157 may manage the neural network queues 159 for performance, power, thermal stability, and/or wear-leveling of the NNPs, such as leveling the number of power-down cycles or leveling the number of configuration changes ... The Admin API 149 may include tools to monitor the queues and may control the Load Balancer's 157 priorities for loading or dropping configurations based on the initiator resources 155, the configurations power and/or performance and the neural network queue depths). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teaching of Tu by performing load balancing to manage the neural network queues 159 for performance, power, thermal stability of NNPs of Merril. The motivation would have been to achieve near-optimal throughput across heterogeneous processing units with widely varying individual throughput capabilities, while minimizing the cost of processing including power usage (Merrill paragraph [0007]).
Regarding claim 10 referring to claim 1, Tu discloses A system for workload re-allocation in a system-on-a-chip (“SoC”), comprising: a plurality of heterogeneous SoC processors; and a workload allocation control processor configured to: ... (Fig. 4, paragraph [0005]: Various embodiments of methods and systems for mode-based workload reallocation in a portable computing device that contains a heterogeneous, multi-processor system on a chip ("SoC") are disclosed; paragraph [0042]: FIG. 3 is a functional block diagram illustrating an embodiment of an on-chip system 102 for mode-based workload reallocation in a heterogeneous, multi-core PCD 100 ... the processing component(s) 110 is depicted as a group of heterogeneous processing engines 222, 224, 226, 228 for illustrative purposes only and may represent a single processing component having multiple, heterogeneous cores 222, 224, 226, 228 or multiple, heterogeneous processors 222, 224, 226, 228)).
Regarding claim 19 referring to claim 1, Tu discloses A system for workload re-allocation in a system-on-a-chip (“SoC”), comprising: means for obtaining a plurality of measurements, each characterizing operation of a corresponding SoC processor; means for ... (Fig. 2, 4, paragraph [0005]: Various embodiments of methods and systems for mode-based workload reallocation in a portable computing device that contains a heterogeneous, multi-processor system on a chip ("SoC") are disclosed; paragraph [0042]: FIG. 3 is a functional block diagram illustrating an embodiment of an on-chip system 102 for mode-based workload reallocation in a heterogeneous, multi-core PCD 100 ... the processing component(s) 110 is depicted as a group of heterogeneous processing engines 222, 224, 226, 228 for illustrative purposes only and may represent a single processing component having multiple, heterogeneous cores 222, 224, 226, 228 or multiple, heterogeneous processors 222, 224, 226, 228)).
Regarding claim 28 referring to claim 1, Tu discloses A computer program product for workload re-allocation in a system-on-a-chip (“SoC”), the computer program product comprising a non-transitory computer-readable medium having stored thereon in computer-executable form instructions executable on a processor system to: (Fig. 2, 4, paragraph [0005]: Various embodiments of methods and systems for mode-based workload reallocation in a portable computing device that contains a heterogeneous, multi-processor system on a chip ("SoC") are disclosed; paragraph [0042]: FIG. 3 is a functional block diagram illustrating an embodiment of an on-chip system 102 for mode-based workload reallocation in a heterogeneous, multi-core PCD 100 ... the processing component(s) 110 is depicted as a group of heterogeneous processing engines 222, 224, 226, 228 for illustrative purposes only and may represent a single processing component having multiple, heterogeneous cores 222, 224, 226, 228 or multiple, heterogeneous processors 222, 224, 226, 228); paragraph [0089]: computer-readable media).

Regarding claims 2, 11, 20, and 29, Tu discloses 
wherein identifying a workload comprises identifying the workload from among a plurality of workloads concurrently executing (paragraph [0008]: Based on the operational mode and its associated performance goal(s), an active workload of the processing components may be reallocated across the processing components based on the individual performance capabilities of each) on the one of the SoC processors (paragraph [0005]: Various embodiments of methods and systems for mode-based workload reallocation in a portable computing device that contains a heterogeneous, multi-processor system on a chip ("SoC") are disclosed; paragraph [0042]: FIG. 3 is a functional block diagram illustrating an embodiment of an on-chip system 102 for mode-based workload reallocation in a heterogeneous, multi-core PCD 100 ... the processing component(s) 110 is depicted as a group of heterogeneous processing engines 222, 224, 226, 228 for illustrative purposes only and may represent a single processing component having multiple, heterogeneous cores 222, 224, 226, 228 or multiple, heterogeneous processors 222, 224, 226, 228)).
Tu does not explicitly disclose the neural network unit from among a plurality of neural network units. Merrill discloses the neural network unit from among a plurality of neural network units (paragraph [0007]: the architecture may allow for leveling and load balancing to achieve near-optimal throughput across heterogeneous processing units with widely varying individual throughput capabilities, while minimizing the cost of processing including power usage; paragraph [0033]: the NNP architecture may simultaneously write multiple words on input bus 25 and output multiple words on the output bus 28 in a single clock cycle; paragraph [0066]: when an appropriate configuration is available, data associated with each user request may be sent through the Network API 158 to an initiator 155, which may be tightly coupled 150 to one or more of the same or different types of processors 156. In one example, the dispatcher 153 may assign user requests to a specific NNP, being controlled by an initiator 155. In another example, the initiator 155 may assign user requests to one or more of the processors 156 it controls. The types of neural network processors 156 may include, but are not limited to, a reconfigurable interconnect NNP, a fixed-architecture NNP, a GPU, standard multi-processors, and/or virtual machines; paragraph [0067]: the Load Balancer 157 may manage the neural network queues 159 for performance, power, thermal stability, and/or wear-leveling of the NNPs, such as leveling the number of power-down cycles or leveling the number of configuration changes ... The Admin API 149 may include tools to monitor the queues and may control the Load Balancer's 157 priorities for loading or dropping configurations based on the initiator resources 155, the configurations power and/or performance and the neural network queue depths). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teaching of Tu by performing load balancing to manage the neural network queues 159 for performance, power, thermal stability of NNPs of Merril. The motivation would have been to achieve near-optimal throughput across heterogeneous processing units with widely varying individual throughput capabilities, while minimizing the cost of processing including power usage (Merrill paragraph [0007]).

Regarding claims 3, 12, 21, and 30, Tu discloses 
wherein the workload comprises a workload of one or more workloads, each ... concurrently executing (paragraph [0008]: Based on the operational mode and its associated performance goal(s), an active workload of the processing components may be reallocated across the processing components based on the individual performance capabilities of each) on the one of the SoC processors (paragraph [0005]: Various embodiments of methods and systems for mode-based workload reallocation in a portable computing device that contains a heterogeneous, multi-processor system on a chip ("SoC") are disclosed; paragraph [0042]: FIG. 3 is a functional block diagram illustrating an embodiment of an on-chip system 102 for mode-based workload reallocation in a heterogeneous, multi-core PCD 100 ... the processing component(s) 110 is depicted as a group of heterogeneous processing engines 222, 224, 226, 228 for illustrative purposes only and may represent a single processing component having multiple, heterogeneous cores 222, 224, 226, 228 or multiple, heterogeneous processors 222, 224, 226, 228)).
Tu does not explicitly disclose wherein the p comprises a neural network layer or a neural network of one or more neural networks, each having a plurality of layers concurrently executing on the plurality of ... processors. Merrill discloses wherein the p comprises a neural network layer or a neural network of one or more neural networks, each having a plurality of layers concurrently executing on the plurality of ... processors (paragraph [0065]: It is further contemplated that one or more of the fixed-architecture NNPs in 156 may be equivalent to 120 in FIG. 12, and may include a plurality of FPGAs, which may be reconfigured for each neural network, or layer of neural network, by the generator 152 ... It is further contemplated that any configuration may be composed of layers that may be executed on more than one type of processor or NNP and that the cache 154 may be a combination of volatile and non-volatile memories and may contain transient and/or permanent data). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teaching of Tu by executing layers of NNPs of Merril. The motivation would have been to achieve near-optimal throughput across heterogeneous processing units with widely varying individual throughput capabilities, while minimizing the cost of processing including power usage (Merrill paragraph [0007]).

Regarding claims 4, 13, and 22, Tu discloses 
wherein: the metadata is contained in an input file having workload information and comprises a power consumption level of each workload (paragraph [0023]: one of ordinary skill in the art will acknowledge that use of these "thermal" terms in the present disclosure may be related to process load distributions, workload burdens and power consumption; paragraph [0029]: ne of ordinary skill in the art will recognize that the performance characteristics associated with any given processing component may vary in relation with the operating temperature of that processing component, the power supplied to that processing component, etc) executing on each SoC processor (paragraph [0005]: Various embodiments of methods and systems for mode-based workload reallocation in a portable computing device that contains a heterogeneous, multi-processor system on a chip ("SoC") are disclosed; paragraph [0042]: FIG. 3 is a functional block diagram illustrating an embodiment of an on-chip system 102 for mode-based workload reallocation in a heterogeneous, multi-core PCD 100 ... the processing component(s) 110 is depicted as a group of heterogeneous processing engines 222, 224, 226, 228 for illustrative purposes only and may represent a single processing component having multiple, heterogeneous cores 222, 224, 226, 228 or multiple, heterogeneous processors 222, 224, 226, 228)); and 
the measurements comprise temperatures associated with execution of the plurality of workloads (paragraph [0022]: it will be understood that the terms "thermal" and "thermal energy" may be used in association with a device or component capable of generating or dissipating energy that can be measured in units of "temperature”; paragraph [0029]: one of ordinary skill in the art will recognize that the performance characteristics associated with any given processing component may vary in relation with the operating temperature of that processing component, the power supplied to that processing component, etc.) on each of the plurality of SoC processors (paragraph [0005]: Various embodiments of methods and systems for mode-based workload reallocation in a portable computing device that contains a heterogeneous, multi-processor system on a chip ("SoC") are disclosed; paragraph [0042]: FIG. 3 is a functional block diagram illustrating an embodiment of an on-chip system 102 for mode-based workload reallocation in a heterogeneous, multi-core PCD 100 ... the processing component(s) 110 is depicted as a group of heterogeneous processing engines 222, 224, 226, 228 for illustrative purposes only and may represent a single processing component having multiple, heterogeneous cores 222, 224, 226, 228 or multiple, heterogeneous processors 222, 224, 226, 228)).
Tu does not explicitly disclose neural network information ... neural network unit. Merrill discloses neural network information ... neural network unit (paragraph [0065]: It is further contemplated that one or more of the fixed-architecture NNPs in 156 may be equivalent to 120 in FIG. 12, and may include a plurality of FPGAs, which may be reconfigured for each neural network, or layer of neural network, by the generator 152 ... It is further contemplated that any configuration may be composed of layers that may be executed on more than one type of processor or NNP and that the cache 154 may be a combination of volatile and non-volatile memories and may contain transient and/or permanent data). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teaching of Tu by executing layers of NNPs of Merril. The motivation would have been to achieve near-optimal throughput across heterogeneous processing units with widely varying individual throughput capabilities, while minimizing the cost of processing including power usage (Merrill paragraph [0007]).

Regarding claims 5, 14, and 23, Tu discloses 
wherein identifying the workload from among a plurality of workloads concurrently executing on the one of the SoC processors comprises identifying the workload indicated by the metadata as having a highest power consumption of all workloads executing on the one of the SoC processors (paragraph [0022]: it will be understood that the terms "thermal" and "thermal energy" may be used in association with a device or component capable of generating or dissipating energy that can be measured in units of "temperature”; paragraph [0029]: one of ordinary skill in the art will recognize that the performance characteristics associated with any given processing component may vary in relation with the operating temperature of that processing component, the power supplied to that processing component, etc; paragraph [0030]: consider an exemplary heterogeneous multi-core processor which may include a number of different processing cores generally ranging in performance capacities from low to high (notably, one of ordinary skill in the art will recognize that an exemplary heterogeneous multi-processor system on a chip ("SoC") which may include a number of different processing components, each containing one or more cores, may also be considered).
Tu does not explicitly disclose neural network unit from among a plurality of neural network units ... all neural network units. Merrill discloses neural network unit from among a plurality of neural network units ... all neural network units (paragraph [0065]: It is further contemplated that one or more of the fixed-architecture NNPs in 156 may be equivalent to 120 in FIG. 12, and may include a plurality of FPGAs, which may be reconfigured for each neural network, or layer of neural network, by the generator 152 ... It is further contemplated that any configuration may be composed of layers that may be executed on more than one type of processor or NNP and that the cache 154 may be a combination of volatile and non-volatile memories and may contain transient and/or permanent data). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teaching of Tu by executing layers of NNPs of Merril. The motivation would have been to achieve near-optimal throughput across heterogeneous processing units with widely varying individual throughput capabilities, while minimizing the cost of processing including power usage (Merrill paragraph [0007]).

Regarding claims 6, 15, and 24, Tu discloses 
wherein identifying a target processor comprises identifying the SoC processor (paragraph [0030]: consider an exemplary heterogeneous multi-core processor which may include a number of different processing cores generally ranging in performance capacities from low to high (notably, one of ordinary skill in the art will recognize that an exemplary heterogeneous multi-processor system on a chip ("SoC") which may include a number of different processing components, each containing one or more cores, may also be considered) indicated by the metadata as executing the identified workload at a lowest power consumption of all processors (paragraph [0041]: the PS mode-decision conditions outlined in the FIG. 2 graph are not offered as an exhaustive list of the triggers that may be used to point a MAM module to a PS mode and, as such, one of ordinary skill in the art will recognize that other triggers or conditions within a PCD may be used to indicate that workloads should be allocated or reallocated to processing components with low power consumption characteristic).
Tu does not explicitly disclose neural network unit. Merrill discloses neural network unit (paragraph [0065]: It is further contemplated that one or more of the fixed-architecture NNPs in 156 may be equivalent to 120 in FIG. 12, and may include a plurality of FPGAs, which may be reconfigured for each neural network, or layer of neural network, by the generator 152 ... It is further contemplated that any configuration may be composed of layers that may be executed on more than one type of processor or NNP and that the cache 154 may be a combination of volatile and non-volatile memories and may contain transient and/or permanent data). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teaching of Tu by executing layers of NNPs of Merril. The motivation would have been to achieve near-optimal throughput across heterogeneous processing units with widely varying individual throughput capabilities, while minimizing the cost of processing including power usage (Merrill paragraph [0007]).

Regarding claims 7, 16, and 25, Tu discloses 
wherein: the metadata is contained in an input file having workload information and comprises a processing load level of each workload executing (paragraph [0020]: wherein a recognized mode-decision condition is associated with the PS mode and comprises a total processing capacity utilization of the processing components that is below a predetermined threshold) on each SoC processor (paragraph [0030]: consider an exemplary heterogeneous multi-core processor which may include a number of different processing cores generally ranging in performance capacities from low to high (notably, one of ordinary skill in the art will recognize that an exemplary heterogeneous multi-processor system on a chip ("SoC") which may include a number of different processing components, each containing one or more cores, may also be considered); and 
the measurements comprise processor utilization associated with execution of the plurality of workloads (paragraph [0040]: Other exemplary mode-decision conditions illustrated in FIG. 2 as possible triggers for a HPP mode include detection of a performance benchmark, a core utilization greater than some threshold (e.g., &gt;90%)) on each of the SoC processors (paragraph [0030]: consider an exemplary heterogeneous multi-core processor which may include a number of different processing cores generally ranging in performance capacities from low to high (notably, one of ordinary skill in the art will recognize that an exemplary heterogeneous multi-processor system on a chip ("SoC") which may include a number of different processing components, each containing one or more cores, may also be considered).
Tu does not explicitly disclose neural network information ... neural network unit ... the plurality of neural network units. Merrill discloses neural network information ... neural network unit ... the plurality of neural network units (paragraph [0065]: It is further contemplated that one or more of the fixed-architecture NNPs in 156 may be equivalent to 120 in FIG. 12, and may include a plurality of FPGAs, which may be reconfigured for each neural network, or layer of neural network, by the generator 152 ... It is further contemplated that any configuration may be composed of layers that may be executed on more than one type of processor or NNP and that the cache 154 may be a combination of volatile and non-volatile memories and may contain transient and/or permanent data). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teaching of Tu by executing layers of NNPs of Merril. The motivation would have been to achieve near-optimal throughput across heterogeneous processing units with widely varying individual throughput capabilities, while minimizing the cost of processing including power usage (Merrill paragraph [0007]).

Regarding claims 8, 17, and 26, Tu discloses 
wherein identifying the workload from among a plurality of workloads concurrently executing on the one of the SoC processors comprises identifying the workload indicated by the metadata as having a highest processing load of all workloads executing on the one of the SoC processors (paragraph [0020]: wherein a recognized mode-decision condition is associated with the PS mode and comprises a total processing capacity utilization of the processing components that is below a predetermined threshold); paragraph [0030]: consider an exemplary heterogeneous multi-core processor which may include a number of different processing cores generally ranging in performance capacities from low to high (notably, one of ordinary skill in the art will recognize that an exemplary heterogeneous multi-processor system on a chip ("SoC") which may include a number of different processing components, each containing one or more cores, may also be considered).
Tu does not explicitly disclose neural network unit from among a plurality of neural network units ... all neural network units. Merrill discloses neural network unit from among a plurality of neural network units ... all neural network units (paragraph [0065]: It is further contemplated that one or more of the fixed-architecture NNPs in 156 may be equivalent to 120 in FIG. 12, and may include a plurality of FPGAs, which may be reconfigured for each neural network, or layer of neural network, by the generator 152 ... It is further contemplated that any configuration may be composed of layers that may be executed on more than one type of processor or NNP and that the cache 154 may be a combination of volatile and non-volatile memories and may contain transient and/or permanent data). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teaching of Tu by executing layers of NNPs of Merril. The motivation would have been to achieve near-optimal throughput across heterogeneous processing units with widely varying individual throughput capabilities, while minimizing the cost of processing including power usage (Merrill paragraph [0007]).

Regarding claims 9, 18, and 27, Tu discloses 
wherein identifying a target processor comprises identifying the processor indicated by the metadata as executing the identified workload at a lowest processing load of all processors (paragraph [0020]: wherein a recognized mode-decision condition is associated with the PS mode and comprises a total processing capacity utilization of the processing components that is below a predetermined threshold); paragraph [0030]: consider an exemplary heterogeneous multi-core processor which may include a number of different processing cores generally ranging in performance capacities from low to high (notably, one of ordinary skill in the art will recognize that an exemplary heterogeneous multi-processor system on a chip ("SoC") which may include a number of different processing components, each containing one or more cores, may also be considered; paragraph [0031]: Recognizing that certain cores in a heterogeneous processor are better suited to process a given workload than other cores when the PCD is in certain modes of operation, a mode-based workload reallocation algorithm can be leveraged to reallocate workloads to the processing core or cores which offer the best performance in the context of the given mode).
Tu does not explicitly disclose neural network unit. Merrill discloses neural network unit (paragraph [0065]: It is further contemplated that one or more of the fixed-architecture NNPs in 156 may be equivalent to 120 in FIG. 12, and may include a plurality of FPGAs, which may be reconfigured for each neural network, or layer of neural network, by the generator 152 ... It is further contemplated that any configuration may be composed of layers that may be executed on more than one type of processor or NNP and that the cache 154 may be a combination of volatile and non-volatile memories and may contain transient and/or permanent data). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teaching of Tu by executing layers of NNPs of Merril. The motivation would have been to achieve near-optimal throughput across heterogeneous processing units with widely varying individual throughput capabilities, while minimizing the cost of processing including power usage (Merrill paragraph [0007]).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SISLEY KIM whose telephone number is (571)270-7832.  The examiner can normally be reached on 9:30 A.M - 6:30 P.M. 
	If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Emerson Puente can be reached on (571)272-3652. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
/SISLEY N KIM/Primary Examiner, Art Unit 2196                                                                                                                                                                                                        4/28/2022