DETAILED ACTION
Claims 1-20 are pending.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 7, 8, 10-15, 17, 18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Chai et al. (US PGPUB US 2020/0134461 A1), in view of Hamilton et al. (US PGPUB US 2015/0121272 A1)

Regarding claim 1, Chai teaches a method for processing a computing task, comprising: 
determining parameter data of multiple layers associated with a neural network model in response to receiving a computing task based on the neural network model (¶ [0004]: Machine learning algorithms have recently made rapid progress using deep neural networks (DNNs); ¶ [0031]: DNN 106 has a plurality of layers 108. Each of layers 108 may include a respective set of artificial neurons. Layers 108 include an input layer 108A, an output layer 108N, and one or more hidden layers (e.g., layers 108B through 108M). Layers 108 may include fully connected layers, convolutional layers, pooling layers, and/or other types of layers. 
ranking a given number of layers of the multiple layers on the basis of the parameter data so as to obtain a layer list (¶ [0008]:storing a set of weights of the DNN and a set of bit precision values of the DNN, the DNN including a plurality of layers, wherein for each layer of the plurality of layers, the set of weights includes weights of the layer and the set of bit precision values includes a bit precision value of the layer, the weights of the layer being represented in memory using values having bit precisions equal to the bit precision value of the layer, the weights of the layer being associated with inputs to neurons of the layer; ¶ [0033]: memory 102 stores a set of low-precision weights 116 for DNN 106 (which may be referred to herein as a first set of weights), a set of high-precision weights 114 (which may be referred to herein as a second set of weights), and a set of bit precision values 118. i.e., rank); 
ranking multiple computing resources on the basis of status information of the multiple computing resources so as to obtain a resource list (¶ [0135]: machine learning system 104 (FIG. 1) may first review the system architecture on which neural network software architecture 1100 needs to operate in inference mode. For example, machine learning system 104 may determine that, for a 1-bit BNN, the best processor would be an FPGA because FPGAs have fine grain programmable units that can support binary operations. (i.e., rank); as shown on Fig. 12 shows multiple system architectures); 
determining a mapping between a corresponding layer among the multiple layers and a corresponding computing resource among the multiple computing resources, the mapping indicating the corresponding computing resource among the multiple computing resources is to process parameters associated with the corresponding layer among the multiple layers (Fig. 12; ¶ [0030]: mapping of DNNs in the neural network software architecture to processors of a hardware architecture; ¶ [0136]: The system architecture parameters are used to map the neural network software architecture to the appropriate processors in the system architecture. For instance, machine learning system 104 of computing system 100 (FIG. 1) map the neural network software architecture to appropriate processors in the system architecture. Machine learning system 104 may use a cost function to select the best mapping (e.g., a best fit algorithm can be use). The cost function can be one of size, weight, power and cost (SWaPC). For example, given the selection of an 8-bit CNN, machine learning system 104 may use the cost function to select a mapping that provides a lower system SWaPC. For instance, machine learning system 104 may evaluate various potential mappings of DNNs in a neural network software architecture to processors in a hardware architecture. Machine learning system 104 may use a mapping cost function to select a mapping of DNNs in the neural network software architecture to processors in the hardware architecture; ¶ [0137]: FIG. 12 shows an example mapping of a neural network software architecture to a system architecture. More specifically, machine learning system 104 may map 8-bit CNN 1102 of FIG. 11 to 32-bit floating point CPU 1206 of FIG. 12. The same 8-bit CNN, if mapped to an FPGA, may incur higher computation resources (e.g., higher usage of memory and a FPGA fabric to support floating point computations). Furthermore, in the example of FIG. 12, machine learning system 104 may map 1-bit BNN 1108 to 1-bit FPGA 1208, map 8-bit MLP 1110 to 16-bit CPU 1202, map 32-bit LSTM 1106 to 64-bit floating point GPU 1204, and map 4-bit CNN 1104 to 8-bit DSP 1210. After machine learning system 104 maps a DNN to a processor, the processor may ; and 
causing allocation of the corresponding computing resource to process the parameters associated with the corresponding layer based on the mapping (¶ [0137]: More specifically, machine learning system 104 may map 8-bit CNN 1102 of FIG. 11 to 32-bit floating point CPU 1206 of FIG. 12. The same 8-bit CNN, if mapped to an FPGA, may incur higher computation resources (e.g., higher usage of memory and a FPGA fabric to support floating point computations). Furthermore, in the example of FIG. 12, machine learning system 104 may map 1-bit BNN 1108 to 1-bit FPGA 1208, map 8-bit MLP 1110 to 16-bit CPU 1202, map 32-bit LSTM 1106 to 64-bit floating point GPU 1204, and map 4-bit CNN 1104 to 8-bit DSP 1210. After machine learning system 104 maps a DNN to a processor, the processor may execute the DNN. For instance, in the example of FIG. 12, GPU 1206 may execute CNN 1102).

	While Chai ranks layers based on weights and determines the best (i.e., rank) resources out of a set of heterogeneous resources, Chai does not expressly disclose a layer list, a resource list, and determining a mapping between a corresponding layer among the multiple layers and a corresponding computing resource among the multiple computing resources on the basis of the layer list and the resource list. 

	However, Hamilton teaches a layer list (Fig. 5, Step 7.8.3; ¶ [0038]: At Step 7.8.3, define list/index of task/subtask/element; wherein a subtask of a task corresponds to a layer of neural network model), a resource list (Fig. 5, Step 7.8.2 Define list of resources), and determining a mapping between a corresponding layer among the multiple layers and a corresponding on the basis of the layer list and the resource list (Fig. 5, Steps 7.8.4 and 7.8.5; ¶ [0038]: Go to Step 7.8.4. At Step 7.8.4, relate (and can display) a relationship connection or lack of connection between Resource and Task/subtask/element (T2RESOURCE) Go to Step 7.8.5. At Step 7.8.5, determine if there is a relationship between each task/subtask/element and at least one Resource?: This is a yes or no question, Go to Step 7.8.6 else go to Step 7.8.3. At Step 7.8.6, if answer to 7.8.5 is Yes: Go to Step 7.8.10. At Step 7.8.10, Output T2RESOURCE, connections, connection changes, RESOURCES, INDEX).

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Hamilton with the teachings of Chai to correlate resources in a list with a list of tasks to ensure optimal resource allocation to tasks. The modification would have been motivated by the desire of ensuring that the subtasks/layers get executed by an appropriate resource.

Regarding claim 2, Chai teaches wherein obtaining the layer list comprises: 
obtaining a corresponding number of parameters comprised by a corresponding layer among the multiple layers on the basis of the parameter data and ranking the multiple layers on the basis of the corresponding number so as to obtain the layer list (¶ [0008]: storing a set of weights of the DNN and a set of bit precision values of the DNN, the DNN including a plurality of layers, wherein for each layer of the plurality of layers, the set of weights includes weights of the layer and the set of bit precision values includes a bit precision value of the layer, the weights of the layer being represented in memory using values having bit 

Regarding claim 3, Chai teaches wherein determining the mapping comprises: 
for a first layer among the multiple layers, selecting, a first computing resource matching with a first number of parameters of the first layer from the multiple computing resources so as to process parameters associated with the first layer (¶ [0136]: The system architecture parameters are used to map the neural network software architecture to the appropriate processors in the system architecture. For instance, machine learning system 104 of computing system 100 (FIG. 1) map the neural network software architecture to appropriate processors in the system architecture. Machine learning system 104 may use a cost function to select the best mapping (e.g., a best fit algorithm can be use). The cost function can be one of size, weight, power and cost (SWaPC). For example, given the selection of an 8-bit CNN, machine learning system 104 may use the cost function to select a mapping that provides a lower system SWaPC. For instance, machine learning system 104 may evaluate various potential ; and 
determining the mapping on the basis of the first layer and the first computing resource (¶ [0137]: FIG. 12 shows an example mapping of a neural network software architecture to a system architecture. More specifically, machine learning system 104 may map 8-bit CNN 1102 of FIG. 11 to 32-bit floating point CPU 1206 of FIG. 12. The same 8-bit CNN, if mapped to an FPGA, may incur higher computation resources (e.g., higher usage of memory and a FPGA fabric to support floating point computations). Furthermore, in the example of FIG. 12, machine learning system 104 may map 1-bit BNN 1108 to 1-bit FPGA 1208, map 8-bit MLP 1110 to 16-bit CPU 1202, map 32-bit LSTM 1106 to 64-bit floating point GPU 1204, and map 4-bit CNN 1104 to 8-bit DSP 1210. After machine learning system 104 maps a DNN to a processor, the processor may execute the DNN. For instance, in the example of FIG. 12, GPU 1206 may execute CNN 1102.).

	In addition, Hamilton teaches selecting, on the basis of the resource list, a first computing resource matching with a first number of parameters of the first layer (Fig. 5, Steps 7.8.4 and 7.8.5; ¶ [0038]: Go to Step 7.8.4. At Step 7.8.4, relate (and can display) a relationship connection or lack of connection between Resource and Task/subtask/element (T2RESOURCE) Go to Step 7.8.5. At Step 7.8.5, determine if there is a relationship between each task/subtask/element and at least one Resource?: This is a yes or no question, Go to Step 7.8.6 .

Regarding claim 4, Chai teaches further comprising: 
determining a first resource allocation required for processing parameters associated with the first layer (¶ [0137]: machine learning system 104 may map 8-bit CNN 1102 of FIG. 11 to 32-bit floating point CPU 1206 of FIG. 12.); 
updating the status information on the basis of the first resource allocation and ranking the multiple computing resources on the basis of the updated status information so as to obtain an updated resource list (¶ [0145]: if one of the hardware resources becomes unavailable (e.g., lost power, network connection lost, etc.) a BitNet training method can be invoked to map the neural network software architecture to a new set of system architecture parameters. Specifically, a new mapping is performed on the neural network software architecture (e.g., neural network software architecture 1100 of FIG. 11) to a new system architecture (e.g., a subset of processors in FIG. 12).).

Regarding claim 5, Chai teaches further comprising: 
for a second layer neighboring the first layer among the multiple layers, selecting, on the basis of the updated resource list, a second computing resource matching with a second number of parameters of the second layer from the multiple computing resources so as to process parameters associated with the second layer and determining the mapping on the basis of the second layer and the second computing resource (¶ [0144]: the AI system may anticipate its resource needs by determining which parts of the neural network software 

Regarding claim 7, Chai teaches wherein determining status information of the multiple computing resources comprises: 
monitoring resource information of the multiple computing resources, and determining status information of the multiple computing resources on the basis of the resource information (¶ [0144]: computing system may identify a plurality of different 

Regarding claim 8, Chai teaches wherein resource information of the multiple computing resources comprises at least any one indicator of: 
processing capacity information, memory resource information and bandwidth resource information of a corresponding computing resource among the multiple computing resources (¶ [0144]: computing system may identify a plurality of different scenarios. In some examples, different scenarios may be different system architectures. In some examples, different scenarios may involve the same set of processors but with differences in other parameters, such as available bandwidth, available remaining battery life, remaining allocable memory space, processor workload, and so on.).

Regarding claim 10, Chai teaches wherein the multiple computing resources are multiple graphics processing units (¶ [0134]: system architecture 1200 comprising a heterogeneous set of processors. In the example of FIG. 12, the processors include a CPU 1202, a GPU 1204, a GPU 1206, a FPGA 1208, and a DSP 1210.).

Regarding claim 11, it is a system type claim having similar limitations as claim 1 above. Therefore, it is rejected under the same rationale above. The additional limitations at least one processor; a volatile memory; and a memory coupled to the at least one processor and having instructions stored thereon, the instructions, when executed by the at least one processor, causing the apparatus to perform acts comprising are taught by Chai in at least ¶ [0171]: “The techniques described in this disclosure may also be embodied or encoded in a computer-readable medium, such as a computer-readable storage medium, containing instructions. Instructions embedded or encoded in a computer-readable storage medium may cause a programmable processor, or other processor, to perform the method, e.g., when the instructions are executed”

Regarding claim 12, it is a system type claim having similar limitations as claim 2 above. Therefore, it is rejected under the same rationale above.

Regarding claim 13, it is a system type claim having similar limitations as claim 3 above. Therefore, it is rejected under the same rationale above.

Regarding claim 14, it is a system type claim having similar limitations as claim 4 above. Therefore, it is rejected under the same rationale above.

Regarding claim 15, it is a system type claim having similar limitations as claim 5 above. Therefore, it is rejected under the same rationale above.

Regarding claim 17, it is a system type claim having similar limitations as claim 7 above. Therefore, it is rejected under the same rationale above.

Regarding claim 18, it is a system type claim having similar limitations as claim 8 above. Therefore, it is rejected under the same rationale above.

Regarding claim 20, it is a media/product type claim having similar limitations as claim 1 above. Therefore, it is rejected under the same rationale above.

Claims 6 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Chai and Hamilton, as applied to claim 1, in further view of Hassan Hussein et al. (US PGPUB US 2020/0120707 A1).

Regarding claim 6, Chai teaches wherein the layer list is ranked in order of the corresponding number (¶ [0008]:storing a set of weights of the DNN and a set of bit precision values of the DNN, the DNN including a plurality of layers, wherein for each layer of the plurality of layers, the set of weights includes weights of the layer and the set of bit precision values includes a bit precision value of the layer, the weights of the layer being represented in memory using values having bit precisions equal to the bit precision value of the layer, the weights of the layer being associated with inputs to neurons of the layer; ¶ [0033]: memory 102 stores a set of low-precision weights 116 for DNN 106 (which may be referred to herein as a first set of weights), a set of high-precision weights 114 (which may be referred to herein as a second set of weights), and a set of bit precision values 118. i.e., rank), the resource list is ranked in order of status information of the multiple computing resources (¶ [0135]: machine learning system 104 (FIG. 1) may first review the system architecture on which neural network software architecture 1100 needs to operate in inference mode. For example, machine learning system 104 i.e., rank)).

	In addition, while Chai describes layer sets and resource sets, Chai does not expressly disclose the set as being lists. 

	However, Hamilton teaches a layer list (Fig. 5, Step 7.8.3; ¶ [0038]: At Step 7.8.3, define list/index of task/subtask/element; wherein a subtask of a task corresponds to a layer of neural network model), a resource list (Fig. 5, Step 7.8.2 Define list of resources).

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Hamilton with the teachings of Chai to correlate resources in a list with a list of tasks to ensure optimal resource allocation to tasks. The modification would have been motivated by the desire of ensuring that the subtasks/layers get executed by an appropriate resource.

Chai and Hamilton do not expressly disclose selecting the second computing resource comprises: selecting a computing resource at an endpoint position of the updated resource list as the second computing resource.

However, Hassan Hussein teaches selecting the second computing resource comprises: selecting a computing resource at an endpoint position of the updated resource list as the second computing resource (¶ [0041]: For example, the JRP physical resource assigned to a i.e., least ranked)).

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Hassan Hussein with the teachings of Chai and Hamilton to allocate the first ranked resource to a first device and a least ranked resource to a second device. The modification would have been motivated by the desire of avoiding access contentions and therefore increasing reliability. See at least ¶ [0041].

Regarding claim 16, it is a system type claim having similar limitations as claim 6 above. Therefore, it is rejected under the same rationale above.

Claims 9 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Chai and Hamilton, as applied to claim 1, in further view of Laird et al. (US PGPUB US 2019/0068519 A1).

Regarding claim 9, Chai and Hamilton do not expressly disclose wherein determining status information of the multiple computing resources on the basis of the resource information further comprises: 
for a given computing resource among the multiple computing resources, determining importance of a corresponding indicator among the multiple indicators for the given computing resource on the basis of the computing task; and determining status information for the given computing resource on the basis of the importance of the corresponding indicator and the corresponding indicator.

However, Laird teaches wherein determining status information of the multiple computing resources on the basis of the resource information further comprises: 
for a given computing resource among the multiple computing resources, determining importance of a corresponding indicator among the multiple indicators for the given computing resource on the basis of the computing task and determining status information for the given computing resource on the basis of the importance of the corresponding indicator and the corresponding indicator (¶ [0018]: The network resource optimization system may process the resource request and determine or identify one or more network resources that satisfy the required resource parameters. The identified network resources may then be evaluated or ranked to determine a subset of network resources using a score and/or a number of other factors. Some of these factors may include, but is not limited to, past response times, acceptance of communication requests (e.g., the number of times previous communication requests were accepted and whether communication sessions were established between the network resource and network nodes), availability, quality of the resource, performance 

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Laird with the teachings of Chai and Hamilton to determine the ranking of a resource based on its availability. The modification would have been motivated by the desire of ensuring that the resources included in the set of resources are available. See at least Laird’s ¶ [0052].

Regarding claim 19, it is a system type claim having similar limitations as claim 9 above. Therefore, it is rejected under the same rationale above.
Response to Arguments
Applicant's arguments filed on 03/04/2021 have been fully considered but they are not persuasive.
In Remarks Applicant argues:
(I)	Applicant’s claimed arrangement, however, recites that determining parameter data of multiple layers associated with a neural network model in response to receiving a computing task based on the neural network model. Applicant’s claimed arrangement further recites causing allocation of the corresponding computing resource to process the parameters associated with the 
Hamilton fails to supplement the fundamental deficiencies of Chai. In particular, Hamilton is cited for the alleged disclosure of a layer list. See the Office Action at page 11. Accordingly, Hamilton fails to supplement the fundamental deficiencies of Chai, and the collective teachings of Chai and Hamilton fail to disclose or suggest the newly-added features of claim 1.
In view of the above, examiner respectfully submits the following.
As to point (I)
Examiner respectfully disagrees with the Applicant for at least the following reasons. Chai teaches deep neural networks (DNNs) that have multiple hidden layers (See at least ¶ [0004]),  DNN that receives a description of a hardware architecture and a description of a problem that a network software architecture is designed to solve” in at least ¶ [0137]. With this information the system eventually determines a best-fit mapping between layers and processors of a system architecture. Therefore, Chai reasonably teaches determining parameter data of multiple layers associated with a neural network model as in much the same way based on requirements of a layer, an ideal processor is selected. Accordingly, Chai teaches the argued limitation and the rejection is maintained. Regarding the amended limitation “causing allocation of the corresponding computing resource to process the parameters associated with the corresponding layer based on the mapping” it is respectfully submitted that Chai teaches in at least ¶ [0137]: “after machine learning system 104 maps a DNN. For instance, in the example of Fig. 12, GPU 1206 may execute CNN 1102”. Accordingly, Chai reasonably teaches the amended limitation.
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JORGE A CHU JOY-DAVILA whose telephone number is (571)270-0692.  The examiner can normally be reached on Monday-Friday, 9:00am-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Meng-Ai T An can be reached on (571)-272-3756.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications 






/JORGE A CHU JOY-DAVILA/Primary Examiner, Art Unit 2195