Reasons for Allowance
The following is a statement of reasons for the indication of allowable subject matter:
The prior art of record, Chandra et al. Pub. No. US 2019/0266015 A1 (hereafter Chandra) teaches a method, comprising: obtaining a dynamic system model (DNN models in at least Fig. 1 and receiving workloads in at least Fig. 11) based on a relation between an amount of at least one resource (The profiler 110 uses training data to generate a performance model for a DNN model. This performance model is used to determine predicted resource requirements for an instance of the DNN model, e.g., a DNN workload. Creating a performance model may be a supervised learning problem, where given a value of Sr, Ci, Bs, and Pc, the performance model predicts the time taken and peak RAM usage of a DNN workload in at least ¶ [0047] and ¶ [0042]) for a plurality of workloads (Systems, methods, and computer-executable instructions for scheduling neural network workloads on an edge device. A performance model for each neural network model is received  in at least abstract and scheduling workloads in at least Fig. 11) and at least one predefined service metric (FIGS. 4A, 4B, and 4C are graphs showing training data used for the profiler in accordance with respective examples. FIG. 4A illustrates maximum 402 and minimum 404 run time values for varying batch sizes. FIG. 4B illustrates maximum 412 and minimum 414 values of run time for varying precision values. FIG. 4C illustrates the run time for various DNN models for varying CPU core usage. These figures show that each metric may play a crucial role in controlling the time and memory taken by a DNN model for execution in at least ¶ [0047] and ¶ [0042]);
obtaining, an instantaneous value of the at least one predefined service metric (The core allocator allocates of one or more cores 116A-116D to each DNN workload based on resource requirement of a DNN workload and current system utilization. The DNN parameter allocator takes the input from the allocator 106 and profiler 110 to assign various DNN parameters to each of the DNN workloads to maximize a specified optimization criteria … The allocator 106 uses the learned performance model profiler for each DNN model and current system utilization. The allocator 106 formulates allocation of DNN workloads as an optimization problem for assigning system resource and DNN parameters to each DNN workload while maximizing the specified optimization criteria and following the constraints that arise from hardware limitations in at least ¶ [0035] – [0036]);
applying to a given controller associated with a given one of the plurality of workloads (DNN model applied to controllers in at least Fig. 1, allocator, scheduler, profiler, et cetera): (i) the dynamic system model (DNN models in at least Fig. 1 and DNN model received and applied to controllers in at least Fig. 11), (ii) an interference effect that aggregates and amount of an allocation of resources to two or more additional workloads of the plurality of workloads (At 1140, image streams are received. The image streams are mapped to DNN workloads. At 1150, the DNN workloads are scheduled to be executed. At 1160, the DNN workloads are executed at the scheduled time on the assigned processing core. In addition, the corresponding image stream is provided o the DNN workload. After a period of time, the scheduling process may run again to continue to schedule any uncompleted DNN workload as well as new DNN workloads <plural, two or more> in at least ¶ [0138] and Fig. 11) on a performance of the given one of the plurality of workloads, and wherein the given controller determines an adjustment to the amount of the at least one resource for the given one of the plurality of iterative workloads based at least in part on the interference effect (to optimize the core allocations and the parameter allocations, the DNN parameters may be randomly initialized. For given DNN parameters, the optimal core allocation scheme is calculated for DNN workloads. Then, given the core allocation scheme, the optimal DNN parameters are determined. The optimal core allocation is then determined again, this time using the optimized DNN parameters. Using the latest core allocation, the DNN parameters may be optimized again. This process repeats until there is convergence. The core and DNN parameter allocators are described in greater detail below in at least ¶ [0053], Examiner notes that each iteration reevaluates core allocations for most optimized allocation. That is, as more workloads are scheduled (as in ¶ [0138]) the optimal core allocation will changed by the effect of the additional workloads in aggregate since the previous core allocation); and
(iii) parameters for optimizing the instantaneous value of the at least one predefined service metric and a target value for the at least one predefined service metric, wherein the given controller determines an optimization to the amount of the at least one resource for the given one of the plurality of workloads based at least in part on the parameters; and initiating, by the given controller, an application of the determined optimization to the amount of the at least one resource to the given one of the plurality of workloads (to optimize the core allocations and the parameter allocations, the DNN parameters may be randomly initialized. For given DNN parameters, the optimal core allocation scheme is calculated for DNN workloads. Then, given the core allocation scheme, the optimal DNN parameters are determined. The optimal core allocation is then determined again, this time using the optimized DNN parameters. Using the latest core allocation, the DNN parameters may be optimized again in at least ¶ [0053] and Fig. 11);
wherein the method is performed by at least one processing device of the given controller, wherein the at least one processing device comprises a processor coupled to a memory (Computing device 1200 may include a hardware processor 1202 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), a hardware processor core, or any combination thereof), a main memory 1204 and a static memory 1206, some or all of which may communicate with each other via a link (e.g., bus) 1208 in at least ¶ [0140] and  a computer-readable storage media or machine-readable storage media may include any medium that is capable of storing, encoding, or carrying instructions for execution by the computing device 1200 and that cause the computing device 1200 to perform any one or more of the techniques of the present disclosure, or that is capable of storing, encoding or carrying data structures used by or associated with such instructions in at least ¶ [0144]).
Chandra teaches utilizing deep neural network models for optimizing resource allocations to workloads in view of instantaneous utilization and target values based on metrics. However, although Chandra clearly iterates to refine optimization (see at least ¶ [0053]) and accommodate additional workloads (see at least ¶ [0138] and Fig. 11), Chandra does not specifically recite that the workloads themselves are iterative. Further, Chandra teaches corrective adjustment (i.e., optimizing iteratively, see again the aforementioned citations) but does not specifically teach that these optimized corrections are pursuant to a difference between the instantaneous value and a target value.
However, in analogous art Dirac et al. Pat. No. US 10,257,275 B1 (hereafter Dirac) teaches iterative workloads (The optimizer implements a plurality of iterations of execution of the model, interleaved with observation collection intervals. During a given observation collection interval, tunable parameter settings suggested by the previous model execution iteration are used in the execution environment, and the observations collected during the interval are used as inputs for the next model execution iteration. When an optimization goal is attained, the tunable settings that led to achieving the goal are stored in at least abstract and col. 4 lines 49-59)
a difference between the instantaneous value of the at least one predefined service metric and a target value for the at least one predefined service metric (The optimization tool may determine a definition of, or a formula for, an objective function which is to be maximized or minimized with respect to the optimization target execution environment in various embodiments. An objective function may be identified in various ways in different embodiments, e.g., based on input from the client or based on the types of applications being run. Objective functions may sometimes be referred to as “loss functions”. While a number of different types of performance-related execution results (such as throughput, response times, consumed CPU-minutes, network bandwidth used, etc.) may be collected from the execution environment in some cases, a scalar objective function (an optimization objective expressible and computable as a single value) may be used in various embodiments for Bayesian optimization using Gaussian processes. The value of the objective function for a given observation collection interval may be calculated using a combination of raw execution results by the optimization tool in some embodiments. For example, an objective function such as “total cost in dollars” may be computed using a collection of raw metrics such as “CPU-minutes consumed”, “network bandwidth used”, “storage space used”, in combination with cost metrics such as “dollars per CPU-minute”, “dollars-per-megabyte-of-bandwidth”, etc in at least col. 3 lines 32-55), wherein the given controller determines an adjustment to the amount of the at least one resource for the given one of the plurality of iterative workloads based at least in part on the difference (In at least some embodiments, the optimization tool may determine the boundaries (e.g., start and end times) of the observation collection intervals as discussed below. The iterations of model execution and observation collection may be repeated until either the targeted extremum of the objective function has been attained (at least to within some tolerance limit), or until the resources available for the optimization task have been exhausted. The combination of tunable parameter settings that correspond to the attainment of the optimization goal (or the combination of tunable parameter settings that came closest to the optimization goal) may be stored in various embodiments, e.g., in a persistent storage repository accessible to the optimization tool and/or in a knowledge base in at least col. 4 line 59 – col. 5 line 6 and col. 15 lines 39-55) and
initiating, by the given controller, an application of the determined adjustment to the amount of the at least one resource to the given one of the plurality of iterative workloads (The service 620 may select the appropriate resources for the execution environment corresponding to each of the computation requests 615, and initiate the execution of the program code 625 indicated in the computation requests in at least col. 12 lines 46-50),
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date of the claimed invention to combine the iterative workloads and corrective adjustment corresponding to a difference between the instantaneous value and a target value of Dirac with the systems and methods of Chandra resulting in a system in which the workloads of Chandra are iterative as in Dirac and the corrective optimization of Chandra is corresponding to a difference between the instantaneous value and a target value as in Dirac. A person having ordinary skill in the art would have been motivated to make this combination, with a reasonable expectation of success, for the purpose of iterating on the optimization such that the parameters may be tuned to further increase performance and efficiency (see at least Dirac col. 15 lines 39-55 and abstract).
As can be seen, neither Chandra nor Dirac anticipates nor renders obvious the combination set forth in the independent claims. The claims further require a self-allocation effect of the given one of the plurality of iterative workloads on the given one of the plurality of iterative workloads with respect to the at least one predefined service metric, wherein the self-allocation effect is determined separately from the interference effect of the one or more additional iterative workloads of the plurality of iterative workloads on the given one of the plurality of iterative workloads with respect to the at least one predefined service metric. That is, while Chandra teaches utilizing deep neural network models for optimizing resource allocations to workloads in view of instantaneous utilization and target values based on metrics and iterates to refine optimization (see at least ¶ [0053]) and accommodates additional workloads (see at least ¶ [0138] and Fig. 11) as well as performs corrective adjustment (i.e., optimizing iteratively, see again the aforementioned citations) in combination with Dirac teaching iterative workloads and optimized corrections pursuant to a difference between the instantaneous value and a target value, the combination of Chandra and Dirac falls short of teaching the aforementioned self-allocation effect  determined separately from the interference effect and with respect to a predefined service metric, within the context of the remainder of the claim.
The aforementioned limitations and reasons are in conjunction with all other claim limitations and the structure and environment which are not specifically recited in the quotes or expounded upon in the reasons. The Notice of Allowability is based on the totality of the claims. Thus, for at least the foregoing reasons, the prior art of record neither anticipates nor renders obvious the present invention as set forth in the claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRADLEY A TEETS whose telephone number is (571)272-3338.  The examiner can normally be reached on Monday - Friday, 6am-2pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Meng An can be reached on 5712723756.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/BRADLEY A TEETS/Primary Examiner, Art Unit 2195