DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
In response to applicant’s arguments on page 13 that the combined teachings of Spinner et al., Griffin et al., and Spinner et al. fail to teach “auto-scaling, a processing node number using backpropagation machine learning and responsive to determining that a total execution time for all of the requests in the plurality of requests in the service master queue exceeds a predetermined time value” in claim 1, the applicant’s argument has been considered, but is not deemed persuasive. 
Sigal et al. teaches auto-scaling, a processing node number using backpropagation machine learning and responsive to determining that a total execution time for all of the requests in the plurality of requests in the service master queue exceeds a predetermined time value (paragraph 0016, “ … a feed-forward backpropagation neural network is utilized to predict completion of incoming tasks …” teaches a backpropagation neural network trained to predict completion [predetermined time value] of incoming tasks [plurality of requests];
paragraph 0041, “The load balancer and neural network model execute within the backend thread independently from the client thread. The neural network model obtains task information and input parameters from the load balancer engine and in addition obtains observed resource utilization and analyzes the information. The information is utilized for training the neural network to predict resource utilization for future incoming tasks…” 
paragraph 0069, “FIG. 4 shows a flow chart for an embodiment of the preemptive neural network database load balancer … The computer program product obtains an incoming task name and a first set of associated input parameters at 401. The task name is used as a handle to associate observed and predicted resource utilization for example. The computer program product obtains a resource utilization result, i.e., an actual observed resource utilization associated with the incoming task that is executed on a database server residing in a server cluster at 402. This may involve more than one resource utilization parameter, e.g., CPU time to execute and memory.  The computer program product uses the observed resource utilization result to train a neural network and the observed result is associated with the incoming task name and its set of associated input parameters at 403. The computer program product provides predicted resource utilization for an incoming task having an already observed incoming task name although the newly inbound task may utilize different input parameters at 404. The computer program product optionally provides a connection to a predicted least busy server based on the predicted resource utilization for the new incoming task at 405.” teaches resource utilization parameter including CPU time to execute a task,
 teaches observed CPU time to execute [total execution time] for each incoming task out of a plurality of incoming tasks [plurality of requests], 
teaches predicted CPU time to execute [predetermined time value] for each incoming task out of a plurality of incoming tasks [plurality of requests], and 

Based upon the above argument, the Examiner respectfully disagrees because the combination of Spinner et al., Griffin et al., and Spinner et al. does teach the limitations recited in claim 1.  Therefore, the rejection made to claim 1 and its dependent claims are proper and maintained. 
In response to applicant’s arguments on page 13 that the combined teachings of Spinner et al., Griffin et al., and Spinner et al., fail to teach “auto-scale a processing node number responsive to determining that a total execution time for all of the requests in the plurality of requests in the service master queue exceeds a predetermined time value” in claim 8, the applicant’s argument has been considered, but is not deemed persuasive. 
Sigal et al. teaches auto-scaling, a processing node number using backpropagation machine learning and responsive to determining that a total execution time for all of the requests in the plurality of requests in the service master queue exceeds a predetermined time value 
(paragraph 0041, “The load balancer and neural network model execute within the backend thread independently from the client thread. The neural network model obtains task information and input parameters from the load balancer engine and in addition obtains observed resource utilization and analyzes the information. The information is utilized for training the neural network to predict resource utilization for future incoming tasks…” teaches prediction of resource utilization for a plurality of incoming tasks [the plurality of requests in the service master queue];  
paragraph 0069, “FIG. 4 shows a flow chart for an embodiment of the preemptive neural network database load balancer … The computer program product obtains an incoming task name and a first set of associated input parameters at 401. The task name is used as a handle to associate observed and predicted resource utilization for example. The computer program product obtains a resource utilization result, i.e., an actual observed resource utilization associated with the incoming task that is executed on a database server residing in a server cluster at 402. This may involve more than one resource utilization parameter, e.g., CPU time to execute and memory.  The computer program product uses the observed resource utilization result to train a neural network and the observed result is associated with the incoming task name and its set of associated input parameters at 403. The computer program product provides predicted resource utilization for an incoming task having an already observed incoming task name although the newly inbound task may utilize different input parameters at 404. The computer program product optionally provides a connection to a predicted least busy server based on the predicted resource utilization for the new incoming task at 405.” teaches resource utilization parameter including CPU time to execute a task,
 teaches observed CPU time to execute [total execution time] for each incoming task out of a plurality of incoming tasks [plurality of requests], 
teaches predicted CPU time to execute [predetermined time value] for each incoming task out of a plurality of incoming tasks [plurality of requests], and 
teaches potential assignment of incoming task to an idle server in plurality of servers [auto-scale a processing node number]  based upon observed and predicted resource utilization parameters).
Spinner et al., Griffin et al., and Spinner et al. does teach the limitations recited in claim 8.  Therefore, the rejection made to claim 8 and its dependent claims are proper and maintained. 
In response to applicant’s arguments on page 13 that the combined teachings of Spinner et al., Griffin et al., and Spinner et al., fail to teach “auto-scaling, via the processor, a processing node number responsive to determining that a total execution time for all of the requests in the plurality of requests in the service master queue exceeds a predetermined time value” in claim 15, the applicant’s argument has been considered, but is not deemed persuasive. 
Sigal et al. teaches auto-scaling, a processing node number using backpropagation machine learning and responsive to determining that a total execution time for all of the requests in the plurality of requests in the service master queue exceeds a predetermined time value 
(paragraph 0041, “The load balancer and neural network model execute within the backend thread independently from the client thread. The neural network model obtains task information and input parameters from the load balancer engine and in addition obtains observed resource utilization and analyzes the information. The information is utilized for training the neural network to predict resource utilization for future incoming tasks…” teaches prediction of resource utilization for a plurality of incoming tasks [the plurality of requests in the service master queue];  
paragraph 0069, “FIG. 4 shows a flow chart for an embodiment of the preemptive neural network database load balancer … The computer program product obtains an incoming task name and a first set of associated input parameters at 401. The task name is used as a handle to associate observed and predicted resource utilization for example. The computer program product obtains a resource utilization result, i.e., an actual observed resource utilization associated with the incoming task that is executed on a database server residing in a server cluster at 402. This may involve more than one resource utilization parameter, e.g., CPU time to execute and memory.  The computer program product uses the observed resource utilization result to train a neural network and the observed result is associated with the incoming task name and its set of associated input parameters at 403. The computer program product provides predicted resource utilization for an incoming task having an already observed incoming task name although the newly inbound task may utilize different input parameters at 404. The computer program product optionally provides a connection to a predicted least busy server based on the predicted resource utilization for the new incoming task at 405.” teaches resource utilization parameter including CPU time to execute a task,
 teaches observed CPU time to execute [total execution time] for each incoming task out of a plurality of incoming tasks [plurality of requests], 
teaches predicted CPU time to execute [predetermined time value] for each incoming task out of a plurality of incoming tasks [plurality of requests], and 
teaches potential assignment of incoming task to an idle server in plurality of servers [auto-scaling, via the processor, a processing node number]  based upon observed and predicted resource utilization parameters).
Based upon the above argument, the Examiner respectfully disagrees because the combination of Spinner et al., Griffin et al., and Spinner et al. does teach the limitations 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5-10, 12-17, and 19-20  are rejected under 35 U.S.C. 103 as being unpatentable over Spinner et al. (“ Runtime Vertical Scaling of Virtualized Applications via Online Model Estimation”) in view of Griffin et al. (US 8,738,860 B1) and in further view of Sigal et al. (US 2008/0222646 Al).
Regarding Claim 1 and analogous Claim 8,
Spinner et al. teaches a computer implemented method for creating an auto-scaled predictive analytics model (p. 3, section III, paragraph 2 “… Due to the complexity of virtualized environments, we adopt a layered modeling approach for describing the application performance, where a virtualized system consists of a layered architecture, with each layer contributing to the externally visible application performance … the physical resource layer consists of the hardware resources (CPUs, main memory, etc.) of the physical host …” and p. 5, section III, paragraph 15 “…We use non-negative least squares regression to determine the application demand” teaches a regression model to predict application demand [predictive analytics model] and CPUs [a processor]; 
p. 5, section IV(A), paragraph 2, “ … The first part of the algorithm evaluates if the application can still fulfill its performance targets if a vCPU is removed from any of the member VMs and choose the VM which has the least impact on the application performance … The second part of the algorithm is executed if the application performance targets are or will soon be violated, and determines which VM is best scaled up to improve the application performance …”  teaches auto-scaling of VMs in response to predicting resource utilization using the regression model [predictive analytics model]) comprising: 
retrieving via the processor, a number of available processing nodes based on the time required for each of the requests ( p. 162, Algorithm 1   

    PNG
    media_image1.png
    691
    478
    media_image1.png
    Greyscale

teaches determining desired number of vCPUs [a number of available processing nodes] dependent upon the variables Rdown[v] and Rcur[v];  
	p. 161, section IV, paragraph 5 “… The function AnalyseModel takes a performance submodel Mv, the current queue length Qv and the number of vCPUs av for VM v as the input. It then calculates the expected residence time for newly arriving requests …” teaches the variables Rdown[v] and Rcur[v] being expected residence times for newly arriving requests [the time required for each of the requests]); and 
p. 161, section III, paragraph 15 

    PNG
    media_image2.png
    412
    506
    media_image2.png
    Greyscale

teaches the linear regression model of Equation 6 [auto-scaled predictive analytics model] based on the number of vCPUs in a VM [node number] and average queue length [queue size]).
Spinner et al. does not appear to explicitly teach determining, via a processor, whether a queue size of a service master queue is greater than zero; responsive to determining that the queue size is greater than zero, fetching, via the processor, a count of requests in a plurality of requests in the service master queue, and a type for each of the requests; deriving via the processor, a respective value for time required for each of the requests; and … auto-scaling, a processing node number using backpropagation machine learning and responsive to determining that a total execution time for all of the requests in the plurality of requests in the service master queue exceeds a predetermined time value.
Griffin et al. teaches determining, via a processor, whether a queue size of a service master queue is greater than zero (col. 17, lines 4-9 “… Software that configures the processors to implement the ring buffer also ensure that the enqueue operation will block or return an error whenever the user attempts to enqueue an item when the ring buffer's data storage is already full or dequeue an item when the ring buffer is empty…” teaches determining an empty ring buffer [determining, via a processor, whether a queue size of a service master queue is greater than zero] via generating an error when attempting to remove an item from said empty ring buffer); and 
responsive to determining that the queue size is greater than zero, fetching, via the processor, a count of requests in a plurality of requests in the service master queue, and a type for each of the requests … in the service master queue (col. 17, lines 10-17, “FIG. 4A depicts an example of the ring buffer mechanism. Items are enqueued by writing to the “tail slot identified by a tail pointer 35, and dequeued by reading from the “head' slot identified by a head pointer 36. Item 1, to which the head pointer 36 points, is the oldest item and next to be dequeued, and item3, to which the tail pointer 36 points, is the youngest, most recently enqueued, item” and col. 18, lines 4-27 “ … a “fetch and add if greater than or equal to zero” (fetchaddgez) operation allows the implementation of a more efficient ring buffer queue. This shared memory operation adds an integer value to the value in a specified memory location, atomically writes the result back to the memory location if the result was greater than or equal to zero, and returns the original value in the memory location … As part of implementing the shared memory queue, the processor is configured to construct an in-memory, integer value that contains a pair of values in different bit ranges: a credit count in its high bits and a ring buffer slot count in its low bits. One of these pairs is used for the enqueue operation, and another pair is used for the dequeue operation … the fetchaddgez operation can be used to atomically decrement the credit count and increment the slot count …” teaches fetching a 
Spinner et al. and Griffin et al. are considered analogous art because they are directed to efficient computing in parallel processing environments.
In view of the teachings of Spinner et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Griffin et al. at the time the application was filed in order to develop reconfigurable customized logic circuits capable of processing multiple applications simultaneously, thus increasing performance and consuming less power (cf. Griffin et al., col. 1, lines 17-39, “FPGAs (Field Programmable Gate Arrays) and ASICs (Application Specific Integrated Circuits) are two exemplary approaches for implementing customized logic circuits. The cost of building an ASIC includes the cost of verification, the cost of physical design and timing closure, and the NRE (non-recurring costs) of creating mask sets and fabricating the ICs. Due to the increasing costs of building an ASIC, FPGAs became increasingly popular. Unlike an ASIC, an FPGA is reprogrammable in that it can be reconfigured for each application. Similarly, as protocols change, an FPGA design can be changed even after the design has been shipped to customers, much like Software can be updated. However, FPGAs are typically more expensive, often costing 10 to 100 times more than an ASIC. FPGAs typically consume more power for performing comparable functions as an ASIC and their performance can be 10 to 20 times worse than that of an ASIC. Multicore systems (e.g., tiled processors) use parallel processing to achieve some features of both ASICs and FPGAs. For example, some multicore systems are power efficient like an ASIC because they use custom logic for some functions, and reconfigurable like FPGAs because they are programmable in software.”).
Spinner et al. in view of Griffin et al. does not appear to explicitly teach … deriving via the processor, a respective value for time required for each of the requests; and … auto-scaling, a processing node number using backpropagation machine learning responsive to determining that a total execution time for all of the requests in the plurality of requests in the service master queue exceeds a predetermined time value.
	Sigal et al. teaches … deriving via the processor, a respective value for time required for each of the requests (paragraph 0070, “FIG. 5 shows a graphical representation of the resource utilization per unit time for the servers in a cluster. Specifically, a new incoming task 500 that will take a certain percentage of CPU namely "CPn", on an idle server for a given execution time namely "Tnew" and that will take a certain amount of memory "menm" until the task completes. Predicted server resource utilization chart 501 shows the predicted CPU, memory utilization and completion times (that is updated when a task actually completes in one or more embodiments) of the tasks executing on a first server …” teaches predicting completion times of each task in a plurality of tasks [deriving via the processor, a respective value for time required for each of the tasks]); and 
… auto-scaling, a processing node number using backpropagation machine learning and responsive to determining that a total execution time for all of the requests in the plurality of requests in the service master queue exceeds a predetermined time value (paragraph 0016, “ … a feed-forward backpropagation neural network is utilized to predict completion of incoming tasks …” teaches a backpropagation neural network trained to predict completion [predetermined time value] of incoming tasks [plurality of requests];
paragraph 0041, “The load balancer and neural network model execute within the backend thread independently from the client thread. The neural network model obtains task information and input parameters from the load balancer engine and in addition obtains observed resource utilization and analyzes the information. The information is utilized for training the neural network to predict resource utilization for future incoming tasks…” teaches prediction of resource utilization for a plurality of incoming tasks [the plurality of requests];  
paragraph 0069, “FIG. 4 shows a flow chart for an embodiment of the preemptive neural network database load balancer … The computer program product obtains an incoming task name and a first set of associated input parameters at 401. The task name is used as a handle to associate observed and predicted resource utilization for example. The computer program product obtains a resource utilization result, i.e., an actual observed resource utilization associated with the incoming task that is executed on a database server residing in a server cluster at 402. This may involve more than one resource utilization parameter, e.g., CPU time to execute and memory.  The computer program product uses the observed resource utilization result to train a neural network and the observed result is associated with the incoming task name and its set of associated input parameters at 403. The computer program product provides predicted resource utilization for an incoming task having an already observed incoming task name although the newly inbound task may utilize different input parameters at 404. The computer program product optionally provides a connection to a predicted least busy server based on the predicted resource utilization for the new incoming task at 405.” teaches resource utilization parameter including CPU time to execute a task,
 teaches observed CPU time to execute [total execution time] for each incoming task out of a plurality of incoming tasks [plurality of requests], 

teaches potential assignment of incoming task to an idle server in plurality of servers [auto-scaling, a processing node number]  based upon observed and predicted resource utilization parameters).
Spinner et al., Griffin et al., and Sigal et al. are considered analogous art because they are directed to effective management of computing resources within a computer cluster.
In view of the teachings of Spinner et al. in view of Griffin et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Sigal et al. at the time the application was filed in order preemptively assign incoming tasks to servers based on predicted CPU utilization of said tasks, thus enabling scalability, decreasing latency, and optimizing performance (cf. Sigal et al., paragraph 0013, “One or more embodiments of the invention enable a preemptive neural network database load balancer. Embodiments of the invention are predictive in nature and are configured to observe, learn and predict the resource utilization that given incoming tasks utilize. Predictive load balancing allows for efficient execution and use of system resources. Efficient use of system resources allows for lower hardware costs, since the hardware is utilized in a more efficient manner …”).
Regarding Claim 2,
	Spinner et al. in view of Griffin et al. and in further view of Sigal et al. teaches the method of claim 1.
Sigal et al. further teaches storing, in an operatively connected computer memory, the value for time required for each of the requests in the plurality of requests in the service master queue (paragraphs 0040-0041, “[0040]… The load balancer engine is responsible for collection for example through a "listener" of all needed information (CPU, memory, disk, network utilization) with respect to the tasks running in the cluster of servers (shown as the lower rectangle in FIG. 1). The listener continuously collects resource utilization information from the servers via and the load balancer engine calls the neural network model in order to continuously predict which server has the lowest current and predicted loads.… [0041] … Upon request from the load balancer engine, a given task with particular input parameters results in the neural network returning predicted resource utilization to the load balancer. The load balancer then assigns the incoming task to a particular server based upon the predicted and observed resource utilization of a given server and the predicted resource utilization of the particular incoming task …” teaches collection of predicted resource utilization for each of the tasks [each of the requests] in the plurality of tasks [plurality of requests];  
paragraph 0069, “…The task name is used as a handle to associate observed and predicted resource utilization … This may involve more than one resource utilization parameter, e.g., CPU time to execute and memory …” teaches resource utilization being CPU time to execute [value for time];
paragraph 0027, “Embodiments of the invention gather information related to the … list of tasks when the tasks are instantiated by clients utilizing the system.  This information, together with the information about resource utilization … is dynamically stored, analyzed and then used for training a neural network…” and paragraph 0069, “ … Embodiments of the preemptive neural network database load balancer may be implemented as a computer program product for example which includes computer readable instruction code that executes in a tangible memory medium of a computer or server computer …” teaches resource utilization [value for time] stored in tangible memory medium of computer [operatively connected computer memory]).
Spinner et al., Griffin et al., and Sigal et al. are combinable for the same rationale as set forth above with respect to claim 1.
Regarding Claim 3,
	Spinner et al. in view of Griffin et al. and in further view of Sigal et al. teaches the method of claim 2.
	Sigal et al. further teaches wherein the predetermined time value for each of the requests in the plurality of requests in the service master queue changes dynamically with respect to time (paragraph 0068, “… When the data in the database changes over time which alters the number of records, number of images, BLOBs, PDF files, etc., the CPU execution time and memory utilization may also change depending on the particular task. As the observed values change over time, the neural network employed may thus learn and alter predictions which allows for more accurate preemptive load balancing…” teaches predicted CPU execution time [predetermined time value] dynamically changing with time based on the data in the database changing over time).
Spinner et al., Griffin et al., and Sigal et al. are combinable for the same rationale as set forth above with respect to claim 1.
Regarding Claim 5,
	Spinner et al. in view of Griffin et al. and in further view of Sigal et al. teaches the method of claim 1.
	Spinner et al. further teaches … summing a plurality of execution times associated with each of the requests (p. 6, Algorithm 1, lines 5 and 6 & p. 5, section IV, paragraph 6 “ In order to determine possible candidate VMs for scale down, the algorithm calculates the expected end-to-end latency Tdown if one vCPU is removed from any of the VMs (line 5 and 6)…”  and  p. 6 Algorithm 1, line 3 & p. 5, section IV, paragraph 5 “ … The function AnalyseModel takes a performance submodel Mv, the current queue length Qv and the number of vCPUs av for VM v as the input. It then calculates the expected residence time for newly arriving requests using Equation (1) with the current number of vCPUs (line 3) or with one vCPU less (line 4)…” teaches summation of the values of the variable Rcur[v] on a plurality of VMs [summing a plurality of execution times] for newly arriving requests).
	Spinner et al. in view of Griffin et al. do not appear to explicitly teach wherein auto-scaling the processing node number using backpropagation machine learning responsive to determining that a total execution time for all of the requests in the plurality of requests in the service master queue exceeds a predetermined time value comprises: retrieving the predetermined time value stored in an operatively connected computer memory; … evaluating, via the processor, whether a sum of the plurality of execution times exceeds the predetermined time value; and auto-scaling the processing node number responsive to determining that a total execution time for all of the requests in the plurality of requests exceeds a predetermined time value.
Sigal et al. teaches wherein auto-scaling the processing node number responsive to determining that a total execution time for all of the requests in the plurality of requests in the service master queue exceeds a predetermined default time value comprises: retrieving the predetermined time value stored in an operatively connected computer memory (paragraph 0027, “Embodiments of the invention gather information related to the … list of tasks when the tasks are instantiated by clients utilizing the system.  This information, together with the information about resource utilization … is dynamically stored, analyzed and then used for training a neural network…” and paragraph 0069, “ … Embodiments of the preemptive neural network database load balancer may be implemented as a computer program product for example which includes computer readable instruction code that executes in a tangible memory medium of a computer or server computer. The computer program product obtains an incoming task name and a first set of associated input parameters at 401. The task name is used as a handle to associate observed and predicted resource utilization … This may involve more than one resource utilization parameter, e.g., CPU time to execute and memory …” teaches predicted resource utilization [predetermined time value] stored in tangible memory medium of computer [operatively connected computer memory] for future use in training a neural network [retrieving the predetermined time value]);
… evaluating, via the processor, whether a sum of the plurality of execution times exceeds the predetermined time value; and auto-scaling the processing node number responsive to determining that a total execution time for all of the requests in the plurality of requests exceeds a predetermined time value (paragraph 0041, “… with a given incoming read-only task predicted to take 10 seconds of CPU to complete, and with server Slave 1 and Slave "m" having a predicted current utilization of 20 more seconds and Slave 2 having a predicted current utilization of 10 more seconds, the incoming task is assigned to Slave 2 …” teaches the plurality of execution times (10 seconds from incoming task PLUS 20 seconds of execution time of additional tasks by server slaves 1 
Spinner et al., Griffin et al., and Sigal et al. are combinable for the same rationale as set forth above with respect to claim 1.
Regarding Claim 6,
	Spinner et al. in view of Griffin et al. and in further view of Sigal et al. teaches the method of claim 1.
	Sigal et al. further teaches wherein auto-scaling the processing node number using backpropagation machine learning and responsive to determining that a total execution time for all of the requests in the plurality of requests exceeds a predetermined time value comprises writing, via the processor, to a computer memory, a value for a number of processing nodes having capacity to complete all of the requests in the plurality of requests in the service master queue (paragraph 0027, “Embodiments of the invention gather information related to the … list of tasks when the tasks are instantiated by clients utilizing the system.  This information, together with the information about resource utilization including CPU, memory, disk and/or network utilization is dynamically stored, analyzed and then used for training a neural network…” and paragraph 0069, “ … Embodiments of the preemptive neural network database load balancer may be implemented as a computer program product for example which includes computer readable instruction code that executes in a tangible memory medium of a computer or server computer. The computer program product obtains an incoming task name and a first set of associated input parameters at 401. The task name is used as a handle to associate observed and predicted resource utilization … This may involve more than one resource utilization parameter, e.g., CPU time to execute and memory …” teaches resource utilization [value for number of processing nodes having capacity to complete all of the requests] stored in tangible memory medium of computer [writing, via the processor, to a computer memory]).
Spinner et al., Griffin et al., and Sigal et al. are combinable for the same rationale as set forth above with respect to claim 1.
Regarding Claim 7,
	Spinner et al. in view of Griffin et al. and in further view of Sigal et al. teaches the method of claim 6.
	Sigai et al. further teaches wherein the value for the number of processing nodes having capacity to complete all of the requests in the plurality of requests in the service master queue does not exceed a sum of the plurality of execution times by greater than a predetermined optimization value (paragraph 0028, “ … Specifically, when training a feed-forward back-propagation neural network inputs such as the task name and input parameters are stored and the error that occurs between the predicted resource utilization and the observed resource utilization are utilized to calculate the gradient of the error of the network and to find weights for the neurons that minimize the error…” and paragraph 0069, “ … Embodiments of the preemptive neural network database load balancer may be implemented as a computer program product for example which includes computer readable instruction code that executes in a tangible memory medium of a computer or server computer. The computer program product obtains an incoming task name and a first set of associated input parameters at 401. The task name is used as a handle to associate observed and predicted resource utilization … This may involve more than one resource utilization parameter, e.g., CPU time to execute and memory …” teaches resource utilization parameters including CPU time to execute [execution time] and teaches the observed resource utilization [the value for the number of processing nodes having capacity to complete all of the requests] not exceeding the predicted resource utilization [the sum of the plurality of execution times] by greater than the error [predetermined optimization value]). 
Regarding Claim 9,
	Claim 9 is substantially similar to claim 2 and therefore is rejected on the same ground as claim 2.  Claim 9 is directed to a “system” that corresponds to the method of claim 2.  
	Spinner et al. further teaches a system for creating an auto-scaled predictive analytics model comprising a processor (p. 3, section III, paragraph 2 “… Due to the complexity of virtualized environments, we adopt a layered modeling approach for describing the application performance, where a virtualized system consists of a layered architecture, with each layer contributing to the externally visible application performance … the physical resource layer consists of the hardware resources (CPUs, main memory, etc.) of the physical host …” and p. 5, section III, paragraph 15 “…We use non-negative least squares regression to determine the application demand” teaches a regression model to predict application demand [predictive analytics model] and CPUs [a processor]; 
p. 5, section IV(A), paragraph 2, “ … The first part of the algorithm evaluates if the application can still fulfill its performance targets if a vCPU is removed from any of the member VMs and choose the VM which has the least impact on the application performance … The second part of the algorithm is executed if the application performance targets are or will soon be violated, and determines which VM is best scaled up to improve the application performance …”  teaches auto-scaling of VMs in response to predicting resource utilization using the regression model [predictive analytics model]).
Regarding Claim 10,
	Claim 10 is substantially similar to claim 3 and therefore is rejected on the same ground as claim 3.  Claim 10 is directed to a “system” that corresponds to the method of claim 3.  
	Spinner et al. further teaches a system for creating an auto-scaled predictive analytics model comprising a processor (p. 3, section III, paragraph 2 “… Due to the complexity of virtualized environments, we adopt a layered modeling approach for describing the application performance, where a virtualized system consists of a layered architecture, with each layer contributing to the externally visible application performance … the physical resource layer consists of the hardware resources (CPUs, main memory, etc.) of the physical host …” and p. 5, section III, paragraph 15 “…We use non-negative least squares regression to determine the application demand” teaches a regression model to predict application demand [predictive analytics model] and CPUs [a processor]; 
p. 5, section IV(A), paragraph 2, “ … The first part of the algorithm evaluates if the application can still fulfill its performance targets if a vCPU is removed from any of the member VMs and choose the VM which has the least impact on the application performance … The second part of the algorithm is executed if the application performance targets are or will soon be violated, and determines which VM is best scaled up to improve the application performance …”  teaches auto-scaling of VMs in response to predicting resource utilization using the regression model [predictive analytics model]).
Regarding Claim 12,

	Spinner et al. further teaches a system for creating an auto-scaled predictive analytics model comprising a processor (p. 3, section III, paragraph 2 “… Due to the complexity of virtualized environments, we adopt a layered modeling approach for describing the application performance, where a virtualized system consists of a layered architecture, with each layer contributing to the externally visible application performance … the physical resource layer consists of the hardware resources (CPUs, main memory, etc.) of the physical host …” and p. 5, section III, paragraph 15 “…We use non-negative least squares regression to determine the application demand” teaches a regression model to predict application demand [predictive analytics model] and CPUs [a processor]; 
p. 5, section IV(A), paragraph 2, “ … The first part of the algorithm evaluates if the application can still fulfill its performance targets if a vCPU is removed from any of the member VMs and choose the VM which has the least impact on the application performance … The second part of the algorithm is executed if the application performance targets are or will soon be violated, and determines which VM is best scaled up to improve the application performance …”  teaches auto-scaling of VMs in response to predicting resource utilization using the regression model [predictive analytics model]).
Regarding Claim 13,
	Claim 13 is substantially similar to claim 6 and therefore is rejected on the same ground as claim 6.  Claim 13 is directed to a “system” that corresponds to the method of claim 6.  
	Spinner et al. further teaches a system for creating an auto-scaled predictive analytics model comprising a processor (p. 3, section III, paragraph 2 “… Due to the complexity of virtualized environments, we adopt a layered modeling approach for describing the application performance, where a virtualized system consists of a layered architecture, with each layer contributing to the externally visible application performance … the physical resource layer consists of the hardware resources (CPUs, main memory, etc.) of the physical host …” and p. 5, section III, paragraph 15 “…We use non-negative least squares regression to determine the application demand” teaches a regression model to predict application demand [predictive analytics model] and CPUs [a processor]; 
p. 5, section IV(A), paragraph 2, “ … The first part of the algorithm evaluates if the application can still fulfill its performance targets if a vCPU is removed from any of the member VMs and choose the VM which has the least impact on the application performance … The second part of the algorithm is executed if the application performance targets are or will soon be violated, and determines which VM is best scaled up to improve the application performance …”  teaches auto-scaling of VMs in response to predicting resource utilization using the regression model [predictive analytics model]).
Regarding Claim 14,
	Claim 14 is substantially similar to claim 5 and therefore is rejected on the same ground as claim 7.  Claim 14 is directed to a “system” that corresponds to the method of claim 7.  
	Spinner et al. further teaches a system for creating an auto-scaled predictive analytics model comprising a processor (p. 3, section III, paragraph 2 “… Due to the complexity of virtualized environments, we adopt a layered modeling approach for describing the application performance, where a virtualized system consists of a layered architecture, with each layer contributing to the externally visible application performance … the physical resource layer consists of the hardware resources (CPUs, main memory, etc.) of the physical host …” and p. 5, section III, paragraph 15 “…We use non-negative least squares regression to determine the application demand” teaches a regression model to predict application demand [predictive analytics model] and CPUs [a processor]; 
p. 5, section IV(A), paragraph 2, “ … The first part of the algorithm evaluates if the application can still fulfill its performance targets if a vCPU is removed from any of the member VMs and choose the VM which has the least impact on the application performance … The second part of the algorithm is executed if the application performance targets are or will soon be violated, and determines which VM is best scaled up to improve the application performance …”  teaches auto-scaling of VMs in response to predicting resource utilization using the regression model [predictive analytics model]).
Regarding Claim 15,
	Claim 15 is substantially similar to claim 1 and therefore is rejected on the same ground as claim 1.  Claim 15 is directed to a “computer program product” that corresponds to the method of claim 1.  
	Sigal et al. further teaches a computer program product for creating an auto-scaled predictive analytics model, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform a method (paragraph 0069, “FIG. 4 shows a flow chart for an embodiment of the preemptive neural network database load balancer. Processing starts at 400. Embodiments of the preemptive neural network database load balancer may be implemented as a computer program product for example which includes computer readable instruction code that executes in a tangible memory medium of a computer or server computer…” teaches a neural network database load balancer [predictive analytics model] executed on a tangible memory medium of a computer; 
paragraph 0027, “ … In one or more embodiments of the invention, the neural network is for example a feed-forward back-propagation neural network module that is trained to predict the resource utilization and completion of incoming client tasks and determine the server that should be utilized to execute the task. In one or more embodiments, the server to utilize for an incoming task is for example the least resource bound or least utilized” teaches a neural network being used to identify a least utilized server for processing an incoming task, the least busy server possibly being unused [auto-scaled predictive analytics model]).
Regarding Claim 16,
	Claim 16 is substantially similar to claim 2 and therefore is rejected on the same ground as claim 2.  Claim 16 is directed to a “computer program product” that corresponds to the method of claim 2.  
	Sigal et al. further teaches a computer program product for creating an auto-scaled predictive analytics model, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform a method (paragraph 0069, “FIG. 4 shows a flow chart for an embodiment of the preemptive neural network database load balancer. Processing starts at 400. Embodiments of the preemptive neural network database load balancer may be implemented as a computer program product for example which includes computer readable instruction code that executes in a tangible memory medium of a computer or server computer…” teaches a neural network database load balancer [predictive analytics model] executed on a tangible memory medium of a computer; 
paragraph 0027, “ … In one or more embodiments of the invention, the neural network is for example a feed-forward back-propagation neural network module that is trained to predict the resource utilization and completion of incoming client tasks and determine the server that should be utilized to execute the task. In one or more embodiments, the server to utilize for an incoming task is for example the least resource bound or least utilized” teaches a neural network being used to identify a least utilized server for processing an incoming task, the least busy server possibly being unused [auto-scaled predictive analytics model]).
Regarding Claim 17,
	Claim 17 is substantially similar to claim 3 and therefore is rejected on the same ground as claim 3.  Claim 17 is directed to a “computer program product” that corresponds to the method of claim 3.  
	Sigal et al. further teaches a computer program product for creating an auto-scaled predictive analytics model, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform a method (paragraph 0069, “FIG. 4 shows a flow chart for an embodiment of the preemptive neural network database load balancer. Processing starts at 400. Embodiments of the preemptive neural network database load balancer may be implemented as a computer program product for example which includes computer readable instruction code that executes in a tangible memory medium of a computer or server computer…” teaches a neural network database load balancer [predictive analytics model] executed on a tangible memory medium of a computer; 
paragraph 0027, “ … In one or more embodiments of the invention, the neural network is for example a feed-forward back-propagation neural network module that is trained to predict the resource utilization and completion of incoming client tasks and determine the server that should be utilized to execute the task. In one or more embodiments, the server to utilize for an incoming task is for example the least resource bound or least utilized” teaches a neural network being used to identify a least utilized server for processing an incoming task, the least busy server possibly being unused [auto-scaled predictive analytics model]).
Regarding Claim 19,
	Claim 19 is substantially similar to claim 5 and therefore is rejected on the same ground as claim 5.  Claim 19 is directed to a “computer program product” that corresponds to the method of claim 5.  
	Sigal et al. further teaches a computer program product for creating an auto-scaled predictive analytics model, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform a method (paragraph 0069, “FIG. 4 shows a flow chart for an embodiment of the preemptive neural network database load balancer. Processing starts at 400. Embodiments of the preemptive neural network database load balancer may be implemented as a computer program product for example which includes computer readable instruction code that executes in a tangible memory medium of a computer or server computer…” teaches a neural network database load balancer [predictive analytics model] executed on a tangible memory medium of a computer; 
paragraph 0027, “ … In one or more embodiments of the invention, the neural network is for example a feed-forward back-propagation neural network module that is trained to predict the resource utilization and completion of incoming client tasks and determine the server that should be utilized to execute the task. In one or more embodiments, the server to utilize for an incoming task is for example the least resource bound or least utilized” teaches a neural network being used to identify a least utilized server for processing an incoming task, the least busy server possibly being unused [auto-scaled predictive analytics model]).
Regarding Claim 20,
	Claim 20 is substantially similar to claim 6 and therefore is rejected on the same ground as claim 6.  Claim 20 is directed to a “computer program product” that corresponds to the method of claim 6.  
	Sigal et al. further teaches a computer program product for creating an auto-scaled predictive analytics model, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform a method (paragraph 0069, “FIG. 4 shows a flow chart for an embodiment of the preemptive neural network database load balancer. Processing starts at 400. Embodiments of the preemptive neural network database load balancer may be implemented as a computer program product for example which includes computer readable instruction code that executes in a tangible memory medium of a computer or server computer…” teaches a neural network database load balancer [predictive analytics model] executed on a tangible memory medium of a computer; 
paragraph 0027, “ … In one or more embodiments of the invention, the neural network is for example a feed-forward back-propagation neural network module that is trained to predict the resource utilization and completion of incoming client tasks and determine the server that should be utilized to execute the task. In one or more embodiments, the server to utilize for an incoming task is for example the least resource bound or least utilized” teaches a neural network being used to identify a least utilized server for processing an incoming task, the least busy server possibly being unused [auto-scaled predictive analytics model]).
Claims 4, 11 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Spinner et al. (“ Runtime Vertical Scaling of Virtualized Applications via Online Model Estimation”) in view of Griffin et al. (US 8,738,860 B1) and in view of Sigal et al. (US 2008/0222646 Al) and in further view of Guo et al. (“SVIS: Large Scale Video Data Ingestion into Big Data Platform”).
Regarding Claim 4,
Spinner et al. in view of Griffin et al. and in further view of Sigal et al. teaches the method of claim 1.
Spinner et al. in view of Griffin et al. and in further view of Sigal et al. does not appear to explicitly teach wherein deriving the value for time required for each of the requests comprises accessing, via the processor, an ingestion service model comprising a data store value, a data size value associated with the data store value, and a time taken value associated with the data store value.
Guo et al. teaches wherein deriving the value for time required for each of the requests comprises accessing, via the processor, an ingestion service model (p. 301, section 1, paragraph 5 “In this paper, we thereby present SVIS, a scalable and extendable video data ingestion system which can fast ingest diverse video sources into big data stores. SVIS integrates rich video processing functionalities. It is able to transcode and transform video data of different source types, and then directly pipeline them to video analytics applications. SVIS can easily scale out to support large scale video surveillance systems. It can be conveniently extended to support new video sources and data sinks” teaches an ingestion service model responsible for pipelining data from different video sources and into data stores) comprising
a data store value, a data size value associated with the data store value, and a time taken value associated with the data store value (p. 304, section 3, paragraph 1 “ … We setup a 10-node video ingestion cluster on 10 physical servers with 16GB RAM and Xeon E5-2640@2.00 GHz 4 core CPU. They were connected by 1Gbps network. .We deployed 1 ingestion manager node, 8 ingestion worker nodes, and 1 RabbitMQ server node serving as the message bus …” and p. 305, section 3, paragraph 4, “ … We generated three video files with size of 1GB, 10GB and 20GB and parallel ingested them with different parallelism degrees. Greater parallelism degree indicates that more ingestion workers are employed by SVIS ingestion system. Figure 3 shows the experimental results: left figure shows the total ingestion time of each file, right figure shows the achieved ingestion bitrate of each file …” teaches the ingesting bit rate of each video file [data store value], the size of the video file [data size value associated with the data store value], and total ingestion time of each video file [time taken value associated with the data store value]).
Spinner et al., Griffin et al., Sigal et al., and Guo et al. are considered analogous art because they are directed to facilitating the efficient processing of data transmitted through a network.
Spinner et al. in view of Griffin et al. and in further view of Sigal et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Guo et al. at the time the application was filed in order to efficiently transcode and transform unstructured video data of different source types and pipeline them into video analytics applications, thus obtaining more timely video analytics results (cf. Guo et al., p.301, section 1, paragraphs 3-5, “… There are many existing systems in big data ecosystem to realize data loading or data ingestion work … However, all these systems are not perfectly suitable to address video ingestion jobs.  They lack video processing capability, and hence it is difficult to ingest video data especially video streaming data directly from IP cameras.  They do not have good flexibility and extendability so that it is difficult to extend them to support new types of data source and data sink.  These systems usually utilize batch loading, emphasizing more on ingestion throughput rather than latency … In this paper, we therefore provide SVIS, a scalable and extendable video data ingestion system which can fast ingest diverse video sources into big data stores … SVIS can easily scale out to support large scale video surveillance systems.  It can be conveniently extended to support new video sources and data sinks …”).
Regarding Claim 11,
	Claim 11 is substantially similar to claim 4 and therefore is rejected on the same ground as claim 4.  Claim 11 is directed to a “system” that corresponds to the method of claim 4.  
	Spinner et al. further teaches a system for creating an auto-scaled predictive analytics model comprising a processor (p. 3, section III, paragraph 2 “… Due to the complexity of virtualized environments, we adopt a layered modeling approach for describing the application performance, where a virtualized system consists of a layered architecture, with each layer contributing to the externally visible application performance … the physical resource layer consists of the hardware resources (CPUs, main memory, etc.) of the physical host …” and p. 5, section III, paragraph 15 “…We use non-negative least squares regression to determine the application demand” teaches a regression model to predict application demand [predictive analytics model] and CPUs [a processor]; 
p. 5, section IV(A), paragraph 2, “ … The first part of the algorithm evaluates if the application can still fulfill its performance targets if a vCPU is removed from any of the member VMs and choose the VM which has the least impact on the application performance … The second part of the algorithm is executed if the application performance targets are or will soon be violated, and determines which VM is best scaled up to improve the application performance …”  teaches auto-scaling of VMs in response to predicting resource utilization using the regression model [predictive analytics model]).
Regarding Claim 18,
	Claim 18 is substantially similar to claim 4 and therefore is rejected on the same ground as claim 4.  Claim 18 is directed to a “computer program product” that corresponds to the method of claim 4.  
	Sigal et al. further teaches a computer program product for creating an auto-scaled predictive analytics model, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform a method (paragraph 0069, “FIG. 4 shows a flow chart for an embodiment of the preemptive neural network database load balancer. Processing starts at 400. Embodiments of the preemptive neural network database load balancer may be implemented as a computer program product for example which includes computer readable instruction code that executes in a tangible memory medium of a computer or server computer…” teaches a neural network database load balancer [predictive analytics model] executed on a tangible memory medium of a computer; 
paragraph 0027, “ … In one or more embodiments of the invention, the neural network is for example a feed-forward back-propagation neural network module that is trained to predict the resource utilization and completion of incoming client tasks and determine the server that should be utilized to execute the task. In one or more embodiments, the server to utilize for an incoming task is for example the least resource bound or least utilized” teaches a neural network being used to identify a least utilized server for processing an incoming task, the least busy server possibly being unused [auto-scaled predictive analytics model]).
Conclusion 
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CHIAKA CHUKWUMA OKOROH/Examiner, Art Unit 2125                                                                                                                                                                                                        
/MICHAEL J HUNTLEY/Primary Examiner, Art Unit 2116