Detailed Action
This action is in response to Applicant's communications filed 23 September 2022.  
Claim(s) 1-3, 5, 7-11, 13, 15, 18-20 was/were amended.  No claims were cancelled. No claims were withdrawn.  No claims were added.  Therefore, claims 1-20 are pending in this Application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments/Arguments
Applicant's amendments/arguments, filed 23 September 2022, regarding the interpretation of claims 1-20 under 35 USC 112(f) have been fully considered and are sufficient to avoid interpretation under 35 U.S.C. 112(f).  Accordingly, the interpretation of the claims under 35 USC 112(f) have been withdrawn.
Applicant's amendments/arguments, filed 23 September 2022, regarding the rejections of claims 11-20 under 35 USC 101 have been fully considered and are sufficient to overcome the rejections.  Accordingly, the rejections to the claims under 35 USC 101 have been withdrawn.
Applicant's arguments, filed 23 September 2022, regarding the rejections of claims 1-20 under 35 USC 103 have been fully considered but are moot because the arguments do not apply to any of the references being used in the current rejection.
Applicant’s arguments, filed 23 September 2022, with respect to the rejections of claims 1-20 under 35 USC 103 are regarding newly amended claims and are addressed in the current rejection. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1, 3-6, 8, 10, 11, 13-16, 18, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cambazoglu et al. (A Refreshing Perspective of Search Engine Caching, hereinafter "Cambazoglu") in view of Mesbahi et al. (Load Balancing in Cloud Computing: A State of the Art Survey, hereinafter "Mesbahi").

Regarding Claim 1,
Cambazoglu teaches a method comprising: 
requesting, by the call prediction processing computer ("large numbers of machines grouped in clusters by functionality... These caches may be deployed in separate machines, acting as a proxy, or co-exist in the same machine with query processors" sec. 1, p. 181), a set of cacheable requests from an orchestration service computer ("At a high level, a search engine receives a query from a user" sec. 1, p. 181; " In the experiments, we use a query log obtained from the traffic of the Yahoo! Web search engine. The log contains 130; 320; 176 queries (65; 100; 647 unique), received during nine consecutive days of operation." sec. 4.1, p. 183), wherein the set of cacheable requests includes previous prediction requests previously processed by the Al computer ("Herein, we focus on result caches, which store previously computed query results. These caches may be deployed in separate machines, acting as a proxy, or co-exist in the same machine with query processors. At a high level, a search engine receives a query from a user, processes the query over its indexed documents, and returns a small set of relevant results to the user. If a previously computed set of results is cached, the query can be served directly from the cache, eliminating the need to process the query." sec. 1, p. 181);
receiving, by the call prediction processing computer from the orchestration service computer, the set of cacheable requests ("Herein, we focus on result caches, which store previously computed query results. These caches may be deployed in separate machines, acting as a proxy, or co-exist in the same machine with query processors. At a high level, a search engine receives a query from a user, processes the query over its indexed documents, and returns a small set of relevant results to the user. If a previously computed set of results is cached, the query can be served directly from the cache, eliminating the need to process the query." sec. 1, p. 181);
sending, by the call prediction processing computer, a cacheable request from the set of cacheable requests to the Al computer for processing ("Once a new query request arrives, the cache hashes the query string and determines if a corresponding entry is present or not. If a query is not cached, the engine evaluates it and caches the results, evicting another query if necessary. A result cache provides two desirable benefits: it reduces the average latency perceived by users, and it reduces the load on back-end query processors. Such a cache may run on the same machines as query processors or on separate machines. Herein, we assume that the result cache resides on separate machines and that most resources of those machines are available to the cache." sec. 3, p. 183), the cacheable request corresponding to one of the previous prediction requests ("Herein, we focus on result caches, which store previously computed query results. These caches may be deployed in separate machines, acting as a proxy, or co-exist in the same machine with query processors. At a high level, a search engine receives a query from a user, processes the query over its indexed documents, and returns a small set of relevant results to the user. If a previously computed set of results is cached, the query can be served directly from the cache, eliminating the need to process the query." sec. 1, p. 181);
receiving, by the call prediction processing computer, an output generated by the Al computer based on the cacheable request ("Once a new query request arrives, the cache hashes the query string and determines if a corresponding entry is present or not. If a query is not cached, the engine evaluates it and caches the results, evicting another query if necessary. A result cache provides two desirable benefits: it reduces the average latency perceived by users, and it reduces the load on back-end query processors. Such a cache may run on the same machines as query processors or on separate machines. Herein, we assume that the result cache resides on separate machines and that most resources of those machines are available to the cache." sec. 3, p. 183); and
storing, by the call prediction processing computer, the output in a data repository ("Once a new query request arrives, the cache hashes the query string and determines if a corresponding entry is present or not. If a query is not cached, the engine evaluates it and caches the results, evicting another query if necessary. A result cache provides two desirable benefits: it reduces the average latency perceived by users, and it reduces the load on back-end query processors. Such a cache may run on the same machines as query processors or on separate machines. Herein, we assume that the result cache resides on separate machines and that most resources of those machines are available to the cache." sec. 3, p. 183), 
wherein, based on receiving a subsequent prediction request which is the same as the one of the previous prediction requests, the output is retrieved from the data repository without the subsequent prediction request being sent to the Al computer for processing ("Note that processing a query may require cycles from hundreds of computers. Consequently, retrieving the results of a query from the disk rather than recomputing it over a huge Web index uses substantially fewer resources." sec. 4.2, p. 184).

Cambazoglu does not explicitly teach determining, by a call prediction processing computer, that an Al computer is operating below a threshold processor usage;
based on the Al computer operating below the threshold processor usage, requesting a set of cacheable requests

Mesbahi teaches determining, by a call prediction processing computer ("Resource Management Module" sec. IV.B, p. 70), that an Al computer ("some nodes are overloaded while others are under loaded or idle" sec. II, p. 65; "tasks can move dynamically from an overloaded node to an under-loaded one" p. 65; the overloaded and underloaded nodes teach the AI computer) is operating below a threshold processor usage ("Host under-loading detection" sec. IV.C, p. 71; "Using different service thresholds such as: remaining CPU, remaining memory and transmission rate" Table 1, p. 72; "Dynamic load balancing techniques take into consideration the current state of systems that their decisions are based on. In this technique tasks can move dynamically from an overloaded node to an under-loaded one and this is the main advantage of dynamic load balancing algorithms which can change continuously according to the current state of the system." sec. II, p. 65; "A common solution in almost all cloud load balancing mechanisms is application or VM migration that some jobs will be transferred from overloaded nodes to under-loaded ones for gaining a normal state in the system." sec. IV.C, p. 72);
based on the Al computer operating below the threshold processor usage, requesting a set of cacheable requests ("a mechanism for finding and selecting a new instance with the same configuration as previous allocated resources to transfer the rest of client tasks and requests to new resources when the system needs to a load balancing process because of a sudden increasing of the input workload. For instance, in the Amazon EC2, dynamic load balancing is done by replicating instances of the specific middleware platform. A traffic analyzer tracks a client requests and new instances of the same platform will start when the load increases to a certain threshold [59]." sec. III.C, p. 68).
Cambazoglu and Mesbahi are analogous art because both are directed to managing client requests in cloud computing. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the caching system of Cambazoglu with the load balancing of Mesbahi.  The modification would have been obvious because one of ordinary skill in the art would be motivated to improve system performance for achieving optimal resource utilization, as suggested by Mesbahi ("Improving the general system performance for achieving optimal resource utilization, maximum throughput and avoiding overload" sec. II, p. 65).

Regarding Claim 3,
The Cambazoglu/Mesbahi combination teaches the method of claim 1.  Cambazoglu further teaches wherein the set of cacheable requests comprises the cacheable requests most likely to be requested by users ("A well-known observation from the information retrieval literature is that query frequencies follow a power-law distribution. This implies that a few queries have very high frequencies and many appear very infrequently, often just once (singleton queries). As a number of queries repeat often enough, result cache implementations in practice can achieve high hit rates. " sec. 1, p. 181).

Regarding Claim 4,
The Cambazoglu/Mesbahi combination teaches the method of claim 1.  Cambazoglu further teaches wherein the output is persisted in the data repository for a predetermined period of time ("A simple way to achieve this goal is to use a time-to-live (TTL) value and mark entries as expired once they have been in the cache for longer than TTL. Once entries are invalid, they are eventually either evicted or replaced with new results. Given that the engine is often not processing queries at full capacity, we can leverage idle cycles to re-process queries and refresh cache entries. If we use the TTL approach and TTL is long enough, we may be able to populate the cache with fresh entries that will be hits during busy times." sec. 1, p. 182).

Regarding Claim 5,
The Cambazoglu/Mesbahi combination teaches the method of claim 4.  Cambazoglu further teaches wherein the predetermined period of time is determined by the Al computer as part of the output ("A viable alternative to the freshness problem is to bound the amount of time the search engine is allowed to serve a given entry from the cache by associating a TTL value with each entry. An entry is said to be expired if the difference between the current time and the last time the entry is updated is larger than the TTL value. Otherwise, the entry is fresh. The content of an expired entry is considered stale. Hence, every hit on an expired entry is treated as a miss, and the corresponding query is forwarded to the back-end clusters. This approach sets an upper-bound on the age of a hit, but does not prevent the search engine from serving stale results, as expiration is not synchronized with updates to the index. Consequently, under this scheme, it becomes crucial to carefully select a value that provides an acceptable degree of freshness." sec. 4.4, p. 185).

Regarding Claim 6,
The Cambazoglu/Mesbahi combination teaches the method of claim 1.  Cambazoglu further teaches wherein the output is received from the orchestration service computer ("At a high level, a search engine receives a query from a user, processes the query over its indexed documents, and returns a small set of relevant results to the user. If a previously computed set of results is cached, the query can be served directly from the cache, eliminating the need to process the query." sec. 1, p. 181; returning the results to the user teaches the output is received from the orchestration service computer).

Regarding Claim 8,
The Cambazoglu/Mesbahi combination teaches the method of claim 1.  Mesbahi further teaches wherein the sending further comprises sending the cacheable request to the Al computer via the orchestration service computer ("In the third level, there are service nodes. Level two which consists of some service mangers is used for dividing tasks to some subtasks and assigning them to appropriate service nodes. Finally, in the first level, there is a request manager which receives the incoming workloads and sends tasks to appropriate nodes. This algorithm uses the agent-based method for gathering required information. The proposed load balancer is represented in two phases which uses the OLB algorithm for assigning tasks to service managers and in the next phase the LBMM is used to assign subtasks to the third level. " sec. IV.A, p. 70).
Cambazoglu and Mesbahi are analogous art because both are directed to managing client requests in cloud computing. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the caching system of Cambazoglu with the load balancing of Mesbahi.  The modification would have been obvious because one of ordinary skill in the art would be motivated to improve system performance for achieving optimal resource utilization, as suggested by Mesbahi ("Improving the general system performance for achieving optimal resource utilization, maximum throughput and avoiding overload" sec. II, p. 65).

Regarding Claim 10,
The Cambazoglu/Mesbahi combination teaches the method of claim 1.  Mesbahi further teaches wherein sending further comprises continuously sending cacheable requests from the set of cacheable requests to the orchestration service computer ("In the third level, there are service nodes. Level two which consists of some service mangers is used for dividing tasks to some subtasks and assigning them to appropriate service nodes. Finally, in the first level, there is a request manager which receives the incoming workloads and sends tasks to appropriate nodes. This algorithm uses the agent-based method for gathering required information. The proposed load balancer is represented in two phases which uses the OLB algorithm for assigning tasks to service managers and in the next phase the LBMM is used to assign subtasks to the third level. " sec. IV.A, p. 70) as long as the Al computer is below the threshold processor usage ("Host under-loading detection" sec. IV.C, p. 71; "Using different service thresholds such as: remaining CPU, remaining memory and transmission rate" Table 1, p. 72; "Dynamic load balancing techniques take into consideration the current state of systems that their decisions are based on. In this technique tasks can move dynamically from an overloaded node to an under-loaded one and this is the main advantage of dynamic load balancing algorithms which can change continuously according to the current state of the system." sec. II, p. 65; "A common solution in almost all cloud load balancing mechanisms is application or VM migration that some jobs will be transferred from overloaded nodes to under-loaded ones for gaining a normal state in the system." sec. IV.C, p. 72).
Cambazoglu and Mesbahi are analogous art because both are directed to managing client requests in cloud computing. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the caching system of Cambazoglu with the load balancing of Mesbahi.  The modification would have been obvious because one of ordinary skill in the art would be motivated to improve system performance for achieving optimal resource utilization, as suggested by Mesbahi ("Improving the general system performance for achieving optimal resource utilization, maximum throughput and avoiding overload" sec. II, p. 65).

Regarding Claim(s) 11, 13-16, 18, and 20,
Claim(s) 11, 13-16, 18, and 20 recite(s) a system comprising a processor and computer-readable medium (Cambazoglu: "current storage technology and configurations of commodity servers," sec. 3, p. 183) storing instructions for performing functions corresponding to the method steps recited in claim(s) 1, 3-6, 8, and 10, respectively.  The Cambazoglu/Mesbahi combination teaches the limitations of claim(s) 11, 13-16, 18, and 20 as set forth above in connection with claim(s) 1, 3-6, 8, and 10.  Therefore, claim(s) 11, 13-16, 18, and 20 is/are rejected under the same rationale as respective claim(s) 1, 3-6, 8, and 10.

Claim(s) 2 and 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cambazoglu et al. (A Refreshing Perspective of Search Engine Caching, hereinafter "Cambazoglu") in view of Mesbahi et al. (Load Balancing in Cloud Computing: A State of the Art Survey, hereinafter "Mesbahi") and Hsu et al. (Optimizing Energy Consumption with Task Consolidation in Clouds, hereinafter "Hsu").

Regarding Claim 2,
The Cambazoglu/Mesbahi combination teaches the method of claim 1.  The Cambazoglu/Mesbahi combination does not explicitly teach wherein the threshold processor usage is about 20% or less.
Hsu teaches wherein the threshold processor usage is about 20% or less (Fig. 2, Level 1 demonstrates 0-20% CPU utilization; "In this section, we present an energy-aware task consolidation (ETC) method to optimize energy usage in cloud systems. In the energy model presented in Fig. 2b, a VM is assumed to consume a W/s in its idle state. An additional b W/s is required for executing tasks when CPU utilization is between 0% and 20%. If CPU utilization is between 20% and 50%, the additional en-ergy consumed increases to 3b W/s. Energy is consumed at a greater rate as CPU utilization increases. For instance, when a virtual machine has 50% CPU utilization, it consumes a +5b W/s." sec. 3, p. 454).
	Mesbahi and Hsu are analogous art because both are directed to energy consumption management of cloud computing systems. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the load balancing cloud computing management system of the Cambazoglu/Mesbahi combination with the CPU utilization energy consumption optimization of Hsu.  The modification would have been obvious because one of ordinary skill in the art would be motivated to reduce power consumption, as suggested by Hsu ("The simulation results show that TC can significantly reduce power consumption in a cloud system, with 17% improvement over MaxUtil" p. 452).

Regarding Claim(s) 12,
Claim(s) 12 recite(s) a system comprising a processor and computer-readable medium (Cambazoglu: "current storage technology and configurations of commodity servers," sec. 3, p. 183) storing instructions for performing functions corresponding to the method steps recited in claim(s) 2.  The Cambazoglu/Mesbahi/Hsu combination teaches the limitations of claim(s) 12 as set forth above in connection with claim(s) 2.  Therefore, claim(s) 12 is/are rejected under the same rationale as respective claim(s) 2.

Claim(s) 7 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cambazoglu et al. (A Refreshing Perspective of Search Engine Caching, hereinafter "Cambazoglu") in view of Mesbahi et al. (Load Balancing in Cloud Computing: A State of the Art Survey, hereinafter "Mesbahi") and Sharma et al. (Deep Learning Approaches for Question Answering System, hereinafter "Sharma").

Regarding Claim 7,
The Cambazoglu/Mesbahi combination teaches the method of claim 1.  The Cambazoglu/Mesbahi combination does not explicitly teach wherein the Al computer includes an artificial neural network.
Sharma further teaches wherein the Al computer includes an artificial neural network ("The use of recurrent neural networks allows us to expand and apply this model to a variety of question answering tasks." sec. 1.3, p. 786).
Cambazoglu and Sharma are analogous art because both are directed to automatic computer implementation of user requests. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the computer load balancing of the Cambazoglu/Mesbahi combination with the question answering system of Sharma.  The modification would have been obvious because one of ordinary skill in the art would be motivated to implement as system answer questions effectively and with high accuracy, as suggested by Sharma (see 1.2 Motivation, p. 786).

Regarding Claim(s) 17,
Claim(s) 17 recite(s) a system comprising a processor and computer-readable medium (Cambazoglu: "current storage technology and configurations of commodity servers," sec. 3, p. 183) storing instructions for performing functions corresponding to the method steps recited in claim(s) 7.  The Cambazoglu/Mesbahi/Hsu combination teaches the limitations of claim(s) 17 as set forth above in connection with claim(s) 7.  Therefore, claim(s) 17 is/are rejected under the same rationale as respective claim(s) 7.

Claim(s) 9 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cambazoglu et al. (A Refreshing Perspective of Search Engine Caching, hereinafter "Cambazoglu") in view of Mesbahi et al. (Load Balancing in Cloud Computing: A State of the Art Survey, hereinafter "Mesbahi") and Kelley et al. (Obtaining and Managing Answer Quality for Online Data-Intensive Services, hereinafter "Kelley").

Regarding Claim 9,
The Cambazoglu/Mesbahi combination teaches the method of claim 1.  The Cambazoglu/Mesbahi combination does not explicitly teach wherein the sending further comprises prioritizing the set of cacheable requests to select the cacheable request.
Kelley teaches wherein the sending further comprises prioritizing the set of cacheable requests to select the cacheable request ("We studied admission control on the LC.big workload. We issued two classes of queries that arrived at different TCP ports, indicating high and low priority." sec. 8, p. 11:26).
	Cambazoglu and Kelley are analogous art because both are directed to managing online resource intensive services for users. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the load balancing system of the Cambazoglu/Mesbahi combination with the resource management system of Kelley.  The modification would have been obvious because one of ordinary skill in the art would be motivated to speed up execution time while maintaining flexibility for bandwidth use, as suggested by Kelley ("memoization speeds up computationally intensive components, but its increased bandwidth usage can also cause slowdown for some components. Ubora provides ﬂexible settings for memoization, allowing each component to turn off memoization." p. 11:3).

Regarding Claim(s) 19,
Claim(s) 19 recite(s) a system comprising a processor and computer-readable medium (Cambazoglu: "current storage technology and configurations of commodity servers," sec. 3, p. 183) storing instructions for performing functions corresponding to the method steps recited in in claim(s) 19, respectively.  The Cambazoglu/Mesbahi/Kelley combination teaches the limitations of claim(s) 19 as set forth above in connection with claim(s) 19.  Therefore, claim(s) 19 is/are rejected under the same rationale as respective claim(s) 19.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHARLES C KUO whose telephone number is (571)270-7477. The examiner can normally be reached M-F: 9:00 a.m. - 6:00 p.m..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571) 272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CHARLES C KUO/Examiner, Art Unit 2126  
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126