Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is in response to an amendment filed on 2/19/21.
Claims 1-20 are pending.

Response to Arguments
Applicant’s arguments have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Objections
Claim 8 is objected to because of the following informalities:
Claim 8 recites “the pixel values of the color coded region changes from green values to red values”. It is believed this would be better written as “the pixel values of the color coded region change from green values to red values”.
Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 6, 9, 13 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over “Coordinated VM Resizing and Server Tuning: Throughput, Power Efficiency and Scalability” by Guo et al. (Guo) in view of “Multi-Agent Deep Reinforcement Learning” by Egorov et al. (Egorov) in view of US 10,558,483 to Balma et al. (Balma) in view of US 2019/0302310 to Fox et al. (Fox).

Claims 1, 9, 13 and 17: Guo discloses a data center comprising: 
software defined infrastructure in a computing environment (pg. 290, col. 2, 4th par. “a virtualized and shared server infrastructure”); and 
a computer readable medium having instructions which when executed by a processor cause the processor to:
map, by the processor, a state of the data center in a data center domain (pg. 291, col. 2, last partial par. “state space (S) is represented as a collection of state vectors (s)”), wherein the map comprises encoding state information and resource type into arrays (par. bridging pp. i and j are the number of resource types and the number of server parameters”) wherein the map comprises: and
for each type of resource included in the software defined infrastructure, representing the type of resource as one or more arrays by encoding capacity of the type of resource (pg. 289, col. 2, 2nd par. “model … VM capacity”, par. bridging pp. 291 and 292 “The elements in the state vector are … the number of the resource type”); and
representing a service level agreement (SLA) of the computing environment by encoding application response time of the computing environment (pg. 291, col. 2, 1st full par. “SLA requirement”); and
implement, by the processor, a plurality of cognitive agents to perform adaptive reinforcement learning using the map to reconfigure the software defined infrastructure based upon changes in the computing environment (pg. 290, col. 1, 1st full par “each application has two corresponding reinforcement learning agents, one performance agent and one power agent”, pg. 291, col. 2, 3rd full par. “Reinforcement learning … consists of a set of states and several actions for each state”), wherein each cognitive agent of the plurality of cognitive agents trains and learns differently from one another (pg. 291, col. 2, 1st partial par. “feeds the power consumption data in to the power agent. The performance agent obtains the performance data via the performance monitor”), each cognitive agent of the plurality of cognitive agents evaluates a respective managed environment and respective different components within the data center based on different application programming interfaces of the software defined infrastructure (pg. 291, col. 1, last full par. “Each application consists of multiple VMs … Each application is associated with … one reinforcement learning based pg. 291, par. bridging cols. "VMware's Intelligent Power Management Interface … measures the average power consumption of the server system at the VM level and. feeds the power consumption data to the power agent. The performance agent obtains the performance data via the performance monitor of each application"), and each cognitive agent is configured to add, remove and migrate a computing resource (pg. 289, col. 2, last full par. “VM resizing and server parameter tuning”).

Guo does not explicitly disclose performing adaptive deep reinforcement learning; and
mapping a state of the data center as an image in a gaming domain, wherein the map comprises:
representing the type of resource as one or more arrays of pixels of the image by encoding state of the type of resource into pixel values for the one or more arrays of pixels.

Egorov teaches adaptive deep reinforcement learning (Abstract “solving reinforcement learning problems in multi-agent setting … apply deep reinforcement learning techniques”); and
mapping a state of the system as an image in a gaming domain (pg. 1, col. 2, 1st full par. “decomposing the global system state into an image like representation”, pg. 3, col. 1, 1st par. opponent and ally channels”, see e.g. Fig. 1), wherein the map comprises:
representing the type of resources as one or more arrays of pixels of the image by encoding state data into pixels in pixel arrays (pg. 3, col. 1, 1st partial par. “encodes a different set of information from the global state in its pixel values”).

It would have been obvious to implement the plurality of cognitive agents (pg. 290, 1st col., 1st full par “reinforcement learning agents”) to perform adaptive deep reinforcement learning (Egorov Abstract “deep reinforcement learning”) by mapping a state information and resource type (Guo par. bridging pp. 291 & 291, “resource types and … server parameters”) for the data center as an into pixel arrays (Egorov p pg. 3, col. 1, 1st partial par. “encodes … information … in its pixel values”, Guo pg. 291, col. 2, last partial par. “state vectors (s)”). Those of ordinary skill in the art would have been motivated to do so to “alleviate some of the scalability issues encountered … for multi-agent systems” (Egorov pg. 1, col. 2, 1st partial par.). 

Guo and Egorov do not explicitly teach encoding input/output (I/O) load of the type of resource into pixel values for the one or more arrays of pixels.

Balma teaches encoding I/O load (col. 10, lines 24-27 “resource utilization may include … network utilization”, Guo pg. 291, col. 2, last partial par. “state vectors (s)”).

It would have been obvious at the time of filing to encode I/O load (Balma col. 10, lines 24-27 “network utilization”, Guo pg. 291, col. 2, last partial par. “state vectors (s)”, Egorov pg. 1, col. 2, 1st full par. “an image like representation”). Those of ordinary skill in the art would have been motivated to do so to improve/optimize I/O usage (e.g. Balma col. 11, lines 14-17 “minimize an amount of communication traffic”).



Fox teaches representing state data as a color coded region of an image (e.g. par. [0165] “each column of pixels in the image represents a state, and each pixel in the column is color coding a state variable in the state”).

It would have been obvious at the time of filing to encode state data as a color coded region of the image (Egorov pg. 1, col. 2, 1st full par. “decomposing the global system state into an image like representation”, Fox par. [0165] “color coding a state variable in the state”). Those of ordinary skill in the art would have been motivated to do so as one of a limited number of pixel types (i.e. color or black and white/greyscale) which would have produced only the expected effects. 

Claim 6: Guo, Egorov, Balma and Fox teach the data center of claim 1 wherein the changes in the computing environment comprise at least one of new workloads being added, new resources being available, or new constraints (e.g. Guo pg. 294, col. 1 last partial par. “workload … is increased”), and each cognitive agent of the plurality of cognitive agents is an auto-scaling agent that dynamically scales the respective different components within the data center within the respective managed environment (Guo pg. 289, col. 2, last full par. “a scalable approach for coordinated VM resizing and server parameter tuning”).

Claims 2-3, 10-11, 14-15, 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over “Coordinated VM Resizing and Server Tuning: Throughput, Power Efficiency and Scalability” by Guo et al. (Guo) in view of “Multi-Agent Deep Reinforcement Learning” by Egorov et al. (Egorov) in view of US 10,558,483 to Balma et al. (Balma) in view of US 2019/0302310 to Fox et al. (Fox) in view of IS 2011/0153507 to Murthy et al. (Murthy).

Claims 2, 14 and 18: Guo, Egorov, Balma and Fox teach the data center of claims 1, 13 and 17 wherein:
the map further comprises encoding resource utilization (Guo pg. 291, col. 2, last partial par. “resource allocations”), and energy utilization (Guo pg. 291, par. bridging the cols. “average power consumption”);
the computing environment comprises a respective managed environment for each cognitive agent (Guo pg. 290, col. 1, 1st full par “one performance agent and one power agent”); and 
each cognitive agent evaluates the respective managed environment and generates a proposed action to reconfigure the software defined infrastructure based on a respective current state and respective experience (Guo pg. 290, col. 1, 1st full par. “The performance agent generates VM capacity and server configuration options. The power agent generates the prediction on power consumption”, it would at least have been obvious to make recommendations for managing/decreasing power consumption based on this prediction, pg. 291, col. 1, 1st partial par. “feeds the power consumption data into the power agent. The performance agent obtains the performance data”); and
pg. 292, col. 1, 2nd full par. "Q-Table stores the Q-value").

Guo, Egorov, Balma and Fox do not explicitly teach state information comprising data center tickets.

Murthy teaches modeling data center tickets (par. [0045] “a model based on problem tickets”). 

It would have been obvious at the time of filing to encode data center tickets (Murthy par. [0045] “a model based on problem tickets”, Guo pg. 291, col. 2, last partial par. “state vectors (s)”, Egorov pg. 1, col. 2, 1st full par. “an image like representation”). Those of ordinary skill in the art would have been motivated to do so to improve/optimize functioning of the system (e.g. Murthy par. [0045] “determines the risk for a possible configuration”).

Claim 3: Guo, Egorov, Balma, Fox and Murthy teach the data center of claim 2 wherein each cognitive agent further evaluates the respective managed environment and generates the proposed action to reconfigure the software defined infrastructure based upon a model and simulated experience (Egorov pg. 4, col. 2, 1st full par. “To train the neural network controller").

Claim 10: Guo, Egorov, Balma and Fox teach 9 wherein:

the computing environment comprises a respective managed environment for each cognitive agent (Guo pg. 290, col. 1, 1st full par “one performance agent and one power agent”); and 
each cognitive agent evaluates the respective managed environment and generates a proposed action to reconfigure the software defined infrastructure based on a respective current state and respective experience (Guo pg. 290, col. 1, 1st full par. “The performance agent generates VM capacity and server configuration options. The power agent generates the prediction on power consumption”, it would at least have been obvious to make recommendations for managing/decreasing power consumption based on this prediction, pg. 291, col. 1, 1st partial par. “feeds the power consumption data into the power agent. The performance agent obtains the performance data”); and
a deep Q-network approximates Q-values for each of the plurality of cognitive agents (e.g. Guo pg. 292, col. 1, 2nd full par. "Q-Table stores the Q-value"); and
each cognitive agent of the plurality of cognitive agents is an auto-scaling agent that dynamically scales the respective different components within the data center within the respective managed environment (Guo pg. 289, col. 2, last full par. “a scalable approach for coordinated VM resizing and server parameter tuning”).



Murthy teaches modeling data center tickets (par. [0045] “a model based on problem tickets”). 

It would have been obvious at the time of filing to encode data center tickets (Murthy par. [0045] “a model based on problem tickets”, Guo pg. 291, col. 2, last partial par. “state vectors (s)”, Egorov pg. 1, col. 2, 1st full par. “an image like representation”). Those of ordinary skill in the art would have been motivated to do so to improve/optimize functioning of the system (e.g. Murthy par. [0045] “determines the risk for a possible configuration”).

Claims 11, 15 and 19: Guo, Egorov, Balma, Fox and Murthy teach claims 10, 14 and 18 wherein each cognitive agent further evaluates the respective managed environment and generates the proposed action to reconfigure the software defined infrastructure based upon a model and simulated experience (Egorov pg. 4, col. 2, 1st full par. “To train the neural network controller … "), increases in response time and energy usage that violate the SLA result in a negative reward in the gaming domain, and actions that lead to a reduction in a cost of ownership (TCO) score result in a positive reward in the gaming domain (Guo pg. 292, col. 1, 2nd full par. “The reward function of an action a … is defined using the effective throughput (ET)”, pg. 291, col. 2, 1st full par. “the objective is to meet the SLA … but also to improve the overall system throughput and to optimize resource utilization”).

Claims 4-5, 12, 16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over “Coordinated VM Resizing and Server Tuning: Throughput, Power Efficiency and Scalability” by Guo et al. (Guo) in view of “Multi-Agent Deep Reinforcement Learning” by Egorov et al. (Egorov) in view of US 10,558,483 to Balma et al. (Balma) in view of US 2019/0302310 to Fox et al. (Fox) in view of US 2011/0077795 to VanGilder et al. (VanGilder).

Claim 4: Guo, Egorov, Balma, Fox and Murthy teach the data center of claim 3, wherein each cognitive agent further evaluates the respective managed environment and generates the proposed action to reconfigure the software defined infrastructure based upon a policy function optimization based on energy cost, total energy use (Guo pg. 289, col. 2, 1st par. “reduce the running cost”, pg. 290, col. 1, 1st full par “power agent” note that a determination of energy cost will at least obviously consider both the cost of the energy and the amount of energy used), and the pixel values for the one or more array of pixels comprises black and white values or red, green and blue values (Fox par. [0134] “RGB, grayscale, etc.”).

Guo, Egorov, Balma, Fox and Murthy do not explicitly teach optimization based on energy cost, total energy use, labor time and labor rate.

VanGilder teaches optimization labor time and labor rate (par. [0049] “local energy … labor costs” note that a consideration of labor costs will at least obviously include the time and rate of that labor).

st full par “power agent”, VanGilder par. [0049] “local energy … labor costs”). Those of ordinary skill in the art would have been motivated to do so to reduce the cost of operation (Guo pg. 289, col. 2, 1st par. “reduce the running cost”).

Claim 5: Guo, Egorov, Balma, Fox, Murthy and VanGilder teach the data center of Claims 4 wherein the policy function optimization reduces a total cost of ownership (TCO) of the data center (Guo pg. 1, col. 2, 1st par. “helps cloud providers to reduce the running cost”), at least one cognitive agent determines which pixel values reduces the application response time of the computing environment efficiently for improving a TCO score (Guo pg. 280 “reduce the running cost”), and energy use is based on: dynamic and idle power usage over time in the data center, and power usage for cooling to reduce heat in the data center (VanGilder par. [0057] “dynamic requirements such as power, cooling”, par. [0126] “approximately half of its maximum power in an idle state”), increases in response time and energy usage that violate the SLA result in a negative reward in the gaming domain, and actions that lead to a reduction in the TCO score result in a positive reward in the gaming domain (Guo pg. 292, col. 1, 2nd full par. “The reward function of an action a … is defined using the effective throughput (ET)”).

Claim 12: Guo, Egorov, Balma Fox and Murthy teach the data center of claim 11 wherein each cognitive agent further evaluates the respective managed environment and generates the proposed action to reconfigure the software defined infrastructure based upon a policy st par. “reduce the running cost”, pg. 290, col. 1, 1st full par “power agent” note that a determination of energy cost will at least obviously consider both the cost of the energy and the amount of energy used), the pixel values for the one or more arrays of pixels comprise black and white values or red, green and blue values (Fox par. [0134] “RGB, grayscale, etc.”), the resource type, and energy use is based on: power usage over time in the data center (e.g. pg. 292, col. 1, 2nd full par. “time slot”). 

Guo, Egorov, Balma Fox and Murthy do not explicitly teach optimization based on energy cost, total energy use, labor time and labor rate.

VanGilder teaches optimization labor time and labor rate (par. [0049] “local energy … labor costs” note that a consideration of labor costs will at least obviously include the time and rate of that labor) and energy use is based on: dynamic and idle power usage (par. [0057] “dynamic requirements such as power, cooling”, par. [0126] “approximately half of its maximum power in an idle state”).

It would have been obvious at the time of filing to optimize based on energy cost, total energy use, labor time and labor rate (Guo pg. 290, col. 1, 1st full par “power agent”, VanGilder par. [0049] “local energy … labor costs”). Those of ordinary skill in the art would have been motivated to do so to reduce the cost of operation (Guo pg. 289, col. 2, 1st par. “reduce the running cost”).

Guo, Egorov, Balma, Murthy and VanGilder do not explicitly teach the pixel values for the one or more arrays of pixels comprise state information encoded as black and white values or red, green and blue values.

Claims 16 and 20: Guo, Egorov, Balma, Fox and Murthy teach Claim 15 and 19 wherein each cognitive agent is implemented to further evaluate the respective managed environment and generate the proposed action to reconfigure the software defined infrastructure based upon a policy function optimization based on energy cost, total energy use (Guo pg. 289, col. 2, 1st par. “reduce the running cost”, pg. 290, col. 1, 1st full par “power agent” note that a determination of energy cost will at least obviously consider both the cost of the energy and the amount of energy used), the pixel values for the one or more arrays of pixels comprise black and white values or red, green and blue values (Fox par. [0134] “RGB, grayscale, etc.”), the resource type, energy use is based on: power usage over time in the data center, and each cognitive agent of the plurality of cognitive agents is an auto-scaling agent that dynamically scales the respective different components within the data center within the respective managed environment (Guo pg. 289, col. 2, last full par. “a scalable approach for coordinated VM resizing and server parameter tuning”, pg. 291, col. 2, 1st full par. “the objective is to meet the SLA … but also to improve the overall system throughput and to optimize resource utilization”).

Guo, Egorov, Balma Fox and Murthy do not explicitly teach optimization based on labor time and labor rate.

VanGilder teaches optimization based on labor time and labor rate (par. [0049] “local energy … labor costs” note that a consideration of labor costs will at least obviously include the time and rate of that labor) and energy use is based on: power usage for cooling to reduce head in the data center (par. [0057] “dynamic requirements such as power, cooling”, par. [0126] “approximately half of its maximum power in an idle state”).

It would have been obvious at the time of filing to optimize based on energy cost, total energy use, labor time and labor rate (Guo pg. 290, col. 1, 1st full par “power agent”, VanGilder par. [0049] “local energy … labor costs”). Those of ordinary skill in the art would have been motivated to do so to reduce the cost of operation (Guo pg. 289, col. 2, 1st par. “reduce the running cost”).

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over “Coordinated VM Resizing and Server Tuning: Throughput, Power Efficiency and Scalability” by Guo et al. (Guo) in view of “Multi-Agent Deep Reinforcement Learning” by Egorov et al. (Egorov) in view of US 10,558,483 to Balma et al. (Balma) in view of US 2019/0302310 to Fox et al. (Fox) in view of 2018/0287902 to Chitalia et al. (Chitalia).

Claim 7: Guo, Egorov, Balma and Fox teach the data center of claim 1 wherein the software defined infrastructure comprises one or more of the following types of resources: storage nd to last par. “complex applications are hosted in the virtualized and shared server infrastructure”); and 
the plurality of cognitive agents comprises a computing cognitive agent (Guo pg. 290, col. 1, 1st full par “performance agent”).

Guo, Egorov, Balma and Fox do not explicitly teach a storage cognitive agent.

Chitalia teaches a storage cognitive agent (par. [0042] “manages functions of data center 110 such as … storage … resources”, par. [0050] “machine learning to improve … orchestration, security, accounting and planning within data center 110”).

It would have been obvious at the time of filing to implement a storage cognitive agent (Guo pg. 290, 1st col. 1st full par. “reinforcement learning agents”, Chitalia par. [0042] “manages … storage”). Those of ordinary skill in the art would have been motivated to do so to monitor and optimize storage within the infrastructure (e.g. Chitalia par. [0050] “machine learning to improve … orchestration”).

Allowable Subject Matter
Claim 8 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US 2020/0111038 to Ward et al., US 20190347547 to Ebstyne et al. each disclose encoding data values as color coded regions of an image. 

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASON D MITCHELL whose telephone number is (571)272-3728.  The examiner can normally be reached on Monday through Thursday 7:00am - 4:30pm and alternate Fridays 7:00am 3:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lewis Bullock can be reached on (571)272-3759.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/JASON D MITCHELL/Primary Examiner, Art Unit 2199