DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The present application, filed on 07/23/2020, claims benefit of provisional Application No. 62/905,106 filed on 09/24/2019. Claims 1-18 are pending and have been examined.

Information Disclosure Statement
The information disclosure statement (IDS) was submitted on 12/16/2020, 01/28/2021, and 04/05/2022.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Objections
Claims 7-18 are objected to because of the following informalities:  
The recitation of “the method” in claim 7 line 3 should be “the computer implemented method”. 
The recitation of “The method” in claims 8-12 line 1 should be “The computer implemented method”. 
The recitation of “the processor” in claim 13 line 6 should be “the one or more processors”.
Dependent claims 8-12 are objected to based on the same rationale as claim 7.
Dependent claims 14-18 are objected to based on the same rationale as claim 13.
Appropriate correction is required.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 5-6, 11-12, and 17-18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The term “relative success” in each of claim 5, 11, and 17 is a relative term which renders the claim indefinite. The term “relative success” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. For examination purposes, any degree of success as measured by any metric is considered “relative success”.
Dependent claim 6 is rejected based on the same rationale as claim 5. Dependent claim 12 is rejected based on the same rationale as claim 11. Dependent claim 18 is rejected based on the same rationale as claim 17. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3, 7, 9, 13, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (“Exploiting Vulnerabilities of Load Forecasting Through Adversarial Attacks”) in view of Gong et al. (“Real-Time Adversarial Attacks”).
Regarding Claim 1,
Chen et al. teaches An attack system for generating perturbations of input signals to a recurrent neural network (RNN) based target system configured to receive input sensor signals and produce outputs, the attack system comprising (pg. 1 Section 1: “For simplicity, in this paper, we restrict the inputs of the algorithms to be the historical load data, time indicators and temperature information. These algorithms can be thought of as finding a mapping between the (high dimensional) input features to the forecasted time series of load values” and Figure 1 and caption: “Figure 1: The schematic of our proposed attacks on load forecasting algorithms along with the threats over power system operations. Without knowledge about the forecast model’s parameters, the attacker injects designed small, undetectable data perturbations into weather forecasts to induce abnormal system operations” teach generating perturbations of received input temperature data (correspond to input sensor signals because temperature data, under broadest reasonable interpretation, are obtained from temperature sensors) to attack target model/algorithms and produce outputs; pg. 3 Section 3.2.2 and Section 3.2.3 teach the target model/algorithm can be recurrent neural network and LSTM):
one or more processors and a non-transitory computer-readable medium having executable instructions encoded thereon such that when executed, the one or more processors perform operations of (pg. 11 Section B: “We recorded the computation time for neural network training and the implementation time for two proposed attack algorithms. All time are recorded on a laptop with Intel 2.3GHz Core i5-8259U 4 Cores CPU and 8 GB RAM” teaches a computer-based implementation using RAM (corresponds to non-transitory computer-readable medium) and CPU (corresponds to processor) to execute instructions):
...determine a magnitude of a perturbation with which to attack the RNN based target system; generating a perturbed input sensor signal having the determined magnitude (Figure 1 and caption: “Figure 1: The schematic of our proposed attacks on load forecasting algorithms along with the threats over power system operations. Without knowledge about the forecast model’s parameters, the attacker injects designed small, undetectable data perturbations into weather forecasts to induce abnormal system operations” and pg. 3-4 Section 3.3:

    PNG
    media_image1.png
    173
    493
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    210
    488
    media_image2.png
    Greyscale

teach determining the magnitude of the perturbation (for example, “small, undetectable data perturbations”) with which to attack to the target model/algorithm, and generating adversarial input temperature data (correspond to perturbed input sensor signal) having the “small, undetectable data perturbations”; pg. 3 Section 3.2.2 and Section 3.2.3 teach the target model/algorithm can be recurrent neural network and LSTM);
presenting the perturbed input sensor signal to the RNN based target system such that the RNN based target system produces an altered output in response to the perturbed input sensor signal (pg. 3-4 Section 3.3:

    PNG
    media_image1.png
    173
    493
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    210
    488
    media_image2.png
    Greyscale

teach generating modified predictions (correspond to generating altered output) in response to adversarial input temperature data (correspond to perturbed input sensor signal) being presented to the target model/algorithm; pg. 3 Section 3.2.2 and Section 3.2.3 teach the target model/algorithm can be recurrent neural network and LSTM);
and identifying a failure mode of the RNN based target system using the altered output (Table 1 and caption: “Table 1: Forecasts errors evaluated on clean test data and adversarial data for 3 different forecast models. Allowed maximum perturbations are 4F” and Figure 6 and caption: “Figure 6: The forecast MAPE under (a). attacks to increase the load; and (b). attacks to decrease the load. Simulations are run for three times with different random seeds, and shaded area denotes the variance” teach identifying the errors (correspond to failure mode) of the target model/algorithm producing modified predictions based on adversarial data, wherein the modified predictions correspond to altered output; pg. 3 Section 3.2.2 and Section 3.2.3 teach the target model/algorithm can be recurrent neural network and LSTM).
	Chen et al. does not appear to explicitly teach training a reinforcement learning agent to determine a magnitude of a perturbation with which to attack.
	However, Gong et al. teaches training a reinforcement learning agent to determine a magnitude of a perturbation with which to attack (Figure 1 and pg. 2 first full paragraph: “we propose a new attack scheme that continuously uses observed data to approximate an optimal adversarial perturbation for future time points using a deep reinforcement learning architecture (illustrated in Figure 1)” and pg. 2 fifth full paragraph: “a more natural way of describing this problem is to view the adversarial perturbation generator as an agent and model the problem as a partially observable decision process problem, i.e., the generator continuously observes the streaming data and makes a sequence of decisions of how to make the perturbation” teach training a deep reinforcement learning architecture to determine a magnitude of a perturbation with which to attack wherein the architecture includes an agent).
Chen et al. and Gong et al. are analogous art to the claimed invention because they are directed to adversarial machine learning.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate the limitation(s) above as taught by Gong et al. to the disclosed invention of Chen et al.
One of ordinary skill in the arts would have been motivated to make this modification in order to leverage reinforcement learning to implement “real-time adversarial attacks and...how to attack a streaming-based machine learning model by designing a real-time perturbation generator that continuously uses observed data to design optimal perturbations for unobserved data” (Gong et al. pg. 8 Section 4 and Figure 1).
Regarding Claim 3,
Chen et al. in view of Gong et al. teaches the attack system as set forth in Claim 1.
Chen et al. further teaches wherein the one or more processors (pg. 11 Section B: “We recorded the computation time for neural network training and the implementation time for two proposed attack algorithms. All time are recorded on a laptop with Intel 2.3GHz Core i5-8259U 4 Cores CPU and 8 GB RAM” teaches CPU (corresponds to processor)).
Gong et al. further teaches further perform an operation of using an attack generator to generate the perturbed input sensor signal (Figure 4(B) teaches generating perturbed input speech signal (corresponds to sensor signal) using real-time audio adversarial attack based on the real-time adversarial attack scheme of Figure 1, which uses a Real-time Adversarial Perturbation Generator (corresponds to attack generator)).
Chen et al. and Gong et al. are analogous art to the claimed invention because they are directed to adversarial machine learning.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate the limitation(s) above as taught by Gong et al. to the disclosed invention of Chen et al.
One of ordinary skill in the arts would have been motivated to make this modification in order to leverage reinforcement learning to implement “real-time adversarial attacks and...how to attack a streaming-based machine learning model by designing a real-time perturbation generator that continuously uses observed data to design optimal perturbations for unobserved data” (Gong et al. pg. 8 Section 4 and Figure 1).
Regarding Claim 7,
Claim 7 recites analogous limitations to claim 1. Therefore, claim 7 is rejected based on the same rationale as claim 1.
Chen et al. teaches A computer implemented method for generating perturbations of input signals to a recurrent neural network (RNN) based target system configured to receive input sensor signals and produce outputs, the method comprising an act of: (pg. 1 Section 1: “For simplicity, in this paper, we restrict the inputs of the algorithms to be the historical load data, time indicators and temperature information. These algorithms can be thought of as finding a mapping between the (high dimensional) input features to the forecasted time series of load values” and Figure 1 and caption: “Figure 1: The schematic of our proposed attacks on load forecasting algorithms along with the threats over power system operations. Without knowledge about the forecast model’s parameters, the attacker injects designed small, undetectable data perturbations into weather forecasts to induce abnormal system operations” teach generating perturbations of received input temperature data (correspond to input sensor signals because temperature data, under broadest reasonable interpretation, are obtained from temperature sensors) to attack target model/algorithms and produce outputs; pg. 3 Section 3.2.2 and Section 3.2.3 teach the target model/algorithm can be recurrent neural network and LSTM):
causing one or more processers to execute instructions encoded on a non-transitory computer-readable medium, such that upon execution, the one or more processors perform operations of (pg. 11 Section B: “We recorded the computation time for neural network training and the implementation time for two proposed attack algorithms. All time are recorded on a laptop with Intel 2.3GHz Core i5-8259U 4 Cores CPU and 8 GB RAM” teaches a computer-based implementation using RAM (corresponds to non-transitory computer-readable medium) and CPU (corresponds to processor) to execute instructions).
Regarding Claim 9,
Claim 9 recites analogous limitations to claim 3. Therefore, claim 9 is rejected based on the same rationale as claim 3.
Regarding Claim 13,
Claim 13 recites analogous limitations to claim 1. Therefore, claim 13 is rejected based on the same rationale as claim 1.
Chen et al. teaches A computer program product for generating perturbations of input signals to a recurrent neural network (RNN) based target system configured to receive input sensor signals and produce outputs, the computer program product comprising (pg. 1 Section 1: “For simplicity, in this paper, we restrict the inputs of the algorithms to be the historical load data, time indicators and temperature information. These algorithms can be thought of as finding a mapping between the (high dimensional) input features to the forecasted time series of load values” and Figure 1 and caption: “Figure 1: The schematic of our proposed attacks on load forecasting algorithms along with the threats over power system operations. Without knowledge about the forecast model’s parameters, the attacker injects designed small, undetectable data perturbations into weather forecasts to induce abnormal system operations” teach generating perturbations of received input temperature data (correspond to input sensor signals because temperature data, under broadest reasonable interpretation, are obtained from temperature sensors) to attack target model/algorithms and produce outputs; pg. 3 Section 3.2.2 and Section 3.2.3 teach the target model/algorithm can be recurrent neural network and LSTM):
computer-readable instructions stored on a non-transitory computer-readable medium that are executable by a computer having one or more processors for causing the processor to perform operations of (pg. 11 Section B: “We recorded the computation time for neural network training and the implementation time for two proposed attack algorithms. All time are recorded on a laptop with Intel 2.3GHz Core i5-8259U 4 Cores CPU and 8 GB RAM” teaches a computer-based implementation using RAM (corresponds to non-transitory computer-readable medium) and CPU (corresponds to processor) to execute instructions).
Regarding Claim 15,
Claim 15 recites analogous limitations to claim 3. Therefore, claim 15 is rejected based on the same rationale as claim 3.
	
Claims 2, 8, and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (“Exploiting Vulnerabilities of Load Forecasting Through Adversarial Attacks”) in view of Gong et al. (“Real-Time Adversarial Attacks”) and further in view of Inkawhich et al. (“Snooping Attacks on Deep Reinforcement Learning”).
Regarding Claim 2,
Chen et al. in view of Gong et al. teaches the attack system as set forth in Claim 1.
Chen et al. further teaches wherein the one or more processors (pg. 11 Section B: “We recorded the computation time for neural network training and the implementation time for two proposed attack algorithms. All time are recorded on a laptop with Intel 2.3GHz Core i5-8259U 4 Cores CPU and 8 GB RAM” teaches CPU (corresponds to processor)).
Chen et al. in view of Gong et al. does not appear to explicitly teach further perform an operation of training the reinforcement learning agent to learn a timing for the perturbation.
However, Inkawhich et al. teaches further perform an operation of training the reinforcement learning agent to learn a timing for the perturbation (pg. 5 Section 5.5: “When we have the ability to eavesdrop on rewards and actions, we can attack with an imitator, assessor, or psychic. However, with this additional information we can improve our attacks by strategically timing the perturbations to make them less detectable” teaches training a proxy (for example, an imitator or assessor) to learn a timing for the perturbations associated with the attacks; pg. 3 Section 3.2: “In this work, we consider the state-of-the-art value-based DRL method DQN” and pg. 11 Section B.2: “The imitator architecture that we use is identical to the smaller DQN that was initially introduced in [20], and apply a Softmax operation to the logits for use with the cross-entropy classification loss. Note that we intentionally use a different architecture from the target agents in the interest of strictly adhering to black-box assumptions. The assessor uses the same architecture as the imitator, but we replace the classification layer with a single output node, and train the model with the Huber regression loss” teach the proxy, such as imitator or assessor, is trained based on the deep reinforcement learning (DRL) method deep Q-learning (DQN), thus rendering the proxy is a reinforcement learning agent).
Chen et al., Gong et al., and Inkawhich et al. are analogous art to the claimed invention because they are directed to adversarial machine learning.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate the limitation(s) above as taught by Inkawhich et al. to the disclosed invention of Chen et al. in view of Gong et al.
One of ordinary skill in the arts would have been motivated to make this modification because “[w]hen we have the ability to eavesdrop on rewards and actions, we can attack with an imitator, assessor, or psychic...with this additional information we can improve our attacks by strategically timing the perturbations to make them less detectable” (Inkawhich et al. pg. 5 Section 5.5).
Regarding Claim 8,
Claim 8 recites analogous limitations to claim 2. Therefore, claim 8 is rejected based on the same rationale as claim 2.
Regarding Claim 14,
Claim 14 recites analogous limitations to claim 2. Therefore, claim 14 is rejected based on the same rationale as claim 2.

Claims 4, 10, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (“Exploiting Vulnerabilities of Load Forecasting Through Adversarial Attacks”) in view of Gong et al. (“Real-Time Adversarial Attacks”) and further in view of Russo et al. (“Optimal Attacks on Reinforcement Learning Policies”).
Regarding Claim 4,
Chen et al. in view of Gong et al. teaches the attack system as set forth in Claim 3.
Gong et al. further teaches wherein at each time step of training of the reinforcement learning agent, the one or more processors further perform an operation of presenting unattacked sensor data comprising a known property to attack to the reinforcement learning agent (Figure 1 and pg. 2 first full paragraph: “we propose a new attack scheme that continuously uses observed data to approximate an optimal adversarial perturbation for future time points using a deep reinforcement learning architecture (illustrated in Figure 1)” and pg. 2 fifth full paragraph: “a more natural way of describing this problem is to view the adversarial perturbation generator as an agent and model the problem as a partially observable decision process problem, i.e., the generator continuously observes the streaming data and makes a sequence of decisions of how to make the perturbation” teach training a deep reinforcement learning agent by presenting to the agent sensor data that have not been perturbed (corresponds to unattacked data) but are observed (corresponds to data with known property to attack); Figure 4(B) teaches generating perturbed input speech signal (corresponds to sensor data) using real-time audio adversarial attack based on the real-time adversarial attack scheme of Figure 1; also see pg. 5 Section 3.1, which teaches a computer-based system).
Chen et al. and Gong et al. are analogous art to the claimed invention because they are directed to adversarial machine learning.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate the limitation(s) above as taught by Gong et al. to the disclosed invention of Chen et al.
One of ordinary skill in the arts would have been motivated to make this modification in order to leverage reinforcement learning to implement “real-time adversarial attacks and...how to attack a streaming-based machine learning model by designing a real-time perturbation generator that continuously uses observed data to design optimal perturbations for unobserved data” (Gong et al. pg. 8 Section 4 and Figure 1). 
Chen et al. in view of Gong et al. does not appear to explicitly teach wherein the reinforcement learning agent outputs a set of parameters of a probability distribution from which a set of attack parameters are sampled by the attack generator.
However, Russo et al. teaches wherein the reinforcement learning agent outputs a set of parameters of a probability distribution from which a set of attack parameters are sampled by the attack generator (pg. 4 first to third paragraphs:
    PNG
    media_image3.png
    387
    810
    media_image3.png
    Greyscale
teaches the reinforcement agent outputting the action selected in accordance with the modified state (corresponds to parameters of the distribution 
    PNG
    media_image4.png
    24
    54
    media_image4.png
    Greyscale
) with which the adversary (corresponds to attack generator) sampled to obtain the reward parameters (correspond to attack parameters)).
Chen et al., Gong et al., and Russo et al. are analogous art to the claimed invention because they are directed to adversarial machine learning.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate the limitation(s) above as taught by Russo et al. to the disclosed invention of Chen et al. in view of Gong et al.
One of ordinary skill in the arts would have been motivated to make this modification to leverage adversarial attacks that “outperforms gradient methods” because “[d]eriving an optimal attack is important in order to understand how to build RL policies robust to adversarial perturbations” (Russo et al. pg. 8 Section 7).
Regarding Claim 10,
Claim 10 recites analogous limitations to claim 4. Therefore, claim 10 is rejected based on the same rationale as claim 4.
Regarding Claim 16,
Claim 16 recites analogous limitations to claim 4. Therefore, claim 16 is rejected based on the same rationale as claim 4.

Claims 5-6, 11-12, and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (“Exploiting Vulnerabilities of Load Forecasting Through Adversarial Attacks”) in view of Gong et al. (“Real-Time Adversarial Attacks”) in view of Russo et al. (“Optimal Attacks on Reinforcement Learning Policies”) and further in view of Chen et al. (“Evaluation of Reinforcement Learning-Based False Data Injection Attack to Automatic Voltage Control”; hereinafter “Chen-2”).
Regarding Claim 5,
Chen et al. in view of Gong et al. in view of Russo et al. teaches the attack system as set forth in Claim 4.
Chen et al. further teaches wherein the one or more processors (pg. 11 Section B: “We recorded the computation time for neural network training and the implementation time for two proposed attack algorithms. All time are recorded on a laptop with Intel 2.3GHz Core i5-8259U 4 Cores CPU and 8 GB RAM” teaches CPU (corresponds to processor)).
Chen et al. in view of Gong et al. in view of Russo et al. does not appear to explicitly teach further perform an operation of determining a scalar value using the unattacked sensor data, the set of attack parameters, the perturbed input signal, and the altered output, wherein the scalar value represents relative success of the attack associated with the perturbed input sensor signal.
However, Chen-2 teaches further perform an operation of determining a scalar value using the unattacked sensor data, the set of attack parameters, the perturbed input signal, and the altered output, wherein the scalar value represents relative success of the attack associated with the perturbed input sensor signal (pg. 2161 Sections IV(D)-(E):
    PNG
    media_image5.png
    790
    511
    media_image5.png
    Greyscale

teach determining a reward value that indicates the success of the attack associated with the false data injection into the input (corresponds to perturbed input sensor signal) based on voltage data before false data injection (corresponds to unattacked sensor data) and voltage data after false data injection (corresponds to perturbed input signal), and based on a set of 
    PNG
    media_image6.png
    27
    39
    media_image6.png
    Greyscale
 error ratio values (correspond to the set of attack parameters) used to model the false data injection, and based on the action taken based on the observation that has been affected by false data injection (the modified action corresponds to the altered output); pg. 2161 Section A: “Finally, the agent receives a reward equal to r(s, a), which belongs to the reward set R” teaches the reward value is a value in a reward set, thus the reward value is a scalar value).
Chen et al., Gong et al., Russo et al., and Chen-2 are analogous art to the claimed invention because they are directed to adversarial machine learning.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate the limitation(s) above as taught by Chen-2 to the disclosed invention of Chen et al. in view of Gong et al. in view of Russo et al.
One of ordinary skill in the arts would have been motivated to make this modification to leverage “a novel FDI attack method, which could distort normal operations of a power system regulated by the OPF-based AVC. By formulating the attack as a POMDP, the attacker could apply reinforcement learning method to obtain effective action strategy from its experiences” (Chen-2 pg. 2168 Section IX).
Regarding Claim 6,
Chen et al. in view of Gong et al. in view of Russo et al. in view of Chen-2 teaches the attack system as set forth in Claim 5.
Chen et al. further teaches wherein the one or more processors (pg. 11 Section B: “We recorded the computation time for neural network training and the implementation time for two proposed attack algorithms. All time are recorded on a laptop with Intel 2.3GHz Core i5-8259U 4 Cores CPU and 8 GB RAM” teaches CPU (corresponds to processor)).
Chen-2 further teaches further perform an operation of providing the scalar value to the reinforcement learning agent as a reward signal, thereby improving an attack strategy of the reinforcement learning agent (signal (pg. 2161 Sections IV(D)-(E) teach providing the reward value to the reinforcement learning agent as a reward signal; pg. 2161 Section A: “Finally, the agent receives a reward equal to r(s, a), which belongs to the reward set R” teaches the reward value is a value in a reward set, thus the reward value is a scalar value; pg. 2168 Section IX: “a novel FDI attack method, which could distort normal operations of a power system regulated by the OPF-based AVC. By formulating the attack as a POMDP, the attacker could apply reinforcement learning method to obtain effective action strategy from its experiences” and pg. 2161 Section IV(A): “The goal of an agent is to maximize its expected future discounted reward:...An exact solution to a POMDP yields the optimal action for each possible belief over the states, which is also interpreted as an optimal policy γ of the agent for interacting with its environment” teach the reinforcement learning agent is trained based on rewards that will improve the attack strategy).
Chen et al., Gong et al., Russo et al., and Chen-2 are analogous art to the claimed invention because they are directed to adversarial machine learning.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate the limitation(s) above as taught by Chen-2 to the disclosed invention of Chen et al. in view of Gong et al. in view of Russo et al.
One of ordinary skill in the arts would have been motivated to make this modification to leverage “a novel FDI attack method, which could distort normal operations of a power system regulated by the OPF-based AVC. By formulating the attack as a POMDP, the attacker could apply reinforcement learning method to obtain effective action strategy from its experiences” (Chen-2 pg. 2168 Section IX).
Regarding Claim 11,
Claim 11 recites analogous limitations to claim 5. Therefore, claim 11 is rejected based on the same rationale as claim 5.
Regarding Claim 12,
Claim 12 recites analogous limitations to claim 6. Therefore, claim 12 is rejected based on the same rationale as claim 6.
Regarding Claim 17,
Claim 17 recites analogous limitations to claim 5. Therefore, claim 17 is rejected based on the same rationale as claim 5.
Regarding Claim 18,
Claim 18 recites analogous limitations to claim 6. Therefore, claim 18 is rejected based on the same rationale as claim 6.

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: Wang et al. (US 2019/0042761 A1) teaches generating a quality score based on an observation and an action caused by an actor agent during a testing phase, which is relevant to Fig. 3 of the present application.




Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YING YU CHEN whose telephone number is (571)270-1484. The examiner can normally be reached Monday-Friday 7:30 am-5:00 pm (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on (571) 272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/YING YU CHEN/               Primary Examiner, Art Unit 2125