DETAILED ACTION
This action is in response to claims filed 08 November 2019 for application 16/677851 filed 08 November 2019. Currently claims 1-14 and 16-21 are pending.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claims 3 and 16 are objected to because of the following informalities:  It is unclear if the prediction, uncertainty estimate and an output of an anomaly detector are supposed to be in the alternative or not. The claim language appears to be missing a word or is awkwardly worded.  Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

Claim 8 is rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends.  Claim 8 requires “out-of-distribution data” which is not required by the alternative language of parent claim 7.  Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.
Claim 21 is rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends. It is a duplicate claim of claim 18. Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 1, 2, 7, 9, 10 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Gurevich et al. (Pairing an arbitrary regressor with an artificial neural network estimating aleatoric uncertainty).
Regarding claims 1 and 9, Gurevich discloses: 
A non-transitory computer-readable medium on which is stored program that, when executed by a computer, performs a method for estimating an uncertainty of a prediction generated by a machine learning system (“Suppose we have a standard neural network Nr = Nr(x) for regression (the regressor) with a loss function Lr . We complement it by another neural network Nq = Nq(x) (the uncertainty quantifier), and train both networks by minimizing a joint loss” p4 §2 ¶1), the method comprising:
receiving first data (p6 §3 ¶1 “training set”),
training a first machine learning model component of a machine learning system with the received first data, the first machine learning model component is trained to generate a prediction (
    PNG
    media_image1.png
    109
    305
    media_image1.png
    Greyscale
p6 §3 ¶1, note: the regression network is a first machine learning model),
generating an uncertainty estimate of the prediction (p6 §3 ¶1 “Nq(x)” is an uncertainty estimate), 
training a second machine learning model component of the machine learning system with second data, the second machine learning model component is trained to generate a calibrated uncertainty estimate of the prediction (
    PNG
    media_image1.png
    109
    305
    media_image1.png
    Greyscale
p6 §3 ¶1, “We will see that the choice of the functions f(z) and g(z) depends on concrete implementations of the uncertainty quantifier Nq, while λ affects the overall learning speed and the ratio between the learning speeds in clean and noisy regions.” P6 §3 ¶2, note: the quantifier network is a second machine learning model and through training calibrates the uncertainty).

Regarding claims 2 and 10, Gurevich discloses: The computer-readable medium of claim 1, wherein the uncertainty estimate of the prediction is generated by one of the following: the first machine learning model component, the second machine learning model component, an external machine learning model component (
    PNG
    media_image1.png
    109
    305
    media_image1.png
    Greyscale
p6 §3 ¶1, “We will see that the choice of the functions f(z) and g(z) depends on concrete implementations of the uncertainty quantifier Nq, while λ affects the overall learning speed and the ratio between the learning speeds in clean and noisy regions.” P6 §3 ¶2, note: the second machine learning model generates the uncertainty estimate).

Regarding claim 7, Gurevich discloses: The computer-readable medium of claim 1, wherein the second data is one of the following: the first data, out-of-distribution data (
    PNG
    media_image1.png
    109
    305
    media_image1.png
    Greyscale
p6 §3 ¶1, note: the same data is used (the first data)).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 3-6, 11-14 and 16-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gurevich in view of An et al. (Variational Autoencoder based Anomaly Detection using Reconstruction Probability).

Regarding claims 3, 11 and 16, Gurevich discloses: The computer-readable medium of claim 1, wherein the second machine learning model component of the machine learning system is trained to generate the calibrated uncertainty estimate of the prediction in response to a receipt, as an input to the second machine learning component, the following:
the prediction (p6 §3 ¶1, P293 §3 ¶2),
the uncertainty estimate of the prediction (p6 §3 ¶1, “We will see that the choice of the functions f(z) and g(z) depends on concrete implementations of the uncertainty quantifier Nq, while λ affects the overall learning speed and the ratio between the learning speeds in clean and noisy regions.” P6 §3 ¶2).

However, Gurevich does not explicitly disclose: an output of at least one anomaly detector.

An teaches: an output of at least one anomaly detector (“The algorithm of the proposed method is in algorithm 4. The anomaly detection task is conducted a semi-supervised framework, using only data of normal instances for training the VAE. The probabilistic encoder fφ and decoder gθ both paramterizes an isotropic normal distribution in the latent variable space and the original input variable space, respectively. For testing, a number of samples are drawn from the probabilistic encoder of the trained VAE. For each sample from the encoder, the probabilistic decoder outputs the mean and variance parameter. Using these parameters, the probability of the original data generating from the distribution is calculated. The average probability is used as an anomaly score and is called the reconstruction probability” p8 §3.1 ¶1).
Gurevich and An are both in the same field of endeavor of neural networks with random elements and are analogous. Gurevich teaches a neural network for quantifying uncertainty. An teaches a variational autoencoder-based anomaly detector. It would have been obvious to one of ordinary skill in the art before the effective filing date to substitute either or part of the neural networks as taught by Gurevich with the VAE anomaly detector as taught by An. One would be motivated to use the known structure of the VAE with the known application of anomaly detection to modify the generic neural networks of Gurevich.

Regarding claims 4 and 12, Gurevich does not explicitly disclose: The computer-readable medium of claim 3, wherein the anomaly detector is trained with the received second data for detecting deviation in the operational data.

An teaches: The computer-readable medium of claim 3, wherein the anomaly detector is trained with the received second data for detecting deviation in the operational data (“The algorithm of the proposed method is in algorithm 4. The anomaly detection task is conducted a semi-supervised framework, using only data of normal instances for training the VAE. The probabilistic encoder fφ and decoder gθ both paramterizes an isotropic normal distribution in the latent variable space and the original input variable space, respectively. For testing, a number of samples are drawn from the probabilistic encoder of the trained VAE. For each sample from the encoder, the probabilistic decoder outputs the mean and variance parameter. Using these parameters, the probability of the original data generating from the distribution is calculated. The average probability is used as an anomaly score and is called the reconstruction probability” p8 §3.1 ¶1, note: the first and second data are interpreted as the same data and the same training data would be used in the combination).

Regarding claims 5, 13, 17-19 and 21, Gurevich does not explicitly disclose: The computer-readable medium of claim 1, wherein the first machine learning model component is one of the following: a denoising neural network, a generative adversarial network, a variational autoencoder, a ladder network, a recurrent neural network.

An teaches: wherein the first machine learning model component is one of the following: a denoising neural network, a generative adversarial network, a variational autoencoder, a ladder network, a recurrent neural network (“The main difference between a VAE and an autoencoder is that the VAE is a stochastic generative model that can give calibrated probabilities, while an autoencoder is a deterministic discriminative model that does not have a probabilistic foundation. This is obvious in that VAE models the parameters of a distribution as explained above.” P7 ¶2).

Regarding claims 6, 14 and 20, Gurevich does not explicitly disclose: The computer-readable medium of claim 1, wherein the second machine learning model component is one of the following: a denoising neural network, a generative adversarial network, a variational autoencoder, a ladder network, a recurrent neural network.

An teaches: wherein the second machine learning model component is one of the following: a denoising neural network, a generative adversarial network, a variational autoencoder, a ladder network, a recurrent neural network (“The main difference between a VAE and an autoencoder is that the VAE is a stochastic generative model that can give calibrated probabilities, while an autoencoder is a deterministic discriminative model that does not have a probabilistic foundation. This is obvious in that VAE models the parameters of a distribution as explained above.” P7 ¶2).

Claim(s) 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gurevich and further in view of An et al. (Variational Autoencoder based Anomaly Detection using Reconstruction Probability) in view of Hendrycks et al. (Benchmarking Neural Network Robustness to Common Corruptions and Surface Variations).

Regarding claim 8, Gurevich does not explicitly disclose: The computer-readable medium of claim 7, wherein the out-of-distribution data is generated by corrupting the first machine learning model component parameters and generating the out-of-distribution data by evaluating the corrupted first machine learning model component.

Hendrycks teaches: wherein the out-of-distribution data is generated by corrupting the first machine learning model component parameters and generating the out-of-distribution data by evaluating the corrupted first machine learning model component (“Our corruption robustness benchmark consists of 15 diverse corruption types, exemplified in Figure 1. The benchmark covers noise, blur, weather, and digital categories. Research that improves performance on this benchmark should indicate general robustness gains, as the corruptions are varied and great in number. These 15 corruption types each have five different levels of severity, since corruptions can manifest themselves at varying intensities. The Supplementary Materials gives an example of a corruption type’s five different severities. Real-world corruptions also have variation even at a fixed intensity. To simulate these, we introduce variation for each corruption when possible.” P2 §3.1 ¶1).

Gurevich and Hendrycks are both in the same field of endeavor of neural networks and are analogous. Gurevich discloses a neural network system for quantifying uncertainty. Hendrycks teaches a system for training a model to account for corrupted out-of-distribution data. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the neural network with uncertainty as taught by Gurevich to use out-of-distribution corrupted data as taught by Hendrycks. One would be motivated to create neural network that is more robust to noise and corruption (Hendrycks P2 §3.1 ¶1).
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claim 1-14 and 16-21 provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1-4 and 8-11 of copending Application No. 16,678,179 (reference application). Although the claims at issue are not identical, they are not patentably distinct from each other because:
Instant Application
Application No. 16678179
Claim 1
Claim 1
A non-transitory computer-readable medium on which is stored program that, when executed by a computer, performs a method for estimating an uncertainty of a prediction generated by a machine learning system, the method comprising:

receiving first data,

training a first machine learning model component of a machine learning system with the received first data, the first machine learning model component is trained to generate a prediction,


generating an uncertainty estimate of the prediction, training a second machine learning model component of the machine learning system with second data, the second machine learning model component is trained to generate a calibrated uncertainty estimate of the prediction.
A non-transitory, computer-readable medium on which is stored a computer program that, when executed by a computer, performs a method for controlling a target system based on operational data of the target system, the method comprising:
receiving first data of at least one source system, training a first machine learning model component of a machine learning system with the received first data, the first machine learning model component is trained to generate a prediction on a state of the target system,
generating an uncertainty estimate of the prediction, training a second machine learning model component of the machine learning system with second data, the second machine learning model component is trained to generate a calibrated uncertainty estimate of the prediction,
the method further comprising:
receiving an operational data of the target system,
controlling the target system in accordance with the received operational data of the target system by means of selecting a control action by optimization using the first machine learning model component and arranging to apply the calibrated uncertainty estimate generated with the second machine learning model component in the optimization.
Claim 1 of the instant application is fully anticipated by claim 1 of the ‘179 application.

	
Claim 9 of the instant application is fully anticipated by claim 1 of the ‘179 application as described above.
Claims 2 and 10 of the instant application are fully anticipated by claim 2 of the ‘179 application.
Claims 3 and 11 of the instant application are fully anticipated by claim 2 of the ‘179 application.
Claims 4 and 12 of the instant application are fully anticipated by claim 2 of the ‘179 application.
Claims 5, 13, 17-19 and 21 of the instant application are fully anticipated by claim 8 of the ‘179 application.
Claims 6, 14 and 20 of the instant application are fully anticipated by claim 9 of the ‘179 application.
Claim 7 of the instant application is fully anticipated by claim 10 of the ‘179 application.
Claim 8 of the instant application is fully anticipated by claim 11 of the ‘179 application.
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Hajizadeh (US 2017/0220928) and Ghosh et al. (US 2018/0341876) disclose neural networks that quantify uncertainty.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC NILSSON whose telephone number is (571)272-5246. The examiner can normally be reached M-F: 7-3.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571)-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ERIC NILSSON/           Primary Examiner, Art Unit 2122