Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 02/11/2019 was filed before the mailing date of the first office action. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.	

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 3 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 3 recites the limitation “determining, using the trained machine learning model, a probability of a time series value at an ith position of a time series of the sensor data, wherein 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 5, 8, 12 and 15 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Hasan et al* (“Learning Temporal Regularity in Video Sequences”, herein Hasan).
*this document was listed in the IDS from 02/11/2019, therefore a copy has not been attached to this office action.
Regarding claim 1, Hasan teaches a method (the abstract recites “we propose two methods that are built upon the autoencoders for their ability to work with little to no supervision”), comprising: 
(section 1 para. 6 recites “We train our models using multiple datasets including CUHK Avenue [8], Subway (Enter and Exit) [11], and UCSD Pedestrian datasets (Ped1 and Ped2) [12], without compensating the dataset bias [13].” (i.e. receiving training data)); 
training a machine learning model using the training data (fig. 2 shows the training process on the left side of the figure), wherein the machine learning model comprises multiple layers and utilizes convolution (section 3.2.1 para 1-2 recites “Figure 4 illustrates the architecture of our fully convolutional autoencoder. The encoder consists of convolutional layers [31] and the decoder consists of deconvolutional layers that are the reverse of the encoder with padding removal at the boundary of images. We use three convolutional layers and two pooling layers on the encoder side and three deconvolutional layers and two unpooling layers on the decoder side by considering the size of input cuboid and training data” (i.e. the machine learning model has multiple layers and uses convolution)); 
receiving sensor data as input (section 1 para. 6 and figs. 8-12 recite the use of camera videos as input data to the model, the broadest reasonable interpretation of receiving sensor data includes receiving video input from a series of cameras); 
and detecting an anomaly in the sensor data using the trained machine learning model (section 4.5 para. 1 and Table 1 recite “As our model learns the temporal regularity, it can be used for detecting anomalous events in a weakly supervised manner.”).
Regarding claim 5, Hasan teaches method of claim 1, wherein the anomaly is indicative of a sensed operational parameter of a machine having a value outside of a normal operating range (section 2 para. 3 recites “One of the applications of our model is abnormal or anomalous event detection. The survey paper [6] contains a comprehensive review of this topic. Most video-based anomaly detection approaches involve a local feature extraction step followed by learning a model on training video. Any event that is an outlier with respect to the learned model is regarded as the anomaly” (i.e. a value outside of a normal operating range). Section 1 para. 6 and figs. 8-12 recite the use of camera videos as input data to the model, the broadest reasonable interpretation of receiving sensor data includes receiving video input from a series of cameras.).
Claim 8 is a system claim and its limitation is included in claim 1. The only difference is that claim 8 requires a system (section 2 para. 5 recites “For an end-to-end learning system for regularity in videos, we employ the convolutional autoencoder”). Therefore, claim 8 is rejected for the same reasons as claim 1.
Claim 12 is a system claim and its limitation is included in claim 5. Claim 12 is rejected for the same reasons as claim 5.
Claim 15 is a computer program product claim and its limitation is included in claim 1. The only difference is that claim 15 requires a computer program product (section 4 para. 1 recites “We learn the model using multiple video datasets, totaling 1 hour 50 minutes, and evaluate our method both qualitatively and quantitatively. We modify1 and use Caffe [59] for all of our experiments on NVIDIA Tesla K80 GPUs”). Therefore, claim 15 is rejected for the same reasons as claim 1.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-3, 9-10, and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Hasan et al* (“Learning Temporal Regularity in Video Sequences”, herein Hasan) in view of Ahmad et al* (“Real-Time Anomaly Detection for Streaming Analytics”, herein Ahmad).
*this document was listed in the IDS from 02/11/2019, therefore a copy has not been attached to this office action.
Regarding claim 2, Hasan teaches the method of claim 1.
However, Hasan does not explicitly teach wherein detecting the anomaly comprises: 
determining, using the trained machine learning model, a probability of a time series of the sensor data; and determining that the probability fails to satisfy a threshold value, wherein the method further comprises classifying the time series of the sensor data as anomalous data.
Ahmad teaches wherein detecting the anomaly comprises: 
determining, using the trained machine learning model, a probability of a time series of the sensor data (section 3.2 para. 2 recites “Rather than thresholding the raw score directly, we model the distribution of anomaly scores and use this distribution to check for the likelihood that the current state is anomalous. The anomaly likelihood is thus a metric defining how anomalous the current state is based on the prediction history of the HTM model (i.e. determining the probability of the time series data)); 
and determining that the probability fails to satisfy a threshold value, wherein the method further comprises classifying the time series of the sensor data as anomalous data (section 3.2 para. 3 recites “We then compute a recent short-term average of anomaly scores, and apply a threshold to the Gaussian tail probability (Q-function, (Karagiannidis & Lioumpas, 2007)) to decide whether or not to declare an anomaly. We define the anomaly likelihood (Lt) as the complement of the tail probability. We threshold Lt and report an anomaly if it is very close to 1 (i.e. if the probability fails to satisfy a threshold the data is considered anomalous)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by adding the methods of determining the time series probability and the threshold comparisons from Ahmad to the anomaly detection methods from Hasan, as Hasan and Ahmad are both directed to detecting anomalies in time series data. Ahmad section 3.2 para. 1 recites “The raw anomaly score described above represents an instantaneous measure of the predictability of the current input stream. This works well for predictable scenarios but in many practical applications, the underlying system is inherently noisy and unpredictable. In these situations it is often the change in predictability that is indicative of anomalous behavior.” Therefore, one of ordinary skill would benefit adding the anomaly likelihood metric analysis from Ahmad, as it would improve the performance of the methods from Hasan by making them more robust in less predictable scenarios.
Regarding claim 3, Hasan teaches the method of claim 1, 

Ahmad teaches wherein detecting the anomaly comprises: 
determining, using the trained machine learning model, a probability of a time series value at an ith position of a time series of the sensor data (section 3.2 para. 2 recites “Rather than thresholding the raw score directly, we model the distribution of anomaly scores and use this distribution to check for the likelihood that the current state is anomalous. The anomaly likelihood is thus a metric defining how anomalous the current state is based on the prediction history of the HTM model (i.e. determining the probability of the time series data – see 112(b) rejection of claim 3 for interpretation of the “ith position”)), wherein the ith position corresponds to a temporal location and a spatial dimension of the time series value (section 2 para. 8 recites “In this paper we focus on using Hierarchical Temporal Memory (HTM) for anomaly detection. HTM is a machine learning algorithm derived from neuroscience that models spatial and temporal patterns in streaming data” (i.e. spatial and temporal dimensions of the time series values are considered)); 
and determining that the probability fails to satisfy a threshold value, wherein the method further comprises classifying the time series value as anomalous data (section 3.2 para. 3 recites “We then compute a recent short-term average of anomaly scores, and apply a threshold to the Gaussian tail probability (Q-function, (Karagiannidis & Lioumpas, 2007)) to decide whether or not to declare an anomaly. We define the anomaly likelihood (Lt) as the complement of the tail probability. We threshold Lt and report an anomaly if it is very close to 1 (i.e. if the probability fails to satisfy a threshold the data is considered anomalous)).
See claim 2 for motivation to combine.
Claim 9 is a system claim and its limitation is included in claim 2. Claim 9 is rejected for the same reasons as claim 2.
Claim 10 is a system claim and its limitation is included in claim 3. Claim 10 is rejected for the same reasons as claim 3.
Claim 16 is a computer program product claim and its limitation is included in claim 2. Claim 16 is rejected for the same reasons as claim 2.
Claim 17 is a computer program product claim and its limitation is included in claim 3. Claim 17 is rejected for the same reasons as claim 3.

Claims 4, 6-7, 11, 13-14, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Hasan et al* (“Learning Temporal Regularity in Video Sequences”, herein Hasan) in view of Xu et al* (“Bayesian Wavelet PCA Methodology for Turbomachinery Damage Diagnosis Under Uncertainty”, herein Xu).
*this document was listed in the IDS from 02/11/2019, therefore a copy has not been attached to this office action
Regarding claim 4, Hasan teaches the method of claim 1, where in the machine learning model uses deep learning and convolution (the abstract recites “we build a fully convolutional feed-forward autoencoder to learn both the local features and the classifiers as an end-to-end learning framework.” Section 3.2.1 para 1-2 recite “Figure 4 illustrates the architecture of our fully convolutional autoencoder. The encoder consists of convolutional layers [31] and the decoder consists of deconvolutional layers that are the reverse of the encoder with padding removal at the boundary of images. We use three convolutional layers and two pooling layers on the encoder side and three deconvolutional layers and two unpooling layers on the decoder side by considering the size of input cuboid and training data”).
However, Hasan does not explicitly teach wherein the machine learning model uses factor analysis.
Xu teaches wherein the machine learning model uses factor analysis (section 3.3 para 2 recites “The PPCA (probabilistic principal component analysis) is derived from a Gaussian latent variable model which is closely related to statistical factor analysis. The factor analysis is a mathematical technique widely used to reduce the number of variables (dimensionality reduction),while identifying the underlying factors that explain the correlations among multiple variables”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by using the probabilistic principal component analysis methods from Xu to reduce the dimensionality of the multivariate input data from Hasan. Xu and Hasan are both directed to detecting anomalies in time series data. One of ordinary skill would benefit from using the probabilistic principal component analysis methods from Xu to simplify the multivariate input data from Hasan, which 
Regarding claim 6, Hasan teaches method of claim 1, wherein the training data comprises training time series data (section 1 para. 6 recites “We train our models using multiple datasets including CUHK Avenue [8], Subway (Enter and Exit) [11], and UCSD Pedestrian datasets (Ped1 and Ped2) [12], without compensating the dataset bias [13].” (i.e. time series training data));
iterating through the multiple layers of the machine learning model to recompute the training time series data and to estimate output parameters, determining that output criteria are satisfied, and outputting the output parameters associated with a most recent iteration (fig. 5 shows the relationship between the length of the input time series and the number of iterations required to reach convergence. The description of fig. 5 recites “Effect of temporal length (T) of input video cuboid. (Left) X-axis is the increasing number of iterations, Y-axis is the training loss, and three plots correspond to three different values of T. (Right) X-axis is the increasing number of video frames and Y-axis is the regularity score. As T increases, the training loss takes more iterations to converge as it is more likely that the inputs with more channels have more irregularity to hamper learning regularity. On the other hand, once the model is learned, the regularity score (i.e. the output) is more distinguishable for higher values of T between regular and irregular regions.“ Examiner’s Note: one of ordinary skill would understand that convergence is another manner of determining that output criteria are satisfied). 

Xu teaches wherein training the machine learning model comprises: initializing the training time series data using principal component analysis to obtain multiple layers of the machine learning model (section 3.3 para. 1 recites “After the multivariate time series data are cleaned, the probabilistic principal component analysis (PPCA) approach is developed in this section to (1) reduce data dimensionality, (2) address the multivariate correlation, and (3) consider data uncertainty. Principal component analysis (PCA) [26] is a well-established statistical method for dimensionality reduction and has been widely applied in data compression, image processing, exploratory data analysis, pattern recognition, and time series prediction” (i.e. using principal component analysis on the training data). 
	See claim 4 for motivation to combine.
Regarding claim 7, the combination of Hasan and Xu teaches method of claim 6, wherein determining that the output criteria are satisfied comprises: 
determining a current lower bound of the training time series data responsive, at least in part, to completion of the most recent iteration; and determining that a difference between the current lower bound of the training time series data and a lower bound associated with a previous iteration satisfies a threshold value (Hasan section 4.5 para. 2 recites “We find the local minimas in the time series of regularity scores to detect abnormal events. However, these local minima are very noisy and not all of them are meaningful local minima. We use the persistence1D [61] algorithm to identify meaningful local minima and span the region with a fixed temporal window (50 frames) and group nearby expanded local minimal regions when they overlap to obtain the final abnormal temporal regions. Specifically, if two local minima are within fifty frames of one another, they are considered to be a part of same abnormal event. We consider a detected abnormal region as a correct detection if it has at least fifty percent overlap with the ground truth” (i.e. determining a lower bound of the training time series data and comparing the difference between lower bounds to satisfy a threshold value)).
Claim 11 is a system claim and its limitation is included in claim 4. Claim 11 is rejected for the same reasons as claim 4.
Claim 13 is a system claim and its limitation is included in claim 6. Claim 13 is rejected for the same reasons as claim 6.
Claim 14 is a system claim and its limitation is included in claim 7. Claim 14 is rejected for the same reasons as claim 7.
Claim 18 is a computer program product claim and its limitation is included in claim 4. Claim 18 is rejected for the same reasons as claim 4.
Claim 19 is a computer program product claim and its limitation is included in claim 6. Claim 19 is rejected for the same reasons as claim 6.
Claim 20 is a computer program product claim and its limitation is included in claim 7. Claim 20 is rejected for the same reasons as claim 7.

	Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
“Toward Automated Anomaly Identification in Large-scale Systems” (Lan et al) teaches using principal component analysis (PCA) and independent component analysis (ICA) to feature extraction in order to detect anomalies in time series data.
“Generic and Scalable Framework for Automated Time-series Anomaly Detection” (Laptev et al) teaches a combination of anomaly detection and forecasting models with an anomaly filtering layer for accurate and scalable anomaly detection on time-series data.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEAH M FEITL whose telephone number is (571)272-8350. The examiner can normally be reached on M-F 0800-1700.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B. Zhen can be reached on (571) 272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications 
	/L.M.F./             Examiner, Art Unit 2121                                                                                                                                                                                           


/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121