DETAILED ACTION
Claims 1-15 have been examined.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Duplicate Claims
Applicant is advised that should claim 1 be found allowable, claim 9 will be objected to under 37 CFR 1.75 as being a substantial duplicate thereof. When two claims in an application are duplicates or else are so close in content that they both cover the same thing, despite a slight difference in wording, it is proper after allowing one claim to object to the other as being a substantial duplicate of the allowed claim. See MPEP § 608.01(m).

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claims 1-3 and 5-15 are provisionally rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 1, 5-8, and 10 of US Patent Application No. 17273709 (reference application). 

Although the conflicting claims are not identical, they are not patentably distinct from each other. Below is a side by side comparison of representative claim 1 of the instant application and representative claim 1 of US Patent Application No. 17273709.
Claim 1 of instant application
Claim 1 of 17273709
1. A computer-implemented method for data analysis, comprising: 
1. A computer-implemented method for anomaly detection in an entity of interest comprising:
obtaining a deep neural network for processing data and at least a part of a training dataset used for training the deep neural network, the deep neural network comprising a plurality of hidden layers, the training dataset including observations that are input to the deep neural network;
((R1, label added for reference below) receiving a new observation said new observation characterizing at least one parameter of the entity;
(R2) inputting the new observation to a deep neural network (100), the deep neural network (100) having a plurality of hidden layers and being trained using a training data set that includes possible observations that can be input to the deep neural network (100); 
(R3) obtaining a second set of intermediate output values that are output from at least one of the plurality of hidden layers of the deep neural network (100) by inputting the received new observation to the deep neural network (100); 
(R4) mapping, using a latent variable model stored in a storage medium, the second set of intermediate output values to a second set of projected values; 
(R5) determining whether or not the received new observation is an outlier with respect to the training dataset based on the latent variable model and the second set of projected values, 
calculating, by the deep neural network (100), a prediction for the new observation; and 
determining a result indicative of the occurrence of at least one anomaly in the entity based on the prediction and the determination whether or not the new observation is an outlier; 
wherein the latent variable model stored in the storage medium is constructed by:
obtaining first sets of intermediate output values that are output from at least one of the plurality of hidden layers, each of the first sets of intermediate output values obtained by inputting a different one of the observations included in said part of the training dataset;
obtaining first sets of intermediate output values that are output from said one of the plurality of hidden layers of the deep neural network (100), each of the first sets of intermediate output values obtained by inputting a different one of the possible observations included in at least a part of the training dataset;
constructing, via at least one processor, a latent variable model using the first sets of intermediate output values, the latent variable model providing a mapping of the first sets of intermediate output values to first sets of projected values in a sub-space that has a dimension lower than a dimension of the first sets of the intermediate output values;
constructing the latent variable model using the first sets of intermediate output values, the latent variable model providing a mapping of the first sets of intermediate output values to first sets of projected values in a sub-space of the latent variable model that has a dimension lower than a dimension of the sets of the intermediate outputs.
receiving an observation to be input to the deep neural network; 
((R1, recited up above, reproduced for comparison) receiving a new observation said new observation characterizing at least one parameter of the entity;
(R2) inputting the new observation to a deep neural network (100)
obtaining a second set of intermediate output values that are output from at least one of said plurality of hidden layers by inputting the received observation to the deep neural network;
 (R3) obtaining a second set of intermediate output values that are output from at least one of the plurality of hidden layers of the deep neural network (100) by inputting the received new observation to the deep neural network (100); 

mapping, using the latent variable model, the second set of intermediate output values to a second set of projected values; and
(R4) mapping, using a latent variable model stored in a storage medium, the second set of intermediate output values to a second set of projected values; 

determining, via the processor, whether the received observation is an outlier with respect to the training dataset based on the latent variable model and the second set of projected values.
(R5) determining whether or not the received new observation is an outlier with respect to the training dataset based on the latent variable model and the second set of projected values, 


As show in the comparison above, the elements of claim 1 of the instant application are recited in claim 1 of the ‘709 application, which changes the order the elements are presented.  Elements of claim 2 of the instant application are recited in claim 10 of the ‘709 application. Elements of claim 3 of the instant application are recited in claim 5 of the ‘709 application. Elements of claims 5-7 of the instant application are recited in claim 6-8 of the ‘709 application. Elements of claims 8 and 9 of the instant application are recited in claim 1 of the ‘709 application. Elements of claim 10 of the instant application are recited in claim 1 of the ‘709 application. Claim 10 of the instant application recites a computer program product comprising computer-readable instructions, which would have been obvious in a computer-implemented method as recited in claim 1 of the ‘709 application. Elements of claims 11-15 of the instant application are recited in claims 1, 2, 5, 6, and 8 of the ‘709 application.  Claims 11-15 recite a system comprising a storage medium and a processor, which would have been obvious in a computer-implemented method as recited in claim 1 of the ‘709 application.

Claim 4 is provisionally rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 4 of US Patent Application No. 17273709 (reference application), in view of Said et al. “Data Preprocessing for Distance-based Unsupervised Intrusion Detection” (hereinafter Said).

Claim 4 of the instant application differs from claim 4 of the ‘709 application in that claim 4 of the instant application recites the distance is Mahalanobis distance, which is not recited in claim 4 of the ‘709 application. However, this difference would have been obvious to one of ordinary skill in the art because it is obvious to use known distance metrics in the art, and Mahalanobis distance is used to take the dependency among variables into account (see at least page 4, section II.C of Said).

This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claim 10 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.

Claim 10 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. In claim 10, a “computer program product” is recited; however, it appears that the computer program produce would reasonably be interpreted by one of ordinary skill in the art as software, per se, since the computer-readable instructions recited as part of the computer program product would reasonably be interpreted by one of ordinary skill in the art as software, per se. As such, it is believed that the computer program product of claim 10 is reasonably interpreted as functional descriptive material, per se, failing to be tangibly embodied or include any recited hardware as part of the computer program product and thereby fit that statutory category of invention.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4, 7-13, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Guo et al. “An Anomaly Detection Framework Based on Autoencoder and Nearest Neighbor” (hereinafter Guo), in view of Said et al. “Data Preprocessing for Distance-based Unsupervised Intrusion Detection” (hereinafter Said).

As per claim 1, Guo teaches a computer-implemented method for data analysis, comprising: 
obtaining a deep neural network for processing data and at least a part of a training dataset used for training the deep neural network (i.e., autoencoder model will be trained on normal data, see at least page 1, section II, page 2, section II.A), the deep neural network comprising a plurality of hidden layers (i.e., autoencoder is a multi-layer forward neural network, see at least page 2, section II.A, page 3, Fig. 1), the training dataset including observations that are input to the deep neural network (i.e., autoencoder is to receive the original data as the initial network input, see at least page 1, section II, page 2, section II.A, page 3, Fig. 1, Algorithm 1); 
obtaining first sets of intermediate output values that are output from at least one of the plurality of hidden layers, each of the first sets of intermediate output values obtained by inputting a different one of the observations included in said part of the training dataset model (i.e., we can obtain the hidden representation vector in the middlemost network layer which represents the deep non-linear feature characteristic of the original data, see at least page 2, section II.B);
constructing, via at least one processor, a latent variable model using the first sets of intermediate output values (i.e., we use hidden representation vector to train an anomaly detection model, see at least page 2, section II.B, page 3, Fig. 1, Algorithm 1); 
receiving an observation to be input to the deep neural network (i.e., testing using testing dataset, see at least page 1, section II, page 2, section II.A, page 4, section IV.A); 
obtaining a second set of intermediate output values that are output from at least one of said plurality of hidden layers by inputting the received observation to the deep neural network (i.e., testing using testing dataset or performing anomaly detection with the anomaly detection framework, see at least page 1, section II, page 2, section II.A, page 4, section IV.A); and
determining, via the processor, whether the received observation is an outlier with respect to the training dataset based on the latent variable model (i.e., detecting outlier in the testing dataset or for anomaly detection, see at least page 1, section II, page 2, sections II.A, II.B, page 4, section IV.A).
Guo does not explicitly teach the latent variable model providing a mapping of the first sets of intermediate output values to first sets of projected values in a sub-space that has a dimension lower than a dimension of the first sets of the intermediate output values, mapping, using the latent variable model, the second set of intermediate output values to a second set of projected values, determining whether the received observation is an outlier based on the second set of projected values.
Said teaches providing a mapping of a first sets of intermediate output values to first sets of projected values in a sub-space that has a dimension lower than a dimension of the first sets of the intermediate output values (i.e., performing PCA to project the data into a space of fewer dimensions, see at least page 2, section II, page 3, Fig. 1, pages 3-4, section B);
mapping, the second set of intermediate output values to a second set of projected values  (i.e., performing PCA to project the data into a space of fewer dimensions, see at least page 2, section II, page 3, Fig. 1, pages 3-4, section B); 
determining, via the processor, whether the received observation is an outlier based on the second set of projected values (i.e., observations are ranked according to their outlying scores and the top M observations are declared as intrusion, see at least page 2, section II, page 3, Fig. 1, pages 3-4, section B, page 5, section III.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Guo such that the latent variable model provides a mapping of the first sets of intermediate output values to first sets of projected values in a sub-space that has a dimension lower than a dimension of the first sets of the intermediate output values, maps the second set of intermediate output values to a second set of projected values, and determines whether the received observation is an outlier based on the second set of projected values as similarly taught by Said because reducing dimensionality of data will decrease computational time as well as the memory requirement (see at least pages 3-4, section B of Said).

As per claim 2, Guo does not explicitly teach wherein the latent variable model is constructed according to principal component analysis.
Said taches principal component analysis used in outlier detection (see at least page 2, section II, page 3, Fig. 1, pages 3-4, section B).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Guo such that the latent variable model is constructed according to principal component analysis as similarly taught by Said because the latent variable model of Guo uses k nearest neighbor and Said teaches applying principal component analysis before applying k nearest neighbor, where PCA reduces dimensionality of data which will decrease computational time as well as the memory requirement (see at least pages 3-4, section B of Said).

As per claim 3, Guo teaches wherein determining whether the received observation is an outlier comprises: calculating a distance of the second set of values with respect to a distribution of the first sets of values (i.e., utilizing the distance of a data point from its k-th nearest neighbor, see at least page 2, section II.B); and 
determining that the received observation is an outlier with respect to the training dataset when the calculated distance is larger than a threshold value for the distance (i.e., assert the top n points in the ranking list as the outliers, see at least page 2, section II.B).
Guo does not explicitly teach Mahalanobis distance and projected values.
	Said teaches Mahalanobis distance (see at least page 4, section II.C, page 5, section II.D) and projected values (i.e., performing PCA to project the data into a space of fewer dimensions, see at least page 2, section II, page 3, Fig. 1, pages 3-4, section B).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Guo such that the distance is Mahalanobis distance as similarly taught by Said because it is obvious to use known distance metrics in the art, and Mahalanobis distance is used to take the dependency among variables into account (see at least page 4, section II.C of Said). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Guo such that the value is projected value as similarly taught by Said because reducing dimensionality of data will decrease computational time as well as the memory requirement (see at least pages 3-4, section B of Said).
As per claim 4, Guo teaches wherein the threshold value for the distance is determined based on distances, each of which is calculated for a different one of the first sets of projected values with respect to the distribution of the first sets of values (i.e., utilizing the distance of a data point from its k-th nearest neighbor, assert the top n points in the ranking list as the outliers, see at least page 2, section II.B.
Guo does not explicitly teach Mahalanobis distance and projected values.
	Said teaches Mahalanobis distance (see at least page 4, section II.C, page 5, section II.D) and projected values (i.e., performing PCA to project the data into a space of fewer dimensions, see at least page 2, section II, page 3, Fig. 1, pages 3-4, section B).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Guo such that the distance is Mahalanobis distance as similarly taught by Said because it is obvious to use known distance metrics in the art, and Mahalanobis distance is used to take the dependency among variables into account (see at least page 4, section II.C of Said). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Guo such that the value is projected value as similarly taught by Said because reducing dimensionality of data will decrease computational time as well as the memory requirement (see at least pages 3-4, section B of Said).

As per claim 7, Guo teaches wherein obtaining the first sets of intermediate output values and constructing the latent variable model are performed for two or more of the plurality of hidden layers (i.e., the original data are continuously compressed by several hidden layers to a more compact vector, we can obtain the hidden representation vector in the middlemost network layer which represents the deep non-linear feature characteristic of the original data, see at least page 2, sections II.A, II.B); 
wherein obtaining the second set of intermediate output values and mapping the second set of intermediate output values to the second set of values are performed for said two or more of the plurality of hidden layers (i.e., testing using testing dataset, see at least page 1, section II, page 2, section II.A, page 4, section IV.A); and 
wherein determining whether the received observation is an outlier is performed based on the latent variable model and the second set of projected values obtained for said two or more of the plurality of hidden layers (i.e., detecting outlier in the testing dataset, see at least page 1, section II, page 2, sections II.A, II.B, page 4, section IV.A).
Guo does not explicitly teach projected values.
	Said teaches projected values (see at least page 2, section II, page 3, Fig. 1, pages 3-4, section B).

As per claim 8, Guo teaches a computer-implemented method comprising: 
obtaining a deep neural network for processing data and at least a part of a training dataset used for training the deep neural network (i.e., autoencoder model will be trained on normal data, see at least page 1, section II, page 2, section II.A), the deep neural network comprising a plurality of hidden layers (i.e., autoencoder is a multi-layer forward neural network, see at least page 2, section II.A, page 3, Fig. 1), the training dataset including observations that are input to the deep neural network (i.e., autoencoder is to receive the original data as the initial network input, see at least page 1, section II, page 2, section II.A, page 3, Fig. 1, Algorithm 1); 
obtaining first sets of intermediate output values that are output from at least one of the plurality of hidden layers, each of the first sets of intermediate output values obtained by inputting a different one of the observations included in said part of the training dataset (i.e., we can obtain the hidden representation vector in the middlemost network layer which represents the deep non-linear feature characteristic of the original data, see at least page 2, section II.B); 
constructing a latent variable model using the first sets of intermediate output values (i.e., we use hidden representation vector to train an anomaly detection model, see at least page 2, section II.B, page 3, Fig. 1, Algorithm 1); and 
storing the latent variable model and the first sets of values in a storage medium (see at least page 2, section II.B, page 3, Fig. 1, Algorithm 1).	Guo does not explicitly teach the latent variable model providing a mapping of the first sets of intermediate output values to first sets of projected values in a sub-space that has a dimension lower than a dimension of the first sets of the intermediate output values.
Said teaches providing a mapping of a first sets of intermediate output values to first sets of projected values in a sub-space that has a dimension lower than a dimension of the first sets of the intermediate output values (i.e., performing PCA to project the data into a space of fewer dimensions, see at least page 2, section II, page 3, Fig. 1, pages 3-4, section B).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Guo such that the latent variable model provides a mapping of the first sets of intermediate output values to first sets of projected values in a sub-space that has a dimension lower than a dimension of the first sets of the intermediate output values as similarly taught by Said because reducing dimensionality of data will decrease computational time as well as the memory requirement (see at least pages 3-4, section B of Said).
As per claim 9, Guo teaches a computer-implemented method comprising: 
receiving an observation to be input to a deep neural network for processing data (i.e., testing or anomaly detection using an anomaly detection framework which integrates an autoencoder model, see at least page 1, section II, page 2, section II.A, page 4, section IV.A), the deep neural network having a plurality of hidden layers and trained using a training data set that includes observations that are input to the deep neural network (i.e., autoencoder is a multi-layer forward neural network, autoencoder model will be trained on normal data, see at least page 1, section II, page 2, section II.A, page 3, Fig. 1); 
obtaining a second set of intermediate output values that are output from at least one of the plurality of hidden layers by inputting the received observation to the deep neural network (i.e., testing using testing dataset, see at least page 1, section II, page 2, section II.A, page 4, section IV.A); and 
determining whether the received observation is an outlier with respect to the training data set based on the latent variable model and the second set of values (i.e., detecting outlier in the testing dataset or for anomaly detection, see at least page 1, section II, page 2, sections II.A, II.B, page 4, section IV.A), 
wherein the latent variable model stored in the storage medium is constructed by:
obtaining first sets of intermediate output values that are output from said one of the plurality of hidden layers, each of the first sets of intermediate output values obtained by inputting a different one of the observations included in at least a part of the training dataset (i.e., we can obtain the hidden representation vector in the middlemost network layer which represents the deep non-linear feature characteristic of the original data, see at least page 2, section II.B); and 
constructing the latent variable model using the first sets of intermediate output values (i.e., we use hidden representation vector to train an anomaly detection model, see at least page 2, section II.B, page 3, Fig. 1, Algorithm 1).
Guo does not explicitly teach mapping, using a latent variable model stored in a storage medium, the second set of intermediate output values to a second set of projected values, determining whether the received observation is an outlier based on the second set of projected values, and the latent variable model providing a mapping of the first sets of intermediate output values to first sets of projected values in a sub-space that has a dimension lower than a dimension of the first sets of the intermediate output values.
Said teaches mapping, the second set of intermediate output values to a second set of projected values  (i.e., performing PCA to project the data into a space of fewer dimensions, see at least page 2, section II, page 3, Fig. 1, pages 3-4, section B); 
determining, via the processor, whether the received observation is an outlier based on the second set of projected values (i.e., observations are ranked according to their outlying scores and the top M observations are declared as intrusion, see at least page 2, section II, page 3, Fig. 1, pages 3-4, section B, page 5, section III.);
providing a mapping of a first sets of intermediate output values to first sets of projected values in a sub-space that has a dimension lower than a dimension of the first sets of the intermediate output values (i.e., performing PCA to project the data into a space of fewer dimensions, see at least page 2, section II, page 3, Fig. 1, pages 3-4, section B).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Guo such that the latent variable model provides a mapping of the first sets of intermediate output values to first sets of projected values in a sub-space that has a dimension lower than a dimension of the first sets of the intermediate output values, maps the second set of intermediate output values to a second set of projected values, and determines whether the received observation is an outlier based on the second set of projected values as similarly taught by Said because reducing dimensionality of data will decrease computational time as well as the memory requirement (see at least pages 3-4, section B of Said).

As per claim 10, this is the computer program product claim of claim 1.  Therefore, claim 10 is rejected using the same reasons as claim 1.

As per claims 11-13 and 15, these are the system claims of claims 1-3 and 7.  Therefore, claims 11-13 and 15 are rejected using the same reasons as claims 1-3 and 7.

Claims 5, 6, and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Guo, in view of Said, further in view of Yu et al. “Recursive Principal Component Analysis-Based Data Outlier Detection and Sensor Data Aggregation in IoT Systems” (hereinafter Yu).

As per claim 5, Guo does not explicitly teach wherein determining whether the received
observation is an outlier comprises: determining an approximate set of intermediate output values corresponding to the second set of intermediate output values, using the latent variable model and the second set of projected values; calculating a squared approximation residual for the second set of intermediate output values and the approximate set of intermediate output values; and determining that the received observation is an outlier with respect to the training
dataset when the calculated squared approximation residual is larger than a threshold value for the squared approximation residual.
Yu teaches determining whether the received observation is an outlier comprises: determining an approximate set of intermediate output values corresponding to the second set of intermediate output values, using the latent variable model and the second set of projected values (i.e., dimension of sensor data can be aggregated by projecting raw data into subspace defined by PC, all parameters in the PCA model are recursively updated to adapt to the variations in the IoT system, see at least page 2209, section IV., page 2211, section 2); 
calculating a squared approximation residual for the second set of intermediate output values and the approximate set of intermediate output values (i.e., square prediction error (SPE) score is defined as the square of residual value after extract of PCs, see at least page 2208, left column, paragraph 2, page 2209, section III); and 
determining that the received observation is an outlier with respect to the training dataset when the calculated squared approximation residual is larger than a threshold value for the squared approximation residual (i.e., if SPE(t) is out of the estimated range, outlier is detected, see at least page 2211, section b, page 2212, Algorithm 1).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Guo such that determining whether the received
observation is an outlier comprises: determining an approximate set of intermediate output values corresponding to the second set of intermediate output values, using the latent variable model and the second set of projected values; calculating a squared approximation residual for the second set of intermediate output values and the approximate set of intermediate output values; and determining that the received observation is an outlier with respect to the training dataset when the calculated squared approximation residual is larger than a threshold value for the squared approximation residual as similarly taught by Yu because data outliers can generate dramatic variations in the residual value after extraction of PCs and SPE score defined from the residual value is sensitive to data outliers (see at least age 2208, right column, paragraph 4, page 2209, right column, paragraph 5 of Yu).

As per claim 6, Guo does not explicitly teach wherein the threshold value for the squared approximation residual is determined based on squared approximation residuals, each of which is calculated for a different one of the first sets of intermediate output values and the approximate set of intermediate output values corresponding to one of said first sets of intermediate output values.
Yu teaches the threshold value for the squared approximation residual is determined based on squared approximation residuals, each of which is calculated for a different one of the first sets of intermediate output values and the approximate set of intermediate output values corresponding to one of said first sets of intermediate output values (i.e., it is still an adaptive threshold, since the mean value and standard deviation of SPE score are kept updating with the newly collected sensor data, see at least page 2211, section 2). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Guo such the threshold value for the squared approximation residual is determined based on squared approximation residuals, each of which is calculated for a different one of the first sets of intermediate output values and the approximate set of intermediate output values corresponding to one of said first sets of intermediate output values as similarly taught by Yu because data outliers can generate dramatic variations in the residual value after extraction of PCs and SPE score defined from the residual value is sensitive to data outliers and to allow the threshold to be an adaptive threshold that is updated with newly collected data (see at least age 2208, right column, paragraph 4, page 2209, right column, paragraph 5 of Yu).

As per claim 14, this is the system claim of claim 5.  Therefore, claim 14 is rejected using the same reasons as claim 5.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Mailh et al. (US 11,062,229) is cited tot each training latent variable machine learning models.
Odry et al. (US 2020/0020098) is cited to teach a first neural network being configured to reduce dimensionality of input data to a latent space and a second neural network to classify the latent variables.
Liu et al. “Neural networks with enhanced outlier rejection ability for off-line handwritten word recognition”, 2002, Pattern Recognition 35. This document is cited to teach outlier recognition using a variant of the radial basis function network that uses principal component analysis on the clusters defined by the nodes in the hidden layer.
Goldberger et al. “Neighborhood Component Analysis”, 2004, Advances in Neural Information Processing Systems 17. This document is cited to teach learning a Mahalanobis distance measure to be used in the KNN classification algorithm.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jue Louie whose telephone number is 571-270-1655.  The examiner can normally be reached on M-F 9:30 am - 5:00pm (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li Zhen can be reached on 571-272-3768.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.



Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/Jue Louie/
Primary Examiner
Art Unit 2121