Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Applicant’s Amendment, filed September 16, 2022, has been fully considered and entered.  Accordingly, Claims 1-22 are pending in this application.  Claims 2 and 12 have been cancelled.  Claims 21 and 22 are new.  Claims 1, 11, and 20 are independent claims and have been amended.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 11, 20, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Stocker (PG Pub. No. 2021/0034994 A1) and further in view of and Ben Simhon (PG Pub. No. 2016/0210556 A1).
Regarding Claim 1, Stocker discloses a computer-implemented method for alerting metric baseline behavior change, comprising:
Stocker does not disclose:
performing an autoencoding procedure for a time-series data;
determining whether one or more change points occur in a seasonal pattern of the time-series data based on the autoencoding procedure; and
 transmitting, to a user, an alert indicating the one or more change points based on a determination that the one or more change points occur in the seasonal pattern of the time-series data.
The combination of Stocker and Ben Simhon discloses:
performing an autoencoding procedure for a time-series data (see Ben Simhon, paragraph [0017], where the system includes a stacked autoencoder module configured to (i) train a stacked autoencoder using each of the time series portions of each of the plurality of metrics and (ii) identify output nodes in the stacked autoencoder activated by each of the time series portions of each of the plurality of metrics);
determining whether one or more change points occur in a seasonal pattern of the time-series data (see Stocker, paragraph [0004], where an embodiment of the present invention described herein includes a method for anomaly and event analysis including steps of receiving … at least one data set of at least one data stream from at least one data source, where the at least one data set includes a plurality of time-varying data points … detection model includes at least one anomaly detection model trained according to a respective plurality of independent event training data sets to identify types of the plurality of event observations, where the types of the plurality of event observations include at least one of: i) anomalies, ii) change-points, iii) patterns, or iv) outliers) based on the autoencoding procedure (see Ben Simhon, paragraph [0017], where the system includes a stacked autoencoder module configured to (i) train a stacked autoencoder using each of the time series portions of each of the plurality of metrics and (ii) identify output nodes in the stacked autoencoder activated by each of the time series portions of each of the plurality of metrics); and
transmitting, to a user, an alert (see Ben Simhon, paragraph [0053], where anomaly data is transmitted to a reporting system 112, which can generate anomaly alerts to system administrators) indicating the one or more change points based on a determination that the one or more change points occur in the seasonal pattern of the time-series data (see Stocker, paragraph [0004], where an embodiment of the present invention described herein includes a method for anomaly and event analysis including steps of receiving … at least one data set of at least one data stream from at least one data source, where the at least one data set includes a plurality of time-varying data points … detection model includes at least one anomaly detection model trained according to a respective plurality of independent event training data sets to identify types of the plurality of event observations, where the types of the plurality of event observations include at least one of: i) anomalies, ii) change-points, iii) patterns, or iv) outliers).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Ben Simhon for the benefit of generating automatic alerts of changed seasonal data (see Ben Simhon, paragraph [0005]).
Regarding Claim 11, Stocker discloses a computing device for managing a knowledge graph, comprising:
a memory storing one or more parameters or instructions for identifying related signals from a service event repository (see Stocker, paragraph [0027], where the application load balancer 12 may include one or more processing devices and/or one or more memory devices); and
at least one processor coupled to the memory (see Stocker, paragraph [0027], where the application load balancer 12 may include one or more processing devices and/or one or more memory devices), wherein the at least one processor is configured to:
Stocker does not disclose:
performing an autoencoding procedure for a time-series data;
determining whether one or more change points occur in a seasonal pattern of the time-series data based on the autoencoding procedure; and
 transmitting, to a user, an alert indicating the one or more change points based on a determination that the one or more change points occur in the seasonal pattern of the time-series data.
The combination of Stocker and Ben Simhon discloses:
performing an autoencoding procedure for a time-series data (see Ben Simhon, paragraph [0017], where the system includes a stacked autoencoder module configured to (i) train a stacked autoencoder using each of the time series portions of each of the plurality of metrics and (ii) identify output nodes in the stacked autoencoder activated by each of the time series portions of each of the plurality of metrics);
determining whether one or more change points occur in a seasonal pattern of the time-series data (see Stocker, paragraph [0004], where an embodiment of the present invention described herein includes a method for anomaly and event analysis including steps of receiving … at least one data set of at least one data stream from at least one data source, where the at least one data set includes a plurality of time-varying data points … detection model includes at least one anomaly detection model trained according to a respective plurality of independent event training data sets to identify types of the plurality of event observations, where the types of the plurality of event observations include at least one of: i) anomalies, ii) change-points, iii) patterns, or iv) outliers) based on the autoencoding procedure (see Ben Simhon, paragraph [0017], where the system includes a stacked autoencoder module configured to (i) train a stacked autoencoder using each of the time series portions of each of the plurality of metrics and (ii) identify output nodes in the stacked autoencoder activated by each of the time series portions of each of the plurality of metrics); and
transmitting, to a user, an alert (see Ben Simhon, paragraph [0053], where anomaly data is transmitted to a reporting system 112, which can generate anomaly alerts to system administrators) indicating the one or more change points based on a determination that the one or more change points occur in the seasonal pattern of the time-series data (see Stocker, paragraph [0004], where an embodiment of the present invention described herein includes a method for anomaly and event analysis including steps of receiving … at least one data set of at least one data stream from at least one data source, where the at least one data set includes a plurality of time-varying data points … detection model includes at least one anomaly detection model trained according to a respective plurality of independent event training data sets to identify types of the plurality of event observations, where the types of the plurality of event observations include at least one of: i) anomalies, ii) change-points, iii) patterns, or iv) outliers).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Ben Simhon for the benefit of generating automatic alerts of changed seasonal data (see Ben Simhon, paragraph [0005]).
Regarding Claim 20, Stocker discloses a non-transitory computer-readable medium, comprising code executable by one or more processors for managing a knowledge graph, the code comprising code for:
Stocker does not disclose:
performing an autoencoding procedure for a time-series data;
determining whether one or more change points occur in a seasonal pattern of the time-series data based on the autoencoding procedure; and
 transmitting, to a user, an alert indicating the one or more change points based on a determination that the one or more change points occur in the seasonal pattern of the time-series data.
The combination of Stocker and Ben Simhon discloses:
performing an autoencoding procedure for a time-series data (see Ben Simhon, paragraph [0017], where the system includes a stacked autoencoder module configured to (i) train a stacked autoencoder using each of the time series portions of each of the plurality of metrics and (ii) identify output nodes in the stacked autoencoder activated by each of the time series portions of each of the plurality of metrics);
determining whether one or more change points occur in a seasonal pattern of the time-series data (see Stocker, paragraph [0004], where an embodiment of the present invention described herein includes a method for anomaly and event analysis including steps of receiving … at least one data set of at least one data stream from at least one data source, where the at least one data set includes a plurality of time-varying data points … detection model includes at least one anomaly detection model trained according to a respective plurality of independent event training data sets to identify types of the plurality of event observations, where the types of the plurality of event observations include at least one of: i) anomalies, ii) change-points, iii) patterns, or iv) outliers) based on the autoencoding procedure (see Ben Simhon, paragraph [0017], where the system includes a stacked autoencoder module configured to (i) train a stacked autoencoder using each of the time series portions of each of the plurality of metrics and (ii) identify output nodes in the stacked autoencoder activated by each of the time series portions of each of the plurality of metrics); and
transmitting, to a user, an alert (see Ben Simhon, paragraph [0053], where anomaly data is transmitted to a reporting system 112, which can generate anomaly alerts to system administrators) indicating the one or more change points based on a determination that the one or more change points occur in the seasonal pattern of the time-series data (see Stocker, paragraph [0004], where an embodiment of the present invention described herein includes a method for anomaly and event analysis including steps of receiving … at least one data set of at least one data stream from at least one data source, where the at least one data set includes a plurality of time-varying data points … detection model includes at least one anomaly detection model trained according to a respective plurality of independent event training data sets to identify types of the plurality of event observations, where the types of the plurality of event observations include at least one of: i) anomalies, ii) change-points, iii) patterns, or iv) outliers).
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Ben Simhon for the benefit of generating automatic alerts of changed seasonal data (see Ben Simhon, paragraph [0005]).
Regarding Claim 21, Stocker in view of Ben Simhon discloses the computer-readable medium of Claim 20, wherein:
Stocker does not disclose the autoencoding procedure detects one the one or more change points by examining the Euclidean distance between two adjacent encoded periods in the time series data.  The combination of Stocker and Ben Simhon discloses the autoencoding procedure (see Ben Simhon, paragraph [0017], where the system includes a stacked autoencoder module configured to (i) train a stacked autoencoder using each of the time series portions of each of the plurality of metrics and (ii) identify output nodes in the stacked autoencoder activated by each of the time series portions of each of the plurality of metrics) detects one the one or more change points (see Stocker, paragraph [0004], where an embodiment of the present invention described herein includes a method for anomaly and event analysis including steps of receiving … at least one data set of at least one data stream from at least one data source, where the at least one data set includes a plurality of time-varying data points … detection model includes at least one anomaly detection model trained according to a respective plurality of independent event training data sets to identify types of the plurality of event observations, where the types of the plurality of event observations include at least one of: i) anomalies, ii) change-points, iii) patterns, or iv) outliers) by examining the Euclidean distance between two adjacent encoded periods in the time series data (see Ben Simhon, paragraph [0015], where the system includes a correlation module configured to compute a similarity value between the candidate pair of metrics. In other features, the similarity value is one of a Euclidean distance).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Ben Simhon for the benefit of generating automatic alerts of changed seasonal data (see Ben Simhon, paragraph [0005]).
Claims 3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Stocker and Ben Simhon as applied to Claims 1, 11, 20, and 20 above, and further in view of Yeung (PG Pub. No. 2021/0042964 A1).
Regarding Claim 3, Stocker in view of Ben Simhon discloses the computer-implemented method of claim 1, wherein performing the autoencoding procedure further comprises:
wherein determining whether the one or more change points occur in the seasonal pattern of the time-series data further comprises determining whether the one or more change points occur in the seasonal pattern of the time-series data based on the plurality of low-dimensional vectors (see Stocker, paragraph [0013], where a change-point may include a point in a variable or set of variables where there is an observed change in the trend of values; see also paragraph [0004], where each time-varying data point of the plurality of time-varying data points includes at least one variable of at least one dimension).
Stocker does not disclose:
generating, by an autoencoder, a plurality of low-dimensional vectors using temporal regularization, wherein each of the plurality of low-dimensional vectors correspond to a period in the time-series data.
Yeung discloses generating, by an autoencoder, a plurality of low-dimensional vectors using temporal regularization, wherein each of the plurality of low-dimensional vectors correspond to a period in the time-series data (see Yeung, paragraph [0261], where Auto encoder 2040 comprises a deep-learning based algorithm implemented to encode sparsely distributed point cloud data with temporal regularization in order to minimize the inconsistency of encoded data in an output sequence. In one embodiment, auto encoder 2040 uses convolutional autoencoders (CA) to perform dimensionality reduction).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Yeung for the benefit of minimizing the inconsistency of encoded data in an output sequence (see Yeung, paragraph [0261]).
Regarding Claim 13, Ben Simhon in view of Yeung discloses the computing device of Claim 11, wherein the at least one processor configured to perform the autoencoding procedure is further configured to:
wherein determining whether the one or more change points occur in the seasonal pattern of the time-series data further comprises determining whether the one or more change points occur in the seasonal pattern of the time-series data based on the plurality of low-dimensional vectors (see Stocker, paragraph [0013], where a change-point may include a point in a variable or set of variables where there is an observed change in the trend of values; see also paragraph [0004], where each time-varying data point of the plurality of time-varying data points includes at least one variable of at least one dimension).
Stocker does not disclose:
generating, by an autoencoder, a plurality of low-dimensional vectors using temporal regularization, wherein each of the plurality of low-dimensional vectors correspond to a period in the time-series data.
Yeung discloses generating, by an autoencoder, a plurality of low-dimensional vectors using temporal regularization, wherein each of the plurality of low-dimensional vectors correspond to a period in the time-series data (see Yeung, paragraph [0261], where Auto encoder 2040 comprises a deep-learning based algorithm implemented to encode sparsely distributed point cloud data with temporal regularization in order to minimize the inconsistency of encoded data in an output sequence. In one embodiment, auto encoder 2040 uses convolutional autoencoders (CA) to perform dimensionality reduction).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Yeung for the benefit of minimizing the inconsistency of encoded data in an output sequence (see Yeung, paragraph [0261]).
Claims 4-6 and 14-16 are rejected under 35 U.S.C. 103 as being unpatentable over Stocker, Ben Simhon, and Yeung, as applied to Claims 3 and 13 above, and further in view of Mestha (PG Pub. No. 2018/0159879 A1).
Regarding Claim 4, Stocker in view of Ben Simhon discloses the computer-implemented method of Claim 3, wherein generating the plurality of low-dimensional vectors using the temporal regularization further comprises:
Stocker does not disclose:
generating, by an encoder, an input vector for each period of the time-series data;
calculating a minimized summated difference between two consecutive encoded periods of the time-series data;
calculating a summated difference between two consecutive encoded periods of the time-series data; and
generating, by a decoder, the plurality of low-dimensional vectors based on the minimized summated difference between each period of the time-series data and the reconstructed version of the input vector and the summated difference between the two consecutive encoded periods of the input time-series data.
Yeung discloses generating, by an encoder, an input vector for each period of the time-series data (see Yeung, paragraph [0261], where Auto encoder 2040 comprises a deep-learning based algorithm implemented to encode sparsely distributed point cloud data with temporal regularization in order to minimize the inconsistency of encoded data in an output sequence. In one embodiment, auto encoder 2040 uses convolutional autoencoders (CA) to perform dimensionality reduction).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Yeung for the benefit of minimizing the inconsistency of encoded data in an output sequence (see Yeung, paragraph [0261]).
Stocker in view of Yeung does not disclose:
calculating a minimized summated difference between two consecutive encoded periods of the time-series data (see Mestha, paragraph [0071], where the system may normalize the sensor values against a base vector (e.g., a vector of 20 whose values came from a time average of a base file that had no anomalies) ;
calculating a summated difference between two consecutive encoded periods of the time-series data (see Mestha, paragraph [0050], where Temporal normalization may provide normalization along a time axis. Spatial normalization may be used to normalize signals along multiple nodes (e.g., sensor axis). In either case, the normalized signals may then be used to perform attack detection using feature extraction and comparisons to normal decision boundaries; see also paragraph [0073], where error between the decoder's transformation of the features, z, and the original input, referred to as a “cost function,” may then be computed … the terms W, b, and d′ may be updated using a backpropagation algorithm with stochastic gradient descent); and
generating, by a decoder, the plurality of low-dimensional vectors based on the minimized summated difference between each period of the time-series data and the reconstructed version of the input vector and the summated difference between the two consecutive encoded periods of the input time-series data (see Mestha, paragraph [0072], where the original input may be reconstructed from the hidden variables (that is, the output of the encoder step)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker and Yeung with Mestha for the benefit of deep learning model to determine parameters of the deep learning model (see Mestha, Abstract).
Regarding Claim 5, Stocker in view of Ben Simhon, Yeung, and Mestha discloses the computer-implemented method of Claim 4, wherein generating the input vector for each period of the time-series data further comprises:
Stocker does not disclose:
calculating an inner product between a weight matrix for a current period of the time-series data and an output of a pervious weight matrix for a previous period of the time-series data
applying a non-linear function to the inner product; and
determining corresponding parameters for the weight matrix based on a gradient descent using back-propagation.
Mestha discloses:
calculating an inner product between a weight matrix for a current period of the time-series data and an output of a pervious weight matrix for a previous period of the time-series data (see Mestha, paragraph [0071], where non-liner transformation is as follows: y=f θ(x)=s(Wx+b) where W is an m×n weight matrix and b is a vector of size m. The term n is the dimension of the input vector and m is equal to the number of hidden variables or features);
applying a non-linear function to the inner product (see Mestha, paragraph [0071], where non-liner transformation is as follows: y=f θ(x)=s(Wx+b) where W is an m×n weight matrix and b is a vector of size m. The term n is the dimension of the input vector and m is equal to the number of hidden variables or features); and
determining corresponding parameters for the weight matrix based on a gradient descent using back-propagation (see Mestha, paragraph [0071], where non-liner transformation is as follows: y=f θ(x)=s(Wx+b) where W is an m×n weight matrix and b is a vector of size m. The term n is the dimension of the input vector and m is equal to the number of hidden variables or features; see also paragraph [0073], where The terms W, b, and d′ may be updated using a backpropagation algorithm with stochastic gradient descent).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Mestha for the benefit of deep learning model to determine parameters of the deep learning model (see Mestha, Abstract).
Regarding Claim 6, Stocker in view of Ben Simhon, Yeung, and Mestha discloses the computer-implemented method of Claim 4, wherein calculating the summated difference between the two consecutive encoded periods of the time-series data further comprises:
Stocker does not disclose:
applying regularization on one or more weights of a network; and
applying a penalization a difference between a low-dimensional vector of two consecutive periods.
Mestha discloses:
applying regularization on one or more weights of a network (see Mestha, paragraph [0050], where Temporal normalization may provide normalization along a time axis. Spatial normalization may be used to normalize signals along multiple nodes (e.g., sensor axis). In either case, the normalized signals may then be used to perform attack detection using feature extraction and comparisons to normal decision boundaries); and
applying a penalization a difference between a low-dimensional vector of two consecutive periods (see Mestha, paragraph [0073], where error between the decoder's transformation of the features, z, and the original input, referred to as a “cost function,” may then be computed … the terms W, b, and d′ may be updated using a backpropagation algorithm with stochastic gradient descent).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Mestha for the benefit of deep learning model to determine parameters of the deep learning model (see Mestha, Abstract).
Regarding Claim 14, Stocker in view of Ben Simhon and Yeung discloses the computing device of Claim 13, wherein the at least one processor configured to generate the plurality of low-dimensional vectors using temporal regularization is further configured to:
Stocker does not disclose:
generating, by an encoder, an input vector for each period of the time-series data;
calculating a minimized summated difference between two consecutive encoded periods of the time-series data;
calculating a summated difference between two consecutive encoded periods of the time-series data; and
generating, by a decoder, the plurality of low-dimensional vectors based on the minimized summated difference between each period of the time-series data and the reconstructed version of the input vector and the summated difference between the two consecutive encoded periods of the input time-series data.
Yeung discloses generating, by an encoder, an input vector for each period of the time-series data (see Yeung, paragraph [0261], where Auto encoder 2040 comprises a deep-learning based algorithm implemented to encode sparsely distributed point cloud data with temporal regularization in order to minimize the inconsistency of encoded data in an output sequence. In one embodiment, auto encoder 2040 uses convolutional autoencoders (CA) to perform dimensionality reduction).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Yeung for the benefit of minimizing the inconsistency of encoded data in an output sequence (see Yeung, paragraph [0261]).
Stocker in view of Yeung does not disclose:
calculating a minimized summated difference between two consecutive encoded periods of the time-series data (see Mestha, paragraph [0071], where the system may normalize the sensor values against a base vector (e.g., a vector of 20 whose values came from a time average of a base file that had no anomalies) ;
calculating a summated difference between two consecutive encoded periods of the time-series data (see Mestha, paragraph [0050], where Temporal normalization may provide normalization along a time axis. Spatial normalization may be used to normalize signals along multiple nodes (e.g., sensor axis). In either case, the normalized signals may then be used to perform attack detection using feature extraction and comparisons to normal decision boundaries; see also paragraph [0073], where error between the decoder's transformation of the features, z, and the original input, referred to as a “cost function,” may then be computed … the terms W, b, and d′ may be updated using a backpropagation algorithm with stochastic gradient descent); and
generating, by a decoder, the plurality of low-dimensional vectors based on the minimized summated difference between each period of the time-series data and the reconstructed version of the input vector and the summated difference between the two consecutive encoded periods of the input time-series data (see Mestha, paragraph [0072], where the original input may be reconstructed from the hidden variables (that is, the output of the encoder step)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker and Yeung with Mestha for the benefit of deep learning model to determine parameters of the deep learning model (see Mestha, Abstract).
Regarding Claim 15, Stocker in view of Ben Simhon, Yeung, and Mestha discloses the computing device of Claim 14, wherein the at least one processor configured to generate the input vector for each period of the time-series data is further configured to:
Stocker does not disclose:
calculating an inner product between a weight matrix for a current period of the time-series data and an output of a pervious weight matrix for a previous period of the time-series data
applying a non-linear function to the inner product; and
determining corresponding parameters for the weight matrix based on a gradient descent using back-propagation.
Mestha discloses:
calculating an inner product between a weight matrix for a current period of the time-series data and an output of a pervious weight matrix for a previous period of the time-series data (see Mestha, paragraph [0071], where non-liner transformation is as follows: y=f θ(x)=s(Wx+b) where W is an m×n weight matrix and b is a vector of size m. The term n is the dimension of the input vector and m is equal to the number of hidden variables or features);
applying a non-linear function to the inner product (see Mestha, paragraph [0071], where non-liner transformation is as follows: y=f θ(x)=s(Wx+b) where W is an m×n weight matrix and b is a vector of size m. The term n is the dimension of the input vector and m is equal to the number of hidden variables or features); and
determining corresponding parameters for the weight matrix based on a gradient descent using back-propagation (see Mestha, paragraph [0071], where non-liner transformation is as follows: y=f θ(x)=s(Wx+b) where W is an m×n weight matrix and b is a vector of size m. The term n is the dimension of the input vector and m is equal to the number of hidden variables or features; see also paragraph [0073], where The terms W, b, and d′ may be updated using a backpropagation algorithm with stochastic gradient descent).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Mestha for the benefit of deep learning model to determine parameters of the deep learning model (see Mestha, Abstract).
Regarding Claim 16, Stocker in view of Ben Simhon, Yeung, and Mestha discloses the computing device of Claim 14, wherein the at least one processor configured to calculate the summated difference between the two consecutive encoded periods of the time-series data is further configured to:
Stocker does not disclose:
applying regularization on one or more weights of a network; and
applying a penalization a difference between a low-dimensional vector of two consecutive periods.
Mestha discloses:
applying regularization on one or more weights of a network (see Mestha, paragraph [0050], where Temporal normalization may provide normalization along a time axis. Spatial normalization may be used to normalize signals along multiple nodes (e.g., sensor axis). In either case, the normalized signals may then be used to perform attack detection using feature extraction and comparisons to normal decision boundaries); and
applying a penalization a difference between a low-dimensional vector of two consecutive periods (see Mestha, paragraph [0073], where error between the decoder's transformation of the features, z, and the original input, referred to as a “cost function,” may then be computed … the terms W, b, and d′ may be updated using a backpropagation algorithm with stochastic gradient descent).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Mestha for the benefit of deep learning model to determine parameters of the deep learning model (see Mestha, Abstract).
Claims 7-10 and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Stocker, Ben Simhon and Yeung as applied to Claims 3 and 13 above, and further in view of Jacob (PG Pub. No. 2016/0196174 A1).
Regarding Claim 7, Stocker in view of Ben Simhon and Yeung discloses the computer-implemented method of Claim 3, wherein determining whether the one or more change points occur in the seasonal pattern of the time-series data based on the plurality of low-dimensional vectors further comprises:
Stocker does not disclose:
determining a location for each of the plurality of low-dimensional vectors; and
performing hierarchical clustering procedure for the plurality of low-dimensional vectors based on the location for each of the plurality of low-dimensional vectors.
Jacob discloses:
determining a location for each of the plurality of low-dimensional vectors (see Jacob, paragraph [0019], where the log event is categorized into the log category based on a comparison of the distance between the TF-IDF vector of the real-time log event and the closest cluster centroid of the cluster with the pre-determined silhouette threshold corresponding to the cluster); and
performing hierarchical clustering procedure for the plurality of low-dimensional vectors based on the location for each of the plurality of low-dimensional vectors (see Jacob, paragraph [0010], where operators may cluster the log events based on conventional divisive clustering technique or agglomerative clustering technique, such as Hierarchical clustering technique).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Jacob for the benefit of categorizing a real-time log event (see Jacob, Abstract).
Regarding Claim 8, Stocker in view of Ben Simhon, Yeung, and Jacob discloses the computer-implemented method of Claim 7, wherein performing the hierarchical clustering procedure for the plurality of low-dimensional vectors based on the location for each of the plurality of low-dimensional vectors further comprises:
Stocker does not disclose:
calculating a silhouette score based on a mean pairwise distance of the location for each of the plurality of low-dimensional vectors in a cluster and a mean distance of each location for each of the plurality of low-dimensional vectors in a neighboring cluster;
determining whether the silhouette score satisfies a hyperparameter threshold; and
selecting a partition based on a determination that the silhouette score satisfies the hyperparameter threshold.
Jacob discloses:
calculating a silhouette score based on a mean pairwise distance of the location for each of the plurality of low-dimensional vectors in a cluster and a mean distance of each location for each of the plurality of low-dimensional vectors in a neighboring cluster (see Jacob, paragraph [0039], where the threshold determination module 118 calculates a cluster radius and a silhouette width of each cluster … a silhouette width of a cluster is indicative of compactness of the cluster. In an example, a silhouette width of a cluster is a measure of how well an object lies within the cluster and how distant is each object from its closest neighboring cluster. The silhouette width may vary from −1 to 1);
determining whether the silhouette score satisfies a hyperparameter threshold (see Jacob, paragraph [0019], where the log event is categorized into the log category based on a comparison of the distance between the TF-IDF vector of the real-time log event and the closest cluster centroid of the cluster with the pre-determined silhouette threshold corresponding to the cluster); and
selecting a partition based on a determination that the silhouette score satisfies the hyperparameter threshold (see Jacob, paragraph [0019], where the log event is categorized into the log category based on a comparison of the distance between the TF-IDF vector of the real-time log event and the closest cluster centroid of the cluster with the pre-determined silhouette threshold corresponding to the cluster).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Jacob for the benefit of categorizing a real-time log event (see Jacob, Abstract).
Regarding Claim 9, Stocker in view of Ben Simhon, Yeung and Jacob discloses the computer-implemented method of Claim 7, further comprising:
Stocker does not disclose determining that no change points exist based on a determination that the silhouette score fails to satisfy the hyperparameter threshold.  Jacob discloses determining that no change points exist based on a determination that the silhouette score fails to satisfy the hyperparameter threshold (see Jacob, paragraph [0019], where the log event is categorized into the log category based on a comparison of the distance between the TF-IDF vector of the real-time log event and the closest cluster centroid of the cluster with the pre-determined silhouette threshold corresponding to the cluster).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Jacob for the benefit of categorizing a real-time log event (see Jacob, Abstract).
Regarding Claim 10, Stocker in view of Ben Simhon, Yeung, and Jacob, discloses the computer-implemented method of claim 7, wherein the time-series data corresponds to seasonal time-series data that has a tendency to exhibit behavior that repeats every fixed period of time (see Stocker, paragraph [0004], where an embodiment of the present invention described herein includes a method for anomaly and event analysis including steps of receiving … at least one data set from at least one data source where at least one data set includes a plurality of time-varying data points … detection model include at least one anomaly detection model trained according to a respective plurality of independent event observations, where the types of the plurality of event observations includes at least one of: i) anomalies, ii) change-points, iii) patterns, or iv) outliers).
Regarding Claim 17, Stocker in view of Ben Simhon and Yeung discloses the computing device of Claim 13, wherein the at least one processor configured to determine whether the one or more change points occur in the seasonal pattern of the time-series data based on the plurality of low-dimensional vectors is further configured to:
Stocker does not disclose:
determining a location for each of the plurality of low-dimensional vectors; and
performing hierarchical clustering procedure for the plurality of low-dimensional vectors based on the location for each of the plurality of low-dimensional vectors.
Jacob discloses:
determining a location for each of the plurality of low-dimensional vectors (see Jacob, paragraph [0019], where the log event is categorized into the log category based on a comparison of the distance between the TF-IDF vector of the real-time log event and the closest cluster centroid of the cluster with the pre-determined silhouette threshold corresponding to the cluster); and
performing hierarchical clustering procedure for the plurality of low-dimensional vectors based on the location for each of the plurality of low-dimensional vectors (see Jacob, paragraph [0010], where operators may cluster the log events based on conventional divisive clustering technique or agglomerative clustering technique, such as Hierarchical clustering technique).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Jacob for the benefit of categorizing a real-time log event (see Jacob, Abstract).
Regarding Claim 18, Stocker in view of Ben Simhon, Yeung, and Jacob discloses the computing device of Claim 17, wherein the at least one processor configured to perform the hierarchical clustering procedure for the plurality of low-dimensional vectors based on the location for each of the plurality of low-dimensional vectors is further configured to:
Stocker does not disclose:
calculating a silhouette score based on a mean pairwise distance of the location for each of the plurality of low-dimensional vectors in a cluster and a mean distance of each location for each of the plurality of low-dimensional vectors in a neighboring cluster;
determining whether the silhouette score satisfies a hyperparameter threshold; and
selecting a partition based on a determination that the silhouette score satisfies the hyperparameter threshold.
Jacob discloses:
calculating a silhouette score based on a mean pairwise distance of the location for each of the plurality of low-dimensional vectors in a cluster and a mean distance of each location for each of the plurality of low-dimensional vectors in a neighboring cluster (see Jacob, paragraph [0039], where the threshold determination module 118 calculates a cluster radius and a silhouette width of each cluster … a silhouette width of a cluster is indicative of compactness of the cluster. In an example, a silhouette width of a cluster is a measure of how well an object lies within the cluster and how distant is each object from its closest neighboring cluster. The silhouette width may vary from −1 to 1);
determining whether the silhouette score satisfies a hyperparameter threshold (see Jacob, paragraph [0019], where the log event is categorized into the log category based on a comparison of the distance between the TF-IDF vector of the real-time log event and the closest cluster centroid of the cluster with the pre-determined silhouette threshold corresponding to the cluster); and
selecting a partition based on a determination that the silhouette score satisfies the hyperparameter threshold (see Jacob, paragraph [0019], where the log event is categorized into the log category based on a comparison of the distance between the TF-IDF vector of the real-time log event and the closest cluster centroid of the cluster with the pre-determined silhouette threshold corresponding to the cluster).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Jacob for the benefit of categorizing a real-time log event (see Jacob, Abstract).
Regarding Claim 19, Stocker in view of Ben Simhon, Yeung and Jacob discloses the computing device of Claim 17, wherein the at least one processor is further configured to:
Stocker does not disclose determining that no change points exist based on a determination that the silhouette score fails to satisfy the hyperparameter threshold.  Jacob discloses determining that no change points exist based on a determination that the silhouette score fails to satisfy the hyperparameter threshold (see Jacob, paragraph [0019], where the log event is categorized into the log category based on a comparison of the distance between the TF-IDF vector of the real-time log event and the closest cluster centroid of the cluster with the pre-determined silhouette threshold corresponding to the cluster).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Jacob for the benefit of categorizing a real-time log event (see Jacob, Abstract).
Claim 22 is rejected under 35 U.S.C. 103 as being unpatentable over Stocker and Ben Simhon as applied to Claims 1, 11, 20, and 21 above, and further in view of Andoni (PG Pub. No. 2019/0228312 A1).
Regarding Claim 22, Stocker in view of Ben Simhon discloses the computer-readable medium of Claim 20, wherein:
Stocker does not disclose the autoencoding procedure is a 3 layer feed-forward neural network, each layer of the neural network including a linear function and a hyperbolic tangent function.  Andoni discloses the autoencoding procedure (see Andoni, paragraph [0005], where in an illustrative aspect, a variational autoencoder (VAE) is used as part of the described systems and methods) is a 3 layer (see Andoni, paragraph [0015], where first neural network 110 may include an input layer, an output layer, and zero or more hidden layers) feed-forward neural network (see Andoni, paragraph [0058], where parameters of the genetic algorithm 310 may include but are not limited to, mutation parameter(s), a maximum number of epochs the genetic algorithm 310 will be executed, a threshold fitness value that results in termination of the genetic algorithm 310 even if the maximum number of generations has not been reached, whether parallelization of model testing or fitness evaluation is enabled, whether to evolve a feedforward or recurrent neural network, etc), each layer of the neural network including a linear function and a hyperbolic tangent function (see Andoni, paragraph [0048], where the activation function of a node may be a step function, sine function, continuous or piecewise linear function, sigmoid function, hyperbolic tangent function, or another type of mathematical function that represents a threshold at which the node is activated).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Stocker with Andoni for the benefit of unsupervised model building for clustering and anomaly detection (see Andoni, Abstract).
Response to Arguments
Applicant’s Arguments, filed September 16, 2022, have been fully considered, but they are not persuasive.
Applicant argues on page 10 of Applicant’s Remarks that Stocker, alone, or in combination with Ben Simhon does not teach, disclose, or fairly suggest, all of the elements of Independent Claim 1 and equivalent Independent Claims 11 and 20.  The Examiner respectfully disagrees.
Stocker is directed toward identifying change points in time series data (see Stocker, paragraph [0004], where an embodiment of the present invention described herein includes a method for anomaly and event analysis including steps of receiving … at least one data set of at least one data stream from at least one data source, where the at least one data set includes a plurality of time-varying data points … detection model includes at least one anomaly detection model trained according to a respective plurality of independent event training data sets to identify types of the plurality of event observations, where the types of the plurality of event observations include at least one of: i) anomalies, ii) change-points, iii) patterns, or iv) outliers).  While Stocker does not explicitly disclose performing these steps using an autoencoder, Stocker discloses the use of any AI or machine learning technique (see Stocker, paragraph [0122], where the exemplary inventive computer-based systems of the present disclosure may be configured to utilize one or more exemplary AI/machine learning techniques chosen from, but not limited to, decision trees, boosting, support-vector machines, neural networks, nearest neighbor algorithms, Naive Bayes, bagging, random forests, and the like. In some embodiments and, optionally, in combination of any embodiment described above or below, an exemplary neutral network technique may be one of, without limitation, feedforward neural network, radial basis function network, recurrent neural network, convolutional network (e.g., U-net) or other suitable network).  Accordingly, it is the position of the Examiner that Stocker suggests a combination with the disclosed autoencoder in Ben Simhon (see Ben Simhon, paragraph [0017], where the system includes a stacked autoencoder module configured to (i) train a stacked autoencoder using each of the time series portions of each of the plurality of metrics and (ii) identify output nodes in the stacked autoencoder activated by each of the time series portions of each of the plurality of metrics) would have been obvious to one of ordinary skill in the art in order to perform the claimed method steps of Independent Claims 1, 11, and 20.
Conclusion
The prior art made of record and not relied upon is considered pertinent to the Applicant’s disclosure:
Sonalker (PG Pub. No. 2016/0188396 A1), which concerns detecting anomalies in periodic CAN bus data using, inter alia, Radial Basis Function kernel.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FARHAD AGHARAHIMI whose telephone number is (571)272-9864. The examiner can normally be reached M-F 9am - 5pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Apu Mofiz can be reached on 571-272-4080. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/FARHAD AGHARAHIMI/Examiner, Art Unit 2161       
























/APU M MOFIZ/Supervisory Patent Examiner, Art Unit 2161