Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1 – 3 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. The claims recite “and training a machine learning model based on the derivatives so as to define a filter for identifying subsequent time series for a second device being absent anomalous communication.” – it is not clear and definite whether the second device itself is anonymous but anomalous communications emanate or the second device does not have anomalous communication. The specification and drawings do not elucidate on this aspect. For BRI purposes, it is construed that false anomalous behaviors are identified for training purposes so new true anomalous behaviors are identified.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1 – 3 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims of U.S. Application No. 16619745 in view of Clayton et al (US 20170206464), hereafter Cla and Kvernvik et al (US 20190334784), hereafter Kve as the instant application and the pending application and further in view of the prior art Cla and Kve are similar in scope, form and content.
Instant Application: 16619767
Pending Application: 16619745
1. A method of anomaly detection for network traffic communicated by devices via a computer network, the method comprising: receiving a set of training time series each including a plurality of time windows of data corresponding to network communication characteristics for a first device; training an autoencoder for a first cluster based on a time series in the first cluster, wherein a state of the autoencoder is periodically recorded after a predetermined fixed number of training examples to define a set of trained autoencoders for the first cluster; receiving a new time series including a plurality of time windows of data corresponding to network communication characteristics for the first device; for each time window of the new time series, generating a vector of reconstruction errors for the first device for each autoencoder based on testing the autoencoder with data from the time window; evaluating a derivative of each vector; and training a machine learning model based on the derivatives so as to define a filter for identifying subsequent time series for a second device being absent anomalous communication.
2. A computer system comprising: a processor and memory storing computer program code for detecting anomalies in network traffic communicated by devices via a computer network, by: receiving a set of training time series each including a plurality of time windows of data corresponding to network communication characteristics for a first device; training an autoencoder for a first cluster based on a time series in the first cluster, wherein a state of the autoencoder is periodically recorded after a predetermined fixed number of training examples to define a set of trained autoencoders for the first cluster; receiving a new time series including a plurality of time windows of data corresponding to network communication characteristics for the first device; for each time window of the new time series, generating a vector of reconstruction errors for the first device for each autoencoder based on testing the autoencoder with data from the time window, evaluating a derivative of each vector; and training a machine learning model based on the derivatives so as to define a filter for identifying subsequent time series for a second device being absent anomalous communication.
3. A non-transitory computer-readable storage element storing computer program code to, when loaded into a computer system and executed thereon, cause the computer system to detect anomalies in network traffic communicated by devices via a computer network, by: receiving a set of training time series each including a plurality of time windows of data corresponding to network communication characteristics for a first device; training an autoencoder for a first cluster based on a time series in the first cluster, wherein a state of the autoencoder is periodically recorded after a predetermined fixed number of training examples to define a set of trained autoencoders for the first cluster; receiving a new time series including a plurality of time windows of data corresponding to network communication characteristics for the first device; for each time window of the new time series, generating a vector of reconstruction errors for the first device for each autoencoder based on testing the autoencoder with data from the time window, evaluating a derivative of each vector; and training a machine learning model based on the derivatives so as to define a filter for identifying subsequent time series for a second device being absent anomalous communication.
1. A method of anomaly detection for network traffic communicated by devices via a computer network, the method comprising: clustering a set of time series, each time series including a plurality of time windows of data corresponding to network communication characteristics for a device; training an autoencoder for each cluster based on a time series in the cluster; generating a set of reconstruction errors for each autoencoder based on testing a respective autoencoder with data from time windows of at least a subset of the time series; generating a probabilistic model of reconstruction errors for each autoencoder; and generating an aggregation of the probabilistic models for, in use, detecting reconstruction errors for a time series of data corresponding to network communication characteristics for a device as anomalous.  
2. The method of claim 1, wherein the clusters are defined based on a respective autoencoder for converting each time series to a vector of features for the time series and a clustering algorithm clusters the vectors.  
3. The method of claim 1, wherein the set of reconstruction errors for a respective autoencoder is generated based on the respective autoencoder processing each time series in a corresponding cluster of time series.
9. A computer system comprising: a processor and memory storing computer program code for anomaly detection for network traffic communicated by devices via a computer network, by: clustering a set of time series, each time series including a plurality of time windows of data corresponding to network communication characteristics for a device; training an autoencoder for each cluster based on a time series in the cluster; generating a set of reconstruction errors for each autoencoder based on testing a respective autoencoder with data from time windows of at least a subset of the time series; generating a probabilistic model of reconstruction errors for each autoencoder; and generating an aggregation of the probabilistic models for, in use, detecting reconstruction errors for a time series of data corresponding to network communication characteristics for a device as anomalous.


The pending application recites the claimed concept in the instant application but is silent on receiving a set of training time series each including a plurality of time windows of data corresponding to network communication characteristics for a first device; receiving a new time series including a plurality of time windows of data corresponding to network communication characteristics for the first device; evaluating a derivative of each vector; (Cla: [002, 035] the input layer nodes receive as input the latent distribution z, which includes 3 distributions at each time interval ([064] each of different time series data is received at varying different time intervals) [024-25] related to a set of connected network communication device characteristics); [026-27] one edge device includes, multiple different data collection devices that collect different types of data comprising concurrently detected plurality of different time series, and memory for receiving and storing large amounts of data; [037] training the variational inference machine includes derivation of a variational lower bound of the variational approximation using a log likelihood technique which measures the difference between two different probability distributions..., and a negative reconstruction error, which is a measure of error between x and x̂);
Therefore it is prima facie obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of pending app. 16619745 to include the idea of receiving time series data and evaluating derivatives of vectors as taught by Cla so that the latent distribution z is an optimal input for a machine learning model ([037]).
Kve teaches and training a machine learning model based on the derivatives so as to define a filter for identifying subsequent time series for a second device being absent anomalous communication. ([099] to identify new anomalies in online or offline data, an anomaly classifier is deployed at the end of the procedure, to classify newly identified anomalies and the trained anomaly classifier is used as a first filter to identify potential false alarms which closely resembles anomalies but by their characteristic variations in performance parameters, be identified and flagged by an anomaly classifier before the corresponding data is analysed, [041] with true anomalous behaviours being further investigated, and false anomalous behaviours being used to train the ANN and error calculation steps such that the false anomalous behaviour is learned by the ANN and correctly identified in future).
Therefore it is prima facie obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to modify the inventions of pending app. 16619745 and Cla to include the idea of training model using derivatives as taught by Kve so that autoencoder thus learns normal seasonal variation of network performance parameters, so that a value considered anomalous at one time is recognised as normal behaviour for a different time ([066]).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1 – 3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Clayton et al (US 20170206464), hereafter Cla and Kvernvik et al (US 20190334784), hereafter Kve.
Claim 1: Cla teaches a method of anomaly detection for network traffic communicated by devices via a computer network, the method comprising: receiving a set of training time series each including a plurality of time windows of data corresponding to network communication characteristics for a first device; ([002, 035] the input layer nodes receive as input the latent distribution z, which includes 3 distributions at each time interval ([064] each of different time series data is received at varying different time intervals) [024-25] related to a set of connected network communication device characteristics);
training an autoencoder for a first cluster based on a time series in the first cluster, ([036] training a variational inference machine where a variational autoencoder is used to train a variational inference machine and a variational generation machine, simultaneously trained as an autoencoder using a set of time series {x.sub.1, x.sub.2, . . . x.sub.n});
wherein a state of the autoencoder is periodically recorded after a predetermined fixed number of training examples to define a set of trained autoencoders for the first cluster; ([027] the machine learning model iteratively updates the result and continuously execute using adapted data representative of collected time series data and produces a continuous result or a periodic result, where [030] a time series data adaptation module adapts time series data into one or more distributed representations (i.e., cluster), which are provided to the machine learning module);
receiving a new time series including a plurality of time windows of data corresponding to network communication characteristics for the first device; ([026-27] one edge device includes, multiple different data collection devices that collect different types of data comprising concurrently detected plurality of different time series, and memory for receiving and storing large amounts of data);
evaluating a derivative of each vector; ([037] training the variational inference machine includes derivation of a variational lower bound of the variational approximation using a log likelihood technique which measures the difference between two different probability distributions..., and a negative reconstruction error, which is a measure of error between x and x̂);
Cla teaches the concept but is silent on for each time window of the new time series, generating a vector of reconstruction errors for the first device for each autoencoder based on testing the autoencoder with data from the time window; and training a machine learning model based on the derivatives so as to define a filter for identifying subsequent time series for a second device being absent anomalous communication.
But analogous art Kve teaches for each time window of the new time series, generating a vector of reconstruction errors for the first device for each autoencoder based on testing the autoencoder with data from the time window; ([034] calculating a reconstruction error between the time series and the reconstructed time series comprises calculating individual reconstruction errors for each performance parameter and [083] combined anomaly metric is then generated by normalizing the combined reconstruction error, and [073] a vector is constructed comprising as elements the representation of time and the associated parameter value);
and training a machine learning model based on the derivatives so as to define a filter for identifying subsequent time series for a second device being absent anomalous communication. ([099] to identify new anomalies in online or offline data, an anomaly classifier is deployed at the end of the procedure, to classify newly identified anomalies and the trained anomaly classifier is used as a first filter to identify potential false alarms which closely resembles anomalies but by their characteristic variations in performance parameters, be identified and flagged by an anomaly classifier before the corresponding data is analysed, [041] with true anomalous behaviours being further investigated, and false anomalous behaviours being used to train the ANN and error calculation steps such that the false anomalous behaviour is learned by the ANN and correctly identified in future).
Therefore it is prima facie obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Cla to include the idea of computing REs and training model as taught by Kve so that autoencoder thus learns normal seasonal variation of network performance parameters, so that a value considered anomalous at one time is recognised as normal behaviour for a different time ([066]).
Claim 2: Cla teaches a computer system comprising: a processor and memory storing computer program code for detecting anomalies in network traffic communicated by devices via a computer network, by (Figs. 1 and 2): receiving a set of training time series each including a plurality of time windows of data corresponding to network communication characteristics for a first device; training an autoencoder for a first cluster based on a time series in the first cluster, wherein a state of the autoencoder is periodically recorded after a predetermined fixed number of training examples to define a set of trained autoencoders for the first cluster; receiving a new time series including a plurality of time windows of data corresponding to network communication characteristics for the first device; evaluating a derivative of each vector; ([002, 035] the input layer nodes receive as input the latent distribution z, which includes 3 distributions at each time interval ([064] each of different time series data is received at varying different time intervals) [024-25] related to a set of connected network communication device characteristics; [036] training a variational inference machine where a variational autoencoder is used to train a variational inference machine and a variational generation machine, simultaneously trained as an autoencoder using a set of time series {x.sub.1, x.sub.2, . . . x.sub.n}; [027] the machine learning model iteratively updates the result and continuously execute using adapted data representative of collected time series data and produces a continuous result or a periodic result, where [030] a time series data adaptation module adapts time series data into one or more distributed representations (i.e., cluster), which are provided to the machine learning module; [026-27] one edge device includes, multiple different data collection devices that collect different types of data comprising concurrently detected plurality of different time series, and memory for receiving and storing large amounts of data; [037] training the variational inference machine includes derivation of a variational lower bound of the variational approximation using a log likelihood technique which measures the difference between two different probability distributions..., and a negative reconstruction error, which is a measure of error between x and x̂);
Cla teaches the concept but is silent on for each time window of the new time series, generating a vector of reconstruction errors for the first device for each autoencoder based on testing the autoencoder with data from the time window; and training a machine learning model based on the derivatives so as to define a filter for identifying subsequent time series for a second device being absent anomalous communication.
But analogous art Kve teaches for each time window of the new time series, generating a vector of reconstruction errors for the first device for each autoencoder based on testing the autoencoder with data from the time window, and training a machine learning model based on the derivatives so as to define a filter for identifying subsequent time series for a second device being absent anomalous communication. ([034] calculating a reconstruction error between the time series and the reconstructed time series comprises calculating individual reconstruction errors for each performance parameter and [083] combined anomaly metric is then generated by normalizing the combined reconstruction error, and [073] a vector is constructed comprising as elements the representation of time and the associated parameter value; [099] to identify new anomalies in online or offline data, an anomaly classifier is deployed at the end of the procedure, to classify newly identified anomalies and the trained anomaly classifier is used as a first filter to identify potential false alarms which closely resembles anomalies but by their characteristic variations in performance parameters, be identified and flagged by an anomaly classifier before the corresponding data is analysed, [041] with true anomalous behaviours being further investigated, and false anomalous behaviours being used to train the ANN and error calculation steps such that the false anomalous behaviour is learned by the ANN and correctly identified in future).
Therefore it is prima facie obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Cla to include the idea of computing REs and training model as taught by Kve so that autoencoder thus learns normal seasonal variation of network performance parameters, so that a value considered anomalous at one time is recognised as normal behaviour for a different time ([066]).
Claim 3: Cla teaches a non-transitory computer-readable storage element storing computer program code to, when loaded into a computer system and executed thereon, cause the computer system to detect anomalies in network traffic communicated by devices via a computer network, by (Figs. 1 and 2): receiving a set of training time series each including a plurality of time windows of data corresponding to network communication characteristics for a first device; training an autoencoder for a first cluster based on a time series in the first cluster, wherein a state of the autoencoder is periodically recorded after a predetermined fixed number of training examples to define a set of trained autoencoders for the first cluster; receiving a new time series including a plurality of time windows of data corresponding to network communication characteristics for the first device; evaluating a derivative of each vector; ([002, 035] the input layer nodes receive as input the latent distribution z, which includes 3 distributions at each time interval; [064] each of different time series data is received at varying different time intervals) [024-25] related to a set of connected network communication device characteristics; [036] training a variational inference machine where a variational autoencoder is used to train a variational inference machine and a variational generation machine, simultaneously trained as an autoencoder using a set of time series {x.sub.1, x.sub.2, . . . x.sub.n}; [027] the machine learning model iteratively updates the result and continuously execute using adapted data representative of collected time series data and produces a continuous result or a periodic result, where [030] a time series data adaptation module adapts time series data into one or more distributed representations (i.e., cluster), which are provided to the machine learning module; [026-27] one edge device includes, multiple different data collection devices that collect different types of data comprising concurrently detected plurality of different time series, and memory for receiving and storing large amounts of data; [037] training the variational inference machine includes derivation of a variational lower bound of the variational approximation using a log likelihood technique which measures the difference between two different probability distributions..., and a negative reconstruction error, which is a measure of error between x and x̂);
Cla teaches the concept but is silent on for each time window of the new time series, generating a vector of reconstruction errors for the first device for each autoencoder based on testing the autoencoder with data from the time window; and training a machine learning model based on the derivatives so as to define a filter for identifying subsequent time series for a second device being absent anomalous communication.
But analogous art Kve teaches for each time window of the new time series, generating a vector of reconstruction errors for the first device for each autoencoder based on testing the autoencoder with data from the time window, and training a machine learning model based on the derivatives so as to define a filter for identifying subsequent time series for a second device being absent anomalous communication. ([034] calculating a reconstruction error between the time series and the reconstructed time series comprises calculating individual reconstruction errors for each performance parameter and [083] combined anomaly metric is then generated by normalizing the combined reconstruction error, and [073] a vector is constructed comprising as elements the representation of time and the associated parameter value; [099] to identify new anomalies in online or offline data, an anomaly classifier is deployed at the end of the procedure, to classify newly identified anomalies and the trained anomaly classifier is used as a first filter to identify potential false alarms which closely resembles anomalies but by their characteristic variations in performance parameters, be identified and flagged by an anomaly classifier before the corresponding data is analysed, [041] with true anomalous behaviours being further investigated, and false anomalous behaviours being used to train the ANN and error calculation steps such that the false anomalous behaviour is learned by the ANN and correctly identified in future).
Therefore it is prima facie obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Cla to include the idea of computing REs and training model as taught by Kve so that autoencoder thus learns normal seasonal variation of network performance parameters, so that a value considered anomalous at one time is recognised as normal behaviour for a different time ([066]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See form PTO-892.  Any inquiry concerning this communication or earlier communications from the examiner should be directed to Badri -- Champakesan whose telephone number is (571)270-3867. The examiner can normally be reached M-F: 8:30am-5pm (EST). Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jorge L. Ortiz-Criado can be reached on 5712727624. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.  Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BADRINARAYANAN /Examiner, Art Unit 2496.