Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
Status of the Application
The following is a Final Office Action. 

In response to Examiner's communication of 6/21/2022, Applicant responded on 9/8/2022, amended 1-10. Added 11-12

Claims 1-12 are pending in this application and have been examined. 












Response to Arguments - 35 USC § 101
Applicant’s arguments with respect to the rejections have been fully considered, but they are not persuasive. Therefore, these rejections are maintained. 

Applicant submits, “...the claims are directed to a method and apparatus "for improving machine learning by accounting for non-equidistant events...the Examiner has failed to establish a prima facie case of patent ineligibility under Prong One of Step 2A as articulated in MPEP §2106...The present claims improve the operation of a computer system...claims are directed to a method and apparatus "for improving machine learning by accounting for non-equidistant events." Accordingly, the claims improve the machine learning system by accounting for non-equidistant events. Therefore, under Step 2A, Prong Two, the present claims are patent-eligible because the elements are integrated into the practical application of improving a machine learning system...the human mind alone can not "perform[] a first machine learning using the past metric data as training data to generate a trend" and "perform[] a second machine learning on the residual data to generate learned residual data" as recited in the present claims. As a result, the claims are directed to an invention that "could not, as a practical matter...” The Examiner respectfully disagrees.

Examiner respectfully notes, while Applicants amendments advances prosecution, the claims do not improve the methods of decision trees or random forest, which are also abstract elements, here the claims seeks to improve the prediction and prediction results which are abstract.  Examiner further notes, while the human mind cannot practically perform certain machine learning steps, the steps and methods, i.e. making trend predictions, decision tree, random forest, claimed here are mathematical concepts that have been practically performed by humans with pen and paper. 

Analyzing under Step 2A, Prong 1:
The limitations regarding, ...receives information from the ..., wherein the information includes past metric data, trains a first ... model using past metric data as training data, generates a trend using the first ... model, generates a trend prediction data from the trend, generates residual data by subtracting the trend prediction data from the training data, trains a second ... model based on the residual data, generates learned residual data using the second ... model, calculates event prediction data based on the learned residual data and actual measured value data, wherein the event prediction data is a set of predicted values of time series data in an intended period including a past certain period and the event prediction data includes the non-equidistant events that occur in unequal and fluctuating time intervals, generates an event prediction model based on the event prediction data, wherein the event prediction model utilizes explanatory variables to predict the non- equidistant events, calculates prediction result data, wherein the prediction result data is calculated as a function of both the trend prediction data and the event prediction data, determines whether a base line shift occurs based on a difference between the training data and the prediction result data in a terminal period of the training data, and on an occurrence of the base line shift, shifts the prediction result data in a future period to generate prediction data in such a manner that the prediction result data in an initial period of the future period is closer to the terminal period of the past period…, under the broadest reasonable interpretation, may be interpreted to include a human reasonably using their mind with pen and paper to, …receives information from the ..., wherein the information includes past metric data, trains a first ... model using past metric data as training data, generates a trend using the first ... model, generates a trend prediction data from the trend, generates residual data by subtracting the trend prediction data from the training data, trains a second ... model based on the residual data, generates learned residual data using the second ... model, calculates event prediction data based on the learned residual data and actual measured value data, wherein the event prediction data is a set of predicted values of time series data in an intended period including a past certain period and the event prediction data includes the non-equidistant events that occur in unequal and fluctuating time intervals, generates an event prediction model based on the event prediction data, wherein the event prediction model utilizes explanatory variables to predict the non- equidistant events, calculates prediction result data, wherein the prediction result data is calculated as a function of both the trend prediction data and the event prediction data, determines whether a base line shift occurs based on a difference between the training data and the prediction result data in a terminal period of the training data, and on an occurrence of the base line shift, shifts the prediction result data in a future period to generate prediction data in such a manner that the prediction result data in an initial period of the future period is closer to the terminal period of the past period…; therefore, the claims are directed to a mental process. 

Further, …receives information from the ..., wherein the information includes past metric data, trains a first ... model using past metric data as training data, generates a trend using the first ... model, generates a trend prediction data from the trend, generates residual data by subtracting the trend prediction data from the training data, trains a second ... model based on the residual data, generates learned residual data using the second ... model, calculates event prediction data based on the learned residual data and actual measured value data, wherein the event prediction data is a set of predicted values of time series data in an intended period including a past certain period and the event prediction data includes the non-equidistant events that occur in unequal and fluctuating time intervals, generates an event prediction model based on the event prediction data, wherein the event prediction model utilizes explanatory variables to predict the non- equidistant events, calculates prediction result data, wherein the prediction result data is calculated as a function of both the trend prediction data and the event prediction data, determines whether a base line shift occurs based on a difference between the training data and the prediction result data in a terminal period of the training data, and on an occurrence of the base line shift, shifts the prediction result data in a future period to generate prediction data in such a manner that the prediction result data in an initial period of the future period is closer to the terminal period of the past period…, under the broadest reasonable interpretation, is directed to mathematical concepts. 

Accordingly, the claims are directed to a mental process and mathematical concepts, and thus, the claims are directed to an abstract idea under the first prong of Step 2A.


Analyzing under Step 2A, Prong 2:
This judicial exception is not integrated into a practical application under the second prong of Step 2A. 
In particular, the claims recite the additional elements beyond the recited abstract idea identified under Step 2A, Prong 1, such as:

Claim 1, 10: An apparatus for, machine learning, apparatus comprising: a memory for storing data; an input/output device; and a processor, operatively coupled to and in communication with the memory and the input/output device, wherein the processor, input/output device, machine learning, a processor

, and pursuant to the broadest reasonable interpretation, as an ordered combination, each of the additional elements are computing elements recited at high level of generality implementing the abstract idea, and thus, are no more than applying the abstract idea with generic computer components. Further, these additional elements generally link the abstract idea to a technical environment, namely the environment of a computer. 


Analyzing under Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under Step 2B. 
As noted above, the aforementioned additional elements beyond the recited abstract idea are not sufficient to amount to significantly more than the recited abstract idea because, as an order combination, the additional elements are no more than mere instructions to implement the idea using generic computer components (i.e. apply it). 
Additionally, as an order combination, the additional elements append the recited abstract idea to well-understood, routine, and conventional activities in the field as individually evinced by the applicant’s own disclosure in at least [0024] In FIG. 2, the management server 101 can be realized using, for example, a commonly used computing machine.   The management server 101 includes herein a central processing unit (CPU) 201, a memory 202, an auxiliary storage device 203, a communication interface 204, a media interface 205, and an input/output device 206. [0025] The auxiliary storage device 203 is an apparatus that records data in a data writable and readable fashion, and stores a program for specifying operations of the CPU 201 and the like. The communication interface 204 communicates with an external apparatus such as the console 105 via the network 103. The media interface 205 writes and reads data to and from an external  recording medium 207. The input/output device 206 is connected to the console 105 operating the management server 101. [0026] The CPU 201 reads the program for specifying the operations of the management server 101 from the auxiliary storage device 203 to the memory 202, and executes the program using the memory 202, thereby realizing the interface 111, the data acquisition section 112, the future prediction section 113, and the analysis result display section 114 depicted in FIG. 1.    It is noted that the program may be stored in the auxiliary storage device 203 in advance,  acquired   from   the  external   apparatus  via   the communication interface 204 when needed, or acquired from the external recording medium 207 via the media interface 205. Examples of the external recording medium 207 include a hard disc, a solid state drive (SSD), an integrated circuit (IC) card, a secure digital (SD) memory card, and a digital versatile disc (DVD). [0027] It is noted that part of or all of configurations, functions, and the like of the management server 101 may be realized by hardware by, for example, being designed with an integrated circuit. [0093] A person skilled in the art can carry out the present disclosure in various other manners, as required by the Berkheimer Memo.

Furthermore, as an ordered combination, these elements amount to generic computer components receiving or transmitting data over a network, performing repetitive calculations, electronic record keeping, and storing and retrieving information in memory, which, as held by the courts, are well-understood, routine, and conventional. See MPEP 2106.05(d).

The present limitations are directed to abstract idea as described above with respect to the first prong of Step 2A, i.e. mental process (i.e. human receive data, human observing data, human making trend prediction on observed data), mathematical concepts (i.e. human making trend predictions from observation), generally linked to a technical environment, i.e. computer, as analyzed under Step 2A Prong 2. Thus, the claims are not “an improvement in the functioning of a computer, or an improvement to other technology or technical field” and do not integrate the recited abstract idea into a practical application.  Even novel and newly discovered judicial exceptions are still exceptions, despite their novelty. July 2015 Update, p. 3; see SAP America Inc. v. Investpic, LLC, No. 2017-2081, slip op. at 2 (Fed Cir. May 15, 2018). 

Simply reciting specific limitations that narrow the abstract idea does not make an abstract idea non-abstract. 79 Fed. Reg. 74631; buySAFE Inc. v. Google, Inc., 765 F.3d 1350, 1355 (2014); see SAP America at p. 12. As discussed in SAP America, no matter how much of an advance the claims recite, when “the advance lies entirely in the realm of abstract ideas, with no plausibly alleged innovation in the non-abstract application realm,” “[a]n advance of that nature is ineligible for patenting.” Id. at p. 3. 



Response to Arguments - 35 USC § Prior Art
Applicant’s arguments with respect to the rejections have been fully considered, but they are not persuasive. Therefore, these rejections are maintained. 

Applicant submits, “...None of the cited references teach an apparatus for improving machine learning by accounting for non-equidistant events with a processor that "generates an event prediction model based on the event prediction data" that "utilizes explanatory variables to predict the non-equidistant events" where "the non- equidistant events that occur in unequal and fluctuating time intervals" as recited in the amended claims....The additional teachings of Wang do not cure the shortcomings of Utsumi. In contrast, Wang teaches a system where "the historical data may be fitted by at least one of the following filters: Savitzky-Golay filter (ie SG filter), Kalman filter, moving window averaging filter, Butterworth filter" and "when the SG filter is used, the SG filter can preserve the characteristics of its high statistical moment while smoothing the data and filtering the noise, thus ensuring a more reliable comparison between the actual data and the predicted baseline." (Wang, Pg. 5). Accordingly, Wang teaches a system that applies a mathematic filter to smooth the data and remove the noise. ....Wang does not teach an apparatus for improving machine learning by accounting for non- equidistant events with a processor that "generates an event prediction model based on the event prediction data" that "utilizes explanatory variables to predict the non- equidistant events" where "the non-equidistant events that occur in unequal and fluctuating time intervals" as recited in the amended claims. Instead, Wang teaches a system that deliberately removes the "non-equidistant events." As a result, Wang, alone or in combination with Utsumi, fails to teach or suggest every feature recited in the amended claims....” The Examiner respectfully disagrees.


Although implied, Utsumi does not expressly disclose the following limitations, which however, are taught by Wang,
calculates event prediction data based on the learned residual data and actual measured value data, wherein the event prediction data is a set of predicted values of time series data in an intended period including a past certain period and the event prediction data includes the non-equidistant events that occur in unequal and fluctuating time intervals (in at least [pg5 para12] the alarm system provides a solution for integrating dynamic machine baseline prediction by implementing a plurality of machine learning methods, and the alarm system selects an appropriate machine learning method from the plurality of machine learning methods as a predictive model to historical data. Processing and predicting the corresponding dynamic baseline helps to improve the prediction accuracy of the dynamic baseline. [pg8 para9] for historical data sequentially arranged in time series, the alarm system moves the m windows of the length in time series so that the SG differentiator can repeat the intra-window by local least squares (LS). The historical data implements a local polynomial fitting, and the order of the polynomial is the smoothing order p described above; since the smoothing process of the historical data can be realized in the fitting process, the obtained local polynomial can be called a smoothing polynomial [pg9 para6] FIG. 7 is a schematic diagram of an abnormal monitoring of throughput of a data center according to an exemplary embodiment. As shown in FIG. 7, by using the abnormal monitoring scheme of this specification, accurate abnormal monitoring of the throughput of the data center can be achieved. For example, in Table 1 below, Sample 1, Sample 2, and Sample 3 are three samples for abnormal monitoring of data center throughput using the anomaly monitoring scheme of this specification, each sample containing 1,440 actual values; From the accuracy, recall rate, false positive rate and comprehensive evaluation indicators (such as F1-score, that is, F1 score) and other dimensions to judge the effect of abnormal monitoring, it can be seen that the abnormal monitoring program of this specification can not only effectively identify abnormalities, It also maintains a low false positive rate and meets the actual monitoring needs of the data center.)
generates an event prediction model based on the event prediction data, wherein the event prediction model utilizes explanatory variables to predict the non- equidistant events (in at least [pg9 para6] FIG. 7 is a schematic diagram of an abnormal monitoring of throughput of a data center according to an exemplary embodiment. As shown in FIG. 7, by using the abnormal monitoring scheme of this specification, accurate abnormal monitoring of the throughput of the data center can be achieved. For example, in Table 1 below, Sample 1, Sample 2, and Sample 3 are three samples for abnormal monitoring of data center throughput using the anomaly monitoring scheme of this specification, each sample containing 1,440 actual values; From the accuracy, recall rate, false positive rate and comprehensive evaluation indicators (such as F1-score, that is, F1 score) and other dimensions to judge the effect of abnormal monitoring, it can be seen that the abnormal monitoring program of this specification can not only effectively identify abnormalities, It also maintains a low false positive rate and meets the actual monitoring needs of the data center.)
determines whether a base line shift occurs based on a difference between the training data and the prediction result data in a terminal period of the training data (in at least [pg7 para1-11] the present specification proposes a correction to the Z-score model in the related art to process the historical data by the modified Z-score model. The correction may include: adjusting the abnormality determination threshold from "3" to "3.5" for the "three-sigma" rule adopted by the Z-score model in the related art, that is, the absolute value of the score Mi corresponding to any time point i When the value is greater than 3.5 (ie, |Mi|⟩3.5), the time point i can be marked as abnormal, which can improve the elimination effect of the absolute median (MAD) on the extreme points and avoid the negative impact of the extreme points on the statistical distribution of the data. To help improve the prediction accuracy of dynamic baselines. [pg9 para1-5] the second derivative of the dynamic baseline is the second derivative of the baseline value corresponding to each time point on the dynamic baseline. The second derivative of the actual data is the second derivative of the actual data corresponding to the actual value at each time point. The absolute difference Differ_sg is used to represent the difference in curvature between the dynamic baseline and the actual data...the alarm system may move the sliding window of the preset length in a time series, and implement a local maximum method for the baseline value in the sliding window in the dynamic baseline to determine a local maximum value in the sliding window. As a local peak; similarly, by moving the sliding window over time series, the local maximum method can be reused to determine local peaks in the baseline values within the sliding window. Then, the alarm system may use a preset area in the vicinity of each local peak as a peak area, for example, the preset area may include an adjacent time point, a time point where the interval is not greater than a preset interval, and the like, which is not limited in this specification)

    PNG
    media_image1.png
    502
    694
    media_image1.png
    Greyscale


At the time the invention was filed, it would have been obvious for one of ordinary skill in the art to have modified the teachings of Utsumi by, ...by collecting data of performance indicators of the monitored system in the system (ie, performance data), and comparing the performance data with a predefined performance threshold, the performance data may be judged correspondingly if the performance data does not meet the performance threshold ...the alarm system provides a solution for integrating dynamic machine baseline prediction by implementing a plurality of machine learning methods, and the alarm system selects an appropriate machine learning method from the plurality of machine learning methods as a predictive model to historical data. Processing and predicting the corresponding dynamic baseline helps to improve the prediction accuracy of the dynamic baseline. ... for historical data sequentially arranged in time series, the alarm system moves the m windows of the length in time series so that the SG differentiator can repeat the intra-window by local least squares (LS). The historical data implements a local polynomial fitting, and the order of the polynomial is the smoothing order p described above; since the smoothing process of the historical data can be realized in the fitting process, the obtained local polynomial can be called a smoothing polynomial...FIG. 7 is a schematic diagram of an abnormal monitoring of throughput of a data center according to an exemplary embodiment. As shown in FIG. 7, by using the abnormal monitoring scheme of this specification, accurate abnormal monitoring of the throughput of the data center can be achieved. For example, in Table 1 below, Sample 1, Sample 2, and Sample 3 are three samples for abnormal monitoring of data center throughput using the anomaly monitoring scheme of this specification, each sample containing 1,440 actual values; From the accuracy, recall rate, false positive rate and comprehensive evaluation indicators (such as F1-score, that is, F1 score) and other dimensions to judge the effect of abnormal monitoring, it can be seen that the abnormal monitoring program of this specification can not only effectively identify abnormalities, It also maintains a low false positive rate and meets the actual monitoring needs of the data center...the present specification proposes a correction to the Z-score model in the related art to process the historical data by the modified Z-score model. The correction may include: adjusting the abnormality determination threshold from "3" to "3.5" for the "three-sigma" rule adopted by the Z-score model in the related art, that is, the absolute value of the score Mi corresponding to any time point i When the value is greater than 3.5 (ie, |Mi|⟩3.5), the time point i can be marked as abnormal, which can improve the elimination effect of the absolute median (MAD) on the extreme points and avoid the negative impact of the extreme points on the statistical distribution of the data. To help improve the prediction accuracy of dynamic baselines...the second derivative of the dynamic baseline is the second derivative of the baseline value corresponding to each time point on the dynamic baseline. The second derivative of the actual data is the second derivative of the actual data corresponding to the actual value at each time point. The absolute difference Differ_sg is used to represent the difference in curvature between the dynamic baseline and the actual data...the alarm system may move the sliding window of the preset length in a time series, and implement a local maximum method for the baseline value in the sliding window in the dynamic baseline to determine a local maximum value in the sliding window. As a local peak; similarly, by moving the sliding window over time series, the local maximum method can be reused to determine local peaks in the baseline values within the sliding window. Then, the alarm system may use a preset area in the vicinity of each local peak as a peak area, for example, the preset area may include an adjacent time point, a time point where the interval is not greater than a preset interval, and the like, which is not limited in this specification..., as taught by Wang, with a reasonable expectation of success if arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make this modification to the teachings of Utsumi with the motivation of, …configuring an alarm mechanism in the data center and other systems, you can monitor the running status of the monitored system in the system to detect and resolve abnormal conditions that may occur in the monitored system...by optimizing and improving the abnormality detection scheme of the alarm system 14, the monitoring operation can be more accurate and sensitive, and avoiding the false alarm of abnormality, thereby causing waste of human resources or other resources of the staff. To ensure the normal operation of the data center. Among them, the data center is only one application object of the anomaly detection scheme provided in the present specification; in fact, in addition to the data center, the anomaly detection scheme of the present specification can be applied to any other electronic device, structure or system, this specification...to improve the prediction accuracy of the predicted baseline...To improve the accuracy of abnormal monitoring...helps to improve the accuracy of the alarm system for the corresponding performance indicators by ensuring the integrity of the historical data corresponding to each performance indicator in time series, and avoids false positives or false negatives at abnormal time points...corresponding dynamic baseline helps to improve the prediction accuracy of the dynamic baseline....can improve the elimination effect of the absolute median (MAD) on the extreme points and avoid the negative impact of the extreme points on the statistical distribution of the data. To help improve the prediction accuracy of dynamic baselines..., as recited in Wang.



Claim Rejections – 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-12 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Claim 1 (similarly 10) recite,
“... improving ... by accounting for non-equidistant events, the ...: 
receives information from the ..., wherein the information includes past metric data, 
trains a first ... model using past metric data as training data, 
generates a trend using the first ... model, 
generates a trend prediction data from the trend, 
generates residual data by subtracting the trend prediction data from the training data 
trains a second ... model based on the residual data, generates learned residual data using the second ... model, 
calculates event prediction data based on the learned residual data and actual measured value data, wherein the event prediction data is a set of predicted values of time series data in an intended period including a past certain period and the event prediction data includes the non-equidistant events that occur in unequal and fluctuating time intervals, 
generates an event prediction model based on the event prediction data, wherein the event prediction model utilizes explanatory variables to predict the non- equidistant events, 
calculates prediction result data, wherein the prediction result data is calculated as a function of both the trend prediction data and the event prediction data, 
determines whether a base line shift occurs based on a difference between the training data and the prediction result data in a terminal period of the training data, and 
on an occurrence of the base line shift, shifts the prediction result data in a future period to generate prediction data in such a manner that the prediction result data in an initial period of the future period is closer to the terminal period of the past period.”  


Analyzing under Step 2A, Prong 1:
The limitations regarding, ...receives information from the ..., wherein the information includes past metric data, trains a first ... model using past metric data as training data, generates a trend using the first ... model, generates a trend prediction data from the trend, generates residual data by subtracting the trend prediction data from the training data, trains a second ... model based on the residual data, generates learned residual data using the second ... model, calculates event prediction data based on the learned residual data and actual measured value data, wherein the event prediction data is a set of predicted values of time series data in an intended period including a past certain period and the event prediction data includes the non-equidistant events that occur in unequal and fluctuating time intervals, generates an event prediction model based on the event prediction data, wherein the event prediction model utilizes explanatory variables to predict the non- equidistant events, calculates prediction result data, wherein the prediction result data is calculated as a function of both the trend prediction data and the event prediction data, determines whether a base line shift occurs based on a difference between the training data and the prediction result data in a terminal period of the training data, and on an occurrence of the base line shift, shifts the prediction result data in a future period to generate prediction data in such a manner that the prediction result data in an initial period of the future period is closer to the terminal period of the past period…, under the broadest reasonable interpretation, may be interpreted to include a human reasonably using their mind with pen and paper to, …receives information from the ..., wherein the information includes past metric data, trains a first ... model using past metric data as training data, generates a trend using the first ... model, generates a trend prediction data from the trend, generates residual data by subtracting the trend prediction data from the training data, trains a second ... model based on the residual data, generates learned residual data using the second ... model, calculates event prediction data based on the learned residual data and actual measured value data, wherein the event prediction data is a set of predicted values of time series data in an intended period including a past certain period and the event prediction data includes the non-equidistant events that occur in unequal and fluctuating time intervals, generates an event prediction model based on the event prediction data, wherein the event prediction model utilizes explanatory variables to predict the non- equidistant events, calculates prediction result data, wherein the prediction result data is calculated as a function of both the trend prediction data and the event prediction data, determines whether a base line shift occurs based on a difference between the training data and the prediction result data in a terminal period of the training data, and on an occurrence of the base line shift, shifts the prediction result data in a future period to generate prediction data in such a manner that the prediction result data in an initial period of the future period is closer to the terminal period of the past period…; therefore, the claims are directed to a mental process. 

Further, …receives information from the ..., wherein the information includes past metric data, trains a first ... model using past metric data as training data, generates a trend using the first ... model, generates a trend prediction data from the trend, generates residual data by subtracting the trend prediction data from the training data, trains a second ... model based on the residual data, generates learned residual data using the second ... model, calculates event prediction data based on the learned residual data and actual measured value data, wherein the event prediction data is a set of predicted values of time series data in an intended period including a past certain period and the event prediction data includes the non-equidistant events that occur in unequal and fluctuating time intervals, generates an event prediction model based on the event prediction data, wherein the event prediction model utilizes explanatory variables to predict the non- equidistant events, calculates prediction result data, wherein the prediction result data is calculated as a function of both the trend prediction data and the event prediction data, determines whether a base line shift occurs based on a difference between the training data and the prediction result data in a terminal period of the training data, and on an occurrence of the base line shift, shifts the prediction result data in a future period to generate prediction data in such a manner that the prediction result data in an initial period of the future period is closer to the terminal period of the past period…, under the broadest reasonable interpretation, is directed to mathematical concepts. 

Accordingly, the claims are directed to a mental process and mathematical concepts, and thus, the claims are directed to an abstract idea under the first prong of Step 2A.


Analyzing under Step 2A, Prong 2:
This judicial exception is not integrated into a practical application under the second prong of Step 2A. 
In particular, the claims recite the additional elements beyond the recited abstract idea identified under Step 2A, Prong 1, such as:

Claim 1, 10: An apparatus for, machine learning, apparatus comprising: a memory for storing data; an input/output device; and a processor, operatively coupled to and in communication with the memory and the input/output device, wherein the processor, input/output device, machine learning, a processor

, and pursuant to the broadest reasonable interpretation, as an ordered combination, each of the additional elements are computing elements recited at high level of generality implementing the abstract idea, and thus, are no more than applying the abstract idea with generic computer components. Further, these additional elements generally link the abstract idea to a technical environment, namely the environment of a computer. 


Analyzing under Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under Step 2B. 
As noted above, the aforementioned additional elements beyond the recited abstract idea are not sufficient to amount to significantly more than the recited abstract idea because, as an order combination, the additional elements are no more than mere instructions to implement the idea using generic computer components (i.e. apply it). 
Additionally, as an order combination, the additional elements append the recited abstract idea to well-understood, routine, and conventional activities in the field as individually evinced by the applicant’s own disclosure in at least [0024] In FIG. 2, the management server 101 can be realized using, for example, a commonly used computing machine.   The management server 101 includes herein a central processing unit (CPU) 201, a memory 202, an auxiliary storage device 203, a communication interface 204, a media interface 205, and an input/output device 206. [0025] The auxiliary storage device 203 is an apparatus that records data in a data writable and readable fashion, and stores a program for specifying operations of the CPU 201 and the like. The communication interface 204 communicates with an external apparatus such as the console 105 via the network 103. The media interface 205 writes and reads data to and from an external  recording medium 207. The input/output device 206 is connected to the console 105 operating the management server 101. [0026] The CPU 201 reads the program for specifying the operations of the management server 101 from the auxiliary storage device 203 to the memory 202, and executes the program using the memory 202, thereby realizing the interface 111, the data acquisition section 112, the future prediction section 113, and the analysis result display section 114 depicted in FIG. 1.    It is noted that the program may be stored in the auxiliary storage device 203 in advance,  acquired   from   the  external   apparatus  via   the communication interface 204 when needed, or acquired from the external recording medium 207 via the media interface 205. Examples of the external recording medium 207 include a hard disc, a solid state drive (SSD), an integrated circuit (IC) card, a secure digital (SD) memory card, and a digital versatile disc (DVD). [0027] It is noted that part of or all of configurations, functions, and the like of the management server 101 may be realized by hardware by, for example, being designed with an integrated circuit. [0093] A person skilled in the art can carry out the present disclosure in various other manners, as required by the Berkheimer Memo.

Furthermore, as an ordered combination, these elements amount to generic computer components receiving or transmitting data over a network, performing repetitive calculations, electronic record keeping, and storing and retrieving information in memory, which, as held by the courts, are well-understood, routine, and conventional. See MPEP 2106.05(d).

Moreover, the remaining elements of dependent claims do not transform the recited abstract idea into a patent eligible invention because these remaining elements merely recite further abstract limitations that provide nothing more than simply a narrowing of the abstract idea recited in the independent claims. 

Looking at these limitations as an ordered combination adds nothing additional that is sufficient to amount to significantly more than the recited abstract idea because they simply provide instructions to use a generic arrangement of generic computer components to “apply” the recited abstract idea, perform insignificant extra-solution activity, and generally link the abstract idea to a technical environment. Thus, the elements of the claims, considered both individually and as an ordered combination, are not sufficient to ensure that the claim as a whole amounts to significantly more than the abstract idea itself. Since there are no limitations in these claims that transform the exception into a patent eligible application such that these claims amount to significantly more than the exception itself, claims 1-12 are rejected under 35 U.S.C. 101 as being directed to non-statutory subject matter.

Claim Rejections – 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
Determining the scope and contents of the prior art.
Ascertaining the differences between the prior art and the claims at issue.
Resolving the level of ordinary skill in the pertinent art.
Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-6, 8, 10-12 is/are rejected under 35 U.S.C. 103 as being unpatentable by WIPO Publication to WO2017212880A1 to Utsumi et al., (hereinafter referred to as “Utsumi”) in view of CN Patent Publication to CN109542740A to Wang et al. (hereinafter referred to as “Wang”).

As per Claim 1, Utsumi teaches: (Currently Amended) An apparatus for improving machine learning by accounting for non-equidistant events, the apparatus comprising: a memory for storing data; an input/output device; and a processor, operatively coupled to and in communication with the memory and the input/output device, wherein the processor: ([pg3 ln52-59][pg4 ln5-25]) 
receives information from the input/output device, wherein the information includes past metric data, (in at least [pg5 ln12-29] The data management device 3 receives the explanatory variable past measurement data 352 A transmitted from the data observation device 6 or the data distribution device 7 and stores it in the explanatory variable past measurement data storage unit 352.Further, the data management device 3 stores the prediction target past measurement data 351 A transmitted from the data observation device 6 or the data distribution device 7 in the prediction target past measurement data storage means 351.The explanatory variable past measurement data 352 A includes the past prediction target past measurement data 351 A such as, for example, weather data such as temperature, humidity, solar radiation, wind speed, atmospheric pressure, and data indicating occurrence / nonexistence of a sudden event such as a typhoon or an event Which can be explained.
trains a first machine learning model using past metric data as training data, (in at least [pg4 ln49-53] The prediction calculation unit 251 calculates a prediction error by calculating a difference or the like based on the prediction target past measurement data 351A including the latest measurement value, and models the occurrence tendency of the prediction error. Thus, the prediction calculation unit 251 calculates the first prediction error amount at an arbitrary future time, and corrects the first prediction calculation result with the calculated future error amount.  [pg6 22-27] the first prediction calculation unit 251A performs prediction. Known methods include, for example, a prediction method (prediction method) based on an arithmetic average value of a similar past period (similar days, etc.) set in advance via the information input / output terminal 4 based on the day of the week or the temperature. . Other known methods include a prediction method using a single regression model or a multiple regression model, a prediction method using a neural network, a prediction method using time series analysis such as an AR model or an ARIMA model, and the like)
generates a trend using the first machine learning model, (in at least [pg5 ln57-60] The first, second, and third prediction calculation result data 302, 304, and 305 calculate data of a section indicating a fluctuation range of the predicted value, or predicted value, in addition to future predicted values ​​of the prediction target past measurement data 351 )
generates a trend prediction data from the trend, (in at least [pg5 ln57-60] The first, second, and third prediction calculation result data 302, 304, and 305 calculate data of a section indicating a fluctuation range of the predicted value, or predicted value, in addition to future predicted values ​​of the prediction target past measurement data 351 )
generates residual data by subtracting the trend prediction data from the training data (in at least [pg11 ln2-8] the error sequence generation unit 251 B 1 first calculates a time series 310 B of prediction errors, which is a difference between the first prediction calculation result data 302 and the latest observation data 303. The error sequence generation unit 251 B 1 also generates a time series 310 A of prediction errors which is a difference between the prediction calculation result data 253 A which is the first prediction calculation result data 302 in a predetermined past period and the prediction target past measurement data 351 A in the same past period . The error sequence generation unit 251 B 1 connects the series of both prediction errors as an error sequence 310 which is one time series data.)
trains a second machine learning model based on the residual data, generates learned residual data using the second machine learning model, (in at least [pg11 ln10-17] the error model identifying unit 251 B 2 uses the time series analysis method to determine the degree in the time series model such as the AR model or the ARIMA model and estimate the coefficients. For the determination of the degree, Akaike information criterion (AIC) under several orders is calculated, and using a known method such that the degree to which the Akaike information criterion value is the smallest is applied It is also good. For the determination of the degree, a method of applying a lag whose statistically significant value of autocorrelation or partial autocorrelation of time series data is significant may be used. In the estimation of the coefficients, a known method such as estimation by the least squares method under the applied order may be applied.)
calculates event prediction data based on the learned residual data and actual measured value data, wherein the event prediction data is a set of predicted values of time series data in an intended period including a past certain period and the event prediction data includes the ... time intervals, (in at least [pg12 ln11-35] FIG. 8 shows the prediction calculation result and the actual observation result (upper part of FIG. 8) in a certain period and the error amount of prediction in the same period (lower part of FIG. 8). Also, the current time at which the data prediction system 12 in this embodiment is operating is indicated as "current 306" in the figure.　The upper graph of FIG. 8 shows the relationship between "first prediction calculation result data 302 and prediction calculation result data 253 A" calculated by the first prediction calculation unit 251 A, and "prediction target past measurement data 351 A" and " Data 303 "is obtained.　In FIG. 8, it is assumed that the latest observation data is collected two periods later than "current 306". Therefore, in the second prediction calculation section 251 B, as the prediction error confirmed at the "present 306" time point, a series of errors shown in the "error series 310 of the first prediction calculation result" in the lower part of FIG. 8 is calculated.　Further, the second predictive computing section 251 B generates a model of the occurrence tendency of errors from this error series, so that the error amount of the first prediction computation up to the future period preset via the information input / output terminal 4 is set to , And calculates it as "second predictive calculation result data 304" shown in FIG.　Here, the generation of the error occurrence tendency model performed by the second prediction calculation unit 251 B may be performed by a known method such as a time series analysis method using an AR model or an ARIMA model as described above And the method shown in FIG. 5 and FIG. 6 may be applied as described above. Finally, the predicted value correction unit 252 adds the "second prediction calculation result data 304" to the "first prediction calculation result data 302", thereby outputting the "third prediction calculation result data 305" .)
generates an event prediction model based on the event prediction data, wherein the event prediction model utilizes explanatory variables to predict the ... events, (in at least [pg6 ln33-45] The second prediction calculation unit 251B acquires actual measurement values for the same period from the prediction target past measurement data 351A or the latest observation data 303 acquired from the data observation device 6. The second prediction calculation unit 251B calculates first prediction error data (error series 310) as a difference between the predicted value and the actual measured value (S402)...An error generation tendency model is created from the calculated first prediction error data, and the first prediction error amount for a predetermined future period is calculated as the second prediction calculation result data 304 from the created model. (S403). The method used when the second prediction calculation unit 251B performs the prediction is the same as the method used when the first prediction calculation unit 251A described above performs the prediction, [pg14 ln59-pg15 ln4] referring to FIG. 12, the second prediction value correction unit 252B in the second embodiment The data flow between functions and the flow of processing will be described. The second predicted value correction unit 252B in the present embodiment calculates the amount of fluctuation of the predicted value of the prediction target future period set in advance via the information input / output terminal 4 from the plurality of prediction calculation result data. [pg9 ln37-46] the total value time series of this weighing point cluster is generated by summing the acquired data for each time, and the generated total value time series is set to the time granularity set when generating the time cluster of this weighing point cluster. Divide by. Each divided data is classified based on, for example, a day type such as a month, a day of the week, a weekday, or a holiday, and all actual observed values of respective extreme values of the classified data are calculated. A model for predicting the extreme value using a single regression model or a multiple regression model is generated using the actual observation value of the calculated extreme value and the past actual observation value of an explanatory variable such as temperature...Using the model generated last, the predicted value of the extreme value of the prediction target period is calculated and input to the predicted value calculation unit 251A4)
calculates prediction result data, wherein the prediction result data is calculated as a function of both the trend prediction data and the event prediction data, (in at least [pg6 ln47-52] the predicted value correction unit 252 uses the first prediction calculation result data 302 calculated by the first prediction calculation unit 251A based on the second prediction calculation result data 304 calculated by the second prediction calculation unit 251B. It correct | amends and calculates the 3rd prediction calculation result data 305 (S404). Specifically, for example, the prediction value of the second prediction calculation result data 304 is corrected by adding it to the prediction value of the first prediction calculation result data 302.) 
... based on a difference between the training data and the prediction result data in a terminal period of the training data, and (in at least [pg12 ln11-43] FIG. 8 illustrates a prediction calculation result and an actual observation result (upper part of FIG. 8) in a certain period, and a prediction error amount (lower part of FIG. 8) in the same period. The current time when the data prediction system 12 in this embodiment is operating is indicated as “current 306” in the figure...The upper graph in FIG. 8 shows “prediction target past measurement data 351A” and “latest observation” after “first prediction calculation result data 302 and prediction calculation result data 253A” calculated by the first prediction calculation unit 251A. The state where data 303 "is obtained is shown...In FIG. 8, it is assumed that the latest observation data is collected two periods later than “Current 306”. Therefore, the second prediction calculation unit 251B calculates a series of errors indicated by “error series 310 of the first prediction calculation result” in the lower part of FIG. 8 as the prediction error confirmed at the “current 306” time point...the second prediction calculation unit 251B generates an error generation tendency model from the error series, thereby obtaining an error amount of the first prediction calculation until a future period set in advance via the information input / output terminal 4. , Calculated as “second prediction calculation result data 304” shown in FIG...for the generation of the error generation tendency model performed by the second prediction calculation unit 251B, a known method such as a method based on time series analysis using an AR model or an ARIMA model may be applied as described above. However, the method shown in FIGS. 5 and 6 may be applied as described above. Finally, the predicted value correction unit 252 outputs “third predicted calculation result data 305” by adding “second predicted calculation result data 304” to “first predicted calculation result data 302”...the second prediction calculation unit 251B models and predicts the fluctuation of the prediction target data that is difficult to explain by the result of the first prediction calculation unit 251A that performs prediction using the explanatory variable data 301. By correcting the first prediction calculation result data 302 of the first prediction calculation unit 251A with the second prediction calculation result data 304 that is a prediction result, fluctuation components that are difficult to explain with main explanatory variables can be obtained. Realize the reflected forecast. Third prediction calculation result data 305 is output as a prediction result [pg17 ln51-54] in FIG. 13. 15 is a graphical representation of data to be input to the second predicted value correction unit 252B in the present embodiment and data output as a result of processing by the second predicted value correction unit 252B in the present embodiment . Here, for the sake of simplicity, processing when inputting the first and fourth prediction calculation result data 302 and 701)
on an occurrence of the base line shift, shifts the prediction result data in a future period to generate prediction data in such a manner that the prediction result data in an initial period of the future period is closer to the terminal period of the past period. (in at least [pg5 ln36-59] The prediction calculation device 2 acquires the prediction target past measurement data 351A and the explanatory variable past measurement data 352A stored in the data management device 3. The prediction calculation device 2 predicts a future value at an arbitrary point in time by the first prediction calculation unit 251A, and additionally stores the prediction calculation result data 253A in the prediction calculation result data storage unit 253...the prediction calculation device 2 inputs the prediction target past measurement data 351A stored in the data management device 3 and the latest observation data 303 transmitted from the data observation device 6 to the second prediction calculation unit 251B. The prediction calculation device 2 calculates the error of the first prediction from a predetermined past date and time, and predicts the error amount at any future date and time of the first prediction by modeling the tendency of occurrence of the error.... the prediction calculation device 2 uses the second prediction calculation result data 304 output from the second prediction calculation unit 251B and the first prediction calculation result data 302 output from the first prediction calculation unit 251A. Input to the predicted value correction unit 252....The prediction calculation device 2 corrects the first prediction calculation result data 302 with the second prediction calculation result data 304, outputs the third prediction calculation result data 305, and transmits it to the plan creation / execution management device 5....The first, second, and third prediction calculation result data 302, 304, and 305 calculate, in addition to the future prediction value of the prediction target past measurement data 351A, the data of the section indicating the fluctuation range of the prediction value, or the prediction value [pg12 ln11-43] FIG. 8 illustrates a prediction calculation result and an actual observation result (upper part of FIG. 8) in a certain period, and a prediction error amount (lower part of FIG. 8) in the same period. The current time when the data prediction system 12 in this embodiment is operating is indicated as “current 306” in the figure...The upper graph in FIG. 8 shows “prediction target past measurement data 351A” and “latest observation” after “first prediction calculation result data 302 and prediction calculation result data 253A” calculated by the first prediction calculation unit 251A. The state where data 303 "is obtained is shown...In FIG. 8, it is assumed that the latest observation data is collected two periods later than “Current 306”. Therefore, the second prediction calculation unit 251B calculates a series of errors indicated by “error series 310 of the first prediction calculation result” in the lower part of FIG. 8 as the prediction error confirmed at the “current 306” time point...the second prediction calculation unit 251B generates an error generation tendency model from the error series, thereby obtaining an error amount of the first prediction calculation until a future period set in advance via the information input / output terminal 4. , Calculated as “second prediction calculation result data 304” shown in FIG...for the generation of the error generation tendency model performed by the second prediction calculation unit 251B, a known method such as a method based on time series analysis using an AR model or an ARIMA model may be applied as described above. However, the method shown in FIGS. 5 and 6 may be applied as described above. Finally, the predicted value correction unit 252 outputs “third predicted calculation result data 305” by adding “second predicted calculation result data 304” to “first predicted calculation result data 302”...the second prediction calculation unit 251B models and predicts the fluctuation of the prediction target data that is difficult to explain by the result of the first prediction calculation unit 251A that performs prediction using the explanatory variable data 301. By correcting the first prediction calculation result data 302 of the first prediction calculation unit 251A with the second prediction calculation result data 304 that is a prediction result, fluctuation components that are difficult to explain with main explanatory variables can be obtained. Realize the reflected forecast. Third prediction calculation result data 305 is output as a prediction result...[pg17 ln51-54] in FIG. 13. 15 is a graphical representation of data to be input to the second predicted value correction unit 252B in the present embodiment and data output as a result of processing by the second predicted value correction unit 252B in the present embodiment . Here, for the sake of simplicity, processing when inputting the first and fourth prediction calculation result data 302 and 701)

Although implied, Utsumi does not expressly disclose the following limitations, which however, are taught by Wang,
calculates event prediction data based on the learned residual data and actual measured value data, wherein the event prediction data is a set of predicted values of time series data in an intended period including a past certain period and the event prediction data includes the non-equidistant events that occur in unequal and fluctuating time intervals (in at least [pg5 para12] the alarm system provides a solution for integrating dynamic machine baseline prediction by implementing a plurality of machine learning methods, and the alarm system selects an appropriate machine learning method from the plurality of machine learning methods as a predictive model to historical data. Processing and predicting the corresponding dynamic baseline helps to improve the prediction accuracy of the dynamic baseline. [pg8 para9] for historical data sequentially arranged in time series, the alarm system moves the m windows of the length in time series so that the SG differentiator can repeat the intra-window by local least squares (LS). The historical data implements a local polynomial fitting, and the order of the polynomial is the smoothing order p described above; since the smoothing process of the historical data can be realized in the fitting process, the obtained local polynomial can be called a smoothing polynomial [pg9 para6] FIG. 7 is a schematic diagram of an abnormal monitoring of throughput of a data center according to an exemplary embodiment. As shown in FIG. 7, by using the abnormal monitoring scheme of this specification, accurate abnormal monitoring of the throughput of the data center can be achieved. For example, in Table 1 below, Sample 1, Sample 2, and Sample 3 are three samples for abnormal monitoring of data center throughput using the anomaly monitoring scheme of this specification, each sample containing 1,440 actual values; From the accuracy, recall rate, false positive rate and comprehensive evaluation indicators (such as F1-score, that is, F1 score) and other dimensions to judge the effect of abnormal monitoring, it can be seen that the abnormal monitoring program of this specification can not only effectively identify abnormalities, It also maintains a low false positive rate and meets the actual monitoring needs of the data center.)
generates an event prediction model based on the event prediction data, wherein the event prediction model utilizes explanatory variables to predict the non- equidistant events (in at least [pg9 para6] FIG. 7 is a schematic diagram of an abnormal monitoring of throughput of a data center according to an exemplary embodiment. As shown in FIG. 7, by using the abnormal monitoring scheme of this specification, accurate abnormal monitoring of the throughput of the data center can be achieved. For example, in Table 1 below, Sample 1, Sample 2, and Sample 3 are three samples for abnormal monitoring of data center throughput using the anomaly monitoring scheme of this specification, each sample containing 1,440 actual values; From the accuracy, recall rate, false positive rate and comprehensive evaluation indicators (such as F1-score, that is, F1 score) and other dimensions to judge the effect of abnormal monitoring, it can be seen that the abnormal monitoring program of this specification can not only effectively identify abnormalities, It also maintains a low false positive rate and meets the actual monitoring needs of the data center.)
determines whether a base line shift occurs based on a difference between the training data and the prediction result data in a terminal period of the training data (in at least [pg7 para1-11] the present specification proposes a correction to the Z-score model in the related art to process the historical data by the modified Z-score model. The correction may include: adjusting the abnormality determination threshold from "3" to "3.5" for the "three-sigma" rule adopted by the Z-score model in the related art, that is, the absolute value of the score Mi corresponding to any time point i When the value is greater than 3.5 (ie, |Mi|⟩3.5), the time point i can be marked as abnormal, which can improve the elimination effect of the absolute median (MAD) on the extreme points and avoid the negative impact of the extreme points on the statistical distribution of the data. To help improve the prediction accuracy of dynamic baselines. [pg9 para1-5] the second derivative of the dynamic baseline is the second derivative of the baseline value corresponding to each time point on the dynamic baseline. The second derivative of the actual data is the second derivative of the actual data corresponding to the actual value at each time point. The absolute difference Differ_sg is used to represent the difference in curvature between the dynamic baseline and the actual data...the alarm system may move the sliding window of the preset length in a time series, and implement a local maximum method for the baseline value in the sliding window in the dynamic baseline to determine a local maximum value in the sliding window. As a local peak; similarly, by moving the sliding window over time series, the local maximum method can be reused to determine local peaks in the baseline values within the sliding window. Then, the alarm system may use a preset area in the vicinity of each local peak as a peak area, for example, the preset area may include an adjacent time point, a time point where the interval is not greater than a preset interval, and the like, which is not limited in this specification)

    PNG
    media_image1.png
    502
    694
    media_image1.png
    Greyscale


At the time the invention was filed, it would have been obvious for one of ordinary skill in the art to have modified the teachings of Utsumi by, ...by collecting data of performance indicators of the monitored system in the system (ie, performance data), and comparing the performance data with a predefined performance threshold, the performance data may be judged correspondingly if the performance data does not meet the performance threshold ...the alarm system provides a solution for integrating dynamic machine baseline prediction by implementing a plurality of machine learning methods, and the alarm system selects an appropriate machine learning method from the plurality of machine learning methods as a predictive model to historical data. Processing and predicting the corresponding dynamic baseline helps to improve the prediction accuracy of the dynamic baseline. ... for historical data sequentially arranged in time series, the alarm system moves the m windows of the length in time series so that the SG differentiator can repeat the intra-window by local least squares (LS). The historical data implements a local polynomial fitting, and the order of the polynomial is the smoothing order p described above; since the smoothing process of the historical data can be realized in the fitting process, the obtained local polynomial can be called a smoothing polynomial...FIG. 7 is a schematic diagram of an abnormal monitoring of throughput of a data center according to an exemplary embodiment. As shown in FIG. 7, by using the abnormal monitoring scheme of this specification, accurate abnormal monitoring of the throughput of the data center can be achieved. For example, in Table 1 below, Sample 1, Sample 2, and Sample 3 are three samples for abnormal monitoring of data center throughput using the anomaly monitoring scheme of this specification, each sample containing 1,440 actual values; From the accuracy, recall rate, false positive rate and comprehensive evaluation indicators (such as F1-score, that is, F1 score) and other dimensions to judge the effect of abnormal monitoring, it can be seen that the abnormal monitoring program of this specification can not only effectively identify abnormalities, It also maintains a low false positive rate and meets the actual monitoring needs of the data center...the present specification proposes a correction to the Z-score model in the related art to process the historical data by the modified Z-score model. The correction may include: adjusting the abnormality determination threshold from "3" to "3.5" for the "three-sigma" rule adopted by the Z-score model in the related art, that is, the absolute value of the score Mi corresponding to any time point i When the value is greater than 3.5 (ie, |Mi|⟩3.5), the time point i can be marked as abnormal, which can improve the elimination effect of the absolute median (MAD) on the extreme points and avoid the negative impact of the extreme points on the statistical distribution of the data. To help improve the prediction accuracy of dynamic baselines...the second derivative of the dynamic baseline is the second derivative of the baseline value corresponding to each time point on the dynamic baseline. The second derivative of the actual data is the second derivative of the actual data corresponding to the actual value at each time point. The absolute difference Differ_sg is used to represent the difference in curvature between the dynamic baseline and the actual data...the alarm system may move the sliding window of the preset length in a time series, and implement a local maximum method for the baseline value in the sliding window in the dynamic baseline to determine a local maximum value in the sliding window. As a local peak; similarly, by moving the sliding window over time series, the local maximum method can be reused to determine local peaks in the baseline values within the sliding window. Then, the alarm system may use a preset area in the vicinity of each local peak as a peak area, for example, the preset area may include an adjacent time point, a time point where the interval is not greater than a preset interval, and the like, which is not limited in this specification..., as taught by Wang, with a reasonable expectation of success if arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make this modification to the teachings of Utsumi with the motivation of, …configuring an alarm mechanism in the data center and other systems, you can monitor the running status of the monitored system in the system to detect and resolve abnormal conditions that may occur in the monitored system...by optimizing and improving the abnormality detection scheme of the alarm system 14, the monitoring operation can be more accurate and sensitive, and avoiding the false alarm of abnormality, thereby causing waste of human resources or other resources of the staff. To ensure the normal operation of the data center. Among them, the data center is only one application object of the anomaly detection scheme provided in the present specification; in fact, in addition to the data center, the anomaly detection scheme of the present specification can be applied to any other electronic device, structure or system, this specification...to improve the prediction accuracy of the predicted baseline...To improve the accuracy of abnormal monitoring...helps to improve the accuracy of the alarm system for the corresponding performance indicators by ensuring the integrity of the historical data corresponding to each performance indicator in time series, and avoids false positives or false negatives at abnormal time points...corresponding dynamic baseline helps to improve the prediction accuracy of the dynamic baseline....can improve the elimination effect of the absolute median (MAD) on the extreme points and avoid the negative impact of the extreme points on the statistical distribution of the data. To help improve the prediction accuracy of dynamic baselines..., as recited in Wang.

As per Claim 2, Utsumi teaches: (Currently Amended) The time series data prediction apparatus according to claim 1, wherein the processor further: 
calculates the trend prediction data indicating a tendency of the time series data in the intended period based on the actual measured value data.  (in at least [pg6 ln33-52]  second prediction calculation unit 251B of the prediction calculation unit 251 acquires a prediction value from the first prediction calculation result data 302 of a predetermined past period from the prediction calculation result data 253A. The second prediction calculation unit 251B acquires actual measurement values for the same period from the prediction target past measurement data 351A or the latest observation data 303 acquired from the data observation device 6. The second prediction calculation unit 251B calculates first prediction error data (error series 310) as a difference between the predicted value and the actual measured value (S402)...An error generation tendency model is created from the calculated first prediction error data, and the first prediction error amount for a predetermined future period is calculated as the second prediction calculation result data 304 from the created model. (S). The 403 method used when the second prediction calculation unit 251B performs the prediction is the same as the method used when the first prediction calculation unit 251A described above performs the prediction, and the description thereof is omitted here.) 
calculates relative value data indicating relative values of the actual measured value data to the trend prediction data in a certain period, calculates the event prediction data based on the relative value data, and shifts each value of the event prediction data using the trend prediction data as the difference.  (in at least [pg11 ln10-29] Next, the error model identifying unit 251 B 2 uses the time series analysis method to determine the degree in the time series model such as the AR model or the ARIMA model and estimate the coefficients. For the determination of the degree, Akaike information criterion (AIC) under several orders is calculated, and using a known method such that the degree to which the Akaike information criterion value is the smallest is applied…The error prediction amount calculation unit 251 B 3 calculates the prediction value of the time series error amount of the first prediction calculation result in the prediction target period using the generated model and outputs it as the second prediction calculation result data 304…With the above processing, the second prediction calculation processing in the present embodiment is completed. When the first prediction calculation is performed at predetermined intervals such as every 24 hours, for example, the series of prediction errors may be discontinuous at the boundary of the period. When performing the second prediction calculation based on the discontinuous series, it becomes impossible to obtain an appropriate second prediction calculation result. Therefore, it is possible to perform the second prediction calculation after removing the discontinuity point of the series of prediction errors (the point discontinuous at the break of the section of the engine), for example, by smoothing processing or the like. [pg12 ln11-35] FIG. 8 shows the prediction calculation result and the actual observation result (upper part of FIG. 8) in a certain period and the error amount of prediction in the same period (lower part of FIG. 8). The upper graph of FIG. 8 shows the relationship between "first prediction calculation result data 302 and prediction calculation result data 253 A" calculated by the first prediction calculation unit 251 A, and "prediction target past measurement data 351 A" and " Data 303 "is obtained…the second predictive computing section 251 B generates a model of the occurrence tendency of errors from this error series, so that the error amount of the first prediction computation up to the future period preset via the information input / output terminal 4 is set to , And calculates it as "second predictive calculation result data 304" shown in FIG.…the generation of the error occurrence tendency model performed by the second prediction calculation unit 251 B may be performed by a known method such as a time series analysis method using an AR model or an ARIMA model as described above And the method shown in FIG. 5 and FIG. 6 may be applied as described above. Finally, the predicted value correction unit 252 (i.e. correction section) adds the "second prediction calculation result data 304" to the "first prediction calculation result data 302", thereby outputting the "third prediction calculation result data 305)

As per Claim 3, Utsumi teaches: (Currently Amended) The apparatus according to claim 2, wherein the processor further: 
generates a trend prediction model with elapsed time assumed as explanatory variables and values of the time series data assumed as objective variables based on the actual measured value data, and calculates the trend prediction data using the trend prediction model. (in at least [pg12 ln11-43] FIG. 8 shows the prediction calculation result and the actual observation result (upper part of FIG. 8) in a certain period and the error amount of prediction in the same period (lower part of FIG. 8). Also, the current time at which the data prediction system 12…In FIG. 8, it is assumed that the latest observation data is collected two periods later than "current 306" (i.e. elapsed time assumed as explanatory variables)…The upper graph of FIG. 8 shows the relationship between "first prediction calculation result data 302 and prediction calculation result data 253 A" calculated by the first prediction calculation unit 251 A, and "prediction target past measurement data 351 A" (i.e. objective variables on the basis of the actual measured value data) and " Data 303 "is obtained….the second predictive computing section 251 B generates a model of the occurrence tendency of errors from this error series, so that the error amount of the first prediction computation up to the future period preset via the information input / output terminal 4 is set to , And calculates it as "second predictive calculation result data 304" shown in FIG… the fluctuation of the prediction target data, which is difficult to explain by the result of the first prediction computation unit 251 A that performs prediction using the explanatory variable data 301, is modeled by the second prediction computation unit 251 B and predicted. By correcting the first prediction calculation result data 302 of the first prediction calculation unit 251 A with the second prediction calculation result data 304 which is the result of the prediction, even the variation component difficult to explain with the main explanatory variable Realize reflected predictions. The third predicted calculation result data 305 is output as a result of prediction. )


As per Claim 4, Utsumi teaches: (Currently Amended)The apparatus according to claim 3, 
wherein the trend prediction model is a linear regression model. (in at least [pg6 ln22-45]  Known methods include a prediction method using a single regression model or a multiple regression model, a prediction method using a neural network, and a prediction method using a time series analysis such as an AR model or an ARIMA model. [pg12 ln30-35] the generation of the error occurrence tendency model performed by the second prediction calculation unit 251 B may be performed by a known method such as a time series analysis method using an AR model or an ARIMA model as described above)  


As per Claim 5, Utsumi teaches: (Currently Amended) The apparatus according to claim 2, wherein the processor further: 
generates the event prediction model with calendar information assumed as explanatory variables and values of the time series data assumed as objective variables based on the relative value data, and calculates the event prediction data using the event prediction model.  (in at least [pg11 ln54-58] FIG. 21 shows a conceptual diagram of the effect. First, the prediction error in the first prediction calculation result is as shown in the graph 851 for the full year. Among them, the graph 853 of the data for the most recent seven days of the prediction target date shows the error mode of the first prediction for each day with respect to the graph 852 of the data on the prediction target date and the seven days from the same month of the previous year)


As per Claim 6, Utsumi teaches: (Currently Amended) The apparatus according to claim 5, 
wherein the event prediction model includes a decision tree model. (in at least [pg6 ln12-20] the first prediction calculation unit 251 A of the prediction calculation unit 251 obtains and receives the prediction target past measurement data 351 A and the explanatory variable past measurement data 352 A from the data management device 3. Next, based on the correlation between the value of the prediction target past measurement data 351 A and the value of the explanatory variable such as the calendar date of the explanatory variable past measurement data 352 A and the weather information, the first prediction calculation unit 251 A calculates the information input / output terminal 4 And calculates the first predicted calculation result data 302 at a plurality of future time points preset in advance through the above-described processing. Thereafter, the first predictive computation unit 251 A additionally records the prediction computation result data 253 A of the prediction computation result data storage unit 253 (S 401).  [pg11 ln44-46] clustering unit 251 A 1 and the profiling unit 251 A 2 shown in FIG. 5 are applied to the error model identifying unit 251 B 2, the weight value is applied to a decision tree learning algorithm such as CART, ID 3, random forest and the like. )


As per Claim 8, Utsumi teaches: (Currently Amended) The apparatus according to claim 1,wherein the processor further:
shifts each value of the event prediction data in and after a certain period in response to the difference between the actual measured value data and the event prediction data in the terminal period including an end of the certain period. (in at least [pg12 ln11-28] FIG. 8 shows the prediction calculation result and the actual observation result (upper part of FIG. 8) in a certain period and the error amount of prediction in the same period (lower part of FIG. 8). Also, the current time at which the data prediction system 12 in this embodiment is operating is indicated as "current 306" in the figure…In FIG. 8, it is assumed that the latest observation data is collected two periods later than "current 306". Therefore, in the second prediction calculation f 251 B, as the prediction error confirmed at the "present 306" time point, a series of errors shown in the "error series 310 of the first prediction calculation result" in the lower part of FIG. 8 is calculated.)


As per Claim 11, Utsumi teaches:  (New) The apparatus according to claim 1, 
wherein at least one of the first machine learning model or the second machine learning model is based on a Random Forest Model.  (in at least [pg16 ln45-52] The process of calculating the relationship information between each prediction calculation result and the state of the prediction target period may be set in advance through the information input / output terminal 4, for example. A known learning algorithm may be applied to this calculation process. Known algorithms include decision tree learning algorithms such as CART, ID 3, random forest, discriminator learning algorithms such as SVM (Support Vector Machine) and Naive Bayes. The known algorithm is, for example, information indicating a sequence of errors of each prediction calculation result, for example, an error sequence arrangement, result information obtained by performing frequency analysis such as Fourier transform on an error sequence, and the like as teacher labels.)

As per Claim 10, 12, for a method (see at least Utsumi [pg1 ln11]), substantially recite the subject matter of Claim 1 and are rejected based on the same reasoning and rationale.


Claims 7, 9  is/are rejected under 35 U.S.C. 103 as being unpatentable by WIPO Publication to WO2017212880A1 to Utsumi et al., (hereinafter referred to as “Utsumi”) in view of CN Patent Publication to CN109542740A to Wang et al. (hereinafter referred to as “Wang”) in view of US Patent to US7558803B1 to deVille (hereinafter referred to as “deVille”).

As per Claim 7, Utsumi teaches: (Currently Amended) The apparatus according to claim 6, 
wherein the decision tree model includes … calculating candidate data that serves as a candidate of the event prediction data, and the processor further calculates, … for the event prediction data based.... (in at least [pg12 ln30-36] the generation of the error occurrence tendency model performed by the second prediction calculation unit 251 B may be performed by a known method such as a time series analysis method using an AR model or an ARIMA model as described above And the method shown in FIG. 5 and FIG. 6 may be applied as described above. Finally, the predicted value correction unit 252 adds the "second prediction calculation result data 304" to the "first prediction calculation result data 302", thereby outputting the "third prediction calculation result data 305")

Although implied, Utsumi in view of Wang does not expressly disclose the following limitations, which however, are taught by deVille,
wherein the decision tree model includes a plurality of decision trees each calculating candidate data that serves as a candidate of the event prediction data, and the event calculation section calculates a lower limit and an upper limit for the event prediction data based on a basis of a plurality of pieces of the candidate data calculated by the decision trees (in at least [col4 ln10-30] FIG. 4 shows a graphical depiction of the decision tree 112 for group 1 that was constructed in FIG. 3. As discussed with reference to FIG. 3, the non-hierarchical rules 92 derived from group 1 are used to form the leaf nodes 126 of the decision tree 112. The result of the bottom-up formation process 100 is a decision tree 112, having a root node 122, one or more middle nodes 124, and a leaf node for each member of group 1 126. The decision tree 112 alternatively may be referred to as the rule set for group 1. 90 (Examiner notes each of the nodes also provide lower and upper limit) [col3 ln50-col4 ln10] A purity threshold property of a rule induction node (available in Enterprise Miner) specifies the minimum training leaf purity percentage required for a leaf to be ripped. Observations in leaves which meet or exceed the threshold are removed from the training data. The default setting for a purity threshold property is 100, but acceptable values are integers, such as 80 or 90. An initial rule induction process comes up with one or more rules that drive the codes into pure nodes. For example, if the process results in a rule that predicts 90% of the code 10s that are in the training data set and the purity threshold is set at 80, then this rule is used as a leaf node for code 10. The data set is then reduced by taking out the data related to code 10. On a successive iteration in this example, a rule might be determined that classifies 85% of the code 50s. Because this is above the threshold, the rule is used as a leaf node for code 50. Processing continues to determine rules for the remaining codes that drive them into pure nodes. It may occur that rules cannot be found that satisfy the purity threshold for one or more codes; in such a situation, the process ends with rules that only can predict such codes at a lower level of probability. )

At the time the invention was filed, it would have been obvious for one of ordinary skill in the art to have modified the teachings of Utsumi in view of Wang by, … received containing descriptive data (e.g., qualitative and/or quantitative descriptive data) and target values. The target values are partitioned into groups, wherein each group contains a distinct subset of the target values. A rule induction engine generates a set of rules for each group based upon a set of descriptive data. The rules specify a relationship between a set of descriptive data and the target values in a group, and the rules are configured to predict target values. In a bottom-up manner, a decision tree is generated for each of the groups. The generation in a bottom-up manner includes using the rule set of a group to generate lower level nodes of the decision tree for the group before generating the upper level nodes of the decision tree for the group. The generated decision trees for each of the groups are used to output predicted target values for input data.…, as taught by deVille, with a reasonable expectation of success if arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make this modification to the teachings of Utsumi in view of Wang with the motivation of, …As an alternative, each step may permit, or require, a greater number of options. Each data value eventually is represented by one of the bottom-most nodes in the tree, called leaf nodes. The greater the number of options at each step, the lesser will be the number of steps from the top of the tree, called the root node, to the leaf nodes.…ensures that reliable, reproducible, and accurate classification/prediction rules are developed by this process...use of a metadata node to partition the input data into training and test data so as to reliably and accurately form classification rules for the groups...., as recited in deVille.


As per Claim 9, Utsumi teaches: (Currently Amended) The apparatus according to claim 2, wherein the processor further: 
repeatedly calculates the event prediction data while shifting the certain period, and (in at least [pg11 ln22-30] the first prediction calculation is performed at predetermined intervals such as every 24 hours, for example, the series of prediction errors may be discontinuous at the boundary of the period. When performing the second prediction calculation based on the discontinuous series, it becomes impossible to obtain an appropriate second prediction calculation result. Therefore, it is possible to perform the second prediction calculation after removing the discontinuity point of the series of prediction errors (the point discontinuous at the break of the section of the engine), for example, by smoothing processing or the like)
determines 
whether the difference between the actual measured value data and the event prediction data in the terminal period including an end of the certain period … whenever the event prediction data is calculated, (in at least [pg6 ln33-52] The second predictive computation unit 251 B acquires the actual measurement value of the same period from the prediction target past measurement data 351 A or the latest observation data 303 acquired from the data observation device 6. The second prediction calculation unit 251B calculates first prediction error data (error sequence 310) as a difference between the predicted value and the actual measurement value (S402)…on the basis of the second prediction calculation result data 304 calculated by the second prediction calculation section 251 B, the prediction value correction section 252 outputs the first prediction calculation result data 302 calculated by the first prediction calculation section 251 A And calculates the third predicted calculation result data 305 (S404). Specifically, for example, the prediction value of the second prediction calculation result data 304 is corrected by adding it to the prediction value of the first prediction calculation result data 302.)
shifts each value of the event prediction data in response to the difference in a case in which the difference … and the event prediction data is finally calculated event prediction data, and (in at least [pg6 ln33-52] The second predictive computation unit 251 B acquires the actual measurement value of the same period from the prediction target past measurement data 351 A or the latest observation data 303 acquired from the data observation device 6. The second prediction calculation unit 251B calculates first prediction error data (error sequence 310) as a difference between the predicted value and the actual measurement value (S402)…on the basis of the second prediction calculation result data 304 calculated by the second prediction calculation section 251 B, the prediction value correction section 252 outputs the first prediction calculation result data 302 calculated by the first prediction calculation section 251 A And calculates the third predicted calculation result data 305 (S404). Specifically, for example, the prediction value of the second prediction calculation result data 304 is corrected by adding it to the prediction value of the first prediction calculation result data 302.)
changes an approach of calculating the trend prediction data in a case in which the difference … and the event prediction data is not the finally calculated event prediction data. (in at least [pg12 ln25-43] the second predictive computing section 251 B generates a model of the occurrence tendency of errors from this error series, so that the error amount of the first prediction computation up to the future period preset via the information input / output terminal 4 is set to , And calculates it as "second predictive calculation result data 304" shown in FIG.…the generation of the error occurrence tendency model performed by the second prediction calculation unit 251 B may be performed by a known method such as a time series analysis method using an AR model or an ARIMA model as described above And the method shown in FIG. 5 and FIG. 6 may be applied as described above. Finally, the predicted value correction unit 252 adds the "second prediction calculation result data 304" to the "first prediction calculation result data 302", thereby outputting the "third prediction calculation result data 305"…, the fluctuation of the prediction target data, which is difficult to explain by the result of the first prediction computation unit 251 A that performs prediction using the explanatory variable data 301, is modeled by the second prediction computation unit 251 B and predicted. By correcting the first prediction calculation result data 302 of the first prediction calculation unit 251 A with the second prediction calculation result data 304 which is the result of the prediction, even the variation component difficult to explain with the main explanatory variable Realize reflected predictions. The third predicted calculation result data 305 is output as a result of prediction.)

Although implied, Utsumi in view of Wang does not expressly disclose the following limitations, which however, are taught by deVille,
… is equal to or greater than a certain value…,  (in at least [col4 ln10-30] FIG. 4 shows a graphical depiction of the decision tree 112 for group 1 that was constructed in FIG. 3. As discussed with reference to FIG. 3, the non-hierarchical rules 92 derived from group 1 are used to form the leaf nodes 126 of the decision tree 112. The result of the bottom-up formation process 100 is a decision tree 112, having a root node 122, one or more middle nodes 124, and a leaf node for each member of group 1 126. The decision tree 112 alternatively may be referred to as the rule set for group 1. 90 (Examiner notes each of the nodes also provide lower and upper limit) [col3 ln50-col4 ln10] A purity threshold property of a rule induction node (available in Enterprise Miner) specifies the minimum training leaf purity percentage required for a leaf to be ripped. Observations in leaves which meet or exceed the threshold are removed from the training data. The default setting for a purity threshold property is 100, but acceptable values are integers, such as 80 or 90. An initial rule induction process comes up with one or more rules that drive the codes into pure nodes. For example, if the process results in a rule that predicts 90% of the code 10s that are in the training data set and the purity threshold is set at 80, then this rule is used as a leaf node for code 10. The data set is then reduced by taking out the data related to code 10. On a successive iteration in this example, a rule might be determined that classifies 85% of the code 50s. Because this is above the threshold, the rule is used as a leaf node for code 50. Processing continues to determine rules for the remaining codes that drive them into pure nodes. It may occur that rules cannot be found that satisfy the purity threshold for one or more codes; in such a situation, the process ends with rules that only can predict such codes at a lower level of probability. )

The reason and rationale to combine Utsumi, Wang and deVille are the same as recited above. 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  

A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to PO HAN MAX LEE whose telephone number is (571)272-3821.  The examiner can normally be reached on Mon-Thurs 8:00 am - 7:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Rutao Wu can be reached on (571) 272-6045.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/PO HAN MAX LEE/Examiner, Art Unit 3623         

/CHARLES GUILIANO/Primary Examiner, Art Unit 3623