DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Response to Amendments
The amendments filed 06/24/2022 have been entered. Claims 1, 3-8, 10-15, and 17-20 remain pending in the application. 
 
Applicant’s arguments, with respect to the rejection(s) of claim(s) 4,11, and 18 under 35 U.S.C. 112(b) have been fully considered and are persuasive. Therefore, the previous rejection set forth in the previous office action mailed 04/01/2022 has been withdrawn. 

Response to Arguments
Applicant’s argument, filed 06/24/2022, with respect to the specification have been fully considered and are persuasive in part. 
Without conceding the propriety of the previous objection, the objection to hyperlinks within the specification has been withdrawn, solely due to the amendments made to the specification. 
However, the applicant has failed address the objection regarding the reference cited within the specification and therefore this specification objection remains (See below for more details). 
Applicant's arguments, with respect to 35 U.S.C 101 filed 06/24/2022 have been fully considered but they are not persuasive. 
Upon further consideration and due, at least in part, to applicant’s amendments, the claims remain ineligible under 35 U.S.C. 101. 
The examiner turns to the last limitation of at least representative claim 1 which recites: 
“provide labels, based on any of i) correlating the tickets to the alarms and ii) receiving an input from a user, for the determined statistically correlated PM data with an associated label based on the associated target event of the one or more target events and utilize a set of labeled data based on the provided labels to train a machine learning process” 
First, the examiner notes HOW labels are provided. That is, as amended and claimed, a human can provide a label (e.g. receiving an input from a user). Clearly, under step 2A prong 1, this is a mental process as a simple observation or evaluation. Additionally, correlating the tickets to the alarms is simply just both a mental process and a mathematical concept. That is, as a mental process the functionality encompassed by “correlating” could be simply a human looking at two lists (e.g. tickets and alarms) and evaluating or observing the similarities between these two lists. As a math concept, the functionality of “correlating” is simply applying a mathematical concept. 
Second, the use of the labeled data to train a machine learning process is, under step 2A Prong 2, only generally linking the use of the judicial exception to a particular technological environment (e.g. machine learning). That is, due, at least in part, to the high level of generality at which this step is claimed, it only is generally linking to machine learning (MPEP 2106.05(h)). 
Additionally, under Step 2B, the claim language which recites: 
“and utilize a set of labeled data based on the provided labels to train a machine learning process.”
Is well-known, routine, and conventional (MPEP 2106.05(d)). That is, this limitation is claiming the well-known, routine, and conventional practice of using supervised machine learning. 
Because the examiner is asserting that this claim language is WURC (well-known, routine, and conventional), the examiner must provide evidence to this assertion as per the Berkheimer memo. 
As evidence, the examiner refers to two articles. First, an IBM cloud education post, entitled “Supervised learning” (NPL 2020) recites: 
“Supervised learning, also known as supervised machine learning, is a subcategory of machine learning and artificial intelligence. It is defined by its use of labeled datasets to train algorithms that to classify data or predict outcomes accurately…” 
From this first piece of evidence, it is clear that the claim language is simply defining the general nature of the machine learning process that is being used and, as can be seen, it is defined (e.g. well-understood, routine, and conventional) by using labeled data to train an algorithm. In other words, because this limitation is simply claiming the definition of supervised machine learning, it is akin to claiming language such as “…and use supervised machine learning”; which again only generally links the use of the judicial exception to a particular technological environment. 
The second article, entitled “A brief introduction to Supervised Learning”, by Aiden Wilson (NPL 2019) similarly recites: 
“Supervised learning is the most common subbranch of machine learning today…Supervised machine learning algorithm are designed to learn by example. The name ‘supervised’ learning originates from the idea that training this type of algorithm is like having a teacher [e.g. the claimed “set of labeled data”] supervise the whole process.” 
Again, as can be seen, this limitation, at best is claiming the definition of a well-understood, routine, and conventional machine learning process (i.e. supervised machine learning) and it is doing so at a very high level of generality. 
For at least the reasons above, the rejection under 35 U.S.C. 101 is maintained and updated to reflect the current claim language. 

Applicant's arguments, with respect to 35 U.S.C 103 filed 06/24/2022 have been fully considered but they are not persuasive. 
Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.
Applicant's arguments do not comply with 37 CFR 1.111(c) because they do not clearly point out the patentable novelty which he or she thinks the claims present in view of the state of the art disclosed by the references cited or the objections made. Further, they do not show how the amendments avoid such references or objections.
The following is the applicant’s entire argument: 
“…The YouTube video and Zhang fail to suggest use of multiple data source and using the correlation between the tickets and alarms for the labels…” 
On its face, the argument cannot be considered persuasive for at least the reasons above. It does not specifically address or challenge any of the citations provided by the examiner nor does it provide analysis of WHY “…The YouTube video and Zhang fail to suggest use of multiple data source and using the correlation between the tickets and alarms for the labels…” At best, it merely is a conclusory statement. 
Additionally, it is clear that the applicant has not accounted for the full teachings of either references and especially Zhang; who, as was cited, teaches the claim language at issue. The examiner refers to the rejection under 103 for more details. 

Specification
The listing of references in the specification is not a proper information disclosure statement.  37 CFR 1.98(b) requires a list of all patents, publications, or other information submitted for consideration by the Office, and MPEP § 609.04(a) states, "the list may not be incorporated into the specification but must be submitted in a separate paper."  Therefore, unless the references have been cited by the examiner on form PTO-892, they have not been considered.
To clarify, Paragraph [0067] makes reference to a paper entitled “Building High-level features using Large Scale Unsupervised Learning”. However, the referenced paper was not included in applicant’s IDS filed 06/18/2019 and therefore has no be considered. 
Appropriate correction is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1, 3-8, 10-15, and 17-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1: Each of claims 1, 8, and 15 are drawn towards a system, method, and non-transitory computer readable medium respectively. Therefore, each of claims 1, 8, 15 pass step 1.

Analysis of Claim 1: 
Revised Step 2A: Do(es) the claim(s) recite an abstract idea and, if so, does the claim(s) contain any additional elements that integrate the abstract idea into a practical application. 
Yes, Claim 1 recites an abstract idea and No, Claim 1 does NOT include any additional elements that integrate the abstract idea into a practical application. 
Specifically, the claim recites the following limitation(s): 
A processor, and memory storing instructions, that when executed, case the processor to…
This limitation is considered an additional element. Under Step 2A prong 2, this additional element does NOT integrate the abstract idea into a practical application because it is simply using a general purpose computer as a tool to implement the abstract idea (MPEP 2106.05(f). This additional element must further be considered under Step 2B (see below). 
Obtain network data including first data of devices and services in the network, Performance Monitoring (PM) data associated with the devices and services and with associated timestamps, and second data including tickets, alarms, and events affecting some of the devices and services and with associated timestamps
This limitation is considered an additional element. Under Step 2A prong 2, this additional element does NOT integrate the abstract idea into a practical application because mere data gathering is considered insignificant extra-solution activity (see MPEP 2106.05(g)). This additional element must further be considered under Step 2B (see below). 
Obtain one or more target events from the second data based on associated impact in the network.
This limitation is considered part of the abstract idea. Under Step 2A Prong 1, the functionality encompassed, under BRI, of this limitation is nothing more than a mental process. For example, based upon a simple judgement or evaluation, for example, a human could reasonable identify, based on data showing the impact, events to be solved. That is, reasonable, given data, a human could identify specific events.
This limitation does not appear to contain any additional element that must be considered further. 
Determine the PM data that is statistically correlated with the one or more target events. 
This limitation is considered part of the abstract idea. Under Step 2A Prong 1, the functionality encompassed, under BRI, of this limitation is nothing more than both a mental process and mathematical concept. As a mental process, this limitation is simply equivalent to a human organizing data that has similar characteristics; in other words, nothing more than a simple evaluation or judgement. As a math concept, determining data that is “statistically correlated” is nothing more than applying a mathematical equation or concept to data.
This limitation does not appear to contain any additional elements that must be considered further.
Determine the statistically correlated PM data over a corresponding time based on the associated timestamps of the PM data and the one or more events. 
This limitation is considered part of the abstract idea. Under Step 2A Prong 1, this limitation is nothing more than a mental process. For example, given the statistically correlated PM data and associated time stamps and the target events, a human could perform a simple judgement or evaluation mentally or using a pencil and a piece of paper to group the statistically correlated data “over a period of time” and for a target event.
This limitation does not appear to contain any additional elements that must be considered further. 
Provide labels, based on any of i) correlating the tickets to the alarms and ii) receiving an input from a user, for the determined statistically correlated PM data with an associated label based on the associated target event of the one or more target event, and utilize a set of labeled data based on the provided labels to train a machine learning process  
This limitation is considered part of the abstract idea. Under Step 2A Prong 1, the functionality encompassed by this limitation, under BRI, is nothing more than a mental process. Specifically, a human given the statistically correlated PM data and the target event could apply a label by using a pencil and a piece of paper for example. Specifically, for example, a human could circle or otherwise give a description (i.e. label) of a particular data entry or multiple data entries. The examiner further refers to the response to arguments above.
This limitation contains the additional element of “…and utilize a set of labeled data based on the provided labels to train a machine learning process…” Under step 2A Prong 2, this limitation only generally links the use of the judicial exception to a particular technological environment (e.g. machine learning) (MPEP 2106.05(h)). This limitation must be further considered under step 2B. 

Step 2B: Does the claim contain any additional elements that amount to significantly more than the judicial exception? No, representative Claim 1 does NOT contain any additional elements that amount to significantly more than the judicial exception. 
	Specifically, the examiner identified above the following additional elements: 
A processor, and memory storing instructions, that when executed, case the processor to…
As mentioned above, this additional element merely amounts to using a general purpose computer as a tool to implement the abstract idea and therefore does not amount to significantly more than the judicial exception (MPEP 2106.05(f)). 
Obtain network data including first data of devices and services in the network, Performance Monitoring (PM) data associated with the devices and services and with associated timestamps, and second data including any of tickets, alarms, and events affecting some of the devices and services and with associated timestamps. 
As mentioned above, this additional element merely amounts to insignificant extra solution activity. Specifically though, this limitation merely amounts to mere data gathering and selecting a particular data source or type of data to be manipulated (MPEP 2106.05(g)) and therefore does NOT amount to significantly more than the judicial exception. 
“…and utilize a set of labeled data based on the provided labels to train a machine learning process.” 
The examiner refers to the response to arguments above. Specifically though, this limitation is considered well-understood, routine, and conventional (MPEP 2106.05(d)). That is, this limitation is simply claiming, and at a high level of generality, the definition of supervised machine learning which, as evidenced by the Wilson article (See response to arguments above) is the most common subbranch of machine learning today. That is, the IBM article (see above) and the Wilson article (see above) (e.g. Berkheimer evidence) demonstrate that using labeled data to train a machine learning algorithm is well-understood, routine, and conventional and therefore does NOT amount to significantly more than the judicial exception. Again, for further details please see the response to arguments above. 

Conclusion: Because Claim 1 recites an abstract idea (Revised Step 2A Prong 1) and the identified additional elements do NOT integrate the abstract idea into a practical application (Step 2A Prong 2) and the identified additional elements do NOT amount to significantly more than the judicial exception, claim 1 is NOT patent eligible. Therefore, a rejection under 35 U.S.C. 101 is appropriate. 

Analysis of Claim 3: 
	Claim 3 is dependent on Claim 1 and therefore recites the same abstract idea of Claim 1. Claim 3, however, further recites:
Subsequent to training a machine learning process with a set of labeled data based on the provided labels, obtain second PM data based on current operation of the network
This limitation is considered an additional element. First, “subsequent to training a machine learning process with a set of labeled data based on the provided labels” at Step 2A Prong 2 only generally links the use of the judicial exception to a particular technological environment or field of use (e.g. machine learning) (MPEP 2106.05(h)). Additionally, under Step 2B, and as stated above, this limitation is simply claiming at a high level of generality the use of “supervised machine learning” which is well-understood, routine, and conventional (MPEP 2106.05(d)). For Berkheimer evidence to this assertion, the examiner refers to the response to arguments above. 
“…obtain second PM data based on current operation of the network” is also considered an additional element. That is, under step 2A prong 2 and Step 2B, this limitation is considered mere data gathering, which is considered insignificant extra-solution activity (MPEP 2106.05(g)). 
Process the second PM data via the machine learning process
This limitation is considered an additional element. Under step 2A prong 2 and Step 2B, because of the high level of generality, it is simply using a general-purpose computer as a tool to implement the abstract idea (MPEP 2106.05(f)). Additionally, or alternatively, this limitation is also just generally linking the judicial exception to particular technological environment or field of use (e.g. Machine learning) (MPEP 2106.05(h)). 
Obtain predictions from the machine learning process based on labels associated with the set of labeled data.
Again, this limitation is considered an additional element. Specifically, though, at Step 2A prong 2, this limitation, due to the high level of generality, only generally links the use of the judicial exception to a particular technological environment or field of use (e.g. Machine learning) (MPEP 2106.05(h)). 
Under step 2B, this limitation is considered well-understood, routine, and conventional (MPEP 2106.05(d)). As Berkheimer evidence to this assertion, the examiner turns again to the Wilson article (See PTO-892 and above) which recites: 
“Supervised learning is the most common subbranch of machine learning today…Supervised machine learning algorithms are designed to learn by example. The name ‘supervised’ learning originates from the idea that training this type of algorithm is like having a teacher supervise the whole process…After training, supervised learning algorithm will take in new unseen inputs [e.g. second PM data] and will determine which label [e.g. predict] the new inputs will be classified as based on prior training data [e.g. as claimed “…based on labels associated with the set of labeled data.”].”
As seen, from this citation, when using, as the instant invention claims at a very high level of generality, supervised machine learning, it is well-understood, routine, and conventional that when new data is applied to this algorithm, predictions about that new data will be based on the already seen labeled data; that is simply how supervised machine learning conventionally operates and thus does not amount to significantly more than the judicial exception. 

Because Claim 3 recites the same abstract idea as claim 1 and the identified additionally elements do NOT integrate the abstract idea into a practical application or amount to significantly more than the judicial exception, Claim 4 is NOT patent eligible and therefore a rejection under 35 U.S.C. 101 is appropriate. 


Analysis of Claim 4: 
Claim 4 is dependent on Claim 1 and therefore recites the same abstract idea of Claim 1. Claim 4, however, further recites “wherein the statistical correlation includes measuring correlation of the PM data at a same time as each of the one or more target events and measuring the correlation of the PM data for prior time bins as each of the one or more target events.” This limitation is considered part of the abstract idea. Specifically, under 2A Prong 1 and BRI, the functionality encompassed by this limitation is nothing more than both a mathematical concept and mental process. As a mathematical concept, this limitation is simply applying a mathematical equation to PM data that has a “same time” and to PM data for “prior time bins.” Similarly, as a mental process, a human could reasonably correlate data based on specific time instances or ranges. This limitation contains no additional elements that must be further considered. 
Because Claim 4 recites the same abstract idea as claim 1 and itself is considered part of the abstract idea, Claim 4 is NOT patent eligible and therefore a rejection under 35 U.S.C. 101 is appropriate. 

Analysis of Claim 5: 
Claim 5 is dependent on Claim 1 and therefore recites the same abstract idea of Claim 1. Claim 5, however, further recites “wherein the network includes any of optical network elements, [TDM] network elements, [WDM] network elements, and packet network elements.” This limitation is considered an additional element. However, under Step 2A prong 2 and Step 2B, this limitation does NOT integrate the abstract idea into a practical application and does NOT amount to significantly more than the judicial exception because selecting the types of network elements is insignificant extra-solution activity (MPEP 2106.05(g)). 
Because Claim 5 recites the same abstract idea as claim 1 and its additional elements fail Step 2A prong 2 and Step 2B, Claim 5 is NOT patent eligible and therefore a rejection under 35 U.S.C. 101 is appropriate. 
Analysis of Claim 6: 
Claim 5 is dependent on Claim 1 and therefore recites the same abstract idea of Claim 1. Claim 5, however, further recites “wherein the devices in the network include a plurality of disparate types of devices from a plurality of equipment vendors.” This limitation is considered an additional element. However, under Step 2A prong 2 and Step 2B, this limitation does NOT integrate the abstract idea into a practical application and does NOT amount to significantly more than the judicial exception because selecting the types of network elements is insignificant extra-solution activity (MPEP 2106.05(g)). 
Because Claim 6 recites the same abstract idea as claim 1 and its additional elements fail Step 2A prong 2 and Step 2B, Claim 6 is NOT patent eligible and therefore a rejection under 35 U.S.C. 101 is appropriate. 

Analysis of Claim 7: 
Claim 7 is dependent on Claim 1 and therefore recites the same abstract idea of Claim 1. Claim 7, however, further recites “wherein the associated label is based on one or more of a risk assessment of network equipment, service assurance, and application Quality of Experience (QoE).” This limitation is considered an additional element. However, under Step 2A prong 2 and Step 2B, this limitation does NOT integrate the abstract idea into a practical application and does NOT amount to significantly more than the judicial exception because selecting the types of labels is insignificant extra-solution activity (MPEP 2106.05(g)). 
Because Claim 7 recites the same abstract idea as claim 1 and its additional elements fail Step 2A prong 2 and Step 2B, Claim 7 is NOT patent eligible and therefore a rejection under 35 U.S.C. 101 is appropriate. 
The examiner notes for clarity of record that claims 8, 10-14, 15, and 17-20 recite similar subject matter to that of claims 1 and 3-7. Therefore, for similar reasons, Claims8, 10-14, 15, and 17-20 are rejected under 35 U.S.C. 101. 

Examiner’s Remarks
For clarity of record, the examiner notes the Ciena YouTube video prior art as used in the art rejection(s) below. The attached file (see PTO-892) includes the full transcription as well as pertinent screen shots. The included transcription comes from the transcription provided by YouTube. 

Claim Rejections - 35 USC § 112

The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1, 3-8, 10-15, and 17-20 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Representative Claim 1 (Claim 8 and Claim 15 recite similar language and are similarly rejected for the reasons below) recites, at least in part, as amended:
“…and second data including 
The examiner notes the stricken language (e.g. “any of”). As currently amended, this limitation appears to require that ALL of “tickets, alarms, and events” are including in the second data. This function of ALL of “tickets, alarms, and events” is does NOT appear to be supported. 
Indeed, the applicant has NOT provided the paragraph numbers that support this amendment.
Turning now to the disclosure, the examiner first draws attention to Figure 24 which appears to be the best description of the above amended claim language.
In particular step “D21” recites:
“A) Inventory of Devices or Services, B) PM data from the devices or services, C) List of tickets, alarms or events affecting some of the above” 
The examiner notes the use of the term “OR”. Moving on to S21; while this step does recite “…based on the tickets, alarms, and events…”, this step clearly refers back to “C)” from above and recites “…from C)…” 
Therefore, under the Broadest Reasonable Interpretation of the claim language in light of the above disclosure the requirement of ALL “tickets, alarms, and events” is NOT supported; merely, “tickets, alarms, or events”. 
Turning now to the specification. It appears that paragraphs [00134] and [00136] set forth a similar understanding and recite, at least in part: 
“The automatic data labeling process 700 includes obtaining data including…and C) a list of tickets, alarms or events affect some of the above devices or services, associated with a timestamp…[00136] The automatic data labeling process 700 includes, based on the tickets, alarms, and events from C) in step D21…” 
Again, from the above disclosure, it is clear that the claimed invention encompasses merely “tickets, alarms, or events” and does have support for, as amended, “tickets, alarms, and events”.
Because the amended claim language does not have basis within the as-filed disclosure a rejection under 112(a) for new matter is appropriate. 
For purposes of examination, the limitation at issues will be interpreted as encompassing “…tickets, alarms, or events…” 
The examiner notes for clarity of record that the respective dependent claims are rejected because they depend on an appropriately rejected claim (e.g. the independent claims). 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

For clarity of record and ease of reading, the examiner notes the following: 
Any text that is bolded is a limitation of a claim. 
The “teaching” or reference citation, along with any necessary examiner notes are contained within the parentheses “()” following the bolded claim language. 
Any text that is underlined is emphasized language from reference(s) used and/or particular important examiner notes. While NOT fully reflective of the rejection as a whole, these underlined passages are indicative or otherwise reflective of key evidence.   

Claim 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ciena (YouTube video: Alarm Correlation, NPL 2016, hereinafter “Ciena”) in view of Zhang et al. (“Automated IT System Failure Prediction: A deep learning approach”. NPL 2016, hereinafter “Zhang”). 
With respect to Claim 1, Ciena teaches A system comprising: a processor; and memory storing instructions that, when executed, cause the processor to obtain network data including first data of devices and services in the network, Performance Monitoring (PM) data associated with the devices and services and with associated timestamps, and second data including tickets, alarms, and events affecting some of the devices and services and with associated timestamps (Ciena. See Screen Shot at 1:10 which shows “Data acquisition”. Next, see Screen shot at 2:23. The examiner notes that the screen shot shows network component name. Showing and collecting the component name teaches “first data.” Next, see screen shot at 1:24. Note the third column which shows “Alarm”. Showing the alarm and/or alarm description (fourth column) teaches “Performance Monitoring data…” Additionally, as can be seen (sixth column) the “associated time stamp” is shown. Next, see alarm column as described above. This additionally teaches “…and second data including any of tickets, alarms, and events affecting some of the devices and services and with associated timestamps…”) 
obtain one or more target events from the second data based on associated operational impact in the network (Ciena time stamp 1:27-1:33. “Here we have some sample network data with close to 3,000 outstanding alarms. We were able to cluster the majority of these into 637 events…” Further note, screen shot at 2:23 of which shows the impact of a particular alarm (see screen shot “Impact: Minor, non service affecting alarm”). The examiner notes that creating clusters (i.e. target events) based on clustering the alarm data and the associated impact of an alarm teaches “obtain one or more target events from the second data based on associated operational impact in the network”.) 
determine the PM data that is statistically correlated with the one or more target events (Ciena time stamp 1:37-1:44 “This cluster distribution list clearly shows which events caused the largest number of alarms and therefore which clusters to target first.” Time stamp 1:55-2:13 “If we click on a cluster we see that both the map and the alarm details filter to those in the selected cluster. Next to the standard alarm information we have an inter cluster indication. Filtering to those inter clusters shows which alarms are most highly correlated to each other.” The examiner notes that filtering alarms (i.e. PM data) associated with a selected event (i.e. one or more target events) by correlating alarms teaches “determine the PM data that is statistically correlated with the one or more target events”.).
determine the statistically correlated PM data over a corresponding time based on the associated timestamps of the PM data and the one or more target events (Ciena time stamp 1:37-1:44 “This cluster distribution list clearly shows which events caused the largest number of alarms and therefore which clusters to target first.” Time stamp 1:55-2:13 “If we click on a cluster we see that both the map and the alarm details filter to those in the selected cluster. Next to the standard alarm information we have an inter cluster indication. Filtering to those inter clusters shows which alarms are most highly correlated to each other.” Additionally, see screen shot at 1:24 and/or 2:23. Note that each alarm has an associated time stamp. Therefore, a person of ordinary skill in the art would infer that the correlation of the PM data is based, at least in part, on the corresponding time and the one or more target events.). 
Ciena, however, does not appear to explicitly disclose: 
provide labels, based on any of i) correlating the tickets to the alarms and ii) receiving an input from a user, for the determined statistically correlated PM data with an associated label based on the associated target event of the one or more target events, and utilize a set of labeled data based on the provided labels to train a machine learning process
Zhang, however, teaches provide labels, based on any of i) correlating the tickets to the alarms and ii) receiving an input from a user, for the determined statistically correlated PM data with an associated label based on the associated target event of the one or more target events…(Zhang Pg. 1 c.f. Figure 1. Pg. 2 Col. 1 “As shown in Figure 1, our system first extract format patterns for heterogeneous logs by clustering similar logs together and extracting the format/structure of log clusters…These patterns are then passed to the feature representation module, where sequential features over time are extracted.” Pg. 2 Cols 1-2 Section II “Given a component (or a subsystem) of a computing system K and a collection of console logs L(K) from this component, infer the probability of a failure pi(W) occurring at this component within time window W. in order to solve problem 1 we will treat it as a supervised learning problem and in particular, as a binary classification problem for predicting failure events within a time window W prior to the occurrence of failures. The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by the system administrators…” Pg. 9 Col. 1 “A real failure example: We further delve into the details of some failure events from our dataset and examine what the “important” features are. For example, we found a failure event as shown in Figure 10 that is most likely due to the network failure, where the server cannot respond in time. These failure events can be effectively captured by our model and revealed by some important patterns such as P79 and P38.” The examiner notes that the features given labels are the patterns representing the performance of a component and/or network (e.g. see Figure 3). Therefore, the failure labels are “…for the determined statistically correlated PM data with an associated label based on the associated target event of the one or more target events” as the claim requires.). 
…and utilize a set of labeled data based on the provided labels to train a machine learning process (Zhang Pg. 1 c.f. Figure 1. Pg. 2 Col. 1 “As shown in Figure 1, our system first extract format patterns for heterogeneous logs by clustering similar logs together and extracting the format/structure of log clusters…These patterns are then passed to the feature representation module, where sequential features over time are extracted.” Pg. 2 Cols 1-2 Section II “Given a component (or a subsystem) of a computing system K and a collection of console logs L(K) from this component, infer the probability of a failure pi(W) occurring at this component within time window W. in order to solve problem 1 we will treat it as a supervised learning problem and in particular, as a binary classification problem for predicting failure events within a time window W prior to the occurrence of failures. The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by the system administrators…”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the statistically correlated PM data, collection of data, and determination of events as taught by Ciena modified with the providing of labels for at least the performance data as taught by Zhang because this labeling would provide a learning model with an accurate dataset for which to train on. This, in turn, would allow the training of an accurate prediction model (Zhang Pg. 2 Col. 2). 
With respect to Claim 3, the combination of Ciena and Zhang teach subsequent to training a machine learning process with a set of labeled data based on the provided labels, obtain second PM data based on current operation of the network (Zhang Pg. 1 c.f. Figure 1. Pg. 2 Col. 1 “As shown in Figure 1, our system first extract format patterns for heterogeneous logs by clustering similar logs together and extracting the format/structure of log clusters…These patterns are then passed to the feature representation module, where sequential features over time are extracted.” Pg. 2 Cols 1-2 Section II “Given a component (or a subsystem) of a computing system K and a collection of console logs L(K) from this component, infer the probability of a failure pi(W) occurring at this component within time window W. in order to solve problem 1 we will treat it as a supervised learning problem and in particular, as a binary classification problem for predicting failure events within a time window W prior to the occurrence of failures. The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by the system administrators…” Zhang Pgs 5-6 Section V. Our dataset has been collected from two large enterprise systems, a web server cluster (WSC) and a mailer server cluster (MSC)…Each cluster is composed of multiple components with various types of applications running on them. The historical logs span 541 days for WSC and 161 for MSC, where the system failures are recorded by the system administrators, whenever there is something wrong with a subsystem or component…For model training and evaluation, we split the dataset into training and testing sets in time order…” The examiner notes that applying training data to a machine learning process teaches “…training a machine learning process with a set of labeled data based on the provided labels…” Additionally, test data (second PM data) gathered and/or processed after the training (see Zhang c.f. Figure 1 for example) teaches “subsequent to…obtain second PM data based on current operations of the network.”)
Process the second PM data via the machine learning process (Zhang Pg. 1 c.f. Figure 1. Pg. 2 Col. 1 “As shown in Figure 1, our system first extract format patterns for heterogeneous logs by clustering similar logs together and extracting the format/structure of log clusters…These patterns are then passed to the feature representation module, where sequential features over time are extracted.” Pg. 2 Cols 1-2 Section II “Given a component (or a subsystem) of a computing system K and a collection of console logs L(K) from this component, infer the probability of a failure pi(W) occurring at this component within time window W. in order to solve problem 1 we will treat it as a supervised learning problem and in particular, as a binary classification problem for predicting failure events within a time window W prior to the occurrence of failures. The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by the system administrators…” Zhang Pgs 5-6 Section V. Our dataset has been collected from two large enterprise systems, a web server cluster (WSC) and a mailer server cluster (MSC)…Each cluster is composed of multiple components with various types of applications running on them. The historical logs span 541 days for WSC and 161 for MSC, where the system failures are recorded by the system administrators, whenever there is something wrong with a subsystem or component…For model training and evaluation, we split the dataset into training and testing sets in time order…” The examiner notes that processing log data on a trained machine learning process and/or processing test data with the trained machine learning process teaches “process the second PM data via the machine learning process…”). 
Obtain predictions from the machine learning process based on labels associated with the set of labeled data (Zhang Pg. 2 Col. 1 “The main goal of the prediction task is to successfully signal an alert before the failure occurs.” Zhang Pg. 5 Col. 2 “To reiterate, we formalize the failure prediction problem as a binary classification problem, that is the target dt is a binary vector with 2 complementary classes. The output yt from the prediction network (Figure 4) is essentially a binary vector serving as a representation of the system status, which we can utilize to estimate the binomial distributions….”). 
With respect to Claim 4, the combination of Ciena and Zhang teach wherein the statistical correlation includes measuring correlation of the PM data at a same time as each of the one or more target events and measuring the correlation of the PM data for some prior time bins before each of the one or more target events (Ciena time stamp 1:27-1:33. “Here we have some sample network data with close to 3,000 outstanding alarms. We were able to cluster the majority of these into 637 events…” Ciena time stamp 1:37-1:44 “This cluster distribution list clearly shows which events caused the largest number of alarms and therefore which clusters to target first.” Time stamp 1:55-2:13 “If we click on a cluster we see that both the map and the alarm details filter to those in the selected cluster. Next to the standard alarm information we have an inter cluster indication. Filtering to those inter clusters shows which alarms are most highly correlated to each other.” Ciena screen shot at 1:57. The examiner notes that this screen shows the alarms (i.e. PM Data) associated with cluster. As can be seen from the screen shot, each cluster is representative of an event (see “Clusters(Events).”). Further, it can be seen from observing each alarm that an associated timestamp is present and in the specific alarm shown each time stamp has a substantially similar time (e.g. 10:41:00 PM). Clustering alarms that have a substantially similar timestamps, which in turn determines an event teaches, under BRI, “wherein the statistical correlation includes measuring correlation of the PM data at a same time as each of the one or more target events and measuring the correlation of the PM data for some prior time bins before each of the one or more target events”.).
With respect to Claim 5, the combination of Ciena and Zhang teach wherein the network includes any of optical network elements, Time Division Multiplexing (TDM) network elements, Wavelength Division Multiplexing (WDM) networking elements, and packet network elements (Ciena see screen shot at time stamp 1:24. Note especially the component column (second column). Any or all of the components shown teach the claim language). 
With respect to Claim 6, the combination of Ciena and Zhang teach wherein the devices in the network include a plurality of disparate types of devices from a plurality of equipment vendors (Ciena see screen shot at time stamp 1:24. Note especially the component column (second column). Any or all of the components shown teach the claim language. Additionally, see screen shot at 1:10 note under data acquisition that the alarm correlation system described in the video is “multivendor ready”. Acquiring data in a multivendor ready system additionally teaches “wherein the devices in the network include a plurality of disparate types of devices from a plurality of equipment vendors”.). 
With respect to Claim 7, the combination of Ciena and Zhang teach wherein the associated label is based on one or more of a risk assessment of network equipment, service assurance, and application Quality of Experience (QoE) (Zhang Pg. 2 Col. 2 “The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by system administrators…” The examiner notes that because the labels are representative of failures on a network the labels are “…based on one or more of a risk assessment of network equipment, service assurance, and application Quality of Experience (QoE)” as the claim requires.). 

With respect to Claim 8, Ciena teaches A method comprising: obtaining network data including first data of devices and services in the network, Performance Monitoring (PM) data associated with the devices and services and with associated timestamps, and second data including tickets, alarms, and events affecting some of the devices and services and with associated timestamps (Ciena. See Screen Shot at 1:10 which shows “Data acquisition”. Next, see Screen shot at 2:23. The examiner notes that the screen shot shows network component name. Showing and collecting the component name teaches “first data.” Next, see screen shot at 1:24. Note the third column which shows “Alarm”. Showing the alarm and/or alarm description (fourth column) teaches “Performance Monitoring data…” Additionally, as can be seen (sixth column) the “associated time stamp” is shown. Next, see alarm column as described above. This additionally teaches “…and second data including any of tickets, alarms, and events affecting some of the devices and services and with associated timestamps…”) 
obtaining one or more target events from the second data based on associated operational impact in the network (Ciena time stamp 1:27-1:33. “Here we have some sample network data with close to 3,000 outstanding alarms. We were able to cluster the majority of these into 637 events…” Further note, screen shot at 2:23 of which shows the impact of a particular alarm (see screen shot “Impact: Minor, non service affecting alarm”). The examiner notes that creating clusters (i.e. target events) based on clustering the alarm data and the associated impact of an alarm teaches “obtain one or more target events from the second data based on associated operational impact in the network”.) 
determining the PM data that is statistically correlated with the one or more target events (Ciena time stamp 1:37-1:44 “This cluster distribution list clearly shows which events caused the largest number of alarms and therefore which clusters to target first.” Time stamp 1:55-2:13 “If we click on a cluster we see that both the map and the alarm details filter to those in the selected cluster. Next to the standard alarm information we have an inter cluster indication. Filtering to those inter clusters shows which alarms are most highly correlated to each other.” The examiner notes that filtering alarms (i.e. PM data) associated with a selected event (i.e. one or more target events) by correlating alarms teaches “determine the PM data that is statistically correlated with the one or more target events”.).
determining the statistically correlated PM data over a corresponding time based on the associated timestamps of the PM data and the one or more target events (Ciena time stamp 1:37-1:44 “This cluster distribution list clearly shows which events caused the largest number of alarms and therefore which clusters to target first.” Time stamp 1:55-2:13 “If we click on a cluster we see that both the map and the alarm details filter to those in the selected cluster. Next to the standard alarm information we have an inter cluster indication. Filtering to those inter clusters shows which alarms are most highly correlated to each other.” Additionally, see screen shot at 1:24 and/or 2:23. Note that each alarm has an associated time stamp. Therefore, a person of ordinary skill in the art would infer that the correlation of the PM data is based, at least in part, on the corresponding time and the one or more target events.). 
Ciena, however, does not appear to explicitly disclose: 
providing labels, based on any of i) correlating the tickets to the alarms and ii) receiving an input from a user, for the determined statistically correlated PM data with an associated label based on the associated target event of the one or more target events, and utilizing a set of labeled data based on the provided labels to train a machine learning process
Zhang, however, teaches providing labels, based on any of i) correlating the tickets to the alarms and ii) receiving an input from a user, for the determined statistically correlated PM data with an associated label based on the associated target event of the one or more target events…(Zhang Pg. 1 c.f. Figure 1. Pg. 2 Col. 1 “As shown in Figure 1, our system first extract format patterns for heterogeneous logs by clustering similar logs together and extracting the format/structure of log clusters…These patterns are then passed to the feature representation module, where sequential features over time are extracted.” Pg. 2 Cols 1-2 Section II “Given a component (or a subsystem) of a computing system K and a collection of console logs L(K) from this component, infer the probability of a failure pi(W) occurring at this component within time window W. in order to solve problem 1 we will treat it as a supervised learning problem and in particular, as a binary classification problem for predicting failure events within a time window W prior to the occurrence of failures. The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by the system administrators…” Pg. 9 Col. 1 “A real failure example: We further delve into the details of some failure events from our dataset and examine what the “important” features are. For example, we found a failure event as shown in Figure 10 that is most likely due to the network failure, where the server cannot respond in time. These failure events can be effectively captured by our model and revealed by some important patterns such as P79 and P38.” The examiner notes that the features given labels are the patterns representing the performance of a component and/or network (e.g. see Figure 3). Therefore, the failure labels are “…for the determined statistically correlated PM data with an associated label based on the associated target event of the one or more target events” as the claim requires.). 
…and utilizing a set of labeled data based on the provided labels to train a machine learning process (Zhang Pg. 1 c.f. Figure 1. Pg. 2 Col. 1 “As shown in Figure 1, our system first extract format patterns for heterogeneous logs by clustering similar logs together and extracting the format/structure of log clusters…These patterns are then passed to the feature representation module, where sequential features over time are extracted.” Pg. 2 Cols 1-2 Section II “Given a component (or a subsystem) of a computing system K and a collection of console logs L(K) from this component, infer the probability of a failure pi(W) occurring at this component within time window W. in order to solve problem 1 we will treat it as a supervised learning problem and in particular, as a binary classification problem for predicting failure events within a time window W prior to the occurrence of failures. The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by the system administrators…”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the statistically correlated PM data, collection of data, and determination of events as taught by Ciena modified with the providing of labels for at least the performance data as taught by Zhang because this labeling would provide a learning model with an accurate dataset for which to train on. This, in turn, would allow the training of an accurate prediction model (Zhang Pg. 2 Col. 2). 
With respect to Claim 10, the combination of Ciena and Zhang teach subsequent to training a machine learning process with a set of labeled data based on the provided labels, obtaining second PM data based on current operation of the network (Zhang Pg. 1 c.f. Figure 1. Pg. 2 Col. 1 “As shown in Figure 1, our system first extract format patterns for heterogeneous logs by clustering similar logs together and extracting the format/structure of log clusters…These patterns are then passed to the feature representation module, where sequential features over time are extracted.” Pg. 2 Cols 1-2 Section II “Given a component (or a subsystem) of a computing system K and a collection of console logs L(K) from this component, infer the probability of a failure pi(W) occurring at this component within time window W. in order to solve problem 1 we will treat it as a supervised learning problem and in particular, as a binary classification problem for predicting failure events within a time window W prior to the occurrence of failures. The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by the system administrators…” Zhang Pgs 5-6 Section V. Our dataset has been collected from two large enterprise systems, a web server cluster (WSC) and a mailer server cluster (MSC)…Each cluster is composed of multiple components with various types of applications running on them. The historical logs span 541 days for WSC and 161 for MSC, where the system failures are recorded by the system administrators, whenever there is something wrong with a subsystem or component…For model training and evaluation, we split the dataset into training and testing sets in time order…” The examiner notes that applying training data to a machine learning process teaches “…training a machine learning process with a set of labeled data based on the provided labels…” Additionally, test data (second PM data) gathered and/or processed after the training (see Zhang c.f. Figure 1 for example) teaches “subsequent to…obtain second PM data based on current operations of the network.”)
Processing the second PM data via the machine learning process (Zhang Pg. 1 c.f. Figure 1. Pg. 2 Col. 1 “As shown in Figure 1, our system first extract format patterns for heterogeneous logs by clustering similar logs together and extracting the format/structure of log clusters…These patterns are then passed to the feature representation module, where sequential features over time are extracted.” Pg. 2 Cols 1-2 Section II “Given a component (or a subsystem) of a computing system K and a collection of console logs L(K) from this component, infer the probability of a failure pi(W) occurring at this component within time window W. in order to solve problem 1 we will treat it as a supervised learning problem and in particular, as a binary classification problem for predicting failure events within a time window W prior to the occurrence of failures. The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by the system administrators…” Zhang Pgs 5-6 Section V. Our dataset has been collected from two large enterprise systems, a web server cluster (WSC) and a mailer server cluster (MSC)…Each cluster is composed of multiple components with various types of applications running on them. The historical logs span 541 days for WSC and 161 for MSC, where the system failures are recorded by the system administrators, whenever there is something wrong with a subsystem or component…For model training and evaluation, we split the dataset into training and testing sets in time order…” The examiner notes that processing log data on a trained machine learning process and/or processing test data with the trained machine learning process teaches “process the second PM data via the machine learning process…”). 
Obtaining predictions from the machine learning process based on labels associated with the set of labeled data (Zhang Pg. 2 Col. 1 “The main goal of the prediction task is to successfully signal an alert before the failure occurs.” Zhang Pg. 5 Col. 2 “To reiterate, we formalize the failure prediction problem as a binary classification problem, that is the target dt is a binary vector with 2 complementary classes. The output yt from the prediction network (Figure 4) is essentially a binary vector serving as a representation of the system status, which we can utilize to estimate the binomial distributions….”). 
With respect to Claim 11, the combination of Ciena and Zhang teach wherein the statistical correlation includes measuring correlation of the PM data at a same time as each of the one or more target events and measuring the correlation of the PM data for some prior time bins before each of the one or more target events (Ciena time stamp 1:27-1:33. “Here we have some sample network data with close to 3,000 outstanding alarms. We were able to cluster the majority of these into 637 events…” Ciena time stamp 1:37-1:44 “This cluster distribution list clearly shows which events caused the largest number of alarms and therefore which clusters to target first.” Time stamp 1:55-2:13 “If we click on a cluster we see that both the map and the alarm details filter to those in the selected cluster. Next to the standard alarm information we have an inter cluster indication. Filtering to those inter clusters shows which alarms are most highly correlated to each other.” Ciena screen shot at 1:57. The examiner notes that this screen shows the alarms (i.e. PM Data) associated with cluster. As can be seen from the screen shot, each cluster is representative of an event (see “Clusters(Events).”). Further, it can be seen from observing each alarm that an associated timestamp is present and in the specific alarm shown each time stamp has a substantially similar time (e.g. 10:41:00 PM). Clustering alarms that have a substantially similar timestamps, which in turn determines an event teaches, under BRI, “wherein the statistical correlation includes measuring correlation of the PM data at a same time as each of the one or more target events and measuring the correlation of the PM data for prior time bins as each of the one or more target events”.).
With respect to Claim 12, the combination of Ciena and Zhang teach wherein the network includes any of optical network elements, Time Division Multiplexing (TDM) network elements, Wavelength Division Multiplexing (WDM) networking elements, and packet network elements (Ciena see screen shot at time stamp 1:24. Note especially the component column (second column). Any or all of the components shown teach the claim language). 
With respect to Claim 13, the combination of Ciena and Zhang teach wherein the devices in the network include a plurality of disparate types of devices from a plurality of equipment vendors (Ciena see screen shot at time stamp 1:24. Note especially the component column (second column). Any or all of the components shown teach the claim language. Additionally, see screen shot at 1:10 note under data acquisition that the alarm correlation system described in the video is “multivendor ready”. Acquiring data in a multivendor ready system additionally teaches “wherein the devices in the network include a plurality of disparate types of devices from a plurality of equipment vendors”.). 
With respect to Claim 14, the combination of Ciena and Zhang teach wherein the associated label is based on one or more of a risk assessment of network equipment, service assurance, and application Quality of Experience (QoE) (Zhang Pg. 2 Col. 2 “The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by system administrators…” The examiner notes that because the labels are representative of failures on a network the labels are “…based on one or more of a risk assessment of network equipment, service assurance, and application Quality of Experience (QoE)” as the claim requires.). 

With respect to Claim 15, Ciena teaches A non-transitory computer-readable medium comprising instructions for automatically labeling data from a telecommunication network, wherein the instructions, when executed, cause a processor to perform the steps of: obtaining network data including first data of devices and services in the network, Performance Monitoring (PM) data associated with the devices and services and with associated timestamps, and second data including any of tickets, alarms, and events affecting some of the devices and services and with associated timestamps (Ciena. See Screen Shot at 1:10 which shows “Data acquisition”. Next, see Screen shot at 2:23. The examiner notes that the screen shot shows network component name. Showing and collecting the component name teaches “first data.” Next, see screen shot at 1:24. Note the third column which shows “Alarm”. Showing the alarm and/or alarm description (fourth column) teaches “Performance Monitoring data…” Additionally, as can be seen (sixth column) the “associated time stamp” is shown. Next, see alarm column as described above. This additionally teaches “…and second data including any of tickets, alarms, and events affecting some of the devices and services and with associated timestamps…”) 
obtaining one or more target events from the second data based on associated operational impact in the network (Ciena time stamp 1:27-1:33. “Here we have some sample network data with close to 3,000 outstanding alarms. We were able to cluster the majority of these into 637 events…” Further note, screen shot at 2:23 of which shows the impact of a particular alarm (see screen shot “Impact: Minor, non service affecting alarm”). The examiner notes that creating clusters (i.e. target events) based on clustering the alarm data and the associated impact of an alarm teaches “obtain one or more target events from the second data based on associated operational impact in the network”.) 
determining the PM data that is statistically correlated with the one or more target events (Ciena time stamp 1:37-1:44 “This cluster distribution list clearly shows which events caused the largest number of alarms and therefore which clusters to target first.” Time stamp 1:55-2:13 “If we click on a cluster we see that both the map and the alarm details filter to those in the selected cluster. Next to the standard alarm information we have an inter cluster indication. Filtering to those inter clusters shows which alarms are most highly correlated to each other.” The examiner notes that filtering alarms (i.e. PM data) associated with a selected event (i.e. one or more target events) by correlating alarms teaches “determine the PM data that is statistically correlated with the one or more target events”.).
determining the statistically correlated PM data over a corresponding time based on the associated timestamps of the PM data and the one or more target events (Ciena time stamp 1:37-1:44 “This cluster distribution list clearly shows which events caused the largest number of alarms and therefore which clusters to target first.” Time stamp 1:55-2:13 “If we click on a cluster we see that both the map and the alarm details filter to those in the selected cluster. Next to the standard alarm information we have an inter cluster indication. Filtering to those inter clusters shows which alarms are most highly correlated to each other.” Additionally, see screen shot at 1:24 and/or 2:23. Note that each alarm has an associated time stamp. Therefore, a person of ordinary skill in the art would infer that the correlation of the PM data is based, at least in part, on the corresponding time and the one or more target events.). 
Ciena, however, does not appear to explicitly disclose: 
providing labels, based on any of i) correlating the tickets to the alarms and ii) receiving an input from a user, for the determined statistically correlated PM data with an associated label based on the associated target event of the one or more target events, and utilizing a set of labeled data based on the provided labels to train a machine learning process
Zhang, however, teaches providing labels, based on any of i) correlating the tickets to the alarms and ii) receiving an input from a user, for the determined statistically correlated PM data with an associated label based on the associated target event of the one or more target events…(Zhang Pg. 1 c.f. Figure 1. Pg. 2 Col. 1 “As shown in Figure 1, our system first extract format patterns for heterogeneous logs by clustering similar logs together and extracting the format/structure of log clusters…These patterns are then passed to the feature representation module, where sequential features over time are extracted.” Pg. 2 Cols 1-2 Section II “Given a component (or a subsystem) of a computing system K and a collection of console logs L(K) from this component, infer the probability of a failure pi(W) occurring at this component within time window W. in order to solve problem 1 we will treat it as a supervised learning problem and in particular, as a binary classification problem for predicting failure events within a time window W prior to the occurrence of failures. The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by the system administrators…” Pg. 9 Col. 1 “A real failure example: We further delve into the details of some failure events from our dataset and examine what the “important” features are. For example, we found a failure event as shown in Figure 10 that is most likely due to the network failure, where the server cannot respond in time. These failure events can be effectively captured by our model and revealed by some important patterns such as P79 and P38.” The examiner notes that the features given labels are the patterns representing the performance of a component and/or network (e.g. see Figure 3). Therefore, the failure labels are “…for the determined statistically correlated PM data with an associated label based on the associated target event of the one or more target events” as the claim requires.). 
…and utilizing a set of labeled data based on the provided labels to train a machine learning process (Zhang Pg. 1 c.f. Figure 1. Pg. 2 Col. 1 “As shown in Figure 1, our system first extract format patterns for heterogeneous logs by clustering similar logs together and extracting the format/structure of log clusters…These patterns are then passed to the feature representation module, where sequential features over time are extracted.” Pg. 2 Cols 1-2 Section II “Given a component (or a subsystem) of a computing system K and a collection of console logs L(K) from this component, infer the probability of a failure pi(W) occurring at this component within time window W. in order to solve problem 1 we will treat it as a supervised learning problem and in particular, as a binary classification problem for predicting failure events within a time window W prior to the occurrence of failures. The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by the system administrators…”).It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the statistically correlated PM data, collection of data, and determination of events as taught by Ciena modified with the providing of labels for at least the performance data as taught by Zhang because this labeling would provide a learning model with an accurate dataset for which to train on. This, in turn, would allow the training of an accurate prediction model (Zhang Pg. 2 Col. 2). 
With respect to Claim 17, the combination of Ciena and Zhang teach subsequent to training a machine learning process with a set of labeled data based on the provided labels, obtaining second PM data based on current operation of the network (Zhang Pg. 1 c.f. Figure 1. Pg. 2 Col. 1 “As shown in Figure 1, our system first extract format patterns for heterogeneous logs by clustering similar logs together and extracting the format/structure of log clusters…These patterns are then passed to the feature representation module, where sequential features over time are extracted.” Pg. 2 Cols 1-2 Section II “Given a component (or a subsystem) of a computing system K and a collection of console logs L(K) from this component, infer the probability of a failure pi(W) occurring at this component within time window W. in order to solve problem 1 we will treat it as a supervised learning problem and in particular, as a binary classification problem for predicting failure events within a time window W prior to the occurrence of failures. The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by the system administrators…” Zhang Pgs 5-6 Section V. Our dataset has been collected from two large enterprise systems, a web server cluster (WSC) and a mailer server cluster (MSC)…Each cluster is composed of multiple components with various types of applications running on them. The historical logs span 541 days for WSC and 161 for MSC, where the system failures are recorded by the system administrators, whenever there is something wrong with a subsystem or component…For model training and evaluation, we split the dataset into training and testing sets in time order…” The examiner notes that applying training data to a machine learning process teaches “…training a machine learning process with a set of labeled data based on the provided labels…” Additionally, test data (second PM data) gathered and/or processed after the training (see Zhang c.f. Figure 1 for example) teaches “subsequent to…obtain second PM data based on current operations of the network.”)
Processing the second PM data via the machine learning process (Zhang Pg. 1 c.f. Figure 1. Pg. 2 Col. 1 “As shown in Figure 1, our system first extract format patterns for heterogeneous logs by clustering similar logs together and extracting the format/structure of log clusters…These patterns are then passed to the feature representation module, where sequential features over time are extracted.” Pg. 2 Cols 1-2 Section II “Given a component (or a subsystem) of a computing system K and a collection of console logs L(K) from this component, infer the probability of a failure pi(W) occurring at this component within time window W. in order to solve problem 1 we will treat it as a supervised learning problem and in particular, as a binary classification problem for predicting failure events within a time window W prior to the occurrence of failures. The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by the system administrators…” Zhang Pgs 5-6 Section V. Our dataset has been collected from two large enterprise systems, a web server cluster (WSC) and a mailer server cluster (MSC)…Each cluster is composed of multiple components with various types of applications running on them. The historical logs span 541 days for WSC and 161 for MSC, where the system failures are recorded by the system administrators, whenever there is something wrong with a subsystem or component…For model training and evaluation, we split the dataset into training and testing sets in time order…” The examiner notes that processing log data on a trained machine learning process and/or processing test data with the trained machine learning process teaches “process the second PM data via the machine learning process…”). 
Obtaining predictions from the machine learning process based on labels associated with the set of labeled data (Zhang Pg. 2 Col. 1 “The main goal of the prediction task is to successfully signal an alert before the failure occurs.” Zhang Pg. 5 Col. 2 “To reiterate, we formalize the failure prediction problem as a binary classification problem, that is the target dt is a binary vector with 2 complementary classes. The output yt from the prediction network (Figure 4) is essentially a binary vector serving as a representation of the system status, which we can utilize to estimate the binomial distributions….”). 
With respect to Claim 18, the combination of Ciena and Zhang teach wherein the statistical correlation includes measuring correlation of the PM data at a same time as each of the one or more target events and measuring the correlation of the PM data for some prior time bins before each of the one or more target events (Ciena time stamp 1:27-1:33. “Here we have some sample network data with close to 3,000 outstanding alarms. We were able to cluster the majority of these into 637 events…” Ciena time stamp 1:37-1:44 “This cluster distribution list clearly shows which events caused the largest number of alarms and therefore which clusters to target first.” Time stamp 1:55-2:13 “If we click on a cluster we see that both the map and the alarm details filter to those in the selected cluster. Next to the standard alarm information we have an inter cluster indication. Filtering to those inter clusters shows which alarms are most highly correlated to each other.” Ciena screen shot at 1:57. The examiner notes that this screen shows the alarms (i.e. PM Data) associated with cluster. As can be seen from the screen shot, each cluster is representative of an event (see “Clusters(Events).”). Further, it can be seen from observing each alarm that an associated timestamp is present and in the specific alarm shown each time stamp has a substantially similar time (e.g. 10:41:00 PM). Clustering alarms that have a substantially similar timestamps, which in turn determines an event teaches, under BRI, “wherein the statistical correlation includes measuring correlation of the PM data at a same time as each of the one or more target events and measuring the correlation of the PM data for some prior time bins before each of the one or more target events”.).
With respect to Claim 19, the combination of Ciena and Zhang teach wherein the network includes any of optical network elements, Time Division Multiplexing (TDM) network elements, Wavelength Division Multiplexing (WDM) networking elements, and packet network elements (Ciena see screen shot at time stamp 1:24. Note especially the component column (second column). Any or all of the components shown teach the claim language). 
With respect to Claim 20, the combination of Ciena and Zhang teach wherein the associated label is based on one or more of a risk assessment of network equipment, service assurance, and application Quality of Experience (QoE) (Zhang Pg. 2 Col. 2 “The training data for our supervised learning model consist of features extracted from the console logs of various computing systems…and failure labels provided by system administrators…” The examiner notes that because the labels are representative of failures on a network the labels are “…based on one or more of a risk assessment of network equipment, service assurance, and application Quality of Experience (QoE)” as the claim requires.). 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FEN TAMULONIS whose telephone number is (571)272-0934. The examiner can normally be reached 7:30AM-5:30PM MON-FRI EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571)-272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/FEN CHRISTOPHER TAMULONIS/Examiner, Art Unit 2126  
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126