DETAILED ACTION
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on December 30th, 2020 has been entered.
 This action is in response to the amendments filed on December 30th, 2020. A summary of this action:
Claims 2-6, 8-12, 14-21 have been presented for examination.
Claims 19-21, 2, 14 have been amended
Claims 19-21 are objected to for informalities
The specification is objected to for new matter
Claims 2-6, 8-12, 14-21 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement.
Claims 2-5, 8-11, 14-17, 19-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shibuya et al., US 2012/0290879 in view of Maeda et al., US 2012/0041575 and in further view of Zhang et al., “KRNN: k Rare-class Nearest Neighbour classification”, 2016
Claims 6, 12, 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shibuya et al., US 2012/0290879 in view of Maeda et al., US 2012/0041575 and in further view of Zhang et al., “KRNN: k Rare-class Nearest Neighbour classification”, 2016 in further view of Skand, “kNN(k-Nearest Neighbour) Algorithm in R”, 2017
This action is made non-Final

	Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment/Arguments
Regarding the § 103 Rejection
Applicant’s arguments with respect to the rejection under § 103 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

In regards to the applicant’s arguments on claim interpretation, these are not persuasive. The claims are given their broadest reasonable interpretation to one of ordinary skill in light of the specification. Limitations are not read in from the specification. 
E.g., the applicant’s argument on page 12 recites, in part: “A feature typically represents a pattern that occurs in multiple pieces of time series data that all correspond to the same condition of a machine”. Claim 19 recites “each first time series being associated with at least one feature vector for at least one time point”.
As recited in claim 19, there is at least one feature vector associated with each first time series. This is what the claim recites. The claim does not recite “represents a pattern that occurs in multiple pieces of time series data that all correspond to the same condition of a machine...”
 Limitations are not read in from the specification. 
In regards the arguments for applicant’s previous claims (Remarks, page 13) wherein the time series comprise feature vectors – this was rejected as these series are received from “sensors”. The specification also conveys this, and conveys that features are extracted from the signals/time series wherein the extraction occurs as part of the claimed system and not in the “sensors”. The applicant may amend the claims to better reflect the disclosed invention described in ¶ 25-26 for the “feature identification logic”, however limitations are not read in from the specification. 
 

Regarding the § 112(a) Rejection
	In light of the applicant’s amendments, the § 112(a) rejection is withdrawn in part, and MAINTAINED in part. 
	See below for the grounds that are maintained. 
	
In addition, the Examiner notes that on page 17 of the Remarks, the applicant submitted:
In addition, since a ratio of the "first ratio" and the "second ratio" is to be taken, as recited in Claim 19, it is meaningless to have the same denominator in the two ratios, and the description in para. [0043] of the application publication is clearly a typo...

This argument has been fully considered – see the rejection for new matter below.
Also, and as demonstrated below in the new matter rejection, the original claims, the original disclosure, and several subsequent rounds of amendments contained this “meaningless” recitation. 
	The argument that this was “clearly a typo” is not persuasive, as the prosecution history clearly shows that this scope has been claimed numerous times in different manners by the applicant. 
	The argument is not persuasive, the rejection is maintained. 

In addition, the applicant submits (Remarks, page 17):
As discussed previously and indicated in the last Office Action response, the support for the above-quoted feature of Claim 19 is found in para. [0042] of the application publication, for example. That paragraph states that "The contribution may be further refined by determining the relative density of the label in the neighborhood to the density of the label in the model overall", which is defined by "( c) a number of model condition points in the model having a same label as the labeled point and ( d) the set number plurality of model condition points", as recited in Claim 19. 

	The applicant arguments are not persuasive. 
	The specification does not support their argument. The “label density”, as recited in ¶ 42, contains no clear link to ¶ 43 and the first/second ratio.
	In addition, ¶ 43 clearly sets forth what is meant by the first/second ratio. One of ordinary skill would not reasonably infer from the specification and original claims, as filed, that ¶ 43 was disclosing an different equation then the one that was originally filed in the original The applicant’s interpretation is inconsistent with the disclosure as filed. 

	In addition, ¶ 43 conveys and supports the equation in claim 21, not two distinctive claim scopes that are drawing from the same support.
	The applicant’s argument requires that a person of ordinary skill somehow, in some manner, recognizes that one single paragraph ¶ 43 conveys two distinctive claim scopes, this is entirely unreasonable – ¶ 43, as originally filed, conveys support for the second ratio as recited in claim 21, NOT support for the second ratio/percentage in claims 19-20. 
	In other words, one of ordinary skill would not recognize that ¶ 43 explicitly supports claims 19-20, nor would they recognize that ¶ 42-43 implies some support for claims 19-20. 
	For them to do this they would have to infer that the disclosure in ¶ 42-43 supports two distinctly different interpretations of ¶ 43’s ratios as filed. This is unreasonable. 

	In addition, claim 19 does not recite: “( d) the set number plurality of model condition points" as argued by the applicant, instead claim 19 recites “( d) the plurality of
model condition points;”, which is why claim 19 is rejected under § 112(a) for new matter.

The applicant further submits:
The Office Action also asserts that a feature vector is for the entirety of the time series but not for at least one time point. Nothing in the specification limits the scope of a feature vector. In an extreme case, the feature vector consists of one feature over multiple signals covering one time point
This is not persuasive. 
	The applicant is claiming this limited scope of a feature vector being “for at least one time point”, e.g. to encompass matter such as “one feature over multiple signals covering one time point”.
	The specification does not convey such a feature. The feature vectors are associated with the time series for a range of time, see the § 112(a) rejection.
	The applicant’s claim reflects a scope in which one of ordinary skill would not reasonably ascertain, from the specification, that the applicant actually had written possession of at the time of filing. 
	And while, under the BRI, a feature vector associated with a time series may reasonably encompass a feature vector representing only a portion of said time series, nothing in the specification actually conveys this feature. At most, and as previously noted, ¶ 33 conveys that the feature vectors is a function of time, but not that it is a time series. As the feature vector is extracted from the time series in the specification, at most the specification merely conveys that as the feature vector is extracted from a time series it therefore is a function of time, of the time series. 
	The applicant has written possession for that which was originally filed. One of ordinary skill would not reasonably recognize that the applicant was in possession of undisclosed features of the claimed invention.
  
See MPEP § 2163.02:
The courts have described the essential question to be addressed in a description requirement issue in a variety of ways. An objective standard for determining compliance with the written description requirement is, "does the description clearly allow persons of ordinary skill in the art to recognize that he or she invented what is claimed."
The subject matter of the claim need not be described literally (i.e., using the same terms or in haec verba) in order for the disclosure to satisfy the description requirement. If a claim is amended to include subject matter, limitations, or terminology not present in the application as filed, involving a departure from, addition to, or deletion from the disclosure of the application as filed, the examiner should conclude that the claimed subject matter is not described in that application.
The applicant’s arguments indicate that the statement “The subject matter of the claim need not be described literally” somehow would show written possession of undisclosed features of the invention. 
This is entirely unreasonable – the feature being claimed must be disclosed in such a fashion that to “allow persons of ordinary skill in the art to recognize that he or she invented what is claimed". 
The statement of “The subject matter of the claim need not be described literally” requires that the “subject matter” is still disclosed and described to allow a person of ordinary skill to recognize the claimed invention is what the inventor actually invented . 

The Examiner does note that while ¶ 25-26 do discuss that “feature vectors represent sets of signal data...for a particular range of time”, the claims recite “for a range of time, a first that the feature vector is representing the time series for which the feature vector is associated with for the “range of time” of the time series. 

As such, i.e. the “range of time” that the applicant appears to be relying upon for the “at least one time point” is already being used to support the verbatim phrase “range of time” in the claims, and therefore does not provide support for the “at least one time point” as recited in the present claims. 
Furthermore, The present claims now recite that within this “range of time”, i.e. this “time window”/”time duration window” (¶ 25 and ¶26) that one of ordinary skill would somehow recognize that the invention had a feature in which a feature vector is only for a single time point – there is NOTHING in the specification to even suggest such a claimed feature as being part of what the inventor of the instant invention actually invented. 

The applicant also submits (Remarks, page 15):
The Office Action asserts that there is no time series in the subspace but only a feature vector of the time series. As discussed above, even a component of the feature vector can still be a time series. ...
	
There is nothing in the disclosure which provides support for the invention actually having a “feature vector” which “can still be a time series”. While the claimed invention may encompass such an embodiment under the BRI and obviousness, nothing in the disclosure suggests that this is what the inventor “invented”, i.e. one of ordinary skill would not “recognize” that the claimed invention is, in fact, what the inventor of the disclosed invention actually invented. 
	As per MPEP § 2163.02 “. An objective standard for determining compliance with the written description requirement is, "does the description clearly allow persons of ordinary skill in the art to recognize that he or she invented what is claimed."”

	Nothing in the specification supports that the disclosed invention includes a “feature vector” and/or features which are “time series” as used in the claim.
	The claim, as recited, clearly sets for that the “time series” are from the signal data.
	The claims do not preclude a feature vector from being a time series feature vector, e.g. such as taking the derivative or a filtered version of the “time series” from the signals, i.e. the claims are not limited from encompassing this scope.
	But that does not mean the applicant may claim undisclosed features/subject matter – this feature/subject matter was undisclosed and as such one of ordinary skill would NOT recognize that the applicant actually invented this subject matter. 
Nothing in the specification even recites what the “feature vectors” or features being used actually are or even what they could be, instead, the specification merely disclosed using the term “feature vector”. See above for the standard used for § 112(a). At most, ¶ 25 conveys but the claims fail to even recite a scope including the “Feature identification logic”, let alone either of these embodiments in ¶ 25. Instead, the claims merely recite that the feature vectors are received/obtained, by any possible means, e.g. retrieving from a 2nd system, retrieving from storage, using feature extraction/feature identification, etc., as there is no limitation in the claims are what actually obtains the feature vectors, let alone any limitation on how the feature vectors are actually extracted. 
The claimed invention is not supported by the instant specification. 

Claim Objections
Claims 19-21 are objected to because of the following informalities:
Claim 19 recites “each first time series being associated with at least one feature vector...” and then later recites “projecting...evaluating at least one feature vector associated with the first time series” – the second recitation should recite “the at least one feature vector” as this element is already presented previously in the claim. Claims 20 and 21 contain similar recitations and are objected to under a similar rationale. 
Claim 19 recites “a second ratio of...points in the model” however “model” was not previously recited, this should read “a number of model condition points of the plurality of model condition points...” – claim 21 contains a similar recitation and is objected to under a similar rationale


Specification
The amendment filed December 30th, 2020 is objected to under 35 U.S.C. 132(a) because it introduces new matter into the disclosure.  35 U.S.C. 132(a) states that no amendment shall introduce new matter into the disclosure of the invention.  The added material which is not supported by the original disclosure is as follows: 
The amendment cancelled part of the denominator in a disclosed equation.
This materially changes the disclosed invention, i.e. the invention original disclosed equation A, now it discloses equation B. 
Applicant is required to cancel the new matter in the reply to this Office Action.

In addition, the Examiner notes that the applicant has described the original specification as recited a “meaningless” equation. 
This is of particular note – see Remarks, page 17. The applicant’s argument is that a recitation such as the originally presented one is “meaningless”, i.e. with a lack of meaning, devoid of meaning. 

The Examiner refers the applicant to present claim 21, which now recites a “meaningless” limitation.
	Clearly, this cancellation of subject matter in the specification is an introduction of new matter.

	To clarify with the prosecution history on how this amendment to the specification is new matter: 

The original equation disclosed in the specification is, using the claim terms:
            
                S
                e
                c
                o
                n
                d
                 
                r
                a
                t
                i
                o
                =
                
                    
                        a
                         
                        n
                        u
                        m
                        b
                        e
                        r
                         
                        o
                        f
                         
                        m
                        o
                        d
                        e
                        l
                         
                        c
                        o
                        n
                        d
                        i
                        t
                        i
                        o
                        n
                         
                        p
                        o
                        i
                        n
                        t
                        s
                         
                        i
                        n
                         
                        t
                        h
                        e
                         
                        m
                        o
                        d
                        e
                        l
                         
                        h
                        a
                        v
                        i
                        n
                        g
                         
                        a
                         
                        s
                        a
                        m
                        e
                         
                        l
                        a
                        b
                        e
                        l
                         
                        a
                        s
                         
                        t
                        h
                        e
                         
                        l
                        a
                        b
                        e
                        l
                        e
                        d
                         
                        p
                        o
                        i
                        n
                        t
                    
                    
                         
                        t
                        h
                        e
                         
                        s
                        e
                        t
                         
                        n
                        u
                        m
                        b
                        e
                        r
                         
                        o
                        f
                         
                        m
                        o
                        d
                        e
                        l
                         
                        c
                        o
                        n
                        d
                        i
                        t
                        i
                        o
                        n
                         
                        p
                        o
                        i
                        n
                        t
                        s
                         
                        i
                        n
                         
                        t
                        h
                        e
                         
                        a
                        n
                        a
                        l
                        y
                        s
                        i
                        s
                         
                        n
                        e
                        i
                        g
                        h
                        b
                        o
                        r
                        h
                        o
                        o
                        d
                        ;
                    
                
            
        

            
                 
            
        
The newly amended equation is:

            
                S
                e
                c
                o
                n
                d
                 
                r
                a
                t
                i
                o
                =
                
                    
                        a
                         
                        n
                        u
                        m
                        b
                        e
                        r
                         
                        o
                        f
                         
                        m
                        o
                        d
                        e
                        l
                         
                        c
                        o
                        n
                        d
                        i
                        t
                        i
                        o
                        n
                         
                        p
                        o
                        i
                        n
                        t
                        s
                         
                        i
                        n
                         
                        t
                        h
                        e
                         
                        m
                        o
                        d
                        e
                        l
                         
                        h
                        a
                        v
                        i
                        n
                        g
                         
                        a
                         
                        s
                        a
                        m
                        e
                         
                        l
                        a
                        b
                        e
                        l
                         
                        a
                        s
                         
                        t
                        h
                        e
                         
                        l
                        a
                        b
                        e
                        l
                        e
                        d
                         
                        p
                        o
                        i
                        n
                        t
                    
                    
                        t
                        h
                        e
                         
                        p
                        l
                        u
                        r
                        a
                        l
                        i
                        t
                        y
                         
                        o
                        f
                         
                        m
                        o
                        d
                        e
                        l
                         
                        c
                        o
                        n
                        d
                        i
                        t
                        i
                        o
                        n
                         
                        p
                        o
                        i
                        n
                        t
                        s
                    
                
            
        

This is for the claimed “second ratio”

The applicant’s argument is that “it is meaningless to have the same denominator in the two ratios, and the description in para. [0043] of the application publication is clearly a typo” (Remarks, page 17)

However, the original claims as filed, as well as later amendments all use original denominator, i.e. the original claims, and numerous amendments during the course of prosecution include such a recitation. Nothing in the prosecution history indicates this is a “typo”, instead this amendment to the specification is clearly an attempt by the applicant to introduce new matter into the specification. 

Original claim 1, 2/27/2018:
calculating a second ratio of (c) points in the model having a same label as the labeled point and (d) the set number;

In terms of the present claim language, the original claims recited:             
                S
                e
                c
                o
                n
                d
                 
                r
                a
                t
                i
                o
                =
                
                    
                        a
                         
                        n
                        u
                        m
                        b
                        e
                        r
                         
                        o
                        f
                         
                        m
                        o
                        d
                        e
                        l
                         
                        c
                        o
                        n
                        d
                        i
                        t
                        i
                        o
                        n
                         
                        p
                        o
                        i
                        n
                        t
                        s
                         
                        i
                        n
                         
                        t
                        h
                        e
                         
                        m
                        o
                        d
                        e
                        l
                         
                        h
                        a
                        v
                        i
                        n
                        g
                         
                        a
                         
                        s
                        a
                        m
                        e
                         
                        l
                        a
                        b
                        e
                        l
                         
                        a
                        s
                         
                        t
                        h
                        e
                         
                        l
                        a
                        b
                        e
                        l
                        e
                        d
                         
                        p
                        o
                        i
                        n
                        t
                    
                    
                         
                        t
                        h
                        e
                         
                        s
                        e
                        t
                         
                        n
                        u
                        m
                        b
                        e
                        r
                         
                        o
                        f
                         
                        m
                        o
                        d
                        e
                        l
                         
                        c
                        o
                        n
                        d
                        i
                        t
                        i
                        o
                        n
                         
                        p
                        o
                        i
                        n
                        t
                        s
                         
                        i
                        n
                         
                        t
                        h
                        e
                         
                        a
                        n
                        a
                        l
                        y
                        s
                        i
                        s
                         
                        n
                        e
                        i
                        g
                        h
                        b
                        o
                        r
                        h
                        o
                        o
                        d
                        ;
                    
                
            
        


Claim Amendments for claim 19 on 11/8/2019:
calculating a second ratio of (c) a number of model condition points in the model having a same label as the labeled point and (d) the set number of model condition points in the analysis neighborhood;

In terms of the present claim language, the claims recited:             
                S
                e
                c
                o
                n
                d
                 
                r
                a
                t
                i
                o
                =
                
                    
                        a
                         
                        n
                        u
                        m
                        b
                        e
                        r
                         
                        o
                        f
                         
                        m
                        o
                        d
                        e
                        l
                         
                        c
                        o
                        n
                        d
                        i
                        t
                        i
                        o
                        n
                         
                        p
                        o
                        i
                        n
                        t
                        s
                         
                        i
                        n
                         
                        t
                        h
                        e
                         
                        m
                        o
                        d
                        e
                        l
                         
                        h
                        a
                        v
                        i
                        n
                        g
                         
                        a
                         
                        s
                        a
                        m
                        e
                         
                        l
                        a
                        b
                        e
                        l
                         
                        a
                        s
                         
                        t
                        h
                        e
                         
                        l
                        a
                        b
                        e
                        l
                        e
                        d
                         
                        p
                        o
                        i
                        n
                        t
                    
                    
                         
                        t
                        h
                        e
                         
                        s
                        e
                        t
                         
                        n
                        u
                        m
                        b
                        e
                        r
                         
                        o
                        f
                         
                        m
                        o
                        d
                        e
                        l
                         
                        c
                        o
                        n
                        d
                        i
                        t
                        i
                        o
                        n
                         
                        p
                        o
                        i
                        n
                        t
                        s
                         
                        i
                        n
                         
                        t
                        h
                        e
                         
                        a
                        n
                        a
                        l
                        y
                        s
                        i
                        s
                         
                        n
                        e
                        i
                        g
                        h
                        b
                        o
                        r
                        h
                        o
                        o
                        d
                        ;
                    
                
            
        

This was also recited in the claims in the amendments on 12/12/2019 and 02/03/2020, as well as the present claim 21

	The amendment to the specification is introduced new matter, and as such is objected to. 

Claim Rejections - 35 USC § 112(a)
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 2-6, 8-12, 14-21 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for 

Independent claims 19-21 are not supported by the instant specification in such a way to reasonably convey to one skilled in the relevant are that the inventor or a joint inventor…at the time the application was filed, had possession of the claimed invention. Claim 19 recites (claims 20-21 recite substantially similar subject matter):
...each first time series being associated with at least one feature vector for at least one time point,...
...each second time series being associated with at least one feature vector for at least one time point,...

The above recited limitations are not supported by the instant specification. The specification sets forth that the “feature vector” is associated with the “time series”, i.e. “Feature vectors represent sets of signal data from one or more sensors for a particular range of time”. In other words, the specification sets forth that a “time series”/”time series signal” has an associated “feature vector” for the “time series”, wherein the “time series” is “signal data from one or more sensors for a particular range of time”. 
This does not support the claimed limitation of “each time series comprising at least one feature vector for at least one time point”:
1) the “time series” is “associated with the” “feature vector”, and as per figure 2 is “extracted” from the time series signal, i.e. a feature vector is extracted from the time 
2) the “feature vector” as described in the specification is “for a particular range of time”, i.e. a “time series”, in other words a “feature vector” is for the entirety of the “time series” including all “time point”, in other words the specification does not support that the “feature vector for at least one time point”, as the specification does not support that the “feature vector” is for only “one time point”, or any subset of time points less than the entirety of the “time series”. 
The closest support is in ¶ 47 which recites “The explanation process 700 continues by generating a signal projection space 704 for a first signal to analyze, and retrieving a feature vector for the signal 706. These steps are repeated for each signal and each feature vector associated with the selected signal.”
Further see ¶ 26 which recites “The feature identification logic 112 provides instructions to aggregate the multiple sets of signal data into one or more feature vectors. Feature vectors represent sets of signal data from one or more sensors for a particular range of time.”
Also see ¶ 25 which recites, in part a “window size for evaluating multiple sets of signal data” wherein the “set of signal data points within the time duration window” are “reduced” to “a feature vector of reduced dimensionality”. This is interpreted that the “feature vectors” include being used for dimensional reduction of a time series, i.e. “set of signal data points within the time duration window” are interpreted to be a “time series”. 

In addition, see figure 2 # 124-#204: 

    PNG
    media_image1.png
    314
    778
    media_image1.png
    Greyscale


Claims 20-21 recite substantially similar subject matter and are rejected under a similar rationale. 
The claimed invention is not supported by the instant specification. 


Independent claims 19-20 are not supported by the instant specification in such a way to reasonably convey to one skilled in the relevant are that the inventor or a joint inventor…at the time the application was filed, had possession of the claimed invention. Claim 19 recites:
	for each signal projection subspace, calculating a second ratio of (c) a number of model condition points in the model having a same label as the labeled point and (d) the plurality of model condition points;
calculating a contribution of each of the plurality of signals to the first condition, based on a ratio of the first ratio and the second ratio calculated for the corresponding signal projection subspace, to form a sorted list of signals and contributions;

	The claimed invention is not supported by the written specification. The closest support is in ¶ 43 which recites “A first ratio or percent…may be formed as (1) a numerator that is the number of points in the constrained neighborhood of model data points around the point being evaluated and having a same label as the point, and (2) a denominator that is the total number of model data points in the constrained neighborhood. A second ratio or percent for finding the signal contribution may be formed as (1) a numerator that is the number of points in the model data having a same label as the point, and (2) a denominator that is the total number of model data points in the constrained neighborhood”
	Clearly, the claim is not supported by this portion, or any other readily apparent portion of the specification. 

Claim 20 recites:
logic to calculate, for each signal projection subspace, a second percent of model condition points in the model having a same label as the labeled point out of the plurality of model condition points
logic to calculate a contribution of each of the plurality of signals to the first condition from the first percent and the second percent calculated for the corresponding signal projection subspace,

	The claimed invention is not supported by the written specification. The closest support is in ¶ 43 which recites “A first ratio or percent…may be formed as (1) a numerator that is the number of points in the constrained neighborhood of model data points around the point being evaluated and having a same label as the point, and (2) a denominator that is the total number of model data points in the constrained neighborhood. A second ratio or percent for finding the signal contribution may be formed as (1) a numerator that is the number of points in the model data having a same label as the point, and (2) a denominator that is the total number of model data points in the constrained neighborhood”
	Clearly, the claim is not supported by this portion, or any other readily apparent portion of the specification. 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention 

Claims 2-5, 8-11, 14-17, 19-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shibuya et al., US 2012/0290879 in view of Maeda et al., US 2012/0041575 and in further view of Zhang et al., “KRNN: k Rare-class Nearest Neighbour classification”, 2016

Regarding Claim 19.
Shibuya teaches: 
	A control method for a machine system comprising sensors generating a plurality of signals over time, the method comprising: (Shibuya, abstract, teaches an anomaly detection system for a “facility”, e.g. see ¶ 2-3 – this is for facilities that include machine systems such as a “windmill”, a “nuclear reactor”, etc. – then see figure 9A which shows an example embodiment in which 4 signals are received wherein ¶ 105 teaches these signals are “sensor signal”  and see figure 1 – this is to “diagnose” an anomaly in a machine system based on signals received from sensors)

    PNG
    media_image2.png
    630
    892
    media_image2.png
    Greyscale

	receiving a labeled point comprising, for a range of time, a first time series from each of the plurality of signals, each first time series being associated with at least one feature vector for at least one time point, each feature vector having components corresponding to one or more signal features, the labeled point having a first label describing a first condition of a plurality of conditions of the machine system, each of the plurality of signals having a corresponding signal projection subspace; (Shibuya, see figure 19 and ¶ 169 – 170 – this is a system which combines both the “first embodiment” and the “second embodiment” wherein sensor signal(s) are received from a facility along with an “event signal” wherein the “event signal” is used for a “mode dividing unit”, i.e. the signals from each sensor are divided into a plurality of time series due to the “mode” – see ¶174 “Meanwhile, the mode dividing unit to clarify – the system receives a plurality of sensor signals over a period of time, segments the signals into time series for each mode [each label], and extracts feature vectors for each cycle/mode, i.e. the system receives from this process a labelled point for each mode comprising time series from each signal and the associated feature vectors for each time series for each mode, e.g. for the mode “start” the system uses the label of this mode to divide/segment the received sensors, and for the segment of time series received during the “start” mode the system performs a feature extraction into a feature space, i.e. “data is extracted....every period” – see figures 9A and 9B for the plurality of signals and associated features
for more clarification also see ¶ 72 teaches “The event signal 103 is a signal indicating an operation, a failure, or a warning of the facility which is output irregularly and is constituted by a character string indicating the time and the operation, the failure, or the warning.”, e.g. a 
in regards to claim interpretation – the system of Shibuya is obviously receiving a continuous stream of data segmented into “modes”, i.e. this is for anomaly detection , it would have been obvious that a newly received “mode”, e.g. “start”, and the associated signal data/feature vectors would have been encompassed by the labelled point, i.e. the “mode” “start” obviously is a label which describes a condition of the machine system starting, wherein this “mode” comprises the time series data from the signals for that particular “mode”, and wherein the system extracts feature vectors for that “mode” 
to clarify and in regards to the feature space– see ¶ 88 “The feature amount extraction is considered using the sensor signal as it is. A window of ±1, ±2, etc., is set with respect to a predetermined time and a feature indicating a time variation of data may be extracted by a feature vector of a window width (3, 5, etc.,) x the number of sensor”, in other words a feature vector for all the signals for a “window” of time [e.g., the operating cycle such as “start”] is extracted wherein the feature vector comprises a separate feature vector for each of the “number of sensor” [e.g. see figure 9B] wherein each sensor has its own feature space [as it is obviously a feature vector that was extracted forming a feature space for each vector, also see figure 13 for an example], in other words there is a joint feature space divided into feature subspaces for each of “the number of sensors” wherein for each sensor there is a feature space for the sensor with at least one associated feature vector, also see ¶ 143 and figure 13 for an example – a “feature” space is created for each signal, e.g. the “daily mean” of the signal 

    PNG
    media_image3.png
    361
    918
    media_image3.png
    Greyscale


    PNG
    media_image4.png
    670
    779
    media_image4.png
    Greyscale


	projecting each of a plurality of model condition points into each signal projection subspace of the plurality of signal projection subspaces, each model condition point comprising a second time series from each of the plurality of signals, each second time series being associated with at least one feature vector for at least one time point, each feature vector having components corresponding to one or more signal features, the model condition point having a second label describing one of the plurality of conditions of the machine system, the projecting comprising evaluating at least one feature vector associated with the second time series from each of the plurality of signals in the corresponding signal projection subspace; (Shibuya, see figure 19 as cited above and as detailed above, specifically see the # 1903 for the “Learning-Data selecting unit” for the model creation – this selects the “learning data” wherein ¶ 173 teaches “The sensor signal 102 output from the facility 101 is accumulated for learning in advance. The feature amount extraction unit 1901 inputs the accumulated sensor signal 102 and performs feature amount extraction to acquire the feature vector. The feature-selection unit 1902 performs data check of the feature vector output from the feature amount extraction unit 1901 and selects a feature to be used. The learning data selecting unit 1903 performs data check of the feature vector configured by the selected feature and check of the event signal 103 and selects the learning data used to create the normal model”, in other words the system accumulates sensor(s) signal data over a period of time for learning, and uses these plurality of time series as model condition points for creating a “normal model” – to clarify see figure 5 and ¶ 83-86 – the “normal model” is created by using the accumulated signals from the same sensors “during a predetermined period” and dividing these signals for the “mode”, i.e. ¶ 91 “the normal-model creation unit 106 classifies the learning data selected in step S503 for each of the modes divided by the mode dividing unit 104 and creates the normal model for each mode in step S505.”, and then see ¶ 179 “When the normal model is created for each mode, the anomaly measurement is computed by using the normal models of all the modes and the minimum value is acquired.” and ¶ 161-163 which teaches creating “plural normal models” [model condition points] for each “cycle”/mode by “random sampling” for “several cycles” and see ¶178 “The feature amount extraction unit 1901 inputs the sensor signal 102 and performs the same feature amount extraction as that at the learning time to acquire the feature vector”, in other words the system obtains model condition points [plural normal models] for each mode [points associated with labels for a plurality of conditions, e.g. start/stop/on/off] wherein each model condition point comprises time series from each of the sensors [random sampling over several cycles for accumulating sensor signal data] wherein each of these time series at least one feature vector [using the “same feature...extraction”] as the labelled point and wherein these model condition points and then used for classifying (fig. 5, “classify data for each mode”) using the feature vectors – in regards to this being in the signal projection subspace see above – the system is using the same feature extraction method for both the labelled input data and the learning data, so obviously these are in the same signal projection feature spaces, e.g. figure 13 shows an example – obviously as figure 13 is an example comprising a plurality of points for “every one day” (¶ 143) this shows not just the feature space/feature vectors for the most recent data [the labelled point for the mode] but also shows the learning data that was previously accumulated, and ¶ 143 then clarifies, as cited above, that the period/cycle for this would also obviously be by mode, e.g. “when the starting/stopping time are known” [starting/stopping modes], to clarify – the system extract features, e.g. the mean for each mode over several cycles of each mode, to form a feature space for each signal, i.e. each signal has a feature space which comprises the received data for each mode, including the labelled point [e.g., the point being the data for the most recent “start” cycle] wherein the system then projects the “learning data” into the same feature space for each signal as the system stores the “learning data” and extracts the feature vectors using the “same” algorithm as for the labelled point, e.g. see figure 13,, also see ¶ 88 as cited above, and also see figure 6 ¶ 93 which provides an example of a “3D feature space” and clarifies that “the dimension of the feature space may be...higher”, i.e. the feature space for each signal, in figure 6 the “evaluation data” is the projected feature vectors from the “the number of the learning data” (see ¶ 93), and in regards to evaluating the at least one feature vector in the subspace see ¶ 93-99 which provides an example of evaluating the feature vectors using a “local sub-space classifier” which creates a subspace in the feature space, e.g., for a 3D space it creates a 2D plane to evaluate the “distance between the evaluation data [learning data] and the point b [the labelled point]” for the normal model, e.g. this evaluation is using the “mean...and covariance matrix...of the learning data” for the feature space, also see ¶ 99 for other methods of creating “the normal model” in the feature space such as “a nearest method” or a “similarity base model” )	

    PNG
    media_image5.png
    697
    904
    media_image5.png
    Greyscale

	projecting the labeled point into each signal projection subspace, comprising evaluating at least one feature vector associated with the first time series from each of the plurality of signals in the corresponding signal projection subspace; (Shibuya, as cited above, e.g. see figures 5 and 13, also see figure 17 and ¶ 158 – the “features” from both the “learning data” and the newly received labelled point [i.e. a plurality of time series labelled with a model from the facility sensors] are projected into the same feature subspace for each signal wherein ¶ 88 as cited above provides an example of feature extraction for forming a feature vector for each signal, wherein there feature vectors for each signal are then turned into a “a feature vector of a window width (3, 5, etc.,) x the number of sensors.”, i.e. there is a feature vector for the in other words the system evaluates the feature vectors of both the model condition points/learning data and the labelled point, e.g. an “anomaly” to find a distance between the feature vectors to determine the “deviation” of each signal)

    PNG
    media_image6.png
    939
    938
    media_image6.png
    Greyscale

	[...]
calculating a contribution of each of the plurality of signals to the first condition..., to form a sorted list of signals and contributions;(Shibuya, ¶ 117 “Further, since it is considered that a signal having a large deviation [contribution to the first condition/label] when the anomaly occurs contributes to the anomaly judgment, when the signals are displayed in the order of the large deviation from the top, it is easily verified which sensor signal has the  wherein this is used to detect an anomaly occurring in the received “event” (¶ 120) wherein as per at least figure 19 as cited above is the mode label from the event signal – in other words the system determines for each newly received mode/”cause event” [including the first label] the deviation of each signal compared to the “normal” signal/model(s)”  and sorts the signals into an ordered list by the “order of large deviation from the top” to determine “which sensor signal has the anomaly” [which sensor signal contributes the most to the first label/cause event] – to clarify this obviously forms a sorted listed of signals based on the contribution [e.g., the “deviation”] of the signal to the event label/mode of the received data, wherein the deviation is a measure of how much the present labelled point/associated time series/associated feature vectors vary/deviate from the normal case, i.e. this shows how much each signal contributes to the first condition/label)

Shibuya does not explicitly teach:
	constraining an analysis neighborhood for the labeled point to a set number of model condition points closest to the labeled point as projected in each signal projection subspace;
	for each signal projection subspace, calculating a first ratio of (a) a number of model condition points in the analysis neighborhood having a same label as the labeled point in the analysis neighborhood and (b) the set number of model condition points in the analysis neighborhood;
6for each signal projection subspace, calculating a second ratio of (c) a number of model condition points in the model having a same label as the labeled point and (d) the plurality of model condition points;
	... based on a ratio of the first ratio and the second ratio calculated for the corresponding signal projection subspace...
	and adapting the machine system's behavior based on the sorted list. 

Maeda teaches: 
constraining an analysis neighborhood for the labeled point to a set number of model condition points closest to the labeled point as projected in each signal projection subspace; (As an initial matter, the inventors for both Shibuya as relied upon above and Maeda are the same, and then see Maeda the abstract – this is for identifying an abnormal “sensor signal” and see figures 3-4, Maeda is for a related invention to the Shibuya invention, then see figure 11 and the description starting in ¶ 86 – Maeda is using a method similar to a “k-NN method”, i.e. the “k” [a set number of model condition points] which are the “nearest neighbors” wherein kNN uses a “distance within a feature space” wherein ¶ 87 then teaches that this is to find the “k pieces of data with highest similarities to the unit for deciding normal range” and then see ¶ 88 and figure 13 – the system is finding “a number k” of the nearest/most similar signal data [based on “distance within a feature space”] to a newly “observed sensor signal” and based on the k-nearest neighbors of the sensor signal determines if the “observed sensor signal” is “an anomaly” and provides calculation of “a deviance of observation data”, in other words it would have been obvious that, as taken in combination with Shibuya as relied upon above, the system would have used a kNN algorithm, or a “similar method”, to constrain an analysis neighborhood for each signal/each signal’s subspace to a set number “k” of model condition points that are closest in “distance within [the] feature space” [and obviously, the subspace for the sensor/signal] in order to determine the “deviance” of a “observation data” [a labelled point] from the normal points, i.e. this detects if a newly received labelled point comprising signal data for an event period, e.g. “starting” [see Shibuya] is “an anomaly” by checking the deviation for the new data to the k nearest neighbors of that data from the “learning data” “within [the] feature space” for the signal [the feature subspace for the particular signal])
	and adapting the machine system's behavior based on the sorted list. (Medea, as cited above, and as taken in combination with Shibuya teaches detecting an anomalous event and forming a list of signals wherein the list is sorted by which signals have the most deviation at the top – then see Medea, ¶ 171 – the system is be used for “condition-based maintenance” wherein “parts replacement is performed [the system is adjusted] in accordance with the conditions of the device” wherein this is based on the “normal and anomalous data” of the devices as “it is important [for condition-based maintenance] to detect outliers from normal data”, it would have been obvious to use the sorted list of signals and their deviations/contributions to adapt the machine system, e.g. by implementing a parts replacement to correct for anomalous data)

The motivation to combine would have been that 1) Maeda provides a computationally efficient means of anomaly detection using kNN wherein Maeda’s technique also provides “an 
In addition, both references are by the same reference in a similar time period, i.e. it would have been obvious that these references were describing different aspects of the same overall system. 

Shibuya, as taken in combination with Maeda does not explicitly teach: 
	for each signal projection subspace, calculating a first ratio of (a) a number of model condition points in the analysis neighborhood having a same label as the labeled point in the analysis neighborhood and (b) the set number of model condition points in the analysis neighborhood;
	6for each signal projection subspace, calculating a second ratio of (c) a number of model condition points in the model having a same label as the labeled point and (d) the plurality of model condition points;
	based on a ratio of the first ratio and the second ratio calculated for the corresponding signal projection subspace...


Zhang teaches:
for each signal projection subspace, calculating a first ratio of (a) a number of model condition points in the analysis neighborhood having a same label as the labeled point in the analysis neighborhood and (b) the set number of model condition points in the analysis neighborhood; (Zhang, abstract, teaches a modification to a KNN “to address the within-class imbalance” in order to “bias classification towards the rare class” and then see § 5.2 which teaches that “A strategy is desired to compute the true positive propensity for these regions so as to distinguish and effectively rank the corresponding query instances. To this end, if the local positive interval for a query instance t (Eq. (2)) is higher than the global positive interval (Eq. (1)), intuitively it indicates that t has higher posterior positive probability than the positive prior based on the observed positive frequency in the training population...The positive posterior probability estimation for t in Eq. (3) therefore should be adjusted. Let λ denote the positive odds (P:N) in the query neighbourhood over the positive odds in the global population. As an example, if P:N = 1:5 in the query region [first ratio, this is the ratio in the neighborhood] and P:N=1:10  [second ratio, this is the ratio in the total population] in the global training population, then λ = 2....” wherein “λ takes into account the local versus global class imbalance levels and the positive odds ratio in the local neighbourhood versus the global population indicates the positive propensity for the query instance...”, and then see table 1 which also calculates the “Pos frequency” as a variation to P:N, the first ratio is merely the frequency “in the query region”, e.g. 1/6)
	6for each signal projection subspace, calculating a second ratio of (c) a number of model condition points in the model having a same label as the labeled point and (d) the plurality of model condition points; (Zhang, § 5.2 as cited above, “P:N=1:10 in the global training 
	based on a ratio of the first ratio and the second ratio calculated for the corresponding signal projection subspace... (Zhang, the λ above is the ratio of the first to the second ratio, e.g. (1/5)/(1/10) = “2”) wherein this forms the probability of the labelled point being in the neighborhood/”query region” adjusted for the class imbalance, wherein this adjusts the probability of the prediction)

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings from Shibyua, as modified above, on a system which uses a kNN for finding anomalous signals  with the teachings from Zhang on a modification to the kNN’s probability estimation to account for “rare classes”. The motivation to combine would have been that the technique of Zhang would have biased the classification towards the rare class, e.g. an anomaly/outlying class, and as such Zhang would have improved the system’s ability to determine if a newly received set of time series from signals was anomalous [i.e., a rare class, it is abnormal], i.e. “These strategies more accurately characterise the rare-class distribution for accurate classification.” (Zhang, §8)
	

Regarding Claim 2
Zhang teaches:
The control method of claim 19, wherein calculating the contribution of each the signal to the first condition comprises:
	if the first ratio is less than or equal to the second ratio, setting the contribution of the signal to (first ratio/second ratio) - 1; (Zhang, as cited above, renders this obvious, i.e. this is “λ” -1 which is an obvious variant, and in terms of using λ as the contribution rate this also would have been obvious, this is the odds that the model condition points in the neighborhood have the same label as the labelled point, divided by the odds that the model condition points in the model have the same label as the labeled point)
	and otherwise, setting the contribution of the signal to (first ratio - second ratio)/(1 - second ratio). (This is also an obvious variant from Zhang using a simple rearrangement of λ, and this additionally this is contingent)

Regarding Claim 3
Maeda teaches: 
	The control method of claim 19, wherein constraining the analysis neighborhood for the labeled point to the set number of model condition points closest to the labeled point in the signal projection subspace comprises sorting the model condition points by a distance from the labeled point and limiting the analysis neighborhood to a number of the model condition points having the closest distance. (Maeda, ¶ 86-90, as cited above, this uses a kNN algorithm, or a similar algorithm, to select the k nearest points by “distance” such as for when k = 5 this is “the five highest pieces”, in other words obviously a kNN algorithm constrains the neighborhood of the k nearest neighbors to the closest k neighbors based on “distance within a feature space” [e.g., the subspace] wherein the closest/nearest are sorted by distance [e.g. highest] from the labelled point) 

Regarding Claim 4
Maeda teaches: 
	The control method of claim 3, wherein the distance comprises a distance between a projection of the labeled point and projections of the model condition points on a signal feature axis in the signal projection subspace. (Maeda, ¶ 86 – “distance within a feature space” is used, obviously this includes the distance between the labelled point and the model condition points in each signals feature space, wherein the distance is on a signal feature axis [it’s a distance in a coordinate system, obviously this includes the distance on the feature axis])

Regarding Claim 5
Maeda teaches: 
	The control method of claim 19, wherein adapting the machine system's behavior comprises remediating a faulty component. (Maeda, ¶ 171 teaches using the system for determining “parts replacement” [example of remediating a fault component])



Regarding Claim 20.
Shibuya teaches: 

A machine system monitoring and alert apparatus, comprising: (Shibuya, abstract, teaches an anomaly detection system for a “facility”, e.g. see ¶ 2-3 – this is for facilities that include machine systems such as a “windmill”, a “nuclear reactor”, etc. – then see figure 9A which shows an example embodiment in which 4 signals are received wherein ¶ 105 teaches these signals are “sensor signal”  and see figure 1 – this is to “diagnose” an anomaly in a machine system based on signals received from sensors)
	a computer system comprising a plurality of inputs to receive time series signals from a plurality of sensors; (Shibuya, as cited above)

    PNG
    media_image2.png
    630
    892
    media_image2.png
    Greyscale

the computer system comprising a processor and a memory adapted with instructions forming:  (Shibuya, as cited above)
	logic to receive a labeled point comprising, for a range of time, a first time series from each of the plurality of signals, each first time series being associated with at least one feature vector for at least one time point, each feature vector having components corresponding to one or more signal features, the labeled point having a first label describing a first condition of a plurality of conditions of the machine system, each of the plurality of signals having a corresponding signal projection subspace;(Shibuya, see figure 19 and ¶ 169 – 170 – this is a system which combines both the “first embodiment” and the “second embodiment” wherein sensor signal(s) are received from a facility along with an “event signal” wherein the “event signal” is used for a “mode dividing unit”, i.e. the signals from each sensor are divided into a plurality of time series due to the “mode” – see ¶174 “Meanwhile, the mode dividing unit 1908 performs mode dividing of dividing the time for each operating state based on the event signal 103” wherein ¶ 143 clarifies “As described above, when the operating cycle is regular, e.g., the operation starts and stops at the determined time of one day, data is extracted every fixed period, e.g., one day to compute the mean and the distribution. Although the period is not one day, the same applies thereto. When the operation starting/stopping time is known, data in a period which can be regarded as a normal operation is carried out may be extracted to compute the mean and the distribution [example of feature extraction] and this method may be applied even though the operating cycle is irregular.” and see ¶ 183 “In the learning data selection processing in the learning-data selection unit 1903, the method using the event signal is considered in addition to the same method as the example described by using FIG. 12.”, to clarify – the system receives a plurality of sensor signals over a period of time, segments the signals into time series for each mode [each label], and extracts feature vectors for each cycle/mode, i.e. the system receives from this process a labelled point for each mode comprising time series from each signal and the associated feature vectors for each time series for each mode, e.g. for the mode “start” the system uses the label of this mode to divide/segment the received sensors, and for the segment of time series received during the “start” mode the system performs a feature extraction into a feature space, i.e. “data is extracted....every period” – see figures 9A and 9B for the plurality of signals and associated features
for more clarification also see ¶ 72 teaches “The event signal 103 is a signal indicating an operation, a failure, or a warning of the facility which is output irregularly and is constituted by a character string indicating the time and the operation, the failure, or the warning.”, e.g. a “normal OFF”, a “start”, a “normal ON”, and a “stop”, and see ¶ 82 “An example of the sensor signal 102 is shown in FIG. 4. The sensor signal 102 is plural time-series signals...” and see figures 2C and figures 9A to 9B 
in regards to claim interpretation – the system of Shibuya is obviously receiving a continuous stream of data segmented into “modes”, i.e. this is for anomaly detection , it would have been obvious that a newly received “mode”, e.g. “start”, and the associated signal data/feature vectors would have been encompassed by the labelled point, i.e. the “mode” “start” obviously is a label which describes a condition of the machine system starting, wherein this “mode” comprises the time series data from the signals for that particular “mode”, and wherein the system extracts feature vectors for that “mode” 
to clarify and in regards to the feature space– see ¶ 88 “The feature amount extraction is considered using the sensor signal as it is. A window of ±1, ±2, etc., is set with respect to a predetermined time and a feature indicating a time variation of data may be extracted by a feature vector of a window width (3, 5, etc.,) x the number of sensor”, in other words a feature vector for all the signals for a “window” of time [e.g., the operating cycle such as “start”] is extracted wherein the feature vector comprises a separate feature vector for each of the “number of sensor” [e.g. see figure 9B] wherein each sensor has its own feature space [as it is obviously a feature vector that was extracted forming a feature space for each vector, also see figure 13 for an example], in other words there is a joint feature space divided into feature subspaces for each of “the number of sensors” wherein for each sensor there is a feature space for the sensor with at least one associated feature vector, also see ¶ 143 and figure 13 for an example – a “feature” space is created for each signal, e.g. the “daily mean” of the signal as one feature and the “Daily distribution” of the signal, in other words “the mean and the distribution” are computed for each mode for each signal, obviously forming a feature space for the signal wherein each feature space comprises the feature vectors for each mode, as the features, e.g. mean, are extracted for each cycle such as “starting/stopping”
	logic to projecting each of a plurality of model condition points into each signal projection subspace of the plurality of signal projection subspaces, each model condition point comprising a second time series from each of the plurality of signals, each second time series being associated with at least one feature vector for at least one time point, each feature vector having components corresponding to one or more signal features, the model condition point having a second label describing one of the plurality of conditions of 7Ser. No. 15/906,702 filed 2/27/2018 Gregory Olsen, et al. - GAU 2128 (Hopkins) Docket No. 60363-0027 the machine system, the projecting comprising evaluating the at least one feature vector associated with the second time series from each of the plurality of signals in the corresponding signal projection subspace; (Shibuya, see figure 19 as cited above and as detailed above, specifically see the # 1903 for the “Learning-Data selecting unit” for the model creation – this selects the “learning data” wherein ¶ 173 teaches “The sensor signal 102 output from the facility 101 is accumulated for learning in advance. The feature amount extraction unit 1901 inputs the accumulated sensor signal 102 and performs feature amount extraction to acquire the feature vector. The feature-selection unit 1902 performs data check of the feature vector output from the feature amount extraction unit 1901 and selects a feature to be used. The learning data selecting unit 1903 performs data check of the feature vector configured by the selected feature and check of the event signal 103 and selects the learning data used to create the normal model”, in other words the system accumulates sensor(s) signal data over a period of time for learning, and uses these plurality of time series as model condition points for creating a “normal model” – to clarify see figure 5 and ¶ 83-86 – the “normal model” is created by using the accumulated signals from the same sensors “during a predetermined period” and dividing these signals for the “mode”, i.e. ¶ 91 “the normal-model creation unit 106 classifies the learning data selected in step S503 for each of the modes divided by the mode dividing unit 104 and creates the normal model for each mode in step S505.”, and then see ¶ 179 “When the normal model is created for each mode, the anomaly measurement is computed by using the normal models of all the modes and the minimum value is acquired.” and ¶ 161-163 which teaches creating “plural normal models” [model condition points] for each “cycle”/mode by “random sampling” for “several cycles” and see ¶178 “The feature amount extraction unit 1901 in other words the system obtains model condition points [plural normal models] for each mode [points associated with labels for a plurality of conditions, e.g. start/stop/on/off] wherein each model condition point comprises time series from each of the sensors [random sampling over several cycles for accumulating sensor signal data] wherein each of these time series at least one feature vector [using the “same feature...extraction”] as the labelled point and wherein these model condition points and then used for classifying (fig. 5, “classify data for each mode”) using the feature vectors – in regards to this being in the signal projection subspace see above – the system is using the same feature extraction method for both the labelled input data and the learning data, so obviously these are in the same signal projection feature spaces, e.g. figure 13 shows an example – obviously as figure 13 is an example comprising a plurality of points for “every one day” (¶ 143) this shows not just the feature space/feature vectors for the most recent data [the labelled point for the mode] but also shows the learning data that was previously accumulated, and ¶ 143 then clarifies, as cited above, that the period/cycle for this would also obviously be by mode, e.g. “when the starting/stopping time are known” [starting/stopping modes], to clarify – the system extract features, e.g. the mean for each mode over several cycles of each mode, to form a feature space for each signal, i.e. each signal has a feature space which comprises the received data for each mode, including the labelled point [e.g., the point being the data for the most recent “start” cycle] wherein the system then projects the “learning data” into the same feature space for each signal as the system stores the “learning data” and extracts the feature vectors using the “same” algorithm as for the labelled point, e.g. see figure 13,, also see ¶ 88 as and also see figure 6 ¶ 93 which provides an example of a “3D feature space” and clarifies that “the dimension of the feature space may be...higher”, i.e. the feature space for each signal, in figure 6 the “evaluation data” is the projected feature vectors from the “the number of the learning data” (see ¶ 93), and in regards to evaluating the at least one feature vector in the subspace see ¶ 93-99 which provides an example of evaluating the feature vectors using a “local sub-space classifier” which creates a subspace in the feature space, e.g., for a 3D space it creates a 2D plane to evaluate the “distance between the evaluation data [learning data] and the point b [the labelled point]” for the normal model, e.g. this evaluation is using the “mean...and covariance matrix...of the learning data” for the feature space, also see ¶ 99 for other methods of creating “the normal model” in the feature space such as “a nearest method” or a “similarity base model” )
	logic to project the labeled point into each signal projection subspace, comprising evaluating the at least one feature vector associated with the first time series from each of the plurality of signals in the corresponding signal projection subspace; (Shibuya, as cited above, e.g. see figures 5 and 13, also see figure 17 and ¶ 158 – the “features” from both the “learning data” and the newly received labelled point [i.e. a plurality of time series labelled with a model from the facility sensors] are projected into the same feature subspace for each signal wherein ¶ 88 as cited above provides an example of feature extraction for forming a feature vector for each signal, wherein there feature vectors for each signal are then turned into a “a feature vector of a window width (3, 5, etc.,) x the number of sensors.”, i.e. there is a feature vector for the plurality of signals, and within that feature vector there is a feature vector for the “number of sensors”, e.g. the mean (fig 13) of each signal wherein ¶ 93-¶99 teaches then evaluating the in other words the system evaluates the feature vectors of both the model condition points/learning data and the labelled point, e.g. an “anomaly” to find a distance between the feature vectors to determine the “deviation” of each signal)
logic to calculate a contribution of each of the plurality of signals to the first condition ... to form a sorted list of signals and contributions;(Shibuya, ¶ 117 “Further, since it is considered that a signal having a large deviation [contribution to the first condition/label] when the anomaly occurs contributes to the anomaly judgment, when the signals are displayed in the order of the large deviation from the top, it is easily verified which sensor signal has the anomaly. In addition, when a past case of the cause event is displayed in the same manner as the presented result event, it is easy to accept the same phenomenon to trust the advance notice of the result event.”, wherein this is used to detect an anomaly occurring in the received “event” (¶ 120) wherein as per at least figure 19 as cited above is the mode label from the event signal – in other words the system determines for each newly received mode/”cause to clarify this obviously forms a sorted listed of signals based on the contribution [e.g., the “deviation”] of the signal to the event label/mode of the received data, wherein the deviation is a measure of how much the present labelled point/associated time series/associated feature vectors vary/deviate from the normal case, i.e. this shows how much each signal contributes to the first condition/label)
	and logic to display signals that are higher in the sorted list of signals as the most likely contributors for the machine system condition corresponding to the labeled point. (Shibyua, as cited above, e.g. ¶ 117 – the signals are displayed in a sorted list the sued in order of the signals contribution to the labeled point)


Shibuya does not explicitly teach:
	logic to constrain an analysis neighborhood for the labeled point to a set number of model condition points closest to the labeled point as projected in each signal projection subspace; 
logic to calculate, for each signal projection subspace, a first percent of model condition points in the analysis neighborhood having a same label as the labeled point in the analysis neighborhood out of the set number of model condition points in the analysis neighborhood;
logic to calculate, for each signal projection subspace, a second percent of model condition points in the model having a same label as the labeled point out of the plurality of model condition points;
...from the first percent and the second percent calculated for the corresponding signal projection subspace...


Maeda teaches: 
	logic to constrain an analysis neighborhood for the labeled point to a set number of model condition points closest to the labeled point as projected in each signal projection subspace; (As an initial matter, the inventors for both Shibuya as relied upon above and Maeda are the same, and then see Maeda the abstract – this is for identifying an abnormal “sensor signal” and see figures 3-4, Maeda is for a related invention to the Shibuya invention, then see figure 11 and the description starting in ¶ 86 – Maeda is using a method similar to a “k-NN method”, i.e. the “k” [a set number of model condition points] which are the “nearest neighbors” wherein kNN uses a “distance within a feature space” wherein ¶ 87 then teaches that this is to find the “k pieces of data with highest similarities to the unit for deciding normal range” and then see ¶ 88 and figure 13 – the system is finding “a number k” of the nearest/most similar signal data [based on “distance within a feature space”] to a newly “observed sensor signal” and based on the k-nearest neighbors of the sensor signal determines if the “observed sensor signal” is “an anomaly” and provides calculation of “a deviance of in other words it would have been obvious that, as taken in combination with Shibuya as relied upon above, the system would have used a kNN algorithm, or a “similar method”, to constrain an analysis neighborhood for each signal/each signal’s subspace to a set number “k” of model condition points that are closest in “distance within [the] feature space” [and obviously, the subspace for the sensor/signal] in order to determine the “deviance” of a “observation data” [a labelled point] from the normal points, i.e. this detects if a newly received labelled point comprising signal data for an event period, e.g. “starting” [see Shibuya] is “an anomaly” by checking the deviation for the new data to the k nearest neighbors of that data from the “learning data” “within [the] feature space” for the signal [the feature subspace for the particular signal])
The motivation to combine would have been that 1) Maeda provides a computationally efficient means of anomaly detection using kNN wherein Maeda’s technique also provides “an anomaly explanation message” (Maeda, ¶ 86) in addition to the “deviance” and 2) Shibuya, ¶ 99 suggests uses a “nearest method” technique [e.g., kNN], and 3) for “condition-based maintenance..., in many cases, anomalous data is rarely collected and the bigger the facility, the more difficult it is to collect anomalous data. Therefore, it is important to detect outliers from normal data.” (Maeda, ¶ 171)
	In addition, both references are by the same reference in a similar time period, i.e. it would have been obvious that these references were describing different aspects of the same overall system. 

Shibuya, as taken in combination with Maeda does not explicitly teach: 
logic to calculate, for each signal projection subspace, a first percent of model condition points in the analysis neighborhood having a same label as the labeled point in the analysis neighborhood out of the set number of model condition points in the analysis neighborhood;
	logic to calculate, for each signal projection subspace, a second percent of model condition points in the model having a same label as the labeled point out of the plurality of model condition points;
...from the first percent and the second percent calculated for the corresponding signal projection subspace...

Zhang teaches:
logic to calculate, for each signal projection subspace, a first percent of model condition points in the analysis neighborhood having a same label as the labeled point in the analysis neighborhood out of the set number of model condition points in the analysis neighborhood; (Zhang, abstract, teaches a modification to a KNN “to address the within-class imbalance” in order to “bias classification towards the rare class” and then see § 5.2 which teaches that “A strategy is desired to compute the true positive propensity for these regions so as to distinguish and effectively rank the corresponding query instances. To this end, if the local positive interval for a query instance t (Eq. (2)) is higher than the global positive interval (Eq. (1)), intuitively it indicates that t has higher posterior positive probability than the positive prior based on the observed positive frequency in the training population...The positive posterior probability estimation for t in Eq. (3) therefore should be adjusted. Let λ denote the positive odds (P:N) in the query neighbourhood over the positive odds in the global population. As an example, if P:N = 1:5 in the query region [first ratio, this is the ratio in the neighborhood] and P:N=1:10  [second ratio, this is the ratio in the total population] in the global training population, then λ = 2....” wherein “λ takes into account the local versus global class imbalance levels and the positive odds ratio in the local neighbourhood versus the global population indicates the positive propensity for the query instance...”, and then see table 1 which also calculates the “Pos frequency” as a variation to P:N, the first ratio is merely the frequency “in the query region”, e.g. 1/6)
	logic to calculate, for each signal projection subspace, a second percent of model condition points in the model having a same label as the labeled point out of the plurality of model condition points;(Zhang, § 5.2 as cited above, “P:N=1:10 in the global training population” and then see table 1 which also calculates the “Pos frequency” as a variation to P:N, the second ratio is merely the frequency “in the global population region”, e.g. 1/11)
...from the first percent and the second percent calculated for the corresponding signal projection subspace...  (Zhang, the λ above is the ratio of the first to the second ratio, e.g. (1/5)/(1/10) = “2”) wherein this forms the probability of the labelled point being in the neighborhood/”query region” adjusted for the class imbalance, wherein this adjusts the probability of the prediction)

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings from Shibyua, as modified above, on a 8)

7Ser. No. 15/906,702 filed 2/27/2018 Gregory Olsen, et al. - GAU 2128 (Hopkins) Docket No. 60363-0027 the machine system, the projecting comprising evaluating the at least one feature vector associated with the second time series from each of the plurality of signals in the corresponding signal projection subspace;

Regarding Claim 8
Zhang teaches:
	The apparatus of claim 20, wherein calculating the contribution of each the signal to the first condition comprises:
	if the first ratio is less than or equal to the second ratio, setting the contribution of the signal to (first ratio/second ratio) - 1; (Zhang, as cited above, renders this obvious, i.e. this is “λ” -1 which is an obvious variant, and in terms of using λ as the contribution rate this also would have been obvious, this is the odds that the model condition points in the neighborhood have the same label as the labelled point, divided by the odds that the model condition points in the model have the same label as the labeled point)
	and otherwise, setting the contribution of the signal to (first ratio - second ratio)/(1 - second ratio). (This is also an obvious variant from Zhang using a simple rearrangement of λ, and this additionally this is contingent)

Regarding Claim 9
Maeda teaches: 
	The apparatus of claim 20, wherein constraining the analysis neighborhood for the labeled point to the set number of model condition points closest to the labeled point in the signal projection subspace comprises sorting the model condition points by a distance from the labeled point and limiting the analysis neighborhood to a number of the model condition points having the closest distance.  (Maeda, ¶ 86-90, as cited above, this uses a kNN algorithm, or a similar algorithm, to select the k nearest points by “distance” such as for when k = 5 this is “the five highest pieces”, in other words obviously a kNN algorithm constrains the neighborhood of the k nearest neighbors to the closest k neighbors based on “distance within a feature space” [e.g., the subspace] wherein the closest/nearest are sorted by distance [e.g. highest] from the labelled point) 

Regarding Claim 10.
Maeda teaches: 
	The apparatus of claim 9, wherein the distance comprises a distance between a projection of the labeled point and projections of the model condition points on a signal feature axis in the signal projection subspace. (Maeda, ¶ 86 – “distance within a feature space” is used, obviously this includes the distance between the labelled point and the model condition points in each signals feature space, wherein the distance is on a signal feature axis [it’s a distance in a coordinate system, obviously this includes the distance on the feature axis])

Regarding Claim 11.
Maeda, as taken in combination with Shibyua, teaches: 
	The apparatus of claim 20, wherein the first condition comprises a faulty component.  (Maeda, ¶ 171 teaches using the system for determining “parts replacement” [example of remediating a faulty component] and Maeda, as well as Shibyua, teaches that the system is to detect anomalies/abnormal events, it would have been obvious that an abnormal event would have comprised a faulty component, e.g. Shibyua ¶ 12 wherein the anomaly includes “or a repairing operation such as component replacement”)

Regarding Claim 21.
Shibuya teaches: 
	A non-transitory machine-readable storage medium storing instructions that when executed by a processor, cause the processor to execute a control method for a machine system comprising sensors generating a plurality of signals over time, the method comprising: (Shibuya, abstract, teaches an anomaly detection system for a “facility”, e.g. see ¶ 2-3 – this is for facilities that include machine systems such as a “windmill”, a “nuclear reactor”, etc. – then see figure 9A which shows an example embodiment in which 4 signals are received wherein ¶ 105 teaches these signals are “sensor signal”  and see figure 1 – this is to “diagnose” an anomaly in a machine system based on signals received from sensors)
	8 Gregory Olsen, et al. - GAU 2128 (Hopkins) Docket No. 60363-0027 receiving a labeled point comprising, for a range of time, a first time series from each of the plurality of signals, each first time series being associated with at least one feature vector for at least one time point, each feature vector having components corresponding to one or more signal features, the labeled point having a first label describing a first condition of a plurality of conditions of the machine system, each of the plurality of signals having a corresponding signal projection subspace;   (Shibuya, see figure 19 and ¶ 169 – 170 – this is a system which combines both the “first embodiment” and the “second embodiment” wherein sensor signal(s) are received from a facility along with an “event signal” wherein the “event signal” is used for a “mode dividing unit”, i.e. the signals from each sensor are divided into a plurality of time series due to the “mode” – see ¶174 “Meanwhile, the mode dividing unit 1908 performs mode dividing of dividing the time for each operating state based on the event signal 103” wherein ¶ 143 clarifies “As described above, when the operating cycle is regular, e.g., the operation starts and stops at the determined time of one day, data is extracted every fixed period, e.g., one day to compute the mean and the distribution. Although the period is not one day, the same applies thereto. When the operation starting/stopping time is known, data in a period which can be regarded as a normal operation is carried out may be extracted to compute the mean and the distribution [example of feature extraction] and this method may be applied even though the operating cycle is irregular.” and see ¶ 183 “In the learning data selection processing in the learning-data selection unit 1903, the method using the event signal is considered in addition to the same method as the example described by using FIG. 12.”, to clarify – the system receives a plurality of sensor signals over a period of time, segments the signals into time series for each mode [each label], and extracts feature vectors for each cycle/mode, i.e. the system receives from this process a labelled point for each mode comprising time series from each signal and the associated feature vectors for each time series for each mode, e.g. for the mode “start” the system uses the label of this mode to divide/segment the received sensors, and for the segment of time series received during the “start” mode the system performs a feature extraction into a feature space, i.e. “data is extracted....every period” – see figures 9A and 9B for the plurality of signals and associated features
for more clarification also see ¶ 72 teaches “The event signal 103 is a signal indicating an operation, a failure, or a warning of the facility which is output irregularly and is constituted by a character string indicating the time and the operation, the failure, or the warning.”, e.g. a “normal OFF”, a “start”, a “normal ON”, and a “stop”, and see ¶ 82 “An example of the sensor signal 102 is shown in FIG. 4. The sensor signal 102 is plural time-series signals...” and see figures 2C and figures 9A to 9B 
in regards to claim interpretation – the system of Shibuya is obviously receiving a continuous stream of data segmented into “modes”, i.e. this is for anomaly detection , it would have been obvious that a newly received “mode”, e.g. “start”, and the associated signal data/feature vectors would have been encompassed by the labelled point, i.e. the “mode” “start” obviously is a label which describes a condition of the machine system starting, wherein this “mode” comprises the time series data from the signals for that particular “mode”, and wherein the system extracts feature vectors for that “mode” 
to clarify and in regards to the feature space– see ¶ 88 “The feature amount extraction is considered using the sensor signal as it is. A window of ±1, ±2, etc., is set with respect to a predetermined time and a feature indicating a time variation of data may be extracted by a feature vector of a window width (3, 5, etc.,) x the number of sensor”, in other words a feature in other words there is a joint feature space divided into feature subspaces for each of “the number of sensors” wherein for each sensor there is a feature space for the sensor with at least one associated feature vector, also see ¶ 143 and figure 13 for an example – a “feature” space is created for each signal, e.g. the “daily mean” of the signal as one feature and the “Daily distribution” of the signal, in other words “the mean and the distribution” are computed for each mode for each signal, obviously forming a feature space for the signal wherein each feature space comprises the feature vectors for each mode, as the features, e.g. mean, are extracted for each cycle such as “starting/stopping”
	projecting each of a plurality of model condition points into each signal projection subspace of the plurality of signal projection subspaces, each model condition point comprising a second time series from each of the plurality of signals, each second time series being associated with at least one feature vector for at least one time point, each feature vector having components corresponding to one or more signal features, the model condition point having a second label describing one of the plurality of conditions of the machine system, the projecting comprising evaluating the at least one feature vector associated with the second time series from each of the plurality of signals in the corresponding signal projection subspace; (Shibuya, see figure 19 as cited above and as detailed above, specifically see the # 1903 for the “Learning-Data selecting unit” for the model creation – this selects the “learning data” wherein “The sensor signal 102 output from the facility 101 is accumulated for learning in advance. The feature amount extraction unit 1901 inputs the accumulated sensor signal 102 and performs feature amount extraction to acquire the feature vector. The feature-selection unit 1902 performs data check of the feature vector output from the feature amount extraction unit 1901 and selects a feature to be used. The learning data selecting unit 1903 performs data check of the feature vector configured by the selected feature and check of the event signal 103 and selects the learning data used to create the normal model”, in other words the system accumulates sensor(s) signal data over a period of time for learning, and uses these plurality of time series as model condition points for creating a “normal model” – to clarify see figure 5 and ¶ 83-86 – the “normal model” is created by using the accumulated signals from the same sensors “during a predetermined period” and dividing these signals for the “mode”, i.e. ¶ 91 “the normal-model creation unit 106 classifies the learning data selected in step S503 for each of the modes divided by the mode dividing unit 104 and creates the normal model for each mode in step S505.”, and then see ¶ 179 “When the normal model is created for each mode, the anomaly measurement is computed by using the normal models of all the modes and the minimum value is acquired.” and ¶ 161-163 which teaches creating “plural normal models” [model condition points] for each “cycle”/mode by “random sampling” for “several cycles” and see ¶178 “The feature amount extraction unit 1901 inputs the sensor signal 102 and performs the same feature amount extraction as that at the learning time to acquire the feature vector”, in other words the system obtains model condition points [plural normal models] for each mode [points associated with labels for a plurality of conditions, e.g. start/stop/on/off] wherein each model condition point comprises time series from each of the sensors [random sampling over several cycles for accumulating sensor signal data] wherein each of these time series at least one feature vector [using the “same feature...extraction”] as the labelled point and wherein these model condition points and then used for classifying (fig. 5, “classify data for each mode”) using the feature vectors – in regards to this being in the signal projection subspace see above – the system is using the same feature extraction method for both the labelled input data and the learning data, so obviously these are in the same signal projection feature spaces, e.g. figure 13 shows an example – obviously as figure 13 is an example comprising a plurality of points for “every one day” (¶ 143) this shows not just the feature space/feature vectors for the most recent data [the labelled point for the mode] but also shows the learning data that was previously accumulated, and ¶ 143 then clarifies, as cited above, that the period/cycle for this would also obviously be by mode, e.g. “when the starting/stopping time are known” [starting/stopping modes], to clarify – the system extract features, e.g. the mean for each mode over several cycles of each mode, to form a feature space for each signal, i.e. each signal has a feature space which comprises the received data for each mode, including the labelled point [e.g., the point being the data for the most recent “start” cycle] wherein the system then projects the “learning data” into the same feature space for each signal as the system stores the “learning data” and extracts the feature vectors using the “same” algorithm as for the labelled point, e.g. see figure 13,, also see ¶ 88 as cited above, and also see figure 6 ¶ 93 which provides an example of a “3D feature space” and clarifies that “the dimension of the feature space may be...higher”, i.e. the feature space for each signal, in figure 6 the “evaluation data” is the projected feature vectors from the “the number of the learning data” (see ¶ 93), and in regards to evaluating the at least one feature vector in the subspace see ¶ 93-
	projecting the labeled point into each signal projection subspace, comprising evaluating the at least one feature vector associated with the first time series from each of the plurality of signals in the corresponding signal projection subspace; (Shibuya, as cited above, e.g. see figures 5 and 13, also see figure 17 and ¶ 158 – the “features” from both the “learning data” and the newly received labelled point [i.e. a plurality of time series labelled with a model from the facility sensors] are projected into the same feature subspace for each signal wherein ¶ 88 as cited above provides an example of feature extraction for forming a feature vector for each signal, wherein there feature vectors for each signal are then turned into a “a feature vector of a window width (3, 5, etc.,) x the number of sensors.”, i.e. there is a feature vector for the plurality of signals, and within that feature vector there is a feature vector for the “number of sensors”, e.g. the mean (fig 13) of each signal wherein ¶ 93-¶99 teaches then evaluating the feature vectors for both the model condition points [the learning data/evaluation data] and the labelled point in the feature space to for “anomaly measurement”, e.g. evaluating the “distance between the evaluation data [the model condition points] and the point b [the labelled point]”, e.g. ¶ 116 “The distance between the feature vector at the time of anomaly judgment and each in other words the system evaluates the feature vectors of both the model condition points/learning data and the labelled point, e.g. an “anomaly” to find a distance between the feature vectors to determine the “deviation” of each signal)
[...]
calculating a contribution of each of the plurality of signals to the first condition..., to form a sorted list of signals and contributions;(Shibuya, ¶ 117 “Further, since it is considered that a signal having a large deviation [contribution to the first condition/label] when the anomaly occurs contributes to the anomaly judgment, when the signals are displayed in the order of the large deviation from the top, it is easily verified which sensor signal has the anomaly. In addition, when a past case of the cause event is displayed in the same manner as the presented result event, it is easy to accept the same phenomenon to trust the advance notice of the result event.”, wherein this is used to detect an anomaly occurring in the received “event” (¶ 120) wherein as per at least figure 19 as cited above is the mode label from the event signal – in other words the system determines for each newly received mode/”cause event” [including the first label] the deviation of each signal compared to the “normal” signal/model(s)”  and sorts the signals into an ordered list by the “order of large deviation from the top” to determine “which sensor signal has the anomaly” [which sensor signal contributes to clarify this obviously forms a sorted listed of signals based on the contribution [e.g., the “deviation”] of the signal to the event label/mode of the received data, wherein the deviation is a measure of how much the present labelled point/associated time series/associated feature vectors vary/deviate from the normal case, i.e. this shows how much each signal contributes to the first condition/label)


Shibuya does not explicitly teach:
	constraining an analysis neighborhood for the labeled point to a set number of model condition points closest to the labeled point as projected in each signal projection subspace;
	for each signal projection subspace, calculating a first ratio of (a) a number of model condition points in the analysis neighborhood having a same label as the labeled point in the analysis neighborhood and (b) the set number of model condition points in the analysis neighborhood;
	for each signal projection subspace, calculating a second ratio of (c) a number of model condition points in the model having a same label as the labeled point and (d) the setDocket No. 60363-0027 number of model condition points in the analysis neighborhood;
	...based on a ratio of the first ratio and the second ratio calculated for the corresponding signal projection subspace...
	and adapt the physical system's behavior based on the sorted list.


constraining an analysis neighborhood for the labeled point to a set number of model condition points closest to the labeled point as projected in each signal projection subspace; (As an initial matter, the inventors for both Shibuya as relied upon above and Maeda are the same, and then see Maeda the abstract – this is for identifying an abnormal “sensor signal” and see figures 3-4, Maeda is for a related invention to the Shibuya invention, then see figure 11 and the description starting in ¶ 86 – Maeda is using a method similar to a “k-NN method”, i.e. the “k” [a set number of model condition points] which are the “nearest neighbors” wherein kNN uses a “distance within a feature space” wherein ¶ 87 then teaches that this is to find the “k pieces of data with highest similarities to the unit for deciding normal range” and then see ¶ 88 and figure 13 – the system is finding “a number k” of the nearest/most similar signal data [based on “distance within a feature space”] to a newly “observed sensor signal” and based on the k-nearest neighbors of the sensor signal determines if the “observed sensor signal” is “an anomaly” and provides calculation of “a deviance of observation data”, in other words it would have been obvious that, as taken in combination with Shibuya as relied upon above, the system would have used a kNN algorithm, or a “similar method”, to constrain an analysis neighborhood for each signal/each signal’s subspace to a set number “k” of model condition points that are closest in “distance within [the] feature space” [and obviously, the subspace for the sensor/signal] in order to determine the “deviance” of a “observation data” [a labelled point] from the normal points, i.e. this detects if a newly received labelled point comprising signal data for an event period, e.g. “starting” [see Shibuya] is “an anomaly” by checking the deviation for the new data to the k nearest neighbors of that data from the 
	and adapting the machine system's behavior based on the sorted list. (Medea, as cited above, and as taken in combination with Shibuya teaches detecting an anomalous event and forming a list of signals wherein the list is sorted by which signals have the most deviation at the top – then see Medea, ¶ 171 – the system is be used for “condition-based maintenance” wherein “parts replacement is performed [the system is adjusted] in accordance with the conditions of the device” wherein this is based on the “normal and anomalous data” of the devices as “it is important [for condition-based maintenance] to detect outliers from normal data”, it would have been obvious to use the sorted list of signals and their deviations/contributions to adapt the machine system, e.g. by implementing a parts replacement to correct for anomalous data)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings from system of Shibuya for “detecting advance signs of anomalies” such as by determining the deviation of each sensor signal from the normal [from the learning data] with the teachings from Maeda on using learning data about normal cases for anomaly detection using an algorithm similar to K-NN and applying the system for condition based maintenance. The motivation to combine would have been that 1) Maeda provides a computationally efficient means of anomaly detection using kNN wherein Maeda’s technique also provides “an anomaly explanation message” (Maeda, ¶ 86) in addition to the “deviance” and 2) Shibuya, ¶ 99 suggests uses a “nearest method” technique [e.g., kNN], and 3) for “condition-based maintenance..., in many cases, anomalous data is rarely collected 
	In addition, both references are by the same reference in a similar time period, i.e. it would have been obvious that these references were describing different aspects of the same overall system. 

Shibuya, as taken in combination with Maeda does not explicitly teach: 
for each signal projection subspace, calculating a first ratio of (a) a number of model condition points in the analysis neighborhood having a same label as the labeled point in the analysis neighborhood and (b) the set number of model condition points in the analysis neighborhood;
	for each signal projection subspace, calculating a second ratio of (c) a number of model condition points in the model having a same label as the labeled point and (d) the setDocket No. 60363-0027 number of model condition points in the analysis neighborhood;
	...based on a ratio of the first ratio and the second ratio calculated for the corresponding signal projection subspace...

Zhang teaches:
	for each signal projection subspace, calculating a first ratio of (a) a number of model condition points in the analysis neighborhood having a same label as the labeled point in the analysis neighborhood and (b) the set number of model condition points in the analysis neighborhood; (Zhang, abstract, teaches a modification to a KNN “to address the within-class imbalance” in order to “bias classification towards the rare class” and then see § 5.2 which teaches that “A strategy is desired to compute the true positive propensity for these regions so as to distinguish and effectively rank the corresponding query instances. To this end, if the local positive interval for a query instance t (Eq. (2)) is higher than the global positive interval (Eq. (1)), intuitively it indicates that t has higher posterior positive probability than the positive prior based on the observed positive frequency in the training population...The positive posterior probability estimation for t in Eq. (3) therefore should be adjusted. Let λ denote the positive odds (P:N) in the query neighbourhood over the positive odds in the global population. As an example, if P:N = 1:5 in the query region [first ratio, this is the ratio in the neighborhood] and P:N=1:10  [second ratio, this is the ratio in the total population] in the global training population, then λ = 2....” wherein “λ takes into account the local versus global class imbalance levels and the positive odds ratio in the local neighbourhood versus the global population indicates the positive propensity for the query instance...”, and then see table 1 which also calculates the “Pos frequency” as a variation to P:N, the first ratio is merely the frequency “in the query region”, e.g. 1/6)
for each signal projection subspace, calculating a second ratio of (c) a number of model condition points in the model having a same label as the labeled point and (d) the setDocket No. 60363-0027 number of model condition points in the analysis neighborhood;(Zhang, § 5.2 as cited above, “P:N=1:10 in the global training population” and then see table 1 which also calculates the “Pos frequency” as a variation to P:N, the second ratio as claimed is merely an obvious variant of Zhang, i.e. there is 1 point in Zhang’s teaching with the same label in the dataset, with 11 points 
	based on a ratio of the first ratio and the second ratio calculated for the corresponding signal projection subspace... (Zhang, the λ above is the ratio of the first to the second ratio, e.g. (1/5)/(1/10) = “2”) wherein this forms the probability of the labelled point being in the neighborhood/”query region” adjusted for the class imbalance, wherein this adjusts the probability of the prediction, this claim is merely an obvious variant of Zhang with a simple rearrangement of the odds being determined and used)

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings from Shibyua, as modified above, on a system which uses a kNN for finding anomalous signals  with the teachings from Zhang on a modification to the kNN’s probability estimation to account for “rare classes”. The motivation to combine would have been that the technique of Zhang would have biased the classification towards the rare class, e.g. an anomaly/outlying class, and as such Zhang would have improved the system’s ability to determine if a newly received set of time series from signals was anomalous [i.e., a rare class, it is abnormal], i.e. “These strategies more accurately characterise the rare-class distribution for accurate classification.” (Zhang, §8)
	
Regarding Claim 14.

	The machine-readable storage medium of claim 21, wherein calculating the contribution of each the signal to the first condition comprises:
	if the first ratio is less than or equal to the second ratio, setting the contribution of the signal to (first ratio/second ratio) - 1;(Zhang, as cited above, renders this obvious, i.e. this is “λ” -1 which is an obvious variant, and in terms of using λ as the contribution rate this also would have been obvious, this is the odds that the model condition points in the neighborhood have the same label as the labelled point, divided by the odds that the model condition points in the model have the same label as the labeled point)
	and otherwise, setting the contribution of the signal to (first ratio - second ratio)/(1 - second ratio).  (This is also an obvious variant from Zhang using a simple rearrangement of λ, and this additionally this is contingent)

Regarding Claim 15.
Maeda teaches: 
	The machine-readable storage medium of claim 21, wherein constraining the analysis neighborhood for the labeled point to the set number of model condition points closest to the labeled point in the signal projection subspace comprises sorting the model condition points by a distance from the labeled point and limiting the analysis neighborhood to a number of the condition points having the closest distance. (Maeda, ¶ 86-90, as cited above, this uses a kNN algorithm, or a similar algorithm, to select the k nearest points by “distance” such as for when k = 5 this is “the five highest pieces”, in other words obviously a kNN algorithm constrains the neighborhood of the k nearest neighbors to the closest k neighbors based on “distance within a feature space” [e.g., the subspace] wherein the closest/nearest are sorted by distance [e.g. highest] from the labelled point) 

Regarding Claim 16.
Maeda teaches: 
	The machine-readable storage medium of claim 15, wherein the distance comprises a distance between a projection of the labeled point and projections of the model condition points on a signal feature axis in the signal projection subspace. (Maeda, ¶ 86 – “distance within a feature space” is used, obviously this includes the distance between the labelled point and the model condition points in each signals feature space, wherein the distance is on a signal feature axis [it’s a distance in a coordinate system, obviously this includes the distance on the feature axis])

Regarding Claim 17.
Maeda teaches: 
	The machine-readable storage medium of claim 21, wherein adapting the physical system's behavior comprises remediating a faulty component. (Maeda, ¶ 171 teaches using the system for determining “parts replacement” [example of remediating a fault component])

Claims 6, 12, 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shibuya et al., US 2012/0290879 in view of Maeda et al., US 2012/0041575 and in further view of Zhang et al., “KRNN: k Rare-class Nearest Neighbour classification”, 2016 in further view of Skand, “kNN(k-Nearest Neighbour) Algorithm in R”, 2017

Regarding Claim 6
Shibuya, as modified above, does not explicitly teach: 
	The control method of claim 19, wherein the set number of the model condition points comprises a square root of a total number of model condition points.  

Skand teaches: 
	The control method of claim 19, wherein the set number of the model condition points comprises a square root of a total number of model condition points.  (Skand, section “Requirements for kNN”, #1 teaches “Generally k gets decided on the square root of number of data points”)

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings from Shibuya, as modified above, on a system which uses a kNN algorithm with the teachings from Skand on using the “square root” of the number of data points [model condition points]. The motivation to combine would have been that this value provides a k-value which reduces the “variance” while avoiding a “bias” (Skand, # 1, as cited above)

Regarding Claim 12.

	The apparatus of claim 20, wherein the set number of the model condition points comprises a square root of a total number of model condition points. 

Skand teaches: 
	...wherein the set number of the model condition points comprises a square root of a total number of model condition points.  (Skand, section “Requirements for kNN”, #1 teaches “Generally k gets decided on the square root of number of data points”)

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings from Shibuya, as modified above, on a system which uses a kNN algorithm with the teachings from Skand on using the “square root” of the number of data points [model condition points]. The motivation to combine would have been that this value provides a k-value which reduces the “variance” while avoiding a “bias” (Skand, # 1, as cited above)



Regarding Claim 18.
Shibuya, as modified above, does not explicitly teach: 
	The machine-readable storage medium of claim 21, wherein the set number of the model condition points comprises a square root of a total number of model condition points. 
logic to project the labeled point into each signal projection subspace, comprising evaluating the at least one feature vector associated with the first time series from each of the plurality of signals in the corresponding signal projection subspace;
	logic to constrain an analysis neighborhood for the labeled point to a set number of model condition points closest to the labeled point as projected in each signal projection subspace;
	logic to calculate, for each signal projection subspace, a first percent of model condition points in the analysis neighborhood having a same label as the labeled point in the analysis neighborhood out of the set number of model condition points in the analysis neighborhood;
	logic to calculate, for each signal projection subspace, a second percent of model condition points in the model having a same label as the labeled point out of the plurality of model condition points;
	logic to calculate a contribution of each of the plurality of signals to the first condition from the first percent and the second percent calculated for the corresponding signal projection subspace, to form a sorted list of signals and contributions;
	and logic to display signals that are higher in the sorted list of signals as the most likely contributors for the machine system condition corresponding to the labeled point. 
Skand teaches: 
	...wherein the set number of the model condition points comprises a square root of a total number of model condition points.  (Skand, section “Requirements for kNN”, #1 teaches “Generally k gets decided on the square root of number of data points”)

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings from Shibuya, as modified above, on a system which uses a kNN algorithm with the teachings from Skand on using the “square root” of the number of data points [model condition points]. The motivation to combine would have been that this value provides a k-value which reduces the “variance” while avoiding a “bias” (Skand, # 1, as cited above)
Regarding Claim 21.

	A non-transitory machine-readable storage medium storing instructions that when executed by a processor, cause the processor to execute a control method for a machine system comprising sensors generating a plurality of signals over time, the method comprising:
	8Ser. No. 15/906,702 filed 2/27/2018 Gregory Olsen, et al. - GAU 2128 (Hopkins) Docket No. 60363-0027 receiving a labeled point comprising, for a range of time, a first time series from each of the plurality of signals, each first time series being associated with at least one feature vector for at least one time point, each feature vector having components corresponding to one or more signal features, the labeled point having a first label describing a first condition of a plurality of conditions of the machine system, each of the plurality of signals having a corresponding signal projection subspace;
	projecting each of a plurality of model condition points into each signal projection subspace of the plurality of signal projection subspaces, each model condition point comprising a second time series from each of the plurality of signals, each second time series being associated with at least one feature vector for at least one time point, each feature vector having components corresponding to one or more signal features, the model condition point having a second label describing one of the plurality of conditions of the machine system, the projecting comprising evaluating the at least one feature vector associated with the second time series from each of the plurality of signals in the corresponding signal projection subspace;
	projecting the labeled point into each signal projection subspace, comprising evaluating the at least one feature vector associated with the first time series from each of the plurality of signals in the corresponding signal projection subspace;
	constraining an analysis neighborhood for the labeled point to a set number of model condition points closest to the labeled point as projected in each signal projection subspace;
	for each signal projection subspace, calculating a first ratio of (a) a number of model condition points in the analysis neighborhood having a same label as the labeled point in the analysis neighborhood and (b) the set number of model condition points in the analysis neighborhood;
	for each signal projection subspace, calculating a second ratio of (c) a number of model condition points in the model having a same label as the labeled point and (d) the set 9Ser. No. 15/906,702 filed 2/27/2018 Gregory Olsen, et al. - GAU 2128 (Hopkins) Docket No. 60363-0027 number of model condition points in the analysis neighborhood;
	calculating a contribution of each of the plurality of signals to the first condition, based on a ratio of the first ratio and the second ratio calculated for the corresponding signal projection subspace, to form a sorted list of signals and contributions;
	and adapt the physical system's behavior based on the sorted list.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Arbawa et al., “Soil Nutrient Content Classification for Essential Oil Plants using kNN”, 2019- see § 2.3, the kNN steps include “sort the smallest distance value to the largest”, this is a description of a general kNN classifier
Darling et al., “Toward Uncertainty Quantification for Supervised Classification”, Jan 2018, Sandia National Labs, Sandia Report SAND2018-0032, page 22 “Similarly, a probability value of a KNN prediction can be calculated using the ratio of class labels of the K nearest neighbors.” 
Basu, “A Tweak on K-Nearest Neighbor Decision Rule”, see § 2 – “The method will start with an initial value of k as ¯. If the difference between the number of representatives of the best and the second best competing classes is ¯, then the point is classified to the best competing class. Otherwise the value of the neighborhood parameter k will be increased by one and the process will continue until the point is classified to a particular class. If the test data point is not classified till the process reaches the last point of the training set, then the test data point will remain unclassified.”, in other words this constrains a neighborhood to a set value of k, and then incrementally grows the neighbor “till the last point” wherein “But unlike kNN, initially, it will take only the first ... points from the ordered set and performs the majority voting with a criterion that the difference between the total number of data points of the competing two classes (Lmax1 &Lmax2 ) among the nearest neighbors is equal to....”
Ko et al., “A New Dynamic Ensemble Selection Method for Numeral Recognition”, 2007, see § 2.1 and 2.2 for accuracy of a kNN being the “the percentage of the local training samples assigned to a class cli by this classifier that have been correctly labeled.” when taken with respect to a output class
Masuda et al., US 2015/0276557 – see abstract, see figures 11-12 and figures 16-17
Mestha et al., US 2019/0230119 – see figures 3B – 5

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached on (571) 272-2589.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/D.A.H./Examiner, Art Unit 2128                                                                                                                                                                                                        
/OMAR F FERNANDEZ RIVAS/Supervisory Patent Examiner, Art Unit 2128