DETAILED ACTION
Response to Arguments
Applicant's arguments filed 06/09/2022 have been fully considered but they are not persuasive as to the following points of issue: 
Kelly does not mention analyzing the total aggregated data based on a plurality of
detection conditions and generating a plurality of entries of usage pattern	information
Wang does not map the usage pattern information to form a plurality of entries of
encoded data having a plurality of mapping dimensions correspondingly
Applicant argues that Kelly does not mention analyzing the total aggregated data based on a plurality of detection conditions and generating a plurality of entries of usage pattern information. See pages 11-12 of Applicant’s Remarks submitted on 06/09/2022(stating that the get_activations() function of Kelly basically filters out and drops some data based on threshold power and threshold duration. However, the filtering operation is different from analyzing the content of the data based on detection conditions because the filtering operation does not transform the data attributes, but the analyzing operation transforms the data attributes). Examiner respectfully disagrees. 
In response to applicant's argument that the references fail to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., transform the data attributes) are not recited in the rejected claim(s).  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
Accordingly, under the Broadest Reasonable interpretation(BRI) of the claims in light of the specification, Kelly does teach the claim limitation of: analyzing the total aggregated data based on a plurality of detection conditions and generating a plurality of entries of usage pattern information. 
Applicant argues that Wang does not teach the claim limitation of: the usage pattern information to form a plurality of entries of encoded data having a plurality of mapping dimensions correspondingly. See pages 12-14 of Applicant’s Remarks submitted on 06/09/2022(arguing that the information mapping module of claim 1 differs from Wang). In support of Applicant’s arguments, Applicant attached the following table to mark the differences between the information mapping module of claim 1 and Wang’s nonintrusive load monitoring model:

    PNG
    media_image1.png
    773
    657
    media_image1.png
    Greyscale
 

	Examiner Respectfully disagrees. In response to applicant's argument that the references fail to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., only one mapping operation is required, power data has been transformed before being mapped, one set of output data) are not recited in the rejected claim(s).  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). 
Accordingly, under the Broadest Reasonable interpretation(BRI) of the claims in light of the specification, Wang does teach the claim limitation of: the usage pattern information to form a plurality of entries of encoded data having a plurality of mapping dimensions correspondingly.
Applicant’s arguments with respect to the third point of issue that Wang does not teach the following limitation of: analyze time correlation of the encoded data according to the corresponding timestamps to generate the first synthesized simulation data and second synthesized simulation data, respectively corresponding to a first electrical appliance and a second electrical appliance used during the unit processing period, have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument in respect to this point of issue.


Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. TW107142078, filed on 11/26/2018.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-6, and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Kelly, Daniel. Disaggregation of domestic smart meter energy data. Diss. Imperial College London, (2016)(“Kelly”) in view of Wang et al, Nonintrusive load monitoring based on deep learning. In International Workshop on Data Analytics for Renewable Energy Integration 2018 Sep 10 (pp. 137-147)(“Wang”) and in view of Filippi, Alessio, et al. "Multi-appliance power disaggregation: An approach to energy monitoring." 2010 IEEE International Energy Conference. IEEE, 2010(“Filippi”). 
Regarding claim 1, Kelly teaches:  a model building device for disaggregating total aggregated data outputted from a total electricity meter and measured during a unit processing period, the unit processing period comprising a plurality of timestamps(Kelly, pg. 158-160, see also table 9.1, 9.2, and 9.4,  “We used UK-DALE (see Chapter 4) as our source dataset. Each submeter in UK-DALE samples once every 6 seconds. All houses record aggregate apparent mains power once every 6 seconds… [t]he window width is decided on an appliance-by-appliance basis and varies from 128 samples (13 minutes) for the kettle to 1536 samples (2.5 hours) for the dish washer.” Kelly teaches we used UK-DALE (see Chapter 4) as our source dataset. Each submeter in UK-DALE samples once every 6 seconds. All houses record aggregate apparent mains power once every 6 seconds (i.e. a model building device for disaggregating total aggregated data outputted from a total electricity meter) the window width is decided on an appliance-by-appliance basis and varies from 128 samples (13 minutes) for the kettle to 1536 samples (2.5 hours) for the dish washer (i.e. and measured during a unit processing period, the unit processing period comprising a plurality of timestamps)), 
the model building device comprising: a usage pattern-analyzing module configured for receiving the total aggregated data, analyzing the total aggregated data based on a plurality of detection conditions, and generating a plurality of entries of usage pattern information(Kelly, pg. 158-160, see also table 9.4, “Appliance activations are extracted using NILMTK’s Electric.get_activations() method.The arguments we passed to get_activations() for each appliance are shown in Table 9.4. On simple appliances such as toasters, we extract activations by finding strictly consecutive samples above some threshold power. We then throw away any activations shorter than some threshold duration (to ignore spurious spikes). For more complex appliances such as washing machines whose power demand can drop below threshold for short periods during a cycle, NILMTK ignores short periods of sub-threshold power demand.” ). 
Kelly does not teach: an information mapping module electrically connected to the usage pattern-analyzing module, configured for mapping the usage pattern information to form a plurality of entries of encoded data having a plurality of mapping dimensions correspondingly;.  
However, Wang teaches: an information mapping module electrically connected to the usage pattern-analyzing module, configured for mapping the usage pattern information to form a plurality of entries of encoded data having a plurality of mapping dimensions correspondingly(Wang, pgs. 139-146, see also fig. 1 and fig. 2, “The overall framework of the proposed nonintrusive load monitoring model is shown in Fig. 1… After data segmentation, an embedding process is used to map the integer value of aggregated power consumption to a high dimensional vector with an embedding matrix E:                         
                            E
                            =
                            [
                            v
                            o
                            c
                            _
                            s
                            i
                            z
                            e
                            ,
                             
                             
                            e
                            m
                            b
                            e
                            d
                            d
                            i
                            n
                            g
                            _
                            s
                            i
                            z
                            e
                            ]
                        
                    … [a]t each time step t, the encoder calculates                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                    , the hidden state of time t, from                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     and                         
                            
                                
                                    Z
                                
                                
                                    t
                                
                            
                        
                    , the embedded input of time t.                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                            =
                            f
                            
                                
                                    
                                        
                                            Z
                                        
                                        
                                            t
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            h
                                        
                                        
                                            t
                                            -
                                            1
                                        
                                    
                                
                            
                            t
                            =
                            1
                            ,
                             
                            2
                            ,
                            …
                            ,
                             
                            
                                
                                    N
                                
                                
                                    i
                                
                            
                        
                     where                         
                            f
                        
                     is the inner computation rule of LSTM.” Wang teaches: from  fig. 1 the low sampling-rate aggregate power consumption data is electrically connected to segmentation which is electrically connected to                         
                            
                                
                                    N
                                
                                
                                    1
                                
                            
                            -
                        
                     length data which is electrically connected to embedding and  encode (i.e. an information mapping module electrically connected to the usage pattern-analyzing module) an embedding process is used to map the integer value of aggregated power consumption to a high dimensional vector with an embedding matrix E:                         
                            E
                            =
                            [
                            v
                            o
                            c
                            _
                            s
                            i
                            z
                            e
                            ,
                             
                             
                            e
                            m
                            b
                            e
                            d
                            d
                            i
                            n
                            g
                            _
                            s
                            i
                            z
                            e
                            ]
                        
                    … [a]t each time step t, the encoder calculates                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                    , the hidden state of time t, from                         
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     and                         
                            
                                
                                    Z
                                
                                
                                    t
                                
                            
                        
                    , the embedded input of time t.                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                            =
                            f
                            
                                
                                    
                                        
                                            Z
                                        
                                        
                                            t
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            h
                                        
                                        
                                            t
                                            -
                                            1
                                        
                                    
                                
                            
                            t
                            =
                            1
                            ,
                             
                            2
                            ,
                            …
                            ,
                             
                            
                                
                                    N
                                
                                
                                    i
                                
                            
                        
                     where                         
                            f
                        
                     is the inner computation rule of LSTM  (i.e. configured for mapping the usage pattern information to form a plurality of entries of encoded data having a plurality of mapping dimensions correspondingly));
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kelly in view of Wang the motivation to do so would be to use a sequence-to-sequence model with attention to be more accurate than models based on CNNs and RNNs (Wang, pg. 146, “We develop a deep learning framework based on sequence-to-sequence model and attention mechanism to perform nonintrusive load monitoring. The proposed model introduces the Encoder-Decoder architecture, and uses attention mechanism to extract the most relevant hidden states of encoder to guide the decoding process. These unique features enhance the proposed model’s ability to extract and utilize information dramatically. Tests on houses involved in training and houses not involved in training demonstrate that the proposed model can increase accuracy, recall and F1-score by 10 to 20% and reduce mean absolute error dramatically compared to the deep learning models based on CNN and RNN.”).
Kelly does not teach: and a time series-analyzing module electrically connected to the information mapping module, configured for analyzing time correlation of the encoded data according to the corresponding timestamps to generate first synthesized simulation data and second synthesized simulation data wherein the first synthesized simulation data and the second synthesized simulation data correspond to a first electrical appliance and a second electrical appliance used during the unit processing period, respectively. 
However, Filippi teaches: and a time series-analyzing module electrically connected to the information mapping module, configured for analyzing time correlation of the encoded data according to the corresponding timestamps to generate first synthesized simulation data and second synthesized simulation data wherein the first synthesized simulation data and the second synthesized simulation data correspond to a first electrical appliance and a second electrical appliance used during the unit processing period, respectively(Filippi, pgs. 92-94, see also table 1, table II, and figs. 2 and 3, “Define the                         
                            N
                            ×
                            1
                        
                     vector                         
                            
                                
                                    
                                        
                                            i
                                        
                                        
                                            k
                                        
                                    
                                
                                ~
                            
                            =
                            
                                
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            i
                                                        
                                                        
                                                            1
                                                            ,
                                                            k
                                                        
                                                    
                                                
                                                ~
                                            
                                             
                                            
                                                
                                                    
                                                        
                                                            i
                                                        
                                                        
                                                            2
                                                            ,
                                                            k
                                                        
                                                    
                                                
                                                ~
                                            
                                            …
                                             
                                            
                                                
                                                    
                                                        
                                                            i
                                                        
                                                        
                                                            N
                                                            ,
                                                            k
                                                        
                                                    
                                                
                                                ~
                                            
                                        
                                    
                                
                                
                                    T
                                
                            
                        
                    , and an                         
                            N
                            ×
                            N
                        
                     correlation matrix R with the element in the m-th row and n-th column,                         
                            
                                
                                    R
                                
                                
                                    m
                                    ,
                                    n
                                
                            
                        
                    , defined as                         
                            
                                
                                    R
                                
                                
                                    m
                                    ,
                                    n
                                
                            
                            =
                             
                            
                                
                                    1
                                
                                
                                    T
                                
                            
                            
                                
                                    ∫
                                    
                                        0
                                    
                                    
                                        T
                                    
                                
                                
                                    
                                        
                                            i
                                        
                                        
                                            m
                                        
                                    
                                    
                                        
                                            t
                                        
                                    
                                    
                                        
                                            i
                                        
                                        
                                            n
                                        
                                    
                                    
                                        
                                            t
                                        
                                    
                                    d
                                    t
                                
                            
                        
                    … Fig. 2 shows the experimental setup…we chose a set of 11 appliances and use the demonstrator to collect the current waveforms of the different appliances…[s]ample current signatures are shown in Fig. 3…[b]ased on the normalized current signatures                         
                            
                                
                                    i
                                
                                
                                    n
                                
                            
                            (
                            t
                            )
                        
                    , we compute the correlation matrix R… [t]he entries of R are shown in Table II.” Filippi teaches: correlation matrix R with the element in the m-th row and n-th column,                         
                            
                                
                                    R
                                
                                
                                    m
                                    ,
                                    n
                                
                            
                        
                    , defined as                         
                            
                                
                                    R
                                
                                
                                    m
                                    ,
                                    n
                                
                            
                            =
                             
                            
                                
                                    1
                                
                                
                                    T
                                
                            
                            
                                
                                    ∫
                                    
                                        0
                                    
                                    
                                        T
                                    
                                
                                
                                    
                                        
                                            i
                                        
                                        
                                            m
                                        
                                    
                                    
                                        
                                            t
                                        
                                    
                                    
                                        
                                            i
                                        
                                        
                                            n
                                        
                                    
                                    
                                        
                                            t
                                        
                                    
                                    d
                                    t
                                
                            
                        
                     [i.e. analyzing time correlation of the encoded data according to the corresponding timestamps] [t]he entries of R are shown in Table II  [i.e.to generate first synthesized simulation data and second synthesized simulation data] we chose a set of 11 appliances and use the demonstrator to collect the current waveforms of the different appliances, sample current signatures are shown in Fig. 3 [i.e. wherein the first synthesized simulation data and the second synthesized simulation data correspond to a first electrical appliance and a second electrical appliance used during the unit processing period, respectively]). 
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kelly in view Filippi of the motivation to do so would be to incorporate correlation among appliances when designing disaggregation/detection algorithms(Filippi, pg. 91, “The approach we take in this paper relies on steady state analysis where observations are aggregated over a complete period of the electrical waveform. In particular, a current sensor measures the total current drawn which is a linear
summation of the individual currents flowing through active appliances corrupted by additive noise. We translate this into a vector system model after performing matched filtering. Two
forms of linear detectors - Zero Forcing (ZF) and Minimum Mean Square Error (MMSE) are then used to estimate the active/inactive states of the appliances, and in turn their power consumptions.”).   
Regarding claim 2, Kelly in view of Wang and in view of Filippi teaches the model building device according to claim 1, wherein the detection conditions comprises a plurality of first time-frequency filter parameters corresponding to the first electrical appliance and a plurality of second time-frequency filter parameters corresponding to the second electrical appliance(Kelly, pgs. 160-161, see also table 9.4, “Appliance activations are extracted using NILMTK’s Electric.get_activations() method. The arguments we passed to get_activations() for each appliance are shown in Table 9.4. On simple appliances such as toasters, we extract activations by finding strictly consecutive samples above some threshold power. We then throw away any activations shorter than some threshold duration (to ignore spurious spikes). For more complex appliances such as washing machines whose power demand can drop below threshold for short periods during a cycle, NILMTK ignores short periods of sub-threshold power demand.”), 
the usage pattern-analyzing module comprising:  a time-frequency detection module, comprising: a plurality of first time-frequency detectors configured for analyzing the total aggregated data according to the first time-frequency filter parameters to generate a plurality of entries of first time-frequency usage pattern information; and a plurality of second time-frequency detectors configured for analyzing the total aggregated data according to the second time-frequency filter parameters to generate a plurality of entries of second time-frequency usage pattern information(Kelly, pgs. 160-161, see also table 9.4, “Appliance activations are extracted using NILMTK’s Electric.get_activations() method. The arguments we passed to get_activations() for each appliance are shown in Table 9.4. On simple appliances such as toasters, we extract activations by finding strictly consecutive samples above some threshold power. We then throw away any activations shorter than some threshold duration (to ignore spurious spikes). For more complex appliances such as washing machines whose power demand can drop below threshold for short periods during a cycle, NILMTK ignores short periods of sub-threshold power demand.”). 
Regarding claim 3, Kelly in view of Wang and in view of Filippi teaches the model building device according to claim 1, but does not teach: wherein the detection conditions comprises an edge detection condition, and the usage pattern-analyzing module comprises: an edge detection module configured for analyzing the total aggregated data based on the edge detection condition by centering each of the timestamps, and generating a plurality of entries of edge usage pattern information corresponding to the timestamps(Kelly, pgs. 162-163, “First, the mean of each sequence is subtracted from the sequence to give each sequence a mean of zero. Every input sequence is divided by the standard deviation of a random sample of the training set… We have done some preliminary experiments and found that neural nets appear to be able to generalize better if we independently cent[er] each sequence.”).   
Regarding claim 4, Kelly in view of Wang and in view of Filippi teaches the model building device according to claim 1, wherein the number of the mapping dimensions is a multiple of the number of the timestamps in the unit processing period, and the number of the mapping dimensions is less than the number of the entries of the usage pattern information(Wang, pg. 139, “After data segmentation, an embedding process is used to map the integer value of aggregated power consumption to a high dimensional vector with an embedding matrix E:                         
                            E
                            =
                            [
                            v
                            o
                            c
                            _
                            s
                            i
                            z
                            e
                            ,
                             
                             
                            e
                            m
                            b
                            e
                            d
                            d
                            i
                            n
                            g
                            _
                            s
                            i
                            z
                            e
                            ]
                        
                    . For each aggregated power consumption value i, it is mapped to vector E[i] and after the data segmentation process, the                         
                            [
                            
                                
                                    N
                                
                                
                                    i
                                
                            
                            *
                            1
                            ]
                        
                     input sequence is transformed into                         
                            [
                            
                                
                                    N
                                
                                
                                    i
                                
                            
                            *
                            e
                            m
                            b
                            e
                            d
                            d
                            i
                            n
                            g
                            _
                            s
                            i
                            z
                            e
                            ]
                        
                     matrix Z.” Wang teaches: the                        
                             
                        
                    input sequence is transformed into                         
                            [
                            
                                
                                    N
                                
                                
                                    i
                                
                            
                            *
                            e
                            m
                            b
                            e
                            d
                            d
                            i
                            n
                            g
                            _
                            s
                            i
                            z
                            e
                            ]
                        
                     matrix Z (i.e. the number of the mapping dimensions is a multiple of the number of the timestamps in the unit processing period) to a high dimensional vector with an embedding matrix E:                         
                            E
                            =
                            [
                            v
                            o
                            c
                            _
                            s
                            i
                            z
                            e
                            ,
                             
                             
                            e
                            m
                            b
                            e
                            d
                            d
                            i
                            n
                            g
                            _
                            s
                            i
                            z
                            e
                            ]
                        
                    (i.e. and the number of the mapping dimensions is less than the number of the entries of the usage pattern information)). 
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kelly with the above teachings of Wang for the same rationale stated at Claim 1.
Regarding claim 5, Kelly in view of Wang and in view of Filippi teaches the model building device according to claim 1, wherein the time series-analyzing module comprises: a correlation analyzing module configured for generating a plurality of first-layer past-time long short-term memory neurons and a plurality of first-layer future-time long short-term memory neurons corresponding to the timestamps, wherein the correlation analyzing module generates a plurality of past-time correlation sequences according to a first-layer past-time series consisting of the first-layer past-time long short-term memory neurons, and  generates a plurality of future-time correlation sequences according to a first-layer future-time series consisting of the first-layer future-time long short-term memory neurons(Kelly, pgs. 165-166, see also fig. 9, “An additional enhancement to RNNs is to use bidirectional layers. In a bidirectional RNN, there are effectively two parallel RNNs, one reads the input sequence forwards and the other reads the input sequence backwards. The output from the forwards and backwards halves of the network are combined either by concatenating them or doing an element-wise sum (we experimented with both and settled on concatenation, although element-wise sum appeared to work almost as well and is computationally cheaper)… We experimented with both RNNs and LSTMs and settled on the following architecture for energy disaggregation: 1. Input (length determined by appliance duration) 2. 1D convolution (filter size=4, stride=1, number of filters=16, activation function=linear, border mode=same) 3. bidirectional LSTM (N=128, with peepholes) 4. bidirectional LSTM (N=256, with peepholes) 5. Fully connected (N=128, activation function=TanH) 6 Fully connected (N=1, activation function=linear). At each time step, the network sees a single sample of aggregate power data and outputs a single sample of power data for the target appliance.”). 
Regarding claim 6, Kelly in view of Wang and in view of Filippi teaches the model building device according to claim 5, wherein the past-time correlation sequences and the future-time correlation sequences collectively form the time correlation sequences, the time series-analyzing module further comprising:  a basis waveform-generating module electrically connected to the correlation analyzing module, configured for generating a plurality of past-time waveforms and a plurality of future-time waveforms which are collectively defined as a plurality of common basis waveforms(Kelly, pg. 165-166, “An additional enhancement to RNNs is to use bidirectional layers. In a bidirectional RNN, there are effectively two parallel RNNs, one reads the input sequence forwards and the other reads the input sequence backwards. The output from the forwards and backwards halves of the network are combined…by concatenating them….” Kelly teaches: in a bidirectional RNN, there are effectively two parallel RNNs, one reads the input sequence forwards and the other reads the input sequence backwards (i.e. wherein the past-time correlation sequences and the future-time correlation sequences collectively form the time correlation sequences) the output from the forwards and backwards halves of the network are combined by concatenating them (i.e. the time series-analyzing module further comprising:  a basis waveform-generating module electrically connected to the correlation analyzing module, configured for generating a plurality of past-time waveforms and a plurality of future-time waveforms which are collectively defined as a plurality of common basis waveforms)).  
Regarding claim 9, Kelly in view of Wang and in view of Filippi teaches the model building device according to claim 1, further comprising: a training batch-determination module configured for dividing a plurality of augmented data received from a data processing device into a plurality of batches, wherein the augmented data comprises total augmented data, first augmented data and second augmented data corresponding to the total aggregated data, the first electrical appliance and the second electrical appliance, respectively(Kelly, pg. 161-162, “To create synthetic aggregate data we start by extracting a set of appliance activations for five appliances across all training houses: kettle, washing machine, dish washer, microwave and fridge. To create a single sequence of synthetic data, we start with two vectors of zeros: one vector will become the input to the net; the other will become the target. The length of each vector defines the ‘window width’ of data that the network sees. We go through the five appliance classes and decide whether or not to add an activation of that class to the training sequence… [o]f course, this relatively naïve approach to synthesising aggregate data ignores a lot of structure that appears in real aggregate data. For example, the kettle and toaster might often appear within a few minutes of each other in real data, but our simple ‘simulator’ is completely unaware of this sort of structure… [e]ach network receives data in a mini-batch…for the large RNN sequences, in which case we use a batch size of 16 sequences[]. The code is multi-threaded so the CPU can be busy preparing one batch of data on the fly whilst the GPU is busy training on the previous batch.” Kelly teaches: each network receives data in a mini-batch for the large RNN sequences, in which case we use a batch size of 16 sequences (i.e. a training batch-determination module configured for dividing a plurality of augmented data received from a data processing device into a plurality of batches) to create synthetic aggregate data we start by extracting a set of appliance activations for five appliances across all training houses(i.e. wherein the augmented data comprises total augmented data, first augmented data and second augmented data corresponding to the total aggregated data, the first electrical appliance and the second electrical appliance, respectively)).  

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Kelly, Daniel. Disaggregation of domestic smart meter energy data. Diss. Imperial College London, (2016)(“Kelly”) in view of Wang et al, Nonintrusive load monitoring based on deep learning. In International Workshop on Data Analytics for Renewable Energy Integration 2018 Sep 10 (pp. 137-147)(“Wang”)
Regarding claim 14, Kelly teaches  the loading disaggregation system according to claim 13, wherein the data processing device further comprises: a data augmentation module electrically connected to the data balance module, configured for augmenting the total balanced data, the first respective 20balanced data and the second respective balanced data according to at least a 54File: US13999F0SUNDIAL CONFIDENTIAL data augmentation rule to generate a plurality of entries of total augmented data(Kelly, pgs. 157-158, “In energy disaggregation, we have the advantage that generating effectively infinite amounts of synthetic aggregate data is relatively easy by randomly combining real appliance activations… We trained our nets on both synthetic aggregate data and real aggregate data in a 50:50 ratio. We found that synthetic data acts as a regulariser. In other words, training on a mix of synthetic and real aggregate data rather than just real data appears to improve the net’s ability to generalise to unseen houses.” Kelly teaches: generating effectively infinite amounts of synthetic aggregate data is relatively easy by randomly combining real appliance activations (i.e. a data augmentation module electrically connected to the data balance module, configured for augmenting the total balanced data) We trained our nets on both synthetic aggregate data and real aggregate data in a 50:50 ratio (i.e. the first respective 20balanced data and the second respective balanced data according to at least a 54File: US13999F0SUNDIAL CONFIDENTIAL data augmentation rule to generate a plurality of entries of total augmented data)). 
Kelly does not teach: a plurality of entries of first respective augmented data and a plurality of entries of second respective augmented data, respectively, wherein the number of the entries of the total augmented data is greater 5than the number of the entries of the total balanced data, wherein the entries of the total augmented data, the entries of the first respective augmented data and the entries of the second respective augmented data are equal in quantity.
However, Wang teaches: a plurality of entries of first respective augmented data and a plurality of entries of second respective augmented data, respectively, wherein the number of the entries of the total augmented data is greater 5than the number of the entries of the total balanced data, wherein the entries of the total augmented data, the entries of the first respective augmented data and the entries of the second respective augmented data are equal in quantity(Wang, pgs. 142-143, see also Table 1, “In order to synthetize the training data, we first need to extract load activation…which is the working period power consumption data of each target appliance. The parameter setting of the data extraction is shown in Table 1. The extracted load activations are stored properly. After extracting load activations, the training data are synthetized in three steps. First, create an all-zero sequence of length Ni, which is shown in Table 1. Then put one load activation of the target appliance into the sequence entirely with 50% probability. The remaining sequence is unchanged with 50% probability. Second, for appliances except for the target appliance, put one load activation of each into the sequence with 25% probability, and this does not require the load activation to be put into the sequence entirely. Third, repeat step one and two for K times, and make a training data that includes K pieces of                         
                            
                                
                                    N
                                
                                
                                    i
                                
                            
                             
                        
                    length sequences.” Wang teaches: create an all-zero sequence of length Ni, and put one load activation of the target appliance into the sequence entirely with 50% probability. The remaining sequence is unchanged with 50% probability. (i.e. a plurality of entries of first respective augmented data and a plurality of entries of second respective augmented data, respectively) create an all-zero sequence of length Ni, and the  load activation which is the working period power consumption data of each target appliance (i.e. wherein the number of the entries of the total augmented data is greater 5than the number of the entries of the total balanced data) put one load activation of the target appliance into the sequence entirely with 50% probability. The remaining sequence is unchanged with 50% probability (i.e. wherein the entries of the total augmented data, the entries of the first respective augmented data and the entries of the second respective augmented data are equal in quantity)).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kelly in view of Wang the motivation to do so would be to use a sequence-to-sequence model with attention to be more accurate than models based on CNNs and RNNs (Wang, pg. 146, “We develop a deep learning framework based on sequence-to-sequence model and attention mechanism to perform nonintrusive load monitoring. The proposed model introduces the Encoder-Decoder architecture, and uses attention mechanism to extract the most relevant hidden states of encoder to guide the decoding process. These unique features enhance the proposed model’s ability to extract and utilize information dramatically. Tests on houses involved in training and houses not involved in training demonstrate that the proposed model can increase accuracy, recall and F1-score by 10 to 20% and reduce mean absolute error dramatically compared to the deep learning models based on CNN and RNN.”).

Claims 7 is rejected under 35 U.S.C. 103 as being unpatentable over Kelly, Daniel. Disaggregation of domestic smart meter energy data. Diss. Imperial College London, (2016)(“Kelly”) in view of Wang et al, Nonintrusive load monitoring based on deep learning. In International Workshop on Data Analytics for Renewable Energy Integration 2018 Sep 10 (pp. 137-147)(“Wang”) and in view of Filippi, Alessio, et al. "Multi-appliance power disaggregation: An approach to energy monitoring." 2010 IEEE International Energy Conference. IEEE, 2010(“Filippi”) and further in view of Liang, Jian, et al. "Load signature study—Part II: Disaggregation framework, simulation, and applications." IEEE Transactions on Power Delivery 25.2 (2009)(“Liang”).
Regarding to claim 7, Kelly in view of Wang and in view of Filippi teaches the model building device according to claim 6, wherein the time series-analyzing module further comprises: a data synthesizing module electrically connected to the basis waveform-generating module, and comprising: a plurality of waveform selecting modules electrically connected to the basis waveform-generating module, configured for selecting a portion of the common basis waveforms as a plurality of first appliance basis waveforms corresponding to the first electrical appliance, and selecting another portion of the common basis waveforms as a plurality of second appliance basis waveforms corresponding to the second electrical appliance(Kelly, pg. 165-166, see also fig. 9.2,  “An additional enhancement to RNNs is to use bidirectional layers. In a bidirectional RNN, there are effectively two parallel RNNs, one reads the input sequence forwards and the other reads the input sequence backwards. The output from the forwards and backwards halves of the network are combined…by concatenating them…[a]t each time step, the network sees a single sample of aggregate power data and outputs a single sample of power data for the target appliance… Figure 9.2 shows an example output of our LSTM network in the two ‘RNN’ rows.”); 
Kelly in view of Wang and in view of Filippi does not teach: and a plurality of waveform synthesizing modules electrically connected to the waveform selecting modules correspondingly, configured for generating the first synthesized simulation data by combining the first appliance basis waveforms, and generating the second synthesized simulation data by combining the second appliance basis waveforms. 
However, Liang teaches: and a plurality of waveform synthesizing modules electrically connected to the waveform selecting modules correspondingly, configured for generating the first synthesized simulation data by combining the first appliance basis waveforms, and generating the second synthesized simulation data by combining the second appliance basis waveforms(Liang, pg. 563, “This simulator is designed to produce more realistic consumption patterns of different household appliances during the day. First, we classify the database appliances into four types…Second, we define the expected operating periods and the probabilities of each type being on during each time segment of a day. Some of them may be switched regularly and frequently (e.g., refrigerator) and some may be switched occasionally (e.g., hair dryer or water boiler). The frequency of usage of each appliance during the simulation is determined by their preset probabilities…Fig. 4(a) and (b) shows a typical simulation result and the histogram of the number of simultaneously-operating appliances, respectively. Simulations from behavior-based simulator III are more representative of typical household power consumption, such as the one shown in Fig. 4(c)” & see also Liang, pg. 567, see also fig. 15(a),  “Fig. 15(a) shows a typical power profile within 24-hr derived from simulator III.”).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kelly in view of Wang and in view of Filippi and further in view of Liang  the motivation to do so would be use historical simulations of household appliances to evaluate committee decision mechanism (CDMs) for better recognition performance(Liang, pg. 561, “In general, although two appliances may have similar features, other features could have detectable traits so that they can be used for identification. For instance, a water boiler and air conditioner may look the same in their steady-state signatures but their transient features are noticeably different. Furthermore, different disaggregation algorithms have different recognition capabilities which depend on the features being analyzed. As such, this paper first presents a framework for load disaggregation using three committee decision mechanisms (CDMs). We also propose different random appliance switching simulators and illustrate how these simulators can help us evaluate the performance of different load disaggregation methods, particularly the CDMs.” ).

Claims 8 are rejected under 35 U.S.C. 103 as being unpatentable over Kelly, Daniel. Disaggregation of domestic smart meter energy data. Diss. Imperial College London, (2016)(“Kelly”) in view of Wang et al, Nonintrusive load monitoring based on deep learning. In International Workshop on Data Analytics for Renewable Energy Integration 2018 Sep 10 (pp. 137-147)(“Wang”) and in view of Filippi, Alessio, et al. "Multi-appliance power disaggregation: An approach to energy monitoring." 2010 IEEE International Energy Conference. IEEE, 2010(“Filippi”) and in view of Liang, Jian, et al. "Load signature study—Part II: Disaggregation framework, simulation, and applications." IEEE Transactions on Power Delivery 25.2 (2009)(“Liang”) and further in view of  Matsato WO 2013145779 A2(“Matsato”).
Regarding claim 8, Kelly in view of Wang and in view of Filippi and in view of Liang teaches the model building device according to claim 7, but does not teach wherein the waveform selecting modules comprises: a first waveform selecting module configured for selecting the first appliance basis waveforms from the common basis waveforms, and  5determining a first ratio of each of the first appliance basis waveforms, wherein the first synthesized simulation data are formed by combining the first appliance basis waveforms according to the first ratio.
However Matsato teaches: wherein the waveform selecting modules comprises: a first waveform selecting module configured for selecting the first appliance basis waveforms from the common basis waveforms, and  5determining a first ratio of each of the first appliance basis waveforms, wherein the first synthesized simulation data are formed by combining the first appliance basis waveforms according to the first ratio(Matsato, paras. 0038-0042, see also fig. 2, “In the following description, as the sum total data, for example, a sum total of current consumed by each household electrical appliance is employed, and, for example, a waveform of current consumed by each household electrical appliance is separated from waveforms of the sum total of current which is the sum total data. Fig. 2 is a diagram illustrating an outline of the waveform separation learning performing the household electrical appliance separation. In the waveform separation learning, a current waveform                         
                            
                                
                                    Y
                                
                                
                                    t
                                
                            
                        
                     which is sum total data at the time point t is set as an addition value (sum total) of a current waveform                         
                            
                                
                                    W
                                
                                
                                    (
                                    m
                                    )
                                
                            
                        
                     of current consumed by each household electrical appliance #m, and the current waveform                         
                            
                                
                                    W
                                
                                
                                    (
                                    m
                                    )
                                
                            
                        
                    consumed by each household electrical appliance #m is obtained from the current waveform                         
                            
                                
                                    Y
                                
                                
                                    t
                                
                            
                        
                    . In Fig. 2, there are five household electrical appliances #1 to #5 in the household, and, of the five household electrical appliances #1 to #5, the household electrical appliances #1, #2, #4 and #5 are in an ON state (state where power is consumed), and the household electrical appliance #3 is in an OFF state (state where power is not consumed). For this reason, in Fig. 2, the current waveform                         
                            
                                
                                    Y
                                
                                
                                    t
                                
                            
                        
                     as the sum total data becomes an addition value (sum total) of current consumption                         
                            
                                
                                    W
                                
                                
                                    (
                                    1
                                    )
                                
                            
                            
                                
                                     
                                    W
                                
                                
                                    (
                                    2
                                    )
                                
                            
                             
                            
                                
                                    W
                                
                                
                                    (
                                    4
                                    )
                                
                            
                        
                    and                         
                            
                                
                                    W
                                
                                
                                    (
                                    5
                                    )
                                
                            
                        
                    of the respective household electrical appliances #1, #2, #4 and #5.” Matsato teaches:  fig. 2 and in the waveform separation learning, a current waveform                         
                            
                                
                                    Y
                                
                                
                                    t
                                
                            
                        
                     which is sum total data at the time point t is set as an addition value (sum total) of a current waveform                         
                            
                                
                                    W
                                
                                
                                    (
                                    m
                                    )
                                
                            
                        
                     of current consumed by each household electrical appliance #m, and the current waveform                         
                            
                                
                                    W
                                
                                
                                    (
                                    m
                                    )
                                
                            
                        
                    consumed by each household electrical appliance #m is obtained from the current waveform                         
                            
                                
                                    Y
                                
                                
                                    t
                                
                            
                        
                    (i.e. wherein the waveform selecting modules comprises: a first waveform selecting module configured for selecting the first appliance basis waveforms from the common basis waveforms and  5determining a first ratio of each of the first appliance basis waveforms)) For this reason, in Fig. 2, the current waveform                         
                            
                                
                                    Y
                                
                                
                                    t
                                
                            
                        
                     as the sum total data becomes an addition value (sum total) of current consumption                         
                            
                                
                                    W
                                
                                
                                    (
                                    1
                                    )
                                
                            
                            
                                
                                     
                                    W
                                
                                
                                    (
                                    2
                                    )
                                
                            
                             
                            
                                
                                    W
                                
                                
                                    (
                                    4
                                    )
                                
                            
                        
                    and                         
                            
                                
                                    W
                                
                                
                                    (
                                    5
                                    )
                                
                            
                        
                    of the respective household electrical appliances #1, #2, #4 and #5 (i.e. wherein the first synthesized simulation data are formed by combining the first appliance basis waveforms according to the first ratio)).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kelly in view of Wang and in view of Filippi and in view of Liang and further in view of Matsato the motivation to do so would be to use pre-stored models of household consumption usage and only update the parameters associated with the pre-stored models rather than the models themselves(Matsato, para. 0054-0058,“The overall models                         
                            ϕ
                             
                        
                    include household electrical appliance models #1 to #M which are M models (representing current consumption) of a plurality of household electrical appliances. The parameters                         
                            ϕ
                        
                     of the overall models include a current waveform parameter indicating current consumption for each operating state of a household electrical appliance indicated by the household electrical appliance model #m…The model parameters                         
                            ϕ
                             
                        
                    of the overall models stored in the model storage unit 13 are referred to by the evaluation portion 21 and the estimation portion 22 of the state estimation unit 12, the label acquisition unit 15, and the data output unit 16, and are updated by a waveform separation learning portion 31, a variance learning portion 32, and a state variation learning portion 33 of the model learning unit 14…The model learning unit 14 performs model learning for updating the model parameters….”). 

Claims 10-13 are rejected under 35 U.S.C. 103 as being unpatentable over Kelly, Daniel. Disaggregation of domestic smart meter energy data. Diss. Imperial College London, (2016)(“Kelly”) in view of Wang et al, Nonintrusive load monitoring based on deep learning. In International Workshop on Data Analytics for Renewable Energy Integration 2018 Sep 10 (pp. 137-147)(“Wang”) and in view of Filippi, Alessio, et al. "Multi-appliance power disaggregation: An approach to energy monitoring." 2010 IEEE International Energy Conference. IEEE, 2010(“Filippi”) and further in view of Matsato WO 2013145779 A2(“Matsato”). 
Regarding claim 10, Kelly teaches a loading disaggregation system receiving total raw data outputted from a total electricity meter, first respective raw data outputted from a first respective electricity meter corresponding to a first electrical appliance and second respective raw data outputted from a second respective electricity meter corresponding to a second electrical appliance wherein the total raw data, the first respective raw data and the second respective raw data are measured during a unit processing period(Kelly, pg. 158-160, see also table 9.1, 9.2, and 9.4,  “We used UK-DALE (see Chapter 4) as our source dataset. Each submeter in UK-DALE samples once every 6 seconds. All houses record aggregate apparent mains power once every 6 seconds… [w]e train one network per target appliance. The target (i.e. the desired output of the net) is the power demand of the target appliance. The input to every net we describe in this chapter is a window of aggregate power demand. The window width is decided on an appliance-by-appliance basis and varies from 128 samples (13 minutes) for the kettle to 1536 samples (2.5 hours) for the dish washer.” Kelly teaches: We used UK-DALE (see Chapter 4) as our source dataset. Each submeter in UK-DALE samples once every 6 seconds. All houses record aggregate apparent mains power once every 6 second we train one network per target appliance. The target (i.e. the desired output of the net) is the power demand of the target appliance (i.e. a loading disaggregation system receiving total raw data outputted from a total electricity meter, first respective raw data outputted from a first respective electricity meter corresponding to a first electrical appliance and second respective raw data outputted from a second respective electricity meter corresponding to a second electrical appliance) The window width is decided on an appliance-by-appliance basis and varies from 128 samples (13 minutes) for the kettle to 1536 samples (2.5 hours) for the dish washer (i.e. wherein the total raw data, the first respective raw data and the second respective raw data are measured during a unit processing period)), 
loading disaggregation system comprising: a data processing device configured for processing the total raw data, the first respective raw data and the second respective raw data to generate total aggregated data, first respective verification data and second respective verification data, respectively, wherein the unit processing period comprises a plurality of timestamps(Kelly, pg. 158-160, see also table 9.1, 9.2, and 9.4, “In energy disaggregation, we have the advantage that generating effectively infinite amounts of synthetic aggregate data is relatively easy by randomly combining real appliance activations. (We define an ‘appliance activation’ to be the power drawn by a single appliance over one complete cycle of that appliance. For example, Figure 9.1 shows a single activation for a washing machine.) We trained our nets on both synthetic aggregate data and real aggregate data in a 50:50 ratio. We found that synthetic data acts as a regulariser. In other words, training on a mix of synthetic and real aggregate data rather than just real data appears to improve the net’s ability to generalise to unseen houses. For validation and testing we use only real data (not synthetic)…[t]he input to every net we describe in this chapter is a window of aggregate power demand. The window width is decided on an appliance-by-appliance basis and varies from 128 samples (13 minutes) for the kettle to 1536 samples (2.5 hours) for the dish washer.” Kelly teaches: In energy disaggregation, we have the advantage that generating effectively infinite amounts of synthetic aggregate data is relatively easy by randomly combining real appliance activations. We trained our nets on both synthetic aggregate data and real aggregate data in a 50:50 ratio (i.e. loading disaggregation system comprising: a data processing device configured for processing the total raw data, the first respective raw data and the second respective raw data to generate total aggregated data) For validation we use only real data (not synthetic) (i.e. first respective verification data and second respective verification data, respectively ) The window width is decided on an appliance-by-appliance basis and varies from 128 samples (13 minutes) for the kettle to 1536 samples (2.5 hours) for the dish washer (i.e. wherein the unit processing period comprises a plurality of timestamps));
 a model building device electrically connected to the data processing device, comprising: a usage pattern-analyzing module configured for receiving the total aggregated data, analyzing the total aggregated data based on a plurality of detection conditions, and generating a plurality of entries of usage pattern information(Kelly, pg. 158-160, see also table 9.4, “Appliance activations are extracted using NILMTK’s Electric.get_activations() method.The arguments we passed to get_activations() for each appliance are shown in Table 9.4. On simple appliances such as toasters, we extract activations by finding strictly consecutive samples above some threshold power. We then throw away any activations shorter than some threshold duration (to ignore spurious spikes). For more complex appliances such as washing machines whose power demand can drop below threshold for short periods during a cycle, NILMTK ignores short periods of sub-threshold power demand.”). 
Kelly does not teach: an information mapping module electrically connected to the usage pattern-analyzing module, configured for mapping the usage pattern information to form a plurality of entries of encoded data.
However, Wang teaches: an information mapping module electrically connected to the usage pattern-analyzing module, configured for mapping the usage pattern information to form a plurality of entries of encoded data having a plurality of mapping dimensions correspondingly(Wang, pgs. 139-146, see also fig. 1 and fig. 2, “The overall framework of the proposed nonintrusive load monitoring model is shown in Fig. 1… After data segmentation, an embedding process is used to map the integer value of aggregated power consumption to a high dimensional vector with an embedding matrix E:                 
                    E
                    =
                    [
                    v
                    o
                    c
                    _
                    s
                    i
                    z
                    e
                    ,
                     
                     
                    e
                    m
                    b
                    e
                    d
                    d
                    i
                    n
                    g
                    _
                    s
                    i
                    z
                    e
                    ]
                
            … [a]t each time step t, the encoder calculates                 
                    
                        
                            h
                        
                        
                            t
                        
                    
                
            , the hidden state of time t, from                 
                    
                        
                            h
                        
                        
                            t
                            -
                            1
                        
                    
                
             and                 
                    
                        
                            Z
                        
                        
                            t
                        
                    
                
            , the embedded input of time t.                 
                    
                        
                            h
                        
                        
                            t
                        
                    
                    =
                    f
                    
                        
                            
                                
                                    Z
                                
                                
                                    t
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                    
                    t
                    =
                    1
                    ,
                     
                    2
                    ,
                    …
                    ,
                     
                    
                        
                            N
                        
                        
                            i
                        
                    
                
             where                 
                    f
                
             is the inner computation rule of LSTM.” Wang teaches: from  fig. 1 the low sampling-rate aggregate power consumption data is electrically connected to segmentation which is electrically connected to                 
                    
                        
                            N
                        
                        
                            1
                        
                    
                    -
                
             length data which is electrically connected to embedding and  encode (i.e. an information mapping module electrically connected to the usage pattern-analyzing module) an embedding process is used to map the integer value of aggregated power consumption to a high dimensional vector with an embedding matrix E:                 
                    E
                    =
                    [
                    v
                    o
                    c
                    _
                    s
                    i
                    z
                    e
                    ,
                     
                     
                    e
                    m
                    b
                    e
                    d
                    d
                    i
                    n
                    g
                    _
                    s
                    i
                    z
                    e
                    ]
                
            … [a]t each time step t, the encoder calculates                 
                    
                        
                            h
                        
                        
                            t
                        
                    
                
            , the hidden state of time t, from                 
                    
                        
                            h
                        
                        
                            t
                            -
                            1
                        
                    
                
             and                 
                    
                        
                            Z
                        
                        
                            t
                        
                    
                
            , the embedded input of time t.                 
                    
                        
                            h
                        
                        
                            t
                        
                    
                    =
                    f
                    
                        
                            
                                
                                    Z
                                
                                
                                    t
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                    
                    t
                    =
                    1
                    ,
                     
                    2
                    ,
                    …
                    ,
                     
                    
                        
                            N
                        
                        
                            i
                        
                    
                
             where                 
                    f
                
             is the inner computation rule of LSTM  (i.e. configured for mapping the usage pattern information to form a plurality of entries of encoded data having a plurality of mapping dimensions correspondingly));
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kelly in view of Wang the motivation to do so would be to use a sequence-to-sequence model with attention to be more accurate than models based on CNNs and RNNs (Wang, pg. 146, “We develop a deep learning framework based on sequence-to-sequence model and attention mechanism to perform nonintrusive load monitoring. The proposed model introduces the Encoder-Decoder architecture, and uses attention mechanism to extract the most relevant hidden states of encoder to guide the decoding process. These unique features enhance the proposed model’s ability to extract and utilize information dramatically. Tests on houses involved in training and houses not involved in training demonstrate that the proposed model can increase accuracy, recall and F1-score by 10 to 20% and reduce mean absolute error dramatically compared to the deep learning models based on CNN and RNN.”).
Kelly does not teach: and a time series-analyzing module electrically connected to the information mapping module, configured for analyzing time correlation of the encoded data according to the timestamps to generate first synthesized simulation data and second synthesized simulation data wherein the first synthesized simulation data and the second synthesized simulation data correspond to the first electrical appliance and the second electrical appliance, respectively. 
However Filippi teaches: and a time series-analyzing module electrically connected to the information mapping module, configured for analyzing time correlation of the encoded data according to the timestamps to generate first synthesized simulation data and second synthesized simulation data wherein the first synthesized simulation data and the second synthesized simulation data correspond to the first electrical appliance and the second electrical appliance, respectively(Filippi, pgs. 92-94, see also table 1, table II, and figs. 2 and 3, “Define the                 
                    N
                    ×
                    1
                
             vector                 
                    
                        
                            
                                
                                    i
                                
                                
                                    k
                                
                            
                        
                        ~
                    
                    =
                    
                        
                            
                                
                                    
                                        
                                            
                                                
                                                    i
                                                
                                                
                                                    1
                                                    ,
                                                    k
                                                
                                            
                                        
                                        ~
                                    
                                     
                                    
                                        
                                            
                                                
                                                    i
                                                
                                                
                                                    2
                                                    ,
                                                    k
                                                
                                            
                                        
                                        ~
                                    
                                    …
                                     
                                    
                                        
                                            
                                                
                                                    i
                                                
                                                
                                                    N
                                                    ,
                                                    k
                                                
                                            
                                        
                                        ~
                                    
                                
                            
                        
                        
                            T
                        
                    
                
            , and an                 
                    N
                    ×
                    N
                
             correlation matrix R with the element in the m-th row and n-th column,                 
                    
                        
                            R
                        
                        
                            m
                            ,
                            n
                        
                    
                
            , defined as                 
                    
                        
                            R
                        
                        
                            m
                            ,
                            n
                        
                    
                    =
                     
                    
                        
                            1
                        
                        
                            T
                        
                    
                    
                        
                            ∫
                            
                                0
                            
                            
                                T
                            
                        
                        
                            
                                
                                    i
                                
                                
                                    m
                                
                            
                            
                                
                                    t
                                
                            
                            
                                
                                    i
                                
                                
                                    n
                                
                            
                            
                                
                                    t
                                
                            
                            d
                            t
                        
                    
                
            … Fig. 2 shows the experimental setup…we chose a set of 11 appliances and use the demonstrator to collect the current waveforms of the different appliances…[s]ample current signatures are shown in Fig. 3…[b]ased on the normalized current signatures                 
                    
                        
                            i
                        
                        
                            n
                        
                    
                    (
                    t
                    )
                
            , we compute the correlation matrix R… [t]he entries of R are shown in Table II.” Filippi teaches: correlation matrix R with the element in the m-th row and n-th column,                 
                    
                        
                            R
                        
                        
                            m
                            ,
                            n
                        
                    
                
            , defined as                 
                    
                        
                            R
                        
                        
                            m
                            ,
                            n
                        
                    
                    =
                     
                    
                        
                            1
                        
                        
                            T
                        
                    
                    
                        
                            ∫
                            
                                0
                            
                            
                                T
                            
                        
                        
                            
                                
                                    i
                                
                                
                                    m
                                
                            
                            
                                
                                    t
                                
                            
                            
                                
                                    i
                                
                                
                                    n
                                
                            
                            
                                
                                    t
                                
                            
                            d
                            t
                        
                    
                
             [i.e. analyzing time correlation of the encoded data according to the corresponding timestamps] [t]he entries of R are shown in Table II  [i.e.to generate first synthesized simulation data and second synthesized simulation data] we chose a set of 11 appliances and use the demonstrator to collect the current waveforms of the different appliances, sample current signatures are shown in Fig. 3 [i.e. wherein the first synthesized simulation data and the second synthesized simulation data correspond to a first electrical appliance and a second electrical appliance used during the unit processing period, respectively]). 
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kelly in view Filippi of the motivation to do so would be to incorporate correlation among appliances when designing disaggregation/detection algorithms(Filippi, pg. 91, “The approach we take in this paper relies on steady state analysis where observations are aggregated over a complete period of the electrical waveform. In particular, a current sensor measures the total current drawn which is a linear
summation of the individual currents flowing through active appliances corrupted by additive noise. We translate this into a vector system model after performing matched filtering. Two
forms of linear detectors - Zero Forcing (ZF) and Minimum Mean Square Error (MMSE) are then used to estimate the active/inactive states of the appliances, and in turn their power consumptions.”).   
Kelly does not teach: and a model evaluation device electrically connected to the data processing device and the model building device, configured for receiving the first respective verification data and the second respective verification data from the data processing device, receiving the first synthesized simulation data and the second synthesized simulation data from the model building device, comparing the first respective verification data with the first synthesized simulation data to obtain a first similarity between the first respective verification data and the first synthesized simulation data, and comparing the second respective verification data with the second synthesized simulation data to obtain a second similarity between the second respective verification data and the second synthesized simulation data. 
However, Matsato teaches and a model evaluation device electrically connected to the data processing device and the model building device, configured for receiving the first respective verification data and the second respective verification data from the data processing device, receiving the first synthesized simulation data and the second synthesized simulation data from the model building device, comparing the first respective verification data with the first synthesized simulation data to obtain a first similarity between the first respective verification data and the first synthesized simulation data, and comparing the second respective verification data with the second synthesized simulation data to obtain a second similarity between the second respective verification data and the second synthesized simulation data(Matsato, see also fig. 28, paras. 0469-0474, “The state estimation unit 12 has the evaluation portion 71 and the estimation portion 72, and performs state estimation for estimating operating states of a plurality of M household electrical appliances #1, #2,..., and #m by using the measurement waveform                 
                    
                        
                            Y
                        
                        
                            t
                        
                    
                
             from the data acquisition unit 11, and the waveform model stored in the model storage unit 13. In other words, the evaluation portion 71 obtains an evaluation value E where an extent that a current waveform Y supplied from the data acquisition unit 11 is observed is evaluated so as to be supplied to the estimation portion 72, in each household electrical appliance model #m forming the waveform model as the overall models                 
                    ϕ
                     
                
            stored in the model storage unit 13. The estimation portion 72 estimates an operating state                 
                    
                        
                            C
                        
                        
                            t
                            ,
                             
                            k
                        
                        
                            (
                            m
                            )
                        
                    
                     
                
            at the time point t of each household electrical appliance indicated by each household electrical appliance model #m by using the evaluation value E supplied from the evaluation portion 71, for example, according to an integer programming, so as to be supplied to the model learning unit 14, the label acquisition unit 15, and the data output unit 16. Here, the estimation portion 72 solve an integer programming program of Math (30) according to the integer programming and estimates the operating state                 
                    
                        
                            C
                        
                        
                            t
                            ,
                             
                            k
                        
                        
                            (
                            m
                            )
                        
                    
                     
                
             of the household electrical appliance #m...                 
                    m
                    i
                    n
                    i
                    m
                    i
                    z
                    e
                     
                    E
                    =
                    |
                    
                        
                            Y
                        
                        
                            t
                        
                    
                    -
                    
                        
                            ∑
                            
                                m
                                =
                                1
                            
                            
                                M
                            
                        
                        
                            
                                
                                    ∑
                                    
                                        k
                                        =
                                        1
                                    
                                    
                                        
                                            
                                                k
                                            
                                            
                                                
                                                    
                                                        n
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            W
                                        
                                        
                                            k
                                        
                                        
                                            
                                                
                                                    m
                                                
                                            
                                        
                                    
                                    ⋅
                                    
                                        
                                            C
                                        
                                        
                                            t
                                            ,
                                             
                                            k
                                        
                                        
                                            
                                                
                                                    m
                                                
                                            
                                        
                                    
                                
                            
                        
                    
                     
                
            subject to 0                
                    ≤
                    
                        
                            C
                        
                        
                            t
                            ,
                            k
                        
                        
                            
                                
                                    m
                                
                            
                        
                    
                    ∈
                    i
                    n
                    t
                    e
                    g
                    e
                    r
                
             Here…E indicates an error of the measurement waveform                 
                    
                        
                            Y
                        
                        
                            t
                        
                    
                
             and a current waveform                 
                    
                        ∑
                        
                            
                                ∑
                                
                                    
                                        
                                            W
                                        
                                        
                                            k
                                        
                                        
                                            
                                                
                                                    m
                                                
                                            
                                        
                                    
                                    
                                        
                                            C
                                        
                                        
                                            t
                                            ,
                                             
                                            
                                                
                                                    3
                                                
                                                
                                                    4
                                                
                                            
                                        
                                        
                                            
                                                
                                                    m
                                                
                                            
                                        
                                    
                                
                            
                        
                    
                
            which is sum total data observed in the waveform model as the overall models                 
                    ϕ
                     
                
            and the estimation portion 72 obtains an operating state                 
                    
                        
                            C
                        
                        
                            t
                            ,
                             
                            k
                        
                        
                            (
                            m
                            )
                        
                    
                
             which minimizes the error E.” Matsato teaches: the evaluation portion 71 obtains an evaluation value E where an extent that a current waveform Y supplied from the data acquisition unit 11 is observed is evaluated so as to be supplied to the estimation portion 72, in each household electrical appliance model #m forming the waveform model as the overall models                 
                    ϕ
                     
                
            stored in the model storage unit 13(i.e. and a model evaluation device electrically connected to the data processing device and the model building device, configured for receiving the first respective verification data and the second respective verification data from the data processing device, receiving the first synthesized simulation data and the second synthesized simulation data from the model building device) The estimation portion 72 estimates an operating state                 
                    
                        
                            C
                        
                        
                            t
                            ,
                             
                            k
                        
                        
                            (
                            m
                            )
                        
                    
                     
                
            at the time point t of each household electrical appliance indicated by each household electrical appliance model #m by using the evaluation value E supplied from the evaluation portion 71. The estimation portion 72 solve an integer programming program of Math (30) according to the integer programming and estimates the operating state                 
                    
                        
                            C
                        
                        
                            t
                            ,
                             
                            k
                        
                        
                            (
                            m
                            )
                        
                    
                     
                
             of the household electrical appliance #m...                 
                    m
                    i
                    n
                    i
                    m
                    i
                    z
                    e
                     
                    E
                    =
                    |
                    
                        
                            Y
                        
                        
                            t
                        
                    
                    -
                    
                        
                            ∑
                            
                                m
                                =
                                1
                            
                            
                                M
                            
                        
                        
                            
                                
                                    ∑
                                    
                                        k
                                        =
                                        1
                                    
                                    
                                        
                                            
                                                k
                                            
                                            
                                                
                                                    
                                                        n
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            W
                                        
                                        
                                            k
                                        
                                        
                                            
                                                
                                                    m
                                                
                                            
                                        
                                    
                                    ⋅
                                    
                                        
                                            C
                                        
                                        
                                            t
                                            ,
                                             
                                            k
                                        
                                        
                                            
                                                
                                                    m
                                                
                                            
                                        
                                    
                                
                            
                        
                    
                     
                
            subject to 0                
                    ≤
                    
                        
                            C
                        
                        
                            t
                            ,
                            k
                        
                        
                            
                                
                                    m
                                
                            
                        
                    
                    ∈
                    i
                    n
                    t
                    e
                    g
                    e
                    r
                
             Here E indicates an error of the measurement waveform                 
                    
                        
                            Y
                        
                        
                            t
                        
                    
                
             and a current waveform                 
                    
                        ∑
                        
                            
                                ∑
                                
                                    
                                        
                                            W
                                        
                                        
                                            k
                                        
                                        
                                            
                                                
                                                    m
                                                
                                            
                                        
                                    
                                    
                                        
                                            C
                                        
                                        
                                            t
                                            ,
                                             
                                            
                                                
                                                    3
                                                
                                                
                                                    4
                                                
                                            
                                        
                                        
                                            
                                                
                                                    m
                                                
                                            
                                        
                                    
                                
                            
                        
                    
                
             (i.e. comparing the first respective verification data with the first synthesized simulation data to obtain a first similarity between the first respective verification data and the first synthesized simulation data, and comparing the second respective verification data with the second synthesized simulation data to obtain a second similarity between the second respective verification data and the second synthesized simulation data)). 
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kelly in view of Wang and in view of Matsato the motivation to do so would be to use pre-stored models of household consumption usage and only update the parameters associated with the pre-stored models rather than the models themselves(Matsato, para. 0054-0058,“The overall models                 
                    ϕ
                     
                
            include household electrical appliance models #1 to #M which are M models (representing current consumption) of a plurality of household electrical appliances. The parameters                 
                    ϕ
                
             of the overall models include a current waveform parameter indicating current consumption for each operating state of a household electrical appliance indicated by the household electrical appliance model #m…The model parameters                 
                    ϕ
                     
                
            of the overall models stored in the model storage unit 13 are referred to by the evaluation portion 21 and the estimation portion 22 of the state estimation unit 12, the label acquisition unit 15, and the data output unit 16, and are updated by a waveform separation learning portion 31, a variance learning portion 32, and a state variation learning portion 33 of the model learning unit 14…The model learning unit 14 performs model learning for updating the model parameters….”). 
Regarding claim 11, Kelly in view of Wang and in view of Filippi and further in view of Matsato teaches the loading disaggregation system according to claim 10, wherein the loading disaggregation system operates in a model building mode or a 15model application mode, the loading disaggregation system operating at a training stage or a testing stage when the loading disaggregation system operates in the model building mode(Kelly, pg. 158, “We trained our nets on both synthetic aggregate data and real aggregate data in a 50:50 ratio. We found that synthetic data acts as a regulariser. In other words, training on a mix of synthetic and real aggregate data rather than just real data appears to improve the net’s ability to generalise to unseen houses. For validation and testing we use only real data (not synthetic).”), wherein the loading disaggregation system enables the data processing device, the model building device and the model evaluation device in the 20model building mode(Matsato, paras. 0050-52, see also fig. 3 “In other words, in Fig. 3, the state estimation unit 12 has an evaluation portion 21… [t]he evaluation portion 21 obtains an evaluation value E where the current waveform Y supplied (to the state estimation unit 12) from the data acquisition unit 11 is observed in each combination of states of a plurality of household electrical appliance models #1 to #M forming the overall models                 
                    ϕ
                
             stored in the model storage unit 13….”), wherein the loading disaggregation system enables the data processing 52File: US13999F0SUNDIAL CONFIDENTIAL device and the model building device, and disables the model evaluation device in the model application mode(Matsato, paras. 0058-60, see also fig. 3,  “The model learning unit 14 performs model learning for updating the model parameters of the overall models stored in the model storage unit 13, using the current waveform Y supplied from the data acquisition unit 11 and the estimation result (the operating state of each household electrical appliance) of the state estimation supplied from (the estimation portion 22…[i]n other words, in Fig. 3, the model learning unit 14 includes the waveform separation learning portion 31, the variance learning portion 32, and the state variation learning portion 33. The waveform separation learning portion 31 performs waveform separation learning for obtaining (updating) a current waveform parameter which is the model parameter by using the current waveform Y supplied (to the model learning unit 14) from the data acquisition unit 11 and the operating state of each household electrical appliance supplied from (the estimation portion 22…and updates the current waveform parameter stored in the model storage unit 13 to a current waveform parameter obtained by the waveform separation learning.” Matsato teaches: the model parameters of the overall models stored in the model storage unit 13, using the current waveform Y supplied from the data acquisition unit 11 and the estimation result (of the state estimation supplied from the estimation portion 22 (i.e. wherein the loading disaggregation system enables the data processing 52File: US13999F0SUNDIAL CONFIDENTIAL device and the model building device, and disables the model evaluation device in the model application mode)).
Regarding claim 12, Kelly in view of Wang and in view of Filippi and further in view of Matsato teaches the loading disaggregation system according to claim 10, wherein the data processing device comprises:  a data processing module, comprising: a data sampling module configured for sampling the total raw data, the first respective raw data and the second respective raw data based on a sampling cycle to generate a plurality of entries of total sampling data, a plurality of entries of first respective sampling data and a plurality of entries of second respective sampling data, respectively(Kelly, pg. 67, see also fig. 4.2 and table 4.1, “We present the first open access UK dataset with a high temporal resolution. We recorded from five houses. Every six seconds we recorded the active power drawn by individual appliances and the whole-house apparent power demand. Additionally, in three houses, we sampled the whole-house voltage and current at 44.1 kHz (down-sampled to 16 kHz for storage) and also calculated the active power, apparent power and RMS voltage at 1 Hz. In House 1, we recorded for 3.5 years and individually recorded from almost every single appliance in the house resulting in a recording of 54 separate channels (although less channels were recorded towards the start of the dataset). We will continue to record from this house for the foreseeable future. We recorded from the four other houses for several months; each of these houses recorded between 5 to 26 channels of individual appliance data. Figure 4.2 provides an overview of the system design and Table 4.1 summarises the dataset.”); and a preprocessing module electrically connected to the data sampling module, configured for preprocessing the total sampling data, the first respective sampling data and the second respective sampling data to generate a plurality of entries of total preprocessed data, a plurality of entries of first respective preprocessed data and a plurality of entries of second respective preprocessed data, wherein the entries of the total sampling data, the entries of the first respective sampling data, the entries of the second respective sampling data, the entries of the total preprocessed data, the entries of the first respective preprocessed data and the entries of the second respective preprocessed data are equal in quantity(Kelly, pgs. 94-95, see also 5.2,  “An overview of the NILMTK pipeline is shown in Figure 5.2. Datasets are first converted to NILMTK’s standard data format which is based on the HDF5 file format. Then users can invoke NILMTK’s dataset statistics functions to identify issues with the dataset, or to compute informative statistics. NILMTK also provides a set of pre-processing functions to clean up imperfections or to re-sample data to a different timebase (e.g. downsampling 1 Hz data to 1 minutely data).” ).  
Regarding claim 13, Kelly in view of Wang and in view of Filippi and further in view of Matsato teaches the loading disaggregation system according to claim 12, wherein the data processing device further comprises: a data balance module electrically connected to the preprocessing module, configured for performing bootstrapping on the total preprocessed data, the first respective preprocessed data and the second respective preprocessed data according usage behavior of the first electrical appliance and the second electrical appliance to generate a plurality of entries of total balanced data, a plurality of entries of first respective balanced data and a plurality of entries of second respective balanced data, wherein the number of the entries of the total balanced data is greater than the number of the entries of the total preprocessed data, wherein the entries of the total balanced data, the entries of the first respective balanced data and the entries of the second respective balanced data are equal in quantity(Kelly, pgs. 94-95, see also 5.2,  “An overview of the NILMTK pipeline is shown in Figure 5.2. Datasets are first converted to NILMTK’s standard data format which is based on the HDF5 file format. Then users can invoke NILMTK’s dataset statistics functions to identify issues with the dataset, or to compute informative statistics. NILMTK also provides a set of pre-processing functions to clean up imperfections or to re-sample data to a different timebase (e.g. downsampling 1 Hz data to 1 minutely data)” & see also Kelly, pg. 158, “In energy disaggregation, we have the advantage that generating effectively infinite amounts of synthetic aggregate data is relatively easy by randomly combining real appliance activations. (We define an ‘appliance activation’ to be the power drawn by a single appliance over one complete cycle of that appliance. For example, Figure 9.1 shows a single activation for a washing machine.) We trained our nets on both synthetic aggregate data and real aggregate data in a 50:50 ratio. We found that synthetic data acts as a regulariser. In other words, training on a mix of synthetic and real aggregate data rather than just real data appears to improve the net’s ability to generalise to unseen houses…”).  
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Adam Clark Standke whose telephone number is (571)270-1806. The examiner can normally be reached 10AM-7PM M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
Adam Clark Standke
Assistant Examiner
Art Unit 2129



/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129