Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Amendments
This action is in response to amendments filed November 30th, 2021, in which Claims 1, 3-10, 16, and 17 are amended.  Claim 2 is cancelled.  The amendments have been entered, and Claims 1 and 3-17 are currently pending, with Claims 5-9 and 14-16 currently withdrawn due to a restriction requirement.  Claims 1, 3, 4, 10-13, and 17 are examined.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claims 1, 3, 4, 10-13, and 17 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Specifically, Claims 1, 10, and 17 recite a machine learning system … for a fraud detection system … comprising:  … obtain[ing] a dataset comprising a plurality of attributed sequences based on user behavior associated with clickstreams.  However, the specification makes clear that the “fraud detection” and “clickstream” embodiments of the 
The dependent claims are rejected for inheriting the new matter of the independent claims upon which they rely.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 3, 4, 10, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Das, US PG Pub 2021/0035141 (with an effective filing date of February 23th, 2018) in view of Isaiah, US PG Pub 2019/0019193 (with a filing date of July 13th, 2017) and further in view of Donahue, “Long-term Recurrent Convolutional Networks for Visual Recognition and Description” (arxiv.org version4 cited in current PTO-892, not the version cited in the PTO-892 of 9/2/2021).
Regarding Claim 1, Das teaches a machine learning system for embedding attributed sequence data comprising an attribute data part having a fixed number of attribute data elements (Das, [0082], “if each interval contains 14 features per week and the financial device holder has transacted for a month (assuming 1 month = 4 weeks), the dimensions of the interval for the financial device holder would be 4x14.  In this model, padding is not needed to make the subintervals in the data map uniform, since all subintervals have the same size”) and a sequence data part having a variable number of sequence data elements (Das, [0084], “The various layers of the CNN model may be applied to each of the d intervals of the n users to produce a series”) into a fixed length feature representation (Das, [0087] “LSTM may be used to aggregate the temporal features … to compute an output vector” see Fig. 7, the output from the LSTM 312 to be classified is a fixed length feature representation) for a fraud detection system (Das, [0003], “Consumers use financial devices (e.g. credit cards, debit cards, etc.) to complete financial transactions with merchants … There are many possible factors for why a particular financial device may become a primary financial device, including … the reliability and convenience of fraud detection” that is, a system for predicting financial primary of a financial device is part of system that also provides fraud detection, e.g. a fraud detection system) wherein the machine learning system comprised a multilayer feedforward network having an attribute data input layer and an attribute vector output layer which comprises a first predetermined number of units, operatively coupled to a long short-term memory (LSTM) network which comprises a second number of hidden unit (Das, Figs. 6 & 7, elements 302-306, see [0081], “a CNN is made up of a number of algorithmic layers, including an input layer 302, a convolution layer 304, a pooling layer 306” & [0087], “the CNN predictive model architecture 300 may be reconfigured to pass the one-dimensional extracted feature array V to an LSTM layer 312 … LSMT takes input … to compute a hidden vector sequence”), the machine learning system comprising:  a computing device; and a computer-readable storage medium comprising a set of instructions that upon execution by the computing device (Das, Claim 1, “a computer implemented method”) cause the machine learning system to:  obtain a dataset comprising a plurality of attributed sequences based on user behavior (Das, [0083-0084], “to represent the financial device holder’s transaction history over the entire sample time period” with “each of d intervals for n users” see Fig. 6, element 302) and for each attributed sequence in the dataset, train the multilayer feedforward network using the attribute data part of the attributed sequence … and train the LSTM network using the sequence part of the attributed sequence (Das, Abstract, “the predictive model trained based on historic transaction data”) wherein the training of the multilayer feedforward network is coupled with training the LSTM network such that, in use, the machine learning system is configured to output a fixed-length feature representation of input attributed sequence data which encodes dependencies between different attribute elements in the attribute data part, dependencies between different sequence data elements in the sequence data part, and dependencies between attribute data elements and sequence data elements within the attributed sequence data (Das, [0087], “LSTM may be used to aggregate the temporal aspects 
Das does not teach that the attributed sequences are based on user behavior associated with clickstreams, but Isaiah teaches the analysis of financial transaction data “during online transactions” (Isaiah, Abstract), e.g. user behavior associated with clickstreams.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to, in Das, use transaction data that comes from online transactions, as does Isaiah.  The motivation to do so is that many financial credit card transactions are performed online (see Isaiah, [0037]).
Das further does not teach, but Donahue does teach, to train the multilayer feedforward network … via backpropagation with respect to a first objective function and to train the LSTM network … via backpropagation with respect to a second objective function (Donahue, pg. 4, 2nd column, 1st paragraph, “We train our LRCN models using stochastic gradient decent, with backpropagation used to compute the gradient … over mini-batches” where, pg. 4, Fig. 3 has the same structure as the CNN-LSTM of Das and where each mini-batch defines its own objective function, e.g. error on that batch).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to train the CNN-LSTM of Das in the same manner that the CNN-LSTM of Donahue is trained (e.g. using backpropagation over minibatches to optimize the loss function).  The motivation to do so is that this procedure successfully finds the parameters of the given CNN-LSTM model.
Regarding Claim 3, the Das/Isaiah/Donahue combination of Claim 1 teaches the machine learning system of Claim 1 (and thus the rejection of Claim 1 is incorporated).  The wherein an output of the attribute vector output layer is operatively coupled to an input of the attribute vector input layer of a recurrent neural network (see Das, Fig. 7, between elements 306 and 312 and equivalently Donahue, pg. 4, Fig. 3).
Regarding Claim 4, the Das/Isaiah/Donahue combination of Claim 1 teaches the machine learning system of Claim 3 (and thus the rejection of Claim 3 incorporated).  Das further teaches wherein the attribute vector input layer of the recurrent neural network comprises a hidden state of the LSTM network at a first evaluation step (Das, [0087], (Formula 9) is both an input layer of the attribute vector V, i.e. “to pass the one-dimensional extracted feature V to an LSTM layer” and a hidden state                         
                            
                                
                                    h
                                
                                
                                    t
                                
                            
                        
                     of the LSTM network at each evaluation step); the first predetermined number of attribute vector output layer units is equal to the second predetermined number of hidden units (Claim 1 requires only that the units comprise a predetermined number, and each set of units comprises at least one predetermined unit (as well as other units), and one is equal to one), and the fixed-length feature representation of input attributed sequence data comprises a hidden state of the LSTM network at a final evaluation step (Das, [0087], (Formula 10), the output                         
                            
                                
                                    o
                                
                                
                                    t
                                
                            
                        
                     is a hidden state of the network that is output after H iterations, see Fig. 7).
Claim 10 recites the method that is performed by the system of Claim 1, and is thus rejected for reasons set forth in the rejection of Claim 1.  Similarly, Claim 17 recites a computer program product comprising the instructions recited in Claim 1, and is thus also rejected for reasons set forth in the rejection of Claim 1.

Claims 11 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Das, in view of Isaiah and Donahue, and further in view of Bhat et al., “Identifying Nontechnical Power Loss via Spatial and Temporal Deep Learning.”
Regarding Claim 11, the Das/Isaiah/Donahue combination of Claim 10 teaches the method of Claim 10 (and thus the rejection of Claim 10 is incorporated).  Das further teaches wherein the first predetermined number of attribute vector output units is equal to the second predetermined number of LSTM network hidden units (Claim 10 requires only that the units comprise a predetermined number, and each set of units comprises at least one predetermined unit (as well as other units), and one is equal to one).  
Das does not teach, but Bhat teaches wherein the multilayer feedforward network comprises:  an encoder having an encoder input layer which comprises the attribute data input layer and an encoder output layer which comprises the attribute vector output layer (Bhat, pg. 273, 2nd column, last paragraph – pg. 274, 1st column, 1st paragraph & Fig. 3, “An autoencoder is an unsupervised learning algorithm that tries to replicate the input … This forces the model to learn features that can be used to effectively represent the data” where the “code” is the output of the encoder/learned features) and a decoder having a decoder input layer coupled to the encoder output layer, and a decoder output layer which comprises a reconstructed estimate of an input to the encoder input layer, and wherein the first objective functions comprises a distance measure between the input to the encoder input layer and the reconstructed estimate (Bhat, pg. 273, 2nd column, last paragraph – pg. 274, 1st column, 1st paragraph & Fig. 3, “An autoencoder is an unsupervised learning algorithm that tries to replicate the input … the model then tries to recreate the signal as close as possible to the original signal”) and training the multi-layer feedforward neural network comprises: iteratively performing steps of forward- and back-propagation with the attribute data part of the attributed sequence as input to the encoder input layer until the distance measure satisfies a first convergence target (Bhat, pg. 274, 1st column, 2nd paragraph, “The autoencoder is trained using the backpropagation algorithm” with pg. 275, 1st column, 1st paragraph, “for 400 iterations”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use the autoencoder of Bhat to provide features to the LSTM of Das rather than the CNN and to train the autoencoder in the recited manner.  The motivation to do so is that the autoencoder will “learn features that can be used to effectively represent that data” (Bhat, pg. 275, 1st column, 1st paragraph) – e.g. the CNN and autoencoder of Bhat are two possible choices of feature extraction.
Regarding Claim 12, the Das/Isaiah/Donahue/Bhat combination of Claim 11 teaches the training method of Claim 11 (and thus the rejection of Claim 11 is incorporated).  Das further teaches wherein the second objective function comprises a likelihood measure of incorrect prediction of a next sequence item at each one of a plurality of training step times of the LSTM (Das, Abstract, “output a probability of primary financial device change” & [0084], “to produce a series of                         
                            
                                
                                    p
                                
                                
                                    n
                                
                            
                        
                     probabilities” e.g. prediction at each of a plurality of step times & [0086], “when building the neural network predictive model based on historical data, p can be compared to a known (or predetermined label) value of 0 or 1 to calculate an error rate”) and iteratively repeating the plurality of training time steps (Das, [0086], “the parameters may be … adjusted/learned through successive iterations.  This provides the benefit of a self-improving predictive model”).
until the likelihood satisfies a second convergence target (Donahue, pg. 4, 1st column, last paragraph, “we optimize parameters”), each iteration comprising:  at a first training time step, copying the output of the attribute vector output layer to a hidden state of the LSTM network (Donahue, pg. 4, Fig. 3, output of the CNN/feature extractor goes into the LSTM, equivalent to Das, [0087], (Formula 9), at a final training time step, computing the likelihood measure (Donahue, pg. 4, 1st column, last paragraph, “we optimize parameters to minimize the expected log likelihood”).  The training steps of Donahue have already been incorporated into the combination invention in the rejection of the independent claim.

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Das, in view of Isaiah, Donahue and Bhat, and further in view of “Loss Functions” from the “ML Cheatsheet” at ml-cheatsheet.readthedocs.io/en/latest/loss_functions.html (verified online at least as of January, 2018, by the Internet Archive).
The Das/Isaiah/Donahue/Bhat combination of Claim 11 teaches the training method of Claim 12 (and thus the rejection of Claim 12 is incorporated).  The combination further teaches wherein the likelihood measure comprises a categorical cross-entropy loss function (Donahue, pg. 4, 2nd column, 1st paragraph shows the loss/likelihood as the log of the probability, which is identified in “Loss Functions” as “Cross-Entropy,” which “measures the performance of a classification model whose output is a probability, such as that of Das/Donahue and is shown as the log in the “Code” box).  The combination does not show wherein the distance measure comprises a mean-squared error loss function, but “Loss Functions” show a mean-squared-error loss function (“Loss Functions,” 2nd-to-last page).  It would have been obvious to .
Response to Arguments
Applicant’s arguments filed February 10th, 2022 have been fully considered, but are not fully persuasive.
Applicant’s amendments are such that 35 U.S.C. 112(f) is not invoked by the current claim language.
Applicant’s arguments regarding the prior art rejections of all the claims have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Conclusion
Applicant’s amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRIAN M SMITH whose telephone number is (469)295-9104. The examiner can normally be reached Monday - Friday, 8:30am -5pm Central.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BRIAN M SMITH/Primary Examiner, Art Unit 2122