DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Examiner Suggestion
With respect to claim 5, Examiner notes claim limitation “wherein the cache stores only one set of encoded states corresponding to the user account at a time” may be written in such a way that narrows claim scope more than Applicant intends. Examiner suggests rephrasing the “corresponding to the user account at a time” to something broader, such as “corresponding to a given user account at a time”.
 
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 9 
With respect to claim 9, it is unclear as to what is meant by “previously generated” features, as they are “from the current” payment transaction. The claim as written is unclear as to what extracted features were previously generated via claim 1.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Based upon consideration of all relevant factors with respect to the claims as a whole, claims 1-20 are determined to be directed to an abstract idea. The Examiner has identified system claim 1 as the claim that represents the claimed invention for analysis and is analogous to method claim 10 and non-transitory computer readable storage medium of claim 17 (i.e., same rationale of claim 1 (below), is similarly applied to claims 10 and 17 (mutatis mutandis)). The rationale for the aforementioned determination of patent ineligibility under 35 USC §101 is explained below:

With respect Step 1 of 2019 PEG analysis, the claims are either directed to a system, article of manufacture, or method, which are statutory categories of invention (Step 1 of 2019 PEG analysis: YES).

With respect Step 2A Prong I of 2019 PEG analysis, claims 1-20 recite as a whole a method of organizing human activity because the claims recite a method of (additional elements emphasized in bold are considered to be parsed from the remaining elements which are reciting the abstract idea): 

A system, [comprising]: one or more hardware processors; and a memory storing computer-executable instructions, [that in response to execution by the one or more] hardware processors, causes the system to perform operations comprising: receiving a request to process a current payment transaction between a payment provider and a user having a user account with the payment provider; determining a state of a recurrent neural network (RNN) fraud model for the user account by accessing a cache storing a set of encoded states for a plurality of nodes included in the RNN fraud model, the set of encoded states being previously calculated based on execution of the RNN fraud model with respect to a prior transaction of the user account, wherein the prior transaction is an immediately preceding transaction to the current payment transaction; executing the RNN fraud model based on data associated with the current payment transaction and the set of encoded states stored in the cache, wherein the executing the RNN fraud model includes encoding a new set of encoded states for the plurality of nodes; updating the cache to store the new set of encoded states in place of the set of encoded states; and determining a risk level corresponding to the current RNN fraud model.

Under broadest reasonable interpretation, these are fundamental economic principles and/or practices of mitigating risk by determining risk levels of payment transactions / transaction requests. Thus, the claim recites an abstract idea (Step 2A Prong I: Yes).

Addressing Step 2A Prong II of 2019 PEG analysis, this judicial exception is not integrated into a practical application. The claims as a whole merely describe how to generally apply the generic computer components including a system, processor, [non-transitory] memory, and cache (See MPEP 2106.05(f)), such that it amounts to no more than mere instructions to implement the abstract idea by adding the words “apply it” (or an equivalent). Furthermore, the RNN and updated encoded states is generally linking the use of the judicial exception to a particular technological environment (See MPEP 2106.5(h)). Additionally, sending / receiving of request information is adding insignificant extra-solution activity to the judicial exception (See MPEP 2106.05(g)).  Simply implementing the abstract idea on the aforementioned generic hardware is not a practical application of the abstract idea. Accordingly, when considered separately and as an ordered combination, these additional elements do not integrate the abstract idea into a practical application. The claims are directed to an abstract idea. (Step 2A Prong II: NO, the additional claimed elements are not integrated into a practical application).

Addressing Step 2B of 2019 PEG analysis, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As previously system, processor, [non-transitory] memory, and cache (See MPEP 2106.05(f)), such that it amounts to no more than mere instructions to implement the abstract idea by adding the words “apply it” (or an equivalent). Furthermore, the RNN and updated encoded states is generally linking the use of the judicial exception to a particular technological environment (See MPEP 2106.5(h)). For the step of system receiving request that was previously considered extra-solution, this has been further evaluated here and determined to be well-understood, routine, and conventional activity in the field. The specification does not provide any indication that claimed receiving of request is performed by anything other than a generic form of data transmission, and the OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) court decisions (MPEP 2106.05 (d)(II)) indicate that a computer merely sending/receiving information over a network is well-understood, routine, and conventional function when claimed at a high level of generality, (as the case is here). Accordingly, when considered separately and as an ordered combination, nothing in the claim adds significantly more (i.e. an inventive concept) to the abstract idea. Thus, claims 1 and 11 are not patent eligible. (Step 2B: NO. The claims do not amount to significantly more).

With respect to the dependent claims, the dependent claims have been given the full analysis including analyzing the additional limitations both individually and as an ordered combination. The dependent claims, when analyzed both individually and in combination, are also held to be patent ineligible under 35 U.S.C. 101 because of the same reasoning as above and because the additional limitations recited fail to establish that the claims are not directed 

With respect to claims 2-3, 6-9, 12-14, and 18-19 they are also generally linking the use of the judicial exception to a particular technological environment of RNNs (See MPEP 2106.5(h)), and accordingly do not indicate that the previously mentioned additional elements are successfully integrated / amounting to significantly more, either alone or in combination. For these reasons these dependent claims are also not patent eligible.

With respect to claims 4 and 11, they do not recite any further additional elements outside of the abstract idea, and accordingly do not indicate that the previously mentioned additional elements are successfully integrated / amounting to significantly more, either alone or in combination. For these reasons these dependent claims are also not patent eligible.

With respect to claims 5 and 16, they generally apply the generic cache (See MPEP 2106.05(f)), such that it amounts to no more than mere instructions to implement the abstract idea by adding the words “apply it” (or an equivalent) and accordingly do not indicate that the previously mentioned additional elements are successfully integrated / amounting to significantly more, either alone or in combination. For these reasons these dependent claims are also not patent eligible.

With respect to claims 15 and 20, they generally apply the generic cache (See MPEP 2106.05(f)), such that it amounts to no more than mere instructions to implement the abstract idea by adding the words “apply it” (or an equivalent), and  generally linking the use of the judicial exception to a particular technological environment of RNNs (See MPEP 2106.5(h)), and accordingly do not indicate that the previously mentioned additional elements are successfully integrated / amounting to significantly more, either alone or in combination. For these reasons these dependent claims are also not patent eligible.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.


Claims 1-4, 6-7, 9-12, 14-15, 17-18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over United States Application Publication No.  US-20200314101-A1 to Zhang (hereinafter Zhang) in further view of Non-Patent Literature “Learning to Remember More with Less Memorization”1 to Le (“Le”).
With respect to claim 1, Zhang discloses: A system (Fig. 1, 104 Processing Computer):

    PNG
    media_image1.png
    322
    772
    media_image1.png
    Greyscale



comprising: one or more hardware processors (Fig. 9 in further view of ¶¶107, 118 of Zhang):

    PNG
    media_image2.png
    777
    842
    media_image2.png
    Greyscale

¶107 of Zhang: FIG. 9 shows a block diagram of a processing computer 1200 that may be used in embodiments of the invention. Processing computer 1200 may be for example, processing computer 104 of FIG. 1. Processing computer 1200 may comprise a memory 1220, a processor 1240, […]

and a memory storing computer-executable instructions, that in response to execution by the one or more hardware processors, causes the system to perform operations comprising (¶107 of Zhang): 

¶107 of Zhang: […] Processing computer 1200 may be for example, processing computer 104 of FIG. 1. Processing computer 1200 may comprise a memory 1220, a processor 1240, […]. The processing computer 1200 may also comprise a computer readable medium 1280, which may comprise code, executable by the processor 1240, for implementing methods according to embodiments. […]

receiving a request to process a current payment transaction (authorization request message, ¶27 of Zhang) between a payment provider (authorizing computer 106) and a user having a user account (user, ¶¶21, 50 of Zhang) with the payment provider (authorizing entity, ¶23 of Zhang); (See Fig. 1, circled 1 of Zhang, in further view of ¶¶50-51, and ¶¶21, 23, 25, 27 of Zhang):

    PNG
    media_image3.png
    322
    772
    media_image3.png
    Greyscale

¶¶50-51 of Zhang: In step 1, a user may use the access device 102 to initiate a transaction with a resource provider, and the user may input payment credentials into the access device 102 […] The access device 102 may then generate an authorization request message. [¶51] […] the processing computer 104 can receive the authorization request message from the access device 102. […]

¶21 of Zhang: A “user” may include an individual or a computational device. […] the user may be a cardholder, account holder, or consumer.

¶23 of Zhang: An “authorizing entity” may be an entity that authorizes a request, typically using an authorizing computer to do so. An authorizing entity may be an issuer, a governmental agency, a document repository, an access administrator, etc.

An “issuer” may be a financial institution, such as a bank, that creates and maintains financial accounts for account holders. An issuer or issuing bank may issue and maintain financial accounts for consumers. The issuer of a particular consumer account may determine whether or not to approve or deny specific transactions. An issuer may authenticate a consumer and release funds to an acquirer if transactions are approved (e.g., a consumer's account has sufficient available balance and meets other criteria for authorization or authentication).

¶27 of Zhang: An “authorization request message” may be a message that is sent to request authorization for an interaction. […] An authorization request message according to some embodiments may comply with ISO 8583, which is a standard for systems that exchange electronic transaction information associated with a payment made by a user using a payment device or payment account.

Examiner’s Note: Examiner interprets the limitation as stating the “between a payment provider and a user” recitation is further qualifying the “receiving a request of a current payment transaction”)

determining a state (cell state, c(t) and/or hidden state h(t) Fig. 3) of a recurrent neural network (RNN) fraud model for the user account by accessing […] a set of encoded states (hidden states, h(t-1), and/or cell states c(t-1)) for a plurality of nodes (inputs / outputs of LSTM cells) included in the RNN fraud model, 
(Examiner notes ¶¶17,47 [disclosing fraud RNN model], Fig. 5 in further view of ¶¶85-87 [disclosing structure where states of RNNs are determined], and Fig. 3, refs 315, 304, 306, 335, 312, 345,  in further view of  ¶71 of Zhang [disclosing how determination of state of RNN fraud model is determined by a set of previous states]. Examiner interprets the “plurality of nodes” claim limitation to include inputs/outputs of an LSTM cell, under broadest reasonable interpretation):

¶¶17, 47 of Zhang: Embodiments of the invention include[s] […] approach to incorporating authorization decisions from an authorizing computer into an analytical model residing at a processing computer. The analytical model can be a deep recurrent neural network (RNN) with long short-term memory (LSTM) where authorization decisions are embedded into the inner structure of the deep recurrent neural network. An LSTM is a unit of an RNN that can effectively retain information, […] Authorization decisions for interactions may include […] fraud flags. [¶47 of Zhang:] For example, the processing computer 104 may use the analytical model to predict if the authorization request is fraudulent.


    PNG
    media_image4.png
    748
    458
    media_image4.png
    Greyscale


¶85 of Zhang: FIG. 5 shows a block diagram of an analytical model 500 according to embodiments. The analytical model may be a deep recurrent neural network (RNN). The analytical model 500 may comprise an embedding layer 510, one or more LSTM cells 530A and 530B, and a predictive layer 540.

¶87 of Zhang: […] The first LSTM cell 530A may maintain a cell state c1 (t) and a hidden state h1(t) for each user in the network. The cell state c(t) may be a vector that stores information about a user's interactions over a long time scale (i.e., a long period of time) hidden state h(t) may be a vector that stores information about the user's interactions over a short time scale (i.e., a short period of time).


    PNG
    media_image5.png
    537
    780
    media_image5.png
    Greyscale

¶¶71, 73 of Zhang [in view of Figs. 3 and 5]: The input vector x(t) 305 and the hidden state h(t−1) 315 may also pass through an input activation layer 330 that is a tanh neural network layer. The input activation layer 330 may use the tanh function to transform the inputs to values between −1 and 1. The information in the cell state c(t−1) 325 and the hidden state h(t−1) 315 may be within the range of −1 to 1 already, thus in order to meaningfully add new information to the cell state c(t−1) 325, the input can be scaled to that range as well. Other embodiments may use a different activation function to scale the inputs. The input gate 304 may be a pointwise multiplication of the output of the input activation layer 330 and the output of the input gate layer 340, which results in a vector of information that should be added to the cell state c(t−1) 325. A pointwise addition operation 306 can add this vector of information from the input gate 304 to the cell state c(t−1) 325. The cell state c(t−1) 325 is thus updated to an updated cell state c(t) 335 by removing information with the forget gate 302 and adding information with the input gate 304. […][¶73 of Zhang:] The updated cell state c(t) 335 can pass through a pointwise tanh function 308 to transform the values of the updated cell state c(t) 335 between −1 and 1. As with the input activation layer 330, this may be to ensure that the output is scaled correctly. The output gate 312 may perform a pointwise multiplication of the tanh function 308 and the output of the output gate layer 350 to generate an updated hidden vector h(t) 345. […]. The updated hidden vector h(t) 345 may also be output from the LSTM cell, and may be sent to another LSTM cell or a neural network layer.

Examiner’s Note (1): With respect to “encoded” limitation, Examiner notes the states of Zhang are understood to be “encoded” per the input layer encoding the data forwarded to hidden layer containing LSTMs (Fig. 5 in further view of ¶86 of Zhang), and further takes the stance that it is generally understood that the information within hidden layers ( of which is before predictive (i.e., output/softmax/decoding) layer, and after embedding layer (e.g., encoding layer)) are generally understood to be encoded:

    PNG
    media_image6.png
    748
    458
    media_image6.png
    Greyscale

¶86 of Zhang: The embedding layer 510 may encode the inputs.
Examiner’s Note (2): With respect to “for a plurality of nodes” limitation, Examiner interprets “node” to refer to any input/output of a given LSTM cell. Examiner also notes that calculating inputs/outputs using the associated vectors to determine subsequent states is understood to involve accessing the vectors in the long-short-term-memory cells (LSTM cells). Examiner notes At least Fig. 3 shows a plurality of nodes (i.e., input / outputs of cells (e.g., c(t-1), h(t-1), etc). Furthermore, ¶87 explicitly states: “each LSTM cell 530 may comprise 256 hidden nodes”. 

Examiner’s Note (3): With respect to “for the user account” limitation, Examiner notes the transaction is with respect to user (as previously shown above), of which includes account holders. the user submitting payment authorization request is already understood to be performing transaction with respect to an account of user per Zhang’s definition of user including account holder( i.e., fraud detection by model is for the user account));(See ¶21 of Zhang):

	¶21 of Zhang: A “user” may include an individual or a computational device. In some embodiments, a user may be associated with one or more personal accounts and/or devices. In some embodiments, the user may be a cardholder, account holder, or consumer.

the set of encoded states (e.g., c(t-1), h(t-1)) being previously calculated based on execution of the RNN fraud model with respect to a prior transaction of the user account, 

    PNG
    media_image7.png
    564
    750
    media_image7.png
    Greyscale

¶61 of Zhang [addressing the “with respect to a prior transaction of the user account” limitation]: After the authorization decision is sent back to the processing computer, embodiments of the invention can allow the processing computer to incorporate the
authorization decision into an analytical model to enhance the accuracy for the subsequent transactions.

¶64 of Zhang [addressing the “with respect to a prior transaction of the user account” limitation]: the authorization decision features 220 may still be used to update the analytical model 230, which can be immediately available to analyze the next interaction. Therefore, authorization decisions can not only be used during training but may also be stored and updated at runtime. This may be done in real time, or substantially close to real time.

¶67 in further view of ¶101 of Zhang: time t […] time step t−1. [106] Each time step represents an interaction […]

¶101 of Zhang: […] A precursor cell state from the previous time step c(t−1) and a precursor hidden state from the previous time step h(t−1) may be updated in the LSTM to form a cell state c(t) and a hidden state h(t) for the current time step. […]

Examiner’s Note: Examiner notes the “interactions” of Zhang are understood to be include authorization requests/responses indicative of pending transactions corresponding to user’s account in view of at least ¶¶27-28 of Zhang: 

¶¶27-28 of Zhang: An “authorization request message” may be a message that is sent to request authorization for an interaction. […] An authorization request message according to some embodiments may comply with ISO 8583, which is a standard for systems that exchange electronic transaction information associated with a payment made by a user using a payment device or payment account. […] An “authorization response message” may be a message reply to an authorization request message. The authorization response message may be generated, for example, by a secure data server, an issuing financial institution, a payment processing network, a processing gateway, etc. The authorization response message may include, for example, one or more of the following status indicators: Approval—interaction was approved; Decline—interaction was not approved;

wherein the prior transaction is an immediately preceding transaction to the current payment transaction; (Fig. 3, note notation, (c(t-1) -> c(t), and h(t-1) -> h(t)));(See also ¶76 elucidating that the updated states e.g., (c(t), h(t)), correspond to the current input (e.g., current transaction authorization request), in further view of ¶¶67, 101 of Zhang);

    PNG
    media_image8.png
    564
    750
    media_image8.png
    Greyscale

¶76 of Zhang [in view of Fig. 3 (above): […] the current input vector x(t) […]

¶67 in further view of ¶101 of Zhang: time t […] time step t−1. [106] Each time step represents an interaction […]

¶101 of Zhang: […] A precursor cell state from the previous time step c(t−1) and a precursor hidden state from the previous time step h(t−1) may be updated in the LSTM to form a cell state c(t) and a hidden state h(t) for the current time step. […]

executing the RNN fraud model based on data associated with the current payment transaction and the set of encoded states stored […], (Fig. 3 in further view of Fig. 5 (above), and aforementioned citations indicating encoded states (above) in aforementioned calculations, and ¶¶74, 77 of Zhang):

¶74 of Zhang: the output gate layer 350 may receive information from the cell state c(t−1) 325 and/or the updated cell state c(t) 335 in addition to the input x(t) 305 and the hidden state h(t−1) 315 when determining what information to output.

¶77 of Zhang: The final prediction from the analytical model can be calculated from the hidden state h(t) […] a softmax function may be used to convert the hidden state h(t) into probabilities for each potential label. If there are only two possible categories for the prediction, other activation functions may be used to convert the hidden state h(t) into probabilities, such as a sigmoid function.

wherein the executing the RNN fraud model includes encoding a new set of encoded states for the plurality of nodes; (¶¶74-75, 87, 101 of Zhang, in view of Fig. 3):
¶74 of Zhang: the output gate layer 350 may receive information from the cell state c(t−1) 325 and/or the updated cell state c(t) 335 in addition to the input x(t) 305 and the hidden state h(t−1) 315 when determining what information to output.

¶75 of Zhang: Mathematically, in a general LSTM, the state vectors c(t) and h(t) at time step t can be concatenated into (c(t), h(t)) which can be updated based on state vectors for the previous time step t−1, c(t−1) and h(t−1), as well as a current input vector x(t) 

¶87 of Zhang: The first LSTM cell 530A may update a cell state c.sub.1(t−1) and a hidden state h.sub.1(t−1) from a previous time step with the new input data x(t) using the method described with reference to FIG. 3. […] Each LSTM cell 530 may comprise 256 hidden nodes. […] [i.e., executing the RNN fraud model includes encoding a new set of encoded states for a plurality of nodes, such as nodes corresponding to c(t), h(t)]
¶101 of Zhang: A precursor cell state from the previous time step c(t−1) and a precursor hidden state from the previous time step h(t−1) may be updated in the LSTM to form a cell state c(t) and a hidden state h(t) for the current time step.

updating the [data] to store the new set of encoded states […]; (At least ¶103 of Zhang):
¶103 of Zhang:  A precursor cell state from the previous time step c(t−1) and a precursor hidden state from the previous time step h(t−1) may be updated to form a cell state c(t) and a hidden state h(t) for the current time step.

and determining a risk level (e.g., risk score / risk probability) corresponding to the current payment transaction based on an output of the executing the RNN fraud model. (at least ¶¶85, 101 of Zhang discloses a risk score (i.e., risk level) corresponding to the current payment request (denoted as the tth step, as explained above). Examiner also notes ¶¶17, 47, 65, 79, 85, 89, 101 106 of Zhang as relevant):

¶85 of Zhang: […] One output ŷc(t) may be an interaction label, which may include a security risk score. […].

¶101 of Zhang: A precursor cell state from the previous time step c(t−1) and a precursor hidden state from the previous time step h(t−1) may be updated in the LSTM to form a cell state c(t) and a hidden state h(t) for the current time step. The precursor cell state c(t−1) and the precursor hidden state h(t−1) may be updated by a method such as that described in FIG. 3. The analytical model may then output an interaction ŷc(t) […]For example, the interaction label ŷc(t)  may be a security risk score […].

processing computer outputs such as fraud or non-fraud. […]

¶47 of Zhang: For example, the processing computer 104 may use the analytical model to predict if the authorization request is fraudulent.

¶89 of Zhang: For example, the predictive layer 540 may output a probability that an interaction is fraudulent. The predictive layer 540 may output a value for each possible output. For example, the analytical model may be configured to classify an interaction in one of six categories or risk labels: 0 for a normal interaction, 1 for a fraudulent interaction.

¶65 of Zhang: The risk labels may be based on or related to a risk score. The risk score can be the probability that an interaction is likely to be fraudulent. For example, the risk score may be a value between 0 and 1. A risk score value of close to 1 may indicate that the interaction has a very high likelihood being fraudulent. Because the analytical model may determine a classification for each interaction, the analytical model may be considered a machine learning classifier.

current input and past information while making future predictions […].

¶106 of Zhang: The risk score may take on values between 0 and 1, with 0 representing an interaction that is likely not fraudulent and/or with minimal risk, and 1 representing an interaction that is high risk and/or likely fraudulent. When there is a fraudulent interaction, that may be represented by both the risk label and the risk score.

Zhang fails to teach, but Le discloses: accessing a cache storing […] a set of encoded states (hidden states) / encoded states (hidden states) stored in the cache, (Figure 1 and Figure 1 Caption of Le): 


    PNG
    media_image9.png
    300
    566
    media_image9.png
    Greyscale


hidden states are pushed into the cache. When the writing time comes, the controller attends to the cache, chooses suitable states and accesses the memory. The cache is then emptied.

Page 5, §2.2.1, “Local Optimal Design” of Le: we introduce a cache of size L to store the hidden states of the controller during a write interval.

and, updating the cache to store the new set of encoded states in place of the set of encoded states; (At least abstract, Page 3 - §2.1.2 of Le, and Algorithm 1 of Le):

Abstract of Le: [[…] we introduce modifications to the original solution, resulting in a solution termed Cached Uniform Writing. This method aims to balance between maximizing memorization and forgetting via overwriting mechanisms.

Page 3, §2.1.2 of Le: In slot-based MANNs, memory M is a set of D memory slots. A write at step t can be represented by the controller’s hidden state ht, which accumulates inputs over several timesteps (i.e., x1, ...,xt). If another write happens at step t+k, the state ht+k’s information containing timesteps xt+1, ..., xt+k is stored in the memory. […] During writing, overwriting may happen, replacing an old write with a new one.


    PNG
    media_image10.png
    289
    616
    media_image10.png
    Greyscale


Examiner’s Note: With respect to Algorithm 1 (above), Examiner takes the stance it is understood, that, for each given time step t of T, (e.g., line 1 of algorithm), the hidden states of model of Le (ht = 1,2,3, etc.) are appended to cache (e.g., line 2 of algorithm), and, in the case of cache being full (e.g., line 3), results in corresponding hidden states being cleared from cache (e.g., line 9) when the cache is filled (e.g., line 3 of algorithm, in further view of being nested in line 3’s conditional statement). Examiner further notes that this occurs for T > L, (E.g., line 3 in view of line 1 of algorithm 1), and that any given t where t > L is resulting in updating the cache with newer (corresponding) hidden states, as it is understood the algorithm will store new sets of encoded states (hidden states) in cache (e.g., line 2 of algorithm, particularly in view of lines 1 and 14’s for loop encapsulating entire append/clear steps), of which, under broadest reasonable interpretation, constitutes updating the cache to store the new set of encoded This ensures that after ⌊                         
                            T
                            /
                            L
                        
                    ⌋ writes, all memory slots should be filled and the model has to learn to overwrite.” [i.e., version suggested to be optimal by Le has the total number of time steps (T) exceeding the cache length (L), which necessarily results in algorithm 1 of Le causing newer hidden states (corresponding to t > L) to replace the previous time step’s hidden states (corresponding to t ≤ L) which were removed in previously occurring cache clear step].

Accordingly, it would have been obvious to one having ordinary skill in the art prior to the effective filing date of the claimed invention to have the RNN fraud model of Zhang incorporate the optimized cache implementation of hidden states of Le, resulting in the hidden states of LSTMs of Zhang to be stored in cache(s), and updating the cache to store the new set of encoded states in place of the set of encoded states, in order to advantageously help the fraud model of Zhang learn to consider the importance of each timestep in local intervals via attention weights, and select a best representative hidden state for LSTM cells of Zhang (§2.2.1 of Le): 
§2.2.1 of Le: we perform an attention over the cache to choose the best representative hidden state. The model will learn to assign attention weights to in the cache. This mechanism helps the model consider the importance of each timestep input in the local interval and thus relax the equal contribution assumption […]

With respect to claim 2, Zhang in view of Le disclose all the claim elements of claim 1. Furthermore, Zhang discloses: wherein the operations further comprise: 

training the RNN fraud model based on a set of transaction features that have been previously computed based on transaction information associated with previous transactions of a plurality of other user accounts with the payment provider. (Fig. 4, 308 of Zhang, in further view of ¶84 of Zhang):

    PNG
    media_image11.png
    423
    372
    media_image11.png
    Greyscale

¶84 of Zhang: In step 308, the analytical model may analyze the interaction data features and the authorization decision features. The analytical model may analyze the training data associated with each user. As the analytical model processes the training data, LSTM in the analytical model can update a cell state and a hidden state. For each interaction in the training data that the analytical model processes, it may output a predicted interaction label and a predicted authorization decision label. The output may be risk score and/or a risk label. The analytical model may then calculate classification loss by comparing the predicted interaction label to the actual interaction label and comparing the predicted authorization label to the actual interaction label. The analytical model 

¶81 of Zhang: In step 302, the processing computer can receive prior authorization request data from a plurality of past interactions. The prior authorization request data may form part of a training dataset. […] For example, the prior authorization request data may be derived from interaction histories of a plurality of users […]

¶21 of Zhang: A “user” may include an individual or a computational device. In some embodiments, a user may be associated with one or more personal accounts and/or devices. In some embodiments, the user may be a cardholder, account holder, or consumer.

 With respect to claim 3, Zhang in view of Le discloses the limitations of claim 2. Furthermore, Zhang discloses: wherein the set of encoded states are updated in response to each new transaction processed by the RNN fraud model. (¶¶61, 64 of Zhang. Examiner also notes ¶¶67, 73, 61, 64, 101 of Zhang also discloses this in a more granular level in view of Fig. 3  of Zhang):

embodiments of the invention can allow the processing computer to incorporate the authorization decision into an analytical model to enhance the accuracy for the subsequent transactions.

¶64 of Zhang: the authorization decision features 220 may still be used to update the analytical model 230, which can be immediately available to analyze the next interaction. Therefore, authorization decisions can not only be used during training but may also be stored and updated at runtime. This may be done in real time, or substantially close to real time.


    PNG
    media_image12.png
    537
    780
    media_image12.png
    Greyscale

time t […] time step t−1. [106] Each time step represents an interaction [As noted in parent claim 1, Examiner notes it is understood the interactions may comprise transaction authorization requests (i.e., transactions)]

¶73 of Zhang: The output gate 312 may perform a pointwise multiplication of the tanh function 308 and the output of the output gate layer 350 to generate an updated hidden vector h(t) 345. In other embodiments, the operation of the tanh function 308 may correspond to the activation function of the input activation layer 330. The updated hidden vector h(t) 345 and the updated cell vector c(t) 335 can then be used by the LSTM cell at the next time step t+1. The updated hidden vector h(t) 345 may also be output from the LSTM cell, and may be sent to another LSTM cell or a neural network layer.

¶101 of Zhang: […] A precursor cell state from the previous time step c(t−1) and a precursor hidden state from the previous time step h(t−1) may be updated in the LSTM to form a cell state c(t) and a hidden state h(t) for the current time step. […]

With respect to claim 4, Zhang in view of Le discloses all the claim elements of parent claim 2. Zhang further discloses:

wherein the transaction information comprises at least one of networking information associated with a user device of the user, a user identifier, a transaction identifier, or a geographic location of the user. (At least ¶76 in view of ¶80 of Zhang discloses user identifiers included as elements of the transaction information used as input. Examiner also notes ¶¶98, 83, 63 of Zhang as relevant): 

¶76 of Zhang: the current input vector x(t) may consist of interaction data features created from interaction information (e.g., user identifier, […]

¶80 of Zhang: The analytical model may be formed and trained using interaction data from prior authorization request messages and prior authorization response messages from an authorizing computer [.] The analytical model may be run on a processing computer such as processing computer 104 in FIG. 1.

¶98 of Zhang: The second authorization request message may have been received by the processing computer for a second interaction between the user and a resource provider. The resource provider may be the same resource provider as the first interaction or a different resource provider. Example second interaction data may comprise an interaction type, a timestamp, and a device identifier of an access device where the second authorization request originated. For example, a vector of interaction data for an a log-in authorization request may be [logInReq, 9 PM, mobile48207].

[$20, 1 PM, Target, e-commerce].

¶63 of Zhang: Interaction data features 210 may include an interaction value, a time stamp, an interaction location, etc.

With respect to claim 6, Zhang in view of Le disclose all the elements of parent claim 4. Furthermore, Zhang discloses: wherein 

the new set of encoded states are calculated based on the data associated with the current payment transaction and the set of encoded states […]. (¶¶61, 64 of Zhang. Examiner also notes ¶¶67, 73, 61, 64, 101 of Zhang disclosing that the hidden states are always calculated based on the current payment transaction and all previous hidden states stored within the one or more LSTMs):

¶61 of Zhang: After the authorization decision is sent back to the processing computer, embodiments of the invention can allow the processing computer to incorporate the authorization decision into an analytical model to enhance the accuracy for the subsequent transactions.

¶64 of Zhang: the authorization decision features 220 may still be used to update the analytical model 230, which can be immediately available to analyze the next interaction. Therefore, authorization decisions can not only be used during training but may also be stored and updated at runtime. This may be done in real time, or substantially close to real time.


    PNG
    media_image12.png
    537
    780
    media_image12.png
    Greyscale

¶67 in further view of ¶101 of Zhang: time t […] time step t−1. [106] Each time step represents an interaction [As noted in parent claim 1, Examiner notes it is understood the interactions may comprise transaction authorization requests (i.e., transactions)]

¶73 of Zhang: The output gate 312 may perform a pointwise multiplication of the tanh function 308 and the output of the output gate layer 350 to generate an updated hidden vector h(t) 345. In other embodiments, the operation of the tanh function 308 The updated hidden vector h(t) 345 and the updated cell vector c(t) 335 can then be used by the LSTM cell at the next time step t+1. The updated hidden vector h(t) 345 may also be output from the LSTM cell, and may be sent to another LSTM cell or a neural network layer.

¶101 of Zhang: […] A precursor cell state from the previous time step c(t−1) and a precursor hidden state from the previous time step h(t−1) may be updated in the LSTM to form a cell state c(t) and a hidden state h(t) for the current time step. […]

Zhang fails to disclose, but Le discloses: set of encoded states stored in the cache (Examiner notes same obviousness rationale of parent claim 1 in applying Le applies here as well): 
Figure 1 of Le: Writing mechanism in Cached Uniform Writing. During non-writing intervals, the controller hidden states are pushed into the cache. When the writing time comes, the controller attends to the cache, chooses suitable states and accesses the memory. The cache is then emptied.

Page 5, §2.2.1, “Local Optimal Design” of Le: we introduce a cache of size L to store the hidden states of the controller during a write interval.

With respect to claim 7, Zhang in view of Le disclose all the claim limitations of parent claim 1. Furthermore, Zhang discloses: wherein the RNN fraud model comprises at least one of a long short-term memory RNN or a gated recurrent units RNN. (¶¶17, 38 of Zhang):

¶17 of Zhang: The analytical model can be a deep recurrent neural network (RNN) with long short-term memory (LSTM) where authorization decisions are embedded into the inner structure of the deep recurrent neural network.

¶38 of Zhang: An LSTM may be comprised of a cell and gates that control the flow information into and out of the cell.

With respect to claim 9, Zhang in view of Le disclose all claim elements of parent claim 1. Furthermore, Zhang discloses: wherein the data associated with the current payment transaction includes previously generated features extracted from the current payment transaction. (¶63 of Zhang. Examiner also notes ¶¶96-97 of Zhang as relevant):

¶63 of Zhang: Interaction data features 210 and authorization decision features 220 may be extracted from a set of training data. For example, the training data may be the interaction history of a user, including authorization responses from an issuer for each transaction.

 the processing computer may extract authorization response data from the first authorization response message. For example, the authorization response data may comprise an authorization decision and a reason code, such as [approved, 00]. The processing computer may then input the authorization response data as authorization decision features and the analytical model may encode the authorization decision features. For example, the analytical model may encode the authorization decision features as [0, 00] where “0” represents an approved interaction as opposed to “1” for a declined transaction.

¶97 of Zhang: the analytical model may be updated by the processing computer with the first authorization response message (as authorization decision features) and the first authorization request message (as interaction features) to form an updated analytical model. The authorization response data may be associated in the analytical model with the first interaction data. One or more LSTM cells of the analytical model may determine whether to add the authorization response data to cell states and hidden states.

Examiner’s Note: Examiner notes that the features extracted from the current payment transaction are “previously generated”, as they must be generated prior to extraction, generally. Examiner further notes ¶47 of Applicant specification: 
	¶47 of Applicant specification: Referring back to Fig. 2, at step 208, a user application, such as application 122 communicates with the service provider system 102 to complete a current transaction between a user account and the service provider. At step 210, the model processing module 108 extracts the set of features (e.g., the features previously selected by the training module 106 to train the RNN fraud model 220) from current transaction information associated with the current transaction.

With respect to claim 10, it is rejected under the same rationale as claim 1 (above), mutatis mutandis. (Examiner notes the encoded states of claim 1 correspond to hidden states of claim 10).

With respect to claim 11, Zhang in view of Le discloses all the elements of parent claim 1. Furthermore, Zhang discloses: determining whether to process the current transaction based on the risk score. (¶52 of Zhang):

¶52 of Zhang: If the processing computer 104 determines that the risk score is too high, the processing computer 104 may not send the authorization request message to the authorizing computer 106 and may instead send a failure message or decline to the access device 102.

With respect to claim 12, it is rejected under the same rationale as claim 1 (above), mutatis mutandis. (See mapping of “determining a risk level” limitation of claim 1 corresponding to a fraud score / probability of Zhang).

With respect to claim 14, Zhang in view of Le discloses the limitations of parent claim 10. Furthermore, Zhang discloses: wherein the current transaction information associated with the current transaction includes a features vector corresponding to the plurality of features being extracted from the current transaction. (at least ¶¶33, 67, 76, 83, of Zhang):

¶33 of Zhang: A “machine learning model” may include an application of artificial intelligence that provides systems with the ability to automatically learn and improve from experience without explicitly being programmed. A machine learning model may include a set of software routines and parameters that can predict an output of a process (e.g., identification of an attacker of a computer network, authentication of a computer, a suitable recommendation based on a user search query, etc.) based on a “feature vector” or other input data. […]

¶67 of Zhang [referring to Fig. 3 showing LSTM cell structure]: The input vector x(t) 305 may comprise interaction data features (e.g., interaction value, time stamp) and/or authorization decision features (e.g., reason codes).

input vector x(t) may consist of interaction data features created from interaction information (e.g., user identifier, resource provider identifier) and authorization decision features created from authorization response messages (e.g., authorization decision, reason code).

¶83 of Zhang: The analytical model can encode the interaction data features and the authorization decision features as embedding vectors. For example, the interaction data features may be [$20, 1 PM, Target, e-commerce]. The analytical model may encode that information as [20, 13, 5, 3] where 13 represents the time stamp in hours, 5 represents a resource provider identifier (e.g., Target is 5.sup.th on a list or resource providers), and 3 represents an interaction type (e.g., e-commerce is 3.sup.rd on a list of interaction types).

With respect to claim 15, it is rejected under the same rationale as claim 1 (above), mutatis mutandis. (See claim 1 mapping of Le combination and obviousness statement, particularly with respect to “updating the cache to store the new set of encoded states in place of the set of encoded states;” limitation mapping).

With respect to claim 17, Zhang discloses: A non-transitory computer readable medium storing computer-executable instructions that in response to execution by one or more hardware processors, causes a service provider system to perform operations comprising: 

¶31 of Zhang: A “memory” may be any suitable device or devices that can store electronic data. A suitable memory may comprise a non-transitory computer readable medium that stores instructions that can be executed by a processor to implement a desired method.
¶¶107-108 of Zhang: Processing computer 1200 may be for example, processing computer 104 of FIG. 1. Processing computer 1200 may comprise a memory 1220, a processor 1240, and a network interface 1260. The processing computer 1200 may also comprise a computer readable medium 1280, which may comprise code, executable by the processor 1240, for implementing methods according to embodiments. [¶108:] The memory 1220 may be implemented using any combination of any number of non-volatile memories (e.g., flash memory) […]

¶47 of Zhang: In some embodiments the processing computer 104 may be part of a payment processing network. In other embodiments the processing computer 104 may be part of an access gateway. The processing computer 104 may process authorization requests from the access device 102 using an analytical model.

the processing computer 104 may be a payment processing network (e.g., Visa), and the authorizing computer 106 may be an issuer computer.

Examiner’s Note: Examiner takes the stance Visa payment network is a form of payment provider.

With respect to the remaining claim limitations of claim 17, they are rejected under the same rationale as claim 1 (above), mutatis mutandis. 

With respect to claim 18, it is rejected under the same rationale as claim 1 (above), mutatis mutandis. (Specifically, see mapping of “wherein the executing the RNN fraud model includes encoding a new set of encoded states for the plurality of nodes;” and “the set of encoded states being previously calculated based on execution of the RNN fraud model with respect to a prior transaction of the user account,” limitations of claim 1)

With respect to claim 20, it is rejected under the same rationale as claim 1 (above), mutatis mutandis. (Specifically, see mapping of Le reference corresponding to: “updating the cache to store the new set of encoded states in place of the set of encoded states” limitation)

Claims 5, 8, 13, 16, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang in view of Le, as applied in corresponding parent claims, in further view of United States Application Publication No.  US-20200065812-A1 to Walters (“Walters”).

With respect to claim 5, Zhang in view of Le discloses all the limitations of parent claim 1. Zhang further discloses: [wherein the cache stores only one] set of encoded states corresponding to the user account [at a time]. (¶¶87-88 in further view of ¶21 of Zhang): 

¶87 of Zhang: The first LSTM cell 530A may maintain a cell state c.sub.1(t) and a hidden state h.sub.1(t) for each user in the network. [Examiner notes ¶88 is mostly copy/ paste of ¶87, but with respect to another LSTM cell].

¶21 of Zhang: A “user” may include an individual or a computational device. In some embodiments, a user may be associated with one or more personal accounts and/or devices. In some embodiments, the user may be a cardholder, account holder […]

¶104 of Zhang: Subsequent interactions from the user may be analyzed using the modified cell state c′(t) and the modified hidden state h′(t).

Zhang fails to disclose, but Le discloses: wherein the cache stores [only one] set of encoded states [corresponding to the user account at a time] (Examiner notes same obviousness rationale of parent claim 1 applies here): 

Figure 1 of Le: Writing mechanism in Cached Uniform Writing. During non-writing intervals, the controller hidden states are pushed into the cache. When the writing time comes, the controller attends to the cache, chooses suitable states and accesses the memory. The cache is then emptied.

Page 5, §2.2.1, “Local Optimal Design” of Le: we introduce a cache of size L to store the hidden states of the controller during a write interval.

Zhang in view of Le fail to disclose, but Walters suggests: [wherein the cache stores] only one [set of encoded states corresponding to the] user account at a time. (At least abstract, ¶¶12, 22, in further view of ¶¶47)
abstract of Walters: Logic may detect fraudulent transactions. Logic may determine, by a neural network based on the data about a transaction, a deviation of the transaction from a range of purchases predicted for the customer, wherein the neural network is pretrained to predict purchases by the customer based on a purchase history of the customer. […]

that neural network is assigned to a specific customer and retrains or continues to train based on the purchase history of that specific customer,

¶22 of Walters: In one embodiment, the neural network is only trained with that customer's purchase history.

¶¶46-47 of Walters: the generative neural network 1605 and the discriminative neural network 1660 may comprise Long Short-Term Memory (LSTM) neural networks, […]. [¶47:] An LSTM is a basic deep learning model and capable of learning long-term dependencies. […] The LSTM internal units have a hidden state augmented with nonlinear mechanisms to allow the state to propagate […]

Accordingly, it would have been obvious to one having ordinary skill in the art prior to the effective filing date of the claimed invention that the model of Zhang in view of Le could be copied for each user account, as suggested by Walters, resulting in any given (copied) fraud  model’s corresponding cache storing hidden states store only one set of encoded states corresponding to a given user account a time, as disclosed by Walters (shown above), in order to advantageously  recognize transaction patterns specific to a given customer (¶13 of Walters): 

advantageously training the neural network to recognize specific transaction patterns of that specific customer. As a result, determinations by the neural network about non-fraudulent transactions are based on predicted transactions for each customer.

With respect to claim 8, Zhang in view of Le discloses all the elements of parent claim 1. Furthermore, Zhang discloses: RNN fraud model (¶¶17,49 of Zhang):

¶¶17, 47 of Zhang: Embodiments of the invention include[s] […] approach to incorporating authorization decisions from an authorizing computer into an analytical model residing at a processing computer. The analytical model can be a deep recurrent neural network (RNN) with long short-term memory (LSTM) where authorization decisions are embedded into the inner structure of the deep recurrent neural network. An LSTM is a unit of an RNN that can effectively retain information, […] Authorization decisions for interactions may include […] fraud flags. [¶47 of Zhang:] For example, the processing computer 104 may use the analytical model to predict if the authorization request is fraudulent.

a transaction between the payment provider and a second user having a user account with the payment provider (See claim 18 of Zhang in further view of at least Fig. 1, circled 1 of Zhang, ¶¶50-51, and ¶¶21, 23, 25, 27 of Zhang):

Claim 18 of Zhang: wherein the long short-term memory comprises a cell state and a hidden state for each of a plurality of users, the plurality of users including the user.

    PNG
    media_image3.png
    322
    772
    media_image3.png
    Greyscale

¶¶50-51 of Zhang: In step 1, a user may use the access device 102 to initiate a transaction with a resource provider, and the user may input payment credentials into the access device 102 […] The access device 102 may then generate an authorization request message. [¶51] […] the processing computer 104 can receive the authorization request message from the access device 102. […]

¶21 of Zhang: A “user” may include an individual or a computational device. […] the user may be a cardholder, account holder, or consumer.

An “authorizing entity” may be an entity that authorizes a request, typically using an authorizing computer to do so. An authorizing entity may be an issuer, a governmental agency, a document repository, an access administrator, etc.

¶25 of Zhang [In view of ¶23 above]: An “issuer” may be a financial institution, such as a bank, that creates and maintains financial accounts for account holders. An issuer or issuing bank may issue and maintain financial accounts for consumers. The issuer of a particular consumer account may determine whether or not to approve or deny specific transactions. An issuer may authenticate a consumer and release funds to an acquirer if transactions are approved (e.g., a consumer's account has sufficient available balance and meets other criteria for authorization or authentication).

¶27 of Zhang: An “authorization request message” may be a message that is sent to request authorization for an interaction. […] An authorization request message according to some embodiments may comply with ISO 8583, which is a standard for systems that exchange electronic transaction information associated with a payment made by a user using a payment device or payment account.

Zhang fails to disclose, but Le discloses: the cache stores […] encoded states corresponding to […] RNN […] model (Examiner notes same obviousness rationale of parent claim 1 applies here): 
hidden states are pushed into the cache. When the writing time comes, the controller attends to the cache, chooses suitable states and accesses the memory. The cache is then emptied.

Page 5, §2.2.1, “Local Optimal Design” of Le: we introduce a cache of size L to store the hidden states of the controller during a write interval.

Zhang in view of Le fails to disclose, but Walters suggests: wherein the cache (Fraud Detection Logic Circuitry, Fig. 1A, 1015);(Fig. 9, 1037, 1047 in further view of ¶¶9, 84 of Walters)

    PNG
    media_image13.png
    617
    441
    media_image13.png
    Greyscale


¶9 of Walters: FIG. 4 […] a system including a multiple-processor platform, a chipset, buses, and accessories such as the server shown in FIGS. 1A-1B; […]

¶84 of Walters: The fraud detection logic circuitry 4026 may represent circuitry configured to implement the functionality of fraud detection for neural network support within the processor core(s) 4020 or may represent a combination of the circuitry within a processor and a medium to store all or part of the functionality of the fraud detection logic circuitry 4026 in memory such as cache,

[the cache] stores a second set of encoded states (hidden states) corresponding to a second instance of the RNN […] model (e.g., Neural Network 1037 / 1047 within Fraud detection logic circuitry 1015) [that has been executed with respect to a transaction between the payment provider and a second user having a second user account with the payment provider]. (issuer, ¶67 of Walters)

(At least Fig. 1A, abstract, ¶¶41, 12, 22, in further view of ¶¶47, of Walters discloses a cache (Fig. 1A, Fraud Detection Logic Circuitry 1015) may store multiple instances Models, such as LSTMs comprising hidden states)

abstract of Walters: Logic may detect fraudulent transactions. Logic may determine, by a neural network based on the data about a transaction, a deviation of the transaction from a range of purchases predicted for the customer, wherein the neural network is pretrained to predict purchases by the customer based on a purchase history of the customer. […]

that neural network is assigned to a specific customer and retrains or continues to train based on the purchase history of that specific customer,

¶40 of Walters: An RNN is a class of artificial neural network where connections between nodes form a directed graph along a sequence. This allows the RNN to exhibit dynamic temporal behavior for a time sequence. RNNs can use their internal state (memory) to process sequences of inputs

¶22 of Walters: In one embodiment, the neural network is only trained with that customer's purchase history.

¶¶46-47 of Walters: the generative neural network 1605 and the discriminative neural network 1660 may comprise Long Short-Term Memory (LSTM) neural networks, […]. [¶47:] An LSTM is a basic deep learning model and capable of learning long-term dependencies. […] The LSTM internal units have a hidden state augmented with nonlinear mechanisms to allow the state to propagate […]

¶67 of Walters: […] payment instrument issuer. The payment instrument issuer may comprise a server to perform fraud detection based on the instance of the neural network 2010 that is trained for this specific customer […]

Accordingly, it would have been obvious to one having ordinary skill in the art prior to the effective filing date of the claimed invention that the model of Zhang in view of Le could be copied result in multiple instances (with multiple corresponding hidden states) stored in cache, as suggested by Walters, resulting in a second set of encoded states corresponding to a second instance of the RNN fraud model that has been executed with respect to a transaction between the payment provider and a second user having a second user account with the payment provider,  in order to advantageously  recognize transaction patterns specific to a given customer (¶13 of Walters): 
¶13 of Walters: […] retrains or continues to train based on the purchase history of that specific customer, advantageously training the neural network to recognize specific transaction patterns of that specific customer. As a result, determinations by the neural network about non-fraudulent transactions are based on predicted transactions for each customer.

With respect to claim 13, Zhang in view of Le discloses all the elements of parent claim 10. Furthermore, Zhang discloses: […] training the RNN fraud using a plurality of features that were previously generated based on prior transactions of a plurality of other user accounts with the service provider. (¶¶84, 81 in further view of ¶21 of Zhang discloses features extracted from interaction (e.g., transaction) and authorization decision features corresponding to a plurality of other accountholders are used to train the model).

interaction data features and the authorization decision features. The analytical model may analyze the training data associated with each user. As the analytical model processes the training data, LSTM in the analytical model can update a cell state and a hidden state. For each interaction in the training data that the analytical model processes, it may output a predicted interaction label and a predicted authorization decision label. The output may be risk score and/or a risk label. The analytical model may then calculate classification loss by comparing the predicted interaction label to the actual interaction label and comparing the predicted authorization label to the actual interaction label. The analytical model can recursively process the training data to minimize the classification loss. When training the analytical model, dropouts may be applied in each LSTM, with a dropout probability of 0.5.

¶81 of Zhang: In step 302, the processing computer can receive prior authorization request data from a plurality of past interactions. The prior authorization request data may form part of a training dataset. […] For example, the prior authorization request data may be derived from interaction histories of a plurality of users […]

¶21 of Zhang: A “user” may include an individual or a computational device. […] the user may be a cardholder, account holder, or consumer.

Zhang in view of Le fails to disclose, but Walters discloses prior to the running the RNN fraud model [training the RNN fraud model], (¶¶12, 19 of Walters)

¶12 of Walters: Moreover, embodiments may train the neural networks based on a transaction history or purchase history for each specific customer. Many embodiments pretrain the neural network based on purchase histories of multiple customers or all customers. Thereafter, an instance of that neural network is assigned to a specific customer and retrains or continues to train based on the purchase history of that specific customer, advantageously training the neural network to recognize specific transaction patterns of that specific customer. As a result, determinations by the neural network about non-fraudulent transactions are based on predicted transactions for each customer.

¶19 of Walters: In many embodiments, a service on one or more servers may pretrain the neural networks with the multiple customers' transaction data and each specific customer's transaction data prior to operating the neural network in inference mode to detect fraudulent transactions for a customer.

Examiner’s Note: Examiner interprets the limitation of “prior to running” as meaning the aforementioned running is with respect to a specific user, and that the prior training 

¶32 of Applicant Specification: According to certain embodiments, each user account may be provided with an initial set of hidden states (stored in the cache 112) before running the RNN fraud model for the first time with respect to transactions of the user account. The initial set of hidden states may the same or may be different for each user account. Over time, however, the set of hidden states for each user account are likely to change based on different transaction information for the different transactions in which each of the user accounts participate.

Accordingly, it would have been obvious to one having ordinary skill in the art prior to the effective filing date of the claimed invention that the model of the model of Zhang in view of Le could each be assigned an instance, as suggested by Walters, where it is pretrained prior to the running of the RNN model with the specific assigned user, in order to advantageously  increase the robustness of the training, advantageously enabling the model to learn common sequences of transactions, and resultantly decreasing likelihood of false positives (¶17 of Walters): 

¶17 of Walters: […] In several embodiments, the neural network is initially pretrained with sets of transactions from multiple customers to train the neural network about common sequences of transactions. Some embodiments select different sets of transactions from the multiple customers to train the neural network with transaction sequences that have different counts […], advantageously increasing the robustness of the neural network's ability to recognize non-fraudulent transactions. […]

With respect to claim 16, it is rejected under the same rationale as claim 5 (above), mutatis mutandis. (Particularly, see mapping of claim 5 pertaining to Walters, and corresponding obviousness statement).

With respect to claim 19, it is rejected under the same rationale as claim 13 (above), mutatis mutandis. (Particularly, see mapping of claim 13 pertaining to Walters, and corresponding obviousness statement).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: 
United States Application Publication No.  US-20200092392-A1 to Seelam (“Seelam”), disclosing scheduling cache for datasets caching (¶65) corresponding to distributed deep learning jobs (¶63 in view of ¶15).

Non-Patent Literature, “IMPROVING NEURAL LANGUAGE MODELS WITH A CONTINUOUS CACHE” to Grave (“Grave”)2, disclosing use of cache on neural network hidden states (Fig. 1 caption) to pre-train (Page 1, last paragraph), improving prediction without more additional training (Page 1, last paragraph).

Non-Patent literature, “4 Major Challenges facing Fraud Detection; Ways to Resolve Them using Machine Learning” to Razorthink (“RazorThink”)3, disclosing use case of LSTM to detect fraud based on IP address (i.e., network identifier associated with payment device of user) and city (i.e., geolocation) to determine whether or not a transaction is fraudulent: “For example, an LSTM (Long Short Term Memory) deep learning model is useful for detecting fraud in a sequence of events. If a user logs in with a new IP address from a different city, changes his street address on file, then purchases an expensive item on an e-commerce site, LSTM might flag this transaction as fraudulent. None of these events alone is indicative of fraud, but the sequence of all three is.”

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARK A MALKOWSKI whose telephone number is (313)446-6624.  The examiner can normally be reached on Monday - Friday 9:30AM-7:00PM.



If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ryan Donlon can be reached on (571) 270-3602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/M.A.M./Examiner, Art Unit 3695                                                                                                                                                                                                        

/ABDULMAJEED AZIZ/Primary Examiner, Art Unit 3695                                                                                                                                                                                                        


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 See PTO-892 Reference “U”
        2 See PTO-892 reference “V”
        3 See PTO-892 reference “W”