DETAILED ACTION
This action is in response to communications filed on 04/05/2021 in which claims 1-20 are still pending.
This action is non-final.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Applicant claims the benefit of a National Stage of International Application No. PCT/US2018/055039, filed October 9, 2018, which is acknowledged.

Drawings
The drawings were received on 04/05/2021.  These drawings are acceptable.

Information Disclosure Statement
The information disclosure statements (IDSs) submitted on 04/05/2021 has been considered by the examiner. 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-6, 8-10, and 14-16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claim 1, the claim recited the limitation “receiving, by the computer, current data for an interaction, wherein the current data” that is unclear. The claim reference to receiving current data where in the current data is disclosed with no limiting features. It is unclear what the wherein clause is adding to the limitation. Should the next limitation be considered the part of the wherein clause or should the wherein clause be omitted? It is unclear what the intended scope of the claim limitation should be therefore the claim limitation is unclear. Thus, the claim is rendered indefinite. 
Regarding claims 2-6 and 8-10 that are dependent on claim 1, the claims do not resolve the noted issues in claim 1 and are therefore appropriately rejected.  

Regarding claims 4-6, the claim recites the term/phrase “behavioral plane” in the claim limitations. The term is not a term of art and the claims do not profile a definition or standard for determining the intended scope of the term “behavioral plane”. In addition, the specification merely recites claim language and does not provide a definition or standard for determining the intended 
Regarding claims 14-16, the claims recite similar limitations as clams 4-6 respectively and are therefore rejected under the same rationale.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-2, 4-7, 11-12, and 14-17 are rejected under 35 U.S.C. 103 as being unpatentable over Walters et al. (US Pub. No. 2020/0065813, hereinafter ‘Walt’) in view of Gupta et al. (US Pub. No. 2019/0391901, hereinafter ‘Gupta’).

Regarding independent claim 1, Walt teaches a method comprising: receiving, by a computer, historical interaction data, wherein the historical interaction data includes a plurality of historical interactions, each historical interaction associated with a plurality of data fields, the data fields being interdependent; (in 0067: … the fraud detection logic circuitry 2000 receives the transaction data [claimed historical interaction data] 2005 as an input and provides the input to the instance of the neural network 2010 trained for the specific customer.)
(claimed assigned weight as neural network layer weights applied to transaction data, in 0063: … Backward propagation of the error may adjust weights [claimed assigned weight for training adjustment] and biases in the layers of the neural network to reduce the error. The backward propaga­tion of the error may effectively adjust the range of predicted transactions responsive to the transaction data [including claimed plurality of data fields] that caused the neural network to output the error.)
generating, by the computer, a neural network using the plurality of weights and the plurality of data fields; (in 0063: A backprop 2046 logic circuitry of the trainer 2040 may train the neural network [claimed generated neural network] 2010 by backward propagation of the error that is output by the neural network 2010 in response to the training data [claimed using weights and plurality of data fields]. Backward propagation of the error may adjust weights [claimed using weights to generate claimed trained neural network] and biases in the layers of the neural network to reduce the error…)
 identifying, by the computer, using the neural network, a first plurality of … indicators indicative of a first class, the first class being different from a second class; receiving, by the computer, a second plurality of … indicators derived from data relating to compromised accounts; (in 0016: Note that while the neural network is trained to predict a transaction by a customer or to detect a non­fraudulent transaction, fraud detection logic circuitry may detect fraud based on a determination by the neural network of the error associated with predicting a transaction con­ducted by the customer or classifying the transaction [claimed identification of a first plurality of feature indicators indicative of a first class, the first class being different from a second class] con­ducted by the customer. In other words, the neural network learns what a non-fraudulent transaction [claimed first class] for a customer looks like in terms of the data input to the neural network for the transaction. When data for a non-fraudulent transaction is input at the input layer of the neural network, the neural network may output an error that is small to indicate that the transaction closely matches a predicted transaction or trans­action classification [claimed second class]. On the other hand, when data for a fraudulent transaction is provided at the input layer of the neural network, the neural network may output a large error indicating that the transaction does not match a predicted transaction well or transaction classification well. Thus, the fraud detection logic circuitry may determine if a transaction is likely fraudulent [claimed second class] by comparing the error output by the neural network to a deviation threshold.; And where the neural network is trained to learn the respective claimed features for determining a respective predicted class/classification, in 0012-0017: …More specifically, embodiments may train neural networks to learn what a non-fraudulent transaction is and when transaction data for a fraudulent transaction is input into the neural network, the neural network may produce an error indicative of the difference between a non-fraudulent trans­action and the transaction provided as the input…Thereafter, an instance of that neural network is assigned to a specific customer and retrains or continues to train based on the purchase history of that specific customer, advantageously training the neural network to recognize specific transaction patterns [claimed identified plurality of feature indicators indicative of each respective class for a trained predicted class] of that specific customer. As a result, determinations by the neural network about non-fraudulent transactions are based on predicted transactions for each customer… Thus, the fraud detection logic circuitry may determine if a transaction is likely fraudulent by comparing the error output by the neural network to a deviation threshold. For transactions in which the output error falls below the deviation threshold, the fraudulent detection logic circuitry may consider the transaction to be non-fraudulent. For transactions in which the output error reaches or exceeds the deviation threshold, the fraudulent detection logic circuitry may consider the transaction to be fraudulent or to potentially be fraudulent…; And where the data is classified based on extracted feature as the recited plurality of transaction data set of time series data  and patterns associated with classifying each respective detected class, in 0012-0017 and in 0026 & 0028: The first set of server(s) 1010 may retrain the neural network 1017 to detect, classify, and/or predict non­fraudulent transactions by training the neural network 1017 with sets of transactions from multiple customers. Each set of transactions may comprise a sequence of transactions that occur in a series such as a time series or time sequence… In several embodiments, one or more the server(s) 1010 may perform fraud detection with the neural networks 1037 and 1047. For example, the when the customer asso­ciated with the customer device 1030 completes a transac­tion such as purchasing gas, the fraud detection logic cir­cuitry 1015 may apply transaction data that describes the purchase as a tensor to the input layer of the neural network… If the customer buys the gas at the same gas station for about the same amount, at about the same time, on about the same day of the week that the customer normally purchases gas for a vehicle, the error output from the neural network 1037 will likely be very small if not nil. On the other hand, if one or more of these factors deviate significantly from the customer's purchase history and/or from the sequences of transactions learned from training on transaction data from multiple customers, the error output from the neural network 1037 may be large.)
updating, a … distribution component in the computer using the first plurality of feature indicators and the second plurality of feature indicators; (in 0065: The fraud detection logic circuitry 2000 may then retrain [claimed updating] each of the instances 2012 of the neural network 2010 with specific customer purchase history data 2024. The specific customer purchase history data 2024 represents a set of the transaction data for a specific customer and there is a set for each customer, or at least each customer in the group of customers [claimed distribution of using claimed feature indicators]. In other words, an instance 2012 of the neural network 2010 is pertained with the purchase history of a specific customer so that the fraud detection via the instance 2012 for a specific customer is, advantageously, based on that specific customer's purchase history…In other words, an instance 2012 of the neural network 2010 is pertained with the purchase history of a specific customer so that the fraud detection via the instance 2012 for a specific customer is, advantageously, based on that specific customer's purchase history.; And the set/group of customer transaction data indicative of the class features, in 0066-0067: In many embodiments, the retraining with the specific customer purchase history data 2024 may occur with sets of transactions that the trainer 2040 selects from the specific customer purchase history data 2024. The ran­dom 2042 logic circuitry may occasionally or periodically generate a set of random, non-fraudulent transactions in a time sequence or time series for training one of the instances 2012 of the neural network 2010. Furthermore, the fuzzy 2044 logic circuitry may adjust values of the transaction data from the specific customer purchase history data 2024. For instance, the fuzzy 2044 logic circuitry may change a price of a grocery bill or other transaction, the time of a dinner bill, or the like. …Once the fraud detection logic circuitry 2000 retrains one of the instances 2012 for a specific customer, the instance of the neural network 2010 can perform fraud detection [claimed using the first plurality of feature indicators and the second plurality of feature indicators] for the specific customer and, in several embodi­ments, continue to train with new, non-fraudulent transac­tions completed by the customer.... The payment instrument issuer may comprise a server to perform fraud detection based on the instance of the neural network 2010 that is trained for this specific customer or may hire a third party to perform the fraud detection... The fraud determiner 2030 may determine if the error or deviation output by the instance of the neural network 2010 pertained for the specific customer indicates that the transaction is non-fraudulent or might be fraudulent [claimed using the first plurality of feature indicators and the second plurality of feature indicators]… )
receiving, by the computer, current data for an interaction, wherein the current data; (claimed current time as data used to predict a future value, in 0014: In many embodiments, the neural network may pretrain on a server of, e.g., a payment instrument issuer, with function approximation, or regression analysis, or classification. Function approximation may involve time series prediction and modeling. A time series is a series of data points indexed ( or listed or graphed) in time order… Neural networks may perform time series analysis to extract meaningful statistics and other characteristics of the data such as time series forecasting to predict future values based on previously observed values [including claimed receiving, by the computer, current data for an interaction].: And in 0026-0027: The first set of server(s) 1010 may pretrain the neural network 1017 to detect, classify, and/or predict non­fraudulent transactions by training the neural network 1017 with sets of transactions from multiple customers. Each set of transactions may comprise a sequence of transactions that occur in a series such as a time series or time sequence… a second set of one or more server(s) 1010 may continue to train or retrain one or more instances of the neural network 1017 with purchase history of one or more customers. For example, some embodiments fully train the neural network 1017 with the transaction data from multiple customers prior to training an instance of the neural network 1017 with purchase history of a specific customer… For example, the when the customer asso­ciated with the customer device 1030 completes a transac­tion such as purchasing gas, the fraud detection logic cir­cuitry 1015 may apply transaction data that describes the purchase as a tensor to the input layer of the neural network 1037 [claimed receiving, by the computer, current data for an interaction]. The neural network 1037 may operate in inference mode and output an indication of error associated with the purchase. )
 applying, by the computer, the … distribution component to the current data; (claimed application as input of received data to the classification, in 0028: …The neural network 1037 may operate in inference mode and output an indication of error associated with the purchase. The error may represent a difference between the purchase of the gas and a predicted range of transactions that the neural network 1037 determines based on the pretraining and/or the continued training with new transactions for this customer. If the customer buys the gas at the same gas station for about the same amount, at about the same time, on about the same day of the week that the customer normally purchases gas for a vehicle, the error output from the neural network 1037 will likely be very small if not nil. On the other hand, if one or more of these factors deviate [claimed applying, by the computer, the … distribution component to the current data] significantly from the customer's purchase history and/or from the sequences of transactions learned from training on transaction data from multiple customers, the error output from the neural network 1037 may be larger [claimed applying, by the computer, the … distribution component to the current data].)
and scoring, by the computer, the interaction using the … distribution component. (error as score, in 0016: Note that while the neural network is trained to predict a transaction by a customer or to detect a non­fraudulent transaction, fraud detection logic circuitry may detect fraud based on a determination by the neural network of the error associated with predicting a transaction con­ducted by the customer or classifying the transaction con­ducted by the customer… When data for a non-fraudulent transaction is input at the input layer of the neural network, the neural network may output an error that is small to indicate that the transaction closely matches a predicted transaction or trans­action classification…; And using the distribution as the claimed data classes for detecting fraudulent and non-fraudulent transactions, in 0016 and in 0065: The fraud detection logic circuitry 2000 may then retrain each of the instances 2012 of the neural network 2010 with specific customer purchase history data 2024. The specific customer purchase history data 2024 represents a set of the transaction data for a specific customer and there is a set for each customer, or at least each customer in the group of customers. In other words, an instance 2012 of the neural network 2010 is pertained with the purchase history of a specific customer so that the fraud detection via the instance 2012 for a specific customer is, advantageously, based on that specific customer's purchase history.)
Examiner notes that claimed computer disclosed in Walt as the server system depicted in Fig. 1A, in 0033-0034: FIG. 1B depicts an embodiment for an apparatus 1100 such as one of the server(s) 1010, the customer device 1030, and/or the customer device 1040 shown in FIG. lA. The apparatus 1100 may be a computer in the form of a smart phone, a tablet, a notebook, a desktop computer, a workstation, or a server. The apparatus 1100 can combine with any suitable embodiment of the systems, devices, and methods disclosed herein. The apparatus 1100 can include processor(s) 1110, a non-transitory storage medium 1120, communication interface 1130,… The processor(s) 1110 may operatively couple with a non-transitory storage medium 1120. The non-transitory storage medium 1120 may store logic, code, and/or program instructions executable by the processor(s) 1110 for per­forming one or more instructions including the fraud detec­tion logic circuitry 1125…

While Walt teaches the use of transaction data for training neural network to detect non-fraudulent and fraudulent classes based on the distribution of the data features to learn patterns in the historical transaction data. And capturing a plurality of features associated with a respective customer in the multiple customer database capture via the network server environment as depicted in Fig. 1A and Fig. 2. 
Walt does not expressly disclose the data features  captured in an networked/server environment and processed as part of the learning and training process as disclosed in the claim limitations:
identifying, by the computer, using the neural network, a first plurality of feature indicators indicative of a first class, the first class being different from a second class; receiving, by the computer, a second plurality of feature indicators derived from data relating to compromised accounts;
updating, a probability distribution component in the computer using the first plurality of feature indicators and the second plurality of feature indicators; 
applying, by the computer, the probability distribution component to the current data;
and scoring, by the computer, the interaction using the probability distribution component.
	However, Gupta does expressly teach the claim limitations:
identifying, by the computer, using the neural network, a first plurality of feature indicators indicative of a first class, the first class being different from a second class; (claimed features extracted based on the respective classifier as claimed first and second classifiers, in 0021-0022: In FIG. 1, the functionality of the detector 101 has been logically organized into an anomaly feature extractor 107, a fuzzy rule-based classifier 109, and an artificial neural network (ANN) 111… The anomaly feature extractor 107 extracts memory anomaly features from the time-series dataset 105 by reducing the size of input to be supplied to the fuzzy rule-based classifier 107 and the ANN 111 and by deriving memory anomaly features from one or more metrics indi­cated in the time-series dataset 105. To focus the analysis (i.e., reduce the input size for analysis by the classifiers), the anomaly feature extractor 107 selects values of other metrics that correlate with the GC operation invocations by time… The fuzzy rule-based classifier 109 is a set of rules for pattern-based detection of memory anomalies. The rules are weighted. The weights of breached or satisfied rules are aggregated into probabilities or confidence values associated with corresponding labels of a first classification "anomaly" and a second classification "no anomaly…)
updating, a probability distribution component in the computer using the first plurality of feature indicators and the second plurality of feature indicators; (updating as aggregation of probabilities associated with the classes, in 0021-0022: … The anomaly feature extractor 107 can supply the severity value to the detected anomaly interface 113 directly or pass it through the fuzzy rule-based classifier 109. The anomaly feature extractor 107 assembles the extracted features into an input vector represented as v(m1, m2, m3, m4, mn), which flows to the fuzzy rule-based classifier 109. The fuzzy rule-based classifier 109 is a set of rules for pattern-based detection of memory anomalies. The rules are weighted. The weights of breached or satisfied rules are aggregated into probabilities or confidence values associated with corresponding labels of a first classification "anomaly" and a second classification "no anomaly…; And in 0040-0042: Although the discussed memory anomaly detector reduces the features fed into the classifiers to achieve a lightweight solution, the classification process still con­sumes resources. While canonical behavior or a baseline for an application can be established for selective use of the classifiers, an application's canonical behavior cannot be presumed as static… An adaptive canonical behavior filter builds a sample dataset from observed time-series values of memory related metrics and then performs kernel density estimation on the sample dataset. With the resulting probability density function, the adaptive canonical behavior filter filters out subsequently observed time-series values of the memory related metrics that fall within a canonical behavior range that is specified/configured… The adaptive canonical behavior filter 501 comprises a multi­variate kernel density estimator 507 and a probability based filter 509. The adaptive canonical behavior filter 501 oper­ates in two phases that can overlap. In the first phase, the multivariate kernel density estimator ("estimator") 507 determines a probability density function for a next window of time-series data or prospective time-series dataset. With the probability density function, the probability based filter 509 filters metric time slices based on a probability range defined for canonical behavior.)
applying, by the computer, the probability distribution component to the current data; (in 0046: … After an initial probability density function has been built, the adaptive filter can be programmed to filter time slices for a defined time period or number of time slices before resuming the first phase and operating in both phases: 1) applying the probability density function to obtained time slices for a current time window, and 2) using the time slices of the current time window build a new sample dataset to build a new probability density function, thus adapting the canonical baseline.)
and scoring, by the computer, the interaction using the probability distribution component. (probability values as claimed scoring, in 0044: The probability based filter 509 applies the prob­ability density function to time slices received subsequent to determination of the probability density function. A prob­ability range will be defined for canonical behavior. After generating a probability value from the probability density function for a time slice of metric values, the probability based filter 509 determines whether the probability value falls within the defined probability range for canonical behavior...)
The Walt and Gupta references would have been recognized by those of ordinary skill in the art as useful for applicant’s purpose in developing information processing system using learning algorithms for data classification task using time series data.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the prior art for processing transaction time series data for anomaly detection and fraud classifications using neural network classifiers as disclosed by Walt with the method of information processing of time series data using fuzzy rules and neural networks for classifying of data anomalies based on probability density distributions as disclosed by Gupta.
One of ordinary skill in the arts would have been motivated to combine the disclosed teaches of Walt and Gupta in order enable the use of smaller artificial neural networks by allowing the memory anomaly detector to derive “additional features from the correlated values to present a smaller input vector to the two classifiers: a fuzzy rule-based classifier and an artificial neural network” (Gupta, 0017); Doing so would allow “the memory anomaly detector to be "lightweight" because it is less 

Regarding claim 2, the rejection of claim 1 is incorporated and Walt in combination with Gupta teaches the method of claim 1, wherein the neural network is a first neural network, the plurality of data fields is a first plurality of data fields, and wherein the method further comprises: receiving, by a computer, recent interaction data, wherein the recent interaction data includes a plurality of recent interactions, each recent interaction associated with a second plurality of data fields, the data fields being interdependent; (claimed received data fields as new current data sets interdepending based on the monitored related customers and transaction data field data captured in the recent time series interaction transaction datasets, in in 0014: In many embodiments, the neural network may pretrain on a server of, e.g., a payment instrument issuer, with function approximation, or regression analysis, or classification. Function approximation may involve time series prediction and modeling. A time series is a series of data points indexed ( or listed or graphed) in time order… Neural networks may perform time series analysis to extract meaningful statistics and other characteristics of the data such as time series [including claimed receiving, by recent data for an interaction] forecasting to predict future values based on previously observed values;  And in 0026-0027: … Each set of transactions may comprise a sequence of transactions that occur in a series such as a time series or time sequence… a second set of one or more server(s) 1010 may continue to train or retrain one or more instances of the neural network 1017 with purchase history of one or more customers [claimed interdependent data series transaction datasets]. For example, some embodiments fully train the neural network 1017 with the transaction data from multiple customers prior to training an instance of the neural network 1017 with purchase history of a specific customer… For example, the when the customer asso­ciated with the customer device 1030 completes a transac­tion such [transaction data as claimed interaction data having claimed data fields] as purchasing gas, the fraud detection logic cir­cuitry 1015 may apply transaction data that describes the purchase as a tensor to the input layer of the neural network 1037...)
generating, by the computer, a second neural network using a second plurality of weights and the second plurality of fields associated with the recent interaction data; identifying, by the computer, using the second neural network, a third plurality of feature indicators indicative of the first class; (claimed plurality of neural networks as depicted in Fig. 1A, processing identified features of the Fraud detection classes, including claimed first class, in 0033: FIG. 1B depicts an embodiment for an apparatus 1100 such as one of the server(s) 1010, the customer device 1030, and/or the customer device 1040 shown in FIG. lA… The processor(s) 1110 may comprise processing circuitry to implement fraud detection logic circuitry 1115 such as the fraud detection logic circuitry 1015, 1035, or 1045 in FIG. 1A.; And claimed plurality of weights used to model neural network models for training identified features in the transaction data as claimed fields associated with interaction data indicative of data detect class features, as depicted in Fig. 1 A, in 0063-0064: A backprop 2046 logic circuitry of the trainer 2040 may train the neural network 2010 by backward propagation of the error that is output by the neural network 2010 in response to the training data. Backward propagation of the error may adjust weights [claimed a second neural network using a second plurality of weights and the second plurality of fields associated with the recent interaction data] and biases in the layers of the neural network to reduce the error. The backward propaga­tion of the error may effectively adjust the range of predicted transactions responsive to the transaction data that caused the neural network to output the error. After pretraining the neural network 2010 with the multiple customer purchase history data 2022, the fraud detection logic circuitry 2000 may create multiple instances 2012 of the neural network 2010. In some embodiments, the fraud detection logic circuitry 2000 may create one of the instances 2012 for every customer. In other embodiments, the fraud detection logic circuitry 2000 may create one of the instances 2012 for every customer within a group of cus­tomers [a plurality of features associated with a third customer as claimed a third plurality of feature indicators indicative of the first class]… The fraud detection logic circuitry 2000 may select the group of customers based on various criteria, based on requests to "opt-in" from the group of customers, based on a list of customers provided to the fraud detection logic circuitry 2000, and/or based on other criteria.)
and updating, the … distribution component in the computer using the first plurality of feature indicators and the second plurality of feature indicators comprises updating the … distribution component in the computer using the first plurality of feature indicators, the second plurality of feature indicators, and the third plurality of feature indicators. (in 0065: The fraud detection logic circuitry 2000 may then retrain [claimed updating] each of the instances 2012 of the neural network 2010 with specific customer purchase history data 2024 [features indicators of customer transactions]. The specific customer purchase history data 2024 represents a set of the transaction data for a specific customer and there is a set for each customer [claim plurality of features indicators including claimed first second and third], or at least each customer in the group of customers [claimed distribution of using claimed feature indicators]. In other words, an instance 2012 of the neural network 2010 is pertained with the purchase history of a specific customer so that the fraud detection via the instance 2012 for a specific customer is, advantageously, based on that specific customer's purchase history… In other words, an instance 2012 of the neural network 2010 is pertained with the purchase history of a specific customer so that the fraud detection via the instance 2012 for a specific customer is, advantageously, based on that specific customer's purchase history.; And in 0066-0067: In many embodiments, the retraining with the specific customer purchase history data 2024 may occur with sets of transactions that the trainer 2040 selects from the specific customer purchase history data 2024. The ran­dom 2042 logic circuitry may occasionally or periodically generate a set of random, non-fraudulent transactions in a time sequence or time series for training one of the instances 2012 of the neural network 2010. Furthermore, the fuzzy 2044 logic circuitry may adjust values of the transaction data from the specific customer purchase history data 2024. For instance, the fuzzy 2044 logic circuitry may change a price of a grocery bill or other transaction, the time of a dinner bill, or the like. …Once the fraud detection logic circuitry 2000 retrains one of the instances 2012 for a specific customer, the instance of the neural network 2010 can perform fraud detection [claimed using the claim plurality of feature indicators for the respective customer] for the specific customer and, in several embodi­ments, continue to train with new, non-fraudulent transac­tions completed by the customer.... The payment instrument issuer may comprise a server to perform fraud detection based on the instance of the neural network 2010 that is trained for this specific customer or may hire a third party to perform the fraud detection... The fraud determiner 2030 may determine if the error or deviation output by the instance of the neural network 2010 pertained for the specific customer indicates that the transaction is non-fraudulent or might be fraudulent [claimed distribution component for determining fraud using the claimed plurality of feature indicators]…)

	
	Walt teaches using features of the data associated with the plurality of data sets including the claimed first, second and third set of features as extracted from a plurality of networked devices for classification of data features by the classification model, as disclosed above. Walt does not expressly disclose the use of probability distributions using the classification feature indicators, as recited in the claim limitation:
and updating, the probability distribution component in the computer using … plurality of feature indicators and the … plurality of feature indicators comprises updating the probability distribution component in the computer using the … plurality of feature indicators, the … plurality of feature indicators, and the … plurality of feature indicators.
	Gupta does expressly teach the use of probability distributions using the classification feature indicators, as recited in the claim limitation:
and updating, the probability distribution component in the computer using … plurality of feature indicators and the … plurality of feature indicators comprises updating the probability distribution component in the computer using the … plurality of feature indicators, the … plurality of feature indicators, and the … plurality of feature indicators. (updating as aggregation of probabilities associated with the classes, in 0021-0022: … The anomaly feature extractor 107 can supply the severity value to the detected anomaly interface 113 directly or pass it through the fuzzy rule-based classifier 109. The anomaly feature extractor 107 assembles the extracted features into an input vector represented as v(m1, m2, m3, m4, mn) [claimed use of feature indicators], which flows to the fuzzy rule-based classifier 109. The fuzzy rule-based classifier 109 is a set of rules for pattern-based detection of memory anomalies. The rules are weighted. The weights of breached or satisfied rules are aggregated into probabilities or confidence values associated with corresponding labels of a first classification "anomaly" and a second classification "no anomaly…; And in 0040-0042: Although the discussed memory anomaly detector reduces the features fed into the classifiers to achieve a lightweight solution, the classification process still con­sumes resources. While canonical behavior or a baseline for an application can be established for selective use of the classifiers, an application's canonical behavior cannot be presumed as static… An adaptive canonical behavior filter builds a sample dataset from observed time-series values of memory related metrics and then performs kernel density estimation on the sample dataset. With the resulting probability density function, the adaptive canonical behavior filter filters out subsequently observed time-series values of the memory related metrics that fall within a canonical behavior range that is specified/configured… The adaptive canonical behavior filter 501 comprises a multi­variate kernel density estimator 507 and a probability based filter 509…; And plurality of data features collected from the network environment devices, in 0017: A memory anomaly detector has been designed that is lightweight and non-intrusive. The lightweight, non­intrusive memory anomaly detector has been designed to extract features for classification by a rule-based classifier until a second classifier has been trained by the rule-based classifier. The memory anomaly detector correlates values in time-series data for selected memory related metrics ("cor­related features"). This data can be efficiently collected by probes or agents without being intrusive with the application component (e.g., virtual machines (VMs)) being monitored [claimed plurality of feature indicators associated with a plurality of monitored devices for classifying input data]…)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Walt and Gupta for the same reasons disclosed above.

Regarding claim 4, the rejection of claim 1 is incorporated and Walt in combination with Gupta teaches the method of claim 1, further comprising: generating a behavioral plane for a user profile with respect to time. (in 0014-0015: … Neural networks may perform time series analysis to extract meaningful statistics and other characteristics of the data such as time series forecasting to predict future values based on previously observed values. Classification may involve pattern and sequence recognition [claimed generating a behavioral plane for a user profile with respect to time ], novelty detection, and sequential decision mak­ing. Neural networks may perform sequence learning such as sequence prediction, sequence generation, sequence rec­ognition, and sequential decision making to classify trans­actions...; And claimed user profile as recognized patterns of transactions specific to the customer, in 0074: … the flowchart may proceed to pretrain the neural network based on the purchase history of a specific customer to train the neural network to recognize patterns of transactions specific to the customer ( element 3020). In several embodiments, the fraud detection logic circuitry may train an instance of the neural network on a server of the payment instrument issuer or a server by a third party to pretrain the neural network. In other embodi­ments, an instance of the neural network that is pertained with multiple customers' transaction histories is communi­cated to a customer device so the customer device can perform retraining with that customer's purchase history.)

Regarding claim 5, the rejection of claim 4 is incorporated and Walt in combination with Gupta teaches the method of claim 4, wherein a plurality of feature indicator weights belong to the user profile, and wherein the plurality of feature indicator weights are mapped in the behavioral plane. (claimed customer transaction features as transaction history specific to the customer patterns mapped using the neural network as part of training, in 0074: … the flowchart may proceed to pretrain the neural network based on the purchase history of a specific customer to train the neural network to recognize patterns of transactions specific to the customer (element 3020). In several embodiments, the fraud detection logic circuitry may train an instance of the neural network on a server of the payment instrument issuer or a server by a third party to pretrain the neural network…; and in 0063: A backprop 2046 logic circuitry of the trainer 2040 may train the neural network 2010 by backward propagation of the error that is output by the neural network 2010 in response to the training data. Backward propagation of the error may adjust weights and biases in the layers of the neural network to reduce the error. The backward propaga­tion of the error may effectively adjust the range of predicted transactions responsive to the transaction data that caused the neural network to output the error.)

Regarding claim 6, the rejection of claim 5 is incorporated and Walt in combination with Gupta teaches the method of claim 5, further comprising: determining a number of deviations of feature (in 0038: In further embodiments, the fraud detection logic circuitry 1115 may also determine whether the transaction data is associated with a fraudulent location or blacklisted transaction data. In response to a determination that the transaction data is associated with a fraudulent location or blacklisted transaction data, a determination that the devia­tion exceeds the deviation threshold, or a weighted deter­mination based on a combination of thereof [claimed determined number of deviations of feature indicator weights for the plurality of current feature indicators from the behavioral plane used to determined classification of transaction], generate a communication for the customer, the communication to identify the transaction...; And 0044: … In the present embodiment, during training, the output of the objective function logic circuitry 1550 should be less than a deviation threshold because the training data is known to represent non-fraudulent transactions. When operating in inference mode, the fraud detection logic circuitry, such as the fraud detection logic circuitry 1115 shown in FIG. 1B, may compare the output of the objective function logic circuitry 1550 against the deviation threshold to determine if the error indicates a potentially fraudulent transaction or a non­fraudulent transaction.)

Regarding claim 7, the rejection of claim 1 is incorporated and Walt in combination with Gupta teaches the method of claim 1, wherein the current data is received in an authorization request message from an access device operated by a resource provider. (claimed data resource provided as transaction server for receiving data as network messages as depicted in Fig. 1A, in 0014: In many embodiments, the neural network may pretrain on a server of, e.g., a payment instrument issuer, with function approximation, or regression analysis, or classification….; And in 0025: In the present embodiment, the server(s) 1010 may represent one or more servers owned and/or operated by a company that provides services. In some embodiments, the server(s) 1010 represent more than one company that pro­vides services. For example, a first set of one or more server(s) 1010 may provide services including pretraining a neural network 1017 of a fraud detection logic circuitry 1015 with transaction data from more than one customer…)

Regarding independent claim 11, Walt teaches a computer comprising: a processor; and a computer readable medium, the computer readable medium comprising code, executable by the processor, for implementing a method comprising: (server system as claimed computer depicted in Fig. 1A, in 0033-0034: FIG. 1B depicts an embodiment for an apparatus 1100 such as one of the server(s) 1010, the customer device 1030, and/or the customer device 1040 shown in FIG. lA. The apparatus 1100 may be a computer in the form of a smart phone, a tablet, a notebook, a desktop computer, a workstation, or a server. The apparatus 1100 can combine with any suitable embodiment of the systems, devices, and methods disclosed herein. The apparatus 1100 can include processor(s) 1110, a non-transitory storage medium 1120, communication interface 1130,… The processor(s) 1110 may operatively couple with a non-transitory storage medium 1120. The non-transitory storage medium 1120 may store logic, code, and/or program instructions executable by the processor(s) 1110 for per­forming one or more instructions including the fraud detec­tion logic circuitry 1125… )
Examiner notes that the claim 11 limitations are similar to claim 1 limitation and are therefore rejected under the same rationale.

Regarding claims 12 and 14-17 the limitations are similar to claims 2 and 4-7 limitations respectively and are therefore rejected under the same rationale. 

Claims 3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Walters et al. (US Pub. No. 2020/0065813, hereinafter ‘Walt’) in view of Gupta et al. (US Pub. No. 2019/0391901, hereinafter ‘Gupta’) in further view of Keeler et al. (US Pat. No. 6, 243, 696, hereinafter ‘James’).

Regarding claim 3, the rejection of claim 2 is incorporated and Walt in combination with Gupta teaches the method of claim 2, wherein the first neural network is an … neural network and the second neural network is an … neural network. (plurality of neural network as depicted in Fig. 1A, for processing data in a network sever environment using online servers and offline network devices, in 0032-0033: In other embodiments, the server(s) 1010 may pretrain the neural network 1017 with sets of transaction data from multiple customers and pretrain instances of the neural network 1017 such as the neural networks 1037 and 1047 with purchase histories of specific customers. There­after, the server(s) 1010 may transmit the neural network 1037 to the customer device 1030 and the neural network 1047 to the customer device 1040. In such embodiments, the fraud detection logic circuitry 1035 may perform fraud detection for transactions by the customer associated with the customer device 1030 and the fraud detection logic circuitry 1045 may perform fraud detection for transactions by the customer associated with the customer device 1040. In further embodiments, the fraud detection logic circuitry 1035 may verify that new transactions are non-fraudulent, add the new, non-fraudulent transactions to training data, and train the neural network 1037 with the new transaction data… FIG. 1B depicts an embodiment for an apparatus 1100 such as one of the server(s) 1010, the customer device 1030, and/or the customer device 1040 shown in FIG. lA. The apparatus 1100 may be a computer in the form of a smart phone, a tablet, a notebook, a desktop computer, a workstation, or a server. The apparatus 1100 can combine with any suitable embodiment of the systems, devices, and methods disclosed herein…)
While Walt teaches the use of a plurality od neural networks for predicting classification labels in time series data such as transaction data monitored in network environments as disclosed above. Gupta also discloses the use of learning patterns in monitored time series data for data classification. 
However, Walt and Gupta does not expressly teach the use offline and online combinations of neural network models for classifying time series data as disclosed in the claim 3 limitation: 
wherein … first neural network is an offline neural network and … second neural network is an online neural network.
James does expressly teach the use offline and online combinations of neural network models for classifying time series data as disclosed in the claim 3 limitation: 
wherein … first neural network is an offline neural network and … second neural network is an online neural network. (plurality of classifier neural network models for classifying time series data, in 7:28-35: The neural network models that are utilized for time­series prediction and control require that the time-interval between successive training patterns be constant. Since the data that comes in from real-world systems is not always on the same time scale, it is desirable to time-merge the data before it can be used for training or running the neural network model…; And claimed use of online and offline system for training models, in 20:47-65: The historical database 310 is utilized to collects information from the DCS 304… The run-time model 300 is operable to generate the predicted output and store it in the historical database 310 or in the DCS 304 under a predicted tag [classification]. Typically, the system administrator will define what predicted tags are available and will define whether another user can write to that tag. In conjunction with the run-time model 300 [claimed second on-line model], there is provided an off-line system. The off-line system is comprised of an off-line modeling process block 320 and a trainer 322. The off-line modeling block 320 [including claimed first offline model], as describes hereinabove, is the system for preprocessing data and gen­erating a model for performing a prediction. The trainer 322 is operable to vary the data such that the various "what-ifs” and setpoints can be varied… )

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the prior art for processing time series data for model classifications using online and offline learning models as disclosed by James with the method of information processing of time series data using neural networks for classifying of data anomalies based on probability density distributions as collectively disclosed by Gupta and Walt.
One of ordinary skill in the arts would have been motivated to combine the disclosed teaches of Walt, Gupta, and James in order enable data processing of over different time scales and process time series historical data during operational state of the monitored environment, (James, 1:60-2:50); Doing so would provide for processing data using a combination of run-time current data and prior data stored in the data used to generate a predictive output using online and offline learning models (James, 2:34-2:50).
Regarding claim 13, the limitations are similar to claim 3 limitations and are therefore rejected under the same rationale. 

Claims 8-9 and  18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Walters et al. (US Pub. No. 2020/0065813, hereinafter ‘Walt’) in view of Gupta et al. (US Pub. No. 2019/0391901, hereinafter ‘Gupta’) in further view of Telford-Reed et al. (US Pat. No. 11,062,002, hereinafter ‘Reed’).

Regarding claim 8, the rejection of claim 7 is incorporated and Walt in combination with Gupta teaches the method of claim 7, further comprising: declining access to a resource provided by the resource provider if the score exceeds a threshold. (claimed declining access to the storage resource for adding new training data, in 0070: … In other embodiments, the add 2050 logic circuitry may add the transaction data 2005 to the new transaction data for training 2026 upon determination that the error does not exceed the deviation threshold, along with an indication that the trans­action data 2005 is not confirmed to be non-fraudulent. New transaction data for training 2026 that is not confirmed may not be unavailable for training purposes until the confirm 2048 logic circuitry changes the indication to indicate that the transaction data 2005 is verified as non-fraudulent…)
While Walt teaches the use of a threshold criteria for providing access to server/provider resources for updating the training dataset for modeling and detecting fraudulent activity and using the threshold criteria for determining resource access for processing the current transaction as training data.
Walt and Gupta do not disclose declining access concerning declining access to a resource provided by the resource provider as decline processing transaction and use of resource provider payment resource.
Reed does disclose processing transaction as the claimed service provided by the resource provider. (in 21:42-44: … or check a fraud score associated with the trans­action to determine whether to approve or decline the transaction…; And 27:19-23: … flagging up a potential fraudulent use of the payment instrument to a third 20 party; flagging up a potential fraudulent use of the payment instrument to the payment instrument holder; and temporarily disabling the payment instrument…; And 18:42-51: … In the negative, the method moves to step 385 in which the payment is declined and optionally a further action may be carried out. The further action may include any combination of: contacting a third party such as the card issuer to flag up the potential fraudulent use of the payment instrument; contacting the payment instrument holder to flag up the potential fraudulent use of the payment instrument; and temporarily disabling the payment instrument. The invention is however not limited in this respect and any other action deemed appropriate to the skilled person upon detec­tion of a potential fraudulent use of a payment instrument can additionally or alternatively be taken in step 385.)
The Walt, Gupta, and Reed are references would have been recognized by those of ordinary skill in the art as useful for applicant’s purpose in developing information processing system using learning algorithms for data classification task for processing monitored events in a network environments.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the prior art for processing transaction activity for management secure data processing services in an network environments as disclosed by Reed with the method of information processing of time series data using neural networks for classifying of data anomalies based on probability density distributions as collectively disclosed by Gupta and Walt.
One of ordinary skill in the arts would have been motivated to combine the disclosed teaches of Walt, Gupta, and Reed in order enable secure information processing in fraud detection systems by identifying incorrect secure information in transaction data using a fraud score, (Reed, 1:60-2:50 and 16:28--40); Doing so would allow securing transaction data by distinguishing between an authorized user accidentally mis-entering their secure data and an unauthorized user entering incorrect secure data. This prevents the inconvenience of a payment instrument being temporarily disabled or other such negative consequence due to an authorized user accidentally inputting incorrect data, (Reed, 18:59-65)  and helps in securely processing transaction related information over a server (Reed, 16:28-40)

Regarding claim 9, the rejection of claim 8 is incorporated and Walt in combination with Gupta teaches the method of claim 8, further comprising, if the score does not exceed the threshold, modifying the authorization request message… and transmitting the modified authorization request message to an authorizing computer. (claimed determination of scores as error values used to classify fraudulent transactions, in 0016: Note that while the neural network is trained to predict a transaction by a customer or to detect a non­fraudulent transaction, fraud detection logic circuitry may detect fraud based on a determination by the neural network of the error associated with predicting a transaction con­ducted by the customer or classifying the transaction con­ducted by the customer… When data for a non-fraudulent transaction is input at the input layer of the neural network, the neural network may output an error that is small to indicate that the transaction closely matches a predicted transaction or trans­action classification…; And comparing error score to a threshold used to send notification messages, in 0069: … The fraud determiner 2030 may comprise logic circuitry to compare 2032 the error to a deviation threshold, notify 2034 the specific customer of the transaction associated with the transaction data 2005, communicate 2036 a message with a notification of the specific transaction to the specific cus­tomer and/or the payment instrument issuer [claimed modified authorization request message specific to the user or the issuer],…; And also modifying messages with transaction information, 0071: In response to an indication that the transaction data 2005 might represent a fraudulent transaction, the notify 2034 logic circuitry may generate a message that includes a description of the transaction associated with the transaction data 2005…)
While Walt teaches the use of a fraud score, and an error value, used to classify the monitored transaction patterns an transmitting messages based on the learned classifications, Walt and Gupta do not expressly disclose to messages modified to included the determined fraud score.
Reed does expressly teach messages modified to included the determined fraud score. (in 16:28-40: Each biometric pattern in the set can be compared against other transaction-related information that is relevant for assessing the likelihood of the transaction being fraudulent. This transaction- related information can be, for example, a fraud score of the type known in the art, and/or information indicating whether the transaction involved a chargeback element (i.e. transmission of funds from a merchant to the holder of the payment instrument. Other suitable informa­tion will be readily identified by a skilled person having the benefit of the present disclosure. It will be appreciated that the transaction-related information is transaction specific and that the transaction-related information can be provided to server 170 along with the biometric pattern.)…; And the use of a conditional score for communicating notification, in 17:32-43: However, if the biometric pattern gathered in step 370 does not match at least one of the trusted patterns, then in step 385 the payment is declined. In another embodiment, if the biometric pattern gathered in step 370 does not match at 35 least one of the trusted patterns, then this result is used to contribute towards a 'fraud score' that is a measure of the likelihood of the transaction being fraudulent. If the fraud score is found to exceed a threshold value, then fraud is deemed likely and the payment is declined. If the fraud score 40 does not exceed the threshold value then the transaction is allowed, although it may also be flagged to an appropriate authority.) 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Walt, Gupta, and Reed for the same reasons disclosed above.
Regarding claims 18-19, the claims recite similar limitations as clams 8-9 respectively and are therefore rejected under the same rationale.

Claims 10 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Walters et al. (US Pub. No. 2020/0065813, hereinafter ‘Walt’) in view of Gupta et al. (US Pub. No. 2019/0391901, hereinafter ‘Gupta’) in further view of Svenson et al. (US Pub. No. 2021/0307621, hereinafter ‘Paul’).
	
	Regarding claim 10, the rejection of claim 1 is incorporated and While Walt teaches the use of neural network trained using backpropagation to perform classification task for classifying transaction data. However, Walt and Gupta does not expressly teach claim 10 limitation.
Paul does expressly teach claim 10 limitation of the method of claim 1, wherein the neural network comprises sigmoid functions. (in 0462-0463: … The use of sigmoid based activation functions has been shown to provide improved performance, compared to binary step functions, when NNs are used in similar pattern classifica­tion tasks (such as image classification). To train each model, the training data fragments are processed to determine the network parameters (i.e. the weights of the nodes) using standard techniques (such as, for example, back propagation)…)
The Walt, Gupta, and Paul are references would have been recognized by those of ordinary skill in the art as useful for applicant’s purpose in developing information processing system using learning algorithms for data classification task for processing monitored events in a network environments.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the prior art for training network network models in pattern classification task using sigmoid node activation functions as disclosed by Paul with the method of information processing of time series data using neural networks for classifying of data anomalies based on probability density distributions as collectively disclosed by Gupta and Walt.
One of ordinary skill in the arts would have been motivated to combine the disclosed teaches of Walt, Gupta, and Paulin order enable training neural networks using standard techniques such as backpropagation (Paul, 0462); Doing so would improved performance of neural network models for use in pattern classification tasks (Paul, 0462). This prevents the inconvenience of a payment instrument being temporarily disabled or other such negative consequence due to an authorized user accidentally inputting incorrect data, (Reed, 18:59-65)  and helps in securely processing transaction related information over a server (Reed, 16:28-40)

Regarding claim 20,  the claims recite similar limitations as clam 10 and is therefore rejected under the same rationale.
	

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to OLUWATOSIN ALABI whose telephone number is (571)272-0516. The examiner can normally be reached Monday-Friday, 8:00am-5:00pm EST..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J. Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/O.O.A./Examiner, Art Unit 2129                                                                                                                                                                                                        
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129