Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
This action is a responsive to the application filed on 05/04/2022.
Claims 1-4, 6-11, 13-15, and 17-19 are pending.
Claims 1, 14, and 18 have been amended.
Claims 5, 12, 16, and 20 have been canceled.

Response to Arguments
Applicant’s arguments, with respect to the rejection(s) of claim(s) 1-9 and 17-23 under 35 U.S.C. 101, have been considered but they are not persuasive. The applicant argues that the claims “recite supervised and unsupervised machine learning models that each generate anomaly decisions…[,] do not contain limitations that can be practically performed in the human mind”, the massive amount of code to execute “would be impractical if not impossible” to be performed by a human, and the “specific steps…achieve improvements in the practical applications of detecting fraud based on anomalous behavior”; therefore overcomes the 101 rejection. The examiner respectfully disagrees. 
The recitations of the “remote user device”, “applying the values of the one or more predefined features to an unsupervised anomaly detection model that generates an unsupervised anomaly decision”, “applying the values of the one or more predefined features to a supervised anomaly detection model that generates a supervised anomaly decision”, “unsupervised”, “supervised”, “at least one processing device comprising a processor coupled to a memory” and “non-transitory processor-readable storage medium” remain recited at a high level and adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea; and further amount to mere data storing and data outputting, which are forms on insignificant extra-solution activities. The claimed “applying” limitations qualify as adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. The recitations of the remaining elements are at a high level of generality and amount to mere data storing and data outputting, which are forms on insignificant extra-solution activities. Additionally, the limitations remain as being able to be performed in the mind in view of the updated analysis.
See 35 U.S.C 101 section for full, updated analysis of claim limitations necessitated by applicant amendments.

Applicant’s arguments, with respect to the rejection(s) of claim(s) 1, 14 and 18 under 35 U.S.C. 103, has been considered but are not persuasive. More specifically, the applicant argues that no art of record teaches the amended claim language of claims 1, 14, and 18 since (1) “Niu analyzes individual models in isolation to determine their performance” and Niu does not teach “that the disclosed random forests are used in ensemble techniques with other unsupervised models.” The examiner respectfully disagrees.
 Due to the broadness of the claim language, Niu is found to meet all requirements set forth by the claim language (e.g., “based at least in part”, “using ensemble techniques”, etc.). Niu teaches creating performance values for the unsupervised and supervised models, where the supervised model can be a random forest of decision trees (ensemble techniques with other unsupervised models as argued). As cited: Niu, section “Supervised Learning Methods”, “Evaluation Metrics”, and “Results” teach determining “True Positive” and “False Positive” values (third anomaly decision) for the algorithms (supervised/unsupervised) based on the algorithms’ fraud sample predictions (based at least in part on the supervised anomaly decision with the unsupervised anomaly decision), and wherein the algorithm can include “using ensemble methods” of a “random forest algorithm” (using ensemble techniques).
(2) Further, the applicant argues the references do not teach the amendments since “Ramakrishnan does not determine one or more reasons for the third anomaly decision by analyzing the supervised anomaly decision” and Ramakrishnan’s “GaussianNB model is an unsupervised model”. The examiner respectfully disagrees due to the broadness of the claim language. 
Primarily, it is noted that Niu remains cited as teaching the limitation “determining one or more reasons for the third anomaly decision by analyzing the supervised anomaly decision”, since Niu considers supervised and unsupervised outputs for creating reasoning; to which the applicant offers no specific arguments.
Next, Ramakrishnan was cited in alternative for teaching this limitation, in section 4.4 teaches gathering “anomaly data” (predefined features) and “deploy[ing] the baseline GausianNB model because it was an unsupervised approach that did not require many anomaly instances” (applying…predefined features to an unsupervised anomaly detection model), then deploying “the supervised RF approach” (applying…predefined features to a supervised anomaly detection model) to combine “the RF and GaussianNB models” and “prioritize items that were identified as anomalies by both approaches (third anomaly decision)”. While section 4.4 teaches the GaussianNB later “was an unsupervised approach”, section 3.2-3.3 teach the model was first trained on “a limited number of labeled anomaly data” (supervised) and deployed, and further teaches implementing feature “log transformations” and equation tracking methods to output a “list of suspected issues” for “guid[ing] a human reviewer to the cause of the [predicted] anomaly” (determining one or more reasons for the third anomaly decision by analyzing the supervised anomaly decision). In other words, the model was first trained on labeled data to provide results that were reported to a user, thus reading on the claim language.
See 35 U.S.C 103 section for full mapping of claim limitations necessitated by applicant amendments.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Claims 1, 14, and 18 are respectively drawn to a system, method, and non-transitory computer-readable recording medium, hence each falls under one of four categories of statutory subject matter (Step 1).  Nonetheless, the claims are directed to a judicially recognized exception of an abstract idea without significantly more.  
Claims 1, 14, and 18 recite the following, or analogous, limitations “obtaining values of one or more predefined features…; applying the values of the one or more predefined features to an…anomaly detection model that generates an…anomaly decision; applying the values of the one or more predefined features to a…anomaly detection model that generates a…anomaly decision; determining a third anomaly decision based at least in part on the…anomaly decisions; and determining one or more reasons for the third anomaly decision by analyzing the…anomaly decision”.  These limitations, as claimed, under its broadest reasonable interpretation, can be evaluated in a human mind and/or with pen and paper, except for the recitation of generic computer components (Step 2A). Other than reciting “remote user device”, “applying the values of the one or more predefined features to an unsupervised anomaly detection model that generates an unsupervised anomaly decision”, “applying the values of the one or more predefined features to a supervised anomaly detection model that generates a supervised anomaly decision”, “unsupervised”, “supervised”, “at least one processing device comprising a processor coupled to a memory” and “non-transitory processor-readable storage medium” to perform the exceptions, nothing in the claims preclude the steps from practically being performed in the human mind. For example, a human expert can
mentally/with the aid of pen and paper obtaining values of one or more predefined features (e.g. by thinking of numerical values associated with item attributes of a human), 
mentally/with the aid of pen and paper applying the values of the one or more predefined features to an…anomaly detection model that generates an…anomaly decision (e.g. by thinking of/writing out the values and determining their corresponding meaning regarding if an item is an anomaly from past anomaly examples), 
mentally/with the aid of pen and paper applying the values of the one or more predefined features to a…anomaly detection model that generates a…anomaly decision (e.g. by thinking of/writing out the values and determining their corresponding meaning regarding if an item is an anomaly), 
mentally/with the aid of pen and paper determining a third anomaly decision based at least in part on the…anomaly decisions… (e.g. by thinking of/writing out the most likely anomaly values from the determined meaning of an item being an anomaly from past anomaly examples), and
mentally/with the aid of pen and paper execute a predefined remediation step in response to the third anomaly decision (e.g. by thinking of/writing out ways to eliminate the anomaly that is detected as mapped above)
mentally/with the aid of pen and paper determining one or more reasons for the third anomaly decision by analyzing the…anomaly decision (e.g. by thinking of/writing out the determined attribute of the value that deems it the 
most likely anomaly value from the determined meaning of an item).
Thus, the claims recite a mental process (Step 2A, Prong 1). 
Claims 1, 14, and 18 include additional elements “remote user device”, “applying the values of the one or more predefined features to an unsupervised anomaly detection model that generates an unsupervised anomaly decision”, “applying the values of the one or more predefined features to a supervised anomaly detection model that generates a supervised anomaly decision”, “unsupervised”, “supervised”, “at least one processing device comprising a processor coupled to a memory” and “non-transitory processor-readable storage medium”, however the recitations of these elements are at a high level of generality and adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea; and further amount to mere data storing and data outputting, which are forms on insignificant extra-solution activities. The claimed “applying” limitations qualify as adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f). Further the recitation of the remaining elements are at a high level of generality and amount to mere data storing and data outputting, which are forms on insignificant extra-solution activities. Hence, each of the additional limitations or in combination is no more than mere instructions to apply the exceptions using generic computer components (i.e., “applying the values of the one or more predefined features to an unsupervised anomaly detection model that generates an unsupervised anomaly decision”, “applying the values of the one or more predefined features to a supervised anomaly detection model that generates a supervised anomaly decision”, “at least one processing device comprising a processor coupled to a memory” and “non-transitory processor-readable storage medium”), since adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, and do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea (Step 2A, Prong 2; see MPEP 2106.05(f)). The additional elements in the claim do not amount to significantly more than an abstract idea. Furthermore, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional elements of using “remote user device”, “applying the values of the one or more predefined features to an unsupervised anomaly detection model that generates an unsupervised anomaly decision”, “applying the values of the one or more predefined features to a supervised anomaly detection model that generates a supervised anomaly decision”, “unsupervised”, “supervised”, “at least one processing device comprising a processor coupled to a memory” and “non-transitory processor-readable storage medium” to perform the steps of “obtaining”, “applying” steps, and “determining” steps  amounts to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. (STEP 2B). As such, claims 1, 14, and 18 are not patent eligible.
Dependent claims 2-13, 15-16, and 19-20 are also ineligible for the same reasons given with respect to claims 1, 14, and 18.  The dependent claims describe additional mental processes:
mentally/with the aid of pen and paper wherein a decision logic of the…anomaly detection model is not exposed to a user (claim 2) (e.g. by thinking of logic used to deem an item value as an anomaly that is not communicated to the other human)
mentally/with the aid of pen and paper wherein the…anomaly detection model comprises a rule-based model and the determining the one or more reasons for the third anomaly decision comprises one or more of (i) identifying one or more violated rules of the rule-based model, and (ii) identifying one or more violated features of one or more violated rules of the rule-based model (claim 6) (e.g. by thinking of/writing out a decision model in a rule-based style and the determined attribute of the value that deems it the most likely anomaly value from the determined meaning of an item that breaks a rule of the model)
mentally/with the aid of pen and paper wherein the…anomaly detection model comprises a nearest neighbor model and wherein the values of the one or more predefined features associated with the remote user device are assigned to a substantially closest data point in the nearest neighbor model, and wherein the determining the one or more reasons for the third anomaly decision comprises identifying an anomaly type of the substantially closest data point in the nearest neighbor model (claim 7) (e.g. by thinking of/writing out a decision model in a nearest neighbor style and the determined attribute of the value that deems it the most likely anomaly value from the determined meaning of an item that is the closest to an anomaly label of the model)
mentally/with the aid of pen and paper wherein the…anomaly detection model comprises a logistic regression classifier model and the determining the one or more reasons for the third anomaly decision comprises identifying one or more of the predefined features associated with the remote user…that contributed to the…anomaly decision (claim 8) (e.g. by thinking of/writing out a decision model in a logistic regression style and the determined attribute of the value that deems it the most likely anomaly value from the determined meaning of an item from the human that is the reason for the result of the model)
mentally/with the aid of pen and paper wherein the…anomaly detection model comprises a Naive Bayes classifier model that estimates a first likelihood of an anomalous class and a second likelihood of a non-anomalous class given each of the predefined features and the determining the one or more reasons for the third anomaly decision comprises identifying one or more of the predefined features associated with the remote user…that contributed to one or more of the first likelihood and the second likelihood (claim 9) (e.g. by thinking of/writing out a decision model in a Naive Bayes style with computed probabilities of classifications and the determined attribute of the value that deems it the most likely anomaly value from the determined meaning of an item from the human that affected the computed probabilities of the model)
mentally/with the aid of pen and paper assigning an importance to one or more of the predefined features based on features appearing in the…anomaly detection model (claim 10) (e.g. by thinking of/writing out the determined attribute of the value that deems it the most likely anomaly value from the model and attaching a corresponding indicator to the value)
mentally/with the aid of pen and paper wherein the third anomaly decision is used to detect one or more predefined anomalies comprising one or more of a risk anomaly, a security level anomaly, a fraud likelihood anomaly, an identity assurance anomaly, and a behavior anomaly (claim 11) (e.g. by thinking of/writing out the determined attribute of the value that deems it the most likely anomaly value from the determined meaning of an item from the human that is used to assist in determining an anomaly type regarding security)
mentally/with the aid of pen and paper obtaining feedback from a human analyst indicating one or more reasons for the third anomaly decision (claims 13 and 17) (e.g. by thinking of/writing out a response from the human for the determined attribute of the value that deems it the most likely anomaly value from the determined meaning of an item)
Again, the dependent claims continued to cover the performance of the limitation in the mind as inherited from the independent claims (Step 2A, Prong 1). The dependent claims 3-4, 15, and 19, analogously recite “wherein the supervised anomaly detection model is trained at least in part using one or more of the unsupervised anomaly decision and anomalous training data based on known anomalies”, but the recitations of these elements are at a high level of generality and adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea; and further amount to mere data storing and data outputting, which are forms on insignificant extra-solution activities. The claimed “train[ing]” limitations qualify as adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f). Further, dependent claims 7-9 restating “remote user device”, and dependent claims 2-10 restating “unsupervised”/“supervised”, wherein the recitations of these elements is again no more than a generic computer component to apply the exceptions and generally link the use of the judicial exception to a particular technological environment or field of use do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea (Step 2A, Prong 2; see MPEP 2106.05(h)). As discussed above with respect to the integration of the abstract idea into a practical application, any additional elements to perform the steps in the dependent claims amount to no more than mere instructions to apply the exception using generic computer components and generally link the use of the judicial exception to a particular technological environment or field of use. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. (STEP 2B). As such, dependent claims 2-13, 15-16, and 19-20 do not amount to significantly more than an abstract idea nor provide any inventive concept, therefore are not patent eligible.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-4, 6-11, 13-15, and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Niu et al (“A Comparative Study of Credit Card Fraud Detection: Supervised versus Unsupervised”, 2019) hereinafter Niu, in view of Ramakrishnan et al (“Anomaly Detection for an E-commerce Pricing System”, 2019) hereinafter Ramakrishnan.
Regarding claims 1, 14, and 18, Niu teaches a method, an apparatus comprising: at least one processing device comprising a processor coupled to a memory; the at least one processing device being configured to implement the following steps, and non-transitory processor-readable storage medium having stored therein program code of one or more software programs, wherein the program code when executed by at least one processing device causes the at least one processing device to perform the following steps (sections “Abstract”, “Introduction”, and “Unsupervised Learning Methods” teach processes being “computationally expensive” which is well-known in the art to be computer computation speed/power. Computers including one or more processors communicatively coupled to memory to execute stored instructions for performing the embodiments of the disclosure):
obtaining values of one or more predefined features associated with a remote user device (sections “Abstract”, “Introduction”, last paragraph, and section “Experimental Results” teach evaluating algorithms on a gathered “credit card transaction dataset” (obtaining values of one or more predefined features) of “credit card transactions made…by European cardholders” from a “Kaggle website” hosted on a server (associated with a remote user device) that are classified (predefined)); 
applying the values of the one or more predefined features to an unsupervised anomaly detection model that generates an unsupervised anomaly decision (sections “Unsupervised Learning Methods”, “Experimental Results”, and “Results” teach testing the “supervised” and “unsupervised” algorithms (to an unsupervised anomaly detection model) on the “transaction dataset” (applying the values of the one or more predefined features) to obtain (that generates) fraud sample “predict[ion]”/“estimat[ion]…results” (an unsupervised anomaly decision)); 
applying the values of the one or more predefined features to a supervised anomaly detection model that generates a supervised anomaly decision (sections “Supervised Learning Methods”, “Experimental Results”, and “Results” teach testing the “supervised” and “unsupervised” algorithms (to a supervised anomaly detection model) on the “transaction dataset” (applying the values of the one or more predefined features) to obtain (that generates) fraud sample “predict[ion]”/“estimat[ion]…results” (a supervised anomaly decision)); 
determining a third anomaly decision based at least in part on the supervised anomaly decision with the unsupervised anomaly decision using ensemble techniques (section “Supervised Learning Methods”, “Evaluation Metrics”, and “Results” teach determining “True Positive” and “False Positive” values (third anomaly decision) for the algorithms (supervised/unsupervised) based on the algorithms’ fraud sample predictions (based at least in part on the supervised anomaly decision with the unsupervised anomaly decision), and wherein the algorithm can include “using ensemble methods” of a “random forest algorithm” (using ensemble techniques)); 
determining one or more reasons for the third anomaly decision by analyzing the supervised anomaly decision (sections “Evaluation Metrics”, “Results”, and “Discussions” teach the “AUROC” value is calculated from the determined “True Positive” and “False Positive” values (third anomaly decision) for the algorithms’ fraud sample predictions (by analyzing the supervised anomaly decision); and the “supervised models perform slightly better than unsupervised models, at the expense of additional preprocessing procedures like outliers remove[d]” (determining one or more reasons). Further, the “True Positive” and “False Positive” values (third anomaly decision) of the models are taught to be determined based on the classifications from calculating “conditional probability” based on KNN (supervised) distances for assigning the input “to the class with the largest probability” (determining one or more reasons…by analyzing the supervised anomaly decision).), 
 
wherein the method is performed by at least one processing device comprising a processor coupled to a memory (sections “Abstract”, “Introduction”, and “Unsupervised Learning Methods” teach processes being “computationally expensive” which is well-known in the art to be computer computation speed/power. Computers including one or more processors communicatively coupled to memory to execute stored instructions for performing the embodiments of the disclosure).

However, Niu does not explicitly teach and executing a predefined remediation step in response to the third anomaly decision.
Ramakrishnan teaches and executing a predefined remediation step in response to the third anomaly decision (sections 1 and 3.7 teach “our anomaly detection models (e.g., GaussianNB, RF) predict anomalies (supervised/unsupervised anomaly decision) and prioritize them based on business impact (the third anomaly decision). The most severe anomalies that have high business impact (the third anomaly decision) are sent to a manual review team that has a capacity to review a fixed number anomalies daily (executing a predefined remediation step in response to the third anomaly decision)…If the prices are [predicted as] anomalous and have high priority, they are not updated, and an alert is generated for a category specialist to review (executing a predefined remediation step in response to the third anomaly decision)”).
Further, Niu at least implies a method, an apparatus comprising: at least one processing device comprising a processor coupled to a memory; the at least one processing device being configured to implement the following steps, and non-transitory processor-readable storage medium having stored therein program code of one or more software programs, wherein the program code when executed by at least one processing device causes the at least one processing device to perform the following steps, determining one or more reasons for the third anomaly decision by analyzing the supervised anomaly decision, and wherein the method is performed by at least one processing device comprising a processor coupled to a memory (as mapped above), however Ramakrishnan teaches a method, an apparatus comprising: at least one processing device comprising a processor coupled to a memory; the at least one processing device being configured to implement the following steps, and non-transitory processor-readable storage medium having stored therein program code of one or more software programs, wherein the program code when executed by at least one processing device causes the at least one processing device to perform the following steps (section 3.7 and A.3 “Computing Resources” teaches using “a single node with 5 CPU cores and 45GB of RAM” for performing “the approaches” of the disclosure),
determining one or more reasons for the third anomaly decision by analyzing the supervised anomaly decision (section 4.4 teaches gathering “anomaly data” (predefined features) and “deploy[ing] the baseline GausianNB model because it was an unsupervised approach that did not require many anomaly instances” (applying…predefined features to an unsupervised anomaly detection model), then deploying “the supervised RF approach” (applying…predefined features to a supervised anomaly detection model) to combine “the RF and GaussianNB models” and “prioritize items that were identified as anomalies by both approaches (third anomaly decision)”. While section 4.4 teaches the GaussianNB later “was an unsupervised approach”, section 3.2-3.3 teach the model was first trained on “a limited number of labeled anomaly data” (supervised) and deployed, and further teaches implementing feature “log transformations” and equation tracking methods to output a “list of suspected issues” for “guid[ing] a human reviewer to the cause of the [predicted] anomaly” (determining one or more reasons for the third anomaly decision by analyzing the supervised anomaly decision).), and
wherein the method is performed by at least one processing device comprising a processor coupled to a memory (section 3.7 and A.3 “Computing Resources” teaches using “a single node with 5 CPU cores and 45GB of RAM” for performing “the approaches” of the disclosure).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Ramakrishnan’s teachings of anomaly prediction through supervised and unsupervised models and further providing explanations how the anomalies were detected into Niu’s teaching of testing the “supervised” and “unsupervised” algorithms on a “transaction dataset” to obtain fraud sample prediction results of credit card fraud detection in order to more effectively “detect the most important anomalies” and “rel[y] on the anomaly scores from a density model to explain the anomalies detected” (Ramakrishnan, section 5). 

Regarding claim 2, the combination of Niu and Ramakrishnan teach all the claim limitations of claim 1 above; and further teach wherein a decision logic of the unsupervised anomaly detection model is not exposed to a user (Niu, sections “Abstract”, “Introduction”, “Unupervised Learning Methods”, “Experimental Results”, and “Results” teach locally testing the “unsupervised” algorithms (wherein a decision logic of the unsupervised anomaly detection model is not exposed to a user) on the “transaction dataset” from a “Kaggle website” to obtain fraud sample “predict[ion]”/“estimat[ion]…results”).

Regarding claim 3, the combination of Niu and Ramakrishnan teach all the claim limitations of claim 1 above; and further teach wherein the supervised anomaly detection model is trained at least in part using the unsupervised anomaly decision (Ramakrishnan, sections 1 and 3.7 teach “our anomaly detection models (e.g., GaussianNB, RF (supervised anomaly detection model)) predict anomalies (unsupervised anomaly decision) and prioritize them based on business impact. The most severe anomalies (including unsupervised anomaly decision) that have high business impact are sent to a manual review team…who correct the problem appropriately. Finally, the feedback obtained from these items are used (using the unsupervised anomaly decision) as training data for our models (supervised anomaly detection model is trained at least in part using the unsupervised anomaly decision).”).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Ramakrishnan’s teachings of anomaly prediction through supervised and unsupervised models and further providing explanations how the anomalies were detected into Niu’s teaching of testing the “supervised” and “unsupervised” algorithms on a “transaction dataset” to obtain fraud sample prediction results of credit card fraud detection in order to more effectively “detect the most important anomalies” and “rel[y] on the anomaly scores from a density model to explain the anomalies detected” (Ramakrishnan, section 5). 

Regarding claim 4, the combination of Niu and Ramakrishnan teach all the claim limitations of claim 1 above; and further teach wherein the supervised anomaly detection model is trained at least in part using anomalous training data based on known anomalies (Niu, sections “Abstract”, “Introduction”, last paragraph, and section “Experimental Results” teach training and evaluating “supervised” algorithms on a gathered “credit card transaction dataset” (supervised anomaly detection model is trained) wherein the dataset samples are pre-classified (at least in part using anomalous training data based on known anomalies)).

Regarding claim 6, the combination of Niu and Ramakrishnan teach all the claim limitations of claim 1 above; and further teach wherein the supervised anomaly detection model comprises a rule-based model and the determining the one or more reasons for the third anomaly decision comprises one or more of (i) identifying one or more violated rules of the rule-based model, and (ii) identifying one or more violated features of one or more violated rules of the rule-based model (Examiner note: Applicant’s spec, page 9, lines 10-12 teach “using a rule-based model, such as a RuleFit algorithm, that trains a set of short (low- depth) decision trees to induce a weighted set of short-width decision rules”. Here, while the “RuleFit algorithm” is taught to be “rule-based”, it is shown to train a set of “decision trees[’]…decision rules”, therefore interpreted as decision trees also being “rule-based model[s]”.
Niu, sections “Supervised Learning Methods”, “Evaluation Metrics”, “Results”, and “Discussions” teach a supervised algorithm including a decision tree ensemble (supervised anomaly detection model comprises a rule-based model) and the “AUROC” value is calculated from the determined “True Positive” and “False Positive” values (third anomaly decision) for the algorithms’ fraud sample predictions to retune a tree’s leaf parameters found in “the sink in a n-dimensional plane” for a “weak learner” tree (determining the one or more reasons for the third anomaly decision comprises one or more of (i) identifying one or more violated rules of the rule-based model).).

Regarding claim 7, the combination of Niu and Ramakrishnan teach all the claim limitations of claim 1 above; and further teach wherein the supervised anomaly detection model comprises a nearest neighbor model and wherein the values of the one or more predefined features associated with the remote user device are assigned to a substantially closest data point in the nearest neighbor model, and wherein the determining the one or more reasons for the third anomaly decision comprises identifying an anomaly type of the substantially closest data point in the nearest neighbor model (Niu, sections “Supervised Learning Methods”, “Evaluation Metrics”, “Results”, and “Discussions” teach a supervised algorithm including a “K-Nearest Neighbor” algorithm (supervised anomaly detection model comprises a nearest neighbor model), that “runs through the whole dataset (wherein the values of the one or more predefined features associated with the remote user device) computing d between x and each training observation (are assigned to a substantially closest data point in the nearest neighbor model)”, and calculating “conditional probability” based on distances for assigning the input “to the class with the largest probability” (comprises identifying an anomaly type of the substantially closest data point in the nearest neighbor model) in order to “classify test transaction data into normal and abnormal categories”. Then the “True Positive” and “False Positive” values (third anomaly decision) of the model are determined based on the classifications (determining the one or more reasons for the third anomaly decision)).

Regarding claim 8, the combination of Niu and Ramakrishnan teach all the claim limitations of claim 1 above; and further teach wherein the supervised anomaly detection model comprises a logistic regression classifier model and the determining the one or more reasons for the third anomaly decision comprises identifying one or more of the predefined features associated with the remote user device that contributed to the supervised anomaly decision (Niu, sections “Supervised Learning Methods”, “Evaluation Metrics”, “Results”, and “Discussions” teach a supervised algorithm including a “Logistic Regression” algorithm (supervised anomaly detection model comprises a logistic regression classifier model), that “estimate the probability of a categorical response based on one or more predictor variables x. It allows one to say that the presence of a predictor increases (or decreases) the probability of a given outcome by a specific percentage” for the “transaction dataset” from a “Kaggle website” and calculating “True Positive” and “False Positive” values (determining the one or more reasons for the third anomaly decision comprises identifying one or more of the predefined features associated with the remote user device that contributed to the supervised anomaly decision)).

Regarding claim 9, the combination of Niu and Ramakrishnan teach all the claim limitations of claim 1 above; and further teach wherein the supervised anomaly detection model comprises a Naive Bayes classifier model that estimates a first likelihood of an anomalous class and a second likelihood of a non-anomalous class given each of the predefined features and the determining the one or more reasons for the third anomaly decision comprises identifying one or more of the predefined features associated with the remote user device that contributed to one or more of the first likelihood and the second likelihood (Ramakrishnan, sections 3.2-3.3 teach using a “GaussianNB model”, and while it is taught that that “it was an unsupervised approach” the model was first trained on “a limited number of labeled anomaly data” (supervised) before moving to unsupervised training (supervised anomaly detection model comprises a Naive Bayes classifier model), that predicts a “likelihood”/“probability distribution” and/or “anomaly score” that is compared to a “threshold” to determine “an anomaly” or non-anomaly status for “each feature” (that estimates a first likelihood of an anomalous class and a second likelihood of a non-anomalous class given each of the predefined features); sections 3.2-3.3 and 3.7 teach obtaining the feature data from a “fileserver” to train the models and implementing feature “log transformations” and equation tracking methods to output a “list of suspected issues” for “guid[ing] a human reviewer to the cause of the [predicted] anomaly” (determining the one or more reasons for the third anomaly decision comprises identifying one or more of the predefined features associated with the remote user device that contributed to one or more of the first likelihood and the second likelihood)).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Ramakrishnan’s teachings of anomaly prediction through supervised and unsupervised models and further providing explanations how the anomalies were detected into Niu’s teaching of testing the “supervised” and “unsupervised” algorithms on a “transaction dataset” to obtain fraud sample prediction results of credit card fraud detection in order to more effectively “detect the most important anomalies” and “rel[y] on the anomaly scores from a density model to explain the anomalies detected” (Ramakrishnan, section 5). 

Regarding claim 10, the combination of Niu and Ramakrishnan teach all the claim limitations of claim 1 above; and further teach further comprising assigning an importance to one or more of the predefined features based on features appearing in the supervised anomaly detection model (Niu, sections “Supervised Learning Methods”, “Evaluation Metrics”, “Results”, and “Discussions” teach a supervised algorithm (supervised anomaly detection model), that “estimate the probability of a categorical response based on one or more predictor variables x. It allows one to say that the presence of a predictor increases (or decreases) the probability of a given outcome by a specific percentage” for the “transaction dataset” from a “Kaggle website” and calculating “True Positive” and “False Positive” values (assigning an importance to one or more of the predefined features based on features appearing in the supervised anomaly detection model)).

Regarding claim 11, the combination of Niu and Ramakrishnan teach all the claim limitations of claim 1 above; and further teach wherein the third anomaly decision is used to detect one or more predefined anomalies comprising one or more of a risk anomaly, a security level anomaly, a fraud likelihood anomaly, an identity assurance anomaly, and a behavior anomaly (Ramakrishnan, section 4.4 teaches combining “the RF and GaussianNB models” to predict on the gathered “anomaly data” and “prioritize items that were identified as anomalies by both approaches (third anomaly decision)”. Sections 1 and 3.7 teach “our anomaly detection models (e.g., GaussianNB, RF) predict anomalies (third anomaly decision is used) and prioritize them based on business impact (is used to detect one or more predefined anomalies comprising one or more of a risk anomaly)” and further that the anomaly detection includes “if an item’s price is more than a few standard deviations from its average historical price (is used to detect one or more predefined anomalies comprising one or more of…a behavior anomaly)”).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Ramakrishnan’s teachings of anomaly prediction through supervised and unsupervised models and further providing explanations how the anomalies were detected into Niu’s teaching of testing the “supervised” and “unsupervised” algorithms on a “transaction dataset” to obtain fraud sample prediction results of credit card fraud detection in order to more effectively “detect the most important anomalies” and “rel[y] on the anomaly scores from a density model to explain the anomalies detected” (Ramakrishnan, section 5). 

Regarding claims 13 and 17, the combination of Niu and Ramakrishnan teach all the claim limitations of claims 1 and 14 above; and further teach further comprising obtaining feedback from a human analyst indicating one or more reasons for the third anomaly decision (Ramakrishnan, sections 1 and 3.7 teach “our anomaly detection models (e.g., GaussianNB, RF) predict anomalies (third anomaly decision) and prioritize them based on business impact (third anomaly decision). The most severe anomalies that have high business impact are sent to a manual review team…who correct the problem appropriately. Finally, the feedback obtained from these items are used as training data for our models (obtaining feedback from a human analyst indicating one or more reasons for the third anomaly decision).”).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Ramakrishnan’s teachings of anomaly prediction through supervised and unsupervised models and further providing explanations how the anomalies were detected into Niu’s teaching of testing the “supervised” and “unsupervised” algorithms on a “transaction dataset” to obtain fraud sample prediction results of credit card fraud detection in order to more effectively “detect the most important anomalies” and “rel[y] on the anomaly scores from a density model to explain the anomalies detected” (Ramakrishnan, section 5). 

Regarding claims 15 and 19, the combination of Niu and Ramakrishnan teach all the claim limitations of claims 14 and 18 above; and further teach wherein the supervised anomaly detection model is trained at least in part using one or more of the unsupervised anomaly decision and anomalous training data based on known anomalies (Ramakrishnan, sections 1, 3.7, and 4.4 teach gathering labeled “anomaly data” (using…anomalous training data based on known anomalies) and deploying “the supervised RF approach” for training (supervised anomaly detection model is trained at least in part), and further “our anomaly detection models (e.g., GaussianNB, RF (supervised anomaly detection model)) predict anomalies (unsupervised anomaly decision) and prioritize them based on business impact. The most severe anomalies (including unsupervised anomaly decision) that have high business impact are sent to a manual review team…who correct the problem appropriately. Finally, the feedback obtained from these items are used (using the unsupervised anomaly decision) as training data for our models (supervised anomaly detection model is trained at least in part using the unsupervised anomaly decision).”).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Ramakrishnan’s teachings of anomaly prediction through supervised and unsupervised models and further providing explanations how the anomalies were detected into Niu’s teaching of testing the “supervised” and “unsupervised” algorithms on a “transaction dataset” to obtain fraud sample prediction results of credit card fraud detection in order to more effectively “detect the most important anomalies” and “rel[y] on the anomaly scores from a density model to explain the anomalies detected” (Ramakrishnan, section 5). 

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CLINT MULLINAX whose telephone number is 571-272-3241.  The examiner can normally be reached on Mon - Fri 8:00-4:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/C.M./Examiner, Art Unit 2123                                                                                                                                                                                                        

/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123