DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-19 are pending and have been examined.

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Drawings
The drawings are objected to because the label for element “108-2” in Fig. 1 is pointing to blank space. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Objections
Claims 1 and 11 are objected to because of the following informalities: 
The second “determining” paragraph should recite “at least one proxy variable[[s]]”
The last paragraph should recite “curation of a data set”. 
Claim 11 is further objected to because the third-from-last paragraph is missing a final semicolon.
Claim 13 is objected to because the last paragraph should recite “de-prejudice[[ing]] the training data”
Claim 18 is objected to because the preamble should recite “… of claim 17, wherein the…”
Claims 10 and 19 are objected to under 37 CFR 1.75(c) as being in improper form because a multiple dependent claim should refer to other claims in the alternative only.  See MPEP § 608.01(n).  For claim 10, Examiner is interpreting the limitation “through the method described in claim 1” as if all the limitations from claim 1 had been recited. Likewise for claim 19, Examiner is interpreting the limitation “through the method described in claim 11” as if all the limitations from claim 11 had been recited. Appropriate correction is required.

Claim Rejections - 35 USC § 112
Claims 1-19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites “determining, by the anomaly detection device, based on the intra-cohort variation when the at least one proxy variable has explanatory power independent of the at least one prejudicing variable and the weight of the at least one proxy variable independent of the at least one 
“determining… when the at least one proxy variable has explanatory power independent of the at least one prejudicing variable and WHEN the weight of the at least one proxy variable IS independent of the at least one prejudicing variable”
“determining… when the at least one proxy variable has explanatory power independent of the at least one prejudicing variable and DETERMINING the weight of the at least one proxy variable independent of the at least one prejudicing variable”
For examining purposes, Examiner is interpreting claim 1 as if it had recited the limitation “determining… when the at least one proxy variable has explanatory power independent of the at least one prejudicing variable and WHEN the weight of the at least one proxy variable IS independent of the at least one prejudicing variable”. Claims 2-10 are rejected for failing to cure the deficiencies of claim 1 upon which they depend. 

Claim 11 recites “determine based on the intra-cohort variation when the at least one proxy variable has explanatory power independent of the at least one prejudicing variable and the weight of the at least one proxy variable independent of the at least one prejudicing variable de-prejudice the feature set of the Al model based on explanatory 30power of the at least one proxy variable independent of the prejudicing variables” The claim is indefinite because it is unclear how the claim should be interpreted. Possible interpretations include:
“determine… when the at least one proxy variable has explanatory power independent of the at least one prejudicing variable and WHEN the weight of the at least one proxy variable IS
“determine… when the at least one proxy variable has explanatory power independent of the at least one prejudicing variable and DETERMINING the weight of the at least one proxy variable independent of the at least one prejudicing variable”
For examining purposes, Examiner is interpreting claim 11 as if it had recited the limitation “determine… when the at least one proxy variable has explanatory power independent of the at least one prejudicing variable and WHEN the weight of the at least one proxy variable IS independent of the at least one prejudicing variable”. Claims 12-19 are rejected for failing to cure the deficiencies of claim 11 upon which they depend.

Claim 3 recites “in response to determining the bias” in the last line. The claim is indefinite because it is unclear whether “the bias” refers to “a bias” which was determined in Claim 1 line 6 or “bias” which was determined in Claim 3 line 2. 

Claim 13 recites “in response to determining the bias” in the last line. The claim is indefinite because it is unclear whether “the bias” refers to “a bias” which was determined in Claim 11 line 7 or “bias” which was determined in Claim 13 line 3. 

TERMINOLOGY
Where applicant acts as his or her own lexicographer to specifically define a term of a claim contrary to its ordinary meaning, the written description must clearly redefine the claim term and set forth the uncommon definition so as to put one reasonably skilled in the art on notice that the applicant intended to so redefine that claim term. Process Control Corp. v. HydReclaim Corp., 190 F.3d 1350, 1357, 52 USPQ2d 1029, 1033 (Fed. Cir. 1999). 
on the inside or within, and it defines the prefix “inter-” as between or among. The term “intra-cohort” in claims 1 and 11 is used by the claim to mean “among cohorts” while the accepted meaning is “on the inside, within cohorts”. The term is indefinite because the specification does not clearly redefine the term. The limitations “intra-cohort variation among the plurality of cohorts” recited in claims 1 and 11 would make more sense if they had recited “inter-cohort variation among the plurality of cohorts”. Claims 2-10 are rejected for failing to cure the deficiencies of claim 1 upon which they depend. Claims 12-19 are rejected for failing to cure the deficiencies of claim 11 upon which they depend.

ANTECEDENT BASIS
Claims 1 and 11 recite the limitation “the weight” in the third “determining” paragraph. There is insufficient antecedent basis for this limitation in the claim. For examining purposes, Examiner is interpreting the claims as if they had recited the limitation “a weight”. Claims 2-10 are rejected for failing to cure the deficiencies of claim 1 upon which they depend. Claims 12-19 are rejected for failing to cure the deficiencies of claim 11 upon which they depend.

Claims 5 and 15 recite the limitation “the weightage”. There is insufficient antecedent basis for this limitation in the claims. For examining purposes, Examiner is interpreting the claims as if they had recited the limitation “a weightage”.

Claims 5 and 15 recite the limitation "the de-prejudicing the Al model" in line 1.  There is insufficient antecedent basis for this limitation in the claims because the claims lack an explicit step of de-prejudicing the AI model. For examining purposes, Examiner interprets the limitation as comprising all the steps in Claim 1 and Claim 11, respectively.

Claim 7 recites the limitation “the de-prejudicing the data set” in line 1. There is insufficient antecedent basis for this limitation in the claim because the data set was never de-prejudiced. For examining purposes, Examiner interprets the claim as if it had recited the limitation “the de-prejudicing the training data [[set]]”, which corresponds to the limitation of claim 16 “wherein to de-prejudice the training data”.

Claims 8-10 and 17-19 recite the limitation "the de-prejudiced AI model".  There is insufficient antecedent basis for this limitation in the claim. For examining purposes, Examiner is interpreting the first instance of the limitation in each of claims 8 and 17 as the limitation “a [[the]] de-prejudiced AI model”.

	Claim 9 recites the limitation "the true positive prediction" in line 3.  There is insufficient antecedent basis for this limitation in the claim. For examining purposes, Examiner is interpreting the claim as if it had recited the limitation “a [[the]] true positive prediction”. Claim 10 is rejected for failing to cure the deficiencies of claim 9 upon which it depends.

Claims 10 and 19 recites the limitation "the human-in-the-loop" in the last line.  There is insufficient antecedent basis for this limitation in the claim. For examining purposes, Examiner is interpreting the claims as if they had recited the limitation “a [[the]] human-in-the-loop”.

RELATIVE TERMS
The term "sufficient" in Claim 6 is a relative term which renders the claim indefinite.  The term "sufficient" is not defined by the claim, the specification does not provide a standard for ascertaining the “Sufficient” is a subjective term as discussed in MPEP 2173.05(b)(IV). For examining purposes, examiner is interpreting “sufficient data” as “a predetermined quantity of data”.

The term “relevant” in Claim 6 is a relative term which renders the claim indefinite.  The term "relevant" is not defined by the claim, the specification, especially paragraph [0034] starting at the line marked 15, does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.  “Relevant” is a subjective term as discussed in MPEP 2173.05(b)(IV). The broadest reasonable interpretation of “prediction relevant features” is every single feature.

The term "higher" in Claim 7 line 3 and Claim 16 line 4 is a relative term which renders the claim indefinite.  The term "higher" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.  “Higher” is a relative term as discussed in MPEP 2173.05(b)(I). For examining purposes, examiner is interpreting claims 7 and 16 as if they had recited “to reveal bias levels higher than a predetermined threshold”.

The term "anomalous" in claims 8 and 17 is a relative term which renders the claim indefinite.  The term "anomalous" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.  “Anomalous” is subjective term as discussed in MPEP 2173.05(b)(IV). For examining purposes, examiner is interpreting claims 8 and 17 as if they had recited “data containing at least one anomaly” instead of “anomalous data”.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


CLAIMS 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

CLAIM 1
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
determining whether the Al model reveals a bias, based on one or more prejudicing variables; 
determining when a feature set of the Al 10model includes at least one proxy variables associated with at least one prejudicing variable from the one or more prejudicing variables; 
building Al cohort models for a plurality of cohorts associated with values of each of the one or more prejudicing variables; 
identifying intra-cohort variation among the 15plurality of cohorts, wherein the intra-cohort variation indicates behavioral differences among the plurality of cohorts; 
determining based on the intra-cohort variation when the at least one proxy variable has explanatory power independent of the at least one prejudicing variable and the weight of the at least one proxy variable independent of the at least one 20prejudicing variable; 
de-prejudicing the feature set of the Al model based on explanatory power of the at least one proxy variable independent of the prejudicing variables; and 

	Each of the above limitations is a mathematical operation. Each of the determining limitations is also a mental process of evaluating which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
5training and testing an Al model based on labelled training data;
an anomaly detection device
 “Training and testing” and an anomaly detection device and are generally linking the abstract ideas to the particular technology environment of machine learning, and they are not an improvement to machine learning technology. Therefore, they are not meaningful limitations. See MPEP 2106.05(e).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 2 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations: 
building the Al model based on historical data associated with a plurality of users.

Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 3 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations: 
determining bias across the one or more prejudicing variables through a multi-level Bayesian analysis; and 
iteratively de-prejudicing the feature set used by the Al model and de- prejudicing the training data in response to determining the bias.
Each of the above limitations is a mathematical operation. Each of the determining limitations is also a mental process of evaluating which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 4 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations: 
iteratively removing one or more of the proxy variables having no explanatory power independent of the one or more prejudicing variables.
This limitation is a mathematical operation. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 5 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations: 

Each of the above limitations is a mathematical operation. Each of the determining limitations is also a mental process of evaluating which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 6 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations: 
wherein separate Al cohort models are retained for 15each of the plurality of cohorts of the one or more prejudicing variable based on a set of conditions comprising: availability of sufficient data, behavior of each of the plurality of cohorts is different based on different prediction relevant features, and predictive power associated with each of the separate Al cohort model is higher than that of the Al model.

Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 7 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations: 
curating data by removing data received from one or more entities, 
wherein the one or more entities are determined by a multi-level Bayesian analysis to reveal higher bias levels; and 
curating the data based on sampling techniques to remove residual bias after 25de-prejudicing the feature set.
These limitations are mathematical operations. Additionally, determining the one or more entities is a mental process of evaluating which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 8 incorporates the rejection of claim 1. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations: 
detecting, by the de-prejudiced Al model, anomalous data from a new set of inputs provided to the de-prejudiced Al model.
This limitation is a mathematical operation. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 9 incorporates the rejection of claim 8. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 8 are incorporated. The claim recites the following limitations: 
attributing a set of causal features to each anomalous prediction of the de- prejudiced Al model that influenced the true positive prediction by the de-prejudiced Al model; and 

refine the attributed set of causal features.
These limitations are mathematical operations. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements. 
receiving feedback from investigators
Receiving feedback is mere data-gathering, which is an insignificant extra-solution activity. See MPEP 2106.05(g). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. Additionally, the receiving feedback is well-understood, routine, conventional activity of receiving or transmitting data over a network. See MPEP 2106.05(d)(II)(i): 
The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. i. Receiving or transmitting data over a network, e.g., using the Internet to gather data.
The claim is not patent eligible.

CLAIM 10 incorporates the rejection of claim 9. 
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 9 are incorporated. The claim recites the following limitations: 
using rule-based checks as a redressal mechanism to provide re-classified results to correct the bias of the de-prejudiced Al model; and 
de-prejudicing the feedback received from the investigator through the method described in claim 1, using the feedback as additional training data for the de-prejudiced Al model to learn from the human-in-the-loop.
These limitations are mathematical operations. Further, the limitation “learn from the human-in-the-loop” is a method of organizing human activity. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 11
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
determine whether the Al model reveals a bias, based on one or more prejudicing variables; 

build Al cohort models for a plurality of cohorts associated with values of each of the one or more prejudicing variables; 
identify intra-cohort variation among the plurality of cohorts, wherein 25the intra-cohort variation indicates behavioral differences among the plurality of cohorts; 
determine based on the intra-cohort variation when the at least one proxy variable has explanatory power independent of the at least one prejudicing variable and the weight of the at least one proxy variable independent of the at least one prejudicing variable 
de-prejudice the feature set of the Al model based on explanatory 30power of the at least one proxy variable independent of the prejudicing variables; and 
de-prejudice the training data, by the anomaly detection device, of any residual bias after de-prejudicing the feature set through curation of data set using sampling techniques.
Each of the above limitations is a mathematical operation. Each of the determining limitations is also a mental process of evaluating which can reasonably be performed in one’s mind with the aid of pencil and paper but for the recitation of a processor. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
An anomaly detection device
a processor
a memory 
processor instructions 
train and test an Al model, based on labelled training data; 

Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 12 incorporates the rejection of claim 11. 
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 11 are incorporated. The claim recites the following limitations: 
build the Al model based on historical data associated with a plurality of users.
This limitation is a mathematical operation. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
processor
A processor amounts to no more than mere instructions to implement an abstract idea on a computer, as discussed in MPEP 2106.05(f). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 13 incorporates the rejection of claim 11. 
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 11 are incorporated. The claim recites the following limitations: 
determine bias across the one or more prejudicing variables through a multi- level Bayesian analysis; and 
iteratively de-prejudice the feature set used by the Al model and de- prejudicing the training data in response to determining the bias.
Each of the above limitations is a mathematical operation. Each of the determining limitations is also a mental process of evaluating which can reasonably be performed in one’s mind with the aid of pencil and paper but for the recitation of a processor. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
processor instructions
processor
Processor instructions and a processor amount to no more than mere instructions to implement an abstract idea on a computer, as discussed in MPEP 2106.05(f). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 14 incorporates the rejection of claim 11. 
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 11 are incorporated. The claim recites the following limitations: 
iteratively removing one or more of the proxy variables having no explanatory power independent of the one or more prejudicing variables.
This limitation is a mathematical operation. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 15 incorporates the rejection of claim 11. 
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 11 are incorporated. The claim recites the following limitations: 

Each of the above limitations is a mathematical operation. Each of the determining limitations is also a mental process of evaluating which can reasonably be performed in one’s mind with the aid of pencil and paper but for the recitation of a processor. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites no additional elements. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 16 incorporates the rejection of claim 11. 
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 11 are incorporated. The claim recites the following limitations: 
curate data by removing data received from one or more entities, 
wherein the one or more entities are determined by a multi-level Bayesian analysis to reveal higher bias levels; and 
curate the data based on sampling techniques to remove residual bias after de- prejudicing the feature set.

Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
processor instructions
Processor instructions amounts to no more than mere instructions to implement an abstract idea on a computer, as discussed in MPEP 2106.05(f). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 17 incorporates the rejection of claim 11. 
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 11 are incorporated. The claim recites the following limitations: 
detect anomalous data from a new set of data provided to the de-prejudiced Al model.
This limitation is a mathematical operation. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
processor instructions

Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 18 incorporates the rejection of claim 17. 
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 17 are incorporated. The claim recites the following limitations: 
attribute a set of causal features to each anomalous prediction of the de- prejudiced Al model that influenced true positive prediction by the de-prejudiced Al model; and 
refine the attributed set of causal 10features.
These limitations are mathematical operations. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements. 
processor instructions
receive feedback from investigators
Processor instructions amounts to no more than mere instructions to implement an abstract idea on a computer, as discussed in MPEP 2106.05(f). Receiving feedback is mere data-gathering, which is an insignificant extra-solution activity, as discussed in MPEP 2106.05(g). Accordingly, the additional 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. Additionally, the receiving feedback is well-understood, routine, conventional activity of receiving or transmitting data over a network. See MPEP 2106.05(d)(II)(i). The claim is not patent eligible.

CLAIM 19 incorporates the rejection of claim 18. 
Step 1: The claim recites a system, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 18 are incorporated. The claim recites the following limitations: 
use rule-based checks as a redressal mechanism to provide re-classified results 15to correct the bias of the de-prejudiced Al model; and 
de-prejudice the feedback received from the investigator through the method described in claim 11, using the feedback to be used as additional training data for the de-prejudiced Al model to learn from the human-in-the-loop.
These limitations are mathematical operations. Further, the limitation “learn from the human-in-the-loop” is a method of organizing human activity. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
processor instructions
Processor instructions amounts to no more than mere instructions to implement an abstract idea on a computer, as discussed in MPEP 2106.05(f). Accordingly, the additional elements do not 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.


Claims 1-2, 4-6, 8, 11-12, 14-15 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over “Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings” to Bolukbasi et al., hereinafter “Bolukbasi,” in view of “Efficient Estimation of Word Representations in Vector Space” to Mikolov et al., hereinafter “Mikolov,” and further in view of US 2017/0279830 A1 to Mermoud et al., hereinafter “Mermoud.”

Regarding CLAIM 1, Bolukbasi teaches: A method for de-prejudicing Artificial Intelligence (AI) based anomaly detection, the method comprising:  
5training and testing, by an anomaly detection device, an Al model… (An “anomaly detection device” is disclosed by the authors’ computer system which generated Figs. 4-8 and Table 1. Bolukbasi teaches at the top of p. 3: “The primary embedding studied in this paper is the popular publicly-available word2vec [24, 25] embedding trained on a corpus of Google News texts consisting of 3 million English words and terms into 300 dimensions, which we refer to here as the w2vNEWS.” Testing an AI model is taught by the experiments of occupational stereotypes and analogies exhibiting stereotypes in section 4, pp. 6-8.)
determining, by the anomaly detection device, whether the Al model reveals a bias, based on one or more prejudicing variables; (A prejudicing variable is a direct gender bias. Determining whether the AI model reveals a bias is the direct gender bias associated with a word embedding. Bolukbasi teaches at §5.2 that a direct gender bias of an embedding to be DirectBiasC.)
determining, by the anomaly detection device, when a feature set of the Al 10model includes at least one proxy variables associated with at least one prejudicing variable from the one or more prejudicing variables; (The proxy variable(s) include the gender direction g on p. 8 §5.1, also called the gender subspace g on p. 9 §5.1, where g is identified by combining several directions such as                         
                            
                                
                                    s
                                    h
                                    e
                                
                                →
                            
                            -
                            
                                
                                    h
                                    e
                                
                                →
                            
                        
                     and                         
                            
                                
                                    w
                                    o
                                    m
                                    a
                                    n
                                
                                →
                            
                            -
                            
                                
                                    m
                                    a
                                    n
                                
                                →
                            
                        
                    . The gender subspace is also captured by the vector direction                         
                            
                                
                                    s
                                    o
                                    f
                                    t
                                    b
                                    a
                                    l
                                    l
                                
                                →
                            
                            -
                            
                                
                                    f
                                    o
                                    o
                                    t
                                    b
                                    a
                                    l
                                    l
                                
                                →
                            
                        
                     – see p. 8 ¶ Indirect gender bias and p. 10 § 5.3. Another proxy variable is the gender direction of any given word embedding such as                         
                            
                                
                                    s
                                    o
                                    f
                                    t
                                    b
                                    a
                                    l
                                    l
                                
                                →
                            
                        
                     and                         
                            
                                
                                    f
                                    o
                                    o
                                    t
                                    b
                                    a
                                    l
                                    l
                                
                                →
                            
                        
                    .) 
building, by the anomaly detection device, Al cohort models for a plurality of cohorts associated with values of each of the one or more prejudicing variables; (Cohorts are the genders male and female. Under the broadest reasonable interpretation, the AI cohort model for “male” is shown as the set of word embeddings in the right quadrants in Fig. 7 at p. 11, and the AI cohort model for “female” is shown as the set of word embeddings in the left quadrants.)
identifying, by the anomaly detection device, intra-cohort variation among the 15plurality of cohorts, wherein the intra-cohort variation indicates behavioral differences among the plurality of cohorts; (Intra-cohort (or inter-cohort) variations indicating behavioral differences is interpreted as female and male stereotypes – see p. 7, ¶ Occupational stereotypes.)
determining, by the anomaly detection device, based on the intra-cohort variation when the at least one proxy variable has explanatory power independent of the at least one prejudicing variable (A proxy variable has explanatory power independent of the prejudicing variable(s) when the system determines that a word embedding is gender-neutral, that is, the word embedding is 

    PNG
    media_image1.png
    61
    943
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    154
    957
    media_image2.png
    Greyscale

and the weight of the at least one proxy variable independent* (Interpreted as “and determining when a weight of the at least one proxy variable is independent) of the at least one 20prejudicing variable; (The broadest reasonable interpretation of a weight is                         
                            β
                        
                    . Page 10 teaches that a value of                         
                            β
                            =
                            0
                        
                     means a word is completely independent of gender: “Note that β(w, w) = 0, which is reasonable since the similarity of a word to itself should not depend on gender contribution.” Relatively low absolute values of β indicate relatively low dependence on gender contribution.)
de-prejudicing, by the anomaly detection device, the feature set of the Al model based on explanatory power of the at least one proxy variable independent of the prejudicing variables; and (Bolukbasi teaches at p. 11, § 6 that gender-specific word embeddings are “neutralized and equalized” or “softened”:

    PNG
    media_image3.png
    311
    951
    media_image3.png
    Greyscale

However, Bolukbasi does not explicitly teach: [training and testing] based on labelled training data; 
de-prejudicing the training data, by the anomaly detection device, of any 25residual bias after de-prejudicing the feature set, through curation of data set based on sampling techniques.
But Mikolov teaches: [training and testing] based on labelled training data; (Mikolov at p. 4, § 3.1 teaches using a continuous bag-of-words model to train an embedding, where the label is the context of the surrounding words:  “we have obtained the best performance… by building a log-linear classifier with four future and four history words at the input, where the training criterion is to correctly classify the current (middle) word.” Mikolov’s embedding is word2vec – see footnote 4 on p. 11.)
Bolukbasi and Mikolov are in the same field of endeavor as the claimed invention, namely machine learning for word embeddings. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have trained Bolukbasi’s word2vec embedding using a continuous bag-of-words model as disclosed by Mikolov to generate w2vNEWS. A motivation for training word2vec using CBOW is to predict the current word based on the surrounding words. (Mikolov, p. 4 §3.1). Bolukbasi’s testing of the model is based on Mikolov’s labelled training data because the trained model is the one being tested. 
de-prejudicing the training data, by the anomaly detection device, of any 25residual bias after de-prejudicing the feature set, through curation of data set based on sampling techniques.
	But Mermoud teaches de-prejudicing the training data, by the anomaly detection device, of any… bias…25, through curation of data set based on sampling techniques. (Mermoud teaches at ¶ [0095]: “At step 825… the device may exclude the set of sample data from a training set for the behavioral analytics model. In particular, in response to determining that the anomaly was a true positive, the device may take steps to ensure that the set of sample data that triggered the anomaly is not used to train the model.”)
	Mermoud is in the same field of endeavor as the claimed invention, namely machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have excluded, according to Mermoud’s teachings, the set of sample data that triggered the anomaly from the training set for Bolukbasi/Mikolov’s model. A motivation for the combination is to protect Bolukbasi/Mikolov’s model from “learning” anomalous behaviors over time which would reduce the ability of the model to detect anomalous conditions. (Mermoud at ¶ [0095]). This happens because, while the first occurrences of an event will be detected as a statistical anomaly, repeated occurrences will see their likelihood increase as a result of the integration of the samples in the underlying model, leading to the model eventually treating the anomalous conditions as the new normal (Mermoud ¶ [0069]).
In this combination, Mermoud debiases the training data after Bolukbasi debiases the feature (residual bias after de-prejudicing the feature set) with motivation to protect Bolukbasi/Mikolov’s model from “learning” anomalous behaviors over time.

CLAIM 2, the combination of Bolukbasi, Mikolov and Mermoud teaches: The method of claim 1 
Further, Bolukbasi teaches: further comprising building the Al model based on historical data associated with a plurality of users. (The AI model was trained on news articles (i.e., historical data) written by authors (i.e., a plurality of users). Bolukbasi at the top of p. 3 teaches: “The primary embedding studied in this paper is the popular publicly-available word2vec embedding trained on a corpus of Google News texts consisting of 3 million English words and terms into 300 dimensions, which we refer to here as the w2vNEWS. One might have hoped that the Google News embedding would exhibit little gender bias because many of its authors are professional journalists.”)

	Regarding CLAIM 4, the combination of Bolukbasi, Mikolov and Mermoud teaches: The method of claim 1, 
Further, Bolukbasi teaches: wherein the de-prejudicing the feature set of the Al model comprises iteratively removing one or more of the proxy variables having no explanatory power independent of the one or more prejudicing variables. (A proxy variable has no explanatory power independent of the prejudicing variable(s) when the system determines that a word embedding is gender-specific, that is, the word embedding is not independent from the concept of gender. Fig. 3 shows the removal of the gender component from multiple word embeddings. The limitation “iteratively” is broadly interpreted as removing multiple gender word embeddings.)

	Regarding CLAIM 5, the combination of Bolukbasi and Mermoud teaches: The method of claim 1, 
Further, Bolukbasi teaches: wherein the de-prejudicing the Al model* (Interpreted as “the de-prejudicing the feature set of the AI model”) comprises determining the weightage associated with one or more of the proxy variables having explanatory power independent of the one or more prejudicing variables. (A proxy variable has explanatory power independent of the prejudicing variable(s) when the system determines that a word embedding is gender-neutral. Calculating a value of                         
                            β
                            
                                
                                    w
                                    ,
                                    v
                                
                            
                        
                     at p. 10 is determining a weightage.)

	Regarding CLAIM 6, the combination of Bolukbasi, Mikolov and Mermoud teaches: The method of claim 1, 
	Further, Bolukbasi teaches: wherein separate Al cohort models are retained for 15each of the plurality of cohorts of the one or more prejudicing variable based on a set of conditions comprising: (The broadest reasonable interpretation is that the male and female cohort models are retained based on the conditions. Bolukbasi teaches that soft bias correction “seeks to preserve pairwise inner products between all the word vectors while minimizing the projection of the gender neutral words onto the gender subspace” (p. 13, ¶ Step 2b). Soft bias correction is needed to preserve “certain distinctions that are valuable in certain applications. For instance, one may wish a language model to assign a higher probability to the phrase to grandfather a regulation) than to grandmother a regulation since grandfather has a meaning that grandmother does not – equalizing the two removes this distinction. The Soften algorithm reduces the differences between these sets while maintaining as much similarity to the original embedding as possible, with a parameter that controls this trade-off.” (p. 11, last paragraph))
availability of sufficient* data, (Interpreted as “a predetermined quantity of data.” P. 12 teaches “word sets W” at Step 1.)
behavior of each of the plurality of cohorts is different based on different prediction relevant features, and (“Behavior” is interpreted as feminine behavior for the female cohort model and masculine behavior for the male cohort model. Bolukbasi Fig. 7 shows the word embedding “actresses” 
predictive power associated with each of the separate Al cohort model is higher than that of the Al model. (Bolukbasi teaches: “For instance, one may wish a language model to assign a higher probability to the phrase to grandfather a regulation) than to grandmother a regulation since grandfather has a meaning that grandmother does not – equalizing the two removes this distinction.” (p. 11, last paragraph))

	Regarding CLAIM 8, the combination of Bolukbasi, Mikolov and Mermoud teaches: The method of claim 1 and de-prejudiced AI model
However, the combination of Bolukbasi, Mikolov and Mermoud does not explicitly teach: further comprising detecting, by the model, anomalous data from a new set of inputs provided to the model.
But Mermoud teaches: further comprising detecting, by the model, anomalous data from a new set of inputs provided to the model. (Mermoud at ¶[0076] discloses: “a given model 410 may use a training set of n-number of prior sets of samples, to determine whether the next set of sample data represents an anomaly in the network.”)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Mermoud’s system into the combination of Bolukbasi, Mikolov and Mermoud’s system by performing multiple iterations of training, with a motivation to take as input empirical data and recognize complex patterns in these data. (Mermoud at ¶ [0045]: “In general, machine learning is concerned with the design and the development of techniques that take as input empirical data (such as network statistics and performance indicators), and recognize complex patterns in these data.”)

	Regarding CLAIM 11, Bolukbasi teaches: An anomaly detection device for de-prejudicing Artificial Intelligence (AI) based anomaly detection, the system comprising: a processor; and a memory communicatively coupled to the processor, wherein the memory 15stores processor instructions, which, on execution, causes the processor to: (The results on p. 13 at § 8 is evidence of a processor, memory, and processor instructions)
train and test an Al model,… (An anomaly detection device is taught by the authors’ computer system which generated Figs. 4-8 and Table 1. Bolukbasi teaches that the pre-trained AI model is word2vec and the trained AI model is w2vNEWS as stated at the top of p. 3: “The primary embedding studied in this paper is the popular publicly-available word2vec [24, 25] embedding trained on a corpus of Google News texts consisting of 3 million English words and terms into 300 dimensions, which we refer to here as the w2vNEWS.” Testing an AI model is taught by the experiments of occupational stereotypes and analogies exhibiting stereotypes in section 4, pp. 6-8.)
determine whether the Al model reveals a bias, based on one or more prejudicing variables; (A prejudicing variable is a direct gender bias. Determining whether the AI model reveals a bias is the direct gender bias associated with a word embedding. Bolukbasi teaches at §5.2 that a direct gender bias of an embedding to be DirectBiasC.)
determine when a feature set of the Al model includes at least one 20proxy variables associated with at least one prejudicing variable from the one or more prejudicing variables that are inducing bias; (The proxy variable(s) include the gender direction g on p. 8 §5.1, also called the gender subspace g on p. 9 §5.1, where g is identified by combining several directions such as                         
                            
                                
                                    s
                                    h
                                    e
                                
                                →
                            
                            -
                            
                                
                                    h
                                    e
                                
                                →
                            
                        
                     and                         
                            
                                
                                    w
                                    o
                                    m
                                    a
                                    n
                                
                                →
                            
                            -
                            
                                
                                    m
                                    a
                                    n
                                
                                →
                            
                        
                    . The gender subspace is also captured by the vector direction                         
                            
                                
                                    s
                                    o
                                    f
                                    t
                                    b
                                    a
                                    l
                                    l
                                
                                →
                            
                            -
                            
                                
                                    f
                                    o
                                    o
                                    t
                                    b
                                    a
                                    l
                                    l
                                
                                →
                            
                        
                     – see p. 8 ¶ Indirect gender bias and p. 10 § 5.3. Another proxy variable is the gender direction of any given word embedding such as                         
                            
                                
                                    s
                                    o
                                    f
                                    t
                                    b
                                    a
                                    l
                                    l
                                
                                →
                            
                        
                     and                         
                            
                                
                                    f
                                    o
                                    o
                                    t
                                    b
                                    a
                                    l
                                    l
                                
                                →
                            
                        
                    .)
build Al cohort models for a plurality of cohorts associated with values of each of the one or more prejudicing variables; (Cohorts are the genders male and female. Under the broadest reasonable interpretation, the AI cohort model for “male” is shown as the set of word embeddings in the right quadrants in Fig. 7 at p. 11, and the AI cohort model for “female” is shown as the set of word embeddings in the left quadrants.)
identify intra-cohort variation among the plurality of cohorts, wherein 25the intra-cohort variation indicates behavioral differences among the plurality of cohorts; (Intra-cohort (or inter-cohort) variations indicating behavioral differences is interpreted as female and male stereotypes – see p. 7, ¶ Occupational stereotypes.)
determine based on the intra-cohort variation when the at least one proxy variable has explanatory power independent of the at least one prejudicing variable (A proxy variable has explanatory power independent of the prejudicing variable(s) when the system determines that a word embedding is gender-neutral, that is, the word embedding is independent from the concept of gender. A proxy variable has no explanatory independent of the prejudicing variable(s) when the system determines that a word embedding is gender-specific, that is, the word embedding is not independent from the concept of gender. According to p. 10 §5.3, β(w,v) is the gender component to the similarity between two word vectors w and v. Determining when word embeddings w and v are gender-neutral is made when the absolute value of β(w,v) is low. Bolukbasi at page 10 states:

    PNG
    media_image1.png
    61
    943
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    154
    957
    media_image2.png
    Greyscale

and the weight of the at least one proxy variable independent* (Interpreted as “and determining when a weight of the at least one proxy variable is independent) of the at least one prejudicing variable (The broadest reasonable interpretation of a weight is                         
                            β
                        
                    . Page 10 teaches that a value of                         
                            β
                            =
                            0
                        
                     means a word is completely independent of gender: “Note that β(w, w) = 0, which is reasonable since the similarity of a word to itself should not depend on gender contribution.” Relatively low absolute values of β indicate relatively low dependence on gender contribution.)
de-prejudice the feature set of the Al model based on explanatory 30power of the at least one proxy variable independent of the prejudicing variables; and (Bolukbasi teaches at p. 11, § 6 that gender-specific word embeddings are “neutralized and equalized” or “softened”:

    PNG
    media_image3.png
    311
    951
    media_image3.png
    Greyscale

However, Bolukbasi does not explicitly teach: [train and test] based on labelled training data; 
de-prejudice the training data, by the anomaly detection device, of any residual bias after de-prejudicing the feature set through curation of data set using sampling techniques.
But Mikolov teaches: [train and test] based on labelled training data; (Mikolov at p. 4, § 3.1 teaches using a continuous bag-of-words model to train an embedding, where the label is the context of the surrounding words:  “we have obtained the best performance… by building a log-linear classifier with four future and four history words at the input, where the training criterion is to correctly classify the current (middle) word.” Mikolov’s embedding is word2vec – see footnote 4 on p. 11.)
(Mikolov, p. 4 §3.1). Bolukbasi’s testing of the model is based on Mikolov’s labelled training data because the trained model is the one being tested. 
However, the combination of Bolukbasi and Mikolov does not explicitly teach: de-prejudice the training data, by the anomaly detection device, of any residual bias after de-prejudicing the feature set through curation of data set using sampling techniques. 
But Mermoud teaches this limitation at ¶ [0095]: “At step 825… the device may exclude the set of sample data from a training set for the behavioral analytics model. In particular, in response to determining that the anomaly was a true positive, the device may take steps to ensure that the set of sample data that triggered the anomaly is not used to train the model.”)
	Mermoud is in the same field of endeavor as the claimed invention, namely machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have excluded, according to Mermoud’s teachings, the set of sample data that triggered the anomaly from the training set for Bolukbasi/Mikolov’s model. A motivation for the combination is to protect Bolukbasi/Mikolov’s model from “learning” anomalous behaviors over time which would reduce the ability of the model to detect anomalous conditions. (Mermoud at ¶ [0095]). This happens because, while the first occurrences of an event will be detected as a statistical anomaly, repeated occurrences will see their likelihood increase as a result of the integration of the samples in the underlying model, leading to the model eventually treating the anomalous conditions as the new normal (Mermoud ¶ [0069]).

	Regarding CLAIM 12, the combination of Bolukbasi, Mikolov and Mermoud teaches: The anomaly detection device of claim 11, 
Further, Bolukbasi teaches: wherein the processor is further configured to build the Al model based on historical data associated with a plurality of users. (The AI model was trained on news articles (i.e., historical data) written by authors (i.e., a plurality of users). Bolukbasi at the top of p. 3 teaches: “The primary embedding studied in this paper is the popular publicly-available word2vec embedding trained on a corpus of Google News texts consisting of 3 million English words and terms into 300 dimensions, which we refer to here as the w2vNEWS. One might have hoped that the Google News embedding would exhibit little gender bias because many of its authors are professional journalists.”)

	Regarding CLAIM 14, the combination of Bolukbasi, Mikolov and Mermoud teaches: The anomaly detection device of claim 11, 
Further, Bolukbasi teaches: wherein the de-prejudicing the feature set of the Al model set comprises iteratively removing one or more of the proxy variables having no explanatory power independent of the one or more prejudicing variables. (A proxy variable has no explanatory power independent of the prejudicing variable(s) when the system determines that a word embedding is gender-specific, that is, the word embedding is not independent from the concept of gender. Fig. 3 shows the removal of the gender component from multiple word embeddings. The limitation “iteratively” is broadly interpreted as removing multiple gender word embeddings.)

Regarding CLAIM 15, the combination of Bolukbasi, Mikolov and Mermoud teaches: The anomaly detection device of claim 11, 
wherein the de-prejudicing the Al 20model* (Interpreted as “the de-prejudicing the feature set of the AI model”) comprises determining weightage associated with the proxy variables having explanatory power independent of the one or more prejudicing variables. (A proxy variable has explanatory power independent of the prejudicing variable(s) when the system determines that a word embedding is gender-neutral. Calculating a value of                         
                            β
                            
                                
                                    w
                                    ,
                                    v
                                
                            
                        
                     at p. 10 is determining a weightage.)

	Regarding CLAIM 17, the combination of Bolukbasi, Mikolov and Mermoud teaches: The anomaly detection device of claim 11, 
However, the combination of Bolukbasi, Mikolov and Mermoud does not explicitly teach: wherein the processor instructions are further configured to detect anomalous data from a new set of data provided to the de-prejudiced Al model.
But Mermoud teaches: wherein the processor instructions are further configured to detect anomalous data from a new set of data provided to the de-prejudiced Al model. (Mermoud at ¶[0076] discloses: “a given model 410 may use a training set of n-number of prior sets of samples, to determine whether the next set of sample data represents an anomaly in the network.”)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Mermoud’s system into the combination of Bolukbasi, Mikolov and Mermoud’s system by performing multiple iterations of training, with a motivation to take as input empirical data and recognize complex patterns in these data. (Mermoud at ¶ [0045]: “In general, machine learning is concerned with the design and the development of techniques that take as input empirical data (such as network statistics and performance indicators), and recognize complex patterns in these data.”)

Claims 3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Bolukbasi, in view of Mikolov and Mermoud, and further in view of “A Multi-Level Bayesian Analysis of Racial Bias in Police Shootings at the County-Level in the United States, 2011–2014” to Ross, hereinafter “Ross.”

Regarding CLAIM 3, the combination of Bolukbasi, Mikolov and Mermoud teaches: The method of claim 1 further comprising: 
Further, Bolukbasi teaches: determining bias across the one or more prejudicing variables (A prejudicing variable is a direct gender bias. Determining whether the AI model reveals a bias is the direct gender bias associated with a word embedding. Bolukbasi teaches at §5.2 that a direct gender bias of an embedding to be DirectBiasC.)
iteratively de-prejudicing the feature set used by the Al model and (The limitation “iteratively” is broadly interpreted as de-biasing multiple word embeddings. Bolukbasi teaches debiasing multiple word embeddings in Appendix G on p. 21-24.)
Further, Mermoud teaches: de-prejudicing the training data in response to determining the bias. (Mermoud teaches at ¶ [0095]: “At step 825… the device may exclude the set of sample data from a training set for the behavioral analytics model. In particular, in response to determining that the anomaly was a true positive, the device may take steps to ensure that the set of sample data that triggered the anomaly is not used to train the model.”)
However, the combination of Bolukbasi, Mikolov and Mermoud does not explicitly teach: through a multi-level Bayesian analysis 
But Ross teaches: determining bias through a multi-level Bayesian analysis (Ross’ abstract discloses: “A geographically-resolved, multi-level Bayesian model is used… to investigate the extent of racial bias in the shooting of American civilians by police officers in recent years.” Ross at p. 22, 27, and 
Ross is in the same field of endeavor as the claimed invention, determining bias using multi-level Bayesian analysis. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Ross’ system into the combination of Bolukbasi, Mikolov and Mermoud’s system by adapting the word embedding-based bias of Bolukbasi with the multi-level Bayesian analysis for bias from Ross to yield predictable results. One would be motivated as multi-level Bayesian analysis is effective at determining bias in data with hierarchical data.

	Regarding CLAIM 13, the combination of Bolukbasi, Mikolov and Mermoud teaches: The anomaly detection device of claim 11, wherein the processor instructions further cause the processor to:
Further, Bolukbasi teaches: determine bias across the one or more prejudicing variables (A prejudicing variable is a direct gender bias. Determining whether the AI model reveals a bias is the direct gender bias associated with a word embedding. Bolukbasi teaches at §5.2 that a direct gender bias of an embedding to be DirectBiasC.)
iteratively de-prejudice the feature set used by the Al model and (The limitation “iteratively” is broadly interpreted as de-biasing multiple word embeddings. Bolukbasi teaches debiasing multiple word embeddings in Appendix G on p. 21-24.)
Further, Mermoud teaches: de-prejudicing the training data in response to determining the bias. (Mermoud teaches at ¶ [0095]: “At step 825… the device may exclude the set of sample data from a training set for the behavioral analytics model. In particular, in response to determining that the 
But Ross teaches: determine bias through a multi-level Bayesian analysis (Ross’ abstract discloses: “A geographically-resolved, multi-level Bayesian model is used… to investigate the extent of racial bias in the shooting of American civilians by police officers in recent years.” Ross at p. 22, 27, and 30-32 discloses a multi-level Bayesian analysis on the risk of being shot by police using levels of county, race/ethnicity, and one’s status of being armed or unarmed.)
Ross is in the same field of endeavor as the claimed invention, determining bias using multi-level Bayesian analysis. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Ross’ system into the combination of Bolukbasi, Mikolov and Mermoud’s system by adapting the word embedding-based bias of Bolukbasi with the multi-level Bayesian analysis for bias from Ross to yield predictable results. One would be motivated as multi-level Bayesian analysis is effective at determining bias in data with hierarchical data.

Claims 7 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Bolukbasi, in view of Mikolov and Mermoud, and further in view of “Learning Programs: A Hierarchical Bayesian Approach” to Liang et al., hereinafter “Liang.”

Regarding CLAIM 7, the combination of Bolukbasi, Mikolov and Mermoud teaches: The method of claim 1, wherein the de-prejudicing the data set* (Interpreted as “the training data”) comprises: 
Further, Bolukbasi teaches: to reveal higher bias levels (Abstract - “direct and indirect gender biases in embeddings”)
 Mermoud teaches: and curating the data based on sampling techniques to remove residual bias after 25de-prejudicing the feature set. (Mermoud ¶ [0095])
However, the combination of Bolukbasi, Mikolov and Mermoud does not explicitly teach: curating data by removing data received from one or more entities, wherein the one or more entities are determined by a multi-level Bayesian analysis  
	But Liang teaches: curating data by removing data received from one or more entities, wherein the one or more entities are determined by a multi-level Bayesian analysis (Liang Abstract: “Since the program for a single task is underdetermined by its data, we introduce a nonparametric hierarchical Bayesian prior over programs which shares statistical strength across multiple tasks.” Liang p. 7, col. 2 state: “We use B transformations on the composition of user actions at the top of the candidate structure; this corresponds to forming a hierarchical grouping via tree rotations. Second, we allow extraction/unextraction of string-typed primitive combinators to the top of the program using a single program transformation, as in Figure 1(b).” In Fig. 7, primitive combinators include (delete s i) and (delete-selection s i).)
	Liang is in the same field of endeavor as the claimed invention, namely machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Liang’s system into the combination of Bolukbasi, Mikolov and Mermoud’s system by deleting data using a hierarchical grouping, with a motivation “to reveal shared subprograms via safe transformations.” (Conclusion: “We have presented a hierarchical Bayesian model of combinator programs which enables multi-task sharing of subprograms. One of the main new ideas is refactoring to reveal shared subprograms via safe transformations.”)

CLAIM 16, the combination of Bolukbasi, Mikolov and Mermoud teaches: The anomaly detection device of claim 11, wherein to de-prejudice the training data, the processor instructions are further configured to: 
Further, Bolukbasi teaches: 25 to reveal higher bias levels; (Abstract - “direct and indirect gender biases in embeddings”)
Further, Mermoud teaches: and curate the data based on sampling techniques to remove residual bias after de-prejudicing the feature set. (Mermoud ¶ [0095])
However, the combination of Bolukbasi, Mikolov and Mermoud does not explicitly teach: curate data by removing data received from one or more entities, wherein the one or more entities are determined by a multi-level Bayesian analysis
But Liang teaches: curate data by removing data received from one or more entities, wherein the one or more entities are determined by a multi-level Bayesian analysis (Liang Abstract: “Since the program for a single task is underdetermined by its data, we introduce a nonparametric hierarchical Bayesian prior over programs which shares statistical strength across multiple tasks.” Liang p. 7, col. 2 state: “We use B transformations on the composition of user actions at the top of the candidate structure; this corresponds to forming a hierarchical grouping via tree rotations. Second, we allow extraction/unextraction of string-typed primitive combinators to the top of the program using a single program transformation, as in Figure 1(b).” In Fig. 7, primitive combinators include (delete s i) and (delete-selection s i).)
	Liang is in the same field of endeavor as the claimed invention, namely machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Liang’s system into the combination of Bolukbasi, Mikolov and Mermoud’s system by deleting data using a hierarchical grouping, with a motivation “to reveal shared subprograms via safe transformations.” (Conclusion: “We have presented a hierarchical Bayesian model of combinator programs which enables multi-task sharing of subprograms. One of the main new ideas is refactoring to reveal shared subprograms via safe transformations.”)

Claims 9 and 18 is rejected under 35 U.S.C. 103 as being unpatentable over Bolukbasi, in view of Mikolov, and Mermoud, and further in view of “‘Why Should I Trust You?’ Explaining the Predictions of Any Classifier” to Ribeiro et al., hereinafter “Ribeiro.”

Regarding CLAIM 9, the combination of Bolukbasi, Mikolov and Mermoud teaches: The method of claim 8 and de-prejudiced AI model
However, the combination of Bolukbasi, Mikolov and Mermoud does not explicitly teach: further comprising 
attributing a set of causal features to each anomalous prediction of the… Al model that influenced the true positive prediction by the… Al model; and 
receiving feedback from investigators to refine the attributed set of causal features.
But, Mermoud teaches: anomalous prediction and true positive prediction (¶ [0080]: “In general, machine learning results fall into one of four categories: true positives, true negatives, false positives, and false negatives. In the context of anomaly detection, true positives refer to detected anomalies that are indeed anomalies.”)
receiving feedback from investigators to refine the attributed set of causal features. (Mermoud at ¶ [0093] describes end users rating the relevancy of the detected anomaly on a scale.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have gathered feedback from end users, as taught by Mermoud, about the model’s prediction with a motivation to confirm a true positive prediction. (¶ [0072] “The device determines that the anomaly was a true positive based on the received feedback. The device excludes the set of sample data from a training set for the behavioral analytics model, in response to determining that the anomaly was a true positive.”)
However, the combination of Bolukbasi, Mikolov and Mermoud does not explicitly teach: further comprising attributing a set of causal features to each… prediction of the… Al model that influenced the… prediction by the… Al model; and 
	But Ribeiro teaches this limitation. Ribeiro Fig. 1 at p. 1136, reproduced and annotated below, shows an example in which a model predicts that a patient has the flu, and LIME highlights the symptoms that led to the prediction. The symptoms are the causal features.

    PNG
    media_image4.png
    305
    862
    media_image4.png
    Greyscale

	Ribeiro is in the same field of endeavor as the claimed invention, namely explaining the factors contributing to a machine learning model’s prediction. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Reibero’s LIME to reveal the features contributing to the true positive prediction made by the combination’s de-prejudiced AI model. A motivation for the combination is to improve the trustworthiness of the de-prejudiced AI model and its predictions (Reibero’s Abstract: “Such understanding [into how the model makes a prediction] also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one.”)

Regarding CLAIM 18, the combination of Bolukbasi, Mikolov and Mermoud teaches: The anomaly detection device of claim 17 and de-prejudiced Al model
However, the combination of Bolukbasi, Mikolov and Mermoud does not explicitly teach: the processor instructions are further configured to: 
attribute a set of causal features to each anomalous prediction of the… Al model that influenced true positive prediction by the... Al model; and 
receive feedback from investigators to refine the attributed set of causal 10features.
But, Mermoud teaches: anomalous prediction and true positive prediction (¶ [0080]: “In general, machine learning results fall into one of four categories: true positives, true negatives, false positives, and false negatives. In the context of anomaly detection, true positives refer to detected anomalies that are indeed anomalies.”)
receive feedback from investigators to refine the attributed set of causal 10features. (Mermoud at ¶ [0093] describes end users rating the relevancy of the detected anomaly on a scale.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have gathered feedback from end users, as taught by Mermoud, about the model’s prediction with a motivation to confirm a true positive prediction. (¶ [0072] “The device determines that the anomaly was a true positive based on the received feedback. The device excludes the set of sample data from a training set for the behavioral analytics model, in response to determining that the anomaly was a true positive.”)
However, the combination of Bolukbasi, Mikolov and Mermoud does not explicitly teach: attribute a set of causal features to each... prediction of the… Al model that influenced… prediction by the... Al model; and 

Ribeiro is in the same field of endeavor as the claimed invention, namely explaining the factors contributing to a machine learning model’s prediction. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Reibero’s LIME to reveal the features contributing to the true positive prediction made by the combination’s de-prejudiced AI model. A motivation for the combination is to improve the trustworthiness of the de-prejudiced AI model and its predictions (Reibero’s Abstract: “Such understanding [into how the model makes a prediction] also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one.”)

Claims 10 and 19 is rejected under 35 U.S.C. 103 as being unpatentable over Bolukbasi, in view of Mikolov, Mermoud, and Ribeiro, and further in view of “Toward Harnessing User Feedback for Machine Learning” to Stumpf et al., hereinafter “Stumpf.”

	Regarding CLAIM 10, the combination of Bolukbasi, Mikolov, Mermoud, and Ribeiro teaches: The method of claim 9 and de-prejudiced AI model
Further, Mermoud teaches: further comprising: using rule-based checks as a redressal mechanism to provide re-classified results to correct the bias of the… model; and (Mermoud at ¶ [0093] describes end users rating the relevancy of the detected anomaly on a scale to confirm whether it was a true positive.)
 de-prejudicing the feedback received from the investigator, by the anomaly detection device, through the method described in claim 1, 
Mermoud teaches that user feedback is a binary rating or a rating on a scale (¶ [0093]). However, this type of feedback is numerical, and it cannot be de-prejudiced by Bolukbasi’s debiasing model which operates on word embeddings. Stumpf teaches a mock machine learning model receiving written feedback from a participant or human-in-the-loop, and this feedback may be de-prejudiced by Bolukbasi’s debiasing model (Stumpf p. 89, col. 1 discloses feedback from Participant 13:
    PNG
    media_image5.png
    203
    660
    media_image5.png
    Greyscale
)

Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have received written feedback from the user about the model’s decisions, as taught by Stumpf, and de-prejudiced it using Bolukbasi’s debiasing algorithm, with a motivation to improve the model’s accuracy. (Stumpf p. 82, col. 1-2: “Therefore, approaches have begun to emerge in which the user and machine learning component communicate with each other to improve the machine’s accuracy or otherwise supplement the machine’s inferences by incorporating user feedback.”)
using the feedback as additional training data for the… Al model to learn from the human-in-the-loop. (Mikolov teaches using a continuous bag-of-words model to train an embedding at p. 4, § 3.1.)

	Regarding CLAIM 19, the combination of Bolukbasi, Mikolov, Mermoud, and Ribeiro teaches: The anomaly detection device of claim 18 and de-prejudiced AI model
Further, Mermoud teaches: wherein the processor instructions are further configured to: use rule-based checks as a redressal mechanism to provide re-classified results 15to correct the bias of the… Al model; and (Mermoud at ¶ [0093] describes end users rating the relevancy of the detected anomaly on a scale to confirm whether it was a true positive.)

However, the combination of Bolukbasi, Mikolov, Mermoud, and Ribeiro does not explicitly teach: de-prejudice the feedback received from the investigator, by the anomaly detection device, through the method described in claim 11, 
Mermoud teaches that user feedback is a binary rating or a rating on a scale (¶ [0093]). However, this type of feedback is numerical, and it cannot be de-prejudiced by Bolukbasi’s debiasing model which operates on word embeddings. Stumpf teaches a mock machine learning model receiving written feedback from a participant or human-in-the-loop, and this feedback may be de-prejudiced by 
    PNG
    media_image5.png
    203
    660
    media_image5.png
    Greyscale
)

Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have received written feedback from the user about the model’s decisions, as taught by Stumpf, and de-prejudiced it using Bolukbasi’s debiasing algorithm, with a motivation to improve the model’s accuracy. (Stumpf p. 82, col. 1-2: “Therefore, approaches have begun to emerge in which the user and machine learning component communicate with each other to improve the machine’s accuracy or otherwise supplement the machine’s inferences by incorporating user feedback.”)
Further, Miklov teaches: using the feedback to be used as additional training data for the… Al model to learn from the human-in-the-loop. (Mikolov teaches using a continuous bag-of-words model to train an embedding at p. 4, § 3.1.)

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. “Local Interpretable Model-agnostic Explanations of Bayesian Predictive Models via Kullback-Leibler Projections” to Peltola teaches a method for explaining predictions of Bayesian predictive models based on Ribeiro’s LIME technique.

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/ASHER JABLON/Examiner, Art Unit 2122                                                                                                                                                                                                        
/ERIC NILSSON/Primary Examiner, Art Unit 2122