Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Application 16/217,574 filed 12/12/2018 has been examined.
In this Office Action, claims 1-25 are currently pending.


Claim Rejections - 35 USC § 112

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2, 10 , 17 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The term “assuming only oracle access” in claims 2, 10, 17 is a relative term which renders the claim indefinite. The term ““assuming only oracle access” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.




Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.



Claims 1-25 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an
abstract idea without significantly more.
Claim 1 recites:
generating a contrastive explanation for a decision.
The limitation of generating a contrastive explanation for a decision, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting a “computer-implemented method”, nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the computer-implemented method language, generating a explanation in the context of this claim encompasses the user manually determining generic “explanations” for generic “decisions”. Similarly, the limitation(s) of highlighting and determining, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, but for the databases language, highlighting and determining in the context of this claim encompasses the user manually generating a listing of
generic features and new values based on generic “decision” (s). If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)).
Further, these concepts also recite “Certain Methods of Organizing Human Activity”; (such as
commercial or legal interactions (including agreements in the form of contracts; legal
obligations; advertising, marketing or sales activities or behaviors; business relations) where
generating a contrastive explanation for a decision is a method of human activity in commercial or legal interactions.
Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim only
recites one additional element – using computer-implemented method to perform both the generating; highlighting and determining steps. The databases/processor in both steps is recited at a high level of generality (i.e., as a generic processor performing a generic computer function of generating “explanations”) such that it amounts no more than mere instructions to apply the
exception using a generic computer component. Accordingly, this additional element does not
integrate the abstract idea into a practical application because it does not impose any
meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more
than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a computer-implemented method to perform both the generating; highlighting and determining steps amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim(s) is/are not patent eligible.

Dependent claims 2-8 are merely add further details of the abstract steps/elements recited in
claim 1 without integrating the idea into a practical application; or including an improvement to
another technology or technical field, an improvement to the functioning of the computer itself,
or meaningful limitations beyond generally linking the use of an abstract idea to a particular
technological environment. Therefore, dependent claims 2-10 are also directed towards
nonstatutory subject matter.

As per independent claims 9, 16 and 24-25, are also rejected as ineligible subject matter under 35 U.S.C. 101 for substantially the same reasons as the method claim(s) 1. The components (i.e., medium/system/methods described in independent claims 9, 16 and 24-25 do not provide for integrating the abstract idea into a practical application. At best, the claim(s) are merely providing alternate environments to implement the abstract idea.

Dependent claims 10-15 and 17-23 merely add further details of the abstract steps/elements
recited in claim 1 without integrating the idea into a practical application; or including an
improvement to another technology or technical field, an improvement to the functioning of the
computer itself, or meaningful limitations beyond generally linking the use of an abstract idea to
a particular technological environment. Therefore, dependent claims 10-15 and 17-23 are also
directed towards non-statutory subject matter.









Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-7, 9-22, 24-25 is/are rejected under 35 U.S.C. 103 as being unpatentable over Merrill et al., US Pub. No. 2019/0378210 A1.

As to claim 1 (substantially similar claim 9 and claim 16), Merrill  discloses a computer-implemented method for model agnostic contrastive explanations for interpreting
a deep neural network (DNN), 
(Merrill [0113] In some embodiments, the differentiable model is a perceptron, a feed-forward neural network, an autoencoder, a probabilistic network, a convolutional neural network, a radial basis function network, a multilayer perceptron, a deep neural network, or a recurrent neural network)
the model agnostic contrastive explanation method comprising:
generating a contrastive explanation for a decision 
(Merrill [0029] In some embodiments, the model evaluation system 120 explains single models. In some embodiments, the model evaluation system 120 explains ensembles by explaining each sub-model of the ensemble and combining explanations of each sub-model by using an ensemblng function of the ensemble;
See also [0032] In some embodiments, model evaluation and explanation system 120 uses score decompositions to determine important features a model ( or ensemble) that impact
scores generated by the model (or ensemble).;
see also [0033] In some embodiments, the model evaluation system evaluates and explains the model ( or ensemble) by generating score explanation information for a specific score generated by the ensemble model for a particular input data
set.)

of a classifier trained on structured data;
(Merrill [0122] In some embodiments, S210 includes: generating a reference input data set (reference data point) representative of the reference population. In some embodiments, S210
includes: selecting a reference population from training data (e.g., of the modelling system 110); for each numerical feature represented by the input data sets (reference data points);
see also [0135] In some embodiments, an autoencoder neural network is trained on a prior set of decompositions, and caused to predict new decompositions;
see also [0160] generating a reference data point or set of points from a given reference population may rely upon different techniques depending on the type of input variable, e.g., numerical (float or integer precision), categorical (ordinal or unstructured))

highlighting an important feature that justifies the decision; 
(Merrill [0131] In some embodiments, the model evaluation system uses a decomposition generated for a model score ( e.g., at one or more of S230, S240, S250) to generate feature
importance information and provide the generated feature importance information to the operator device; see also [0034] the model evaluation system evaluates the model ( or ensemble) by generating information that allows the operator to determine whether the disparate impact has adequate business justification.)

and
determining a minimal set of new values for features that alter the decision
(Merrill teaches determining a reduced/decomposed set of features, i.e. a minimal set of new values see [0132] In some embodiments, the model evaluation system uses a decomposition generated for a model score to generate adverse action information (e.g., at S271) (as
described herein) and provide the generated adverse action information to the operator device 171.
[0133] In some embodiments, the model evaluation system uses a decomposition generated for a model score to generate disparate impact information (e.g., at S272) (as described herein) and provide the generated disparate impact information to the operator device 171. 
see also [0147] model evaluation system collapses the set of decompositions by a statistic such as a mode, rather than retaining MxN decompositions, an aggregate measure is generated for each test data point, thereby collapsing the number of decompositions generated to N.).

It would have been obvious to one having ordinary skill in the art at the time the time of the effective filing date to provide/generate decomposition values as taught by Merrill since it was known in the art that classification systems provide decomposition techniques that are used for understanding the influence of a particular feature across a range of values for that feature for a given population, also known as feature influence, which is roughly equivalent but offers greater insight than partial dependence plots where for reference, partial dependence plots attempt to demonstrate the "averaged" effect upon the predicted output of a specific feature over it's input range (continuous) or space (categorical) by providing a line plot. Decomposition for feature influence provides a plot with increased fidelity; rather than a line plot, the actual distribution of points is provided, offering additional insight. In some embodiments the model evaluation system receives a model and generates a decomposition for feature influence plot which is provided to the operator device  (Merrill [0134]).

As to claim 2, Merrill discloses the computer-implemented method of claim 1, wherein the generating generates the contrastive explanation for any classification model assuming only oracle access
(Merrill [0030] In some embodiments, the decomposition is used to generate explanation information for the model.;
See also [0130] Applications of Decomposition
[0131] In some embodiments, the model evaluation system uses a decomposition generated for a model score ( e.g., at one or more of S230, S240, S250) to generate feature
importance information and provide the generated feature importance information to the operator device 171.).

As to claim 3, Merrill discloses the computer-implemented method of claim 1, wherein the generating is only able to query class probabilities for a desired input to generate the contrastive explanation (Merrill [0150] In some embodiments, a decomposition is a set of vectors, where the univariate distributions will be of interest. For example, for a given feature, the model evaluation
system applies statistical methods, e.g., median, mode, estimation of probability distribution function.).

As to claim 4, Merrill discloses the computer-implemented method of claim 1, wherein, for real-valued features, the generating sets a base value as a median value of the feature (Merrill [0150] In some embodiments, a decomposition is a set of vectors, where the univariate distributions will be of interest. For example, for a given feature, the model evaluation system applies statistical methods, e.g., median, mode, estimation of probability distribution function.;
see also [0160] In some embodiments, generating a reference data point or set of points from a given reference population may rely upon different techniques depending on the type of input
variable, e.g., numerical (float or integer precision), categorical (ordinal or unstructured), and boolean (true or false). In some embodiments, for each numerical feature, selecting an average or median value among the reference population is performed by the model evaluation system. In some embodiments, for each categorical, the median or mode value is selected for ordinals and the mode is selected for unstructured categoricals.).

As to claim 5, Merrill discloses the computer-implemented method of claim 1, wherein, for categorical features, the generating sets a base value as a mode for that feature (Merrill [0150] In some embodiments, a decomposition is a set of vectors, where the univariate distributions will be of interest. For example, for a given feature, the model evaluation system applies statistical methods, e.g., median, mode, estimation of probability distribution function.;
see also [0160] In some embodiments, generating a reference data point or set of points from a given reference population may rely upon different techniques depending on the type of input
variable, e.g., numerical (float or integer precision), categorical (ordinal or unstructured), and boolean (true or false). In some embodiments, for each numerical feature, selecting an average or median value among the reference population is performed by the model evaluation system. In some embodiments, for each categorical, the median or mode value is selected for ordinals and the mode is selected for unstructured categoricals.;
see also [0147] model evaluation system collapses the set of decompositions by a statistic such as a mode, rather than retaining MxN decompositions, an aggregate measure is generated for each test data point, thereby collapsing the number of decompositions generated to N. ).

As to claim 6, Merrill discloses the computer-implemented method of claim 1, wherein the generating uses a zeroth-order optimization
(Merrill [0159] In some embodiments, the reference is a vector representing a fixed point. In some embodiments, the reference is a vector representing a zero vector.)
to estimate a gradient of designed loss functions for pertinen positives (PP) and pertinent negatives (PN) for the contrastive explanation
(Merrill [0046] In some embodiments, the differentiable model decomposition module (e.g, 122) uses integrated gradients;
[0116] In some embodiments, the model evaluation system 120 determines each derivative of the differentiable model for each selected value of each feature i by using a gradient operator to determine the derivatives for each selected value;
see also [0136] For example, given a population and an ensemble of two submodels
with consistent feature inputs, certain features may routinely create strong and pos1t1ve influences for both sub-models ( constructive interference), while other features
may routinely create strong and positive influences in one sub-model but be counteracted by strong and negative influences by the other (destructive interference).).

As to claim 7, Merrill discloses the computer-implemented method of claim 1, wherein, for categorical features, the generating uses one of a frequency map approach (FMA) and a simplex sampling approach to handle the categorical features (Merrill [0124] In
some embodiments features with categorical values are encoded as numerics using a suitable method such as one-hot encoding or another mapping specified by the modeler.;
).

Referring to claim 10, this dependent claim recites similar limitations as claim 2;
therefore, the arguments above regarding claim 2 are also applicable to claim 10.

Referring to claim 11, this dependent claim recites similar limitations as claim 3;
therefore, the arguments above regarding claim 2 are also applicable to claim 11.

Referring to claim 12, this dependent claim recites similar limitations as claim 4;
therefore, the arguments above regarding claim 4 are also applicable to claim 12.

Referring to claim 13, this dependent claim recites similar limitations as claim 5;
therefore, the arguments above regarding claim 5 are also applicable to claim 13.

Referring to claim 14, this dependent claim recites similar limitations as claim 6;
therefore, the arguments above regarding claim 6 are also applicable to claim 14.

Referring to claim 15, this dependent claim recites similar limitations as claim 7;
therefore, the arguments above regarding claim 7 are also applicable to claim 15.

Referring to claim 17, this dependent claim recites similar limitations as claim 2;
therefore, the arguments above regarding claim 2 are also applicable to claim 17.

Referring to claim 18, this dependent claim recites similar limitations as claim 3;
therefore, the arguments above regarding claim 3 are also applicable to claim 18.

Referring to claim 19, this dependent claim recites similar limitations as claim 4;
therefore, the arguments above regarding claim 4 are also applicable to claim 19.

Referring to claim 20, this dependent claim recites similar limitations as claim 5;
therefore, the arguments above regarding claim 5 are also applicable to claim 20.

Referring to claim 21, this dependent claim recites similar limitations as claim 6;
therefore, the arguments above regarding claim 6 are also applicable to claim 21.

Referring to claim 22, this dependent claim recites similar limitations as claim 7;
therefore, the arguments above regarding claim 7 are also applicable to claim 22.



As to claim 24, Merrill discloses a computer-implemented method for model agnostic contrastive explanations for interpreting a deep neural network (DNN), 
(Merrill [0113] In some embodiments, the differentiable model is a perceptron, a feed-forward
neural network, an autoencoder, a probabilistic network, a convolutional neural network, a radial
basis function network, a multilayer perceptron, a deep neural network, or a recurrent neural
network) 
the model agnostic contrastive explanations method comprising:
generating a contrastive explanation for a decision
(Merrill [0029] In some embodiments, the model evaluation system 120 explains single models.
In some embodiments, the model evaluation system 120 explains ensembles by explaining
each sub-model of the ensemble and combining explanations of each sub-model by using an
ensemblng function of the ensemble;
See also [0032] In some embodiments, model evaluation and explanation system 120 uses
score decompositions to determine important features a model ( or ensemble) that impact
scores generated by the model (or ensemble).;
see also [0033] In some embodiments, the model evaluation system evaluates and explains the
model ( or ensemble) by generating score explanation information for a specific score generated
by the ensemble model for a particular input data
set.)
 of a classifier trained on structured data 
(Merrill [0122] In some embodiments, S210 includes: generating a reference input data set
(reference data point) representative of the reference population. In some embodiments, S210
includes: selecting a reference population from training data (e.g., of the modelling system 110);
for each numerical feature represented by the input data sets (reference data points);
see also [0135] In some embodiments, an autoencoder neural network is trained on a prior set
of decompositions, and caused to predict new decompositions;
see also [0160] generating a reference data point or set of points from a given reference population may rely upon different techniques depending on the type of input variable, e.g., numerical (float or integer precision), categorical (ordinal or unstructured),)

based only on a query class probabilities for a desired input
(Merrill [0150] In some embodiments, a decomposition is a set of vectors, where the
univariate distributions will be of interest. For example, for a given feature, the model evaluation
system applies statistical methods, e.g., median, mode, estimation of probability distribution
function.;
see also Merrill [0113] In some embodiments, the differentiable model is a perceptron, a feed-forward neural network, an autoencoder, a probabilistic network).

It would have been obvious to one having ordinary skill in the art at the time the time of the effective filing date to provide/generate decomposition values as taught by Merrill since it was known in the art that classification systems provide decomposition techniques that are used for understanding the influence of a particular feature across a range of values for that feature for a given population, also known as feature influence, which is roughly equivalent but offers greater insight than partial dependence plots where for reference, partial dependence plots attempt to demonstrate the "averaged" effect upon the predicted output of a specific feature over it's input range (continuous) or space (categorical) by providing a line plot. Decomposition for feature influence provides a plot with increased fidelity; rather than a line plot, the actual distribution of points is provided, offering additional insight. In some embodiments the model evaluation system receives a model and generates a decomposition for feature influence plot which is provided to the operator device  (Merrill [0134]).


As to claim 25, Merrill discloses a computer-implemented method for model agnostic contrastive explanations for interpreting a deep neural network (DNN), 
(Merrill [0113] In some embodiments, the differentiable model is a perceptron, a feed-forward
neural network, an autoencoder, a probabilistic network, a convolutional neural network, a radial
basis function network, a multilayer perceptron, a deep neural network, or a recurrent neural
network)
the model agnostic contrastive explanations method comprising:
providing contrastive explanations for decisions 
(Merrill [0029] In some embodiments, the model evaluation system 120 explains single models.
In some embodiments, the model evaluation system 120 explains ensembles by explaining
each sub-model of the ensemble and combining explanations of each sub-model by using an
ensemblng function of the ensemble;
See also [0032] In some embodiments, model evaluation and explanation system 120 uses
score decompositions to determine important features a model ( or ensemble) that impact
scores generated by the model (or ensemble).;
see also [0033] In some embodiments, the model evaluation system evaluates and explains the
model ( or ensemble) by generating score explanation information for a specific score generated
by the ensemble model for a particular input data
set.)

of any black box classifier 
(Merrill [0062] In other embodiments, the model evaluation system 120 includes modules that implement black box evaluation methods such as permutation importance.)

that are differentiable or not learned on tabular data 
(Merrill Fig. 4 item 121: “Differentiable Model Decomposition” Module 122
See also [0031] In some embodiments, the model evaluation and explanation system ( e.g., 120 of FIGS. 1A and 1B) uses the non-differentiable model decomposition module (e.g., 121)
and a differentiable model decomposition module ( e.g., 122) to decompose scores generated by each sub-model of an ensemble model (e.g., a model of modeling system 110 of
FIGS. lA and 1B) that includes at least one non-differentiable model and at least one differentiable model;
See also [0038] In some embodiments, the model evaluation system is not used to explain an ensemble, but to compare two or more models, which may comprise a mixture of tree and
differentiable models;
see also [0057-0060]).

It would have been obvious to one having ordinary skill in the art at the time the time of the effective filing date to provide/generate decomposition values as taught by Merrill since it was known in the art that classification systems provide decomposition techniques that are used for understanding the influence of a particular feature across a range of values for that feature for a given population, also known as feature influence, which is roughly equivalent but offers greater insight than partial dependence plots where for reference, partial dependence plots attempt to demonstrate the "averaged" effect upon the predicted output of a specific feature over it's input range (continuous) or space (categorical) by providing a line plot. Decomposition for feature influence provides a plot with increased fidelity; rather than a line plot, the actual distribution of points is provided, offering additional insight. In some embodiments the model evaluation system receives a model and generates a decomposition for feature influence plot which is provided to the operator device  (Merrill [0134]).




Claims 8, 23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Merrill et al., US Pub. No. 2019/0378210 A1, in view of Galitsky et al., US Pub. No.: US 2019/0236134.
As to claim 8, Merrill does not disclose:
embodied in a cloud-computing environment;
 
However, Galitsky discloses:
discloses the computer-implemented method of claim 1, embodied in a cloud-computing environment (Galitsky [0046] FIG. 27 is a simplified block diagram of components of a system environment by which services provided by the components of an aspect system may be offered as cloud services in accordance with an aspect.)

It would have been obvious to one having ordinary skill in the art at the time the time of the effective filing date to offer classification models using a cloud environment as taught by Galitsky since it was known in the art that classification systems provide services provided
by the cloud infrastructure system can dynamically scale to meet the needs of its users.
(Galitsky [0284]).

Referring to claim 23, this dependent claim recites similar limitations as claim 8;
therefore, the arguments above regarding claim 8 are also applicable to claim 23.





Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:

Fenoglio et al., US Pub. No. 2020/0022016 A1 teaches a network quality assessment service
that monitors a network obtains multimodal data indicative of a plurality of measurements from the network and subjective perceptions of the network by users of the network.
The network quality assessment service uses the obtained multimodal data as input to one or more neural network based models. The network quality assessment service maps, using a conceptual space, outputs of the one or more neural network-based models to symbols. The network quality assessment service applies a symbolic reasoning engine to the symbols, to generate a conclusion regarding the monitored network. The network quality assessment
service provides an indication of the conclusion to a user interface;
and
Hargras et al., US Pub. No. 2022/0036221 A1 teaches a method of determining and explaining an artificial intelligence, AI, system employing an opaque model from a local or global point of view, the method comprising the steps of providing an input and a corresponding output of the opaque model; sampling the opaque model around the input to generate training data samples; performing feature selection to determine dominant features generating a Type-2 Fuzzy
Logic Model, FLM; training the Type-2 FLM with the training data samples; and inputting the input into the Type-2 FLM to provide an explanation of the output from the opaque model.




CONTACT INFORMATION
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EVAN S ASPINWALL whose telephone number is (571)270-7723. The examiner can normally be reached Monday-Friday 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Neveen Abel-Jalil can be reached on 571-270-0474. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/Evan Aspinwall/Primary Examiner, Art Unit 2152                                                                                                                                                                                                        4/29/2022