DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 04/11/2022 has been entered.
Status of Claims
The following claims are pending in this office action: 1-20
The following claims are amended: 1, 5, 7, 9, and 16
The following claims are new: None
The following claims are cancelled: None
The following claims are rejected: 1-20
Response to Arguments
Applicant’s arguments filed on 04/11/2022 to address the 35 U.S.C. 101 rejection have been fully considered, however they are not persuasive. Applicant argues that the “computer functionality is improved by the claimed invention” (see Applicant’s arguments page 8-13). Examiner respectfully disagrees as the claims still recite an abstract idea without significantly more. As currently claimed, the limitations of claim 1 recite a mental process involving grouping sets of data and subsequently generating a tree of the grouped data where the tree is used to explain the grouped data wherein the computing device is considered to be an additional element which is used to apply itself to the mental process. Limitations that the courts have found not to be enough to qualify as "significantly more" when recited in a claim with a judicial exception include adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f). Thus, the 35 U.S.C. 101 rejection is maintained.
Applicant’s arguments filed on 04/11/2022 to address the 35 U.S.C. 103 rejection have been fully considered, however they are not persuasive. Applicant argues that Ribeiro does not teach or suggest “providing multilevel explanation tree, where the multilevel explanation tree to explain one or more predictions of the machine learning model of clusters of data” (See Applicants arguments page 13-14). Examiner respectfully disagrees. Examiner notes that Ribeiro has not been cited to teach “providing multilevel explanation tree, where the multilevel explanation tree to explain one or more predictions of the machine learning model” as noted in the Non-Final rejection dated 09/16/2021. Hetherington is cited to teach “providing multilevel explanation tree, where the multilevel explanation tree to explain one or more predictions of the machine learning model” as noted in Para. [0035] and Para. [0047] of Hetherington where the cited paragraphs disclose using a multilevel explanation tree to demonstrate explainability of a machine learning model which reflects how the machine learning model accordingly classified data. Therefore, Examiner respectfully asserts that the combination of the cited art sufficiently teaches the limitation recited in the claims.
Applicant also argues that the cited prior art does not teach the claim language as recited in claim 5 (see Applicants remarks, page 15). Examiner respectfully disagrees and notes that the Applicants arguments are high level and merely conclusory. Examiner notes that the combination of the cited prior art sufficiently teaches all aspects of what is being claimed in claim 5 as Hetherington is shown to teach the multilevel explanation tree (see Para [0024]), and Section 3.3 of Riberio discloses the sampling and differences of sampled neighborhood of data points. Thus, Examiner respectfully asserts that the combination of the cited art sufficiently teaches the limitation recited in the claims.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more
	Claim 1 recites a method of machine learning, the method comprising: receiving by a computing device a pre-trained artificial intelligence model with one or more predictions; generating by the computing device a multilevel explanation tree, linking neighborhood of datapoints around each of a plurality of training datapoints to the one or more predictions; and utilizing by the computing device the multilevel explanation tree to explain one or more predictions of the machine learning model to provide machine learning.
	The limitation of generating by the computing device a multilevel explanation tree, linking neighborhood of datapoints around each of a plurality of training datapoints to the one or more predictions, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “by the computing device”, nothing in the claim limitation precludes the step from practically being performed in the mind. For example, but for the “by the computing device” language, “generating” in the context of the claim encompasses a user generating a multilevel decision tree where the user associates data points to the various nodes of the multilevel decision tree.
	The limitation of utilizing by the computing device the multilevel explanation tree to explain one or more predictions of the machine learning model of clusters of data to provide machine learning covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “by the computing device” and “the machine learning model” and “machine learning”, nothing in the claim limitation precludes the step from practically being performed in the mind. For example, but for the “by the computing device” and “the machine learning model” and “machine learning” language, “utilizing” in the context of the claim encompasses a user explaining a multilevel decision tree to another user.
If a claim limitation, under its broadest reasonable interpretation, covers performance
of the limitation in the mind but for the recitation of generic computer components, then it falls
within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an
abstract idea.
The judicial exception is not integrated into a practical application. In particular, the
claim recites additional elements – the computing device and the machine learning model and machine learning. The computing device and the machine learning model and machine learning are recited at a high level of generality (i.e., as a generic model performing a generic computer function) such that it amounts to no more than mere instructions to apply the exception using a generic computing model. Further, the claim recites the receiving step (receiving a pre-trained artificial intelligence model). The receiving step is recited at a high level of generality and amounts to mere data gathering which is a form of insignificant extra-solution activity. Accordingly, this additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea
The claim does not include additional elements that are sufficient to significantly more
than the judicial exception As discussed above with respect to integration of the abstract idea
into a practical application, the additional elements of the processing device and the machine learning model and machine learning amounts to no more than mere instructions to apply the exception using a generic computing component. Limitations that the courts have found not to be enough to qualify as "significantly more" when recited in a claim with a judicial exception include adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f) and generally linking the use of the judicial exception to a particular technological environment or field of use – see MPEP 2106.05(h) Mere instructions to apply an exception using generic computing components cannot provide an inventive concept. Further, the receiving step is considered to be an extra-solution activity in Step 2A Prong 2, and thus it is re-evaluated in Step 2B to determine if it is more than what is well-understood, routine, conventional activity in the field.  The court decisions cited in MPEP 2106.05(d)(II) indicate that merely “Receiving or transmitting data over a network, e.g., using the Internet to gather data” is a well‐understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim). Thereby, a conclusion that the claimed receiving step is well-understood, routine, conventional activity is supported under Berkheimer.
This claim is not patent eligible under U.S.C. 101.
	Claim 2 recites the method of claim 1, further comprising receiving by the computing device a dataset for the pre-trained artificial intelligence model including the plurality of training datapoints. This limitation, as drafted, is a process that, under its broadest
reasonable interpretation, covers performance of the limitation in the mind That is, other than reciting “the computing device”, nothing in the claim limitation precludes the
step from practically being performed in the mind.
This judicial exception is not integrated into a practical application. In particular, the
claim only recites one additional element – the computing device. The computing device is recited at a high level of generality (i.e., as a generic model performing a generic computer function) such that it amounts to no more than mere instructions to apply the exception using a generic computing model. Further, the claim recites the receiving step (receiving a dataset). The receiving step is recited at a high level of generality and amounts to mere data gathering which is a form of insignificant extra-solution activity. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea
The claim does not include additional elements that are sufficient to amount to
significantly more than the judicial exception. As discussed above with respect to integration of
the abstract idea into a practical application, the additional element of the computing device amounts to no more than mere instructions to apply the exception using a
generic computing component. Mere instructions to apply an exception using generic
computing components cannot provide an inventive concept  Further, the receiving step is considered to be an extra-solution activity in Step 2A Prong 2, and thus it is re-evaluated in Step 2B to determine if it is more than what is well-understood, routine, conventional activity in the field.  The court decisions cited in MPEP 2106.05(d)(II) indicate that merely “Receiving or transmitting data over a network, e.g., using the Internet to gather data” is a well‐understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim). Thereby, a conclusion that the claimed receiving step is well-understood, routine, conventional activity is supported under Berkheimer.
This claim is not patent eligible under U.S.C. 101.
	Claim 3 recites the method of claim 2, further comprising receiving by the computing device a coordinate wise map of the plurality of training datapoints. This limitation, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind That is, other than reciting “the computing device”, nothing in the claim limitation precludes the step from practically being performed in the mind.
This judicial exception is not integrated into a practical application. In particular, the
claim only recites one additional element – the computing device. The computing device is recited at a high level of generality (i.e., as a generic model performing a generic computer function) such that it amounts to no more than mere instructions to apply the exception using a generic computing model. Further, the claim recites the receiving step (receiving a coordinate wise map). The receiving step is recited at a high level of generality and amounts to mere data gathering which is a form of insignificant extra-solution activity. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea
The claim does not include additional elements that are sufficient to amount to
significantly more than the judicial exception. As discussed above with respect to integration of
the abstract idea into a practical application, the additional element of the computing device amounts to no more than mere instructions to apply the exception using a
generic computing component. Mere instructions to apply an exception using generic
computing components cannot provide an inventive concept Further, the receiving step is considered to be an extra-solution activity in Step 2A Prong 2, and thus it is re-evaluated in Step 2B to determine if it is more than what is well-understood, routine, conventional activity in the field.  The court decisions cited in MPEP 2106.05(d)(II) indicate that merely “Receiving or transmitting data over a network, e.g., using the Internet to gather data” is a well‐understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim). Thereby, a conclusion that the claimed receiving step is well-understood, routine, conventional activity is supported under Berkheimer.
This claim is not patent eligible under U.S.C. 101.
	Claim 4 recites the method of claim 3, further comprising sampling by the computing device a neighborhood of datapoints around each of the training datapoints. This limitation, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind That is, other than reciting “the computing device”, nothing in the claim limitation precludes the step from practically being performed in the mind. For example, but for the “the computing device” language, “sampling” in the context of the claim encompasses a user analyzing a set of data and subsequently sampling data similar to a chosen data. 
This judicial exception is not integrated into a practical application. In particular, the
claim only recites one additional element – the computing device. The computing device is recited at a high level of generality (i.e., as a generic model performing a generic computer function) such that it amounts to no more than mere instructions to apply the exception using a generic computing model. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea
The claim does not include additional elements that are sufficient to amount to
significantly more than the judicial exception. As discussed above with respect to integration of
the abstract idea into a practical application, the additional element of the computing device amounts to no more than mere instructions to apply the exception using a
generic computing component. Mere instructions to apply an exception using generic
computing components cannot provide an inventive concept 
This claim is not patent eligible under U.S.C. 101.
	Claim 5 recites The method of claim 4, wherein the generating by the computing device the multilevel explanation tree, links the neighborhood of datapoints around each of the training datapoints to the one or more predictions, leaves of the multilevel explanation tree representing the neighborhood of datapoints around each of the training datapoints and distances between leaves of the multilevel explanation tree indicating differences between values of the neighborhood of datapoints, and wherein a linear or non-linear local explainability is implemented, further comprising executing the machine learning by utilizing by the computing device the multilevel explanation tree to explain one or more predictions of the machine learning model of clusters of data.
	The limitation of wherein the generating by the computing device the multilevel explanation tree, links the neighborhood of datapoints around each of the training datapoints to the one or more predictions, leaves of the multilevel explanation tree representing the neighborhood of datapoints around each of the training datapoints and distances between leaves of the multilevel explanation tree indicating differences between values of the neighborhood of datapoints as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind That is, other than reciting “the computing device”, nothing in the claim limitation precludes the step from practically being performed in the mind. This limitation merely describes the different aspects/properties of the generated multilevel explanation tree. 
	The limitation of wherein a linear or non-linear local explainability is implemented as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. This limitation merely describes the mode of explainability that is used when generating the multilevel explanation tree.
	The limitation of further comprising executing the machine learning by utilizing by the computing device the multilevel explanation tree to explain one or more predictions of the machine learning model of clusters of data as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind That is, other than reciting “the computing device” and “machine learning” and “machine learning model”, nothing in the claim limitation precludes the step from practically being performed in the mind. For example, “executing” in the context in the claim but for the “the computing device” and “machine learning” and “machine learning model” language encompasses drawing a decision tree to explain an output.
This judicial exception is not integrated into a practical application. In particular, the
claim only recites one additional element – the computing device. The computing device is recited at a high level of generality (i.e., as a generic model performing a generic computer function) such that it amounts to no more than mere instructions to apply the exception using a generic computing model. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea
The claim does not include additional elements that are sufficient to amount to
significantly more than the judicial exception. As discussed above with respect to integration of
the abstract idea into a practical application, the additional element of the computing device, machine learning, and machine learning model amounts to no more than mere instructions to apply the exception using a generic computing component. Mere instructions to apply an exception using generic computing components cannot provide an inventive concept 
This claim is not patent eligible under U.S.C. 101.
	Claim 6 recites  The method of claim 1, wherein the generating by the computing device the multilevel explanation tree, links the neighborhood of datapoints around each of the training datapoints to the one or more predictions, leaves of the multilevel explanation tree representing the neighborhood of datapoints around each of the training datapoints and distances between leaves of the multilevel explanation tree indicating differences between values of the neighborhood of datapoints, and  47U.S. Patent Application Docket No: P201906677US01 YOR.1303 wherein the utilizing by the computing device includes the leaves of the multilevel explanation tree representing the neighborhood of datapoints to explain one or more predictions of the machine learning model.
	The limitation of wherein the generating by the computing device the multilevel explanation tree, links the neighborhood of datapoints around each of the training datapoints to the one or more predictions, leaves of the multilevel explanation tree representing the neighborhood of datapoints around each of the training datapoints and distances between leaves of the multilevel explanation tree indicating differences between values of the neighborhood of datapoints as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind That is, other than reciting “the computing device”, nothing in the claim limitation precludes the step from practically being performed in the mind. This limitation merely describes the different aspects/properties of the generated multilevel explanation tree.
	The limitation of wherein the utilizing by the computing device includes the leaves of the multilevel explanation tree representing the neighborhood of datapoints to explain one or more predictions of the machine learning model as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind That is, other than reciting “the computing device” and “the machine learning model”, nothing in the claim limitation precludes the step from practically being performed in the mind. For example, but for the “the computing device” and “the machine learning model” language, “utilizing” in the context in the claim encompasses a user explaining a decision tree to a user, particularly explaining why the decision tree is created the way it is. 
This judicial exception is not integrated into a practical application. In particular, the
claim only recites additional elements – the computing device and the machine learning model. The computing device and the machine learning model is recited at a high level of generality (i.e., as a generic model performing a generic computer function) such that it amounts to no more than mere instructions to apply the exception using a generic computing model. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea
The claim does not include additional elements that are sufficient to amount to
significantly more than the judicial exception. As discussed above with respect to integration of
the abstract idea into a practical application, the additional element of the computing device and the machine learning model amounts to no more than mere instructions to apply the exception using a generic computing component. Mere instructions to apply an exception using generic computing components cannot provide an inventive concept 
This claim is not patent eligible under U.S.C. 101.
	Claim 7 recites The method of claim 1, further comprising receiving by the computing device a dataset for the pre-trained artificial intelligence model including the plurality of training datapoints, wherein the utilizing by the computing device includes the leaves of the multilevel explanation tree representing the neighborhood of datapoints to explain one or more predictions of the machine learning model, wherein leaves of the multilevel explanation tree provides local sample-wise explanations, a root of the multilevel explanation tree provides global dataset-level explanation, and intermediate levels of the multilevel explanation tree provides explanations of clusters of data.
The limitation of wherein the utilizing by the computing device includes the leaves of the multilevel explanation tree representing the neighborhood of datapoints to explain one or more predictions of the machine learning model as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind That is, other than reciting “the computing device” and “the machine learning model”, nothing in the claim limitation precludes the step from practically being performed in the mind. For example, but for the “the computing device” and “the machine learning model” language, “utilizing” in the context in the claim encompasses a user explaining a decision tree to a user, particularly explaining why the decision tree is created the way it is. 
	The limitation of wherein leaves of the multilevel explanation tree provides local sample-wise explanations, a root of the multilevel explanation tree provides global dataset-level explanation, and intermediate levels of the multilevel explanation tree provides explanations of clusters of data as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind That is, other than reciting “the computing device”, nothing in the claim limitation precludes the step from practically being performed in the mind. This limitation merely describes the different aspects/properties of the generated multilevel explanation tree.
This judicial exception is not integrated into a practical application. In particular, the
claim only recites additional elements – the computing device and the machine learning model. The computing device and the machine learning model is recited at a high level of generality (i.e., as a generic model performing a generic computer function) such that it amounts to no more than mere instructions to apply the exception using a generic computing model. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. Further, the claim recites the receiving step (receiving a dataset). The receiving step is recited at a high level of generality and amounts to mere data gathering which is a form of insignificant extra-solution activity. The claim is directed to an abstract idea
The claim does not include additional elements that are sufficient to amount to
significantly more than the judicial exception. As discussed above with respect to integration of
the abstract idea into a practical application, the additional element of the computing device and the machine learning model amounts to no more than mere instructions to apply the exception using a generic computing component. Mere instructions to apply an exception using generic computing components cannot provide an inventive concept Further, the receiving step is considered to be an extra-solution activity in Step 2A Prong 2, and thus it is re-evaluated in Step 2B to determine if it is more than what is well-understood, routine, conventional activity in the field.  The court decisions cited in MPEP 2106.05(d)(II) indicate that merely “Receiving or transmitting data over a network, e.g., using the Internet to gather data” is a well‐understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim). Thereby, a conclusion that the claimed receiving step is well-understood, routine, conventional activity is supported under Berkheimer.
This claim is not patent eligible under U.S.C. 101.
	Claim 8 recites the method according to claim 1 being cloud implemented. This limitation, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim limitation precludes the step from practically being performed in the mind
The judicial exception is not integrated into a practical application. In particular, the
claim does not recite any additional elements. Accordingly, this does not integrate the abstract
idea into a practical application because it does not impose any meaningful limits on practicing
the abstract idea.
The claim does not include additional elements that are sufficient to amount to
significantly more than the judicial exception. Limitations that the courts have found not to be
enough to qualify as "significantly more" when recited in a claim with a judicial exception
include generally linking the use of the judicial exception to a particular technological
environment or field of use – see MPEP § 2106.05(h). As discussed above with respect to the
integration of the abstract idea into a practical application, no additional elements are cited.
This claim is not patent eligible under U.S.C. 101.
	Claim 12 recites the system according to claim 11, further comprising sampling a neighborhood of datapoints around each of the training datapoints, wherein leaves of the multilevel explanation tree provides local sample-wise explanations, a root of the multilevel explanation tree provides global dataset-level 49U.S. Patent Application Docket No: P201906677US0 I YOR.1303 explanation, and intermediate levels of the multilevel explanation tree provides explanations of clusters of data.
The limitation of sampling a neighborhood of datapoints around each of the training datapoints, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind That is, nothing in the claim limitation precludes the step from practically being performed in the mind. For example, “sampling” in the context of the claim encompasses a user analyzing a set of data and subsequently sampling data similar to a chosen data. 
The limitation of wherein leaves of the multilevel explanation tree provides local sample-wise explanations, a root of the multilevel explanation tree provides global dataset-level 49U.S. Patent Application Docket No: P201906677US0 I YOR.1303 explanation, and intermediate levels of the multilevel explanation tree provides explanations of clusters of data  as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind That is, nothing in the claim limitation precludes the step from practically being performed in the mind. This limitation merely describes the different aspects/properties of the generated multilevel explanation tree.
This judicial exception is not integrated into a practical application. In particular, the
claim only recites one additional element – the computing device. The computing device is recited at a high level of generality (i.e., as a generic model performing a generic computer function) such that it amounts to no more than mere instructions to apply the exception using a generic computing model. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea
The judicial exception is not integrated into a practical application. In particular, the
claim does not recite any additional elements. Accordingly, this does not integrate the abstract
idea into a practical application because it does not impose any meaningful limits on practicing
the abstract idea.
The claim does not include additional elements that are sufficient to amount to
significantly more than the judicial exception. As discussed above with respect to the
integration of the abstract idea into a practical application, no additional elements are cited.
This claim is not patent eligible under U.S.C. 101.
	Claim 15 recites the system according to claim 9, wherein the utilizing includes utilizing of leaves of the multilevel explanation tree representing the neighborhood of datapoints to explain one or more predictions of the machine learning model. as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind That is, other than reciting “the machine learning model”, nothing in the claim limitation precludes the step from practically being performed in the mind. For example, but for the “the machine learning model” language, “utilizing” in the context in the claim encompasses a user explaining a decision tree to a user, particularly explaining why the decision tree is created the way it is. 
This judicial exception is not integrated into a practical application. In particular, the
claim only recites additional elements –the machine learning model. The machine learning model is recited at a high level of generality (i.e., as a generic model performing a generic computer function) such that it amounts to no more than mere instructions to apply the exception using a generic computing model. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea
The claim does not include additional elements that are sufficient to amount to
significantly more than the judicial exception. As discussed above with respect to integration of
the abstract idea into a practical application, the additional element of the machine learning model amounts to no more than mere instructions to apply the exception using a generic computing component. Mere instructions to apply an exception using generic computing components cannot provide an inventive concept 
This claim is not patent eligible under U.S.C. 101.
	Claim 17 recites the computer program product according to claim 16, further comprising:51U.S. Patent ApplicationDocket No: P201906677US01YOR.1303 receiving a dataset for the pre-trained artificial intelligence model including the plurality of training datapoints; and receiving a coordinate wise map of the plurality of training datapoints, wherein leaves of the multilevel explanation tree provides local sample-wise explanations, a root of the multilevel explanation tree provides global dataset-level explanation, and intermediate levels of the multilevel explanation tree provides explanations of clusters of data.
	The limitation of wherein leaves of the multilevel explanation tree provides local sample-wise explanations, a root of the multilevel explanation tree provides global dataset-level explanation, and intermediate levels of the multilevel explanation tree provides explanations of clusters of data. as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind That is, nothing in the claim limitation precludes the step from practically being performed in the mind. This limitation merely describes the different aspects/properties of the generated multilevel explanation tree.
The judicial exception is not integrated into a practical application. In particular, the claim does not recite any additional elements. Further, the claim recites the receiving step (receiving a dataset and a coordinate wise map). The receiving step is recited at a high level of generality and amounts to mere data gathering which is a form of insignificant extra-solution activity Accordingly, this does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
The claim does not include additional elements that are sufficient to amount to
significantly more than the judicial exception. As discussed above with respect to the
integration of the abstract idea into a practical application, no additional elements are cited. Further, the receiving step is considered to be an extra-solution activity in Step 2A Prong 2, and thus it is re-evaluated in Step 2B to determine if it is more than what is well-understood, routine, conventional activity in the field.  The court decisions cited in MPEP 2106.05(d)(II) indicate that merely “Receiving or transmitting data over a network, e.g., using the Internet to gather data” is a well‐understood, routine, conventional function when it is claimed in a merely generic manner (as it is in the present claim). Thereby, a conclusion that the claimed receiving step is well-understood, routine, conventional activity is supported under Berkheimer.
This claim is not patent eligible under U.S.C. 101.
	Claim 20 recites the computer program product according to claim 16, wherein the utilizing of the leaves of the multilevel explanation tree representing the neighborhood of datapoints to explain one or more predictions of the machine learning model, and the computer program product being cloud implemented.
	The limitation of wherein the utilizing of the leaves of the multilevel explanation tree representing the neighborhood of datapoints to explain one or more predictions of the machine learning model. as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind That is, other than reciting “the machine learning model”, nothing in the claim limitation precludes the step from practically being performed in the mind. For example, but for the “the machine learning model” language, “utilizing” in the context in the claim encompasses a user explaining a decision tree to a user, particularly explaining why the decision tree is created the way it is. 
	The limitation of the computer program product being cloud implemented as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim limitation precludes the step from practically being performed in the mind
This judicial exception is not integrated into a practical application. In particular, the
claim only recites additional elements –the machine learning model. The machine learning model is recited at a high level of generality (i.e., as a generic model performing a generic computer function) such that it amounts to no more than mere instructions to apply the exception using a generic computing model. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea
The claim does not include additional elements that are sufficient to amount to
significantly more than the judicial exception. As discussed above with respect to integration of
the abstract idea into a practical application, the additional element of the machine learning model amounts to no more than mere instructions to apply the exception using a generic computing component. Mere instructions to apply an exception using generic computing components cannot provide an inventive concept . Limitations that the courts have found not to be enough to qualify as "significantly more" when recited in a claim with a judicial exception
include generally linking the use of the judicial exception to a particular technological
environment or field of use – see MPEP § 2106.05(h). 
This claim is not patent eligible under U.S.C. 101.
	Claims 9-11, 13-14 are rejected on the same grounds as claims 1-3, 5-6 respectively
	Claims 16, 18-19 are rejected on the same grounds as claims 1, 5-6 respectively
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 6-9, 14-16, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pub. No. US20200302318 to Hetherington, et al. (hereinafter, “Hetherington”), in view of “Why Should I Trust You?” Explaining the Predictions of Any Classifier to Ribeiro, et al. (hereinafter, “Ribeiro”)
As per claim 1, Hetherington teaches a method of machine learning, the method comprising:
receiving by a computing device a pre-trained artificial intelligence model with one or more predictions; (Hetherington, Para. [0067] discloses “In step 201, a machine learning (ML) model classifies examples according to mutually exclusive labels. Typically, the ML model was trained before step 201, and the examples to classify are numerous and realistic.”)
generating by the computing device a multilevel explanation tree, (Hetherington, Para. [0024] discloses “A decision tree that contains tree nodes is received or generated. “ and Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
and utilizing by the computing device the multilevel explanation tree to explain one or more predictions of the machine learning model of [[clusters of data]] to provide machine learning. (Hetherington, Para. [0047] discloses “More importantly, rules 121-128 may be used to explain why the ML model already classified an (unfamiliar or familiar) example with a particular label. In other words, rules 121-128 may demonstrably provide ML explainability (MLX) for the ML model…” and Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.” (Tree generated describes how the ML model generated its predictions. Note that Ribeiro also explains predictions of a machine learning model. The multilevel explanation tree of Hetherington additionally utilizes data output from the model to thereby provide an explanation where the data gathered is grouped into different nodes or clusters. Note that Ribeiro additionally produces clusters of data or a model which is to be used within the combination of the two references as Ribeiro is primary used to group neighbors of data points i.e. clusters of data))
Hetherington fails to explicitly teach:
linking neighborhood of datapoints around each of a plurality of training datapoints to the one or more predictions
clusters of data
However, Ribeiro teaches:
linking neighborhood of datapoints around each of a plurality of training datapoints to the one or more predictions (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.” And Algorithm 1, and Section 5.4 discloses “Specifically, on training and validation sets (80/20 split of the original training data), each artificial feature appears in 10% of the examples in one class, and 20% of the other, while on the test instances, each artificial feature appears in 10% of the examples in each class” (Sampling neighborhood of points of data which are then “linked” to an explanation of a prediction via a label. LIME addresses model explainability of data that is used in predictions))
clusters of data (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.” (These clusters of data are applied to the ML model from Hetherington as seen above)))
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify generating a multilevel explanation tree as disclosed by Hetherington to use linking as disclosed by Ribeiro. The combination would have been obvious because a person of ordinary skill in the art would be motivated to “present an explanation that is locally faithful” such that the multilevel tree may further explain to a user how a prediction was reached (Ribeiro, Section 3.3)

	As per claim 6, the combination of Hetherington, and Ribeiro, as shown above teaches the method of claim 1, Hetherington further teaches:
wherein the generating by the computing device the multilevel explanation tree, (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
leaves of the multilevel explanation tree representing (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
and distances between leaves of the multilevel explanation tree indicating (Hetherington, Para. [0114] discloses “A decision tree may have many more levels and nodes than shown, and the explanatory value of the nodes decreases based on distance from the root node.”)
and47U.S. Patent ApplicationDocket No: P201906677US01 YOR.1303wherein the utilizing by the computing device includes the leaves of the multilevel explanation tree representing (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
	Ribeiro further teaches:
links the neighborhood of datapoints around each of the training datapoints to the one or more predictions, (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.”)
the neighborhood of datapoints around each of the training datapoints (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.”)
differences between values of the neighborhood of datapoints, (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.” (Different clusters of data get sampled))
the neighborhood of datapoints to explain one or more predictions of the machine learning model. (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model….The primary intuition behind LIME is presented in Figure 3, where we sample instances both in the vicinity of x (which have a high weight due to πx) and far away from x (low weight from πx). Even though the original model may be too complex to explain globally, LIME presents an explanation that is locally faithful (linear in this case), where the locality is captured by πx.”)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Ribeiro for at least the same reasons as discussed above in claim 1

As per claim 7, the combination of Hetherington, and Ribeiro, as shown above teaches the method of claim 1, Hetherington further teaches:
further comprising receiving by the computing device a dataset for the pre-trained artificial intelligence model including [[the plurality of training datapoints]] (Hetherington, Para. [0067] discloses “In step 201, a machine learning (ML) model classifies examples according to mutually exclusive labels. Typically, the ML model was trained before step 201, and the examples to classify are numerous and realistic.”)
wherein the utilizing by the computing device includes the leaves of the multilevel explanation tree representing (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
Ribeiro further teaches:
the neighborhood of datapoints to explain one or more predictions of the machine learning model. (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model….The primary intuition behind LIME is presented in Figure 3, where we sample instances both in the vicinity of x (which have a high weight due to πx) and far away from x (low weight from πx). Even though the original model may be too complex to explain globally, LIME presents an explanation that is locally faithful (linear in this case), where the locality is captured by πx.”)
[[wherein leaves of the multilevel explanation tree provides]] local sample-wise explanations, (Ribeiro, Section 3.3 discloses “Even though the original model may be too complex to explain globally, LIME presents an explanation that is locally faithful (linear in this case),” (Hetherington’s multilevel tree has leaf nodes))
[[a root of the multilevel explanation tree provides]] global dataset-level explanation (Ribeiro, Section 8 discloses “We also introduced SP-LIME, a method to select representative and non-redundant predictions, providing a global view of the model to users” (Hetherington’s multilevel tree has a root node), 
[[and intermediate levels of the multilevel explanation tree provides]] explanations of clusters of data. (Riberio, Section 1 discloses “In this paper, we propose providing explanations for individual predictions as a solution to the “trusting a prediction” problem, Fig. 2 provides an example explanation of a cluster of data and Section 3.3. discloses sampling data for local exploration and explanation (Hetherington’s multi level tree has intermediate levels of the tree))
the plurality of training datapoints (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model….The primary intuition behind LIME is presented in Figure 3, where we sample instances both in the vicinity of x (which have a high weight due to πx) and far away from x (low weight from πx). Even though the original model may be too complex to explain globally, LIME presents an explanation that is locally faithful (linear in this case), where the locality is captured by πx.” (These plurality of training datapoints are applied to the ML model from Hetherington as seen above))
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Ribeiro for at least the same reasons as discussed above in claim 1

As per claim 8, the combination of Hetherington, and Ribeiro, as shown above teaches the method according to claim 1, Hetherington further teaches:
	being cloud implemented (Hetherington, Para. [0144] discloses “The above-described basic computer hardware and software and cloud computing environment presented for purpose of illustrating the basic underlying computer components that may be employed for implementing the example embodiment(s).”)

	As per claim 9, Hetherington teaches a system for explaining one or more predictions of a machine learning model comprising:
	a computer comprising a memory storing computer instructions (Hetherington, Para. [0121] discloses “Computer system 700 also includes a main memory 706, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 702 for storing information and instructions to be executed by processor 704.”)
	a processor configured to execute the computer instructions (Hetherington, Para. [0121] discloses “Computer system 700 also includes a main memory 706, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 702 for storing information and instructions to be executed by processor 704.”)
receive a pre-trained artificial intelligence model with one or more predictions; (Hetherington, Para. [0067] discloses “In step 201, a machine learning (ML) model classifies examples according to mutually exclusive labels. Typically, the ML model was trained before step 201, and the examples to classify are numerous and realistic.”)
generate a multilevel explanation tree, (Hetherington, Para. [0024] discloses “A decision tree that contains tree nodes is received or generated. “ and Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
and utilize the multilevel explanation tree to explain one or more predictions of the machine learning model of [[clusters of data]] to provide machine learning. (Hetherington, Para. [0047] discloses “More importantly, rules 121-128 may be used to explain why the ML model already classified an (unfamiliar or familiar) example with a particular label. In other words, rules 121-128 may demonstrably provide ML explainability (MLX) for the ML model…” and Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.” (Tree generated describes how the ML model generated its predictions. Note that Ribeiro also explains predictions of a machine learning model. The multilevel explanation tree of Hetherington additionally utilizes data output from the model to thereby provide an explanation where the data gathered is grouped into different nodes or clusters. Note that Ribeiro additionally produces clusters of data or a model which is to be used within the combination of the two references as Ribeiro is primary used to group neighbors of data points i.e. clusters of data))
Hetherington fails to explicitly teach:
linking neighborhood of datapoints around each of a plurality of training datapoints to the one or more predictions
clusters of data
However, Ribeiro teaches:
linking neighborhood of datapoints around each of a plurality of training datapoints to the one or more predictions (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.” And Algorithm 1, and Section 5.4 discloses “Specifically, on training and validation sets (80/20 split of the original training data), each artificial feature appears in 10% of the examples in one class, and 20% of the other, while on the test instances, each artificial feature appears in 10% of the examples in each class” (Sampling neighborhood of points of data which are then “linked” to an explanation of a prediction via a label. LIME addresses model explainability of data.))
clusters of data (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.” (These clusters of data are applied to the ML model from Hetherington as seen above)))

It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Ribeiro for at least the same reasons as discussed above in claim 1

As per claim 14, the combination of Hetherington, and Ribeiro, as shown above teaches the system according to claim 9, Hetherington further teaches:
wherein the generating the multilevel explanation tree, (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
leaves of the multilevel explanation tree representing (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
and distances between leaves of the multilevel explanation tree indicating (Hetherington, Para. [0114] discloses “A decision tree may have many more levels and nodes than shown, and the explanatory value of the nodes decreases based on distance from the root node.”)
wherein the utilizing the leaves of the multilevel explanation tree representing (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
	Ribeiro further teaches:
links the neighborhood of datapoints around each of the training datapoints to the one or more predictions, (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.”)
the neighborhood of datapoints around each of the training datapoints (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.”)
differences between values of the neighborhood of datapoints, (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.” (Different clusters of data get sampled))
the neighborhood of datapoints to explain one or more predictions of the machine learning model. (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model….The primary intuition behind LIME is presented in Figure 3, where we sample instances both in the vicinity of x (which have a high weight due to πx) and far away from x (low weight from πx). Even though the original model may be too complex to explain globally, LIME presents an explanation that is locally faithful (linear in this case), where the locality is captured by πx.”)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Ribeiro for at least the same reasons as discussed above in claim 1

As per claim 15, the combination of Hetherington, and Ribeiro, as shown above teaches the system according to claim 9, Hetherington further teaches:
wherein the utilizing includes the leaves of the multilevel explanation tree representing (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
	Ribeiro further teaches:
the neighborhood of datapoints to explain one or more predictions of the machine learning model. (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model….The primary intuition behind LIME is presented in Figure 3, where we sample instances both in the vicinity of x (which have a high weight due to πx) and far away from x (low weight from πx). Even though the original model may be too complex to explain globally, LIME presents an explanation that is locally faithful (linear in this case), where the locality is captured by πx.”)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Ribeiro for at least the same reasons as discussed above in claim 1

	As per claim 16, Hetherington teaches a computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions readable and executable by a computer to cause the computer to perform a method, comprising (Hetherington, Para. [0124] discloses “Such instructions may be read into main memory 706 from another storage medium, such as storage device 710. Execution of the sequences of instructions contained in main memory 706 causes processor 704 to perform the process steps described herein.”):
receive a pre-trained artificial intelligence model with one or more predictions; (Hetherington, Para. [0067] discloses “In step 201, a machine learning (ML) model classifies examples according to mutually exclusive labels. Typically, the ML model was trained before step 201, and the examples to classify are numerous and realistic.”)
generate a multilevel explanation tree, (Hetherington, Para. [0024] discloses “A decision tree that contains tree nodes is received or generated. “ and Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
and utilize the multilevel explanation tree to explain one or more predictions of the machine learning model of [[clusters of data]] to provide machine learning. (Hetherington, Para. [0047] discloses “More importantly, rules 121-128 may be used to explain why the ML model already classified an (unfamiliar or familiar) example with a particular label. In other words, rules 121-128 may demonstrably provide ML explainability (MLX) for the ML model…” and Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.” (Tree generated describes how the ML model generated its predictions. Note that Ribeiro also explains predictions of a machine learning model. The multilevel explanation tree of Hetherington additionally utilizes data output from the model to thereby provide an explanation where the data gathered is grouped into different nodes or clusters. Note that Ribeiro additionally produces clusters of data or a model which is to be used within the combination of the two references as Ribeiro is primary used to group neighbors of data points i.e. clusters of data))
Hetherington fails to explicitly teach:
linking neighborhood of datapoints around each of a plurality of training datapoints to the one or more predictions
clusters of data
However, Ribeiro teaches:
linking neighborhood of datapoints around each of a plurality of training datapoints to the one or more predictions  (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.” And Algorithm 1, and Section 5.4 discloses “Specifically, on training and validation sets (80/20 split of the original training data), each artificial feature appears in 10% of the examples in one class, and 20% of the other, while on the test instances, each artificial feature appears in 10% of the examples in each class” (Sampling neighborhood of points of data which are then “linked” to an explanation of a prediction via a label. LIME addresses model explainability of data.))
clusters of data (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.” (These clusters of data are applied to the ML model from Hetherington as seen above)))
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Ribeiro for at least the same reasons as discussed above in claim 1

As per claim 18, the combination of Hetherington, and Ribeiro, as shown above teaches the computer program product according to claim 16, Hetherington further teaches:
wherein the generating the multilevel explanation tree, (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
leaves of the multilevel explanation tree representing (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
and distances between leaves of the multilevel explanation tree indicating (Hetherington, Para. [0114] discloses “A decision tree may have many more levels and nodes than shown, and the explanatory value of the nodes decreases based on distance from the root node.”)
and47U.S. Patent ApplicationDocket No: P201906677US01 YOR.1303wherein the utilizing by the computing device includes the leaves of the multilevel explanation tree representing (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
	Ribeiro further teaches:
links the neighborhood of datapoints around each of the training datapoints to the one or more predictions, (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model)
the neighborhood of datapoints around each of the training datapoints (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.”)
differences between values of the neighborhood of datapoints, (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.” (Different clusters of data get sampled))
and wherein a linear or non-linear local explainability is implemented. (Ribeiro, Section 3.3. discloses “Even though the original model may be too complex to explain globally, LIME presents an explanation that is locally faithful (linear in this case), where the locality is captured by πx.”)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Ribeiro for at least the same reasons as discussed above in claim 1

As per claim 19, the combination of Hetherington, and Ribeiro, as shown above teaches the computer program product according to claim 16, Hetherington further teaches:
wherein the generating by the computing device the multilevel explanation tree, (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
leaves of the multilevel explanation tree representing (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
and distances between leaves of the multilevel explanation tree indicating (Hetherington, Para. [0114] discloses “A decision tree may have many more levels and nodes than shown, and the explanatory value of the nodes decreases based on distance from the root node.”)
and47U.S. Patent ApplicationDocket No: P201906677US01 YOR.1303wherein the utilizing by the computing device includes the leaves of the multilevel explanation tree representing (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
	Ribeiro further teaches:
links the neighborhood of datapoints around each of the training datapoints to the one or more predictions, (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.”)
the neighborhood of datapoints around each of the training datapoints (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.”)
differences between values of the neighborhood of datapoints, (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.” (Different clusters of data get sampled))
the neighborhood of datapoints to explain one or more predictions of the machine learning model. (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model….The primary intuition behind LIME is presented in Figure 3, where we sample instances both in the vicinity of x (which have a high weight due to πx) and far away from x (low weight from πx). Even though the original model may be too complex to explain globally, LIME presents an explanation that is locally faithful (linear in this case), where the locality is captured by πx.”)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Ribeiro for at least the same reasons as discussed above in claim 1

As per claim 20, the combination of Hetherington, and Ribeiro, as shown above teaches the computer program product according to claim 16, Hetherington further teaches:
	the computer program product being cloud implemented (Hetherington, Para. [0144] discloses “The above-described basic computer hardware and software and cloud computing environment presented for purpose of illustrating the basic underlying computer components that may be employed for implementing the example embodiment(s).”)
and47U.S. Patent ApplicationDocket No: P201906677US01 YOR.1303wherein the utilizing by the computing device includes the leaves of the multilevel explanation tree representing (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
	Ribeiro further teaches:
the neighborhood of datapoints to explain one or more predictions of the machine learning model. (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model….The primary intuition behind LIME is presented in Figure 3, where we sample instances both in the vicinity of x (which have a high weight due to πx) and far away from x (low weight from πx). Even though the original model may be too complex to explain globally, LIME presents an explanation that is locally faithful (linear in this case), where the locality is captured by πx.”)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Ribeiro for at least the same reasons as discussed above in claim 1

Claims 2, and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Hetherington, in view of “Ribeiro, and further in view of U.S. Pub. No US 20200097858 A1 to Baikalov, et al. (hereinafter, “Baikalov”)
	As per claim 2, the combination of Hetherington and Ribeiro as shown above teaches the method of claim 1, the combination of Hetherington and Ribeiro fails to explicitly teach:
further comprising receiving by the computing device a dataset for [[the pre-trained artificial intelligence model]] including the plurality of training datapoints
However, Baikalov teaches:
further comprising receiving by the computing device a dataset for [[the pre-trained artificial intelligence model]] including the plurality of training datapoints (Baikalov, Para. [0019] discloses “As shown in FIG. 2, the prediction process 20 has a training portion 22 and a prediction portion 24. In a training phase, the process may be supplied with subsets of training data (training observations) from a training dataset 26 comprising previously analyzed data similar to and representative of the type of target data to be analyzed and processed for which a prediction or outcome is desired.”)
 Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify generating a multilevel explanation tree as disclosed by Hetherington to use a plurality of training datapoints as disclosed by Baikalov. The combination would have been obvious because a person of ordinary skill in the art would be motivated to provide understandings and insights of a machine learning model in view of data such that an individual is able to fully understand the reasoning paradigm of the machine learning model which increases the amount of trust and believability one has of the machine learning modes predictions.

As per claim 10, the combination of Hetherington and Ribeiro, as shown above teaches the system according to claim 9, the combination of Hetherington and Ribeiro fails to explicitly teach:
further comprising receiving a dataset for [[the pre-trained artificial intelligence model]] including the plurality of training datapoints
However, Baikalov teaches:
further comprising a dataset for [[the pre-trained artificial intelligence model]] including the plurality of training datapoints (Baikalov, Para. [0019] discloses “As shown in FIG. 2, the prediction process 20 has a training portion 22 and a prediction portion 24. In a training phase, the process may be supplied with subsets of training data (training observations) from a training dataset 26 comprising previously analyzed data similar to and representative of the type of target data to be analyzed and processed for which a prediction or outcome is desired.”)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington as modified with the teachings of Baikalov for at least the same reasons as discussed above in claim 2

Claims 3-5, 11-13, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Hetherington, in view of Ribeiro, further in view of Baikalov, and further in view of Scalar Valued Functions to MIT (hereinafter, “MIT”)
As per claim 3, the combination of Hetherington, Ribeiro, and Baikalov as shown above teaches the method of claim 2, the combination of Hetherington, Ribeiro, and Baikalov fails to explicitly teach:
[[further comprising receiving by the computing device]] a coordinate wise map [[of the plurality of training datapoints]]
However, MIT teaches:
a coordinate wise map (MIT, Definition discloses “A scalar valued function is a function that takes one or more values but returns a single value” (Note that MIT discloses a Scalar valued function which can be applied to datapoints which subsequently acts on datapoints “coordinate wise” in that individual points are manipulated independently thus producing a “coordinate wise map” of the original data points))
Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify generating a multilevel explanation tree as disclosed by Hetherington to use a scalar valued function to produce a coordinate wise map as disclosed by MIT. The combination would have been obvious because a person of ordinary skill in the art would be motivated to improve the effective range of datapoints such that the model may perform more effectively.

	As per claim 4, the combination of Hetherington, Ribeiro, Baikalov, and MIT as shown above teaches the method of claim 3, Ribeiro further teaches:
further comprising sampling by the computing device a neighborhood of datapoints around each of the training datapoints. (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.”)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Ribeiro for at least the same reasons as discussed above in claim 1

As per claim 5, the combination of Hetherington, Ribeiro, Baikalov, and MIT as shown above teaches the method of claim 4, Hetherington further teaches:
wherein the generating by the computing device the multilevel explanation tree, (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
leaves of the multilevel explanation tree representing (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
and distances between leaves of the multilevel explanation tree indicating (Hetherington, Para. [0114] discloses “A decision tree may have many more levels and nodes than shown, and the explanatory value of the nodes decreases based on distance from the root node.”)
further comprising executing the machine learning by utilizing by the computing device the multilevel explanation tree to explain one or more predictions of the machine learning model of [[clusters of data]] (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
	Ribeiro further teaches:
links the neighborhood of datapoints around each of the training datapoints to the one or more predictions, (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model)
the neighborhood of datapoints around each of the training datapoints (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.”)
differences between values of the neighborhood of datapoints, (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.” (Different clusters of data get sampled))
and wherein a linear or non-linear local explainability is implemented. (Ribeiro, Section 3.3. discloses “Even though the original model may be too complex to explain globally, LIME presents an explanation that is locally faithful (linear in this case), where the locality is captured by πx.”)
clusters of data (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.” (These clusters of data are applied to the ML model from Hetherington as seen above)))
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Ribeiro for at least the same reasons as discussed above in claim 1

As per claim 11, the combination of Hetherington, Ribeiro, and Baikalov as shown above teaches the system according to claim 10, the combination of Hetherington, Ribeiro, and Baikalov fails to explicitly teach:
[[further comprising receiving]] a coordinate wise map [[of the plurality of training datapoints]]
However, MIT teaches:
a coordinate wise map (MIT, Definition discloses “A scalar valued function is a function that takes one or more values but returns a single value” (Note that MIT discloses a Scalar valued function which can be applied to datapoints which subsequently acts on datapoints “coordinate wise” in that individual points are manipulated independently thus producing a “coordinate wise map” of the original data points))
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington as modified with the teachings of MIT for at least the same reasons as discussed above in claim 3

	As per claim 12, the combination of Hetherington, Ribeiro, Baikalov, and MIT as shown above teaches the system according to claim 11, Ribeiro further teaches:
[[wherein leaves of the multilevel explanation tree provides]] local sample-wise explanations, (Ribeiro, Section 3.3 discloses “Even though the original model may be too complex to explain globally, LIME presents an explanation that is locally faithful (linear in this case),” (Hetherington’s multilevel tree has leaf nodes))
[[a root of the multilevel explanation tree provides]] global dataset-level explanation (Ribeiro, Section 8 discloses “We also introduced SP-LIME, a method to select representative and non-redundant predictions, providing a global view of the model to users” (Hetherington’s multilevel tree has a root node), 
[[and intermediate levels of the multilevel explanation tree provides]] explanations of clusters of data. (Riberio, Section 1 discloses “In this paper, we propose providing explanations for individual predictions as a solution to the “trusting a prediction” problem, Fig. 2 provides an example explanation of a cluster of data and Section 3.3. discloses sampling data for local exploration and explanation (Hetherington’s multi level tree has intermediate levels of the tree))
further comprising sampling a neighborhood of datapoints around each of the training datapoints. (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.”)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Ribeiro for at least the same reasons as discussed above in claim 1

As per claim 13, the combination of Hetherington, Ribeiro, Baikalov, and MIT as shown above teaches the system according to claim 12, Hetherington further teaches:
wherein the generating the multilevel explanation tree, (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
leaves of the multilevel explanation tree representing (Hetherington, Para. [0035] discloses “Tree nodes 111-114 form a decision tree that exactly or approximately reflects how the ML model classified/labeled examples 0-7 based on the values of their features A-C. Node 111 is the root of the tree. Each tree node 111-114 has a condition that is based on a feature, a relational operator, and a split value.”)
and distances between leaves of the multilevel explanation tree indicating (Hetherington, Para. [0114] discloses “A decision tree may have many more levels and nodes than shown, and the explanatory value of the nodes decreases based on distance from the root node.”)
	Ribeiro further teaches:
links the neighborhood of datapoints around each of the training datapoints to the one or more predictions, (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.”)
the neighborhood of datapoints around each of the training datapoints (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.”)
differences between values of the neighborhood of datapoints, (Ribeiro, Section 3.3. discloses “We sample instances around x 0 by drawing nonzero elements of x 0 uniformly at random (where the number of such draws is also uniformly sampled). Given a perturbed sample z 0 ∈ {0, 1} d 0 (which contains a fraction of the nonzero elements of x 0 ), we recover the sample in the original representation z ∈ R d and obtain f(z), which is used as a label for the explanation model.” (Different clusters of data get sampled))
and wherein a linear or non-linear local explainability is implemented. (Ribeiro, Section 3.3. discloses “Even though the original model may be too complex to explain globally, LIME presents an explanation that is locally faithful (linear in this case), where the locality is captured by πx.”)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Ribeiro for at least the same reasons as discussed above in claim 1

As per claim 17, the combination of Hetherington, and Ribeiro as shown above teaches the computer program product according to claim 16, Ribeiro further teaches:
[[wherein leaves of the multilevel explanation tree provides]] local sample-wise explanations, (Ribeiro, Section 3.3 discloses “Even though the original model may be too complex to explain globally, LIME presents an explanation that is locally faithful (linear in this case),” (Hetherington’s multilevel tree has leaf nodes))
[[a root of the multilevel explanation tree provides]] global dataset-level explanation (Ribeiro, Section 8 discloses “We also introduced SP-LIME, a method to select representative and non-redundant predictions, providing a global view of the model to users” (Hetherington’s multilevel tree has a root node), 
[[and intermediate levels of the multilevel explanation tree provides]] explanations of clusters of data. (Riberio, Section 1 discloses “In this paper, we propose providing explanations for individual predictions as a solution to the “trusting a prediction” problem, Fig. 2 provides an example explanation of a cluster of data and Section 3.3. discloses sampling data for local exploration and explanation (Hetherington’s multi level tree has intermediate levels of the tree))
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Ribeiro for at least the same reasons as discussed above in claim 1
The combination of Hetherington, and Ribeiro fails to explicitly teach:
further comprising receiving by the computing device a dataset for [[the pre-trained artificial intelligence model]] including the plurality of training datapoints
However, Baikalov teaches:
further comprising receiving by the computing device a dataset for [[the pre-trained artificial intelligence model]] including the plurality of training datapoints (Baikalov, Para. [0019] discloses “As shown in FIG. 2, the prediction process 20 has a training portion 22 and a prediction portion 24. In a training phase, the process may be supplied with subsets of training data (training observations) from a training dataset 26 comprising previously analyzed data similar to and representative of the type of target data to be analyzed and processed for which a prediction or outcome is desired.”)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of Baikalov for at least the same reasons as discussed above in claim 2
The combination of Hetherington, and Ribeiro fails to explicitly teach:
	[[receiving]] a coordinate wise map [[of the plurality of training datapoints]]
However, MIT teaches:
a coordinate wise map (MIT, Definition discloses “A scalar valued function is a function that takes one or more values but returns a single value” (Note that MIT discloses a Scalar valued function which can be applied to datapoints which subsequently acts on datapoints “coordinate wise” in that individual points are manipulated independently thus producing a “coordinate wise map” of the original data points))
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Hetherington with the teachings of MIT for at least the same reasons as discussed above in claim 3
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HAMZA RAZZAQ MUGHAL whose telephone number is (571)272-8833. The examiner can normally be reached M-TR 7:30-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ALEXEY SHMATOV can be reached on 571-270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/H.R.M./Examiner, Art Unit 2123                                                                                                                                                                                                        
/NICHOLAS KLICOS/Primary Examiner, Art Unit 2145