DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are presented for examination.

Continued Examination under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on September 14, 2022 has been entered.

Response to Amendment
Applicant’s arguments have not overcome the objections to the specification and drawings.  Therefore, those objections are maintained.  Examiner’s response to Applicant’s arguments with respect to the specification and drawings is provided in the section entitled “Response to Arguments” infra.
However, Applicant’s amendment has obviated the objections to the claims.  Therefore, those objections are withdrawn.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on June 13, 2022 (x2) and December 15, 2022 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner.

Specification
The specification is objected to for containing various minor informalities.  Examiner has attached a marked-up copy of the specification indicating where errors have occurred.  To the extent that the markings are not self-explanatory and are not corrected, Examiner will enumerate the objections in a subsequent Office Action.

Drawings
The drawings are objected to because reference character 502 (Fig. 4B) appears to be written on a shaded background; see 37 CFR § 1.84(p)(3).  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Rejections - 35 USC § 101
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  The analysis of the claims will follow the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”).
Claim 1
Step 1: The claim is directed to a method; therefore, it is directed to the statutory category of processes.
Step 2A Prong 1:  The claim recites, inter alia, “providing … one or more performance views based on … performance information, the one or more performance views including a plurality of graphical elements associated with a plurality of feature clusters, wherein the plurality of feature clusters include subsets of test instances from [a] plurality of test instances based on associated feature labels, and wherein the one or performance views includes an indication of … accuracy data corresponding to at least one feature cluster from the plurality of feature clusters.”  Providing performance views comprising graphical elements associated with feature clusters based on performance information could potentially be performed on pen and paper.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  The claim recites that the providing of one or more performance views is performed using a “graphical user interface”.  However, mere recitation that a process that is performable either mentally or with pen and paper is instead to be performed using a generic variety of computer software is a mere instruction to apply the exception to a generic computer.  See MPEP § 2106.05(f).
The remainder of the claim recites “receiving, at a client device, a performance report including performance information for a machine learning system, the machine learning system including one or more machine learning models trained to generate an output based on an input object provided as input to the machine learning system, wherein the performance information comprises: a plurality of outputs of the machine learning system for a plurality of test instances; accuracy data of the plurality of outputs, wherein the accuracy data include identified errors between outputs from the plurality of outputs and associated ground truth data corresponding to the plurality of test instances; and feature data associated with the plurality of test instances, the feature data comprising a plurality of feature labels associated with characteristics of the plurality of test instances.”  The recitation of receiving a performance report is directed to the insignificant extra-solution activity of mere data gathering (although the data that are gathered are very specific).  See MPEP § 2106.05(g).  The recitation that the performance report is for a machine learning system does not alter the analysis; the source of the report is not relevant to the final determination that the receipt of the report itself is insignificant extra-solution activity.  Note that the claim does not positively recite the training or use of the machine learning system; it merely indicates that the system is the source of the data.
Step 2B:  The claim does not contain significantly more than the judicial exception.  As noted above, the recitation that the providing of the performance views is to be performed using a GUI is a mere instruction to apply an exception using a generic computer.  See MPEP § 2106.05(f).  Moreover, the recitation of receiving a performance report comprising performance information is the insignificant extra-solution activity of mere data gathering, as noted above.  See MPEP § 2106.05(g).  As an ordered whole, the claim is directed to a method, potentially performable with pen and paper, of visualizing the performance of a machine learning model using various metrics of performance.  Nothing in the claim provides significantly more than this.  As such, the claim is not patent eligible. 

 Claim 2
Step 1: A process, as above.
Step 2A Prong 1:  The claim recites “detecting a selection of a graphical element from the plurality of graphical elements associated with a combination of one or more feature labels; and providing a visualization of the accuracy data associated with a subset of outputs from the plurality of outputs corresponding to a subset of test instances corresponding to the combination of one or more feature labels.”  Detecting a selection of a graphical element could be performed by merely observing that a user has selected the element.  Providing the visualization could be performed by drawing out the visualization with the claimed properties using a pen and paper.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not meaningfully integrate the judicial exception into a practical application.  See MPEP § 2106.05(f).
Step 2B:  The claim does not contain significantly more than the judicial exception.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not amount to significantly more than the judicial exception.  See MPEP § 2106.05(f).

Claim 3
Step 1: A process, as above.
Step 2A Prong 1:  The claim recites that “the plurality of graphical elements comprises a list of selectable features corresponding to the plurality of feature clusters, wherein the selectable features are ranked within the list based on measures of correlation between the plurality of feature clusters and identified errors from the accuracy data.”  The ranking of features within a list based on correlation between clusters and error data could potentially be performed mentally.  The providing of performance views comprising selectable graphical elements is still potentially mentally performable insofar as the claimed “selection” could merely correspond to a mental selection of an element.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not meaningfully integrate the judicial exception into a practical application.  See MPEP § 2106.05(f).
Step 2B:  The claim does not contain significantly more than the judicial exception.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not amount to significantly more than the judicial exception.  See MPEP § 2106.05(f).

Claim 4
Step 1: A process, as above.
Step 2A Prong 1:  The claim recites that “providing the one or more performance views comprises providing a global performance view for the plurality of feature clusters, the global performance view including a visual representation of the accuracy data with respect to multiple feature clusters of the plurality of feature clusters, … wherein the plurality of graphical elements includes selectable portions of the global performance view associated with the multiple feature clusters.”  Accuracy data with respect to multiple feature clusters can be visually represented with pen and paper, and this pen-and-paper representation can include “selectable” portions of the view insofar as these portions can be selected mentally.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not meaningfully integrate the judicial exception into a practical application.  See MPEP § 2106.05(f).
Step 2B:  The claim does not contain significantly more than the judicial exception.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not amount to significantly more than the judicial exception.  See MPEP § 2106.05(f).

Claim 5
Step 1: A process, as above.
Step 2A Prong 1:  The claim recites “detecting a selection of a graphical element corresponding to a first feature cluster from the plurality of feature clusters; … wherein providing the one or more performance views comprises providing a cluster performance view for the first feature cluster, the cluster performance view comprising a visualization of the accuracy data for a first subset of outputs from the plurality of outputs associated with the first feature cluster.”  The detecting limitation may encompass the mental determination that an element has been selected.  The limitation reciting providing of a cluster performance view comprising a visualization of accuracy data for a subset of outputs may encompass the drawing, using pen and paper, of a visual representation of the results of the model that has the claimed properties.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not meaningfully integrate the judicial exception into a practical application.  See MPEP § 2106.05(f).
Step 2B:  The claim does not contain significantly more than the judicial exception.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not amount to significantly more than the judicial exception.  See MPEP § 2106.05(f).

Claim 6
Step 1: A process, as above.
Step 2A Prong 1:  The claim recites that “the cluster performance view comprises a multi-branch visualization of the accuracy data for the plurality of outputs, wherein the multi-branch visualization comprises: a first branch including an indication of the accuracy data associated with the first subset of outputs from the plurality of outputs associated with the first feature cluster; and a second branch including an indication of the accuracy data associated with a second subset of outputs from the plurality of outputs not associated with the first feature cluster.”  The providing of a cluster performance view containing multiple branches, each branch indicating accuracy data for subsets of data from a cluster, may be performed merely by drawing a visual representation of the data with the claimed properties.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not meaningfully integrate the judicial exception into a practical application.  See MPEP § 2106.05(f).
Step 2B:  The claim does not contain significantly more than the judicial exception.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not amount to significantly more than the judicial exception.  See MPEP § 2106.05(f).

Claim 7
Step 1: A process, as above.
Step 2A Prong 1:  The claim recites “detecting a selection of the first branch; detecting a selection of an additional graphical element corresponding to a second feature cluster from the plurality of feature clusters; and providing a third branch including an indication of the accuracy data associated with a third subset of outputs associated with a combination of feature labels shared by the first cluster and the second feature cluster.”  The two limitations reciting the detection of a selection could encompass the mental perception that a user has selected a branch and a graphical element.  The providing of a third branch with the claimed properties could encompass merely drawing a third branch with the claimed properties.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not meaningfully integrate the judicial exception into a practical application.  See MPEP § 2106.05(f).
Step 2B:  The claim does not contain significantly more than the judicial exception.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not amount to significantly more than the judicial exception.  See MPEP § 2106.05(f).

Claim 8
Step 1: A process, as above.
Step 2A Prong 1:  The claim recites that “the multi-branch visualization of the accuracy data for the plurality of outputs comprises: a root node representative of the plurality of outputs for the plurality of test instances; a first level including a first node representative of the first subset of outputs and a second node representative of the second subset of outputs; and a second level including a third node representative of the third subset of outputs.”  The providing of a multi-branch visualization of the accuracy data with the claimed properties may encompass merely drawing the multi-branch visualization using pen and paper.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not meaningfully integrate the judicial exception into a practical application.  See MPEP § 2106.05(f).
Step 2B:  The claim does not contain significantly more than the judicial exception.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not amount to significantly more than the judicial exception.  See MPEP § 2106.05(f).

Claim 9
Step 1: A process, as above.
Step 2A Prong 1:  The claim recites that “providing the one or more performance views further comprises providing an instance view associated with a selected feature cluster, wherein the instance view comprises a display of a test instance, a display of an output from the machine learning system for the test instance, and a display of at least a portion of the ground truth data for the test instance.”  An instance view associated with a selected cluster comprising a display of a test instance, a display of a machine learning output, and a display of ground truth data can be provided by merely drawing the instance view using pen and paper.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not meaningfully integrate the judicial exception into a practical application.  See MPEP § 2106.05(f).
Step 2B:  The claim does not contain significantly more than the judicial exception.  Mere recitation that the judicial exception is to be performed using generic computer equipment does not amount to significantly more than the judicial exception.  See MPEP § 2106.05(f).

Claim 10
Step 1: A process, as above.
Step 2A Prong 1:  The claim recites the same mental process as in claim 1.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  The claim recites, inter alia, “provid[ing] failure information to a training system, the failure information comprising an indication of one or more feature labels from the plurality of feature labels associated with a threshold rate of identified errors from the accuracy data….”  This limitation recites the insignificant extra-solution activity of mere data gathering.  See MPEP § 2106.05(g).  The claim then recites that the providing is performed via a “selectable option” provided by a “graphical user interface of a client device,” which amounts to a mere instruction to apply the judicial exception using generic computer equipment.  See MPEP § 2106.05(f).  Finally, the claim recites “causing the training system to refine at least one machine learning model of the machine learning system based on selectively identified training data associated with the one or more feature labels.”  Insofar as this limitation is only nominally or tangentially related to the claimed invention, which is directed predominantly to the providing of the performance views themselves rather than the refining of a machine learning system based thereon, the limitation recites insignificant post-solution activity.  See MPEP § 2106.05(g).
Step 2B:  The claim does not contain significantly more than the judicial exception.  The providing limitation, in addition to being insignificant extra-solution activity, is also directed to the well-understood, routine, and conventional activity of receiving or transmitting data over a network.  MPEP § 2106.05(d)(II); OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) (sending messages over a network).  As noted above, the recitation that this providing is performed via a GUI is a mere instruction to apply the exception using generic computer equipment.  See MPEP § 2106.05(f).  In addition to being insignificant post-solution activity, the refining of a machine learning model based on selectively identified training data associated with feature labels is well-understood, routine, and conventional.  See MPEP § 2106.05(d); Jenson (US 20180218283) paragraph 26 (disclosing that conventional approaches may retrain (i.e., refine) machine learning models when a change occurs based on a pattern).

Claim 11
Step 1: The claim is directed to a “system, comprising: one or more processors; memory in electronic communication with the one or more processors; and instructions stored in the memory, the instructions being executable by the one or more processors”; therefore, the claim is directed to the statutory category of machines.
Step 2A Prong 1:  The claim recites the same mental processes as in claim 1.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Unlike claim 1, claim 11 recites that the method is performed by a “system, comprising: one or more processors; memory in electronic communication with the one or more processors; and instructions stored in the memory, the instructions being executable by the one or more processors”.  However, mere recitation that the judicial exception is to be performed by generic computing equipment cannot integrate the judicial exception into a practical application.  See MPEP § 2106.05(f).  With that exception, the analysis at this step mirrors that of claim 1.
Step 2B:  The claim does not contain significantly more than the judicial exception.  Unlike claim 1, claim 11 recites that the method is performed by a “system, comprising: one or more processors; memory in electronic communication with the one or more processors; and instructions stored in the memory, the instructions being executable by the one or more processors”.  However, mere recitation that the judicial exception is to be performed by generic computing equipment cannot amount to significantly more than the judicial exception.  See MPEP § 2106.05(f).  With that exception, the analysis at this step mirrors that of claim 1.

Claims 12-13 and 15
Step 1: A machine, as above.
Step 2A Prong 1:  The claims recite the same mental processes as in claims 2, 5, and 10, respectively.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  The mere recitation that the method is to be performed using one or more processors and a server device does not meaningfully integrate the judicial exception into a practical application, as it constitutes a mere instruction to apply the exception on a generic computer.  See MPEP § 2106.05(f).  With that exception, the analysis at this step mirrors that of claims 2, 5, and 10, respectively.
Step 2B:  The claim does not contain significantly more than the judicial exception.  The mere recitation that the method is to be performed using one or more processors and a server device does not amount to significantly more than the judicial exception, as it constitutes a mere instruction to apply the exception on a generic computer.  See MPEP § 2106.05(f).  With that exception, the analysis at this step mirrors that of claims 2, 5, and 10, respectively.

Claim 14
Step 1: A machine, as above.
Step 2A Prong 1:  The claim recites that “providing the one or more performance views further comprises providing an instance view associated with the first feature cluster, wherein the instance view comprises a display of a test instance from the first feature cluster and associated accuracy data for the test instance.”  This limitation could encompass the drawing, with pen and paper, of an instance view containing a test instance and associated accuracy data for the test instance.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Mere recitation that the judicial exception is to be performed with generic computing equipment cannot integrate the judicial exception into a practical application.  See MPEP § 2106.05(f).
Step 2B:  The claim does not contain significantly more than the judicial exception.  Mere recitation that the judicial exception is to be performed with generic computing equipment cannot amount to significantly more than the judicial exception.  See MPEP § 2106.05(f).

Claim 16
Step 1: The claim is directed to a “non-transitory computer readable storage medium storing instructions thereon”; therefore, it is directed to the statutory category of articles of manufacture.
Step 2A Prong 1:  The claim recites the same mental processes as in claim 1.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Unlike claim 1, claim 16 recites that the method is performed by a “non-transitory computer readable storage medium storing instructions thereon that, when executed by one or more processors, causes a client device to [perform the method]”.  However, mere recitation that the judicial exception is to be performed by generic computing equipment cannot integrate the judicial exception into a practical application.  See MPEP § 2106.05(f).  With that exception, the analysis at this step mirrors that of claim 1.
Step 2B:  The claim does not contain significantly more than the judicial exception.  Unlike claim 1, claim 16 recites that the method is performed by a “non-transitory computer readable storage medium storing instructions thereon that, when executed by one or more processors, causes a client device to [perform the method]”.  However, mere recitation that the judicial exception is to be performed by generic computing equipment cannot amount to significantly more than the judicial exception.  See MPEP § 2106.05(f).  With that exception, the analysis at this step mirrors that of claim 1.

Claims 17-20
Step 1:  An article of manufacture, as above.
Step 2A Prong 1:  The claims recite the same mental processes as in claims 2, 5, and 9-10, respectively.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  The mere recitation that the method is to be performed using a processor executing instructions stored on a non-transitory computer-readable storage medium does not meaningfully integrate the judicial exception into a practical application, as it amounts to a mere instruction to apply the judicial exception on a generic computer.  See MPEP § 2106.05(f).  With that exception, the analysis at this step mirrors that of claims 2-5 and 9-10, respectively.
Step 2B:  The claim does not contain significantly more than the judicial exception.  The mere recitation that the method is to be performed using a processor executing instructions stored on a non-transitory computer-readable storage medium does not amount to significantly more than the judicial exception, as it amounts to a mere instruction to apply the judicial exception on a generic computer.  See MPEP § 2106.05(f).  With that exception, the analysis at this step mirrors that of claims 2-5 and 9-10, respectively.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-2, 4-5, and 9-20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Dasgupta et al. (US 10719301) (“Dasgupta”).
Regarding claim 1, Dasgupta discloses “[a] method (may also be embodied as a system comprising one or more processors and a memory containing instructions that cause a device to perform the method, see Dasgupta, col. 60, ll. 52-62 (disclosing a computer system that performs the method comprising processors coupled to a system memory via an I/O interface)), comprising: 
receiving, at a client device, a performance report including performance information for a machine learning system, the machine learning system including one or more machine learning models trained to generate an output based on an input object provided as input to the machine learning system (model development environment may provide a model diagnosis interface that may be used to generate a body of performance metrics from model performance metrics in a repository – Dasgupta, col. 11, ll. 24-37; model experiment may involve a single machine learning model trained in one or more training runs; a model tester generates test results for the model, which are saved as model performance metrics [performance information, collectively comprising a performance report] – id. at col. 10, ll. 1-25; see also Fig. 1 (depicting a model development environment containing model performance metrics that are sent to a development workflow containing various interfaces)), wherein the performance information comprises: 
a plurality of outputs of the machine learning system for a plurality of test instances (input samples to a production model [i.e., test instances] and results of the production model [outputs] may be saved in the same data store and periodically provided to the model development environment – Dasgupta, col. 18, l. 66-col. 19, l. 5); 
accuracy data of the plurality of outputs, wherein the accuracy data include identified errors between outputs from the plurality of outputs and associated ground truth data corresponding to the plurality of test instances (Dasgupta Fig. 17B depicts a confusion matrix comparing the output of a classifier to the ground truth; non-diagonal entries correspond to errors (e.g., one horse is misclassified as a giraffe))); and 
feature data associated with the plurality of test instances, the feature data comprising a plurality of feature labels associated with characteristics of the plurality of test instances (Dasgupta Fig. 15 discloses that each image in the test set [test instance] is associated with a feature label (e.g., horse, cow, giraffe); Fig. 24 shows that this label is associated with certain characteristics of the images such as a head of a horse); and 
providing, via a graphical user interface, one or more performance views based on the performance information (model development environment may provide a model diagnosis interface, which may include a series of multiple GUIs or webpages; the interface may be used to generate a body of performance metrics from model performance metrics in a repository; the interface may allow users to view the performance metrics in different ways – Dasgupta, col. 11, ll. 38-49; see also col. 8, l. 51-col. 9, l. 3 (disclosing that the interfaces may also include a media data management interface, model experiment interface, and result notification interface)), the one or more performance views including a plurality of graphical elements associated with a plurality of feature clusters (media annotation system may generate training and test sets for an active learning classifier used to annotate media samples using a clustering technique – Dasgupta, col. 44, ll. 10-18; feature vectors are extracted from the samples, and a clustering technique is used to cluster the feature vectors; the feature vectors are displayed [as graphical elements] in a graphical user interface with the clustering – id. at col. 44, ll. 29-58), wherein the plurality of feature clusters include subsets of test instances from the plurality of test instances based on associated feature labels (the images [test instances] in the image set may be clustered using a clustering technique and the clustering is used to determine different clusters of images with similar features in the image set – Dasgupta, col. 36, ll. 1-12; see also Figs. 14 (depicting the set of feature vectors corresponding to the images divided into subsets via clustering); 15 (showing that each of the images is associated with a label like “cow,” “horse,” or “giraffe”)), and wherein the one or performance views includes an indication of the accuracy data corresponding to at least one feature cluster from the plurality of feature clusters (Dasgupta Fig. 17B shows a GUI that depicts a confusion matrix showing how many test images the classifier classified correctly vs. incorrectly [accuracy data]; in this instance, 4/5 horses were classified correctly and all cows and giraffes were classified correctly; see also col. 36, ll. 1-12 (disclosing that the images in the image set may be clustered using a clustering technique and that the clustering is used to determine different clusters of images with similar features in the image set)).”  

Claim 11 is a system claim corresponding to method claim 1 and is rejected for the same reasons as given in the rejection of that claim.  Similarly, claim 16 is a non-transitory computer-readable storage medium claim corresponding to method claim 1 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 2, Dasgupta discloses “detecting a selection of a graphical element from the plurality of graphical elements associated with a combination of one or more feature labels (Dasgupta Fig. 22B depicts the system detecting a user selecting an entry in a confusion matrix [graphical element] that is associated with ground truth label “horse” but for which the model made the prediction “giraffe” [feature labels]]); and 
providing a visualization of the accuracy data associated with a subset of outputs from the plurality of outputs corresponding to a subset of test instances corresponding to the combination of one or more feature labels (Dasgupta Fig. 22B depicts a popup window with pictures of horses [test instances] with ground truth label “horse” [feature label] that were incorrectly labeled as giraffes [output = “giraffe”], selected from a confusion matrix that shows that 35 horses were incorrectly labeled as giraffes by the model [confusion matrix = visualization of accuracy data]).”  

Claim 12 is a system claim corresponding to method claim 2 and is rejected for the same reasons as given in the rejection of that claim.  Similarly, claim 17 is a non-transitory computer-readable storage medium claim corresponding to method claim 2 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 4, Dasgupta discloses that “providing the one or more performance views comprises providing a global performance view for the plurality of feature clusters (Dasgupta Fig. 22B depicts a confusion matrix showing how many images of cows/horses/giraffes were correctly and incorrectly classified [so it is global insofar as it shows the accuracy for all available categories of images]; see also col. 36, ll. 13-21 (disclosing that the feature vector of each image is associated with a cluster)), the global performance view including a visual representation of the accuracy data with respect to multiple feature clusters of the plurality of feature clusters (Dasgupta Fig. 22B depicts a confusion matrix showing how many images of cows/horses/giraffes were correctly and incorrectly classified [so it provides a visual representation of the accuracy of the model in classifying multiple categories of images]), and wherein the plurality of graphical elements includes selectable portions of the global performance view associated with the multiple feature clusters (Dasgupta Fig. 22B shows that the elements of the confusion matrix are selectable and that, when selected, they show examples of images that were incorrectly classified, e.g., horses misclassified as giraffes).”  

Regarding claim 5, Dasgupta discloses “detecting a selection of a graphical element corresponding to a first feature cluster from the plurality of feature clusters (user interface may be used to load an image set (or other media dataset) via a load button; the user interface may include a view button that allows users visually to inspect [select] the images or other media samples [graphical elements] to be loaded – Dasgupta, col. 35, ll. 38-49; images in the image set may be clustered using a clustering technique – id. at col. 36, ll. 1-12; see also Fig. 14 (showing a view button on the GUI that shows the feature vectors of the images organized into clusters)); and 
wherein providing the one or more performance views comprises providing a cluster performance view for the first feature cluster, the cluster performance view comprising a visualization of the accuracy data for a first subset of outputs from the plurality of outputs associated with the first feature cluster (Dasgupta Fig. 17B discloses a confusion matrix [cluster performance view] showing how many images were incorrectly versus correctly classified [accuracy data; output = output of classifier]; one block, for instance, shows that one horse was incorrectly classified as a giraffe [i.e., it is a subset of outputs]; see also col. 36, ll. 1-12 (disclosing that each image in the image set may be associated with a cluster)).”  

Claim 13 is a system claim corresponding to method claim 5 and is rejected for the same reasons as given in the rejection of that claim.  Similarly, claim 18 is a non-transitory computer-readable storage medium claim corresponding to method claim 5 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 9, Dasgupta discloses “providing the one or more performance views further comprises providing an instance view associated with a selected feature cluster, wherein the instance view comprises a display of a test instance, a display of an output from the machine learning system for the test instance, and a display of at least a portion of the ground truth data for the test instance (Dasgupta Fig. 23 discloses a GUI screen [instance view] containing pictures [displays] of horses classified as giraffes [test instances] listed under a column labeled “horse [ground truth data for the test instance] classified as giraffe [output from the machine learning system for the test instance]; see also col. 36, ll. 1-12 (disclosing that the system may cluster the images in the image set using a clustering technique [so each image is associated with a cluster])).”  

Claim 19 is a non-transitory computer-readable storage medium claim corresponding to method claim 9 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 10, Dasgupta discloses “providing, via the graphical user interface of the client device, a selectable option to provide failure information to a training system (samples that are difficult to classify may be selected for labeling; once human(s) annotate new samples, the classifier may be retrained with new data which are the most confusing samples [failure information] to the classifier’s current state [since humans labeling the confusing samples results in the system being retrained, the labeling itself is the option to provide the information to the training system] – Dasgupta, col. 32, ll. 3-15), the failure information comprising an indication of one or more feature labels from the plurality of feature labels associated with a threshold rate of identified errors from the accuracy data (samples that are difficult to classify may be selected for labeling; once human(s) annotate new samples, the classifier may be retrained with new data which are the most confusing samples [failure information, containing labels after humans have labeled them] to the classifier’s current state – Dasgupta, col. 32, ll. 3-15; in determining whether a performance aberration of the model is detected, if one or more performance metrics such as precision, recall, or F1 score falls below a specified threshold, an aberration may be detected [i.e., the labels may be associated with a threshold error rate] – Dasgupta, col. 29, l. 63-col. 30, l. 12); and 
causing the training system to refine at least one machine learning model of the machine learning system based on selectively identified training data associated with the one or more feature labels (samples that are difficult to classify [selectively identified training data] may be selected for labeling; once human(s) annotate new samples, the classifier may be retrained [refined] with new data which are the most confusing samples [failure information] to the classifier’s current state – Dasgupta, col. 32, ll. 3-15).”

Claim 20 is a non-transitory computer-readable storage medium claim corresponding to method claim 10 and is rejected for the same reasons as given in the rejection of that claim.  Similarly, claim 15 is system claim corresponding to method claim 10 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 14, Dasgupta discloses “providing the one or more performance views further comprises providing an instance view associated with the first feature cluster, wherein the instance view comprises a display of a test instance from the first feature cluster and associated accuracy data for the test instance (Dasgupta Fig. 23 discloses a GUI screen [instance view] containing pictures of horses classified as giraffes [test instances] and their associated closest images from the horse and giraffe datasets [associated accuracy data]; see also col. 36, ll. 1-12 (disclosing that the system may cluster the images in the image set using a clustering technique [so each image is associated with a cluster])).”

Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Dasgupta in view of Dixon et al. (CA 2955330) (“Dixon”).
Regarding claim 3, the rejection of claim 1 is incorporated.  Dasgupta further discloses that “the plurality of graphical elements comprises a list of selectable features corresponding to the plurality of feature clusters (Dasgupta Fig. 17B discloses that a user may select an entry [selectable feature] from a confusion matrix corresponding to an incorrectly classified sample, e.g., a horse classified as a giraffe; see also col. 36, ll. 13-21 (disclosing that the feature vector of each image is associated with a cluster))….”
Dasgupta appears not to disclose explicitly the further limitations of the claim.  However, Dixon discloses that “the selectable features are ranked within the list based on measures of correlation between the plurality of feature clusters and identified errors from the accuracy data (users may be clustered into groups using dimension reduction techniques; the system may display information about why a group of users were clustered together; one way to do this is to find the top X dimensions in the low-dimension space that the cluster as a whole differs most on from the population average; the divergence of the cluster’s distribution of answers from the general population’s distribution in each dimension in the sub-space may be used to rank dimensions in terms of how well they explain what is unique about each cluster – paragraph 176; one way to cluster users may be to pick an initial random grouping of users and iteratively move users between clusters to minimize how much users differ from each other in their own cluster; the process may continue until a threshold amount of error has been reached [i.e., the ranking of features is based at least in part on how much each feature contributes to the decrease in the clustering error] – id. at paragraph 178).”  
Dixon and the instant application both relate to the ranking of features in clusters for machine learning systems and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dasgupta to rank the features in each cluster based on correlations between the clusters and error data, as disclosed by Dixon, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would enhance the explainability of the system by allowing the system to display information about why certain data instances were clustered together.  See Dixon, paragraph 176.

Claims 6-8 are rejected under 35 U.S.C. 103 as being unpatentable over Dasgupta in view of Moore et al. (US 20180336487) (“Moore”).
Regarding claim 6, Dasgupta appears not to disclose explicitly the further limitations of the claim.  However, Moore discloses that “the cluster performance view comprises a multi-branch visualization of the accuracy data for the plurality of outputs (node value contributions for internal nodes and entrance nodes of a tree ensemble may be calculated; illustrated environment also shows node value contribution determination operations – Moore, paragraph 74;  to determine the node contribution value for entrance node 302 when an instance splits to leaf node 310, the expected node output value for entrance node 302 is subtracted from the node output value for leaf node 310 – id. at paragraph 77 [note that, since contribution values are calculated using differences between expected and actual values, they qualify as accuracy data]; see also Fig. 3 (depicting a tree comprising multiple branches of entrance nodes, internal nodes, and leaf nodes)), wherein the multi-branch visualization comprises: 
a first branch including an indication of the accuracy data associated with the first subset of outputs from the plurality of outputs associated with the first feature cluster (Moore Fig. 3 and paragraph 75 disclose a tree structure with multiple branches, of which the leftmost branch [first branch] ends in leaf node 318 [first subset of outputs] whose node contribution [accuracy data] is calculated by subtracting the expected node output value from node 308 from the node output value for leaf node 318; paragraph 32 discloses that a predicted hypothesis may be generated for each data instance and the instances may be clustered according to their feature impact rankings, and the accuracy of the tree ensemble predictions [outputs] may be measured against the true labels for those instances [i.e., each set of features associated with a specific branch of the tree is associated with a cluster]); and 
a second branch including an indication of the accuracy data associated with a second subset of outputs from the plurality of outputs not associated with the first feature cluster (Moore Fig. 3 and paragraph 77 disclose a tree structure with multiple branches, of which a second branch ends in leaf node 310 [second subset of outputs] whose node contribution [accuracy data] is calculated by subtracting the expected node output value from node 302 from the node output value for leaf node 310; paragraph 32 discloses that a predicted hypothesis may be generated for each data instance and the instances may be clustered according to their feature impact rankings, and the accuracy of the tree ensemble predictions [outputs] may be measured against the true labels for those instances [i.e., each set of features associated with a specific branch of the tree is associated with a cluster, and the branch ending in leaf node 320 is associated with a different cluster of features from the branch ending in leaf node 318]).”  
Moore and the instant application both relate to explainable AI systems and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dasgupta to allow the performance of the model to be viewed as a multi-branch tree containing branches with indications of accuracy data for subsets of the input dataset, as disclosed by Moore, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would increase the interpretability of the machine learning system being visualized by disclosing which features are more and less important to the overall classification.  See Moore, paragraph 19.

Regarding claim 7, the rejection of claim 6 is incorporated.  Dasgupta further discloses “detecting a selection of an additional graphical element (Dasgupta Fig. 17B discloses a GUI with a popup window that appears pursuant to a user selection of an entry in a confusion matrix suggesting that a horse is incorrectly classified as a giraffe)….”
Dasgupta appears not to disclose explicitly the further limitations of the claim.  However, Moore discloses “detecting a selection of the first branch (Moore Fig. 3 and paragraphs 74-77 disclose a tree structure containing an entrance node 302 and an internal node 308 that branches off into leaf nodes 318 and 320 [i.e., whenever the system traverses the tree from entrance node 302 to internal node 308, it has selected the first branch]); 
detecting a selection of an additional … element corresponding to a second feature cluster from the plurality of feature clusters (Moore Fig. 3 and paragraphs 76-77 disclose a tree structure containing an entrance node 302 and a leaf node 320 connected by an internal node 308 [additional element = classification associated with leaf node 320]; paragraph 32 discloses that each data instance is associated with a cluster of features [so a data instance that arrives at leaf node 320 must in general have a different set of features from a data instance that arrives at leaf node 318, or in other words, it is associated with a second feature cluster]); and 
providing a third branch including an indication of the accuracy data associated with a third subset of outputs associated with a combination of feature labels shared by the first cluster and the second feature cluster (Moore Fig. 3 discloses a tree structure containing an entrance node 302 and a leaf node 320 connected by an internal node 308 [path going from entrance node 302 to leaf node 320 = third branch; output of leaf node 320 = third subset of outputs]; paragraph 76 discloses that the node contribution for internal node 308 when an instance splits from node 308 to 320 is calculated at operation 316 by subtracting the expected node output value for internal node 308 from the node output value for leaf node 320 [node contribution value calculated at operation 316 = accuracy data associated with the third subset]; note also that the clusters of features for instances that arrive at leaf node 318 [first cluster] must share common features with the instances that arrive at leaf node 320 [second cluster] because both travel through internal node 308).”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dasgupta to provide a third branch containing accuracy data for a third set of output data for inputs whose clusters share features, as disclosed by Moore, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would increase the interpretability of the machine learning system being visualized by disclosing which features are more and less important to the overall classification.  See Moore, paragraph 19.

Regarding claim 8, Dasgupta, as modified by Moore, discloses that “the multi-branch visualization of the accuracy data for the plurality of outputs comprises: 
a root node representative of the plurality of outputs for the plurality of test instances (Moore Fig. 3 depicts a tree structure with an entrance node [root node]; paragraph 20 clarifies that an entrance node is the first node in a tree ensemble through which an instance is run and that the instance may travel through internal nodes until finally reaching a leaf node and an output is generated [i.e., the entrance node represents the instance that could be associated with any possible output, depending upon the path the instance traverses down the tree]); 
a first level including a first node representative of the first subset of outputs and a second node representative of the second subset of outputs (Moore Fig. 3 depicts a tree structure containing an internal node 308 [first node] that branches off into leaf node 318 [first subset of outputs] and leaf node 310 [second node, representative of a second subset of outputs]); and 
a second level including a third node representative of the third subset of outputs (Moore Fig. 3 depicts a tree structure containing a second level that contains, inter alia, leaf node 320 [third node, representative of a third subset of outputs]).”  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dasgupta such that the multi-branch visualization comprises a root node, a first level representative of two sets of outputs, and a third level representative of a third set of outputs, as disclosed by Moore, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would increase the interpretability of the machine learning system being visualized by disclosing which features are more and less important to the overall classification.  See Moore, paragraph 19.

Response to Arguments
Applicant's arguments filed September 14, 2022 (“Remarks”) have been fully considered but they are, except insofar as a rejection has been withdrawn, not persuasive.
Applicant first requests an enumeration of the specification objections and requests guidance as to whether a substitute specification should be submitted or merely a set of replacement paragraphs.  Remarks at 9.  Either of the two alternatives would be acceptable.  Given the numerosity of the objections and the fact that most of the marked errors are self-explanatory, Examiner requests that Applicant make at least the corrections in those instances where the error is obvious; Examiner is happy to enumerate the exact reasons for objection with respect to the errors that remain.
	Applicant then argues that it should not be required to modify Figure 4B because the number 502 written on a shaded background is not a reference character.  Remarks at 9-10.  However, 37 CFR § 1.84(p)(3), which Examiner cited in the objection, recites that “[n]umbers, letters, and reference characters must measure at least .32 cm. (1/8 inch) in height. They should not be placed in the drawing so as to interfere with its comprehension. Therefore, they should not cross or mingle with the lines. They should not be placed upon hatched or shaded surfaces.”  (Emphasis added.)  In other words, no text should be placed on a shaded surface, regardless of whether the text corresponds to a reference character.
Applicant then argues that the claims as amended are eligible because (a) identifying feature clusters, generating a performance report, and providing performance views allegedly cannot be performed in the mind; and (b) the claimed subject matter allegedly represents a technical solution of providing performance views including graphical elements representative of subsets of test instances based on clusters of feature labels to the technical problem of difficulty evaluating the failures and inaccuracies of black-box machine learning models, which allegedly represents more than a drafting effort designed to monopolize an abstract idea.  Remarks at 10-12.  However, regarding (a), the identification of feature clusters can certainly be performed in the mind, and even assuming arguendo that generating a performance report and providing performance views cannot be performed in the mind, which Examiner does not concede at least because the report can be generated in one’s mind and the performance views can be mentally visualized, the performance views could be developed using a pen and paper upon receipt of the relevant data.  Regarding (b), the supposed technical solution to a technical problem proffered is not reflected in the claim language itself.  The claims in general are not directed to improving the machine learning models themselves by clarifying to the user where they are failing and allowing the user to take appropriate action.  Rather, they are directed merely to informing the user about the existence of failures in a visually intelligible format that could in any event be developed mentally or with pen and paper.  Claim 10, which does recite the refinement of a machine learning model based on training data, is still ineligible because the recitation of refinement is ancillary to the general inventive concept of providing the performance views.  In other words, the claim is not directed to improving the model, but rather to providing the performance information.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN C VAUGHN whose telephone number is (571)272-4849. The examiner can normally be reached M-R 7:50a-5:50p ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar, can be reached at 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/RYAN C VAUGHN/Examiner, Art Unit 2125