DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
	This Office Action is in response to the communication filed on 02/18/2020.
	Claims 1-15 are being considered on the merits.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 2/18/2020 have been considered. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, initialed and dated copies of Applicant's IDS form 1449 filed 2/18/2020 is attached to the instant Office action. 
Drawings
	The drawings filed on 2/18/2020 are accepted. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1 and 5 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kahng, et. al. (“ACTIVIS: Visual Exploration of Industry-Scale Deep Neural Network Models”, Jan 2018, IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 24, NO. 1; hereinafter “Kahng”)

Regarding claim 1, Kahng teaches a method of:
determining a correctness of a prediction performed by a deep learning model with respect to input data (Kahng, pg. 90, sec. 3.1.1: “Once the training process is done, the interface provides high-level information to aid result analysis (e.g., precision, accuracy)”. Examiner notes that the broadest reasonable interpretation of correctness means accuracy.)
extracting, by a prediction validation device, a neuron activation pattern of at least one layer of the deep learning model with respect to the input data (Kahng, pg. 92, sec. 4.2 and pg. 93, fig. 3: "For example, in Fig.3, the neurons are sorted based on the average activation values for the class ‘LOC’. Sorting facilitates activation comparison and helps reveal patterns, such as spotting instances that are positively correlated with their true class in terms of the activation pattern”. Examiner notes that the broadest reasonable interpretation of a predication validation device is a computer or system such as the ACTIVIS proposed by Kahng)
generating, by the prediction validation device, an activation vector based on the neuron activation pattern of the at least one layer of the deep learning model (Kahng, pg. 92, sec. 4.2: “we compute the average activations for instances within the subsets. The vector of average activations for a subset can then be placed next to the vectors of other instances or subsets for comparison.” Examiner notes that the broadest reasonable interpretation of a predication validation device is a computer or system such as the ACTIVIS proposed by Kahng).
determining, by the prediction validation device, the correctness of the prediction performed by the deep learning model with respect to the input data using a prediction validation model and based on the activation vector, (Kahng, pg. 89, sec. 1; pg. 90, sec. 3.1.1; pg. 92, sec. 4.2: “A developer can visualize a deep learning model using ACTIVIS by adding only a few lines of code” “users can train a model by picking a relevant workflow from a collection of existing workflows and specifying several input parameters for the selected workflow…Once the training process is done, the interface provides high-level information to aid result analysis (e.g., precision, accuracy)” “we compute the average activations for instances within the subsets. The vector of average activations for a subset can then be placed next to the vectors of other instances or subsets for comparison.”).
wherein the prediction validation model is a machine learning model that has been generated and trained using a plurality of training activation vectors derived from correctly predicted test dataset and incorrectly predicted test dataset of the deep learning model (Kahng, pg. 92, fig. 2: Fig.2. ACTIVIS integrates multiple coordinated views. A. The computation graph summarizes the model architecture. B. The neuron activation panel’s matrix view displays activations for instances, subsets, and classes (at B1), and its projected view shows a 2-D t-SNE projection of the instance activations (at B2). C. The instance selection panel displays instances and their classification results; correctly classified instances shown on the left, misclassified on the right.” Examiner notes that the broadest reasonable interpretation of a “model” is a simplified description of a system or process to assist calculations and predictions such as those output by the ACTIVIS program).  
providing, by the prediction validation device, the correctness of the prediction performed by the deep learning model with respect to the input data for at least one of subsequent rendering or subsequent processing. (Kahng, pg. 89, sec. 1; pg. 90, sec. 3.1.1; and pg. 96, sec. 6.1.2: “A developer can visualize a deep learning model using ACTIVIS by adding only a few lines of code” “Once the training process is done, the interface provides high-level information to aid result analysis (e.g., precision, accuracy)” “One of the main components of ACTIVIS is the visual representation of activations that helps users easily recognize patterns and anomalies. As Carol interacted with the visualization, she gleaned a number of new insights, and a few hints for how to debug deep learning models in general. She interactively selected many different instances and added them to the neuron activation matrix to see how they activated neurons.” Examiner notes that the broadest reasonable interpretation of a predication validation device is a computer or system such as the ACTIVIS proposed by Kahng. Examiner additionally notes that the broadest reasonable interpretation of “subsequent rendering” means subsequent representation of data (i.e. visualization) after a first representation, including in where a user selects different instances to view newly rendered visualizations).

Regarding claim 5, Kahng teaches the method of claim 1 (above). Kahng further teaches: 
generating and training the prediction validation model using the plurality of training activation vectors. (Kahng, pg. 92, sec. 4.2 and Fig. 2: "The vector of average activations for a subset can then be placed next to the vectors of other instances or subsets for comparison. The neuron activation matrix, shown at Fig. 2B.1, illustrates this concept of comparing multiple instances and instance subsets, using the TREC question classification dataset2 [25]. The dataset consists of 5,500 question sentences and each sentence is labeled by one of six categories (e.g., is a question asking about location?).” Examiner notes that the broadest reasonable interpretation of “training activation vectors” are activation vectors derived from data used to train a model. Examiner additionally notes that the broadest reasonable interpretation of a “vector” means a matrix with one row or column, as shown in Fig. 2) 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Kahng in view of Arzani (US 2021/0012239 A1, hereinafter “Arzani”)

Regarding claim 2, Kahng teaches the method of claim 1 (above). Khang does not explicitly disclose: 
the at least one layer comprises at least one of a dense layer and a long short-term memory (LSTM) layer of the deep learning model
However, Arzani teaches: 
the at least one layer comprises at least one of a dense layer and a long short-term memory (LSTM) layer of the deep learning model (Arzani, para. 0055: “the model selection process might favor a deep learning neural network with a long short-term memory layer and/or a semantic word embedding layer…As another example, given the substantial training and memory budgets, the model selection process may select densely-connected network layers.”) 
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Arzani into Kahng. Kahng teaches ACTIVIS, an interactive visualization system which allows machine learning models to perform prediction validation and output visualized results; Arzani teaches automatic selection and generation of machine learning models. One of ordinary skill would have been motivated to combine the teachings of Arzani into Kahng in order to automatically select a machine learning model and hyperparameters given any number of constraints which would allow someone without machine-learning expertise to nonetheless evaluate and obtain an appropriate machine learning model (Arzani, para. 0082). 

Claims 3 and 4 are rejected under 35 U.S.C. 103 as being unpatentable over Kahng in view of Montoro (US 2017/0193335 A1, hereinafter “Montoro”)
Regarding claim 3, Kahng teaches the method of claim 1 (above). Kahng does not explicitly disclose: 
generating and training the deep learning model using annotated training data from training dataset 
testing the deep learning model using test data from test dataset
However, Montoro teaches: 
generating and training the deep learning model using annotated training data from training dataset (Montoro, para. 0043 and 0045: “The methods disclosed herein, also called as WiseNet™, consist of a novel encoding method that transforms raw data, which may be non-visual, into images which form the input of a Deep Learning prediction model as illustrated in FIG. 6.” “WiseNet methods can be used for supervised learning, when we have labeled training data”)
testing the deep learning model using test data from test dataset (Montoro, para. 0043 and 0072: “The methods disclosed herein, also called as WiseNet™, consist of a novel encoding method that transforms raw data, which may be non-visual, into images which form the input of a Deep Learning prediction model as illustrated in FIG. 6.” “The optimal set of hyper-parameters was determined using the test dataset and optimizing the log-loss. Looking at the results, the WiseNet methods outperforms all the other models in every metric studied.”).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Montoro into Kahng. Kahng teaches ACTIVIS, an interactive visualization system for interpreting large-scale deep learning models and results; Montoro teaches leveraging of various machine learning architectures in the context of classifying and encoding non-visual data. One of ordinary skill would have been motivated to combine the teachings of Montoro into Kahng to implement, and use known machine learning architectures best suited for each user’s objective rather than blindly attempting to use any machine learning architecture (Montoro, paras. 00061-0064).

Regarding claim 4, Kahng and Montoro teaches the method of claim 3 (above). Kahng further teaches: 
segregating the test dataset into the correctly predicted test dataset and the incorrectly predicted test dataset (Kahng, pg. 92, fig. 2: “Fig.2. ACTIVIS integrates multiple coordinated views. A. The computation graph summarizes the model architecture. B. The neuron activation panel’s matrix view displays activations for instances, subsets, and classes (at B1), and its projected view shows a 2-D t-SNE projection of the instance activations (at B2). C. The instance selection panel displays instances and their classification results; correctly classified instances shown on the left, misclassified on the right.” Examiner notes that the broadest reasonable interpretation of “test dataset” means any dataset used to test (i.e. verify) the predictions of a model, such that results of a prediction applied to such dataset can be validated as correctly or incorrectly classified).  
extracting a plurality of neuron activation patterns of the at least one layer of the deep learning model with respect to the correctly predicted test dataset and the incorrectly predicted test dataset; (Kahng, pg. 88, fig. 1 and pg. 92, fig. 2: "Fig. 1. ACTIVIS integrates several coordinated views to support exploration of complex deep neural network models, at both instance and subset-level. 1. Our user Susan starts exploring the model architecture, through its computation graph overview (at A). Selecting a data node (in yellow) displays its neuron activations (at B). 2. The neuron activation matrix view shows the activations for instances and instance subsets; the projected view displays the 2-D projection of instance activations. 3. From the instance selection panel (at C), she explores individual instances and their classification results. 4. Adding instances to the matrix view enables comparison of activation patterns across instances, subsets, and classes, revealing causes for misclassification.” “Fig.2. ACTIVIS integrates multiple coordinated views. A. The computation graph summarizes the model architecture. B. The neuron activation panel’s matrix view displays activations for instances, subsets, and classes (at B1), and its projected view shows a 2-D t-SNE projection of the instance activations (at B2). C. The instance selection panel displays instances and their classification results; correctly classified instances shown on the left, misclassified on the right.” Examiner notes that notes that Fig 1(A) illustrates the layers of the model architecture from which the user is selecting and displays the (extracted) the activation pattern. Examiner additionally notes that the broadest reasonable interpretation of “test dataset” means any dataset used to test (i.e. verify) the predictions of a model, such that results of a prediction applied to such dataset can be validated as correctly or incorrectly classified).
generating the plurality of training activation vectors based on the plurality of neuron activation patterns of the at least one layer of the deep learning model. (Kahng, pg. 92, sec. 4.2 and Fig. 2: "The vector of average activations for a subset can then be placed next to the vectors of other instances or subsets for comparison. The neuron activation matrix, shown at Fig. 2B.1, illustrates this concept of comparing multiple instances and instance subsets, using the TREC question classification dataset2 [25]. The dataset consists of 5,500 question sentences and each sentence is labeled by one of six categories (e.g., is a question asking about location?).” Examiner notes that the classification of results as correctly or incorrectly classified implies that the dataset is a “test” dataset where such data is labeled. Examiner additionally notes that Fig 2 illustrates a network with multiple layers (i.e. a deep learning model).


Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Kahng in view of Montoro and further in review of Arzani. 

Regarding claim 6 Kahng teaches the method of claim 1 (above). Kahng does not explicitly disclose: 
the machine learning model comprises one of a support vector machine (SVM) model, a random forest model, an extreme gradient boosting model, and an artificial neural network (ANN) model
However, Arzani teaches:
the machine learning model comprises one of a support vector machine (SVM) model, a random forest model…(Arzani, para. 0054: “In this example, assume the classification model pool 210 includes a logistic regression model type, a decision tree model type, a random forest model type, a Bayesian network model type, a support vector machine model type, and a deep learning neural network model type.”) 
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Arzani into Kahng as set forth above with respect to claim 2.  
Neither Kahng nor Arzani teach:
…an extreme gradient boosting model, and an artificial neural network (ANN) model 
However, Montoro teaches: 
…an extreme gradient boosting model, and an artificial neural network (ANN) model (Montoro, paras. 0072 and 0097: “We considered the performance of four well-known machine learning algorithms compared to the best performing WiseNet model: randomForests, generalized linear models (GLM), generalized boosted machines (GBM) and extreme gradient boosting (xgboost).” “In some embodiments, the machine learning may include training an artificial neural network based on the plurality of non-visual data and the plurality of classifications.” )
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Montoro into Kahng and Arzani. Kahng teaches ACTIVIS, an interactive visualization system which allows machine learning models to perform prediction validation and output visualized results; Arzani teaches automatic selection and generation of machine learning models; Montoro teaches leveraging of various machine learning architectures in the context of classifying and encoding non-visual data. One of ordinary skill would have been motivated to combine the teachings of Montoro into Kahng and Arzani to implement, and use known machine learning architectures best suited for each user’s objective rather than blindly attempting to use any machine learning architecture (Montoro, paras. 00061-0064).

Claims 7-8, 10-12 and 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Kahng in view of Knighton (US 2021/0166111 A1, hereinafter “Knighton”)

Regarding claim 7, Kahn teaches the method of claim 1 (above). Knighton further teaches:
the deep learning model comprises at least one of a multilayer perceptron (MLP) model, a convolutional neural network (CNN) model, a recursive neural network (RNN) model, a recurrent neural network (RNN) model, or a long short-term memory (LSTM) model. (Knighton, para. 0077: “each decoder is a neural network such as a convolutional neural network (CNN), a multilayer perceptron, a feed-forward neural network, a recursive neural network, a recurrent neural network (RNN), a deep neural network, a shallow neural network, a fully-connected neural network, a sparsely-connected neural network, a convolutional neural network that comprises a fully-connected neural network (FCNN), a fully convolutional network without a fully-connected neural network, a deep stacking neural network, a deep belief network, a residual network, echo state network, liquid state machine, highway network, maxout network, long short-term memory (LSTM) network, recursive neural network grammar (RNNG), gated recurrent unit (GRU), pre-trained and frozen neural networks, and so on.”)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Knighton into Kahng. Kahng teaches ACTIVIS, an interactive visualization system which allows machine learning models to perform prediction validation and output visualized results; Knighton teaches systems and methods of training processing engines wherein such systems can access information from various sources while keeping the sources private from each other. One of ordinary skill would have been motivated to combine the teachings of Knighton into Kahng to produce and evaluate various representations or classifications of input data using any of several machine learning techniques and multiple sources of data which could provide insights and predictions for achieving users’ various objectives (Knighton, para. 0015).


Regarding claim 8, Kahng teaches a system for:
extract a neuron activation pattern of at least one layer of the deep learning model with respect to the input data (Kahng, pg. 92, sec. 4.2 and pg. 93, fig. 3: "For example, in Fig.3, the neurons are sorted based on the average activation values for the class ‘LOC’. Sorting facilitates activation comparison and helps reveal patterns, such as spotting instances that are positively correlated with their true class in terms of the activation pattern”)
generate an activation vector based on the neuron activation pattern of the at least one layer of the deep learning model; (Kahng, pg. 92, sec. 4.2: “we compute the average activations for instances within the subsets. The vector of average activations for a subset can then be placed next to the vectors of other instances or subsets for comparison.”).
determine the correctness of the prediction performed by the deep learning model with respect to the input data using a prediction validation model and based on the activation vector, (Kahng, pg. 89, sec. 1; pg. 90, sec. 3.1.1; pg. 92, sec. 4.2: “A developer can visualize a deep learning model using ACTIVIS by adding only a few lines of code” “users can train a model by picking a relevant workflow from a collection of existing workflows and specifying several input parameters for the selected workflow…Once the training process is done, the interface provides high-level information to aid result analysis (e.g., precision, accuracy)” “we compute the average activations for instances within the subsets. The vector of average activations for a subset can then be placed next to the vectors of other instances or subsets for comparison.”).
wherein the prediction validation model is a machine learning model that has been generated and trained using a plurality of training activation vectors derived from correctly predicted test dataset and incorrectly predicted test dataset of the deep learning model (Kahng, pg. 92, fig. 2: Fig.2. ACTIVIS integrates multiple coordinated views. A. The computation graph summarizes the model architecture. B. The neuron activation panel’s matrix view displays activations for instances, subsets, and classes (at B1), and its projected view shows a 2-D t-SNE projection of the instance activations (at B2). C. The instance selection panel displays instances and their classification results; correctly classified instances shown on the left, misclassified on the right.” Examiner notes that the model being illustrated in Fig. 2 is a convolutional neural network.).  
provide the correctness of the prediction performed by the deep learning model with respect to the input data for at least one of subsequent rendering or subsequent processing. (Kahng, pg. 89, sec. 1; pg. 90, sec. 3.1.1; pg. 92, sec. 4.2; and pg. 96, sec. 6.1.2: “A developer can visualize a deep learning model using ACTIVIS by adding only a few lines of code” “Once the training process is done, the interface provides high-level information to aid result analysis (e.g., precision, accuracy)” “we compute the average activations for instances within the subsets. The vector of average activations for a subset can then be placed next to the vectors of other instances or subsets for comparison.” “One of the main components of ACTIVIS is the visual representation of activations that helps users easily recognize patterns and anomalies. As Carol interacted with the visualization, she gleaned a number of new insights, and a few hints for how to debug deep learning models in general. She interactively selected many different instances and added them to the neuron activation matrix to see how they activated neurons.” Examiner notes that the broadest reasonable interpretation of “subsequent rendering” means any rendering (i.e. visualization) after a first rendering, including in where a user selects different instances to view newly rendered visualizations).
Kahng does not explicitly disclose:

determining a correctness of a prediction performed by a deep learning model with respect to input data, the system comprising: a processor and a memory communicatively coupled to the processor, wherein the memory stores processor-executable instructions, which, on execution, causes the processor to  
However, Knighton teaches: 
determining a correctness of a prediction performed by a deep learning model with respect to input data, the system comprising: a processor and a memory communicatively coupled to the processor, wherein the memory stores processor-executable instructions, which, on execution, causes the processor to (Knighton, para. 0202: “A system implementation of the technology disclosed includes one or more processors coupled to memory. The memory is loaded with computer instructions to train processing engines.”) 

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Knighton into Kahng. Kahng teaches ACTIVIS, an interactive visualization system which allows machine learning models to perform prediction validation and output visualized results; Knighton teaches systems and methods of training processing engines wherein such systems can access information from various sources while keeping the sources private from each other. One of ordinary skill would have been motivated to combine the teachings of Knighton into Kahng to produce and evaluate various representations or classifications of input data using any of several machine learning techniques and multiple sources of data which could provide insights and predictions for achieving users’ various objectives (Knighton, para. 0015).

Regarding claim 10, Kahng and Knighton teaches the method of claim 8 (above). Kahng further teaches:
the processor-executable instructions further cause the processor to: generate and train the deep learning model using annotated training data from training dataset (Kahng, pg. 91, sec. 3.1.1: “One natural way for users at Facebook to understand complex models is by tracking how an individual example (i.e., training or test instance) behaves inside the models; users often have their own collection of example instances, for which they know their characteristics and ground truth labels.” Examiner notes that the broadest reasonable interpretation of “annotated data” means labeled data)
test the deep learning model using test data from test dataset (Kahng, pg. 91, sec. 3.1.1: “One natural way for users at Facebook to understand complex models is by tracking how an individual example (i.e., training or test instance) behaves inside the models; users often have their own collection of example instances, for which they know their characteristics and ground truth labels.” Examiner notes that the broadest reasonable interpretation of “testing” means taking measures to check the quality or performance such as when users testing model performance with known data).

Regarding claim 11, Kahng and Knighton teaches the method of claim 10 (above). Kahng further teaches:
the processor-executable instructions further cause the processor to: segregate the test dataset into the correctly predicted test dataset and the incorrectly predicted test dataset; (Kahng, pg. 92, fig. 2: “Fig.2. ACTIVIS integrates multiple coordinated views. A. The computation graph summarizes the model architecture. B. The neuron activation panel’s matrix view displays activations for instances, subsets, and classes (at B1), and its projected view shows a 2-D t-SNE projection of the instance activations (at B2). C. The instance selection panel displays instances and their classification results; correctly classified instances shown on the left, misclassified on the right.” Examiner notes that the broadest reasonable interpretation of “test dataset” means any dataset used to test (i.e. verify) the predictions of a model, such that results of a prediction applied to such dataset can be validated as correctly or incorrectly classified).  
extract a plurality of neuron activation patterns of the at least one layer of the deep learning model with respect to the correctly predicted test dataset and the incorrectly predicted test dataset  (Kahng, pg. 88, fig. 1: "Fig. 1. ACTIVIS integrates several coordinated views to support exploration of complex deep neural network models, at both instance and subset-level. 1. Our user Susan starts exploring the model architecture, through its computation graph overview (at A). Selecting a data node (in yellow) displays its neuron activations (at B). 2. The neuron activation matrix view shows the activations for instances and instance subsets; the projected view displays the 2-D projection of instance activations. 3. From the instance selection panel (at C), she explores individual instances and their classification results. 4. Adding instances to the matrix view enables comparison of activation patterns across instances, subsets, and classes, revealing causes for misclassification.” Examiner notes that the classification of results as correctly or incorrectly classified implies that the dataset is a “test” dataset where such data is labeled. Examiner additionally notes that Fig 1(A) illustrates the layers of the model architecture from which the user is selecting and viewing (i.e. extracting) the activation pattern).
generate the plurality of training activation vectors based on the plurality of neuron activation patterns of the at least one layer of the deep learning model (Kahng, pg. 92, sec. 4.2 and Fig. 2: "The vector of average activations for a subset can then be placed next to the vectors of other instances or subsets for comparison. The neuron activation matrix, shown at Fig. 2B.1, illustrates this concept of comparing multiple instances and instance subsets, using the TREC question classification dataset2 [25]. The dataset consists of 5,500 question sentences and each sentence is labeled by one of six categories (e.g., is a question asking about location?).” Examiner notes that the classification of results as correctly or incorrectly classified implies that the dataset is a “test” dataset where such data is labeled. Examiner additionally notes that Fig 2 illustrates a network with multiple layers (i.e. a deep learning model).


Regarding claim 12 Kahng and Knighton teaches the method of claim 8 (above). Kahng further teaches:
the processor-executable instructions further cause the processor to generate and train the prediction validation model using the plurality of training activation vectors. (Kahng, pg. 92, sec. 4.2 and Fig. 2: "The vector of average activations for a subset can then be placed next to the vectors of other instances or subsets for comparison. The neuron activation matrix, shown at Fig. 2B.1, illustrates this concept of comparing multiple instances and instance subsets, using the TREC question classification dataset2 [25]. The dataset consists of 5,500 question sentences and each sentence is labeled by one of six categories (e.g., is a question asking about location?).” Examiner notes that the broadest reasonable interpretation of “training activation vectors” are activation vectors derived from labeled data used to train a model. Examiner additionally notes that the broadest reasonable interpretation of a “vector” means a matrix with one row or column, as shown in Fig. 2)

Regarding claim 14, Kahng and Knighton teaches the method of claim 8 (above). Knighton further teaches:
the deep learning model comprises at least one of a multilayer perceptron (MLP) model, a convolutional neural network (CNN) model, a recursive neural network (RNN) model, a recurrent neural network (RNN) model, or a long short-term memory (LSTM) model. (Knighton, para. 0077: “each decoder is a neural network such as a convolutional neural network (CNN), a multilayer perceptron, a feed-forward neural network, a recursive neural network, a recurrent neural network (RNN), a deep neural network, a shallow neural network, a fully-connected neural network, a sparsely-connected neural network, a convolutional neural network that comprises a fully-connected neural network (FCNN), a fully convolutional network without a fully-connected neural network, a deep stacking neural network, a deep belief network, a residual network, echo state network, liquid state machine, highway network, maxout network, long short-term memory (LSTM) network, recursive neural network grammar (RNNG), gated recurrent unit (GRU), pre-trained and frozen neural networks, and so on.”)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Knighton into Kahng as set forth above with respect to claim 8. 

Regarding claim 15, Kahng teaches:
extracting a neuron activation pattern of at least one layer of the deep learning model with respect to the input data; Kahng, pg. 92, sec. 4.2 and pg. 93, fig. 3: "For example, in Fig.3, the neurons are sorted based on the average activation values for the class ‘LOC’. Sorting facilitates activation comparison and helps reveal patterns, such as spotting instances that are positively correlated with their true class in terms of the activation pattern”)
generating an activation vector based on the neuron activation pattern of the at least one layer of the deep learning model; (Kahng, pg. 92, sec. 4.2: “we compute the average activations for instances within the subsets. The vector of average activations for a subset can then be placed next to the vectors of other instances or subsets for comparison.”).
determining the correctness of the prediction performed by the deep learning model with respect to the input data using a prediction validation model and based on the activation vector (Kahng, pg. 89, sec. 1; pg. 90, sec. 3.1.1; pg. 92, sec. 4.2: “A developer can visualize a deep learning model using ACTIVIS by adding only a few lines of code” “users can train a model by picking a relevant workflow from a collection of existing workflows and specifying several input parameters for the selected workflow…Once the training process is done, the interface provides high-level information to aid result analysis (e.g., precision, accuracy)” “we compute the average activations for instances within the subsets. The vector of average activations for a subset can then be placed next to the vectors of other instances or subsets for comparison.”).
wherein the prediction validation model is a machine learning model that has been generated and trained using a plurality of training activation vectors derived from correctly predicted test dataset and incorrectly predicted test dataset of the deep learning model; (Kahng, pg. 92, fig. 2: Fig.2. ACTIVIS integrates multiple coordinated views. A. The computation graph summarizes the model architecture. B. The neuron activation panel’s matrix view displays activations for instances, subsets, and classes (at B1), and its projected view shows a 2-D t-SNE projection of the instance activations (at B2). C. The instance selection panel displays instances and their classification results; correctly classified instances shown on the left, misclassified on the right.” Examiner notes that the model being illustrated in Fig. 2 is a convolutional neural network.).  
providing, the correctness of the prediction performed by the deep learning model with respect to the input data for at least one of subsequent rendering or subsequent processing. (Kahng, pg. 89, sec. 1; pg. 90, sec. 3.1.1 and pg. 96, sec. 6.1.2: “A developer can visualize a deep learning model using ACTIVIS by adding only a few lines of code” “Once the training process is done, the interface provides high-level information to aid result analysis (e.g., precision, accuracy)” “One of the main components of ACTIVIS is the visual representation of activations that helps users easily recognize patterns and anomalies. As Carol interacted with the visualization, she gleaned a number of new insights, and a few hints for how to debug deep learning models in general. She interactively selected many different instances and added them to the neuron activation matrix to see how they activated neurons.” Examiner notes that the broadest reasonable interpretation of “correctness” means accuracy. Examiner additionally notes that the broadest reasonable interpretation of “subsequent rendering” means subsequent representation of data (i.e. visualization) after a first representation, including in where a user selects different instances to view newly rendered visualizations).
Kahng does not explicitly disclose:
A non-transitory computer-readable medium storing computer-executable instructions 
However, Knighton teaches:
A non-transitory computer-readable medium storing computer-executable instructions (Knighton 0199: “Other implementations may include a non-transitory computer readable storage medium storing instructions executable by a processor to perform a method as described above”)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Knighton into Kahng as set forth above with respect to claim 8. 

Claims 9 are rejected under 35 U.S.C. 103 as being unpatentable over Kahng in view of Knighton and further in view of Arzani. 
Regarding claim 9, Kahng and Knighton teaches the method of claim 8 (above). Arzani further teaches:
at least one layer comprises at least one of a dense layer and a long short-term memory (LSTM) layer of the deep learning model (Arzani, para. 0055: “the model selection process might favor a deep learning neural network with a long short-term memory layer and/or a semantic word embedding layer…As another example, given the substantial training and memory budgets, the model selection process may select densely-connected network layers.”)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Arzani into Kahng and Knighton. Kahng teaches ACTIVIS, an interactive visualization system which allows machine learning models to perform prediction validation and output visualized results; Knighton teaches  systems and methods of training processing engines; Arzani teaches automatic selection and generation of machine learning models; One of ordinary skill would have been motivated to combine the teachings of Arzani into Kahng and Knighton in order to automatically select a machine learning model and hyperparameters given any number of constraints which would allow someone without machine-learning expertise to nonetheless evaluate and obtain an appropriate machine learning model (Arzani, para. 0082).


Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Kahng in view of Knighton, further in view of Arzani and further in view of Montoro. 

Regarding claim 13, Kahng and Knighton teaches the method of claim 8 (above). Neither Kahng nor Knighton explicitly disclose:
the machine learning model comprises one of a support vector machine (SVM) model, a random forest model, an extreme gradient boosting model, and an artificial neural network (ANN) model. 
However, Arzani teaches:
the machine learning model comprises one of a support vector machine (SVM) model, a random forest model…(Arzani, para. 0054: “In this example, assume the classification model pool 210 includes a logistic regression model type, a decision tree model type, a random forest model type, a Bayesian network model type, a support vector machine model type, and a deep learning neural network model type.”) 
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Arzani into Kahng and Knighton as set forth above with respect to claim 9. 
Neither Kahng, Knighton, nor Arzani teach:
…an extreme gradient boosting model, and an artificial neural network (ANN) model 
However, Montoro teaches: 
…an extreme gradient boosting model, and an artificial neural network (ANN) model (Montoro, paras. 0072 and 0097: “We considered the performance of four well-known machine learning algorithms compared to the best performing WiseNet model: randomForests, generalized linear models (GLM), generalized boosted machines (GBM) and extreme gradient boosting (xgboost).” “In some embodiments, the machine learning may include training an artificial neural network based on the plurality of non-visual data and the plurality of classifications.” )
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Montoro into Kahng and Knighton. Kahng teaches ACTIVIS, an interactive visualization system which allows machine learning models to perform prediction validation and output visualized results; Knighton teaches  systems and methods of training processing engines; Montoro teaches leveraging of various machine learning architectures in the context of classifying and encoding non-visual data. One of ordinary skill would have been motivated to combine the teachings of Montoro into Kahng to implement, and use known machine learning architectures best suited for each user’s objective rather than blindly attempting to use any machine learning architecture (Montoro, paras. 00061-0064).

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Patel, et. al. (US 2021/0125732 A1) teaches a system and method  with federated learning model for geotemporal data associated medical prediction applications
Ravindran, et. al. (US 10,002,322 B1) teaches systems and methods for predicting transactions using a long short-term memory network. 
Su, et. al. (WO 2019/048506 A1) teaches training methods for machine learning assisted optical proximity error correction 
Gong, et. al. (“Improving accuracy of rutting prediction for mechanistic-empirical pavement design guide with deep neural networks”, 28 Sep 2018, Construction and Building Materials) teaches the use of neural networks and random forests in prediction modeling for performance of asphalt concrete.
Kim, et. al. (“Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)”, 2018, Proceedings of the 35th International Conference on Machine Learning) teaches Concept Activation Vectors (CAVs) provide an interpretation of a neural net’s internal state in terms of human-friendly concepts
Murdoch, et. al. (“Definitions, methods, and applications in interpretable machine learning”, 29 Oct 2019, PNAS: Vol. 16, No. 44) teaches a framework for evaluation of machine learning models including predictive accuracy, descriptive accuracy, and relevancy. 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Sally T. Nguyen whose telephone number is (571) 272-3406. The examiner can normally be reached Monday - Thursday, 9:00am - 5:00pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on (571) 270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/STN/Examiner, Art Unit 2123                                                                                                                                                                                                        

/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123