DETAILED ACTION
The request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 14 January 2021 has been entered. Furthermore, this action is in response to the amendments filed 15 December 2020 for application 15/494027 filed on 22 May 2017.  Currently claims 9, 13-17, and 20-23 are pending. Claims 1-8, 10-12, 18-19, and 24-25 have been canceled. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Double Patenting
A rejection based on double patenting of the “same invention” type finds its support in the language of 35 U.S.C. 101 which states that “whoever invents or discovers any new and useful process... may obtain a patent therefor...” (Emphasis added). Thus, the term “same invention,” in this context, means an invention drawn to identical subject matter. See Miller v. Eagle Mfg. Co., 151 U.S. 186 (1894); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Ockert, 245 F.2d 467, 114 USPQ 330 (CCPA 1957).
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 9, 13-17, and 20-23 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-8 of copending Application No. 15/814451 (reference application) in view of Ivanov et al. (“In silico assessment of adverse drug reactions and associated mechanism”, Drug Discovery Today, Vol. 21, Number 1, January 2016, pp. 58-71). 


Co-Pending Application 15/814451
Instant Application
Claims 1, 2/1, 3/2, and 4/3
Claim 12/1

Claim 1

A computer-implemented method for generating a framework for analyzing adverse drug reactions comprising: 

receiving, to a processor, a plurality of drug chemical structures, 

receiving, to the processor, a plurality of known drug-adverse drug reaction associations; 

and constructing, by the processor, a deep learning framework for each of a plurality of adverse drug reactions based at least in part upon the plurality of drug chemical structures and the plurality of known adverse-drug reaction associations

Claim 2/1

Further comprising analyzing the deep learning frameworks to determine a substructure related to one of the plurality of adverse drug reactions and generating a substructure-adverse drug reaction association

Claim 3/2

further comprising determining substructure-adverse drug reaction associations


Claim 4/3

further comprising ranking substructure-adverse drug reaction associations.

A computer program analyzing adverse drug reactions, the computer program product comprising:
a computer readable storage medium readable by a processing circuit and storing program instructions for execution by the processing circuit for performing a method comprising:
receiving a plurality of drug chemical structures;
receiving a plurality of known drug-adverse drug reaction associations; 
constructing a deep learning framework for each of a plurality of adverse drug reactions based at least in part upon the plurality of drug chemical structures and the plurality of known drug-adverse drug reaction associations;
analyzing the deep learning frameworks to determine a set of substructure-adverse drug reaction associations; 

calculating, for each substructure-adverse drug reaction association, a p-value using a chi-squared test to evaluate a relative association strength between each substructure and each adverse drug reaction association;

ranking the substructure-adverse drug reaction associations according to statistical significance using the p values;

and redesigning a candidate substructure of a candidate drug to avoid a determined substructure-adverse drug reaction association.

Co-pending application 15/814451 fails to teach the limitation in bold above. However Ivanov et al teach the limitation “calculating, for each substructure-adverse drug reaction association, a p-value using a chi-squared test to evaluate a relative association strength between each substructure and each adverse drug reaction association; … according to statistical significance using the p values; and redesigning a candidate substructure of a candidate drug to avoid a determined substructure-adverse drug reaction association.” (([Abstract, p. 58, Introduction, p. 61, “ADR”, p. 61, “Structure-related data”, p. 65, “Prediction of ADRs with various drug features and associated analyses”, p. 69, “Concluding remarks”, Box 1] where the relative degree of substructure (structural fragment)  – adverse drug reaction associations are determined through a chi-squared test or Fisher’s Exact test (also a chi-shared test) in which it is well known that either test computes p-values to determine this relative (ranked) strength and where this statistical analysis is used to facilitate drug development (design/re-design). It would have been obvious at the time of the filing of the applicant’s invention to modify the teachings of co-pending application 15/814451 by incorporating “calculating, for each substructure-adverse drug reaction association, a p-value using a chi-squared test to evaluate a relative association strength between each substructure and each adverse drug reaction association; … according to statistical significance using the p values; and redesigning a candidate substructure of a candidate drug to avoid a determined substructure-adverse drug reaction association.” as taught by Ivanov et al. in order to  use a statistical analysis such as Chi-square or Fisher Exact test in evaluating substructure-adverse reaction association in “in silico” predictive models to efficiently develop drugs by avoiding the adverse drug reactions that are most likely to be accurately predicted in the design of any candidate drug (Ivanov, [Abstract, p. 65, “Prediction of ADRs with various drug features and associated analyses”, p. 69, “Concluding Remarks”]). In addition it is noted that the independent instant claim 9 and co-pending application ‘451 claims 1-4 is that claims 1-4 of co-pending application ‘451 are directed to a computer-implemented method and the instant claim 9 is a computer program for performing the method. Thus, claims 1-4 and claim 9 are obvious variations of one another.
Claim 5/4
Claim 13/9
further comprising grouping substructures and related adverse drug reactions with biclustering.
wherein the method further comprises grouping substructures and related adverse drug reactions with biclustering.
The difference between the independent instant claim 13/9 and co-pending application ‘451 claim 5/4 is that dependent claim 5/4 of co-pending application ‘451 is a computer-implemented method and the instant dependent claim 13/12 is a computer program for performing the method. Thus, claims 5/4 and 13/12 are obvious variations of one another.
Claim 6/5
Claim 14/13
further comprising outputting a predicted drug-adverse drug reaction association.
wherein the method further comprises outputting a chemical substructure-adverse drug reaction association.

The difference between the independent instant claim 14/13 and co-pending application ‘451 claim 6/5 is that dependent claim 6/5 of co-pending application ‘451 is a computer-implemented method and the instant dependent claim 14/13 is a computer program for performing the method. Thus, claims 6/5 and 14/13 are obvious variations of one another.
Claim 7/5
Claim 15/13
further comprising outputting a substructure-adverse drug reaction map.
wherein the method further comprises outputting a substructure-adverse drug reaction map.
The difference between the independent instant claim 15/13 and co-pending application ‘451 claim 7/5 is that dependent claim 7/5 of co-pending application ‘451 is a computer-implemented method and the instant dependent claim 15/13 is a computer program for performing the method. Thus, claims 7/5 and 15/13 are obvious variations of one another.
Claim 8/1
Claim 16/9
wherein constructing the deep learning framework comprises defining a plurality of neighborhood-based fingerprints using a plurality of hidden layers and pooling the plurality of neighborhood- based fingerprints.
wherein constructing the deep learning framework comprises defining a plurality of neighborhood-based fingerprints using a plurality of hidden layers and pooling the plurality of neighborhood-based fingerprints.
The difference between the independent instant claim 16/9 and co-pending application ‘451 claim 8/1 is that dependent claim 8/1 of co-pending application ‘451 is a computer-implemented method and the instant dependent claim 16/9 is a computer program for performing the method. Thus, claims 8/1 and 16/9 are obvious variations of one another.



Claims 17 and 20-23 of the instant application are also rejected over claims 1-8, respectively, of co-pending application ‘451 for the same reasons as set forth above for claims 9 and 13-16, respectively, of the instant application. In addition, the co-pending application ‘451 claims 1-8 are directed to a computer-implemented method whereas the instant claims 9 and 13-16 are directed to a system with processors and memory.

This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 9, 16-17, and 21-23 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. (“Deep Learning Based Regression and Multi-class Models for Acute Oral Toxicity Prediction”, https://arxiv.org/pdf/1704.04718v1.pdf, arXiv:1704.04718v1 [stat.ML] 16 April 2017, pp. 1-36), hereinafter referred to as Xu in view of  Ivanov et al. (“In silico assessment of adverse drug reactions and associated mechanism”, Drug Discovery Today, Vol. 21, Number 1, January 2016, pp. 58-71), hereinafter referred to as Ivanov. 

In regards to claim 9, Xu teaches A computer program analyzing adverse drug reactions, the computer program product comprising: a computer readable storage medium readable by a processing circuit and storing program instructions for execution by the processing circuit for performing a method comprising: ([p. 7, “MGE-CNN”, Figure 1]], The MGE-CNN architecture takes the canonical SMILES string of a small molecule as input, and produces a score capable of describing a value or label about toxicity. Figure 1A and 1B show this architecture and its high-level pseudocode with the steps of MGE-CNN feedforward process, wherein a machine learning framework for predicting toxicity/adverse reactions is implemented using the computer implemented MGE-CNN framework shown in Figures 1A and 1B which implements code corresponding to the indicated pseudo code.) receiving a plurality of drug chemical structures; ([p. 7, “MGE-CNN”, p. 11 “Data Collection and Preparation”, Figure 1],  The MGE-CNN architecture takes the canonical SMILES string of a small molecule as input, …The MGE-CNN architecture takes the canonical SMILES string of a small molecule as input…, wherein drug chemical structures in the form of potentially toxic chemicals and drugs are received as input into the MGE-CNN architecture.) receiving a plurality of known drug-adverse drug reaction associations; ([Abstract, p. 11 “Data Collection and Preparation”, Table 1],  Median lethal death, LD50, is a general indicator of compound acute oral toxicity (AOT)., The AOT database provided by Li et al., 15 the largest data set for oral LD50 in rat, was used in this study. All data was from three sources: 1) the admetSAR database;45 2) the MDL Toxicity Database (version 2004.1),46 and 3) the Toxicity Estimation Software Tool (TEST version 4.1)47 program from the U.S. EPA…. Finally, the training and validation sets included 8080 and 2045 compounds, respectively, with measured LD50 values adopted from the admetSAR database. Two external data sets contained 1673 (from MDL Toxicity Database) and 375 (from TEST) compounds. Based on the U.S. EPA definition of toxicity,49 all compounds were divided into four categories based on their levels of toxicity., wherein training (and evaluation) data consists of molecular compounds and their associated (known) toxicity (LD50).) constructing a deep learning framework for each of a plurality of adverse drug reactions based at least in part upon the plurality of drug chemical structures and the plurality of known drug-adverse drug reaction associations. ([Abstract, p. 9, “Hyperparameter Optimization” , Figure 1],  In this study, a deep learning architecture composed of multi-layer convolution neural network was used to develop three types of high-level predictive models: regression model (deepAOT-R), multi-classification (deepAOT-C) model and multitask model (deepAOT-CR) for AOT evaluation., An appropriate set of hyperparameters must be selected before applying deep learning framework for a new data set, which is a timeconsuming and tedious task.41 The hyperparameters of MGE-CNN include the length of fingerprint (FPL), the depth of fingerprint (FPD), the width of convolution kernel (CKW), the size of hidden units in the output layer (HLS), the L2 penalty of cost function (L2P), the scale of initial weights (IWS) and the step size of learning rate (LRS). The ranges of these parameters are shown in Table S1, as recommended by Duvenaud et al.(github.com/HIPS/ neural-fingerprint/issues/2) In order to reduce computational costs, a simplified parameter range was used as follows:…; wherein various deep learning models (regression model (deepAOT-R), multi-classification (deepAOT-C) model and multitask model (deepAOT-CR)) are generated through a respective deep learning framework  for learning the drug compound/structure-toxicity/adverse reaction associations (for new/alternative compounds/molecular structures/substructures) based on the known compound-toxicity associations and the molecular substructures (directly input into the framework).) analyzing the deep learning frameworks to determine a set of substructure-adverse drug reaction associations; ([Abstract, p. 9, “Hyperparameter Optimization”, pp. 12-13, “Forward and backward exploration of fingerprints”, p. 23, “Backward Exploration of Fingerprints”, Figure 1, Figure 4A, Figure 4B], We further performed automatic feature learning, a key essence of deep learning, to map the corresponding activation values into fragment space and derive AOT-related chemical substructures by reverse mining of the features., In order to determine what these models actually predict, the forward and backward exploration approach was applied for “Fingerprint” layer. The forward exploration was implemented by extracting the values of “Fingerprint” layer (deep fingerprints) to construct MLR and SVM models. This could demonstrate the support degree that these features provided in the shallow machine learning decision-making system. While assessing the performance of shallow models with deep fingerprints, increased performance would suggest optimized predictive features from this MGE-CNN architecture. The backward exploration is that after linear regression, the most linear-negative-correlation feature was selected from the |F P L|-dimensional “Fingerprint” layer. Further analysis examined that related atoms and their neighboring atoms, with the most prominent contribution to this feature were reversely calculated out, which was called activation fragment. The activation fragment is highlighted in a drawing of each compound presented in category I. These highlighted fragments were considered by prediction models to be substructures most related to AOT, which an inference to toxicity fragments., The backward exploration of the “Fingerprint” layer was expected to provide an understanding of fingerprint activation…. After linear regression, the most negative correlation feature of the fingerprints was calculated, which represented the most toxic feature. Comparing activation values of this feature, nine values were determined to contribute most to feature activation. The nine values could be mapped into different substructures, thereby suggesting that these substructures were the most correlative to the explored toxicity feature (Figure 4A and 4B)., The forward exploration was implemented by extracting the values of “Fingerprint” layer (deep fingerprints) to construct MLR and SVM models.) ;wherein the parameters (hidden layers) of each deep learning framework are analyzed using automatic feature learning (SVM and/or MLR models, Figure 1) through forward or backward exploration to deduce/determine toxicities associated with substructures/activation fragments (Figure 4, Table 6).)calculating, for each substructure-adverse drug reaction association, … test to evaluate a relative association strength between each substructure and each adverse drug reaction association; ranking the substructure-adverse drug reaction associations …([ p. 7, “MGE-CNN”, pp. 12-13, “Forward and backward exploration of fingerprints”, p. 23, “Backward Exploration of Fingerprints”, Figure 1A, Figure 1B, Figure 4A, Figure 4B], The sub-graph from each layer (or iteration) is encoded into a fixed-sized vector zLl ∈ R |F P L| l ∈ {1, 2, ..., |F P D|}, then these vectors are summed as zx ∈ R |F P L| representing this molecule. Then zx is used as input of the subsequent neural network in the output layer for executing the following operation: score = f(zxW output H + b output H )W output O + b output O (1) where W output H ∈ R |F P L|×|HLS| is the weight matrix of hidden layer in the output layer, W output O ∈ R |HLS|×d out all is the weight matrix of output layer in the output layer, and b output H ∈ R 1×|HLS| and b output O ∈ R 1×d out all are bias terms. d out all = 1 for RMs, d out all = 4 for MCMs. The 4-dimensional vector is transformed with softmax function representing the probability of four classes. p(i|x) = e score(x)i ∑4 j=1 e score(x)j is the probability of category i, where score(x)i is the score for category i…., The backward exploration is that after linear regression, the most linear-negative-correlation feature was selected from the |F P L|-dimensional “Fingerprint” layer. Further analysis examined that related atoms and their neighboring atoms, with the most prominent contribution to this feature were reversely calculated out, which was called activation fragment. The activation fragment is highlighted in a drawing of each compound presented in category I. These highlighted fragments were considered by prediction models to be substructures most related to AOT, which an inference to toxicity fragments., After linear regression, the most negative correlation feature of the fingerprints was calculated, which represented the most toxic feature. Comparing activation values of this feature, nine values were determined to contribute most to feature activation. The nine values could be mapped into different substructures, thereby suggesting that these substructures were the most correlative to the explored toxicity feature (Figure 4A and 4B).; wherein the association between each substructure as represented by a sub-graph in a particular layer/iteration of the learning framework and a respective level of toxicity is quantified through the application of either the regression (MLR) or multiclass (SVM) model in the form of a score or probability (relative strength of association) in these associations are ranked in the backward exploration to select the most prominent substructure/fragment contribution to the toxicity.)
However, Xu does not explicitly teach … a p value using a chi-squared test … according to statistical significance using the p values; and redesigning a candidate substructure of a candidate drug to avoid a determined substructure-adverse drug reaction association. Xu uses neural network-based means (MLR, SVM) to quantify the substructure-toxicity strength of association not a statistical test per se. Although Xu indicates that the deep learning framework is an “in silico” method for reducing costs and time (Abstract), he does not explicitly disclose that this increased efficiency is relative to drug development/redesign.
However, Ivanov, in the analogous art of drug design using computer-based ADR predictions, teaches calculating, for each substructure-adverse drug reaction association, a p value using a chi-squared test to evaluate a relative association strength between each substructure and each adverse drug reaction association; ranking the substructure-adverse drug reaction associations according to statistical significance using the p values; ([p. 61, “ADR”, p. 61, “Structure-related data”, p. 65, “Prediction of ADRs with various drug features and associated analyses”, Box 1],  The goal is to identify drugs that have a greater proportion of a particular event among their reported events compared with the reported frequencies for other drugs. Signals are detected by comparing the observed reporting rates between a drug–event pair to the expected reporting rate derived from other drug–event pairs. Under the null hypothesis that the event occurred by chance, the observed and expected rates will be equivalent and their ratio equal to one. When this ratio is much larger than one the null hypothesis is rejected., It is generally accepted that chemical structures determine all chemical properties and biological activities of pharmacological substances [42], for example interactions with biological targets., ADR mechanisms are caused by many different drug features, which can be divided into chemical aspects, such as structural fragments, scaffolds and biological features like protein targets, genes, pathways or biological processes. Correlations between drug features and ADRs can be calculated by comparing drug–feature profiles between ADR causing and non-ADR-causing drugs. This can be done separately for each feature by simple statistical analyses, such as Fisher’s exact test [51,53,63–65], chi-square test[47], Mann–Whitney statistics [59,60], disproportionality analysis [33,34] and others [56–58,67,79,87].; wherein drug structural feature-adverse reactions are deduced/determined using a chi-squared test (which may include the Fisher’s exact test) and wherein it is noted that the chi-squared test/Fisher’s Exact test determines (relative/rankable) statistical significance of the deviation for a null hypothesis as quantified by the p-value (as is well known in the general theory of statistical analysis)) and redesigning a candidate substructure of a candidate drug to avoid a determined substructure-adverse drug reaction association. ([Abstract, p. 58, “Introduction”, p. 65, “Prediction of ADRs with various drug features and associated analyses”, p. 69, “Concluding remarks”], These features have also been used for the creation of predictive models that enable estimation of ADRs during the early stages of drug development. In this review, we discuss various in silico approaches to predict these features for a certain drug, estimate correlations with ADRs, establish causal relationships between selected features and ADR mechanisms and create corresponding predictive models., Therefore, many serious ADRs can be detected only at the last stages of clinical trials, making ADRs the second most common cause of drug development attrition [3,4]., Another important observation from Table 3 is that the accuracy of predictive models based only on chemical features is comparable to the accuracies of models based on biological features. This is not surprising because the chemical structure determines the DTI and pathway interaction profiles of a drug as well as the resulting phenotypic effects. Therefore, SAR-based models can be used for ADR prediction at the earliest stages of drug development when other information besides structural formulae is not available., ADR-related features can be used in various in vitro and in silico approaches to estimate ADRs at early stages of drug development and to gain understanding about the general pathophysiological ADR mechanisms.; wherein the analysis, through predictive (statistical) modeling of molecular structure-adverse reaction associations is used at the early stages of drug design (i.e., the resultant information is used to guide drug design/re-design).)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Xu to incorporate the teachings of Ivanov to apply a chi-squared test with the calculation of a p-value to determine the relative ranked degree/strength of association between a substructure and an adverse reaction and to redesign a drug while accounting for these learned associations.  The modification would have been obvious because one of ordinary skill would have been motivated to use a statistical analysis such as Chi-square or Fisher Exact test in evaluating substructure-adverse reaction association in “in silico” predictive models to efficiently develop drugs by avoiding the adverse drug reactions that are most likely to be accurately predicted in the design of any candidate drug (Ivanov, [Abstract, p. 65, “Prediction of ADRs with various drug features and associated analyses”, p. 69, “Concluding Remarks”]).

In regards to claim 16, the rejection of claim 9 is incorporated and Xu further teaches wherein constructing the deep learning framework comprises defining a plurality of neighborhood-based fingerprints using a plurality of hidden layers and pooling the plurality of neighborhood-based fingerprints. ([p. 7, “MGE-CNN”, p. 20 “Forward Exploration of Fingerprints”, Figure 1A, Figure 1B] The sub-graph from each layer (or iteration) is encoded into a fixed-sized vector zLl ∈ R |F P L| l ∈ {1, 2, ..., |F P D|}, then these vectors are summed as zx ∈ R |F P L| representing this molecule. Then zx is used as input of the subsequent neural network in the output layer., For this purpose, fingerprints were extracted from the “Fingerprint” layer in the well-trained deep models, then the whole data set was transferred into a matrix of N (number of compounds) × FPL, which was a featurization and vectorization process for compounds.; wherein the fingerprints represented/encoded by each of the succession of layers in the deep learning framework are pooled to form a matrix of a number of compounds x FPL (the dimensionality of the fingerprint) prior to being input into the MLR or SVM networks such that each hidden layer of the convolutional neural network corresponds to a molecular neighborhood in a compound of different breadth (Figure 1A) and wherein associated pooling operations are also seen in lines 19-20 and 26 of the pseudocode (Figure 1B) corresponding to the extraction of features from each layer by summing across deep neural network parameters that represent the sub-structures/fingerprints and summing over the layers to obtain the z_x from which the score/strength of substructure/fingerprint/subgraph-toxicity (Equation 1) is computed.)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Xu to incorporate the teachings of Ivanov for the same reasons as pointed out for claim 9.

Claim 17 is also rejected because it is just a system implementation of the same subject matter of claim 9 which can be found in Xu and Ivanov.

In regards to claim 21, the rejection of claim 17 is incorporated and Xu further teaches wherein the processor is configured to output a significant chemical substructure. ([pp. 10-12, section 3.2, Figure 8] We looked for possible associations between all neurons of the networks and 1429 toxicophores, that were available as described in Section 2.3.2. We checked the associations using a U-test, in which a neuron was characterized by its activation over the compounds of the training set and a toxicophore was characterized by its presence/absence in the training set compounds… Next we investigated the correlation of known toxicophores to neurons in different layers to quantify their matching. To this end, we used the rank-biserial correlation which is compatible to the previously used U-test…  Visual inspection of the results also confirmed that lower layers tended to learn smaller features, often focusing on single functional groups, such as sulfonic acid groups (see row 1 and 2 of Figure 8), while in higher layers the correlations tended to be with larger toxicophore clusters (row 3 of Figure 8)… Most importantly, these learned toxicophore structures demonstrated that Deep Learning can support finding new chemical knowledge that is encoded in its hidden units., wherein a result of the analysis of the significant association between neuron activations and toxicophore is the generation of substructures shown in context with the full chemical structure (Figure 8) in which the association between activations in each layer and the corresponding substructure is illustrated.)  
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Xu to incorporate the teachings of Ivanov for the same reasons as pointed out for claim 9.

In regards to claim 22, the rejection of claim 17 is incorporated and Xu further teaches wherein the processor is configured to output a substructure-adverse drug reaction map. ([pp. 10-12, section 3.2, p. 7 of Supplementary Material, Section 5,Figure 8] Visual inspection of the results also confirmed that lower layers tended to learn smaller features, often focusing on single functional groups, such as sulfonic acid groups (see row 1 and 2 of Figure 8), while in higher layers the correlations tended to be with larger toxicophore clusters (row 3 of Figure 8)… Most importantly, these learned toxicophore structures demonstrated that Deep Learning can support finding new chemical knowledge that is encoded in its hidden units., We present the software libraries DeepTox depons on … matplotlib …, wherein results are generated/outputted in the form of figures of pertinent fingerprint (molecular fragment) substructures from the analysis, through the neuronal activations in the deep neural network, the toxicity/adverse reaction association with compound structures, and wherein these results are maps of the pertinent substructures as they are structurally and contextually represented in each compound/drug.)  
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Xu to incorporate the teachings of Ivanov for the same reasons as pointed out for claim 9.

Claim 23/17 is also rejected because it is just a system implementation of the same subject matter of claim 16/9 which can be found in Xu and Ivanov.

Claims 13-15 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Xu, in view of Ivanov, and in further view of  Harpaz et al. (“Biclustering of Adverse Drug Events in FDA’s Spontaneous Reporting System”, Clin. Pharmocol Ter. 2011 February, 89(2):243-250), hereinafter referred to as Harpaz. 

In regards to claim 13, the rejection of claim 9 is incorporated and Xu and Ivanov do not further teach wherein the method further comprises grouping substructures and related adverse drug reactions with bi-clustering. Although Xu makes use the tanimoto distance (p. 22, Table 5) for associating/correlating deep fingerprints (substructures) with known molecular topological structure-based fingerprints as a demonstration of the interpretability of his analytical framework, he does not explicitly disclose a bi-clustering operation for grouping different structures/sub-structures according their ADR’s. Although Ivanov (p. 65) discusses the clustering of chemicals according to ADR mechanisms, he also does not explicitly disclose “bi-clustering”.
However, Harpaz, in the analogous environment of associating adverse drug events to drug molecular structure, teaches wherein the method further comprises grouping substructures and related adverse drug reactions with biclustering ([p. 7, Section 4, Biclustering],  The clustering data matrix consisted of m rows each representing a different drug, and n columns each representing a different AE, where each cell in the matrix initially contained GPS’ EBGM association strength value corresponding to the i-th drug and the j-th AE. This matrix was then transformed into a binary data matrix, where each cell contained either a 1 or 0, representing the states of “strongly associated” or “weakly associated” respectively. The transformation was performed by selecting an “association strength threshold”, and reflects the fundamental idea that drugs and AEs are either associated due to a causal relationship or not, and that in reality discovering this type of relationship is the ultimate goal., wherein biclustering is used to identify associations between drug groups and adverse reaction/events to quantify that association and provide insight into the causality of adverse drug events.).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Xu and Ivanov to incorporate the teachings of Harpaz to have applied biclustering of drug features/substructures and their related toxicities/adverse reactions either for model evaluation clustering or for analysis of the association between learned substructures and toxicities/adverse reactions. The modification would have been obvious because one of ordinary skill would have been motivated to use biclustering to provide causal insight into drug/drug feature groups and toxicity/adverse reaction modes in order to potentially infer new or previously unrecognized adverse drug events. (Harpaz, [Abstract, pp. 7-8, Section 4]).

In regards to claim 14, the rejection of claim 13 is incorporated and Xu further teaches wherein the method further comprises outputting a chemical substructure-adverse drug reaction association. ([p.  23, “Backward Exploration of Fingerprints”, Figure 4, Table 6] Comparing activation values of this feature, nine values were determined to contribute most to feature activation. The nine values could be mapped into different substructures, thereby suggesting that these substructures were the most correlative to the explored toxicity feature (Figure 4A and 4B). There were mainly two classes of highlighted fragments, α,β-Unsaturated nitriles (TA626) and alyl (thio)phosphates (TA776) for RM4, while TA776 and thicarbonyl (TA374) for CM1. The three fragments have been reported to be toxicity structural alerts.; wherein results from the deep learning framework analysis are generated/outputted in the form of figures and tables of significant substructures-toxicity associations.)  
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Xu to incorporate the teachings of Ivanov and Harpaz for the same reasons as pointed out for claims 9 and 13, respectively.

In regards to claim 15, the rejection of claim 13 is incorporated and Xu further teaches wherein the method further comprises outputting a substructure-adverse drug reaction map. ([p.  23, “Backward Exploration of Fingerprints”, Figure 4, Table 6] Comparing activation values of this feature, nine values were determined to contribute most to feature activation. The nine values could be mapped into different substructures, thereby suggesting that these substructures were the most correlative to the explored toxicity feature (Figure 4A and 4B). There were mainly two classes of highlighted fragments, α,β-Unsaturated nitriles (TA626) and alyl (thio)phosphates (TA776) for RM4, while TA776 and thicarbonyl (TA374) for CM1. The three fragments have been reported to be toxicity structural alerts.;  wherein results are generated/outputted in the form of figures of pertinent fingerprint (molecular fragment) substructures from the analysis, through the neuronal activations in the deep neural network, the toxicity/adverse reaction association with compound structures, and wherein these results are maps of the pertinent substructures as they are structurally and contextually represented in each compound/drug.)  
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Xu to incorporate the teachings of Ivanov and Harpaz for the same reasons as pointed out for claims 9 and 13, respectively.

Claim 20/19 is also rejected because it is just a system implementation of the same subject matter of claim 14/13 which can be found in Xu, Ivanov, and Harpaz.


Response to Arguments
Applicant’s arguments with respect to claims 9, 13-17, and 20-23 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
	
Dougal Maclaurin  (“Modeling, Inference and Optimization with Composable Differentiable Procedures”, PhD Dissertation, Harvard University, April 2016, pp. , 2014, pp. 1-135) teaches the identification of (differentiable) neural graph fingerprint in a deep learning framework for predicting molecular responses (solubility, toxicity) according to substructures identified through analysis/pooling of representations of those substructures within the hidden layers of that framework.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ROBERT LEWIS KULP whose telephone number is (571)272-7983.  The examiner can normally be reached on M, Th, F 8-5:30; Tu 8-3.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki, can be reached on 571-272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/ROBERT LEWIS KULP/Examiner, Art Unit 2122                                                                                                                                                                                                        
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122