DETAILED ACTION
This action is in response to the amendments filed 1 July 2021 for application for application 15/814451 filed 16 November 2017.  Currently claims 1 and 4-7 are pending. Claims 2, 3, 8, and 9 have been canceled. All references in each of the IDS statements have been considered.  The claim objections and rejections under 35 U.S.C. 101 have been withdrawn in light of the amendments.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Double Patenting
A rejection based on double patenting of the “same invention” type finds its support in the language of 35 U.S.C. 101 which states that “whoever invents or discovers any new and useful process... may obtain a patent therefor...” (Emphasis added). Thus, the term “same invention,” in this context, means an invention drawn to identical subject matter. See Miller v. Eagle Mfg. Co., 151 U.S. 186 (1894); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Ockert, 245 F.2d 467, 114 USPQ 330 (CCPA 1957).
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1 and 4-7  are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 9, 13-15 of copending Application No. 15/494027 (reference application). The method in instant claims 1 and 4-7 are anticipated by the method recited in the body of the program product claims 9, 13-15. Specifically the claims 1 and 4-7 of the instant application are rejected over claims 9, 9, and 13-15, respectively.  Although the claims at issue are not identical, they are not patentably distinct from each other according to the following analysis:


Instant Application
Co-Pending Application 15/494027
Claim 1
Claim 9


A computer-implemented method for generating a framework for analyzing adverse drug reactions comprising: 

receiving, to a processor, a plurality of drug chemical structures, 

receiving, to the processor, a plurality of known drug-adverse drug reaction associations; 

and constructing, by the processor, a deep learning framework for each of a plurality of adverse drug reactions based at least in part upon the plurality of drug chemical structures and the plurality of known adverse-drug reaction associations

wherein constructing the deep learning framework comprises defining a plurality of neighborhood-based fingerprints for each of the plurality of drug chemical structures using a plurality of hidden layers and generating a convolutional feature map comprising a plurality of convolutional steps, wherein each convolutional step encodes a respective neighborhood-based fingerprint at a respective hidden layer, 

wherein each fingerprint is obtained by starting from multiple atoms belonging to said fingerprint and resultant redundancies are removed by mapping each fingerprint into a lower dimension using a single layer of the deep learning framework; 

analyzing the deep learning frameworks to determine a set of substructure-adverse drug reaction associations; 

and redesigning a candidate substructure of a candidate drug to avoid a determined substructure-adverse drug reaction association

A computer program analyzing adverse drug reactions, the computer program product comprising:
a computer readable storage medium readable by a processing circuit and storing program instructions for execution by the processing circuit for performing a method comprising:
receiving a plurality of drug chemical structures;
receiving a plurality of known drug-adverse drug reaction associations; 
constructing a deep learning framework for each of a plurality of adverse drug reactions based at least in part upon the plurality of drug chemical structures and the plurality of known drug-adverse drug reaction associations;
wherein constructing the deep learning framework comprises defining a plurality of neighborhood-based fingerprints for each of the plurality of drug chemical structures using a plurality of hidden layers and generating a convolutional feature map comprising a plurality of convolutional steps, wherein each convolutional step encodes a respective neighborhood-based fingerprint at a respective hidden layer, 
wherein each fingerprint is obtained by starting from multiple atoms belonging to said fingerprint and resultant redundancies are removed by mapping each fingerprint into a lower dimension using a single layer of the deep learning framework;
analyzing the deep learning frameworks to determine a set of substructure-adverse drug reaction associations; 

calculating, for each substructure-adverse drug reaction association, a p-value using a chi-squared test to evaluate a relative association strength between each substructure and each adverse drug reaction association;
ranking the substructure-adverse drug reaction associations according to statistical significance using the p values;

and redesigning a candidate substructure of a candidate drug to avoid a determined substructure-adverse drug reaction association.

. The method in instant claim 1 is anticipated by the method recited in the body of program product claim 9 of the ‘027 application
Claim 4/3
Claim 9
further comprising ranking substructure-adverse drug reaction associations.
A computer program analyzing adverse drug reactions, the computer program product comprising:
a computer readable storage medium readable by a processing circuit and storing program instructions for execution by the processing circuit for performing a method comprising:
receiving a plurality of drug chemical structures;
receiving a plurality of known drug-adverse drug reaction associations; 
constructing a deep learning framework for each of a plurality of adverse drug reactions based at least in part upon the plurality of drug chemical structures and the plurality of known drug-adverse drug reaction associations;
wherein constructing the deep learning framework comprises defining a plurality of neighborhood-based fingerprints for each of the plurality of drug chemical structures using a plurality of hidden layers and generating a convolutional feature map comprising a plurality of convolutional steps, wherein each convolutional step encodes a respective neighborhood-based fingerprint at a respective hidden layer, 
wherein each fingerprint is obtained by starting from multiple atoms belonging to said fingerprint and resultant redundancies are removed by mapping each fingerprint into a lower dimension using a single layer of the deep learning framework;
analyzing the deep learning frameworks to determine a set of substructure-adverse drug reaction associations; 

calculating, for each substructure-adverse drug reaction association, a p-value using a chi-squared test to evaluate a relative association strength between each substructure and each adverse drug reaction association;

ranking the substructure-adverse drug reaction associations according to statistical significance using the p values;
and redesigning a candidate substructure of a candidate drug to avoid a determined substructure-adverse drug reaction association.

The method in instant claim 4/3 is anticipated by the method recited in the body of program product claim 9 of the ‘027 application. 
Claim 5/4
Claim 13/9
further comprising grouping substructures and related adverse drug reactions with biclustering.
wherein the method further comprises grouping substructures and related adverse drug reactions with biclustering.
The method in instant claim 5/4 is anticipated by the method recited in the body of program product claim 13/9 of the ‘027 application. 
Claim 6/5
Claim 14/13
further comprising outputting a predicted drug-adverse drug reaction association.
wherein the method further comprises outputting a chemical substructure-adverse drug reaction association.
The method in instant claim 6/5 is anticipated by the method recited in the body of program product claim 14/13 of the ‘027 application. 
Claim 7/5
Claim 15/13
further comprising outputting a substructure-adverse drug reaction map.
wherein the method further comprises outputting a substructure-adverse drug reaction map.
The method in instant claim 7/5 is anticipated by the method recited in the body of program product claim 15/13 of the ‘027 application. 




This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1 and 4 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. (“Deep Learning Based Regression and Multi-class Models for Acute Oral Toxicity Prediction”, https://arxiv.org/pdf/1704.04718v1.pdf, arXiv:1704.04718v1 [stat.ML] 16 April 2017, pp. 1-36), hereinafter referred to as Xu in view of  Ivanov et al. (“In silico assessment of adverse drug reactions and associated mechanism”, Drug Discovery Today, Vol. 21, Number 1, January 2016, pp. 58-71), hereinafter referred to as Ivanov. 

In regards to claim 1, Xu teaches A computer-implemented method for generating a framework for analyzing adverse drug reactions comprising:  ([p. 7, “MGE-CNN”, Figure 1]], The MGE-CNN architecture takes the canonical SMILES string of a small molecule as input, and produces a score capable of describing a value or label about toxicity. Figure 1A and 1B show this architecture and its high-level pseudocode with the steps of MGE-CNN feedforward process, wherein a machine learning framework for analyzing and predicting toxicity/adverse reactions is implemented/generated using the computer implemented MGE-CNN framework shown in Figures 1A and 1B which implements code corresponding to the indicated pseudo code.) receiving, to a processor, a plurality of drug chemical structures, ([p. 7, “MGE-CNN”, p. 11 “Data Collection and Preparation”, Figure 1],  The MGE-CNN architecture takes the canonical SMILES string of a small molecule as input, …The MGE-CNN architecture takes the canonical SMILES string of a small molecule as input…, wherein drug chemical structures in the form of potentially toxic chemicals and drugs are received as input into the MGE-CNN architecture.) receiving, to the processor, a plurality of known drug-adverse drug reaction associations; ([Abstract, p. 11 “Data Collection and Preparation”, Table 1],  Median lethal death, LD50, is a general indicator of compound acute oral toxicity (AOT)., The AOT database provided by Li et al., 15 the largest data set for oral LD50 in rat, was used in this study. All data was from three sources: 1) the admetSAR database;45 2) the MDL Toxicity Database (version 2004.1),46 and 3) the Toxicity Estimation Software Tool (TEST version 4.1)47 program from the U.S. EPA…. Finally, the training and validation sets included 8080 and 2045 compounds, respectively, with measured LD50 values adopted from the admetSAR database. Two external data sets contained 1673 (from MDL Toxicity Database) and 375 (from TEST) compounds. Based on the U.S. EPA definition of toxicity,49 all compounds were divided into four categories based on their levels of toxicity., wherein training (and evaluation) data consists of molecular compounds and their associated (known) toxicity (LD50).) constructing, by the processor, a deep learning framework for each of a plurality of adverse drug reactions based at least in part upon the plurality of drug chemical structures and the plurality of known adverse-drug reaction associations  ([Abstract, p. 9, “Hyperparameter Optimization” , Figure 1],  In this study, a deep learning architecture composed of multi-layer convolution neural network was used to develop three types of high-level predictive models: regression model (deepAOT-R), multi-classification (deepAOT-C) model and multitask model (deepAOT-CR) for AOT evaluation., An appropriate set of hyperparameters must be selected before applying deep learning framework for a new data set, which is a timeconsuming and tedious task.41 The hyperparameters of MGE-CNN include the length of fingerprint (FPL), the depth of fingerprint (FPD), the width of convolution kernel (CKW), the size of hidden units in the output layer (HLS), the L2 penalty of cost function (L2P), the scale of initial weights (IWS) and the step size of learning rate (LRS). The ranges of these parameters are shown in Table S1, as recommended by Duvenaud et al.(github.com/HIPS/ neural-fingerprint/issues/2) In order to reduce computational costs, a simplified parameter range was used as follows:…; wherein various deep learning models (regression model (deepAOT-R), multi-classification (deepAOT-C) model and multitask model (deepAOT-CR)) are generated through a respective deep learning framework  for learning the drug compound/structure-toxicity/adverse reaction associations (for new/alternative compounds/molecular structures/substructures) based on the known compound-toxicity associations and the molecular substructures (directly input into the framework).) wherein constructing the deep learning framework comprises defining a plurality of neighborhood-based fingerprints for each of the plurality of drug chemical structures using a plurality of hidden layers and generating a convolutional feature map comprising a plurality of convolutional steps, wherein each convolutional step encodes a respective neighborhood-based fingerprint at a respective hidden layer, ([p. 7, “MGE-CNN”, p. 20 “Forward Exploration of Fingerprints”, Figure 1A, Figure 1B] The sub-graph from each layer (or iteration) is encoded into a fixed-sized vector zLl ∈ R |F P L| l ∈ {1, 2, ..., |F P D|}, then these vectors are summed as zx ∈ R |F P L| representing this molecule. Then zx is used as input of the subsequent neural network in the output layer., For this purpose, fingerprints were extracted from the “Fingerprint” layer in the well-trained deep models, then the whole data set was transferred into a matrix of N (number of compounds) × FPL, which was a featurization and vectorization process for compounds.; ,” wherein the CNN-based deep learning framework (Figure 1A) computes fingerprints (z’s in the algorithm pseudo code Figure 1B, especially z_x but also including Z _layer and z_Ll) in the fingerprint layer as well as across the layers of the CNN corresponding to each compound (each molecule with m atoms) such that these fingerprints are formed/encoded using each layer of a CNN and such that distinct fingerprints are encoded for each atom “a” in that molecule along with its respective neighbors in that molecule (lines 12-20 for Figure 1B) with each fingerprint having dimension FPL (the feature dimension) and with the fingerprints thereby formed relative to different atoms in that molecule (and pooled across these atoms and their respective neighbors and CNN layers - lines 20-26 of Figure 1B) used as an output fingerprint  (fingerprint layer) that is subsequently used to determine the score (equation 1) z_x associated with substructure/fingerprint/subgraph-toxicity.) wherein each fingerprint is obtained by starting from multiple atoms belonging to said fingerprint and resultant redundancies are removed by mapping each fingerprint into a lower dimension using a single layer of the deep learning framework; ([p. 7, “MGE-CNN”, p. 20 “Forward Exploration of Fingerprints”, Figure 1A, Figure 1B, Table 2] The sub-graph from each layer (or iteration) is encoded into a fixed-sized vector zLl ∈ R |F P L| l ∈ {1, 2, ..., |F P D|}, then these vectors are summed as zx ∈ R |F P L| representing this molecule. Then zx is used as input of the subsequent neural network in the output layer., For this purpose, fingerprints were extracted from the “Fingerprint” layer in the well-trained deep models, then the whole data set was transferred into a matrix of N (number of compounds) × FPL, which was a featurization and vectorization process for compounds.; wherein as noted above (and seen in Figure 1B, lines 12-24), fingerprints are formed for each atom and its neighbors for each (hidden) layer of the CNN and pooled (see Figure 1A – especially the summation over the softmax output layer) with the resultant fingerprint output then passing through a single hidden layer indicated by the function W_H^output having dimension FPL x HLS prior to classification or regression (see lines 2 and 28 of Figure 1B as well as the previous discussion on the computation of the score), wherein this hidden layer reduces the dimensionality of the fingerprint from FPL to HLS for test configurations in which HLS<FPL which, as can be seen in table 2, is the case for 5 of the 10 best performing multi-classification models but 4 out of the 5 of these models for which both test accuracy metrics exceeded 90% (viz., CM2, CM3, CM5, CM8), and wherein it is noted that Xu teaches the underlying functionality of performing the mapping of the fingerprints to a lower dimension via the single layer which inherently performs the removal of the redundancy (i.e., the effect of the mapping is inherently the removal of redundancy). analyzing the deep learning frameworks to determine a set of substructure-adverse drug reaction associations; ….([Abstract, p. 9, “Hyperparameter Optimization”, pp. 12-13, “Forward and backward exploration of fingerprints”, p. 23, “Backward Exploration of Fingerprints”, Figure 1, Figure 4A, Figure 4B], We further performed automatic feature learning, a key essence of deep learning, to map the corresponding activation values into fragment space and derive AOT-related chemical substructures by reverse mining of the features., In order to determine what these models actually predict, the forward and backward exploration approach was applied for “Fingerprint” layer. The forward exploration was implemented by extracting the values of “Fingerprint” layer (deep fingerprints) to construct MLR and SVM models. This could demonstrate the support degree that these features provided in the shallow machine learning decision-making system. While assessing the performance of shallow models with deep fingerprints, increased performance would suggest optimized predictive features from this MGE-CNN architecture. The backward exploration is that after linear regression, the most linear-negative-correlation feature was selected from the |F P L|-dimensional “Fingerprint” layer. Further analysis examined that related atoms and their neighboring atoms, with the most prominent contribution to this feature were reversely calculated out, which was called activation fragment. The activation fragment is highlighted in a drawing of each compound presented in category I. These highlighted fragments were considered by prediction models to be substructures most related to AOT, which an inference to toxicity fragments., The backward exploration of the “Fingerprint” layer was expected to provide an understanding of fingerprint activation…. After linear regression, the most negative correlation feature of the fingerprints was calculated, which represented the most toxic feature. Comparing activation values of this feature, nine values were determined to contribute most to feature activation. The nine values could be mapped into different substructures, thereby suggesting that these substructures were the most correlative to the explored toxicity feature (Figure 4A and 4B)., The forward exploration was implemented by extracting the values of “Fingerprint” layer (deep fingerprints) to construct MLR and SVM models.) ;wherein the parameters (hidden layers) of each deep learning framework are analyzed using automatic feature learning (SVM and/or MLR models, Figure 1) through forward or backward exploration to deduce/determine toxicities associated with substructures/activation fragments (Figure 4, Table 6).) 
However, Xu does not explicitly teach redesigning a candidate substructure of a candidate drug to avoid a determined substructure-adverse drug reaction association. Although Xu indicates that the deep learning framework is an “in silico” method for reducing costs and time (Abstract), he does not explicitly disclose that this increased efficiency is relative to drug development/redesign.
However, Ivanov, in the analogous art of drug design using computer-based ADR predictions, teaches and redesigning a candidate substructure of a candidate drug to avoid a determined substructure-adverse drug reaction association. ([Abstract, p. 58, “Introduction”, p. 65, “Prediction of ADRs with various drug features and associated analyses”, p. 69, “Concluding remarks”], These features have also been used for the creation of predictive models that enable estimation of ADRs during the early stages of drug development. In this review, we discuss various in silico approaches to predict these features for a certain drug, estimate correlations with ADRs, establish causal relationships between selected features and ADR mechanisms and create corresponding predictive models., Therefore, many serious ADRs can be detected only at the last stages of clinical trials, making ADRs the second most common cause of drug development attrition [3,4]., Another important observation from Table 3 is that the accuracy of predictive models based only on chemical features is comparable to the accuracies of models based on biological features. This is not surprising because the chemical structure determines the DTI and pathway interaction profiles of a drug as well as the resulting phenotypic effects. Therefore, SAR-based models can be used for ADR prediction at the earliest stages of drug development when other information besides structural formulae is not available., ADR-related features can be used in various in vitro and in silico approaches to estimate ADRs at early stages of drug development and to gain understanding about the general pathophysiological ADR mechanisms.; wherein the analysis, through predictive (statistical) modeling of molecular structure-adverse reaction associations is used at the early stages of drug design (i.e., the resultant information is used to guide drug design/re-design).)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Xu to incorporate the teachings of Ivanov to determine the relative strength of association between a substructure and an adverse reaction and to redesign a drug while accounting for these learned associations.  The modification would have been obvious because one of ordinary skill would have been motivated to use a statistical analysis in evaluating substructure-adverse reaction association in “in silico” predictive models to efficiently develop drugs by avoiding the adverse drug reactions that are most likely to be accurately predicted in the design of any candidate drug (Ivanov, [Abstract, p. 65, “Prediction of ADRs with various drug features and associated analyses”, p. 69, “Concluding Remarks”]).

In regards to claim 4, the rejection of claim 1 is incorporated and Xu further teaches further comprising ranking substructure-adverse drug reaction associations.  ([p. 7, “MGE-CNN”, pp. 12-13, “Forward and backward exploration of fingerprints”, p. 23, “Backward Exploration of Fingerprints”, Figure 1A, Figure 1B, Figure 4A, Figure 4B], The sub-graph from each layer (or iteration) is encoded into a fixed-sized vector zLl ∈ R |F P L| l ∈ {1, 2, ..., |F P D|}, then these vectors are summed as zx ∈ R |F P L| representing this molecule. Then zx is used as input of the subsequent neural network in the output layer for executing the following operation: score = f(zxW output H + b output H )W output O + b output O (1) where W output H ∈ R |F P L|×|HLS| is the weight matrix of hidden layer in the output layer, W output O ∈ R |HLS|×d out all is the weight matrix of output layer in the output layer, and b output H ∈ R 1×|HLS| and b output O ∈ R 1×d out all are bias terms. d out all = 1 for RMs, d out all = 4 for MCMs. The 4-dimensional vector is transformed with softmax function representing the probability of four classes. p(i|x) = e score(x)i ∑4 j=1 e score(x)j is the probability of category i, where score(x)i is the score for category i…., The backward exploration is that after linear regression, the most linear-negative-correlation feature was selected from the |F P L|-dimensional “Fingerprint” layer. Further analysis examined that related atoms and their neighboring atoms, with the most prominent contribution to this feature were reversely calculated out, which was called activation fragment. The activation fragment is highlighted in a drawing of each compound presented in category I. These highlighted fragments were considered by prediction models to be substructures most related to AOT, which an inference to toxicity fragments., After linear regression, the most negative correlation feature of the fingerprints was calculated, which represented the most toxic feature. Comparing activation values of this feature, nine values were determined to contribute most to feature activation. The nine values could be mapped into different substructures, thereby suggesting that these substructures were the most correlative to the explored toxicity feature (Figure 4A and 4B).; wherein the association between each substructure as represented by a sub-graph in a particular layer/iteration of the learning framework and a respective level of toxicity is quantified through the application of either the regression (MLR) or multiclass (SVM) model in the form of a score or probability (relative strength of association) in these associations are ranked in the backward exploration to select the most prominent substructure/fragment contribution to the toxicity.)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Xu to incorporate the teachings of Ivanov for the same reasons as pointed out for claim 1.

Claims 5-7 are rejected under 35 U.S.C. 103 as being unpatentable over Xu, in view of  Ivanov, and in further view of Harpaz et al. (“Biclustering of Adverse Drug Events in FDA’s Spontaneous Reporting System”, Clin. Pharmocol Ter. 2011 February, 89(2):243-250), hereinafter referred to as Harpaz. 

In regards to claim 5, the rejection of claim 4 is incorporated and Xu and Ivanov do not further teach further comprising grouping substructures and related adverse drug reactions with biclustering. Although Xu makes use the tanimoto distance (p. 22, Table 5) for associating/correlating deep fingerprints (substructures) with known molecular topological structure-based fingerprints as a demonstration of the interpretability of his analytical framework, he does not explicitly disclose a bi-clustering operation for grouping different structures/sub-structures according their ADR’s. Although Ivanov (p. 65) discusses the clustering of chemicals according to ADR mechanisms, he also does not explicitly disclose “bi-clustering”.
However, Harpaz, in the analogous environment of associating adverse drug events to drug molecular structure, teaches further comprising grouping substructures and related adverse drug reactions with biclustering. ([p. 7, Section 4, Biclustering],  The clustering data matrix consisted of m rows each representing a different drug, and n columns each representing a different AE, where each cell in the matrix initially contained GPS’ EBGM association strength value corresponding to the i-th drug and the j-th AE. This matrix was then transformed into a binary data matrix, where each cell contained either a 1 or 0, representing the states of “strongly associated” or “weakly associated” respectively. The transformation was performed by selecting an “association strength threshold”, and reflects the fundamental idea that drugs and AEs are either associated due to a causal relationship or not, and that in reality discovering this type of relationship is the ultimate goal., wherein biclustering is used to identify associations between drug groups and adverse reaction/events to quantify that association and provide insight into the causality of adverse drug events.).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Xu and Ivanov to incorporate the teachings of Harpaz to have applied biclustering of drug features/substructures and their related toxicities/adverse reactions either for model evaluation clustering or for analysis of the association between learned substructures and toxicities/adverse reactions. The modification would have been obvious because one of ordinary skill would have been motivated to use biclustering to provide causal insight into drug/drug feature groups and toxicity/adverse reaction modes in order to potentially infer new or previously unrecognized adverse drug events. (Harpaz, [Abstract, pp. 7-8, Section 4]).

In regards to claim 6, the rejection of claim 5 is incorporated and Xu further teaches further comprising outputting a predicted drug-adverse drug reaction association. ([pp. 7-8, “MGE-CNN”, p. 12, Forward and backward exploration of Fingerprints, Figure 1A, Figure 1B] Then zx is used as input of the subsequent neural network in the output layer for executing the following operation: score = f(zxW output H + b output H )W output O + b output O (1) where W output H ∈ R |F P L|×|HLS| is the weight matrix of hidden layer in the output layer, W output O ∈ R |HLS|×d out all is the weight matrix of output layer in the output layer, and b output H ∈ R 1×|HLS| and b output O ∈ R 1×d out all are bias terms. d out all = 1 for RMs, d out all = 4 for MCMs. The 4-dimensional vector is transformed with softmax function representing the probability of four classes. p(i|x) = e score(x)i ∑4 j=1 e score(x)j is the probability of category i, where score(x)i is the score for category i….. During “Fingerprint analysis”, the well-trained deep fingerprints of small molecules were used to develop shallow models, MLR and SVM, to predict AOT values or labels., In order to determine what these models actually predict, the forward and backward exploration approach was applied for “Fingerprint” layer. The forward exploration was implemented by extracting the values of “Fingerprint” layer (deep fingerprints) to construct MLR and SVM models., wherein a predicted drug-adverse drug reaction association with substructures/fingerprints is computed during the Forward exploration operation in which a MLR or  SVM (multi class classifier) are used to predict the toxicity (AOT) associated with each fingerprint (molecular sub-structure) discerned from the deep learning/CNN architecture.)  
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Xu to incorporate the teachings of Ivanov and Harpaz for the same reasons as pointed out for claims 1 and 5, respectively.

In regards to claim 7, the rejection of claim 5 is incorporated and Xu further teaches further comprising outputting a substructure-adverse drug reaction map.   ([p. 23, “Backward Exploration of Fingerprints”, Figure 4, Table 6] Comparing activation values of this feature, nine values were determined to contribute most to feature activation. The nine values could be mapped into different substructures, thereby suggesting that these substructures were the most correlative to the explored toxicity feature (Figure 4A and 4B). There were mainly two classes of highlighted fragments, α,β-Unsaturated nitriles (TA626) and alyl (thio)phosphates (TA776) for RM4, while TA776 and thicarbonyl (TA374) for CM1. The three fragments have been reported to be toxicity structural alerts.;  wherein results are generated/outputted in the form of figures of pertinent fingerprint (molecular fragment) substructures from the analysis, through the neuronal activations in the deep neural network, the toxicity/adverse reaction association with compound structures, and wherein these results are maps of the pertinent substructures as they are structurally and contextually represented in each compound/drug.)  
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Xu to incorporate the teachings of Ivanov and Harpaz for the same reasons as pointed out for claims 1 and 5, respectively.

Response to Arguments
Applicant's arguments filed 1 July 2021 have been fully considered but they are not persuasive. 

Specifically, Applicants Argue:
Xu teaches that "Firstly, given an input SMILES string (x), a molecular structural graph is converted by the RDKit toolbox. The sub-graph from each layer (or iteration) is encoded into a fixed- sized vector..., then these vectors are summed...representing this molecule. Then zx is used as input AMENDMENT AND RESPONSE TO NON-FINAL OFFICE ACTIONPage 7 of 9Serial Number: 15/494,027of the subsequent neural network in the output layer for executing the following operation... The 4- dimensional vector is transformed with softmax function representing the probability of four classes." (Page 7, internal citations omitted). With respect to fingerprint analysis, Xu further teaches that "During "Fingerprint analysis", the well-trained deep fingerprints of small molecules were used to develop shallow models, MLR and SVM, to predict AOT values or labels. Simultaneously, the most relevant feature among deep fingerprint for each compound was calculated based on linear regression with least squares fitting, then traced back to the atomic level, and mapped onto AOT activation fragments. These activated fragments were then used to compare with reported toxicity alerts (TAs) to validate the inference capability for TAs." (Page 7).  Applicant respectfully notes that identifying the most relevant features (e.g., as calculated based on linear regression with least squares fitting) and tracing back to the atomic level for mapping onto activation fragments is not the same as "wherein constructing the deep learning framework comprises defining a plurality of neighborhood-based fingerprints for each of the plurality of drug chemical structures using a plurality of hidden layers and generating a convolutional feature map comprising a plurality of convolutional steps, wherein each convolutional step encodes a respective neighborhood-based fingerprint at a respective hidden layer, wherein each fingerprint is obtained by starting from multiple atoms belonging to said fingerprint and resultant redundancies are removed by mapping each fingerprint into a lower dimension using a single layer of the deep learning framework," as currently recited. In short, Xu does not teach or fairly contemplate a series of convolutional steps where each fingerprint is obtained multiple times by starting from different atoms of the structure or where the resultant redundancies are removed in the manner claimed. Instead, Xu teaches tracing back to the atomic level once the most relevant feature is already determined. (See, e.g., pages 7, 8, and 12-13, "the most relevant feature among deep fingerprint for each compound was calculated based on linear regression with least squares fitting, then traced back to the atomic level, and mapped onto AOT activation fragments"... "the backward exploration is that after linear regression, the most linear-negative-correlation features [is] selected"). In other words, Xu does not appear to appreciate that one could build fingerprints from different centers or offer any solution for addressing the redundancies that emerge from such a process. 
AMENDMENT AND RESPONSE TO NON-FINAL OFFICE ACTION 
Page 8 of 9 Serial Number: 15/494,027
Examiner’s Response
The Examiner respectfully disagrees and notes that, during examination, a claim must be given its broadest reasonable interpretation consistent with the specification (MPEP2173.01(1), MPEP 2111.01(II)). Specifically, Xu does teach “wherein constructing the deep learning framework comprises defining a plurality of neighborhood-based fingerprints for each of the plurality of drug chemical structures using a plurality of hidden layers and generating a convolutional feature map comprising a plurality of convolutional steps, wherein each convolutional step encodes a respective neighborhood-based fingerprint at a respective hidden layer,” because he teaches a CNN-based deep learning framework (Figure 1A) in which fingerprints (z’s in the algorithm pseudo code Figure 1B, especially z_x formed in the “fingerprint layer” but also including Z _layer and z_Ll formed across the layers of the CNN) corresponding to each compound (each molecule with m atoms) are formed/encoded using each layer of a CNN such that distinct fingerprints are encoded for each atom “a” in that molecule along with its respective neighbors in that molecule (lines 12-20 for Figure 1B) with each fingerprint having dimension FPL (the feature dimension) and with the fingerprints thereby formed relative to different atoms in that molecule (and pooled across these atoms and CNN layers - lines 20-26 of Figure 1B) to generate an output fingerprint (fingerprint layer) that is subsequently used to determine the score (equation 1) z_x associated with substructure/fingerprint/subgraph-toxicity (-viz., ([p. 7, “MGE-CNN”, p. 20 “Forward Exploration of Fingerprints”, Figure 1A, Figure 1B, Table 2] The sub-graph from each layer (or iteration) is encoded into a fixed-sized vector zLl ∈ R |F P L| l ∈ {1, 2, ..., |F P D|}, then these vectors are summed as zx ∈ R |F P L| representing this molecule. Then zx is used as input of the subsequent neural network in the output layer., For this purpose, fingerprints were extracted from the “Fingerprint” layer in the well-trained deep models, then the whole data set was transferred into a matrix of N (number of compounds) × FPL, which was a featurization and vectorization process for compounds.) Furthermore, Xu also teaches “wherein each fingerprint is obtained by starting from multiple atoms belonging to said fingerprint and resultant redundancies are removed by mapping each fingerprint into a lower dimension using a single layer of the deep learning framework” because he teaches that, as noted above (and seen in Figure 1B, lines 12-24), fingerprints are formed for each atom and its neighbors for each (hidden) layer of the CNN and pooled (see Figure 1A – especially the summation over the softmax output layer) with the resultant fingerprint output then passing through a single (hidden) layer indicated by the application of the transformation function W_H^output having dimension FPL x HLS prior to classification or regression (see lines 2 and 28 of Figure 1B as well as the previous discussion on the computation of the score) in which this hidden layer reduces the dimensionality of the fingerprint from FPL to HLS for hyperparameter test configurations in which HLS<FPL which, as can be seen in table 2, is the case for 5 of the 10 best performing multi-classification models but also is the case for 4 out of the 5 of these models for which both test accuracy metrics exceeded 90% (viz., CM2, CM3, CM5, CM8)  (-viz., CM2, CM3, CM5, CM8) (-viz, ([p. 7, “MGE-CNN”, p. 20 “Forward Exploration of Fingerprints”, Figure 1A, Figure 1B] The sub-graph from each layer (or iteration) is encoded into a fixed-sized vector zLl ∈ R |F P L| l ∈ {1, 2, ..., |F P D|}, then these vectors are summed as zx ∈ R |F P L| representing this molecule. Then zx is used as input of the subsequent neural network in the output layer., For this purpose, fingerprints were extracted from the “Fingerprint” layer in the well-trained deep models, then the whole data set was transferred into a matrix of N (number of compounds) × FPL, which was a featurization and vectorization process for compounds.) It is noted that Xu teaches the underlying functionality of performing the mapping of the fingerprints to a lower dimension via the single layer which inherently performs the removal of the redundancy (i.e., the effect of the mapping is inherently the removal of redundancy). 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
	
Steven Michael Kearnes  (“Finding Needles in a Haystack: Molecular Similarity and Machine Learning for Drug Discovery Applications”, PhD Dissertation, Stanford University, June 2016, pp. 1-281) teaches a convolution-based deep learning framework for deriving robust molecular graph fingerprints in which a fuzzy histogram technique is used to achieve (atomic) order invariance in these features/fingerprints.   

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ROBERT LEWIS KULP whose telephone number is (571)272-7983.  The examiner can normally be reached on M, Th, F 8-5:30; Tu 8-3.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki, can be reached on 571-272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/ROBERT LEWIS KULP/Examiner, Art Unit 2122                                                                                                                                                                                                        
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122