DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Status
Claims 1-27 are pending.
Claims 1-27 are under examination.

Specification
The disclosure is objected to because it contains embedded hyperlinks (page 22, lines 1, 22 and 29; page 23, line 4). Applicant is required to delete or modify the embedded hyperlinks; references to websites should be limited to the top-level domain name without any prefix such as http:// or other browser-executable code.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1 rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.
"Claims directed to nothing more than abstract ideas (such as a mathematical formula or equation), natural phenomena, and laws of nature are not eligible for patent protection" (MPEP 2106.04 § |). Abstract ideas include mathematical concepts, and procedures for evaluating, analyzing or organizing information, which are a type of mental process (MPEP 2106.04(a)(2)). The claims as a whole, considering all claim elements both individually and in combination, do not amount to significantly more than the abstract idea of “classifying genetic variants”.
Step 1: The Four Categories of Statutory Subject Matter (MPEP 2106.03)
The claims are directed to a method, which is one of the categories of statutory subject matter.
Step 2A, Prong One: Whether the Claims Set Forth or Describe a Judicial Exception (MPEP 2106.04 § II.A.1)
Mathematical concepts recited in the claims include “convolutional neural network”; “processing an input sequence”; “producing an intermediate convolved feature”; “metadata correlator”; "correlating the variant with a set of metadata features"; “fully-connected neural network”; "processing a feature sequence" and “outputting classification scores”.-
Hence, the claims explicitly recite numerous elements that, individually and in combination, constitute abstract ideas. The claims must therefore be examined further to determine whether they integrate that abstract idea into a practical application (MPEP 2106.04(d)).
	Step 2A, Prong Two: Whether the Claims Contain Additional Elements that Integrate the Judicial Exception(s) into a Practical Application (MPEP 2106.04 § II.A.2)
Claim 1 recites an additional element that is not an abstract idea: "a variant classifier, running on one or more processors operated in parallel and coupled to memory", i.e., a generic computer. Claims 23 and 27 recite another additional element: “a non-transitory computer readable storage medium,” i.e., executable instructions to implement the abstract idea using a generic computer. The claims do not describe any specific computational steps by which the computer performs or carries out the abstract idea, nor do they provide any details of how specific structures of the computer are used to implement these functions. The claims state nothing more than that a generic computer performs the functions that constitute the abstract idea, and are therefore mere instructions to apply the abstract idea using a computer. As such, the claims do not integrate that abstract idea into a practical application (see MPEP 2106.04(d) § |; and MPEP 2106.05(f)).
None of the dependent or analogous (process) claims recite any additional non-abstract elements; they are all directed to further aspects of the information being analyzed, the manner in which that analysis is performed, or the mathematical operations performed on the information.
Because the claims recite an abstract idea, and do not integrate that abstract idea into a practical application, the claims are directed to that abstract idea. Claims that are directed to abstract ideas must be examined further to determine whether the additional elements besides the abstract idea render the claims significantly more than the abstract idea. Claims that are directed to abstract ideas and that raise a concern of preemption of those abstract ideas must be examined to determine what elements, if any, they recite besides the abstract idea, and whether these additional elements constitute inventive concepts that are sufficient to render the claims significantly more than the abstract idea (MPEP 2106.05).
Step 2B: Whether the Claims Contain Additional Elements that Amount to an Inventive Concept (MPEP 2106.05)
As explained above, the mere instructions to implement the abstract idea using a computer are, when considered individually, insufficient to constitute an inventive concept that would render the claims significantly more than an abstract idea (see MPEP 2106.05(f)).
When the claims are considered as a whole, they do not integrate the abstract idea into a practical application; they do not confine the use of the abstract idea to a particular technology; they do not solve a problem rooted in or arising from the use of a particular technology; they do not improve a technology by allowing the technology to perform a function that it previously was not capable of performing; and they do not provide any limitations beyond generally linking the use of the abstract idea to a broad technological environment (i.e. computerized analysis of sequence data). See MPEP 2106.05(a) and 2106.05(h).

Conclusion: Claims are Directed to Non-statutory Subject Matter
For these reasons, the claims, when the limitations are considered individually and as a whole, are directed to an abstract idea and lack an inventive concept. Hence, the claimed invention does not constitute significantly more than the abstract idea, so the claims are rejected under 35 USC § 101 as being directed to non-statutory subject matter.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-27 are rejected under 35 U.S.C. 103 as being unpatentable over Xiong et al. (WO 2018/006152; published 1/11/2018; on IDS of 24 Mar 2022) in view of Patterson & Gibson (Deep learning: A practitioner’s approach. O’Reilly Media, Inc.; published July 2017), Ramachandram & Taylor (IEEE Signal Processing Magazine, 34(6): pp. 96-108; published 11/13/2017) and Torracinta et al. (bioRxiv, 079087; published 11/4/2016; on IDS of 24 Mar 2022).  
Claims 1 is directed to a system comprising a "variant classifier, running on one or more processors operating in parallel and coupled to a memory", comprising
(a)	"a convolutional neural network having at least two convolutional layers and each [layer] having at least five convolutional filters trained over one thousand to millions of gradient update iterations to
	(i)	process an input sequence …" and
	(ii)	"produce an intermediate convolved feature";
(b)	"a metadata correlator that correlates the variant with…metadata features which represent”
	(i) “mutation characteristics of the variant”,
	(ii) “read mapping statistics of the variant”, and
	(iii) “occurrence frequency of the variant”;
and
(c)	“a fully-connected neural network having at least two fully-connected layers trained over the one thousand to millions of gradient update iterations to
	(i) 	process a feature sequence derived from a combination of the intermediate convolved feature and the metadata features” and
	(ii) 	“output classification scores for likelihood that the variant is… somatic… germline… or noise”.
Claim 22 is directed to the method performed by said system.
Claim 23 is directed to a computer program product that implements said method.
With respect to claims 1 and 22-23, Xiong teaches “neural networks [that] take as input biological sequences and additional information and output molecular phenotypes" (Abstract), comprising  “at least three layers” (pg. 23, claim 1a) with “one or more… configured as convolutional layers” (pg. 23, claim 1b), and having 
“a first layer… configured to” (pg. 23 claim 1a)
“obtain a biological sequence” (pg. 23 claim 1a) and
“produce outputs” (pg. 23 claim 1a),
 “each layer other than the first layer configured to” (pg. 23, claim 1a)
 “receive inputs from the produced outputs of one or more prior layers” (pg. 23 claim 1a) and
“produce outputs” (pg. 23 claim 1a),
		wherein “at least one… configured as a fully connected layer” (pg. 24, claim 5);
“a last layer representing a molecular phenotype… configured to” (pg. 23, claim 1a)
“receive inputs from the produced outputs of one or more prior layers” (pg. 23 claim 1a) and
“produce outputs” (pg. 23 claim 1a);
a final output of “values that represent… computed relevance scores” (pg. 13, section 0058; cover pg., Figure 1);  
use of “position dependent tracks obtained with structural, biochemical, population and evolutionary data of biological sequences,” (pg. 19, section 0091), including “tracks of common and rare mutations in human populations” (pg. 20 section 0093), i.e. mutation characteristics of the variant, occurrence frequency of the variant; and
functions that “identify significant filter responses while repressing spurious responses caused by insufficient… matches between the filters and input sequences” (pg. 11, section 0051), i.e. identification of noise.
Xiong does not teach at least two convolution layers; at least two fully-connected layers; at least five convolutional filters per convolutional layer; or gradient updating performed over at least one thousand iterations.
Patterson teaches “number of layers” as a “network architecture decision” to be made based on “memory requirements” (pg. 418). Patterson further teaches numerous other “parameters we tune to make networks train better and faster… called hyperparameters” (pg. 54). One such is “filter count… a hyperparameter value for each convolutional layer… [which] can be chosen freely” (pg. 254).  Thus, Patterson teaches optimizing the number of layers and convolutional filters per layer.
Patterson further states that “an epoch is a full pass over the entire input dataset” and that it is standard to “train on multiple epochs of a dataset before finding training convergence” (pg. 493), i.e. the number of epochs is an optimized variable. Patterson similarly teaches optimizing a parameter called “mini-batch size”, teaching disadvantages to using a batch size either too small or too large for the desired task (pg. 62).  Patterson further states that “number of parameter updates per epoch is just the total number of examples in [the] training set divided by the mini-batch size”(pg. 495), i.e., the number of gradient update iterations is a function of the number of training examples, mini-batch size, and epochs, all of which are parameters that are routinely optimized. Thus, Patterson teaches optimizing the number of iterations over which gradient updating is performed.  Given such teachings, these differences between the claimed invention and the prior art amount to routine optimization of parameters within prior art ranges, which is insufficient to patentably distinguish the invention from the prior art (MPEP 2144.05 § II) . 
Neither Xiong nor Patterson teach combining convolved features with metadata features prior to fully-connected layers.
	Ramachandram discusses neural network-implemented methods using multimodal data, “data from different sensors observing a common phenomenon” in which the data is used “in a complementary manner toward learning a complex task” (pg. 96). Ramachandram states that the use of multimodal data “yield[s] a richer representation that could be used to produce much improved performance compared to using only a single modality” (pg. 97, l. column). It compares conventional multimodal learning (CML) with deep multimodal learning (DML)and discusses several functional advantages of the latter (pg. 98, Table 1). One discussed advantageous routine feature of DML is implementation of “intermediate fusion,” wherein features are combined after the initial layers of the system and prior to the final layers and output (pg. 98, Table 1; pg. 103, Figure 2). In this way, Ramachandram teaches combining convolved features with metadata features prior to final (in the case of Xiong, fully-connected) layers. In discussing the general applicability of such methods to neural network-implemented systems, Ramachandram exemplifies studies that utilize intermediate fusion of “gene expression, DNA methylation, and drug response” features, i.e. biological metadata as in Xiong, Torracinta, and the instant application (pg. 101, Table 3). 
Neither Xiong, Patterson nor Ramachandram teach metadata features specifying read mapping statistics; or prediction of likelihood that a given variant is somatic or germline. 
Torracinta teaches "an adaptive approach to calling somatic variations" that uses "a deep feed-forward neural network with semi-simulated data" (Abstract). This neural network-implemented system incorporates metadata features specifying read mapping statistics, determined via sequence alignment to a reference genome (pg. 4, lines 123-129). Torracinta further teaches “model probability (that [a given] site is a somatic mutation)” (pg. 6, Figure 3, caption) as final output, and a “pair design (germline/somatic)” feature (pg. 11, lines 337-338), i.e. prediction of likelihood that a given variant is somatic or germline. Claims 22 and 23 share the same limitations as claim 1 and amount to process and composition implementations, respectively, of the same invention. Thus, these claims are similarly made obvious by Xiong in view of Patterson, Ramachandram and Torracinta.
With respect to claims 2-12, as stated above, Xiong teaches the use of “position dependent tracks obtained with structural, biochemical, population and evolutionary data of biological sequences,” (pg. 19, section 0091) including
“protein secondary structure” (pg. 20 section 0093), i.e. whether a variant is nonsynonymous, impact on protein function, 
“tracks of common and rare mutations in human populations” (pg. 20 section 0093), i.e. allele frequencies in sequenced populations and ethnic sub-populations, likelihoods of identifying ethnic makeup of an individual expressing the variant, and 
“evolutionary conservation scores” (pg. 20 section 0093), i.e. conservativeness of the variant across multiple species.
 	which may be applied to “intermediate-level convolutional filters that act on feature maps generated by lower level convolutional filters” (pg.20, section 0097). 
	Xiong does not teach metadata features specifying: whether the variant is a single-nucleotide polymorphism, an insertion, or a deletion; quality parameters of read mapping; the variant’s clinical effect, drug sensitivity and histocompatibility; frequency of the variant in sequenced cancerous tumors; and at least one base mutated by the variant at the target position in a reference sequence.
	Torracinta teaches use of metadata features including “data collected at each site for observed bases or indels” versus “the genotype of the reference” (pg. 5, Figure 2, caption), i.e. whether the variant is a single-nucleotide polymorphism, insertion or deletion, base mutations at target positions versus reference genome, and “average probability of base error (derived from quality scores)” (pg. 5, Figure 2, caption) i.e. quality parameters of read mapping. Torracinta suggests use of their method “to identify sites of somatic variations in… tumors” (pg. 1, lines 43-44) and teaches a system wherein the output includes information on the probability of a site being mutated given the prior distribution of mutated sites in the training set (pg. 9, lines 247-248). Thus, one embodiment of the system evaluates frequency of the variant in sequenced cancerous tumors.
	Neither Xiong nor Torracinta teach metadata features specifying the variant’s clinical effect, drug sensitivity and histocompatibility.
	Ramachandram teaches “models that implement multimodal fusion in biomedical applications involving genomic, proteomic, and drug data” (pg. 100, l. column), i.e. neural network implemented-systems utilizing fusion of features such as clinical effect, drug sensitivity and histocompatibility.
	With respect to claims 13-14, Xiong teaches a neural network-implemented system wherein all neural network components, including convolutional and fully-connected layers, are trained end-to-end (pg. 31, Figure 4).
	With respect to claims 15-16, Patterson states that “rectified linear units (ReLU) are the current state of the art because they have proven to work in many different situations” (pg. 131) and that “deep networks using ReLU activation functions… train well without using pretraining techniques” (pg. 132). Patterson additionally states that “to accelerate training in CNNs [one] can normalize the activations of the previous layer at each batch” (pg. 264) and that such “batch normalization in CNNs has been shown to speed up training… reduc[ing] the sensitivity of training towards weight initialization and… reducing the need for other types of regularization” (pg. 264). In this way, Patterson teaches general advantage to the use of rectified linear unit layers and batch normalization layers in neural network architecture.
	With respect to claim 17, Xiong exemplifies use of an input “subsequence of length 600 nucleotides centered at the 3 prime end of exon 6” having a “single-nucleotide substitution located 100 nucleotides from the 3 prime splice site of exon 6” (pg. 8, sections 0038-0039). This renders an input wherein the variant is flanked by at least 19 bases on each side. Xiong additionally teaches that “the functional meaning of a variant is context dependent”, and “the neighboring sequence may be different in two different patients”, rendering “the variant [itself] different” (pg. 8, section 0039). This gives clinical importance to “viewing [this] substitution in… different situations” to understand “impact on gene expression” (pg. 8, section 0039). Three examples of contexts to view said substitution in are provided: the scale of “splice site region variants of length 600 nucleotides”, the scale of “BRCA1 [gene] variants of length 81,189 nucleotides” or the scale of “chromosome 17 variants of length 83 million nucleotides” (pg. 8, section 0039). This suggests that the number of flanking bases should be routinely adjusted by researchers depending on clinical objectives.
	With respect to claims 18-19, Torracinta teaches concatenation of sequence and metadata features. The concatenation is rendered as “a fixed-length input vector” (pg. 11, lines 328-331), i.e. one-dimensional array. 
	With respect to claim 20, Patterson teaches a standard mathematical operation where two input vectors are combined to render a “tensor product” (pg. 23). They define a tensor as “a multidimensional array… of rank 3 or above” (pg. 21), “a matrix of three or more dimensions” (pg. 301) and later exemplify a “four-dimensional tensor input for CNNs” (pg. 598). Patterson thereby teaches input sequences encoded as >2-dimensional arrays.	With respect to claim 21, as explained above, Patterson teaches optimizing the number of convolutional filters per layer.
	Claims 24-25 are directed to the system of claims 2-12, wherein “the input sequence has a variant… and has a set of metadata features correlated with the variant”. In other words: the metadata features are input initially, along with the sequence data. 
	With respect to claims 24-25, Xiong teaches a neural network implemented-system where the initial convolutional layers “take as input biological sequences and additional information” (cover pg., Abstract and Figure 1). 
	Claims 26 and 27 share the same limitations as claim 24 and amount to process and composition implementations, respectively, of the same invention. Thus, these claims are similarly made obvious by Xiong, Patterson, Torracinta and Ramachandram.
	An invention would have been obvious to one of ordinary skill in the art if it simply applies known techniques to a known method. Prior to the time of invention, said practitioner could have optimized number of layers and convolutional filters, as is standard in the art and taught as such by Patterson, in the neural network system of Xiong. Given that Patterson teaches general advantage to such techniques for all neural network-implemented systems, and the system of Xiong is a neural-network implemented system, said practitioner would have readily predicted that the implementation of such techniques would successfully result in an enhanced neural network-implemented system.
	An invention would have been obvious to one of ordinary skill in the art if some teaching in the prior art would have led that person to combine prior art reference teachings to arrive at the claimed invention. Prior to the time of invention, said practitioner would have used intermediate feature injection, as taught by Ramachandram, to enhance the neural network-implemented system of Xiong. Said practitioner would have readily predicted that the neural network-implemented system implementing said intermediate feature injection would successfully operate in the manner documented.	An invention would have been obvious to one of ordinary skill in the art if some teaching in the prior art would have led that person to combine prior art reference teachings to arrive at the claimed invention. Prior to the time of invention, said practitioner would have used the metadata features and somatic/germline likelihood calculation taught by Torracinta to enhance the neural-network implemented system of Xiong. They would have implemented said metadata features by means of intermediate feature injection, as taught by Ramachandram. Said practitioner would have readily predicted that the neural-network implemented system implementing said features and calculation would successfully operate in the manner documented.
	In this way the disclosure of Xiong, in view of Patterson, Ramachandram and Torracinta, makes obvious the limitations of claims 1-27. Thus, these inventions are prima facie  obvious. 
Conclusion
	Claims 1-27 are rejected.
No claim is allowed. 	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Theodore C. Striegel whose telephone number is (571) 272-1860. The examiner can normally be reached Mondays and Wednesdays 7:30am-5:30pm ET, Tuesdays, Thursdays and every other Friday 8:00am-4:30pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karl Skowronek can be reached on (571) 270-9047. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/T.C.S./Examiner, Art Unit 1671                                                                                                                                                                                                        
/Soren Harward/Primary Examiner, Art Unit 1671