Detailed Action
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Herein, “the previous office action”, refers to the non-final rejection of 01/07/2022.

Claim Status
Claims 9 and 24 have been cancelled.
Claims 30-35 are newly added.
Claims 1-8, 10-23, 15-35 are currently pending.


Withdrawn Rejections
The rejection of claim 29 under 35 USC §112(d) is hereby withdrawn in view of Applicant’s amendments, which changed the dependency of the claim.
All rejections on claim 9 and 24 are hereby withdrawn in view of the Applicant’s amendments; their cancellation moots the rejections.
The rejections under 35 USC §103 to claims other than claims 4-8, 10-11, 13, 15, 20-23, 26-27, 29 are hereby withdrawn in view of Applicant’s amendments to claim 1 and 16. More specifically, the amendment of using a “confidence score” as a threshold when decision tree to make a decision (last two par of claim 1) is not disclosed by Kermani. Claims 3, 4, 18, 19, 29 has dependency changes. The rejections to claims other than claims 4-8, 10-11, 13, 15, 20-23, 26-27, 29 under 35 USC §103 are re-written and re-installed in this office action in view of the amendment. 



Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 8, 23,  recites the limitation " a neural network, Bayesian classifier, logistic regression, decision tree, gradient-boosted tree, multilayer perceptron, one-vs-rest, and Naive Bayes" in “wherein the machine learning classifier is selected from the group consisting of”.  The independent claims (which claims 8 and 23 dependent from) are limited to decision trees, and it's not clear how these other types of classifiers can be made into decision trees.
Claim 32 is rejected for lack of antecedent. Claim 32 should depend from claim 31, other than claim 30. Because the claim 32 limitation “alignment characteristics” has an antecedent in claim 31, not in claim 30.  Appropriate correction is required.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-8, 10-23 and 25-35 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. This rejection is upheld and updated in response claim amendment herein. 
The instant rejection reflects the Guidance published in the Federal Register notice titled 2019 Revised Patent Subject Matter Eligibility Guide lines (Vol. 84, No. 4, Monday January 7, 2019 at 50) and the October 2019 Update d Subject Matter Eligibility Guidance (hereinafter both referred to as the "Guidance"), as outlined in the MPEP at 2106.04. 

Framework with which to Evaluate Subject Matter Eligibility: 
Step 1: Are the claims directed to a process, machine, manufacture, or composition of matter; 
Step 2A, Prong One: Do the claims recite a judicially recognized exception, i.e. a law of nature, a natural phenomenon, or an abstract idea;  
Step 2A, Prong Two: If the claims recite a judicial exception under Prong One, then is the judicial exception integrated into a practical application (Prong Two); and 
Step 2B: If the claims do not integrate the judicial exception, do the claims provide an inventive concept.

Framework Analysis as Pertains to the Instant Claims:
With respect to Step 1: yes, the claims are directed to methods for identifying somatic mutations from a patient sample [Step 1: YES; See MPEP § 2106.03]. 
With respect to Step 2A, Prong One, the claims recite abstract ideas. The MPEP at 2106.04(a)(2) further explains that abstract ideas are defined as: 
•	mathematical concepts (mathematical formulas or equations, mathematical relationships and mathematical calculations);
•	certain methods of organizing human activity (fundamental economic practices or principles, managing personal behavior or relationships or interactions between people); and/or
•	mental processes (procedures for observing, evaluating, analyzing/ judging and organizing information).

With respect to the instant claims, under the Step 2A, Prong One evaluation, the claims are found herein to recite abstract ideas that fall into the grouping of mental processes (in particular procedures for observing, analyzing and organizing information) and mathematical concepts (in particular mathematical relationships and formulas).
 
The claim steps to abstract ideas of mental processes and mathematical concepts as follows: 

Mental process recited in the claims include:
“mapping the pair of paired end reads to the reference, wherein when the pair of paired-end reads exhibit a discordant mapping to the reference, the fragment includes the boundary” (claim 14);
“mapping the tags to the reference, and determining tag densities of mapped tags along portions of the reference, wherein when a portion of the reference exhibits an anomalous tag density a large indel is detected in a corresponding portion of the nucleic acid from the subject, wherein an end of the indel corresponds to the boundary of the structural alteration” (claim 14);
	“analyze whole exome sequencing data obtained from the sequencing of amplified nucleic acid from the sample to identify a candidate variant” (claim 16);

Mathematical concept recited in the claims include:  
“evaluating, using a computer having a machine learning classifier, the candidate variant against a plurality of decision trees trained to detect somatic mutations in the candidate variant” (claim 1);
“generating, using the computer, a confidence score for the candidate variant based on a proportion of the plurality of decision trees classifying the candidate variant as somatic” (claim 1);
“classifying, using the computer, the candidate variant as a somatic mutation based on the confidence score, thereby identifying somatic mutations” (claim 1);
“evaluating the candidate variant using a random forest classifier” (claim 2);
“training the machine learning classifier using a training data set of sequences that include known mutations or structural alterations” (claim 5);
“optimizing parameters of the machine learning classifier until the machine learning classifier produces output describing the known mutations and/or structural alterations” (claim 6);
	“selecting a plurality of feature categories” (claim 7);
	“detecting at least one SNV in the nucleic acid” (claim 13);
“validating the detected SNV as present in the nucleic acid using the classification model” (claim 13);
“evaluate the candidate variant against the plurality of decision trees trained to detect somatic mutations in the candidate variant” (claim 16);
“generate a confidence score for the candidate variant based on a proportion of the plurality of decision trees classifying the candidate variant as somatic” (claim 16);
“classify the candidate variant as a somatic mutation based on the confidence score” (claim 16); 
“to train the machine learning classifier using a training data set of sequences that include known mutations or structural alterations” (claim 20);
	“to optimize parameters of the machine learning classifier until the machine learning classifier produces output describing the known mutations and/or structural alterations” (claim 21);
“select a plurality of feature categories” (claim 22);

The abstract ideas recited in the claims are evaluated under the Broadest Reasonable Interpretation (BRI) and determined herein to each cover performance either in the mind and/or performance by mathematical operation because the steps involve nothing more than instructions for a user to manually manipulate data using mathematical concepts such as analyzing, evaluating, generating, classifying, training, optimizing, mapping, and determining. There are no specifics as to the methodology involved in  “analyzing”, “evaluating”, “generating”, “classifying”, “training”, “optimizing”, “mapping”, and “determining” and thus, under the BRI, one could simply, for example, make a list of the genetic changes between samples and analyze said data, for example, with pen and paper, and make decisions or correlations based on those results in one's mind. Therefore, claims 1-8, 10-23 and 25-35 recite an abstract idea [Step 2A, Prong 1: YES; See MPEP § 2106.04]. 
Because the claims do recite judicial exceptions, direction under Step 2A, Prong Two, provides that the claims must be examined further to determine whether they integrate the abstract ideas into a practical application (MPEP 2106.04(d). A claim can be said to integrate a judicial exception into a practical application when it applies, relies on, or uses the judicial exception in a manner that imposes a meaningful limit on the judicial exception. This is performed by analyzing the additional elements of the claim to determine if the abstract idea is integrated into a practical application (MPEP 2106.04(d).I.; MPEP 2106.0S(a-h)). If the claim contains no additional elements beyond the abstract idea, the claim is said to fail to integrate the abstract idea into a practical application (MPEP 2106.04(d).III).
  
With respect to the instant application, the claims recite the following additional elements:
Claims 1 and 14 recite elements other than the abstract idea: “analyzing the patient sample by amplifying nucleic acid from the sample and performing whole exome sequencing analysis on the nucleic acid from the sample to identify a candidate variant” (claim 1);  "(a) sequencing a fragment of the nucleic acid by paired-end sequencing to obtain a pair of paired-end reads" (claim 14) and "and (b) sequencing the nucleic acid to determine a plurality of sequence tags" (claim 14), which are all necessary additional elements to the abstractive idea, and are insignificant extra-solution activity.
MPEP § 2106.05(g) has the following guideline to determine whether an additional element is insignificant extra-solution activity, examiners may consider the following:
Whether the extra-solution limitation is well known. 
(2) 	Whether the limitation is significant (i.e. it imposes meaningful limits on the claim such that it is not nominally or tangentially related to the invention). 
(3) 	Whether the limitation amounts to necessary data gathering and outputting, (i.e., all uses of the recited judicial exception require such data gathering or data output). 
The additional elements from claims 1 and 14 apparently match all the three points: (1) it is well known library preparation and sequencing; (2) it is insignificant because it is usually done through commercial service; (3) It is necessary data-gathering because otherwise there would be no sequence to analyze.
Claims 10, 13, and 34-35 recite elements other than the abstract idea: claim 10 recites “providing a report that describes the candidate variant as  including the mutation or structural alteration”; claim 13 recites “providing a report that describes the nucleic as including the SNV”; claim 34 recites “displaying on a user interface connected to an output device, a report that describes the candidate variant as including the mutation or structural alteration”; claim35 recites “to display a report that describes the candidate variant as including the mutation or structural alteration on the output device”. These are necessary data outputting steps. Here the data outputting is an insignificant extra-solution activity (MEPE § 2106.05(g)(3) ). 
Claims 16 and 25-28 recite another group of elements other than the abstract idea: claim 16 recite “A system for identifying somatic mutations from a patient sample”; claim 25 recite “the sample is a biological sample”; claim 26 recite “the sample is selected from the group consisting of plasma, blood, serum, saliva, sputum, stool, a tumor, cell free DNA, circulating tumor cell, and other biological sample”; claim 27 recite “the sample is from a subject having or at risk of having cancer”; claim 28 recite “the cancer is selected from lung, bladder, colon, gastric, head and neck, breast, prostate, non-small cell lung adenocarcinoma, non-small cell lung squamous cell carcinoma, bladder urothelial carcinoma, colorectal, brain or pancreatic cancer”.  These are field of use and technological environment. This type of limitation merely confines the use of the abstract idea to a particular field of use (samples from population risking cancer) and thus fails to add an inventive concept to the claims (838 F.3d at 1259, 120 USPQ2d at 1204) because the claim limitation detects naturally occurred mutations in cancer samples without change anything. Limitations that amount to merely indicating a field of use or technological environment in which to apply a judicial exception do not amount to significantly more than the exception itself, and cannot integrate a judicial exception into a practical application. (MPEP §2106.05(h)). 
Claims 16 recites a system comprising of a computer with processor and memory that can execute the methods as discussed above. Claim 16 recites nothing more than generic computer and the memory is not clearly non-transitory. Claim 16 and its dependent claims 17-23 contain mere instructions to apply the abstract idea using a computer outlined in claims 1-8, 10-15. Claim 16 recites nothing more than a generic computer capable to apply the abstract idea outlined in claims 1-8, 10-15, and 25-29. Therefore claim 16 do not integrate that abstract idea into a practical application (see MPEP § 2106.04(d)(1); and MPEP 2106.05(f)). 
Thus, none of the claims recite additional elements which would integrate a judicial exception into a practical application, and the claims are directed to an abstract idea [Step 2A, Prong 2: NO; See MPEP § 2106.04(d)]. 

As such, the claims are lastly evaluated using the Step 2B analysis, wherein it is determined that because the claims recite abstract ideas which are not integrated into a practical application, the claims also lack a specific inventive concept. Applicant is reminded that the judicial exception alone cannot provide the inventive concept or the practical application and that the identification of whether the additional elements amount to such an inventive concept requires considering the additional elements individually and in combination to determine if they provide significantly more than the judicial exception. (MPEP 2106.05.A i-vi).
The dependent claims of claim 16 and claim 1, recite no additional non-abstract elements but are directed to further aspects of the information being analyzed, the manner in which that analysis is performed, or the mathematical operations performed on the information. 
Because the claims recite an abstract idea, and do not integrate the Judicial 
Exceptions into a practical application, the claims as a whole are directed to abstract ideas. Claims that are directed to abstract ideas must be examined further to determine whether the additional elements amount to significantly more than the exception itself. Claims that are directed to abstract ideas and that raise a concern of preemption of those abstract ideas must be examined to determine what elements, if any, they recite besides the abstract idea, and whether these additional elements constitute inventive concepts that are sufficient to render the claims significantly more than the abstract idea (MPEP 2106.05). 
As explained above, the mere instructions to implement the abstract idea using a computer are, when considered individually, insufficient to constitute an inventive concept that would render the claims significantly more than an abstract idea (see MPEP 2106.05(f)). 
As explained above, the data-gathering steps in claims 1 and 14; the data outputting of claims 10, 13, 34-35; the field of use and technological environment of claims 16, 25-28, and the later implementing of the methods in a generic computer framework (claim 16) constitute insignificant extra solution activities, and when considered individually, or as a whole, are insufficient to constitute inventive concepts that would render the claims significantly more than an abstract idea (see MPEP 2106.05(g)). (Step 2B: No) 

When the claims are considered as a whole, they do not integrate the abstract idea into a practical application; they do not confine the use of the abstract idea to a particular technology; they do not solve a problem rooted in or arising from the use of a particular technology; they do not improve a technology by allowing the technology to perform a function that it previously was not capable of performing; and they do not provide any limitations beyond generally linking the use of the abstract idea to a broad technological environment (i.e. computerized analysis of biological data). See MPEP 2106.05(a) and 2106.05(h). 
For these reasons, the claims, when the limitations are considered individually and as a whole, are directed to an abstract idea and lack an inventive concept. Hence, the claimed invention does not constitute significantly more than the abstract idea, so the claims are rejected under 35 USC§ 101 as being directed to non-statutory subject matter.


Response to Arguments - Rejections Under 35 USC§ 101

In the reply filed 31 Mar 2022, Applicant asserts that "the combination of elements in claims1 and 16 that integrate the exception into a practical application” (page 13, para 3).  As discussed above, the additional elements recited in claims 1 and 16 are extra-solution activities that either well-known necessary data gathering or data outputting, which do not integrate the abstract idea into a practical application; Simply apply the abstract idea using a computer system is not an improvement to computer technology. Rather, the invention merely invokes computers as a tool to perform the abstract idea, which does not integrate the abstract idea into a practical application. Overall, the argument  that "the combination of elements in claims1 and 16 that integrate the exception into a practical application” is therefore unpersuasive, so the rejection is maintained.
Additionally, Applicant asserts that "provide a technical improvement in the field of somatic mutations identification” (page 13, para 3). Identification of somatic mutation is not a technical field, it makes an observation to a naturally happened phenomena.
Hence, the claimed invention does not represent an improvement to a technological field. The arguments are therefore unpersuasive, so the rejection is maintained.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim 1-2, 4-8, 10-11, 13, 15-19, 20-23, 25-31, 34-35 are rejected under 35 U.S.C. 103 as being unpatentable over Kermani (“Machine Learning for Somatic Single Nucleotide Variant Detection in Cell-free Tumor Nucleic acid Sequencing Applications”, US 20170061072 A1, Date Published: 2017-03-02), in view of 
Marquard: (“TumorTracer: a method to identify the tissue of origin from the somatic mutations of a tumor specimen”. BMC Med Genomics 8, 58 (2015). 
Claim 1 is directed to a method for identifying somatic mutations from a patient sample, comprising: 
analyzing the patient sample by amplifying nucleic acid from the sample and performing whole exome sequencing analysis on the nucleic acid from the sample to identify a candidate variant; 
evaluating, using a computer having a machine learning classifier, the candidate variant against a plurality of decision trees trained to detect somatic mutations in the candidate variant, wherein each decision tree classifies the candidate variant as at least one of somatic or not somatic; 
generating, using the computer, a confidence score for the candidate variant based on a proportion of the plurality of decision trees classifying the candidate variant as somatic; 
classifying, using the computer, the candidate variant as a somatic mutation based on the confidence score, thereby identifying somatic mutations.
With respect to claim 1, Kermani discloses a “Systems and methods are disclosed to detect single-nucleotide variations (SNVs) from somatic sources in a cell-free biological sample of a subject” comprising:
Analyzing patient samples by sequencing cell-free DNA([0048]), and identifying SUVs ([0006]). 
Generating training data with class labels; forming a machine learning unit having one output for each of adenine (A), cytosine (C), guanine (G), and thymine (T) base calls, respectively; training the machine learning unit with a training set of biological samples; and applying the machine learning unit to detect the SNVs from somatic sources in the cell-free biological sample, wherein the cell-free biological sample may comprise a mixture of nucleic acid molecules (e.g., deoxyribonucleic acid (DNA)) from somatic and germline sources, e.g., cells comprising somatic mutations and germline DNA.” ([0006]). 
--
--
However, Kermani does not teach using decision trees or random forest as classifiers, neither does Kermani mention the “confidence scores”. Marquard  discloses a method for tracing the tissue origins of the tumor specimen based on the somatic mutations  that uses Random Forest (RF) classification algorithm  (page 1 of 13, para 2 line 1-2 under section “Methods”). In fact, the random forest (RF) classifier is composed of many decision trees. Marquard  further teaches generating a confidence score and using the confidence score for classifications (“Importantly, a derived confidence score could distinguish tumors that could be identified with 95 % accuracy (32 %/75 % of tumors with/without copy numbers) from those that were less certain”.  page 1 of 13, para 3 line 3-5 under section “Results”).

Regarding claim 2, Kermani did not teaches evaluating the candidate variant (training data) using the random forest classifier. Marquard  teaches  evaluating candidates using the random forest classifier (page 3 of 13, section “Machine Learning”) .
Regarding claim 4, Kermani is silent about a decision trees of multiple-trees for each nutation. Marquard teaches a decision trees with multiple-tree for each site and mutation set (“Site-specific random forest classifiers”, means site-specific multi-tree. Fig 1,  page 5 of 13). 
Regarding claim 5, Kermani further teaches training the machine learning system using training data set with known mutations or structural alterations([0014] and fig 4).
Regarding claim 6, Kermani teaches adjusting the connection weight (During the training of a network the same set of data is processed many times as the connection weights are continually refined) using the Delta Rule ([0060-0061]), in order to achieve high accuracy in classification of variants.
Regarding claim 7, Kermani teaches selection of feature categories ([0023-0026]). 
Regarding claim 8, Kermani  teaches the neural network classifier (claim 21).
Regarding claim 10, Kermani teaches a machine learning unit having one output for each of adenine (A), cytosine (C), guanine (G), and thymine (T) base calls respectively ([0006]).
Regarding claim 11, Kermani further teaches an example output in terms of base calls at each base position is known “if it is known that at a particular base position, sample a1 has A (which may be the base in the reference genome) and sample a2 has C, the expected output of sequencing the mixture 99% a1×1% a2 is, for 1,000 reads, for A-TG-C, [990-0-0-10]” ([0022]), which suggests the claim limitation “comparing the sequence reads to a reference to detect an indicia of the structural alteration; and validating the structural alteration as present in the nucleic acid using the classification model”. 
Regarding claim 13, Kermani teaches training dataset with many known SUVs (Fig. 4, and [0014, 0066])). Kermani further teaches “ A method, comprising: (a) providing a training data set comprising, for each mixture in a plurality of mixtures, wherein each mixture in the plurality comprises polynucleotides from a plurality of different subjects, values indicating: (i) a quantitative measure of each of a plurality of bases at each of a plurality of genomic base positions from sequence reads of a plurality of polynucleotides in the mixture, and (ii) a plurality of class labels, each class label classifying the mixture as having one or more particular bases at a particular genomic base position; and (b) training a machine learning unit on the training data set to generate one or more classification models for detecting a presence of a base at each of a plurality of genomic base positions in a test sample” (Kermani, claim 39), which suggest the claim limitation “the training data set comprises a plurality of known single-nucleotide variants (SNVs), the method comprising: detecting at least one SNV in the nucleic acid; validating the detected SNV as present in the nucleic acid using the classification model; and providing a report that describes the nucleic as including the SNV”.  
Regarding claim 15, Kermani teaches “Machine learning methods can be used to generate models that call the presence of a base at a genomic base position in a sample comprising mixed DNA (e.g., germline DNA and somatic DNA)” ([0017), which suggest the machine learning system to differentiate the variates as germline (vs somatic). 
Regarding claims 16-18, claim 16-18 are similar to claims 1-3 respectively, but in the framework of a computing system (while claims 1-3 applied in the framework of a method). Claim 16-23 are rejected as discussed above regarding claims 1-3 respectively.
Regarding claim 19, Kermani is silent about a decision trees of multiple-trees for each nutation. Marquard teaches a decision trees with multiple-tree for each site and mutation set (“Site-specific random forest classifiers”, means site-specific multi-tree. Fig 1,  page 5 of 13).  
Regarding claim 20-23, claim 20-23 are similar to claim 5-8 respectively, but in the framework of a computing system (while claims 5-8 applied in the framework of a method). Claim 20-23 are rejected as discussed above regarding claim 5-8 respectively.
Regarding claim 25,  Kermani teach using biological sample (section “Abstract”, [0005-0006]).
Regarding claim 26-27, Kermani teaches getting blood samples from patients with cancer and other diseases ([0064]). It is also well known in the cancer research community to acquire samples from non-invasive sources such as saliva, sputum, stool, “circulating tumor cell”.
Regarding claim 28, Kermani  is silent in getting samples from particular cancer types. Marquard teach getting samples from multiple cancer types, including ovary cancer, breast cancer, kidney cancer, lung cancer (Table 2,  page 9 of 13). 

Regarding claim 29, Kermani teaches “Machine learning methods can be used to generate models that call the presence of a base at a genomic base position in a sample comprising mixed DNA (e.g., germline DNA and somatic DNA)” ([0017), which suggest the machine learning system to differentiate the variates as germline (vs somatic). 
Regarding claim 30, Kermani is silent about having each decision tree evaluating a unique combination of information. Marquard teaches each decision tree evaluating a unique combination of mutations/substitutions/copy number variations (“Site-specific random forest classifiers”, Fig 1,  page 5 of 13).  
Regarding claim 31, Kermani teaches that variation detection need mapping or alignments information ([0003]), and SNV detection relies on NGS sequencing error rate ([0004]).

Regarding claim 34, Kermani teaches a machine learning unit having one output for each of adenine (A), cytosine (C), guanine (G), and thymine (T) base calls respectively ([0006]).
Regarding claim 35, Kermani teaches a machine learning unit having one output for each of adenine (A), cytosine (C), guanine (G), and thymine (T) base calls respectively ([0006]).

It would have been a Prima Facie Case of Obviousness “teaching-to-modifying”: 
((G) Some teaching, suggestion, or motivation in the prior art that would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention. (MPEP § 2143 I.G.)), to one of ordinary skill in the art at the time of the invention to modify Kermani’s machine learning pipeline which use the neural network classifier, with Marquard’s random forest of decision trees classifier and the confidence score-based classifications, because the random forest classifier enables easy selection of data features based on their contribution to the variants classification and the confidence score allows easy control of the prediction accuracy.  Marquard  and Kermani are both about classification unknown biological samples based on somatic mutations and using machine learning techniques, and they both succeeded.  

Claims 3 and 33 are rejected under 35 U.S.C. 103 as being unpatentable over Kermani and Marquard, as applied above over claims 1-2, and further in view of Yan (“Mapping the distributions of C3 and C4 grasses in the mixed-grass prairies of southwest Oklahoma using the Random Forest classification algorithm”, International Journal of Applied Earth Observation and Geoinformation, Volume 47, May 2016, Pages 125-138).
Regarding claim 3, Kermani is silent in “at least one thousand decision trees”. Marquard teaches a random forest with 500 decision trees (page 3 of 13, col 2, para 1 line 3 under section “Machine Learning”); Yan teaches a random forest with 1000 decision trees (page 128, para 2 line 4-12 under section “3.3.1. Configurating RF classifications in R”)
Regarding claim 33, Kermani is silent in “a confidence score”. Marquard teaches generating a confidence score and using the confidence score for classifications (“Importantly, a derived confidence score could distinguish tumors that could be identified with 95 % accuracy (32 %/75 % of tumors with/without copy numbers) from those that were less certain”.  page 1 of 13, para 3 line 3-5 under section “Results”), but since Marquard’s decision trees are for multi-class classification, a simple count of how many trees voted for “variant as somatic” is not available, Yan teaches generating a confidence score and using the confidence score for classifications (“We provide a confidence score for each RF prediction by retrieving the fraction of tree votes casted for each target land cover category. Specifically, we determined a prediction supported by the majority vote that accounts for no less than two thirds of the total votes (e.g., at least 667 votes from a Random Forest with 1000 classification trees) as a high confidence prediction. The two-thirds threshold would ensure the majority vote is at least two times as many as the second highest fraction of tree votes”. page 128, para 2 line 4-12 under section “3.3.1. Configurating RF classifications in R”). Yan’s confidence score threshold is ≥ 0.67, since the maximum value for the confidence score is 1.0, so actually Yan’s confidence score range 0.67-1.0 overlaps the claimed limitation of 0.75-1.0 in confidence score. “PRIOR ART WHICH TEACHES A RANGE OVERLAPPING OR TOUCHING THE CLAIMED RANGE ANTICIPATES IF THE PRIOR ART RANGE DISCLOSES THE CLAIMED RANGE WITH "SUFFICIENT SPECIFICITY”” (MPEP §2131.03.II).
It would have been a Prima Facie Case of Obviousness “teaching-to-modifying”: 
((G) Some teaching, suggestion, or motivation in the prior art that would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention. (MPEP § 2143 I.G.)), to one of ordinary skill in the art at the time of the invention to modify Kermani’s machine learning pipeline  with the neural network classifier, with Marquard’s random forest of decision trees classifier and the confidence score-based classifications, plus Yan’s teaching to add more decision trees but simplify the confidence scores, to achieve the claimed limitations. Because the random forest classifier enables easy selection of data features based on their contribution to the variants classification and the confidence score allows easy control of the prediction accuracy. Further, Marquard’s random forest classifier is a multi-class (multiple cancer origins) classifier and the confidence score calculation is complicated, but Yan’s decision tree classifier is a binary classifier which is more similar the binary classification (mutation or no mutation) and Yan’s confidence score by sum the decision tree voting is much simpler (and easy to adjust the threshold for desirable accuracy) than Marquard’s confidence score.  Marquard and Kermani are both about classification unknown biological samples based on somatic mutations and using machine learning techniques, and Yan is technical strong in fine-tuning the decision trees, and they all succeeded.  
Claims 12 and 32 are rejected under 35 U.S.C. 103 as being unpatentable over Kermani and Marquard, as applied above over claims 1-2, and in further view of Spinella (“SNooPer: a machine learning-based method for somatic variant identification from low-pass next-generation sequencing”, BMC Genomics volume 17, Article number: 912 (2016) ).
Regarding claim 12, Kermani and Marquard did not teaches setting up parameters for the decision trees classifier. Spinella teaches “SNooPer expects both normal and tumor files in SAMtools mpileup format” (page 3 of 11, left column, Section “Somatic testing and feature extraction”, line 1-2), which suggests the claim limitation “sample type”; Spinella further teaches “Using the default parameters of quality filters, the algorithm only considers positions presenting at least one read (mapping quality value - MQV ≥10) supporting the alternative allele (base quality value – BQV ≥20), and requires a minimum coverage of 8X in both the tumor sample and its normal counterpart” (page 3/11, left column, Section “Somatic testing and feature extraction”, line 12-17), which suggest the claim limitation “FASTQ quality score; alignment score; read coverage”. The claim limitation “and an estimated probability of error” is unclear. Is this “error” regarding the input sequence? That is embedded in the FASTQ score; or this is the allowed prediction error? 
Regarding claim 32, Kermani and Marquard did not teaches alignment characteristics. Spinella teaches alignment characteristics including sequencing quality, mapping quality, coverage, variant allele frequency ( “The complete list of features and their descriptions are presented in Additional file 1: Table S1). These features are divided into five main groups: i) quality bias of alternative bases (related to base and mapping phred quality values), ii) coverage and VAF, iii) location along the read, iv) strand bias, and v) others. When appropriate, features are evaluated with respect to reference bases at the same position (vs_ref ). Page 2 of 11, col 2, para 2)

It would have been a Prima Facie Case of Obviousness “teaching-to-modifying”: 
((G) Some teaching, suggestion, or motivation in the prior art that would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention. (MPEP § 2143 I.G.)), to one of ordinary skill in the art at the time of the invention to modify Kermani’s machine learning pipeline which use the neural network classifier, with Marquard’s random forest of decision trees classifier and the confidence score-based classifications,  and Spinella’s parameter setting of five main groups, to achieve the claim limitation. Because the random forest classifier enables easy selection of data features based on their contribution to the variants classification and the confidence score allows easy control of the prediction accuracy, andSpinella’s decision tree parameter setting of five main groups can be adopted to Marquard’s random forester of  decision trees, and allows Marquard’s random forest classifier to make better choice of data features that contribute to the classification power.  

Claim 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kermani and Marquard as applied to claims 1-2 above, and further in view of Nazarenko (“IDENTIFYING REARRANGEMENTS IN A SEQUENCED GENOME” US 20120197533 A1, 2012).
Regarding claim 14, Kermani teaches prediction of SNVs by machine learning as discussed above. Neither Kermani nor Marquard teaches prediction of structural alterations by machine learning. Nazarenko teaches identifying structural variants such as rearrangements by performing paired-end sequencing on a candidate sequence, mapping the sequence to a reference sequence, and determining if the sequence from the candidate is discordant from the reference sequence ([0007]); According to another embodiment, a method is provided for determining whether a clinically significant junction exists between a sample genome and a reference genome. Results of paired-end sequencing of a plurality of fragments from the biological sample are received. The results include mate pairs of fragments and mappings of the mate pairs to the reference genome. A plurality of discordant mate pairs are determined. A plurality of potential junctions are determined based on the discordant mate pairs. A list of junctions that have appeared in other sample genomes is obtained ([0009]). A "junction" (also called a discontinuity) is the location (a single point or a short region) on the sample genome where the sequences to the left of the junction and to the right of the junction are at different distance, order, or orientation from each other compared to their relationship to one another on a reference genome. This divergence can occur at a single boundary location (e.g., at or between a single base pair) where two distant sequences in the reference genome are joined ([0036); 
It would have been a Prima Facie Case of Obviousness “teaching-to-modifying”: 
((G) Some teaching, suggestion, or motivation in the prior art that would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention. (MPEP § 2143 I.G.)), to one of ordinary skill in the art at the time of the invention to modify Kermani’s machine learning pipeline which use the neural network classifier, with Marquard’s random forest of decision trees classifier and the confidence score-based classifications,  and Nazarenko’s teaching to identify structural variants such as rearrangement boundary, to achieve the claim limitation. Because the random forest classifier enables easy selection of data features based on their contribution to the variants classification and the confidence score allows easy control of the prediction accuracy, Nazarenko’s teaching to identify structural variants such as rearrangement boundary through mapping analysis is an enhancement to the sample classifications by combined Marquard’s and Kermani’s machine learning model for sample classifications, and they all succeeded.  

Conclusion
No claim is allowable.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GUOZHEN LIU whose telephone number is (571)272-0224. The examiner can normally be reached Monday-Friday 8-5 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karlheinz R Skowronek can be reached on (571)272-9047. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Soren Harward/Primary Examiner, Art Unit 1631                                                                                                                                                                                                        
GUOZHEN . LIU
Examiner
Art Unit 1631