DETAILED ACTION
Applicant’s response, filed 28 April 2021, has been fully considered. The following rejections and/or objections are either reiterated or newly applied. They constitute the complete set presently being applied to the instant application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA 

Status of Claims
Claim 11 is cancelled.
Claim 21 is newly added.
Claims 1-10 and 12-21 are pending.
Claims 1-10 and 12-21 are rejected.

Claim Objections
The claim objection of claims 1 and 12 in the Office action mailed 09 Feb. 2021 has been withdrawn in view of claim amendments and/or cancellations received 28 April 2021.

Claim Interpretation
Claims 1 and 12 recite the term “event” in lines 6 of claims 1 and 11 and line 7 of claim 12. The term event is defined in paragraph [0020] of the specification to describe any type of genetic feature such as, e.g. a mutation.
Claims 1 and 12, recite “determining/determine a distribution of events across the plurality of sets in each window” in lines 6 of claims 1 and 11 and lines 6-7 of claim 12. Paragraph [0020] of the specification discloses “each of ten different sets may have different numbers of events in a given 

Claim Interpretation - 35 USC § 112(f)
The interpretation of the gene sequence module, sampling module, tensor module, training module, diagnosis module, and treatment module recited in claims 12 and 17 under 35 U.S.C. 112(f) in the Office action mailed 09 Feb. 2021 has been withdrawn in view of the claim amendments received 28 April 2021.

Claim Rejections - 35 USC § 112(a)
The rejection of claims 12-20 under 35 U.S.C. 112(a) regarding the limitations “a training module configured train a neural network based on the tensors” and “a diagnosis module configured to diagnose one or more phenotypes from an input genome using the classifier” and their interpretation under 35 U.S.C. 112(f) in the Office action mailed 09 Feb. 2021 has been withdrawn in view of claim amendments received 28 April 2021.
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claims 1-10 and 12-21 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement.  The claims contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the 
Independent claims 1, 10, and 12, and claims dependent therefrom, recite “training/train a neural network classifier, based on the tensors, to recognize phenotypes”. When examining computer-implemented functional claims, examiners should determine whether the specification discloses the computer and the algorithm (e.g. the necessary steps and/or flowcharts) that perform the claimed function in sufficient detail such that one of ordinary skill in the art can reasonably conclude that the inventor possessed the claimed subject matter at the time of filing it. It is not enough that one skilled in the art could write a program to achieve the claimed function because the specification must explain how the inventor intends to achieve the claimed function to satisfy the written description requirement” (MPEP 2161.01 I.). The instant claims recite the limitation is carried out by a processor executing instructions from a computer readable storage medium or in communication with memory. The disclosure must describe the algorithm in sufficient detail to explain how the algorithm achieves the claimed function.
Applicant’s specification at FIG. 2, [0005], [0009], and [0018]-[0022] discloses that a classifier is trained based on the tensors, and that the sets of genomes are split into two groups, one of which is a training group and the other, a test group. Applicant’s specification at para. [0041] further discloses a training module which makes use of a set of training data, and that a training module generates a classifier that identifies whether a given input genome corresponding to a phenotype in question. Applicant’s specification at para. [015] further discloses that tensors are used as input to a machine learning process that creates a classifier, and, at para. [0022], broadly states the types of machine learning analyses that are trained include neural networks, support vector machines, linear discriminant analysis, random forest, and Bayesian processes (e.g. supervised-learning algorithms). Applicant’s specification at para. [0027] explains that the statistical distribution of events characterizes the relationship between the window in question and the phenotype, with the distribution of events playing a role in how the phenotype manifests across a 
However, the specification does not provide any detail as to how the classifier uses the tensors during the training process to recognize phenotypes. IBM (Supervised Learning, 2020, pg. 1-6; newly recited) discloses supervised learning processes, and explains that supervised learning algorithms take training datasets which include inputs and correct outputs, and the algorithm measures accuracy through a loss function, adjusting its weights until the error has been sufficiently minimized (pg. 1: How supervised machine learning works). IBM further discloses that neural networks, a type of supervised learning algorithm, comprise layers of nodes, with each node comprising inputs, weights, a bias/threshold, and an output, and if the output exceeds a given threshold, the node passes data to the next layer in the network (pg. 2, para. 3); IBM further explains that neural networks learn the mapping function for passing data to each layer in the network through supervised learning, which involves adjusting the mapping function based on a loss function. Accordingly, training a neural network based on the tensors, could refer to using the tensors as the labeled training data as inputs into the neural network, or using the tensors in the learning process to adjust the mapping function or as part of the loss function, each which would produce a different neural network depending on how the neural network was specifically trained. While the specification at para. [0021]-[0022] discloses that sets are split into two groups, one as a training group and one as a test group, and at para. [0041] that the training module makes use of training data, the specification does not explain how the training data is used nor how the tensors are used during the training process. For example, Applicant’s specification does not disclose if the training data is used to generate tensors, which are then used as the training input into the classifier, or if the tensors are instead used to adjust weights and/or as part of a loss function during the training process. If the training data is used to generate tensors, which are then used as the input into the classifier, the specification further does not disclose how these tensors are labeled such that a classifier trained on the tensors could recognize phenotypes. For example, the claims recite that the plurality of genomes are randomly sampled into the plurality of sets, which are then used to generate the tensors for the windows; however, the specification does not describe how each of the tensors are labeled, such that a classifier trained on these tensors would be capable of recognizing phenotypes. Given the trained model is capable of recognizing phenotypes, the input data for training the classifier would also require at least two labels (e.g. a normal vs diseased phenotype); however, Applicant’s specification does not disclose how the tensors, which are formed from random subsets of genomes, are labeled to allow for recognizing phenotypes. Furthermore, Applicant’s specification does not disclose how the tensors may be used to either adjust weights in the model during the training process or as part of the loss function to help train the model. Furthermore, given the neural network is presumably novel, it is self-evident that training such a neural network based on the generated tensors cannot be well-known in the art.
Therefore, while one of ordinary skill in the art would could write a program which trains a neural network based on the claimed tensors in some way, the specification must explain how the inventor intends to train a neural network based on tensor of statistical properties of distributions generated from randomly sampled genomes. 
For the reasons discussed above, the specification does not provide a sufficient disclosure of the limitation of “training a neural network classifier, based on the tensors, to recognize phenotypes” recited in claims 1, 10, and 12, and claims dependent therefrom, to demonstrate to one of ordinary skill in the art that the inventor possessed the invention at the time the application was filed. For more information regarding the written description requirement, see MPEP §2161.01- §2163.07(b).

Claim 17 recites “wherein the computer program code further causes the hardware processor to automatically administer a treatment to an individual based on the diagnosis”. Applicant’s specification at para. [0042] discloses a treatment module that directly administers drugs to an individual that may include or be in communication with a hardware device configured to administer such a treatment. Applicant’s specification at para. [0042] further discloses that a treatment module can provide recommended treatment information to a human medical professional. However, Applicant’s specification does not discloses a processor capable of administering a treatment to an individual.
For the reasons discussed above, the specification does not provide a sufficient disclosure of the limitation of “wherein the computer program code further causes the hardware processor to automatically administer a treatment to an individual based on the diagnosis” recited in claim 17 to demonstrate to one of ordinary skill in the art that the inventor possessed the invention at the time the application was filed. THIS IS A NEW MATTER REJECTION. For more information regarding the written description requirement, see MPEP §2161.01- §2163.07(b).

Response to Arguments
Applicant's arguments filed 28 April 2021 regarding 35 U.S.C. 112(a) have been fully considered but they are not persuasive. 
Applicant remarks that the present amendment removes the “diagnosing” step, and thereby removes the conflict between the type of data used to train the model and the type of data used to diagnose one or more phenotypes (Applicant’s remarks at pg. 9, para. 3).
This argument is not persuasive. While, the claims no longer recite that a single input genome is input into the classifier for diagnosis, the claims still recite “training a neural network classifier, based on the tensors, to recognize phenotypes”. As discussed in the previous Office action at para. [028], the claims were rejected under 35 U.S.C. 112(a) based on recitation of “training/train a neural network classifier based on the tensors”. However, the specification does not disclose how the classifier is specifically trained to recognize phenotypes, for the reasons discussed above.

Applicant remarks that the “training” step itself is well supported in the present specification, such as in para. [0022], which states “Block 210 then trains a classifier based on the tensors. ... Types of machine learning analyses include, e.g., neural networks, support vector machine processes, linear discriminant analysis processes, random forest processes, and Bayesian processes. Any one of these types of machine learning, or any other variety, may be used to form the classifiers.". Applicant further remarks that the step of training a neural network classifier is described in the present specification, at least in para. [0022], and if the rejection was intended to be based on the training step alone, rather than on the training and diagnosing steps in tandem, then further clarification on the basis of the rejection is requested (Applicant’s remarks at pg. 9, para. 4 to pg. 10, para. 1).
This argument is not persuasive. As discussed above, it is not enough that one skilled in the art could write a program to achieve the claimed function because the specification must explain how the inventor intends to achieve the claimed function to satisfy the written description requirement” (MPEP 2161.01 I.). In this case, the cited paragraph discloses that the classifier is trained based on the tensors, the training is performed by splitting the sets S into two groups, a training and a test group, and then further generally describes supervised learning algorithms in which the learning processes determines a model that recognizes correspondences between the input genotypes and known phenotypes and uses disagreements between predictions and known results to correct the model and improve accuracy. However, as discussed in the above rejection, Applicant’s specification does not disclose any detail as to how the training is performed based on tensors constructed from statistical properties of randomly sampled genomes; for example, para. [0022] of Applicant’s specification does not disclose what data from the training sets are input into the classifier (e.g. are tensors constructed from the training sets?), nor how these sets are labeled to allow a machine learning model to “recognize phenotypes”, or if the tensors are used as the training dataset or used as part of the learning process to adjust the model. Therefore, while one of ordinary skill in the art may be able to write a program capable of training a model based on tensors, Applicant’s specification does not disclose an algorithm that explains how the inventor intends to train a model based on the tensors. 

Applicant remarks that the present amendment make it clear that the present sampling of genotypes into sets may be performed randomly, as described in at least paragraph 19, and as described in paragraph 20, the distribution of "events" in each set is determined, for example by identifying a genetic feature such as a particular mutation, and because the different sets may have different distributions of such a feature, the classifier can make correlations between the presence of the feature and any phenotypes that may be associated with the genotypes in the set (Applicant’s remarks at pg. 10, para. 2). Applicant further remarks that paragraph 22 states, "Machine learning then uses the testing group to test the generated classifier(s), with the genotypes of the testing group being analyzed and used to predict the known phenotypes of that group." (Applicant’s remarks at pg. 10, para. 2).
This argument is not persuasive. Regarding Applicant remarks that the different sets may have different distributions of a feature, such as a mutation, and the classifier can make correlations between the presence of that mutation and any phenotypes associated with the genotypes in the set, the claims do not recite that a distribution of events is generated for each set; instead, the claims recite “determining a distribution of events across the plurality of sets in each window”, such that each distribution represents a window (and not a set), and comprises numbers of events determined from each of the plurality of sets. Accordingly, each tensor generated for each window is based on all of the randomly sampled sets of genomes, and each of the tensors represent the same phenotype (i.e. if the plurality of genomes share a single phenotype) or mixture of phenotypes (i.e. if the plurality of genomes have various phenotypes). Training a supervised learning algorithm using these tensors as input data would require training an algorithm to “recognize phenotypes” based on training data with a single label; however, no description is provided in the Specification as to how this would be performed or if the tensors are even used as the training data set.  Alternatively, if the sets of genomes are used as input into the classifier, rather than the tensor, “training a neural network classifier, based on the tensors, instead would require using the tensors as part of the learning process (rather than as input data); however, Applicant’s specification does not provide any detail as to how this would performed. Further regarding Applicant’s remarks that the machine learning model then uses the testing group to test the generated classifier, the same issues regarding the type of inputs/labels into the classifier (e.g. from the test set, rather than the training set) and how the tensors are related to the classifier arise. 

Claim Rejections - 35 USC § 112(b)
The rejection of claims 12-20 under 35 U.S.C. 112(b) in the Office action mailed 09 Feb. 2021 has been withdrawn in view of claim amendments received 28 April 2021.
Applicant’s arguments at pg. 10, para. 4 to pg. 11, para. 2 regarding 35 U.S.C. 112(b), filed 28 April 2021 have been fully considered but they do not pertain to the new ground of rejection under 35 U.S.C. 112(b) set forth below.
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


Claims 1-10 and 12-21 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention. This rejection is newly recited and necessitated by claim amendment.
Claims 1, 10, and 12, and claims dependent therefrom, are indefinite for recitation of “training/training a neural network classifier, based on the tensors, to recognize phenotypes”. Prior to this limitation, claims 1, 10, and 12 recite “splitting a plurality of genomes…; randomly sampling the plurality of genomes…; determining a tensor for each window [based on the randomly sampled genomes]”. It’s unclear what the neural network classifier is trained to be able recognize phenotypes in. For example, it’s unclear if the neural network classifier is trained to recognize phenotypes in a single genome, or if the neural network classifier is trained to recognize phenotypes in a plurality of genomes. Applicant’s specification at para. [0003]-[0005] discloses that one or more phenotypes is diagnosed from an input genome, which suggests the neural network is trained to recognize phenotypes in a single genome, however, the claims recite that the neural network is trained based on tensors generated from a plurality of genomes, and the specification at para. [0022] discloses that the classifier is trained on sets of genomes, which suggests the neural network is trained to recognize phenotypes in a plurality of genomes.  Therefore, even when read in light of the specification, the metes and bounds of the claims are unclear. For purpose of examination, the limitation is interpreted to mean that the classifier is trained to recognized phenotypes in any type of input.
Claims 6 and 17 are indefinite for recitation of “… automatically administer/administering a treatment to an individual based on the diagnosis”. There is insufficient antecedent basis for this limitation in the claims because independent claim 1 and 12, from which claim 6 and 17 depend, do not recite “a diagnosis”.
Claim 17 is indefinite for recitation of “…wherein the computer program code further causes the processor to automatically administer a treatment to an individual based on the diagnosis”. However, a processor is not capable of administering a treatment to an individual. This adds a process step beyond what a processor can perform, resulting in a claim that is both a product and a process, and rendering the claim indefinite. See MPEP 2173.05(p). For purpose of examination, the limitation is interpreted to mean a treatment recommendation is provided, as supported by Applicant’s specification at para. [0042]).
Claims 8 and 19 are indefinite for recitation of “…having a contribution to the one or more phenotypes…”. Independent claims 1 and 12, from which claims 8 and 19 depend, recite “training a neural network classifier, based on the tensors, to recognize phenotypes”, which requires that the classifier can recognize multiple phenotypes (e.g. not one phenotype). Therefore, it’s unclear if claims 8 and 19 are intended to require that generating the tensors comprises selecting windows with a contribution to one or more of the phenotypes that is above a threshold, or if claims 8 and 19 intend to require selecting windows with a contribution to the phenotypes that is above a threshold. As such, the metes and bounds of the claim are unclear. For purpose of examination, “the one or more phenotypes” is interpreted to mean “the phenotypes” or “one or more of the phenotypes”.

Claim Rejections - 35 USC § 112(d)
The rejection of claims 14-16 under 35 U.S.C. 112(d) in the Office action mailed 09 Feb. 2021 has been withdrawn in view of claim amendments received 28 April 2021.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 3-5, 7, 9-10, 12, 14-16, 18, and 20-21 are rejected under 35 U.S.C. 103 as being unpatentable over Araya et al. (US 2018/0365372 A1, effectively filed 19 June, 2017 based on priority to provisional application No. 62.521,759; previously cited) in view of Angermueller et al (Deep learning for computational biology, 2016, Molecular Systems Biology, 12:878, pg. 1-16; previously cited). Any newly recited portion herein is necessitated by claim amendment.
Regarding claims 1, 10, and 12, Araya et al. shows a method for determining phenotypes of molecular variants from a biological sample (Abstract) comprising the following steps:
Araya et al. shows determining multi-locus molecular measurements for a plurality of single cells (i.e. a plurality of genomes) ([0053]-[0054])..
Araya et al. shows subsampling (i.e. randomly sampling) the molecular measurements from single cells (i.e. the plurality of genomes) harboring the same molecular variant ([0066]; FIG. 9).
Araya et al. shows using the subsampling (i.e. the random sampling) to determine disjoint estimates of summary statistics relating to the tendency, dispersion, shape, range, covariation, or error of the molecular measurements associated with the variant ([0066]; [069];[0070]; FIG. 9, e.g. distributions generated for each molecular measurement for variant 1 by sampling layer), which shows determining a distribution of molecular measurements for each variant.
Araya et al. shows determining a molecular signals matrix, wherein each column of the matrix is a vector for a molecular signal (i.e. a single-order tensor for the multi-locus molecular measurement) based on the summary statistics (FIG. 10; [0063], e.g. summary statistics used to construct molecular signals matrix 1012).
Araya et al. shows training a neural network using a training set of the matrix of molecular signals (i.e. the tensors) to predict phenotypic impacts (i.e. recognize phenotypes) (FIG. 13; claims 102-103).
Further regarding claim 10, Araya et al. shows the method can be implemented in a tangible computer readable medium (para. [0138]).
Further regarding claim 12, Araya et al. shows the method can be implemented by a system comprising a processor and memory ([0130]; FIG. 19, processor #1904 and memory #1908).
Regarding claims 3, 12,  and 14, Araya et al. shows the molecular signals can correspond to a multi-locus count of mutation of the molecules within single cells ([0053]-[0054]), such that determining the  distribution of molecular signals for each loci necessarily comprises counting the events (i.e. mutations) for that loci.
Regarding claims 4-5, 12, and 15-16, Araya et al. shows the molecular signals matrix comprises columns (i.e. vectors) corresponding to each molecular measurement (e.g. each loci) (Fig. 9), and that the summary statistics for the matrix include a mean, variance, skewness, and kurtosis ([0057]), which shows that the tensor comprises forming an n-tuple comprising a mean, variance, skewness, and a kurtosis.
Regarding claims 7 and 18, Araya et al. shows performing principal component analysis to determine the top 100 molecular scores (i.e. ranking the multi-loci molecular measurements) ([0065]; [0089)]; FIG. 11).
Regarding claim 21, Araya et al. shows the molecular signals matrix includes a tendency, dispersion, shape, range, or error of molecular measurement scores ([0057]; FIG. 9), which shows generating a matrix with only a tendency (i.e. consisting of a 1-tuple).

Araya et al. does not show the following limitations:
Regarding claims 1, 10, and 12, Araya et al. does not explicitly show splitting the plurality of genomes into respective groups of non-overlapping windows. 
Regarding claims 9 and 20, Araya et al. does not show splitting a plurality of genomes into respective groups of non-overlapping windows comprises splitting a corresponding region of each genome into a fixed number of windows.
However, as discussed above, Araya et al. shows each of the molecular measurements corresponds to a multi-locus measurement ([0053]-[0054]). Furthermore, this limitation was known in the art, before the effective filing date of the claimed invention, as shown by Angermueller et al.
Regarding claims 1, 9-10, and 12, Angermueller et al. overviews deep learning in genomics (Abstract), which includes splitting a genome sequence into windows around a trait of interest, and then using the windows in a neural network  (pg. 2, col. 2, para. 3; Fig. 2). Angermueller et al. further shows he windows can be fixed size DNA-windows (pg. 6, col. 1, para. 4), which corresponds to a fixed number of windows for a genome sequence. Angermueller et al. further shows considering sequence windows centered on a trait of interest is a widely used approach that increases the number of input-output pairs from a single individual (pg. 9, col. 1, para. 4).
It would have been prima facie obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the method shown by Araya et al. to have split the genome sequences into non-overlapping windows of a fixed-number, as shown by Angermueller et al. (pg. 2, col. 2, para. 3; pg. 6, col. 1, para. 4; Fig. 2). The motivation would have been to increase the number of input-output pairs from a single individual, as shown by Angermueller et al. (pg. 9, col. 1, para. 4). This modification would have had a reasonable expectation of success because Araya et al. shows the molecular measurements includes sequencing data (claim 27) and further shows the molecular measurements can be multi-locus measurements, such that Araya can incorporate measurements from a genomic window. Therefore, the invention is prima facie obvious.

Claims 2 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Araya et al. in view of Angermueller et al., as applied to claims 1 and 12 above, and further in view of Kirk (WO 02/099452 A1; Pub. Date: 12 December 2002; previously cited). This rejection is previously recited.
Regarding claims 2 and 13, Araya et al. does not show the sampling the plurality of genomes comprises a sampling with repetition allowed, such that any set may include a given genome more than once. However, this limitation was known in the art before the effective filing date of the claimed invention, as shown by Kirk et al.
Regarding claims 2 and 13, Kirk et al. shows a method for analyzing biological data (Abstract), which includes determining subsets of a set of samples by sampling with replacement and then calculating statistical properties for each subset (pg. 27, lines 32-36 to pg. 38, line 3; pg. 46 lines 1-9).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the subsampling shown by Araya et al. to have used sampling with replacement to determine each subset, as shown by Kirk et al. (pg. 27, lines 32-36 to pg. 38, line 3; pg. 46 lines 1-9). The motivation would have been the simple substitution of one known element (i.e. the subsampling of Araya et al.) for another (i.e. the sampling with repetition shown by Kirk et al.) to have yielded the predictable result of obtaining the estimates of the summary statistics of Araya ([0066]) by sampling with repetition. Therefore, the invention is prima facie obvious.

Claims 6 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Araya et al. in view of Angermueller et al., as applied to claims 1 and 12 above, and further in view of Vural et al. (Classification of breast cancer patients using somatic mutation profiles and machine learning approaches, 2016, 10(Suppl 3):62, pg. 263-380; previously cited). This rejection is previously recited.
Regarding claims 6 and 17,  Araya et al. in view of Angermueller et al., as applied to claims 1 and 12 above, does not show automatically administering a treatment to an individual based the diagnosis. However, this limitation was known in the art before the effective filing date of the claimed invention, as shown by Vural et al.
Regarding claims 6 and 17, Vural et al. shows a method for classifying breast cancer patients using mutation profiles and machine learning (Abstract), and discloses effectiveness of specific treatments varies among cancer patients, and classification of cancer based on a mutation profile could be useful for cancer diagnosis and treatment (pg. 264, col. 1, para. 1 and 3). Vural et al. further shows that classification of cancer patients is an important step towards making accurate treatment decisions (pg. 270, col. 2, para. 3).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to have modified the method and system made by Araya et al. in view of Angermueller et al., as applied to claims 1 and 12 above, to have administered a treatment to an individual based on the diagnosis because Vural et al. shows that the effectiveness of specific treatments varies among cancer patients, and classification of cancer is useful for making accurate treatment decisions (pg. 264, col. 1, para. 1 and 3; pg. 270, col. 2, para. 3). This modification would have had a reasonable expectation of success because Araya et al. shows that the classifications can include risks of developing cancer ([0078]) and relate to variations in response to therapeutic treatment ([0048]). Therefore, the invention is prima facie obvious.

Claims 8 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Araya et al. in view of Angermueller et al., as applied to claims 1 and 12 above, and further in view of Williams (Everything you did and didn’t know about PCA, 2016, Its Neuronal, pg. 1-9; previously cited). This rejection is previously recited.
Regarding claims 8 and 19, Araya et al. in view of Angermueller et al., as applied to claims 1 and 12 above, does not show generating the tensor comprises selecting only those windows having a contribution to the one or more phenotypes that is above a threshold value. However, this limitation was known in the art, before the effective filing date, as shown by Williams.
Regarding claims 8 and 19, Williams overviews Principal component analysis (PCA), and shows that PCA is used to reduce the dimensionality of the data being worked with by keeping only the top k components of the analysis by throwing away all components whose singular values (i.e. contribution) are below a specified threshold (e.g. components with singular values above a threshold are kept) (pg. 7, para. 5-7). 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to have modified the PCA for selecting molecular features of Araya et al., which would correspond to windows of Angermueller et al., to have selected only those principal components whose singular values (i.e. contribution) are above a threshold, as shown by Williams (pg. 7, para. 5-7). The motivation would have been to reduce the dimensionality of that data being used, as shown by Williams (pg. 7, para. 5). This modification would have had a reasonable expectation of success because Araya et al. shows using PCA for dimensionality reduction ([0065]; [0089]). Therefore, the invention is prima facie obvious.

Response to Arguments
Applicant's arguments filed 28 April 2021 regarding 35 U.S.C. 103 have been fully considered but they are not persuasive. 
Applicant remarks that in FIG. 9 of Araya, different cells, c1-c9 are grouped into variants v1-v3, and this is used to assess the phenotypic impact of variants across a wide array of variant types; therefore, Araya fails to show a random sampling of genomes into sets, and it would be unreasonable to modify Araya to do so as such a modification would change a fundamental principle of operation for the invention (Applicant’s remarks at pg. 12, para. 4 to pg. 13, para. 1).
This argument is not persuasive. While, Araya et al. at FIG. 9 does show grouping cells into groups that share a common variant (e.g. v1, v2, and v3), Araya et al. additionally shows subsampling molecular measurements from single-cells harboring the same molecular variant (e.g. the single-cells with v1) ([0066]), and further shows the molecular measurements correspond to multi-locus measurements ([0054]). Therefore, within a single group of cells harboring a variant (e.g. v1), Araya et al. shows randomly sampling the molecular measurements at a particular multi-locus within that group of cells (e.g. subsampling of genomes/single cells) to generate estimates of summary statistics for that molecular measurement (; [0054]; [0066]; FIG. 9, e.g. #908, e.g. a distribution for each molecular measurement for v1). Accordingly, while Araya et al. does show grouping the single-cells into groups v1, v2, v3, this feature is not used to show the random sampling of genomes into sets in the above rejection, given Araya et al. also shows randomly sampling the single-cells/genomes within each group v1, v2, and v3, to generate the distribution of events for each molecular measurement (e.g. each multi-locus).

Applicant remarks dependent claims 2-9 depend from claim 1 and claims 13-20 depend directly from claim 12, and therefore claims 2-9 and 13-20 are non-obvious over the cited references due to their dependencies from claims 1 and 12, respectively (Applicant’s remarks at pg. 13, para. 3).
This argument is not persuasive for the same reasons discussed above for claims 1 and 12.

Applicant remarks that claim depends from claim 1, and includes all of the elements of its parent claim, and therefore, claim 21 is condition for allowance for the reasons described above (Applicant’s remarks at para. 5).
This argument is not persuasive for the same reasons discussed above for claims 1 and 12.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) 
- 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions 
of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-2, 6, 10, 12-13 and 17 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 8 and 15 of copending Application No. 15/900,048 (reference application) in view of Wu et al. (U.S. Patent No. 5,845,049; 1 Dec. 2998; cited in IDS filed 07 Dec. 2017; previously cited). This rejection is previously recited. 
Although the claims at issue are not identical, they are not patentably distinct from each other because:
Instant Claim
Instant Limitation
Reference Limitation
Reference Claim
12
A system… comprising: a hardware processor; and a memory that… implements:
A processing system… comprising: a processor in communication with one or more types of memory, the processor configured to perform a method…
15
1, 
A genetic diagnosis method, comprising
 performing a method comprising:
8, 15
1, 12
splitting a plurality of genomes into respective groups of non-overlapping windows;
dividing genomic information of N subjects…. into windows with or without repetition 
8, 15
1, 12
randomly sampling the plurality of genomes into a plurality of sets, each set comprising a plurality of selected genomes;
generating a set of training data for each phenotype by selecting with or without repetition genomic information of N subjects into windows….; 
8, 15
1, 12
determining a distribution of events across the plurality of sets in each window;
computing a distribution of events in windows for each phenotype of the set of test data;
8, 15
1, 12
determining a tensor for each window based on statistical properties of the distribution of events for the window;
extracting, for each window, a tensor that represents a distribution of genomic events in windows for each phenotype of the set of test data;
8, 15
1, 12
training a neural network classifier based on the tensors to recognize phenotypes; and
classifying each phenotype of the set of test data with a classifier; and
8, 15
2, 13
The method of claim 1, wherein sampling the plurality of genomes comprises a sampling with repetition allowed, such that any set may include a given genome more than once.
…dividing genomic information of N subjects selected with or without repetition into windows… 
8, 15
6, 17
The method of claim 1, further comprising automatically administering a treatment to an individual based on the diagnosis.
implementing a treatment plan to the patient based on the phenotype assigned to the patient.
8, 15
10
A non-transitory computer readable storage medium comprising a computer readable program for genetic diagnosis, wherein the computer readable program when executed on a computer causes the computer to perform the steps of claim 1.
a computer readable storage medium readable by a processing circuit and storing program instructions for execution by the processing circuit for performing a method comprising:….
8


Regarding instant claims 1, 10, and 12, reference claims 8-9 and 15-16 do not show the classifier is a neural network. However, this limitation was known in the art before the effective filing date of the claimed invention, as shown by Wu et al.
Regarding instant claims 1, 10, and 12, Wu et al. discloses a method for protein family identification using an n-gram term weighting algorithm for extracting motif patterns and neural networks for combining global and motif sequence information (Abstract), which includes splitting a full length sequence into motifs (Col. 4, lines 6-15) and determining a sum of occurrences for all n-grams (i.e. events) for each motif (i.e. window) in the motif training sequence training set (i.e. samples/genomes) (Col. 7, lines 23-40), and then training a neural network to predict membership using the training sequences (col. 4, lines 6-15).
It would have been prima facie obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the classifier of reference claims 8-9 and 15-16 to have used a neural network classifier, as shown by Wu et al (Abstract; col. 4, lines 6-15). The motivation would have been the substitution of one known element (i.e. the classifier of reference claims 8-9 and 15-16) for another (i.e. the neural network, shown by Wu et al.) to have yielded the predictable result of a neural network classifier trained on sequence data. 
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.

Response to Arguments
Applicant’s remarks regarding the provisional double patenting rejection of claims 1-2, 6, 10, 12-13, and 17 (Applicant’s remarks at pg. 7, para. 5 to pg. 8, para. 2) have been considered but they do not include arguments pertaining to the double patenting rejection. 

Conclusion
No claims are allowed
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KAITLYN L MINCHELLA whose telephone number is (571)272-6485.  The examiner can normally be reached on 7:00 - 4:00 M-Th.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karlheinz Skowronek can be reached on (571) 272-9047.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/K.L.M./Examiner, Art Unit 1631                                                                                                                                                                                                        
/OLIVIA M. WISE/Primary Examiner, Art Unit 1631