DETAILED ACTION
Applicant’s response, 12 April 2022, has been fully considered. The following rejections and/or objections are either reiterated or newly applied. They constitute the complete set presently being applied to the instant application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
Claims 3, 13-15, and 19 are cancelled.
Claims 1-2, 4-12, 16-18, and 20-22 are pending.
Claims 2, 10-11, and 18 are withdrawn from further consideration pursuant to 37 CFR 1.142(b) as being drawn to nonelected species, there being no allowable generic or linking claim. Election was made without traverse in the reply filed on 14 July 2020.
Claims 1, 4-9, 12, 16-17, and 20-22 are rejected.

Claim Objections
The objection to claims 1, 5, 17, and 20-21 in the Office action mailed 16 Dec. 2021 has been withdrawn in view of claim amendments received 12 April 2022.

Claim Rejections - 35 USC § 112(a)
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claims 1, 4-9, 12, 16-17, and 20-22 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor at the time the application was filed, had possession of the claimed invention. This rejection is newly recited and necessitated by claim amendment.
Independent claims 1 and 20-21, and claims dependent therefrom, recite “…obtaining initial variant occurrent frequencies for an initial set of locations different from the at least some of the plurality of locations; and identifying…the at least some of the plurality of locations from among the initial set of locations”. Thus the claims involve identifying the at least some of the plurality of locations based on variant occurrence frequencies from any subject/individual and/or group of subjects/individuals.
Under certain circumstances, omission of a limitation can raise an issue regarding whether the inventor had possession of a broader, more generic invention. See, e.g., Gentry Gallery, Inc. v. Berkline Corp., 134 F.3d 1473, 45 USPQ2d 1498 (Fed. Cir. 1998). In this case, Applicant’s specification discloses at para. [0092] that some of the multiple allele sites may be non-informative for distinguishing among subpopulations because the frequency of variant occurrence at non-informative sites may not differ significantly among subpopulations, and thus in some embodiments, a subset of the multiple allele sites is selected by identifying sites having a difference in the frequency of variant occurrence between different subpopulations. Accordingly, Applicant’s specification provides support for selecting locations based on variant occurrence frequencies of loci across populations, which are then used in a model for identifying a subpopulation an individual belongs. However, Applicant’s specification does not provide support for the broader limitation of identifying the at least some of the plurality of locations, which are informative in a model for distinguishing between subpopulations, from a set of locations based on any variant occurrence frequencies (e.g. of a single subject/individual or population), as encompassed by the broadest reasonable interpretation of the claims.
For the reasons discussed above, the specification does not provide a sufficient disclosure of the limitation of “…obtaining initial variant occurrent frequencies for an initial set of locations different from the at least some of the plurality of locations; and identifying…the at least some of the plurality of locations from among the initial set of locations” recited in claims 1 and 20-21 to demonstrate to one of ordinary skill in the art that the inventor possessed the invention at the time the application was filed. THIS IS A NEW MATTER REJECTION. For more information regarding the written description requirement, see MPEP §2161.01- §2163.07(b).

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.



Claims 1, 4-9, 12, 16-17, and 20-22 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention. This rejection is newly recited and necessitated by claim amendment.
Independent claims 1 and 20-21, and claims dependent therefrom, are indefinite for recitation of “…obtaining initial variant occurrent frequencies for an initial set of locations different from the at least some of the plurality of locations; and identifying…the at least some of the plurality of locations from among the initial set of locations”. First, it’s unclear in what way the at least some of the plurality of locations can be identified from among the initial set of locations if the initial set of locations are different from the at least some of the plurality of locations (i.e. the initial set of locations does not include the at least some of the plurality of locations. As such, it’s unclear if Applicant intended for the initial set of locations to include the at least some of the plurality of locations or if the initial set of locations are different from the at least some of the plurality of locations. As such, the metes and bounds of the claims are unclear. For purpose of examination, the initial set of locations is interpreted to comprise/include the at least some of the plurality of locations, such that the at least some of the plurality of locations can be identified among the initial set of locations.

Claim Rejections - 35 USC § 103
The rejection of claims 1, 4-5, 7-8, 12, 17, and 20-22 under 35 U.S.C. 103 as being unpatentable over Yuan et al. (One Size Doesn't Fit All - RefEditor: Building Personalized Diploid Reference Genome to Improve Read Mapping and Genotype Calling in Next Generation Sequencing Studies, 2015, PLOS Computational Biology, 11(8), pg. 1-20; previously cited) in view of Pardo-Seco et al. (Evaluating the accuracy of AIM panels at quantifying genome ancestry, 2014, BMC Genomics, 15:543, pg. 1-12; previously cited), Liu et al. (MaCH-Admix: Genotype Imputation for Admixed Populations, 2013, Genet Epidemiol., 37(1), pg. 25-37; previously cited) and Siren et al. (Siren et al. (Indexing Graphs for Path Queries with Applications in Genome Research, 2014, IEEE/ACM Transactions on computational biology and bioinformatics, 11(2), pg. 375-388; previously cited) in the Office action mailed 16 Dec. 2021 has been withdrawn in view of claim amendments received 12 April 2022.
The rejection of claim 9 35 U.S.C. 103 as being unpatentable over Yuan et al. in view of Pardo-Seco et al., Liu et al., and Siren et al., as applied to claim 8 above, and further in view of Torkamaneh et al. (Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies, 2016, PLOS one, 11(8), pg. 1-14) in the Office action mailed 16 Dec. 2021 has been withdrawn in view of claim amendments received 12 April 2022.
The rejection of claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Yuan et al. in view of Pardo-Seco et al., Liu et al., and Siren et al., as applied to claim 1 above, and further in view of Sampson et al. (Selecting SNPs to Identify Ancestry, 2011, Ann Hum Genet., 75(4), pg. 539-553) in the Office action mailed 16 Dec. 2021 has been withdrawn in view of claim amendments received 12 April 2022.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 4-8, 12, 17, and 20-22 are rejected under 35 U.S.C. 103 as being unpatentable over Yuan et al. (One Size Doesn't Fit All - RefEditor: Building Personalized Diploid Reference Genome to Improve Read Mapping and Genotype Calling in Next Generation Sequencing Studies, 2015, PLOS Computational Biology, 11(8), pg. 1-20; previously cited) in view of Pardo-Seco et al. (Evaluating the accuracy of AIM panels at quantifying genome ancestry, 2014, BMC Genomics, 15:543, pg. 1-12; previously cited), Liu et al. (MaCH-Admix: Genotype Imputation for Admixed Populations, 2013, Genet Epidemiol., 37(1), pg. 25-37; previously cited), Pritchard et al. (Documentation for structure software: Version 2.2, 2007, pg. 1-36; previously cited), and Siren et al. (Siren et al. (Indexing Graphs for Path Queries with Applications in Genome Research, 2014, IEEE/ACM Transactions on computational biology and bioinformatics, 11(2), pg. 375-388; previously cited). This rejection is newly recited and necessitated by claim amendment.
Regarding claims 1 and 20-21, Yuan et al. shows a method for constructing a personalized reference genome (Abstract) comprising the following steps (Abstract):
Yuan et al. shows receiving sequencing reads using a genotyping array platform for an individual (i.e. obtaining a plurality of sequence reads for an individual) (pg. 3, para. 2; pg. 4, para. 3; pg. 12, para. 3; Figure 2).
Yuan et al. shows obtaining information identifying 6 million SNPs (i.e. obtaining information identifying a plurality of locations) (pg. 5, para. 3).
Yuan et al. shows determining genotypes from the genotyping array to obtain a set of alternative alleles (i.e. a first set of variants) for the 6 million SNP locations (i.e.  the plurality of locations)  in the array (Figure 1-2; pg. 5, para. 3).
Yuan et al. shows imputing genotypes for the individual to identify genetic variants that are not found on the genotyping array (i.e. identifying a second set of variants associated with the first set of variants) (pg. 2, para. 3 to pg. 3, para. 1). 
Yuan et al. shows generating a customized reference genome (i.e. a personalized reference sequence construct) by adding the genetic variants identified by genotype imputation (i.e. the second set of variants) to a universal human reference genome (i.e. an initial reference sequence construct including a human reference genome) (Figure 1 B; pg. 2, para. 3 to pg. 3, para. 3). Given, Yuan et al. shows including all assayed genotypes into the customized reference genome, rather than genotypes specifically used for imputation, this necessarily shows including variants at non-informative locations in the reference, in addition to the imputed genotypes (i.e. the second set of variants).
Yuan et al. shows mapping (i.e. aligning) the reads to the customized reference sequence construct (Figure 1; pg. 15, para. 4).
Further regarding claims 1 and 20, Yuan et al. shows the method is implemented in software and is run on a single core processor (pg. 3, para. 2; pg. 13, para. 6).
Regarding claim 4, Yuan et al. shows the population-specific reference panels consist of African and European haplotypes (pg. 5, para. 4), such that the genotype imputation identifies variants associated with both a first and second subpopulation (i.e. a third set of variants associated with a second subpopulation), and then adding the imputed genotypes, which includes the third set of variants) to the reference genome to generate the customized reference construct (Figure 1; pg. 14, para. 5).
Regarding claim 7, Yuan et al. shows genotyping the sequence reads uses the universal reference genome (Figure 2), which is a different reference construct than the customized reference sequence construct.
Regarding claim 8, Yuan et al. shows the universal reference genome is a linear reference sequence (Figure 2).

Yuan et al. does not show the following limitations:
Regarding claims 1 and 20, Yuan et al. does not show at least one non-transitory computer-readable storage medium for performing the method by the processor. However, Yuan et al. shows the method is implemented in software (pg. 3, para. 2), as discussed above, which necessarily requires a suitably programmed computer with a non-transitory computer-readable storage medium storing instructions for the claimed method. Furthermore, broadly providing an automatic or mechanical means to replace a manual activity which accomplished the same result is not sufficient to distinguish over the prior art. See MPEP 2144.04 III.

Further regarding claims 1, 17, and 20-21, Yuan et al. does not show the plurality of locations consist of a number of locations between 100,000 and 5 million, as recited in claims 1 and 20-21. Yuan et al. further does not show obtaining the information identifying the plurality of informative locations comprises identifying locations at which frequencies of variant occurrence vary among at least some subpopulations in a plurality of subpopulations, as recited in claim 17. However, these limitations were known in the art, before the effective filing of the claimed invention, as shown by Pardo-Seco et al. and Liu et al.
Regarding claims 1, 17, and 20-21, Pardo-Seco et al. discloses methods for evaluating the accuracy of ancestry-informative markers for predicting ancestry (Abstract), which includes selecting SNPs that exhibit differences in allele frequencies between populations (i.e. a plurality locations at which frequencies of variant occurrence differ between subpopulations) and genotyping the ancestry-informative markers (i.e. identifying the first set of variants) (pg. 2, col. 1, para. 1 and col. 2, para. 2), including using a set of SNPs consisting of 1,440,616 SNPs (i.e. between 100,000 and 5 million locations) (Table 1). Pardo-Seco et al. further shows that the closer related the populations under study are, the larger the number of SNPs are needed to infer ancestry, and using the panel consisting of 1,440,616 SNPs more clearly separates two closely related populations compared to using SNP panels of 100,000, 50,000, 1,000, and 500 SNPs (pg. 7, col. 1, para. 2).
Further regarding claims 1, 17, and 20-21, Liu et al. shows it is imperative to incorporate underlying ancestry information when selecting an appropriate reference panel to appropriately impute genotypes (pg. 2, para. 2).
It would have been prima facie obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to have modified the method of Yuan et al., to have obtained information identifying a plurality of locations consisting of between 100,000 and 5 million locations by identifying locations with varying variant occurrences between subpopulations (e.g. selecting the 1,440,616 ancestry informative markers) and genotyping the locations, as shown by Pardo-Seco et al. (pg. 2, col. 1, para. 1 and col. 2, para. 2; Table 1). One of ordinary skill in the art would have been motivated to modify the locations of Yuan et al. with the locations identified by Pardo-Seco et al. in order to infer ancestry between closely related populations, as shown by Pardo-Seco et al. (pg. 7, col. 1, para. 2), and then use the inferred ancestry to select an appropriate reference panel for imputing genotypes, as shown by Liu et al. (pg. 2, para. 2). This modification would have had a reasonable expectation of success because Yuan et al. shows genotyping an individual and then imputing genotypes from the assayed genotypes, such that the selected reference panel of Pardo-Seco would be applicable to the method of Yuan et al. (Figure 1 B; pg. 5, para. 4 to pg. 6, para. 1).

Regarding claims 1, 5, and 20-21, Yuan et al. does not show the identified second set of variants are associated with the first set of variants at the plurality of informative locations, as recited in claims 1 and 20-21. Yuan et al. further does not show accessing a model for identifying one or more subpopulations in a plurality of subpopulations to which the individual likely belongs, identifying, using the first set of variants and the model, a first and second subpopulation in the plurality of subpopulations to which the individual likely belongs, and identifying the second set of variants and a third set of variants associated with the first and second subpopulation, respectively. However, these limitations were known in the art, before the effective filing date of the claimed invention, as shown by Liu et al.
Regarding claims 1, 3, 5, and 20-21, Liu et al. shows an ancestry-weighted approach for genotype imputation (Abstract), which includes estimating the ancestry proportions (i.e. a first and second subpopulation) for a target individual under investigation using an imputation-based approach (i.e. a model) to infer ancestry proportions from the constructed haplotypes of the target individual (i.e. the first set of variants) (pg. 6, para. 2). Liu et al. further shows performing the genotype imputation (i.e. identifying the second set of variants and a third set of variants) using a set of haplotype references from a pool of reference haplotypes according to the ancestry weights (i.e. associated with the first and second subpopulation) (pg. 6, para. 3). Liu et al. further shows it is imperative to incorporate underlying ancestry information when selecting an appropriate reference panel to appropriately impute genotypes (pg. 2, para. 2).
It would have been further prima facie obvious to have modified the method of Yuan et al. to have identified the second set of variants and a third set of variants based on the identified first set of variants for the plurality of locations by accessing a model for identifying subpopulations to which the individual belongs, using the model and the first set of variants to identify a first and second subpopulation to which the individual likely belongs, and identifying a second and third set of variants associated with the first and second subpopulations, respectively, as shown by Liu et al. (pg. 6, para. 2-3). One of ordinary skill in the art would have been motivated to combine the methods of Yuan et al. and Liu et al. to incorporate underlying ancestry information when determining an appropriate reference panel for genotype imputation, as shown by Liu et al. (pg. 2, para. 2). This modification would have had a reasonable expectation of success because Yuan et al. also shows using population-specific reference panels for the genotype imputation (pg. 5, para. 4), which shows the imputed genotypes are associated with specific populations.

Regarding claims 1, 6, and 20-21, Yuan et al., Pardo-Seco et al., and Liu et al. do not show the model comprises information indicating subpopulation-specific variant occurrence frequencies for at least some of the plurality of locations, wherein the at least some of the plurality of locations were identified by: obtaining initial variant occurrence frequencies for an initial set of locations different from the at least some of the plurality of locations; and identifying, based on the initial variant occurrence frequencies, the at least some of the plurality of locations from among the initial set of locations; and wherein identifying the first subpopulation is performed by comparing the subpopulation-specific variant occurrence frequencies with the first set of variants, as recited in claim 6. However, these limitations was known in the art, before the effective filing date of the claimed invention, as shown by Pritchard et al.
Regarding claims 1, 6, and 20-21, Pritchard et al. shows a method for inferring population structure (i.e. identifying subpopulations) (pg. 3, para. 1-3), which includes classifying the ancestry of individuals of unknown origin (pg. 10, para. 4) using a model comprising subpopulation specific allele frequencies of a plurality of loci (i.e. some of the plurality of locations) of samples with known ancestry (pg. 3, para. 2; pg. 10, para. 8), wherein the model compares the allele frequencies (i.e. the first set of variants) of an individual with unknown origin to the allele frequencies of samples with labeled ancestry (i.e. the subpopulation-specific variant occurrence frequencies) (p. 10, para. 4-7).
Regarding the process in which the at least some of the plurality of locations were identified, even though product-by-process claims are limited by and defined by the process, determination of patentability is based on the product itself. The patentability of a product does not depend on its method of production. If the product in the product-by-process claim is the same as or obvious from a product of the prior art, the claim is unpatentable even though the prior product was made by a different process. See MPEP 2113. I. In this case, the plurality of loci of Pritchard et al. is the same as the product of the claims (i.e. also a plurality of loci), and the method of production of the at least some of the plurality of locations in the claim does not impart any distinctive characteristics to the locations that would distinguish the locations of the claim to those disclosed by Pritchard et al. Therefore, Prichard et al. discloses the at least some of the plurality of loci recited in the claims. 
It would have been prima facie obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the model made obvious by Yuan et al., Pardo-Seco et al., and Liu et al. to have used a model comprising subpopulation-specific allele frequencies to compare the subpopulation specific variant occurrence frequencies with the first set of variants to identify the first subpopulation, as shown by Pritchard et al. (pg. 3, para. 1-3; pg. 10, para. 4-8). The motivation would have the simple substitution of one known element (i.e. the model shown by Liu et al.) for another (i.e. the model shown by Pritchard et al.) to obtain the predictable result of estimating the ancestry of the individual, given that Liu et al. shows they used the structure software package to confirm their internal ancestry inference (pg. 6, para. 2). Furthermore, one of ordinary skill in the art would have found the results of the substitution to be predictable, given both models predict the ancestry of an individual.  

Further regarding claims 1, 12, and 20-22, Yuan et al., Pardo-Seco et al., Liu et al., and Pritchard et al., do not show generating the personalized graph reference construct by adding a second set of nodes and edges to the initial reference sequence construct, the second set of nodes and edges representing the second set of variants, as recited in claims 1 and 20-21. Pardo-Seco et al., Liu et al., and Pritchard et al. further do not show the personalized graph reference sequence construct and the initial reference sequence construct comprises a directed acyclic graph through which there are multiple paths, as recited in claims 12 and 22 respectively. However, these limitations were known in the art, before the effective filing date of the claimed invention, as shown by Siren et al.
Further regarding claims 1, 12, and 20-21, Siren et al. discloses a method for representing a linear representation of genomes with graph representations built on a single reference sequence and a set of variants of interest (Abstract; pg. 375, col. 1, para. 2) to generate a directed acyclic graph comprising nodes and edges reflecting the set of variations of interest (pg. 366, col. 2, para. 5; FIG. 5).
It would have been further prima facie obvious to have modified the initial reference sequence construct of Yuan et al., to have been a directed acyclic graph comprising nodes and edges with multiple paths, as shown by Siren et al. (Abstract; pg. 375, col. 1, para. 2; pg. 376, col. 2, para. 5; FIG. 5), such that the resulting reference sequence construct, which includes the initial reference sequence construct, also comprises a directed acyclic graph with multiple paths. The motivation would have been applying a known technique (i.e. representing a reference genome and known variants as a directed acyclic graph) to the known reference genome, shown by Yuan et al. (Figure 1B) to have obtained the predictable result of an initial reference sequence construct comprising an acyclic directed graph with nodes and edges reflecting the linear reference sequence and the set of variants, thereby resulting in an improved initial reference sequence construct, and thus an improved personalized reference sequence construct, that allows for increased alignment accuracy, as shown by Siren et al. (Abstract). 
Therefore, the invention is prima facie obvious.

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Yuan et al. in view of Pardo-Seco et al., Liu et al., Pritchard et al., and Siren et al., as applied to claim 8 above, and further in view of Torkamaneh et al. (Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies, 2016, PLOS one, 11(8), pg. 1-14; previously cited). This rejection is newly recited and necessitated by claim amendment.
Regarding claim 9, Yuan et al. in view of Pardo-Seco et al., Liu et al., Prtichard et al., and Siren et al., as applied to claim 8 above, does not show the genotyping comprises identifying a set of locations in the linear reference sequence and aligning the plurality of sequence reads to locations in the linear reference sequence that are not in the identified set of locations. However, this limitation was known in the art before the effective filing date of the claimed invention, as shown by Torkamaneh et al.
Regarding claim 9, Torkamaneh et al. shows a method for genotyping using sequence data (Abstract), which includes identifying repetitive regions in the genome for which reads with multiple alignment positions may be aligned, and mapping the reads against a masked reference genome to estimate the SNPs originating from the repetitive regions (e.g. the repetitive regions are masked) (Figure 2; pg. 7, para. 2 to pg. 8, para. 1). Torkamaneh et al. further shows that aligning reads to the masked genome allows for the estimation of inaccurate SNPs originating from repetitive regions (pg. 7, para. 2).
It would have been prima facie obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the system made obvious by Yuan et al. in view of Pardo-Seco et al., Liu et al., Pritchard et al., and Siren et al., as applied to claim 8 above, to have identified a set of locations in the linear reference sequence and aligned the sequence reads to locations in the linear reference sequence that are not in the identified set of locations, as shown by Torkamaneh et al. (Figure 2; pg. 7, para. 2 to pg. 8, para. 1). One of ordinary skill in the art would have been motivated to combine the method system made obvious by Yuan et al. in view of Pardo-Seco et al., Liu et al., Pritchard et al., and Siren et al., as applied to claim 8 above, with the method of Torkamaneh et al. to allow for the estimation of inaccurate SNPs originating from repetitive regions, as shown by Torkamaneh et al (pg. 7, para. 2), given Yuan et al. discloses SNPs can be used to generate the custom reference sequence (Abstract; pg. 2 , para. 3). This modification would have had a reasonable expectation of success because Yuan et al. shows genotype information can be obtained from sequencing experiments (pg. 4, para. 2; pg. 14, para. 3).
Therefore, the invention is prima facie obvious.

Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Yuan et al. in view of Pardo-Seco et al., Liu et al., Pritchard et al., and Siren et al., as applied to claim 1 above, and further in view of Sampson et al. (Selecting SNPs to Identify Ancestry, 2011, Ann Hum Genet., 75(4), pg. 539-553; previously cited). This rejection is newly recited and necessitated by claim amendment.
Regarding claim 16, Yuan et al., in view of Pardo-Seco et al., Liu et al., Pritchard et al., and Siren et al., as applied to claim 1 above, does not show identifying the plurality of subpopulations by applying one or more statistical techniques to genomic data, wherein the plurality of subpopulations consist of a number of subpopulations between 10 and 100. However, this limitation was known in the art, before the effective filing date of the claimed invention, as shown Sampson et al. and Liu et al.
Regarding claim 16, Sampson et al. shows a method for selecting SNPs to identify ancestry (Abstract), which includes obtaining genotype frequencies at N SNPs for a training set of subjects from a heterogeneous population containing distinct ancestries (i.e. genomic data for  a plurality of allele sites for the plurality of subpopulations) (pg. 2, para. 6 to pg. 3, para. 3), wherein the number of subpopulations can consist of 24 (pg. 12, para. 1; Figure 7), and estimating an error rate (i.e. applying statistical techniques) for the set of SNPs (i.e. the genomic data) at correctly assigning ancestry and selecting the subset of SNPs with the lowest error rates (pg. 6, section 2.1.6), wherein the subset of SNPs have differences in the allele frequency between the populations (pg. 2, para. 2; pg. 3, para. 3-4). Sampson et al. further shows using the subset of SNPs to estimate the ancestry in a test subject (pg. 3, section 2.1.4). Accordingly, Sampson et al. shows training a model to identify 24 subpopulations using statistical techniques on genomic data, as recited in claim 16, and further shows the training process involves selecting the subset of allele sites with different allele frequencies between subpopulations, which are then used to predict ancestry (i.e. a first subpopulation) in an individual, as recited in claim 21.
Further regarding claim 16, Liu et al. shows an ancestry-weighted approach for genotype imputation (Abstract), which includes estimating the ancestry for a target individual under investigation (pg. 6, para. 2), and further shows it is imperative to incorporate underlying ancestry information when selecting an appropriate reference panel to appropriately impute genotypes (pg. 2, para. 2).
It would have been prima facie obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the system made obvious by Yuan et al. in view of Pardo-Seco et al., Liu et al., Pritchard et al., and Siren et al., as applied to claim 1 above, to have identified the plurality of subpopulations, to have applied one or more statistical techniques to genomic data, wherein the plurality of subpopulations consists of a number of subpopulations between 10 and 100, as shown by Sampson et al. (Abstract; pg. 2, para. 6 to pg. 3, para. 3; pg. 3 sections 2.1.3 and 2.1.14). One of ordinary skill in the art would have been motivated to combine the method of Yuan et al., Pardo-Seco et al., Liu et al., Pritchard et al., and Siren et al. with the method of Sampson et al. in order to train a model (e.g. identify the plurality of subpopulations) and use the model to provide ancestry information, as shown by Sampson et al. (pg. 3, sections 2.1.3 and 2.1.4), which is important to selecting reference panels for genotype imputation, as shown by Liu et al. (pg. 2, para. 2), given genotype imputation is used in Yuan et al.. This modification would have had a reasonable expectation of success because Yuan et al. shows performing genotype imputation using population-specific reference panels (pg. 5, para. 4 to pg. 6, para. 1), such that the method of Sampson et al. would be applicable to the method of Yuan et al. 
Therefore, the invention is prima facie obvious.

Response to Arguments
Applicant's arguments filed 12 April 2022 regarding 35 U.S.C. 103 have been fully considered but they are not persuasive. 
Applicant remarks that Pritchard fails to describe “wherein the at least some of the plurality of locations were identified by: obtaining initial variant occurrence frequencies for an initial set of locations different from the at least some of the plurality of locations; and identifying, based on the initial variant occurrence frequencies, the at least some of the plurality of locations from among the initial set of locations”, and that Pritchard does not describe any process for identifying the loci of the set of allele frequencies at each of multiple loci in Pritchard; therefore, like the other references, Pritchard fails to described this language of amended claim 1 (Applicant’s remarks at pg. 10, para. 1-3).
This argument is not persuasive. Claim 1 recites “…obtaining information identifying a plurality of locations…:accessing a model…, wherein the model comprises information indicating subpopulation-specific variant occurrence frequencies for at least some of the plurality of locations, wherein the at least some of the plurality of locations were identified by:…”. Accordingly, claim 1 only requires obtaining information identifying a plurality of locations and then using at least some of these locations in the model; however, claim 1 does not require a step of identifying the at least some of the plurality of locations. Accordingly, as discussed in the above rejection, the wherein clause describing the process in which the at least some of the plurality of locations were identified is interpreted to be a product-by-process limitation in which the limitation serves to define the process in which the at least some of the plurality of locations were previously identified, but a step of identifying the at least some of the plurality of locations is not required within the metes and bounds of the claim. If the product in the product-by-process claim is the same as or obvious from a product of the prior art, the claim is unpatentable even though the prior product was made by a different process." In re Thorpe, 777 F.2d 695, 698, 227 USPQ 964, 966 (Fed. Cir. 1985). See MPEP 2113 I. While the structure implied by the process steps should be considered when assessing the patentability of product-by-process claims over the prior art, in this case, the claimed method for identifying the at least some of the plurality of locations does not imply any structure to the identified locations (e.g. the locations are not limited to locations with specific variant occurrence frequencies). Therefore, the locations in Pritchard et al. are the same as the locations recited in the claims (e.g. both are positions on a DNA sequence).
Furthermore, as discussed in the above rejection, Pritchard et al. discloses a method for inferring population structure (i.e. identifying subpopulations) (pg. 3, para. 1-3), which includes classifying the ancestry of individuals of unknown origin (pg. 10, para. 4) using a model comprising subpopulation specific allele frequencies of a plurality of loci (i.e. some of the plurality of locations) of samples with known ancestry (pg. 3, para. 2; pg. 10, para. 8; Table 1). Accordingly, Prichard et al. discloses a model comprising information indicating subpopulation-specific variant occurrence frequencies for at least some of the plurality of locations, as claimed. 

Applicant remarks that amended claims 20 and 21 recite the limitations “wherein the at least some of the plurality of locations were identified by: obtaining initial variant occurrence frequencies for an initial set of locations different from the at least some of the plurality of locations; and identifying, based on the initial variant occurrence frequencies, the at least some of the plurality of locations from among the initial set of locations”, and thus, as is clear from the foregoing, the references also fail to describe the quoted language of amended independent claim 20 and 21 (Applicant’s remarks at pg. 11, para. 1-2).
This argument is not persuasive for the same reasons discussed above regarding claim 1.
Applicant remarks that because each of the dependent claims depend from a base claim, the dependent claims are allowable for the same reasons discussed for the independent claims (Applicant’s remarks at pg. 11, para. 3).
This argument is not persuasive for the same reasons discussed above for the independent claims.

Conclusion
No claims are allowed.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KAITLYN L MINCHELLA whose telephone number is (571)272-6485.  The examiner can normally be reached on 7:00 - 4:00 M-Th.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karlheinz Skowronek can be reached on (571) 272-9047.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/K.L.M./Examiner, Art Unit 1631                                                                                                                                                                                                        
/OLIVIA M. WISE/Primary Examiner, Art Unit 1631