DETAILED ACTION
	Applicant’s response, filed 7/11/2022, has been fully considered. Rejections and/or objections not reiterated from previous Office Actions are hereby withdrawn. The following rejections and/or objections are either reiterated or newly applied. They constitute the complete set presently being applied to the instant application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Election/Restrictions
	In view of the amendments to withdrawn claims 20-22 submitted herein, the previous restriction requirement for Group III claims 20-22 is withdrawn.  

Claim Status
Claims 1-22 are pending.
Claims 1-2, 6, 12-14, 17 and 20-22 are amended.
Claims 1-22 are rejected.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 7/11/2022 and 7/28/2022 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statements are being considered by the examiner. Signed copies of the IDS’s are included with this office action.

Claim Objections
The outstanding objections to the claims are withdrawn in view of the amendments submitted herein.
The claims are objected to because of the following informalities. The instant objection is newly stated and is necessitated by claim amendment.
Claim 17 recites “constructing a self-consistent k-mers database comprising k-mers of the sample genomes wherein…”, which should be amended to recite constructing a self-consistent k-mers database comprising k-mers of the sample genomes, wherein…”.

Claim Interpretation
In claim 1 and any claims dependent therefrom, under the broadest reasonable interpretation (BRI), the recited “self-consistent taxonomy” and “self-consistent k-mers database” read on any taxonomy and k-mer database obtained by using a genetic distance matrix, as discussed in Applicant’s remarks at page 10, paragraph 4, regarding the 35 USC 112(a) rejection in the previous Office Action and pertaining to the definitions of a taxonomy being “ ‘self-consistent’ as a result of being obtained by calculation using a genetic distance matrix rather than being based on curated metadata that does not involve a genetic distance matrix (e.g., observed information such as phenotype attributes, isolate attributes, host attributes, and general comments) as found in the NCBI Genbank.”   

Claim Rejections- 35 USC § 112
35 U.S.C. 112(a)
The outstanding rejections to the claims are withdrawn in view of the remarks submitted herein.
35 U.S.C. 112(b)
The outstanding rejections to the claims are withdrawn in view of the amendments submitted herein.
35 U.S.C. 112(d)
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

Claim 20 is rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends.  Claim 20 recites “A self-consistent k-mer database prepared by the method of claim 1, wherein the self-consistent k-mer database is stored on a computer readable medium”, which fails to the recitation of “the self-consistent k-mer database is stored on a computer readable medium” of claim 1, starting on page 3, line 1 of the amended claims submitted 7/11/2022. Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-22 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The instant rejection is newly stated and is necessitated by claim amendment.
The instant rejection reflects the Guidance published in the Federal Register notice titled
2019 Revised Patent Subject Matter Eligibility Guidelines (Vol. 84, No. 4, Monday January
7, 2019 at 50) and the October 2019 Updated Subject Matter Eligibility Guidance  (hereinafter both referred to as the “Guidance”), as outlined in the MPEP at 2106.04.
Framework with which to Evaluate Subject Matter Eligibility:
Step 1: Are the claims directed to a process, machine, manufacture, or composition of matter;
Step 2A, Prong One: Do the claims recite a judicially recognized exception, i.e. a law of nature, a natural phenomenon, or an abstract idea;
Step 2A, Prong Two: If the claims recite a judicial exception under Prong One, then is the judicial exception integrated into a practical application (Prong Two); and
Step 2B: If the claims do not integrate the judicial exception, do the claims provide an inventive concept.
Framework Analysis as Pertains to the Instant Claims:
Step 1
With respect to Step 1: yes, the claims are directed to a method and a system [Step 1: YES; See MPEP § 2106.03].
Step 2A, Prong One
With respect to Step 2A, Prong One, the claims recite abstract ideas. The MPEP at 2106.04(a)(2) further explains that abstract ideas are defined as: 
mathematical concepts (mathematical formulas or equations, mathematical relationships and mathematical calculations);
certain methods of organizing human activity (fundamental economic practices or principles, managing personal behavior or relationships or interactions between people); and/or
mental processes (procedures for observing, evaluating, analyzing/ judging and organizing information).
With respect to the instant claims, under the Step 2A, Prong One evaluation, the claims are found herein to recite abstract ideas that fall into the grouping of mental processes (in particular procedures for observing, analyzing and organizing information) and mathematical concepts (in particular mathematical relationships and formulas).
The claim steps directed to abstract ideas of mental processes and mathematical concepts are as follows:
Independent claim 1: calculating a self-consistent taxonomy using the distance matrix; constructing a self-consistent k-mer database comprising k-mers of the sample genomes, wherein the k-mers of the sample genomes are assigned to nodes of the self-consistent taxonomy based on genetic distance, the nodes of the self-consistent taxonomy assigned respective unique self-consistent IDs, and each of the k-mers of the sample genomes is linked to a respective one of the self-consistent IDs; mapping the reference k-mers and associated reference IDs to the self-consistent k-mer database, thereby linking reference IDs to self-consistent IDs, wherein each of the self-consistent IDs is linked to 1 or more of the mapped reference IDs; storing the self-consistent IDs and respective mapped reference IDs to respective nodes of the self-consistent k-mer database; and calculating respective weights and/or respective probabilities of the mapped reference IDs based on the number of nodes of the self-consistent taxonomy linked to each of the mapped reference IDs, wherein each of the mapped reference IDs of a given node of the self-consistent taxonomy is assigned a calculated weight and/or a calculated probability.	
Dependent claims 2-4, 6-16, and 18-19 recite additional steps that further limit the judicial exceptions in independent claim 1 and, as such, are further directed to abstract ideas. For example, claim 2 further limits the k-mers of the reference database; claim 3 further limits the sample genomes; claims 4 and 7-8 further limit the nodes of the self-consistent taxonomy; claim 6 further limits the basis of the self-consistent taxonomy; claims 9-14 further limit the genetic distances; and claims 15-16 further limit assigning the k-mers to nodes.
Independent claim 17: calculate a self-consistent taxonomy using the distance matrix; constructing a self-consistent k-mer database comprising k-mers of the sample genomes, wherein the k-mers of the sample genomes are assigned to nodes of the self- consistent taxonomy based on genetic distance, the nodes of the self-consistent taxonomy assigned respective unique self-consistent IDs, and each of the k-mers of the sample genomes is linked to a respective one of the self-consistent IDs; map the reference k-mers and associated reference IDs to the self-consistent k-mer database, thereby linking reference IDs to self-consistent IDs, wherein each of the self-consistent IDs is linked to 1 or more of the mapped reference IDs; store the self-consistent IDs and respective mapped reference IDs to respective nodes of the self-consistent k-mer database; and calculate respective weights and/or respective probabilities of the mapped reference IDs based on the number of nodes of the self-consistent taxonomy linked to each of the mapped reference IDs, wherein each of the mapped reference IDs of a given node of the self-consistent taxonomy is assigned a calculated weight and/or a calculated probability.
Dependent claims 18-19 recite additional steps that further limit the judicial exceptions in independent claim 17 and, as such, are further directed to abstract ideas. For example, claims 18-19 further limit the system to generating a report.
The abstract ideas recited in the claims are evaluated under a Broadest Reasonable Interpretation (BRI) and determined herein to each cover performance either in the mind and/or performance by mathematical operation because the steps involve nothing more than instructions for a user to manually construct a self-consistent k-mer database. There are no specifics as to the methodology involved in “calculating”, “constructing”, or “mapping”, and thus, under the BRI, one could simply, for example, use a pen and paper to calculate genetic distances and a self-consistent taxonomy, construct a k-mer database from the taxonomy, map reference IDS between databases, and calculate weights and/or probabilities. The steps for calculating genetic distances, a self-consistent taxonomy, weights, and probabilities are performed using mathematical techniques, which is supported by the Specification at least at [0063-0069] and [0077].
Therefore, claims 1 and 17 and those claims dependent therefrom recite an abstract idea [Step 2A, Prong 1: YES; See MPEP § 2106.04].
Step 2A, Prong Two
Because the claims do recite judicial exceptions, analysis under Step 2A, Prong Two, provides that the claims must be examined further to determine whether they integrate the abstract ideas into a practical application (MPEP 2106.04(d). A claim can be said to integrate a judicial exception into a practical application when it applies, relies on, or uses the judicial exception in a manner that imposes a meaningful limit on the judicial exception. This is performed by analyzing the additional elements of the claim to determine if the abstract idea is integrated into a practical application (MPEP 2106.04(d).I.; MPEP 2106.05(a-h)). If the claim contains no additional elements beyond the abstract idea, the claim is said to fail to integrate the abstract idea into a practical application (MPEP 2106.04(d).III).
With respect to the instant recitations, the claims recite the following additional elements: 
Claim 1: “providing a reference database comprising reference k-mers, the reference k-mers derived from sequenced nucleic acids of one or more organisms, wherein the reference k-mers are classified to nodes of a reference taxonomy, the reference taxonomy not based on genetic distances, the nodes of the reference taxonomy representing genome classifications, the nodes of the reference taxonomy having unique reference IDs, wherein IDs means identifications; providing a sample database comprising sample genomes that includes genomes of the one or more organisms”.
Dependent claims 5 and 21 recite additional steps that further limit the additional elements in independent claim 1. For example, claim 5 further limits the organisms to prokaryotes, and claim 20 further limits the sample genomes to being from a microbiome.
Claim 17: “access a reference database comprising reference k-mers, the reference k-mers derived from sequenced nucleic acids of one or more organisms, wherein the reference k-mers are classified to nodes of a reference taxonomy, the reference taxonomy not based on genetic distances, the nodes of the reference taxonomy representing genome classifications, the nodes of the reference taxonomy having unique reference IDs, wherein IDs means identifications; access a sample database comprising sample genomes that includes genomes of the one or more organisms”.
Independent claims 1 and 17 include storing the self-consistent k-mer database on a computer readable medium and that the self-consistent k-mer database is capable of being queried for taxonomic profiling of sequenced nucleic acids when electronically linked to a computer system. Independent claim 17 additionally includes a system comprising one or more computer processor circuits configured and arranged to perform the method. Dependent claim 20 includes storing the self-consistent k-mer database on a computer readable medium. Dependent claim 22 includes that the self-consistent k-mer database is located on a cloud platform of a computer network.
With respect to Step 2A, Prong Two, the additional elements of the claims do not integrate the judicial exceptions into a practical application for the following reasons. Those steps directed to data gathering, such as “providing” a database and sample genomes, perform functions of collecting the data needed to carry out the abstract idea. Data gathering does not impose any meaningful limitation on the abstract idea, or on how the abstract idea is performed. Data gathering steps are not sufficient to integrate an abstract idea into a practical application (MPEP 2106.05(g)). 
Further steps herein directed to additional non-abstract elements of “a computer readable medium”, “a computer system”, “a system comprising one or more computer processor circuits configured and arranged to”, and “a cloud platform of a computer network” perform the method do not describe any specific computational steps by which the “computer parts” perform or carry out the abstract idea, nor do they provide any details of how specific structures of the computer, such as the computer-readable recording media, are used to implement these functions. The claims state nothing more than a generic computer which performs the functions that constitute the abstract idea. Hence, these are mere instructions to apply the abstract idea using a computer, and therefore the claim does not integrate that abstract idea into a practical application. The courts have weighed in and consistently maintained that when, for example, a memory, display, processor, machine, etc.… are recited so generically (i.e., no details are provided) that they represent no more than mere instructions to apply the judicial exception on a computer, and these limitations may be viewed as nothing more than generally linking the use of the judicial exception to the technological environment of a computer (MPEP 2106.05(f)). 
Thus, none of the claims recite additional elements which would integrate a judicial exception into a practical application, and the claims are directed to an abstract idea [Step 2A, Prong 2: NO; See MPEP § 2106.04(d)].
Step 2B
As such, the claims are lastly evaluated using the Step 2B analysis, wherein it is determined that because the claims recite abstract ideas which are not integrated into a practical application, the claims also lack a specific inventive concept. Applicant is reminded that the judicial exception alone cannot provide the inventive concept or the practical application and that the identification of whether the additional elements amount to such an inventive concept requires considering the additional elements individually and in combination to determine if they provide significantly more than the judicial exception. (MPEP 2106.05.A i-vi).
With respect to the instant claims, the additional elements of data gathering described above do not rise to the level of significantly more than the judicial exception. As directed in the Berkheimer memorandum of 19 April 2018 and set forth in the MPEP, determinations of whether or not additional elements (or a combination of additional elements) may provide significantly more and/or an inventive concept rests in whether or not the additional elements (or combination of elements) represents well-understood, routine, conventional activity. Said assessment is made by a factual determination stemming from a conclusion that an element (or combination of elements) is widely prevalent or in common use in the relevant industry, which is determined by either a citation to an express statement in the specification or to a statement made by an applicant during prosecution that demonstrates a well-understood, routine or conventional nature of the additional element(s); a citation to one or more of the court decisions as discussed in MPEP 2106(d)(II) as noting the well-understood, routine, conventional nature of the additional element(s); a citation to a publication that demonstrates the well-understood, routine, conventional nature of the additional element(s); and/or a statement that the examiner is taking official notice with respect to the well-understood, routine, conventional nature of the additional element(s).
With respect to the instant claims, the prior art to Wood (Genome Biology, 2014, 15, p. 1-12, IDS reference #21) discloses that providing a database and microbial sample genomes from a microbiome is a data gathering element that is routine, well-understood and conventional in the art. Said portions of the prior art are, for example, (p. 5, col. 2, par. 3; p. 8, col. 2, par. 2-3). As such, activities such as data gathering do not improve the functioning of a computer, or comprise an improvement to any other technical field; they do not require or set forth a particular machine; they do not effect a transformation of matter; nor do they provide a nonconventional or unconventional step. Rather, the data gathering steps as recited in the instant claims constitute a general link to a technological environment which is insufficient to constitute an inventive concept which would render the claims significantly more than the judicial exception (MPEP2106.05(g)&(h)).
With respect to claims 1 and 17 and those claims dependent therefrom, the computer-related elements or the general purpose computer do not rise to the level of significantly more than the judicial exception. Further exemplified prior art to, for example, Layer et al. (US 2016/0132640) teaches that computing elements are routine, well-understood and conventional in the art. The specification also notes that computer processors and systems, as example, are commercially available or widely used at [0007] and [0043-0059]. The additional elements are set forth at such a high level of generality that they can be met by a general purpose computer. Therefore, the computer components constitute no more than a general link to a technological environment, which is insufficient to constitute an inventive concept that would render the claims significantly more than an abstract idea (see MPEP 2106.05(b)I-III).
Taken alone, the additional elements do not amount to significantly more than the above-identified judicial exception(s). Even when viewed as a combination, the additional elements fail to transform the exception into a patent-eligible application of that exception. Thus, the claims as a whole do not amount to significantly more than the exception itself [Step 2B: NO; See MPEP § 2106.05].
Therefore, the instantly rejected claims are not drawn to eligible subject matter as they are directed to an abstract idea without significantly more. For additional guidance, applicant is directed generally to applicant is directed generally to the MPEP § 2106.
It is noted that the recited limitation “the self-consistent k-mer database is capable of being queried for taxonomic profiling of a taxonomically unclassified k-mer of a sequenced nucleic acids acid when electronically linked to a computer system, wherein the taxonomically unclassified k-mer is classified to a node of the self-consistent k-mer database and given a self-consistent ID and a reference ID” is interpreted as an intended use limitation and does not affect the 35 USC 101 analysis.

Response to Applicant Arguments
1.	Applicant submits that the practical application of the invention is a self-consistent k-mer database based on genetic distance and linked back to a reference taxonomy that is not based on genetic distance, which improves the ability to identify and correct misclassifications and to accurately assign classifications.
	It is respectfully submitted that this is not persuasive. Applicant alleges that “storing the self-consistent IDs with the mapped reference IDs to the nodes of the self-consistent k-mer database” represents a “practical application”. However, steps directed to “storing” in the instant claims are steps that are, themselves, the judicial exceptions and cannot therefore be a practical application of the judicial exception.  The courts have made clear that a judicial exception is not eligible subject matter (Bilski, 561 U.S. at 601, 95 USPQ2d at 1005-06 (quoting Chakrabarty, 447 U.S. at 309, 206 USPQ at 197 (1980)) if there are no additional claim elements besides the judicial exception, or if the additional claim elements merely recite another judicial exception that is insufficient to integrate the judicial exception into a practical application. See, e.g., RecogniCorp, LLC v. Nintendo Co., 855 F.3d 1322, 1327, 122 USPQ2d 1377 (Fed. Cir. 2017) ("Adding one abstract idea (math) to another abstract idea (encoding and decoding) does not render the claim non-abstract"); Genetic Techs. v. Merial LLC, 818 F.3d 1369, 1376, 118 USPQ2d 1541, 1546 (Fed. Cir. 2016) (eligibility "cannot be furnished by the unpatentable law of nature (or natural phenomenon or abstract idea) itself."). For a claim reciting a judicial exception to be eligible, it is the additional elements (if any) in the claim that must "transform the nature of the claim" into a patent-eligible application of the judicial exception, Alice Corp., 573 U.S. at 217, 110 USPQ2d at 1981, either at Prong Two or in Step 2B. If there are no additional elements in the claim, then it cannot be eligible. It is submitted here that the instant claims do not include any additional elements that provide for a practical application. Rather, the “additional elements” in the instant claims (see exemplary claim 1) includes only the step of “providing a reference database” and “providing a sample database”.  As set forth above, said steps operate in the claim as data gathering steps and do not integrate any of the recited judicial exceptions into a practical application, nor do the claims as a whole include any inventive concept beyond well-understood, routine and conventional steps.
2.	Applicant submits that the self-consistent k-mer database, reference database, and sample database are not abstract ideas.
It is respectfully submitted that this is not persuasive. As put forth in the above rejection and reiterated here, the acts of calculating, constructing, mapping, and storing are identified as judicial exceptions. These acts are judicial exceptions, whether they are performed to produce data or databases, both of which occupy space on storage media, comprise bits, and can be viewed on a digital screen or handled as a hardcopy print, as indicated by applicant.  
3.	Applicant submits that profiling a taxonomically unclassified k-mer is not an abstract idea, and that no restriction is placed on how the end user receives the profiling information.
It is respectfully submitted that this is not persuasive. The act of “profiling a taxonomically unclassified k-mer” is not claimed as the limitation recites “the self-consistent k-mer database is capable of being queried for taxonomic profiling…”. This limitation is interpreted as an intended use of the constructed self-consistent k-mer database which is not required to be performed. It is further noted that the act of profiling a taxonomically unclassified k-mer would be considered an abstract idea.
4.	Applicant submits that as skilled artisans are familiar with calculating genetic distances and calculating a self-consistent taxonomy based on a distance matrix are known, as well as sequencing nucleic acids and mapping k-mers, the essential limitations of calculate/calculating are sufficiently claimed. Applicant submits that additional details of the calculation would be unduly limiting and points to Thales Visionix, Inc. v. United States.
It is respectfully submitted that this is not persuasive. It is agreed that the essential limitations of the acts of “calculate/calculating” are recited for one of ordinary skill in the art to understand how to perform a calculation of genetic distances and mapping of k-mers. However, the acts of calculating genetic distances and mapping k-mers are considered as judicial exceptions of a mathematical technique and an abstract idea, respectively. The courts determined in Thales Visionix, Inc. v. United States, 850 F.3d 1343, 1348-49, 121 USPQ2d 1898, 1902 (Fed. Cir. 2017) that claims to a particular configuration of inertial sensors and a particular method of using the raw data from the sensors in order to more accurately calculate the position and orientation of an object on a moving platform did not merely recite “the abstract idea of using ‘mathematical equations for determining the relative position of a moving object to a moving reference frame”, and were therefore found to be merely based on or involve a mathematical concept described in the specification. However, a review of the instant Specification provides support for calculating genetic distances using mathematical techniques as the only embodiments at [0011]. Therefore, the claimed calculations are not merely only based on or involve a mathematical concept, and, as such, recite a judicial exception. Further, the act of mapping, as an abstract idea, is not identified as a mathematical concept, and the act of sequencing nucleic acids, which is not explicitly recited in the claims, would be identified as an additional element. Therefore, mapping and sequencing are not pertinent to the discussion of Thales Visionix, Inc. v. United States. It is noted that sequencing is routinely practiced in the art, as provided by Applicant, and recitation of such a general act would not provide a practical application of the judicial exceptions. 
5.	Applicant submits that because the invention allows for accurate prophylaxis, there is no longer a recitation of an abstract idea.
It is respectfully submitted that this is not persuasive. The claims do not recite or cover any form of prophylaxis or treatment. As such, there is no particular treatment step recited which could integrate the judicial exceptions.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

A.	Claims 1-10, 13-14, and 20-22 are rejected under 35 U.S.C. 103 as being unpatentable over Wood et al. (Genome Biology, 2014, 15, p. 1-12, IDS 9/30/2018 reference #21) in view of Ondov et al. (Genome Biology, 2016, 17, p. 1-14, IDS 9/30/2018 reference #14), Parks et al. (bioRxiv, 1/31/2018, https://doi.org/10.1101/256800, IDS 7/28/2022 reference #14), and Nasko et al. (bioRxiv, 4/10/2018, p. 1-21, IDS 9/30/2018 reference #12). The instant rejection is newly stated and is necessitated by claim amendment.
With regard to the instant claimed elements taught in the prior art, teaching from Wood are described in italics, after each claimed step herein for claim 1. Instantly claimed elements which are considered to be equivalent to the prior art teachings are described in bold for all claims. Wood discloses Kraken, an ultrafast and highly accurate program for assigning taxonomic labels to metagenomic DNA sequences (abstract). Wood teaches the claimed elements as follows.
Claim 1 discloses a method, comprising:
providing a reference database comprising reference k-mers, the reference k-mers derived from sequenced nucleic acids of one or more organisms, wherein the reference k-mers are classified to nodes of a reference taxonomy, the reference taxonomy not based on genetic distances, the nodes of the reference taxonomy representing genome classifications, the nodes of the reference taxonomy having unique reference IDs, wherein IDs means identifications (Wood teaches that Kraken is a database that contains records consisting of a k-mer and the LCA (lowest common ancestor) of all organisms whose genomes contain that k-mer and allows quick lookup of the most specific node in the taxonomic tree that is associated with a given k-mer (p. 2, col. 1, par. 3); Kraken is created by finding every distinct 31-mer (i.e., k-mer) of completed microbial genomes in the NCBI RefSeq database (i.e., derived from sequencing nucleic acids of one or more organisms) and storing the taxonomic ID numbers of the k-mers’ LCA values, where taxon information is obtained from the NCBI taxonomy database (p. 8, col. 2, par. 2-3); Wood does not teach creating a Kraken database based on genetic distances);
providing a sample database comprising sample genomes that includes genomes of the one or more microorganisms (Wood teaches selecting a library of genomic sequences, with the default library based on completed microbial genomes in the NCBI RefSeq database (p. 8, col. 2, par. 2-3));
calculating genetic distances of the sample genomes, thereby forming a distance matrix;
calculating a self-consistent taxonomy using the distance matrix;
constructing a self-consistent k-mer database comprising k-mers of the sample genomes, wherein the k-mers of the sample genomes are assigned to nodes of the self-consistent taxonomy based on genetic distance, the nodes of the self-consistent taxonomy assigned respective unique self-consistent IDs, and each of the k-mers of the sample genomes linked to a respective one of the self-consistent IDs (Wood teaches creating a Kraken database as described above (p. 8, col. 2, par. 2-3); see below for a discussion of constructing a k-mer database with k-mers assigned to nodes of the self-consistent taxonomy);
mapping the reference k-mers and associated reference IDs to the self-consistent k-mer database, thereby linking reference IDs to self-consistent IDs, wherein each of the self-consistent IDs is linked to 1 or more of the mapped reference IDs (Wood teaches that efficient implementation of Kraken’s classification algorithm requires that the mapping of k-mers to taxa is performed by querying a pre-computed database (p. 8, col. 2, par. 2); for each sequence, the taxon associated with it is used to set the stored LCA values of all k-mers in the sequence, where taxon information is obtained from the NCBI taxonomy database (p. 8, col. 2, par. 3); Wood teaches querying (i.e., mapping) k-mers against the database (Figure 5; p. 2, col. 2, par. 4 through p. 6, col. 2; p. 8, col. 2, par. 5); see below for a discussion of linking reference IDs to self-consistent IDs); 
storing the self-consistent IDs and respective mapped reference IDs to respective nodes of the self-consistent k-mer database; and
calculating respective weights and/or respective probabilities of the mapped reference IDs based on the number of nodes of the self-consistent taxonomy linked to each of the mapped reference IDs, wherein each of the mapped reference IDs of a given node of the self-consistent taxonomy is assigned a calculated weight and/or a calculated probability (Wood teaches that each node in the classification tree is weighted with the number of k-mers in the K(S), or all k-mers within a DNA sequence S, that mapped to the taxon associated with that node (p. 8, col. 1, par. 3));
wherein the self-consistent k-mer database is stored on a computer readable medium (Wood teaches storing the Kraken database in the operating system cache, which is stored in physical memory), and
the self-consistent k-mer database is capable of being queried for taxonomic profiling of a taxonomically unclassified k-mer of a sequenced nucleic acid when electronically linked to a computer system, wherein the taxonomically unclassified k-mer is classified to a node of the self-consistent k-mer database and given a self-consistent ID and reference ID (Wood teaches retrieving the sequences from the NCBI Sequence Read Archive for classification with a Kraken database (p. 11, col. 2, par. 2) to evaluate the taxonomic distribution of the sample microbiomes (i.e., taxonomically unclassified k-mer of a sequenced nucleic acid) (Figure 4) by querying k-mers against the Krakent database (Figure 5); Wood teaches running Kraken on a computer (p. 4, col. 1, par. 1)).
Wood does not teach the limitations regarding calculating genetic distances of the sample genomes, thereby forming a distance matrix, calculating a self-consistent taxonomy using the distance matrix, or storing the self-consistent IDs and respective mapped reference IDs to respective nodes of the self-consistent k-mer database.
However, the prior art to Ondov discloses Mash, an algorithm which extends the MinHash dimensionality-reduction technique to include a pairwise mutation distance and P value significance test (abstract). Ondov teaches that Mash compares MinHash sketches of collections of sequences to provide the Mash distances (i.e., genetic distances) (p. 2, col. 1, par. 2). Ondov teaches applying their method to the entire NCBI RefSeq genome database, searching assembled and unassembled genomes against the sketched RefSeq database, and computing a distance between metagenomic samples (i.e., k-mers of the sample genomes) (p. 2, col. 1, par. 2). Ondov teaches that species clusters can be generated from the all-pairs distance matrix by graph clustering methods or simple thresholding of the Mash distance to create connected components (p. 3, col. 2, par. 2). Ondov teaches that beyond simple clustering, the Mash distance can also be used to approximate phylogenies using hierarchical clustering (p. 4, col. 1, par. 2). Ondov teaches that simply considering the connected components yields a partitioning that largely agrees with the current NCBI bacterial species taxonomy (i.e., a self-consistent taxonomy based on genetic distance) (p. 4, col. 1, par. 1). Ondov teaches computing Mash distances for multiple datasets compared against the RefSeq sketch database (i.e., the self consistent k-mer database is capable of being queried for taxonomic profiling of a taxonomically unclassified k-mer) (p. 4, col. 2, par. 2 through p. 6, col. 1, par. 1).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine, in the course of routine experimentation and with a reasonable expectation of success, the method of Wood for creating a k-mer taxonomic database with the method of Ondov for clustering sequences based on a calculated Mash distance because Kraken can be used to create a database by selecting different libraries, as taught by Wood (p. 8, col. 2, par. 2). As Wood teaches associating k-mers with taxonomic ID numbers, it would have also been obvious to create labels, or self-consistent IDs, for the nodes while creating the k-mer database using the methods of Wood from the data of Ondov. The motivation to select the clustered genetic library of Ondov for input to Kraken would be to use Mash distances which combine the high specificity of matching-based approaches with the dimensionality reduction of statistical approaches that enables accurate all-pairs comparisons between many large genomes and metagenomes, as taught by Ondov (p. 2, col. 1, par. 1). One could have combined the elements as claimed by known methods, and that in combination, each element merely would have performed the same function as it did separately. Furthermore, one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Neither Wood nor Ondov specifically teach mapping the reference k-mers and associated reference IDs to the self-consistent k-mer database, thereby linking reference IDs to self-consistent IDs, and storing the self-consistent IDs and respective mapped reference IDs to respective nodes of the self-consistent k-mer database.
However, the prior art to Parks discloses a standardized bacterial taxonomy (abstract). Parks teaches obtaining datasets from RefSeq/GenBank and the Sequence Read Archive, inferring a bacterial genome tree from the datasets, and annotating the phylogeny with group names using the NCBI taxonomy for the public genomes standardized to seven ranks (p. 3, line 77 through p. 4, line 96.
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine, in the course of routine experimentation and with a reasonable expectation of success, the method of Wood in view of Ondov with the method of Park for annotating a database with standard group names because such a method is motivated by Nasko, who teaches that because contamination in public databases represents a confounding factor for k-mer based lowest common ancestor taxonomic classification methods, a classification database that derives its own hierarchical structure directly from the genomic data, based off of a consistent measurement such as marker gene similarity or average nucleotide identity, rather than taxonomy, and that then maps back the internally derived hierarchy to widely used taxonomic names can avoid errors due to inconsistencies in classification databases (abstract, p. 7, col. 2, par. 3 through p. 8, col. 1, par. 1). As both Wood (Figure 5; p. 2, col. 2, par. 4 through p. 6, col. 2; p. 8, col. 2, par. 5) and Ondov (p. 4, col. 2, par. 2 through p. 6, col. 1, par. 1) teach methods for mapping k-mers to other k-mers when mapping k-mers from unclassified samples to their databases, it would have been obvious to modify these mapping methods to compare the reference database as taught by Wood and the self-consistent k-mer database as taught by Ondov and to store any combination of reference IDs in either database because storing identification information in a database is a common method, as evidenced by both Wood (p. 8, col. 2, par. 2-3) and Ondov (p. 11, col. 2, par. 1).
Regarding claim 2, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. Wood teaches that because some sequences in draft genome samples with large amounts of human sequences were misclassified, they removed some k-mers from the draft contigs to improve classification (i.e., at least one of the k-mers of the reference database is misclassified in the reference taxonomy) (p. 11, col. 1, par. 4).
Regarding claim 3, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. Wood does not explicitly teach that at least one sample genome is misclassified in the reference taxonomy.
However, the prior art to Nasko discloses examining the influence of the database over time on k-mer-based lowest common ancestor taxonomic classification (abstract). Nasko teaches that the increased sequencing of genomes has led to increased incidences of contamination and misclassification, where species and strains can be misclassified when genomes are assigned a taxonomic ID that is inconsistent with its similarity to other genomes in the database (i.e., at least one of the sample genomes is misclassified in the reference taxonomy) (p. 9, lines 185-200). Nasko teaches examining the NCBI bacter RefSeq database (p. 10, lines 213-224).
Regarding claim 4, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. As Wood teaches using the LCA (least common ancestor) taxa associated with a k-mer to determine an appropriate label by performing a lookup of the most specific node in the taxonomic tree (p. 2, col. 1, par. 3), it is considered that no two nodes in a final Kraken database share a common reference ID.
Regarding claim 5, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. Wood teaches selecting a library of genomic sequences, with the default library based on completed microbial genomes (i.e., the one or more organisms are prokaryotes) in the NCBI RefSeq database (p. 8, col. 2, par. 2-3).
Regarding claim 6, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. Wood does not teach a self-consistent taxonomy based exclusively on calculated genetic distances.
However, Ondov teaches sketching and clustering the entire NCBI RefSeq genome database based on the calculated Mash distance, as described above (p. 2, col. 1, par. 2).
Regarding claim 7, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. Wood teaches that a Kraken database contains records consisting of a k-mer and the LCA (least common ancestor) of all organisms whose genomes contain that k-mer, which allows a quick lookup of the most specific node in the taxonomic tree that is associated with a given k-mer (p. 2, col. 1, par. 3). As Wood teaches that each k-mer is associated with only one node to allow for unambiguous lookup, it is considered that Wood fairly teaches the limitation of the claim regarding no two nodes of the taxonomy being linked to an identical k-mer.
Regarding claim 8, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. Wood teaches a taxonomy tree which is composed of leaf nodes (i.e., child nodes), root nodes (p. 8, col. 1, par. 3), and intervening nodes (i.e., parent nodes) (Figure 1). As Wood teaches using the LCA (least common ancestor) taxa associated with a k-mer to determine an appropriate label (p. 2, col. 1, par. 3), it is considered that Wood fairly teaches the limitation regarding no two child nodes of a common parent node are linked to an identical reference ID.
Regarding claim 9, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. Wood does not teach calculating genetic distances.
However, Ondov teaches computing distances between metagenomic samples (p. 2, col. 1, par. 2) and different genomes (i.e., genome-genome distances) (p. 4, col. 2, par. 2). As Ondov teaches that Mash supports arbitrary alphabets of either nucleotides or amino acids (p. 8, col. 2, par. 1), it is considered that the method of Ondov is capable of calculating gene-gene distances, protein domain-protein domain distances, and protein-protein distances.
Regarding claim 10, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. Wood does not teach calculating genetic distances.
However, Ondov teaches computing distances between metagenomic samples (p. 2, col. 1, par. 2) and different genomes (i.e., genome-genome distances) (p. 4, col. 2, par. 2). Ondov teaches that Mash utilizes the MinHash technique to reduce large sequences to compressed sketch representations (p. 1, col. 1, par. 1).
Regarding claim 13, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. Wood does not teach calculating genetic distances.
However, Ondov teaches computing distances between metagenomic samples (p. 2, col. 1, par. 2) and different genomes (i.e., genetic distances) (p. 4, col. 2, par. 2). As Ondov teaches that Mash supports arbitrary alphabets of either nucleotides or amino acids (p. 8, col. 2, par. 1), it is considered that the method of Ondov is capable of calculating gene-gene distances, protein domain-protein domain distances, and protein-protein distances. Ondov teaches using pairwise distances to for microbial genomes (p. 4, col. 1, par. 1). 
Neither Wood or Ondov teach using pairwise distance comparisons for gene-gene distances.
Regarding claim 14, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. Wood does not teach calculating genetic distances.
However, Ondov teaches computing distances between metagenomic samples (p. 2, col. 1, par. 2) and different genomes (i.e., genetic distances) (p. 4, col. 2, par. 2). As Ondov teaches that Mash supports arbitrary alphabets of either nucleotides or amino acids (p. 8, col. 2, par. 1), it is considered that the method of Ondov is capable of calculating protein domain-protein domain distances.
Regarding claim 20, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. Wood teaches that Kraken is written in C++ and Perl and is available for download (i.e., stored on a computer readable medium) (p. 11, col. 2, par. 4). 
Regarding claims 21, Wood in view of Ondov, Parks, and Nasko teach the method of claim 20 as described above. Wood teaches classifying the Human Microbiome Project data (i.e., sample genomes) using a Kraken database made from RefSeq genomes (p. 11, col. 2, par. 2).
Regarding claim 22, Wood in view of Ondov, Parks, and Nasko teach the method of claim 20 as described above. Wood that Kraken is available for download (i.e., located on a cloud platform of a computer network) (p. 11, col. 2, par. 4). 
Regarding claims 2-10, 13-14, and 20-22, it would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine, in the course of routine experimentation and with a reasonable expectation of success, the method of Wood, Ondov, Parks, and Nasko because all references disclose methods for taxonomic classification of sequencing reads. Specifically for claim 3, it would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine, in the course of routine experimentation and with a reasonable expectation of success, the method of Wood in view of Ondov and Parks with the method of Nasko because all references teach analyzing data from the same source, NCBI. As NCBI is the source of all cases of analyzed data, and as Nasko teaches that the data in NCBI is at times misclassified, it is considered that Wood has used misclassified sample genomes in their analysis. One could have combined the elements as claimed by known methods, and that in combination, each element merely would have performed the same function as it did separately. Furthermore, one of ordinary skill in the art would have recognized that the results of the combination were predictable. Specifically for claim 4, as Wood teaches a final k-mer database where a reference ID is associated with only one node, it would have been obvious to collapse nodes in the self-consistent taxonomy based on shared reference IDs to produce a similarly structured database. Specifically for claim 6, it would have been obvious to use the taxonomic output of Mash as taught by Ondov for input to the Kraken algorithm as taught by Wood because Wood suggests the selection of various libraries (p. 8, col. 2, par. 2). Specifically for claim 9, it would have been obvious to use Mash as taught by Ondov to calculate any of the recited genetic distances because Ondov teaches that Mash supports arbitrary alphabets of either nucleotides or amino acids (p. 8, col. 2, par. 1), as described above. Specifically for claim 10, it would have been obvious to use the MinHash algorithm to calculate genome-genome distances as taught by Ondov because this is the example that Ondov explicitly teaches in their main algorithm (p. 1, col. 1, par. 1). Specifically for claim 13, it would have been obvious to apply the pairwise distance calculation for genomes, as taught by Ondov, to gene-gene distances. Ondov teaches that Mash uses large sequences or sequence sets as input (p. 1, col. 1, par. 1), and is considered as being capable of calculating gene-gene distances as described above. Specifically for claim 14, it would have been obvious to calculate protein domain-protein domain distances using the method of Ondov because Ondov teaches that Mash supports arbitrary alphabets of either nucleotides or amino acids (p. 8, col. 2, par. 1), which could include lists of protein domains. One could have combined the elements as claimed by known methods, and that in combination, each element merely would have performed the same function as it did separately. Furthermore, one of ordinary skill in the art would have recognized that the results of the combination were predictable.
B.	Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Wood in view of Ondov, Parks, and Nasko, as applied to claim 1 above, and in further view of Meier-Kolthoff et al. (BMC Bioinformatics, 2013, 14, p. 1-14, IDS 9/30/2018 reference #8). Instantly claimed elements which are considered to be equivalent to the prior art teachings are described in bold. The instant rejection is newly stated and is necessitated by claim amendment.
Regarding claim 11, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. Wood does not teach calculating genetic distances.
However, Ondov teaches computing distances between metagenomic samples (p. 2, col. 1, par. 2) and different genomes (i.e., genome-genome distances) (p. 4, col. 2, par. 2). 
Neither Wood, Ondov, Parks, or Nasko teach using the Meier-Kolthoff algorithm.
However, the prior art to Meier-Kolthoff discloses determining the best-performing methods and the most influential parameters for conducting genome blast distance phylogeny (abstract). Meier-Kolthoff teaches correlation and statistical methods for inferring genome-to-genome distances between pairs of genomes (abstract).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to substitute, in the course of routine experimentation and with a reasonable expectation of success, the method Ondov with the method of Meier-Kolthoff for calculating genetic distances between genomes because both methods are capable of calculating distances between genomes and would have yielded similar types of outputs for downstream analysis. The motivation to use the method taught by Meier-Kolthoff would have been to establish methods for consistent and truly genome sequence-based classification of microorganisms, as taught by Meier-Kolthoff (abstract). The substitution of the method taught by Ondov with the method taught by Meier-Kolthoff thus is no more than the simple substitution of one known element for another.
C.	Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Wood in view of Ondov, Parks, and Nasko, as applied to claim 1 above, and in further view of Nei et al. (The American Naturalist, 1972, 106, p. 283-292, IDS 9/30/2018 reference #13). Instantly claimed elements which are considered to be equivalent to the prior art teachings are described in bold. The instant rejection is newly stated and is necessitated by claim amendment.
Regarding claim 12, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. Wood does not teach calculating genetic distances.
However, Ondov teaches computing distances between metagenomic samples (p. 2, col. 1, par. 2) and different genomes (i.e., genetic distances) (p. 4, col. 2, par. 2). As Ondov teaches that Mash uses large sequences or sequence sets as input (p. 1, col. 1, par. 1) and that Mash supports arbitrary alphabets of either nucleotides or amino acids (p. 8, col. 2, par. 1), it is considered that the method of Ondov is capable of calculating gene-gene distances.
Neither Wood, Ondov, Parks, or Nasko teach using Nei’s standard genetic distance.
However, the prior art to Nei discloses a measure of genetic distance based on the identity of genes between populations (abstract).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine, in the course of routine experimentation and with a reasonable expectation of success, the method Ondov with the method of Nei for calculating genetic distances between genomes because both methods are capable of calculating distances between genes and would have yielded similar types of outputs for downstream analysis. The motivation to use the method taught by Nei would have been to use a method for calculating gene-gene distance that is capable of use between any pair of organisms without regard to ploidy or mating scheme, as taught by Nei (abstract). The substitution of the method taught by Ondov with the method taught by Nei thus is no more than the simple substitution of one known element for another.
D.	Claims 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Wood in view of Ondov, Parks, and Nasko, as applied to claim 1 above, and in further view of Sibson et al. (The Computer Journal, 1973, 16, p. 93-95, IDS 9/30/2018 reference #19). Instantly claimed elements which are considered to be equivalent to the prior art teachings are described in bold. The instant rejection is newly stated and is necessitated by claim amendment.
Regarding claim 15, Wood in view of Ondov, Parks, and Nasko teach the method of claim 1 as described above. Wood does not teach using an agglomerative hierarchical algorithm.
However, Ondov teaches that beyond simple clustering, the Mash distance can also be used to approximate phylogenies using hierarchical clustering (p. 4, col. 1, par. 2).
Neither Wood, Ondov, Parks, or Nasko explicitly teach using an agglomerative hierarchical algorithm.
However, the prior art to Sibson discloses the SLINK algorithm for carrying out single-link nearest neighbor cluster analysis (abstract). The interpretation of SLINK as an agglomerative hierarchical algorithm is supported in the Specification at [0064]
Regarding claim 16, Wood in view of Ondov, Parks, and Nasko teach the method of claim 15 as described above. Wood does not teach using an agglomerative hierarchical algorithm.
However, the prior art to Sibson discloses the SLINK algorithm for carrying out single-link nearest neighbor cluster analysis (abstract).
Regarding claims 15-16, it would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to substitute, in the course of routine experimentation and with a reasonable expectation of success, the method of Wood in view of Ondov, Parks, and Nasko with the method of Sibson for performing agglomerative hierarchical clustering because both methods are capable of clustering data. The motivation to use the method taught by Sibson would have been to use an algorithm for carrying out the single-link method which achieves the theoretical order of magnitude bounds on speed and compactness while also being superior to other general-purpose single-link algorithms, as taught by Sibson (p. 30, col. 1, par. 1). The substitution of the method taught by Ondov with the method taught by Sibson thus is no more than the simple substitution of one known element for another.
E.	Claims 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over  Wood et al. (Genome Biology, 2014, 15, p. 1-12, IDS 9/30/2018 reference #21) in view of Ondov et al. (Genome Biology, 2016, 17, p. 1-14, IDS 9/30/2018 reference #14), Parks et al. (bioRxiv, 1/31/2018, https://doi.org/10.1101/256800, IDS 7/28/2022 reference #14), Nasko et al. (bioRxiv, 4/10/2018, p. 1-21, IDS 9/30/2018 reference #12), and Layer et al. (US 2016/0132640). The instant rejection is newly stated and is necessitated by claim amendment.
With regard to the instant claimed elements taught in the prior art, teaching from Wood are described in italics, after each claimed step herein for claim 17. Instantly claimed elements which are considered to be equivalent to the prior art teachings are described in bold for all claims. Wood discloses Kraken, an ultrafast and highly accurate program for assigning taxonomic labels to metagenomic DNA sequences (abstract). Wood teaches the claimed elements as follows.
Claim 17 discloses a system comprising one or more computer processor circuits configured and arranged to:
access a reference database comprising reference k-mers derived from sequenced nucleic acids of one or more organisms, wherein the reference k-mers are classified to nodes of a reference taxonomy, the reference taxonomy not based on genetic distances, the nodes of the reference taxonomy representing genome classifications, the nodes of the reference taxonomy having unique reference IDs, wherein IDs means identifications (Wood teaches that Kraken is a database that contains records consisting of a k-mer and the LCA (lowest common ancestor) of all organisms whose genomes contain that k-mer and allows quick lookup of the most specific node in the taxonomic tree that is associated with a given k-mer (p. 2, col. 1, par. 3); Kraken is created by finding every distinct 31-mer (i.e., k-mer) of completed microbial genomes in the NCBI RefSeq database (i.e., derived from sequencing nucleic acids of one or more organisms) and storing the taxonomic ID numbers of the k-mers’ LCA values, where taxon information is obtained from the NCBI taxonomy database (p. 8, col. 2, par. 2-3); Wood does not teach creating a Kraken database based on genetic distances);
access a sample database comprising sample genomes that includes genomes of the one or more organisms (Wood teaches selecting a library of genomic sequences, with the default library based on completed microbial genomes in the NCBI RefSeq database (p. 8, col. 2, par. 2-3)); 
calculate genetic distances of the sample genomes, thereby forming a distance matrix; 
calculate a self-consistent taxonomy using the distance matrix;
construct a self-consistent k-mer database comprising k-mers of the sample genomes, wherein the k-mers of the sample genomes are assigned to nodes of the self-consistent taxonomy based on genetic distance, the nodes of the self-consistent taxonomy assigned respective unique self-consistent IDs, and each of the k-mers of the sample genomes linked to a respective one of the self-consistent IDs (Wood teaches creating a Kraken database as described above (p. 8, col. 2, par. 2-3); see below for a discussion of constructing a k-mer database with k-mers assigned to nodes of the self-consistent taxonomy);
map the reference k-mers to the self-consistent k-mer database, thereby mapping reference IDs to self-consistent IDs, wherein each of the self-consistent IDs is linked to 1 or more of the mapped reference IDs (Wood teaches that efficient implementation of Kraken’s classification algorithm requires that the mapping of k-mers to taxa is performed by querying a pre-computed database (p. 8, col. 2, par. 2); for each sequence, the taxon associated with it is used to set the stored LCA values of all k-mers in the sequence, where taxon information is obtained from the NCBI taxonomy database (p. 8, col. 2, par. 3); Wood teaches querying (i.e., mapping) k-mers against the database (Figure 5; p. 2, col. 2, par. 4 through p. 6, col. 2; p. 8, col. 2, par. 5); see below for a discussion of mapping reference IDs to self-consistent IDs);
store the self-consistent IDs and respective mapped reference IDs to respective nodes of the self-consistent k-mer database; and
calculate respective weights and/or respective probabilities of the mapped reference IDs based on number of nodes of the self-consistent taxonomy linked to each of the mapped reference IDs, wherein each of the mapped reference IDs of a given node of the self-consistent taxonomy is assigned a calculated weight and/or a calculated probability (Wood teaches that each node in the classification tree is weighted with the number of k-mers in the K(S), or all k-mers within a DNA sequence S, that mapped to the taxon associated with that node (p. 8, col. 1, par. 3));
wherein the self-consistent k-mer database is stored on a computer readable medium (Wood teaches storing the Kraken database in the operating system cache, which is stored in physical memory), and
the self-consistent k-mer database is capable of being queried for taxonomic profiling of a taxonomically unclassified k-mer of a sequenced nucleic acid when electronically linked to a computer system, wherein the taxonomically unclassified k-mer is classified to a node of the self-consistent k-mer database and given a self-consistent ID and reference ID (Wood teaches retrieving the sequences from the NCBI Sequence Read Archive for classification with a Kraken database (p. 11, col. 2, par. 2) to evaluate the taxonomic distribution of the sample microbiomes (i.e., taxonomically unclassified k-mer of a sequenced nucleic acid) (Figure 4) by querying k-mers against the Krakent database (Figure 5); Wood teaches running Kraken on a computer (p. 4, col. 1, par. 1)).
Wood does not teach a system comprising one or more computer processor circuits.
However, the prior art to Layer discloses a method and system for identifying an unknown DNA sample based on probabilistic data structures and machine learning techniques (abstract) by constructing distinct k-mer profiles from genomes of known species [0008]. Layer teaches computer systems or one or more hardware processors can be configured by software as a circuit that operates to perform their method [0043-0059].
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine, in the course of routine experimentation and with a reasonable expectation of success, the method of Wood for creating a k-mer taxonomic database with the method of Layer for performing the method using a computer system and software in order to automate the method. Thus, it would be obvious to combine the system providing an automatic means, as taught by Layer, to replace the manual activity, as taught by Wood, which accomplishes the same result (See MPEP 2144.04(III)).
Neither Wood or Layer teach either of the limitations regarding calculating genetic distances of the sample genomes, thereby forming a distance matrix, calculating a self-consistent taxonomy using the distance matrix, or storing the self-consistent IDs and respective mapped reference IDs to respective nodes of the self-consistent k-mer database.
However, the prior art to Ondov discloses Mash, an algorithm which extends the MinHash dimensionality-reduction technique to include a pairwise mutation distance and P value significance test (abstract). Ondov teaches that Mash compares MinHash sketches of collections of sequences to provide the Mash distances (i.e., genetic distances) (p. 2, col. 1, par. 2). Ondov teaches applying their method to the entire NCBI RefSeq genome database, searching assembled and unassembled genomes against the sketched RefSeq database, and computing a distance between metagenomic samples (i.e., k-mers of the sample genomes) (p. 2, col. 1, par. 2). Ondov teaches that species clusters can be generated from the all-pairs distance matrix by graph clustering methods or simple thresholding of the Mash distance to create connected components (p. 3, col. 2, par. 2). Ondov teaches that beyond simple clustering, the Mash distance can also be used to approximate phylogenies using hierarchical clustering (p. 4, col. 1, par. 2). Ondov teaches that simply considering the connected components yields a partitioning that largely agrees with the current NCBI bacterial species taxonomy (i.e., a self-consistent taxonomy based on genetic distance) (p. 4, col. 1, par. 1). Ondov teaches computing Mash distances for multiple datasets compared against the RefSeq sketch database (i.e., the self consistent k-mer database is capable of being queried for taxonomic profiling of a taxonomically unclassified k-mer) (p. 4, col. 2, par. 2 through p. 6, col. 1, par. 1).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine, in the course of routine experimentation and with a reasonable expectation of success, the automated method of Wood in view of Layer for creating a k-mer taxonomic database with the method of Ondov for clustering sequences based on a calculated Mash distance because Kraken can be used to create a database by selecting different libraries, as taught by Wood (p. 8, col. 2, par. 2). As Wood teaches associating k-mers with taxonomic ID numbers, it would have also been obvious to create labels, or self-consistent IDs, for the nodes while creating the k-mer database using the methods of Wood from the data of Ondov. The motivation to select the clustered genetic library of Ondov for input to Kraken would be to use Mash distances which combine the high specificity of matching-based approaches with the dimensionality reduction of statistical approaches that enables accurate all-pairs comparisons between many large genomes and metagenomes, as taught by Ondov (p. 2, col. 1, par. 1). One could have combined the elements as claimed by known methods, and that in combination, each element merely would have performed the same function as it did separately. Furthermore, one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Neither Wood, Layer, or Ondov specifically teach mapping the reference k-mers and associated reference IDs to the self-consistent k-mer database, thereby linking reference IDs to self-consistent IDs, and storing the self-consistent IDs and respective mapped reference IDs to respective nodes of the self-consistent k-mer database.
However, the prior art to Parks discloses a standardized bacterial taxonomy (abstract). Parks teaches obtaining datasets from RefSeq/GenBank and the Sequence Read Archive, inferring a bacterial genome tree from the datasets, and annotating the phylogeny with group names using the NCBI taxonomy for the public genomes standardized to seven ranks (p. 3, line 77 through p. 4, line 96.
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine, in the course of routine experimentation and with a reasonable expectation of success, the method of Wood in view of Layer and Ondov with the method of Park for annotating a database with standard group names because such a method is motivated by Nasko, who teaches that because contamination in public databases represents a confounding factor for k-mer based lowest common ancestor taxonomic classification methods, a classification database that derives its own hierarchical structure directly from the genomic data, based off of a consistent measurement such as marker gene similarity or average nucleotide identity, rather than taxonomy, and then map back the internally derived hierarchy to widely used taxonomic names, can avoid errors due to inconsistencies in classification databases (abstract, p. 7, col. 2, par. 3 through p. 8, col. 1, par. 1). As both Wood (Figure 5; p. 2, col. 2, par. 4 through p. 6, col. 2; p. 8, col. 2, par. 5) and Ondov (p. 4, col. 2, par. 2 through p. 6, col. 1, par. 1) teach methods for mapping k-mers to other k-mers when mapping k-mers from unclassified samples to their databases, it would have been obvious to modify these mapping methods to compare the reference database as taught by Wood and the self-consistent k-mer database as taught by Ondov and to store any combination of reference IDs in either database because storing identification information in a database is a common method, as evidenced by both Wood (p. 8, col. 2, par. 2-3) and Ondov (p. 11, col. 2, par. 1).
Regarding claim 18, Wood in view of Layer, Ondov, Parks, and Nasko teach the method of claim 17 as described above. Wood teaches mapping the reference IDs to the k-mer database (p. 8, col. 2, par. 2-3) as described above. Wood teaches storing every distinct k-mer with the taxonomic ID numbers in a database (i.e., a report) (p. 8, col. 2, par. 3).
Regarding claim 19, Wood in view of Layer, Ondov, Parks, and Nasko teach the method of claim 17 as described above. Wood teaches calculating weights (p. 8, col. 1, par. 3) as described above.
Regarding claims 18-19, it would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine, in the course of routine experimentation and with a reasonable expectation of success, features of the method of Wood in view of Layer, Ondov, Parks, and Nasko. As explained above, it would have been obvious to create labels, or self-consistent IDs, for the nodes while creating the k-mer database using the methods of Wood from the data of Ondov. The creation of such a database is considered equivalent to generating a report containing associated reference IDs and self-consisted IDs with their respective weights.

Response to Applicant Arguments
With respect to Applicant’s arguments under 35 USC 103, the arguments have been fully considered but are moot in view of the new grounds of rejection set forth above as necessitated by claim amendment herein.

Conclusion
No claims are allowed.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Inquiries
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JANNA NICOLE SCHULTZHAUS whose telephone number is (571)272-0812.  The examiner can normally be reached on Monday - Friday 8-4.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karlheinz Skowronek can be reached on (571)272-9047.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/JANNA NICOLE SCHULTZHAUS/Examiner, Art Unit 1631         
         
/Lori A. Clow/Primary Examiner, Art Unit 1631