Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Specification
The disclosure is objected to because it contains multiple embedded hyperlink ([0059-0060]) and/or other form of browser-executable code. Applicant is required to delete the embedded hyperlink and/or other form of browser-executable code; references to websites should be limited to the top-level domain name without any prefix such as http:// or other browser-executable code. See MPEP § 608.01.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-8, 17-18, 20-21, 48-49, 52, 73-75, 82-84 are rejected under 35 U.S.C. § 101 because the claimed inventions are directed to non-statutory subject matter.  

“claims directed to nothing more than abstract ideas (such as a mathematical formula or equation), natural phenomena, and laws of nature are not eligible for patent protection”. (MPEP 2106.04 § 1).  Abstract ideas include mathematical concepts, certain methods of organizing human activity, and mental processes (MPEP 

Mathematical concepts recited in the claims include:
“computing a variant-to-variant distance between the pair of consecutive SNVs” (claim 1);
“computing a reduced distance” (claim 1);
“computing, for each position of the matrix, an average and a standard deviation for each matrix in the set of matrices from which the reference matrix is derived” (claim 18);
“transforming the matrix by computing a Z-score for each value in the matrix, wherein the Z-score is the value, minus the average, divided by the standard deviation” (claim 18);
“computing a column average for each column in the matrix” (claim 21); 
“computing a column standard deviation for each column in the matrix” (claim 21); 
“subtracting the column average and dividing by the column standard deviation” (claim 21);  3Application No. 16/306,706Docket No.: 32158/50772 
“computing a row average for each row in the matrix” (claim 21); 
“computing a row standard deviation for each row in the matrix” (claim 21); 

“computing the reduced distance may comprise one or more of the following: scaling linearly, scaling using a nonlinear function, or binning” (claim 49);
“for each consecutive pair of SNV locations: computing a distance between the respective locations of the pair of SNVs; computing a reduced distance; and incrementing a counting value corresponding to the reduced distance” (claim 73);
“computing the first reduced distance comprises finding the remainder after division of the respective distance value by a first vector length, n1” (claim 82); 
“incrementing a counting value according to at least the first reduced distance” (claim 82);
“computing the second reduced distance comprises finding the remainder after division of the respective distance value by a second vector length, n2” (claim 82); 
“incrementing a counting value according to at least the second reduced distance” (claim 82); 

Mental processes recited in the claims include:
"identifying for each single nucleotide variant (SNV) observed in a portion of the genome" (claim 1);
“joining the reference allele and the variant allele together to form a SNV key for each single nucleotide variant” (claim 1);
creating a pair key” (claim 1);
“incrementing a counting value corresponding to both the pair key and the reduced distance” (claim 1);
“creating a matrix comprising one column for each pair key and one row for each reduced distance” (claim 2);
“creating a matrix comprising one row for each pair key and one column for each reduced distance” (claim 2);
“representing the genome as a matrix, and normalizing the matrix relative to a reference matrix derived from a set of genomes” (claim 17);
“representing each genome of the set of genomes as a corresponding matrix” (claim 18);
 “representing the genome as a matrix; and normalizing the matrix internally” (claim 20).
“filtering comprises filtering the SNVs to consider variant quality” (claim 52);
“identifying, for each single nucleotide variant (SNV) observed in a portion of the genome, a location of the SNV” (claim 73); 
“choosing a mask for each pair key, wherein the mask assigns 4Application No. 16/306,706Docket No.: 32158/50772 a class value to each counting value corresponding to both the pair key and the reduced distance” (claim 74);

“normalizing the first and second reduced representations of the portion of the genome to create, respectively, first and second normalized reduced representations” (claim 82);
“joining the first and second normalized reduced representations of the portion of the genome to create the representation of the portion of the genome” (claim 82).

Hence, the claims explicitly recite numerous elements that, individually and in combination, constitute abstract ideas. The claims must therefore be examined further to determine whether they integrate that abstract ideas into practical application (MPEP 2106.04(d)). (Step 2A Prong One: Yes).

Claims  1-8, 17-18, 20-21, 48-49, 52, 73-75, 82 all started with “A computer-implemented method” which sounds like additional elements to the abstract ideas. However, the computer that implement these methods is nothing more than a generic computer. Hence, the invention merely applies the abstract idea outlined above using a computer,  “claims that amount to nothing more than an instruction to apply the abstract idea using a generic computer do not render an abstract idea eligible”( Alice Corp., 573 U.S. at 223, 110 USPQ2d at 1983)  and therefore claims 1-8, 17-18, 20-21, 48-49, 52, 73-75, 82 do not integrate that abstract idea into a practical application (see MPEP 2106.04(d) § 1; and MPEP 2106.05(f)). (Step 2A Prong Two: No).

None of the dependent claims recites any additional non-abstract elements; they are all directed to further aspects of the information being analyzed, the manner in which that analysis is performed, or the mathematical operations performed on the information. 
Because the claims recite an abstract idea, and do not integrate that abstract idea into a practical application, the claims are directed to that abstract idea. Claims that are directed to abstract ideas must be examined further to determine whether the additional elements amount to significantly more than the judicial exception. Claims that are directed to abstract ideas and that raise a concern of preemption of those abstract ideas must be examined to determine what elements, if any, they recite besides the abstract idea, and whether these additional elements constitute inventive concepts that are sufficient to render the claims significantly more than the abstract idea (MPEP 2106.05).
As explained above, the mere implementations of the abstract idea using a computer are, when considered individually, insufficient to constitute an inventive concept that would render the claims significantly more than an abstract idea (see MPEP 2106.05(f)). (Step 2B: No).

When the claims are considered as a whole, they do not integrate the abstract idea into a practical application; they do not confine the use of the abstract idea to a particular technology; they do not solve a problem rooted in or arising from the use of a particular technology; they do not improve a technology by allowing the technology to perform a function that it previously was not capable of performing; and they do not 
For these reasons, the claims, when the limitations are considered individually and as a whole, are directed to an abstract idea and lack an inventive concept. Hence, the claimed invention does not constitute significantly more than the abstract idea, so the claims are rejected under 35 USC § 101 as being directed to non-statutory subject matter.


Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-7, 17, 48-49, 52, 73, 82-84 are rejected under 35 U.S.C. 103 as being unpatentable over Shendure (“A framework for determining the relative effect of genetic variants”, WO 2015042496 A1,  Published 2015-03-26), Drmanac (“Processing And Analysis Of Complex Nucleic Acid Sequence Data” US 20150379192 A9, Published on Dec 31, 2015), and further in view of Brodzik (“Rapid Genomic Sequence Homology Assessment Scheme Based On Combinatorial-Analytic .

Regarding Claim 1,  Shendure discloses a computer-implemented method to  represent a genome (“A method performed by a computing system for determining the relative effect of a genetic variant comprising: applying a machine learning model to a dataset, wherein the dataset comprises one or more genetic variants, each of which is associated with values or states of each of a set of annotations” (claim 1); “The computer-readable storage medium of claim 15, wherein the set of reference deleteriousness scores are derived from a set of reference variants, a reference gene or genome, or the dataset” (claim 21)). However, Shendure is silent in joining the reference allele and the variant allele together to form a SNV key for each single nucleotide variant in the portion of the genome; and for each pair of consecutive SNVs: computing a variant-to-variant distance between the pair of consecutive SNVs; computing a reduced distance; creating a pair key; and incrementing a counting value corresponding to both the pair key and the reduced distance.
Drmanac disclose a graphically reduced genome representation showing pairwise analysis of nearby heterozygous SNPs ([0022], and Fig. 4), through  computing a variant-to-variant distance between the pair of consecutive SNVs. In the graphic presentation, a graph with nodes corresponding to the heterozygous SNPs and the directional connections corresponding to the orientation and the strength of the best hypothesis for the relationship between those SNPs ([0039]).  The orientation is binary. FIG. 21 depicts a flipped and un-flipped relationship between heterozygous SNP pairs, 
Brodzik teaches use of combinatorial analytic concepts for rapid assessment of genomic sequences including SNPs (“The present invention relates to methods and apparatus for rapid assessment of genomic sequences using the difference set model. The invention provides methods to determine the presence and identity of similarities and differences in genomic sequences. In particular, the invention provides methods and apparatus to assess homology, the presence and identity of insertion and deletion segments and the presence and identity of single nucleotide polymorphisms in genomic sequences” (Abstract), [0014], [0019-0020]) and a distance modulo ([0098], [0119], [0150]) to compute reduced distance.
Both Drmanac and Brodzik are silent in explicitly reciting “incrementing a counting value corresponding to both the pair key and reduced distance”. However, 
Shendure teaches to increment a counting value for each SNP set record (in a row) in the table (Fig. 33).
Regarding Claim 2, Shendure further teaches creating a matrix comprising one column for each SNV (“In certain embodiments, the dataset includes a set of one or more genetic variants organized in rows of a table and a set of one or more annotations organized in columns of the table. In other embodiments, the dataset includes a set 
Since distance modulo is a vector for reduced distance, it would have been obvious to one of ordinary skill in the art that the matrix of SNV columns according to Shendure could include reduced distance information computed according to distance modulo of Brodzik in rows arranged with respect to SNV pairs according to Drmanac arranged in columns, to create a simplified but comprehensive genome representation.

Regarding Claim 3, Shendure further teaches the method further comprising creating a matrix comprising one row for each SNV (“In certain embodiments, the dataset includes a set of one or more genetic variants organized in rows of a table and 

Regarding Claim 4, Shendure further teaches the portion of the genome is the whole genome (“FIG. 29 shows the ranking of pathogenic ClinVar missense variants 

Regarding Claim 5, Shendure further teaches the portion of the genome is a chromosome (“These parameters were applied to simulate single nucleotide variants (SNV, also referred to as single nucleotide polymorphisms, or SNPs) and insertion/deletion (also referred to herein as "indel") variants based on the human reference sequence (GRCh37). Variants were simulated by iterating through all bases of the human reference autosomes and the X chromosome and picking sites for mutation with probabilities corresponding to the genome-wide substitution rate matrix” [0093]). The autosomes are the chromosomes other than the sex chromosomes (X, Y chromosomes).

Regarding Claim 6, Shendure further teaches the portion of the genome is an exome, a transcriptome, or other set of the genome selected in a targeted way (“C-scores for these noncoding, disease-causal variants (scaled scores between 23.2 and 24.5) rank them higher than 99.5% of all possible human SNVs, higher than 97% of 

Regarding Claim 7, Shendure further teaches the portion of the genome is a set of single nucleotide polymorphisms (SNPs) (“FIG. 35 shows that C-scores for GWAS SNPs are higher than for nearby control SNPs and are dependent on study sample size according to one embodiment. The average scaled C-score (y axis) is plotted for each category of SNPs, as indicated by color, relative to the sample size of the association study in which the SNP was identified (x axis). Sample size bins are log2 scaled and mutually exclusive; for example, the bin labeled 1,024 represents all SNPs from studies with between 512 and 1,024 samples” [0043]).

Regarding claim 17,  Shendure further teaches representing the genome as a matrix (“FIG. 1 1 is a graph representing an exemplar hyperplane and margins for a support vector machine (SVM) trained with samples from two classes according to one embodiment. The training matrix rows correspond to the variants of the training set or dataset (29.4M variants; y=0 for proxy benign, y=1 for proxy deleterious); and the columns correspond to annotations (947), wherein X1 ,..., Xn -> 63 annotations” ([0019], Fig 11)); and normalizing the matrix relative to a reference matrix (“Since the raw scores do have relative meaning, a specific group of variants may be identified, and the rank for each variant may be defined within that group. The ranked value may then be used as a "normalized" or "scaled" integrated deleteriousness score, which is an externally comparable unit of analysis. In the embodiments described in Example 1 . The CADD score is the integrated deleteriousness score, or C-score (section “Abstract”).
Further, Shendure teaches that the reference matrix is derived from a set of genomes (“In one embodiment, an annotation matrix may be generated using a set of genetic variants derived from the following sources: the Ensembl Variant Effect Predictor (McClaren et al. 2010) (VEP), data from the ENCODE Project (ENCODE Project Consortium et al. 2012) and information from UCSC Genome Browser tracks (Meyer et al. 2013 (FIG. 1 ). Annotations spanned a range of data types, including conservation metrics such as GERP (Cooper et al. 2005), phastCons (Siepel et al. 2005) and phyloP (Pollard et al. 2010); regulatory information (ENCODE Project Consortium et al. 2012) such as genomic regions of DNase I hypersensitivity (Boyle et al. 2008) and transcription factor binding (Johnson et al. 2007); transcript information such as distance to exon-intron boundaries or expression levels in commonly studied cell lines (ENCODE Project Consortium et al. 2012); and protein-level scores such as those generated with Grantham (Grantham 1974), SIFT (Ng & Henikoff 2003) and PolyPhen (Adzhubei et al. 2010). The resulting variant-by-annotation matrix contained 29.4 million variants (half fixed or nearly fixed human-derived alleles ('observed') and 
Regarding Claim 48, Drmanac further teaches each of the single nucleotide variants is a heterozygous variant (para [0022] - 'FIG. 4 shows pairwise analysis of nearby heterozygous SNPs'). 

Regarding Claim 49, Brodzik further teaches the computing the reduced distance may comprise one or more of the following: scaling linearly, scaling using a nonlinear function, or binning (“The invention further provides methods for identifying specific differences between similar sequences on both a coarse grain and fine grain scale” [0061]; “In embodiments of the invention, the analysis is performed using a larger difference set to identify large-scale differences, followed by the analysis using a smaller difference set for higher granularity” [0079]'; “It is convenient to define difference sets using the language of group theory. Suppose G is the additive group of integers modulo v and D is a k -subset of G” [0098]). 


Regarding Claim 52, Shendure further teaches the filtering comprises filtering the SNVs to consider variant quality (“These parameters were applied to simulate single nucleotide variants (SNV, also referred to as single nucleotide polymorphisms, or SNPs) and insertion/deletion (also referred to herein as "indel") variants based on the human reference sequence (GRCh37). Variants were simulated by iterating through all bases of the human reference autosomes and the X chromosome and picking sites for mutation with probabilities corresponding to the genome-wide substitution rate matrix. The Y chromosome and additional contigs were not included in this embodiment to exclude effects due to variation in sequence quality” [0093]).

Regarding Claim 73, Shendure teaches a computer-implemented method of generating a representation of a genome (“A method performed by a computing system for determining the relative effect of a genetic variant comprising: applying a machine 
Drmanac teaches identifying in a portion of the genome heterozygous SNV sites within the portion of the genome; and computing a variant-to-variant distance between the pair of SNVs for generating a graphical reduced genome representation (“FIG. 4 shows pairwise analysis of nearby heterozygous SNPs” ([0022] , Fig 4); “Graph generation: An undirected graph is made, with nodes corresponding to the 
Brodzik teaches use of combinatorial analytic concepts for rapid assessment of genomic sequences including SNPs and a distance modulo (“The present invention relates to methods and apparatus for rapid assessment of genomic sequences using the difference set model. The invention provides methods to determine the presence and identity of similarities and differences in genomic sequences. In particular, the invention provides methods and apparatus to assess homology, the presence and identity of insertion and deletion segments and the presence and identity of single nucleotide polymorphisms in genomic sequences” (section “Abstract”); “Accordingly, in the methods of the present invention, the genomic sequence or sequences are constructed as compact representations in the difference set space, discussed in more detail below. An alignment of these representations permits computationally efficient identification of differences between the sequences ... As discussed above, the query sub-sequence can be a binary representation of the query sequence, wherein one of the symbols in the sequence is designated as '1' and the remaining symbols are 

Regarding Claim 82, Shendure in view of Drmanac, Brodzik teaches a computer-implemented method of generating a representation of a genome (“A method performed by a computing system for determining the relative effect of a genetic variant comprising: applying a machine learning model to a dataset, wherein the dataset comprises one or more genetic variants, each of which is associated with values or states of each of a set of annotations” (claim 1); “Fig. 17 shows the relationship between scaled C-scores and genetic variation according to one embodiment, (a) Mean DAF by scaled C-score for variants listed by the 1000 Genomes Project (1000 Genomes Project Consortium et al. 2012) or ESP (Fu et al. 2013)....(b) Under-representation of polymorphic sites in 1000 Genomes Project data” ([0025], Fig 17); “FIG. 26 is a visual representation of the separation of the curated pathogenic mutations in the NIH ClinVar database (red, n=8174) and matched apparently benign (derived allele frequency of at least 5%) mutations in ESP with the same consequence values (blue, n=8174) for 
Shendure does not state joining the reference allele and the variant allele together to form a SNV key for each single nucleotide variant in the portion of the genome; and for each pair of consecutive SNVs: computing a variant-to-variant distance between the pair of consecutive SNVs; computing a reduced distance; creating a pair key; and incrementing a counting value corresponding to both the pair key and the reduced distance.
Drmanac teaches computing a variant-to-variant distance between the pair of consecutive SNVs for generating a graphical reduced genome representation (“Fig. 4 shows pairwise analysis of nearby heterozygous SNPs” ([0022]); “Graph generation: An undirected graph is made, with nodes corresponding to the heterozygous SNPs and the connections corresponding to the orientation and the strength of the best hypothesis for the relationship between those SNPs. (As used herein, a LnodeL is a datum [data item or data object] that can have one or more values representing a base call or other sequence variant (e.g., a het or indel) in a polynucleotide sequence.) The orientation is binary. FIG. 21 depicts a flipped and unflipped relationship between heterozygous SNP pairs, respectively. The strength is defined by employing fuzzy logic operations on the elements of the shared aliquot matrix. (d) Graph optimization: The graph is optimized via a minimum spanning tree operation” ([0039], Fig 21)). 


Regarding Claim 83, Drmanac further teaches each of the distance values corresponds to the distance between a set of consecutive SNVs observed in the portion of the genome (“Fig. 4 shows pairwise analysis of nearby heterozygous SNPs” ([0022], Fig 4); “Graph generation: An undirected graph is made, with nodes corresponding to the heterozygous SNPs and the connections corresponding to the orientation and the strength of the best hypothesis for the relationship between those SNPs. (As used herein, a Lnodel is a datum [data item or data object] that can have one or more values representing a base call or other sequence variant (e.g., a het or indel) in a polynucleotide sequence.) The orientation is binary. Fig. 21 depicts a flipped and unflipped relationship between heterozygous SNP pairs, respectively. The strength is defined by employing fuzzy logic operations on the elements of the shared aliquot matrix. (d) Graph optimization: The graph is optimized via a minimum spanning tree operation” ([0039], Fig 21)).
 
Regarding Claim 84, Drmanac further teaches each of the distance values corresponds to the distance between consecutive locations exhibiting heterozygosity (“Fig. 4 shows pairwise analysis of nearby heterozygous SNPs” ([0022], Fig 4); “Graph generation: An undirected graph is made, with nodes corresponding to the heterozygous SNPs and the connections corresponding to the orientation and the strength of the best hypothesis for the relationship between those SNPs. (As used herein, a Lnodel is a datum [data item or data object] that can have one or more values representing a base call or other sequence variant (e.g., a het or indel) in a 


It would have been a Prima Facie Case of Obviousness “teaching-to-modifying” (Some teaching, suggestion, or motivation in the prior art that would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention (MPEP § 2143 I.G.)) to one of ordinary skill in the art at the time of the invention to modify Brodzik’s teaching on  combinatorial analytic concepts for rapid assessment of genomic sequences including SNPs and a distance modulo to compute reduced distance., in view of Drmanac’s method of pairwise analyzing of nearby heterozygous SNPs,  and combine with Shendure’s computer-implemented method to  represent a genome, to achieve the claimed limitations and expect to be successful. Because all these three inventions (Shendure, Drmanac and Brodzik) are dealing with SNP genotyping information processing. An ordinary artisan would expect the combination of three to be successful as each individual invention succeed.

Claim 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shendure, Drmanac and Brodzik as applied to claims 1-7 above, and further in view of Wong .
Regarding claim 8, Shendure in view of Drmanac and Brodzik teaches the computer-implemented method of claims 1-7, Shendure, Drmanac and Brodzik are all silent on that the set of SNPs is determined by a SNP chip analysis. Wong teaches SNP chip analysis ([0028-0106]). It would have been obvious “teaching -to-modifying” to one of ordinary skill in art to modify Wong’s teaching of an advanced and systematic workflow in identification of SNPs from chip analysis and combine it with the method of Shendure in view of Drmanac and Brodzik to achieve the claimed limitations, and expect to be successful, since Shendure, Drmanac, Brodzik and Wong are all dealing with the SNP genotyping information and Wong succeed in advanced and systematic analysis of SNP genotyping. 

Claims 18, 20-21are rejected under 35 U.S.C. 103 as being unpatentable over Shendure, Drmanac, and further in view of Brodzik, and Abdi (“Z-scores“, In: Neil Salkind (Ed.) (2007). Encyclopedia of Measurement and Statistics)

Regarding claim 18, Shendure in view of Drmanac and Brodzik disclose everything as set forth above in claim 1. Shendure in view of Drmanac and Brodzik are all silent on z-score calculation and matrix data normalization. However, “computing, for 

Regarding claim 20 and its dependent claim 21, as discussed above regarding claim 18, once a person is good at normalizing the matrix data by reference or by the whole data set, it is obvious for a person with ordinary mathematical skill in art to normalize the matrix data internally by column and by row, using the z-scores. Because sampling by rows, by columns, or by the whole matrix, are all fit the Z-score definition ( Abdi,   page 2, first paragraph under “Definition of Z-scores”)

It would have been a Prima Facie Case of Obviousness “teaching-to-modifying” (Some teaching, suggestion, or motivation in the prior art that would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention (MPEP § 2143 I.G.)) to one of ordinary skill in the art at the time of the invention to modify Abdi’s calculation of Z_scores, and combine with Shendure in view of Drmanac’s method and Brodzik’s method, to achieve the claimed limitations and expect to be successful. Because all these three inventions (Shendure, Drmanac and Brodzik) are dealing with SNP genotyping information processing (while Abdi is about general statistics analysis). An ordinary artisan would .

Claim 74-75 are rejected under 35 U.S.C. 103 as being unpatentable over Shendure, Drmanac and Brodzik as applied to claim 73 above, and further in view of Bassett (“Methods And Systems For Identification Of Causal Genomic Variants”, US 20140359422 A1, DATE PUBLISHED 2014-12-04)

Regarding Claim 74, Shendure in view of Drmanac and Brodzik teaches the computer-implemented method of claim 73, yet silent on choosing a mask for each pair key, wherein the mask assigns a class value to each counting value corresponding to both the pair key and the reduced distance. Bassett teaches using a mask for SNVs (“In some embodiments the filtering masks variants not associated with the biological information. In some embodiments the filtering masks variants associated with biological information” [0048]; “In some embodiments the genetic first data set has been previously filtered and wherein a subset of the data points in the first data set have been masked by the previous filter” [0040]; “In some embodiments the biological context filter
is configured to accept a mask from another filter previously performed on the same data set” [0009]). 

Regarding Claim 75, Shendure in view of Drmanac, Brodzik and Bassett teaches the computer-implemented method of claim 74, and Brodzik further teaches the class value is one of the following values: 0 or 1 (“Accordingly, in the methods of the present 
It would have been a Prima Facie Case of Obviousness “teaching-to-modifying” (Some teaching, suggestion, or motivation in the prior art that would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention (MPEP § 2143 I.G.)) to one of ordinary skill in the art at the time of the invention to modify Bassett’s method, and choose a mask assigning a class value according to Brodzik to each counting value corresponding to both the pair key and the reduced distance, and further simplify reduced genome representation according to Shendure in view of Drmanac, and expect to be successful. Because all these four inventions are dealing with identifying/finding SNV keys efficiently at a genome level  and those SNV keys and the distance in between are crucial for genome representation in the claimed invention. An ordinary artisan would expect masking and assigning class value to “pair key”/”reduced distance” (by Bassett and Brodzik) would successfully further simplify the reduced genome representation when combined to Shendure in view of Drmanac’s methods. 



Conclusions

No claims are allowed.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GUOZHEN LIU whose telephone number is (571)272-0224. The examiner can normally be reached Monday-Friday 8-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karlheinz R Skowronek can be reached on (571)272-9047. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For 

/GUOZHEN LIU/Patent Examiner, Art Unit 1631                                                                                                                                                                                                        
/Soren Harward/Primary Examiner, Art Unit 1631