DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
	Applicant’s amendment and response, filed 8/5/2022 have been entered and considered, but are not completely persuasive.
	The IDS filed 8/5/22 has been entered and considered.
	Claims 1-50 are under examination. Claims 48-50 are newly added.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-50 remain rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. Applicant’s arguments will be addressed below.
	The examiner has given the claims the broadest reasonable interpretation of the claims as set forth in MPEP 2173.  In this interpretation, words of the claim are given their plain meaning, unless such meaning is inconsistent with the specification. However, limitations from the specification cannot be read into the claims.  
“The plain meaning of a term means the ordinary and customary meaning given to the term by those of ordinary skill in the art at the time of the invention. The ordinary and customary meaning of a term may be evidenced by a variety of sources, including the words of the claims themselves, the specification, drawings, and prior art. However, the best source for determining the meaning of a claim term is the specification - the greatest clarity is obtained when the specification serves as a glossary for the claim terms. The presumption that a term is given its ordinary and customary meaning may be rebutted by the applicant by clearly setting forth a different definition of the term in the specification. In re Morris, 127 F.3d 1048, 1054, 44 USPQ2d 1023, 1028 (Fed. Cir. 1997) (the USPTO looks to the ordinary use of the claim terms taking into account definitions or other "enlightenment" contained in the written description); But c.f. In re Am. Acad. of Sci. Tech. Ctr., 367 F.3d 1359, 1369, 70 USPQ2d 1827, 1834 (Fed. Cir. 2004) ("We have cautioned against reading limitations into a claim from the preferred embodiment described in the specification, even if it is the only embodiment described, absent clear disclaimer in the specification."); In re Bigio, 381 F.3d 1320, 1325, 72 USPQ2d 1209, 1211 (Fed. Cir. 2004) (The claims at issue were drawn to a "hair brush." The court upheld the Board’s refusal to import from the specification a limitation that would apply the term only to hairbrushes for the scalp. "[T]his court counsels the PTO to avoid the temptation to limit broad claim terms solely on the basis of specification passages.").”

	The metes and bounds of claim 1 are unclear.  In claim 1, the nature of the “crop sequence information” remains indefinite.  It is unclear if a full genomic sequence, a haplotype, a genotype, or some other set of sequence information is required.  Even if the Examiner assumes that “crop sequence” refers to a DNA or RNA sequence of the crop, it could also logically encompass a sequence of crop rotation or a sequence of actions applied to a crop (i.e. plant, water, weed, grow, harvest).  This is a reasonable interpretation when applied to production of food crops, as illustrated by Sarangi (2015), who apply LDA generative topic model processes to activities of farmer activities in the cultivations of a crop. 
“In this paper, we use the contextual information of agriculture to accurately classify the manual agricultural activities like weeding, bedmaking, digging, sowing etc. We rely on the fact that the agricultural activities are not independent of each other as they follow a particular sequence of occurrence. Cultivation of any crop is associated with a particular pattern of farming activities which is also unique for a given geography and a cultivation technique. This pattern of agricultural activities is called Crop Protocol or Package of Practices (PoP). The knowledge of this a priori information of crop protocol when used in conjunction with the observed set of activities can be used to recognize the major activity.”  
A garment or shirt is created to classify farming activities performed in the field, and provides a likelihood probability that a particular activity is performed on a particular day on a particular plot of land.  
When considering “crop sequence information” to relate to genetic data, including DNA, RNA, protein et al, the specification sets forth at page [0010], that “in particular, various  embodiments of the solution described herein include the accessing of genetic sequence  information representative of a first set of organisms that include a known state of an organism feature.”  This is not a limited definition, and it does not define “crop sequence information.”  [0013] further identifies certain aspects of “genetic sequence information” however this is not a definition of “crop sequence information.” [0015] sets forth possible “organisms” falling within the scope of the invention, crop plants included.  [0016] sets forth a list of possible “organism features” which can be associated with the generative interpolation model.  [0017] first refers to the term “crop sequence information” and identifies “The sequence information associated with a set of candidate crops can be accessed from local memory, from communicatively coupled databases, or from one or more client device associated with the requesting entity.”  This is not a clear, limited definition of “crop sequence information”. [0018] sets forth the “crop sequence information” may be “genotypes” however, if this is the provided genetic information, it is unclear how genotypes, as opposed to sequence reads, or genomic sequence information, are to be transformed into k-mers.  Genotypes are identifications of alleles at certain loci in a genome of an organism, and do not provide a full genomic sequence or full genome of that organism.  As defined by [0040] “a k-mer is a subsequence of k contiguous characters within a second sequence where k is any real integer. k is at least 1 character.”  This appears to be impossible given the definition of [0018] which are not representations of contiguous sequence data, but individual genetic loci.  [0020] goes on to define “microbe sequence information” which is distinct from “crop sequence information.”  In [0046] a definition of “sequence” as “a character string representation of nucleotides…or amino acids…[that] may represent all of the genetic or proteomic information of a single organism…” but it is unclear how the sequence information can be both genotypes, and full sequences and this recitation appears to be directed to “all microorganisms in a culture collection…” and not specifically to crop plants.  The examiner has provided multiple interpretations which are not interchangeable for the idea of “crop sequence information” and does not find a clear limited definition of the term in the specification.  As such, the term remains indefinite.
Further in claim 1, in the training of the topic model, the metes and bounds of the “k-mer” remains indefinite with respect to the new limitation for the new unspecified transformation. No definition of a particular length or other characteristic of the k-mer is set forth in the claim such that one of skill could generate and apply these k-mers to any generative topic model.  As cited previously, [0040] sets forth a definition of a k-mer, however this is not a clear definite recitation of the transformation, nor an identification of a specific length or other aspect of the kmer. 
“[0040] A “k-mer” or “kmer” is a subsequence of k contiguous characters within a second sequence, where k is any real integer. k is at least 1 character. In some embodiments, k is greater than 2 and less than 10 characters. For example, a subsequence that is 6 nucleotides in length may be referred to as a 6-mer and a subsequence that is 20 nucleotides in length may be referred to as a 20-mer. k-mer size is chosen to 1) maximize the ability to predict / make recommendations when identifying target candidates, while 11) minimizing computational load as the number of nucleotide k-mers scales as 4k. In practice, the smallest k-mer size that accurately represents phylogeny in the training data is chosen.”

Claim 1 does not provide the necessary data to choose the k-mer size as set forth in [0040] as it does not set forth any particular target candidate (for which to make recommendations or predictions) nor any limitation to minimizing computational load.  Claim 1 does not set forth any particular organisms from which a phylogeny can be identified or represented by the k-mer.
Further in claim 1, the metes and bounds of the “generative topic model” remain unclear.  The model is to be trained on a set of “training crop sequence information” which is no more definite than any other recitation of the term.  Further, the training limitation states that the model is “configured to convert crop sequence information into a latent space representation of k-mers within the crop sequence information.  It is unclear if this is a part of the training process, or a step taken after the training process.  The origin of the training crop sequence information is not clearly set forth, and while it can be assumed to be the same crop as the “first crop sequence information” of the next limitation, that is not an actual limitation of the claim.  The metes and bounds of the term “convert…into a latent space representation of k-mers” in the training limitation are unclear, as it is unclear if the “conversion” is the same as the “transforming.. into k-mers of a predetermined size” step for the First Crop sequence information- no latent space representation is provided in this step.  Utilizing two separate term suggests two separate mathematical operations taken on the listed “sequence information.”  It is further unclear in this limitation what the topics of the “generative topic model” are intended to reflect.  No labels or characteristics are clearly identified as applying to crops or crop features.
[0065] discusses topics in an LDA approach for a GTM, and notes that:
 “a sequence can be analogized to a document, which is constructed of topics. “The topics can include words and by extension to a sequence, k-mers. These topics are shared across all sequences. The proportion of each topic in a strain’s marker gene sequence is a natural metric for comparison. By examining topics, one or more embodiments calculate distances between strains based on correlated k-mer abundances. This can be undertaken rather than use the reads themselves. These distances can be used as relatedness metrics or scores. When used with a generative model, these distances can be used to identify target candidates that are related to organisms having known features of interest…[0066] In one embodiment, a sequence is a collection of topic proportions thereof. Further, in one embodiment a topic is a collection of k-mer frequencies. These collections are probability distributions. Unsupervised machine learning and these probability distributions and subsets thereof can be used with a generative model to output candidate targets. Such candidate targets are predicted to have a given feature of interest…[0067] In one embodiment, a sequence comprises topics, and these topics are comprised of patterns of k-mers and correlations of k-mers. The topics are available for use across all sequences, though an individual sequences may lack one or more of the topics. The number of topics is chosen as the smallest number to completely represent the sequence data…[0068] The number of non-null topics found will be less than or equal to the total number of topics that is set. The number of topics necessary to properly represent the sequence data can be found by varying the total number of topics, for example, by increasing the number of topics used until the number of non-null topics stabilizes. The proportion of each topic in an organism’s sequence is a natural metric for comparison. By examining topics, the distances between strains based on correlated k-mer abundances, rather than the raw abundances themselves, were calculated. Some embodiments of the present disclosure provide a method for using sequences to create a representation of a phylogeny such as for example a phylogenetic latent space representation or model.”

Without some direction as to the type of topics which are to be utilized in the GTM the claim fails to particularly point out and distinctly claim how the GTM is specifically to be trained, and its output used in any further step of claim 1.  No tables or lists of abundances, frequencies, proportions, patterns, or distances between strains appear to be provided by the specification nor does there appear to be an art recognized linkage between any given topic as defined above, and any given possible feature, or set of features.
The metes and bounds of the “accessing first crop sequence information of a first set of crops that include a crop feature” are unclear.  There is no clear provision of a database or source providing crop sequence information for a multitude of crops that also includes crop features in this limitation such that a “first crop” could be identified out of a set of multiple crops.  The nature of the feature is not clearly pointed out or distinctly claimed, such that one of skill could utilize the presence or absence of a feature to select a set of crop sequence information as required.  Without a specific set of desired features and the link to a sequence or topic, it is unclear how the next model, the generative interpolation model (GIM), can then calculate a likelihood that the target crop includes such a feature.  The topic model (GTM) does not provide any linkage between sequence and features, as only sequence information is used to train the initial topic model.  Even taking the generic definition of features in [0016, or 0036] into account, without the specific selection of one or more desired features, it is entirely unclear how any classification or likelihood calculations can be made by the GIM as to whether any of the “first crop sequence information” informs as to whether the crop has a given feature.  
	Further in claim 1, the metes and bounds of the second training step, for training the generative interpolation model (GIM) are unclear.  This step requires the information from the previous GTM output to perform a classification step and calculate a likelihood that a given First Crop would have any particular feature.  Without knowledge of the data to be applied from the previous step, one of skill would not be apprised as to the classification and determinations to be applied in training the GIM.  [0037] notes that “one or more generative models can be generated using observed k-mer frequency to understand and model how sequences differ” however claim 1 is not limited in such a way.  
	Further in claim 1, the metes and bounds of the second crop information and related steps are unclear.  There is no clear provision of a database or source providing crop sequence information for a multitude of crops that also includes crop features in this limitation such that a “second crop” could be identified out of a set of multiple crops.  The nature of the feature is not clearly pointed out or distinctly claimed, such that one of skill could utilize the presence or absence of a feature to select a set of crop sequence information as required. If the GIM is intended to be iterated for all known genomes of a crop, or organism to generate such a database, the claim does not recite such limitations.  If the requesting entity is to provide “crop sequence information” for the GIM to then transform, classify and determine a likelihood, the steps of claim 1 do not reflect this particular set of steps.  No set of second crops having a crop feature are provided by the requestor, nor are they clearly provided by the specification or a publicly available database or system.  Further in claim 1, the requestor asks for “a crop that includes the feature” but the steps of identifying and modifying refer to “each crop of the subset of crops” which is unclear. Claim 47 drawn to the computer program product is indefinite in the same places.
	The metes and bounds of claims 3 and 4 are unclear.  These claims assert that either the feature is “crop performance” or that the client device is modified to display a “measure of expected crop performance.”  It is entirely unclear what sequence information is linked to any aspect of crop performance.  It is further unclear how to manipulate the feature of “crop performance” to “a measure of expected crop performance” as they appear to be different concepts.
	Independent claim 11 is drawn to a separate method and computer program product with the same limitations.  The metes and bounds of claim 11 are unclear.  The nature of the “first genetic sequence information of a first set of organisms that include an organism feature” is not clearly set forth in the claim. The specification sets forth at page [0010], that “in particular, various embodiments of the solution described herein include the accessing of genetic sequence information representative of a first set of organisms that include a known state of an organism feature.”  This is not a limited definition and no source of such information is clearly provided for all the possible organism features encompassed.  [0016] sets forth a list of possible “organism features” which can be associated with the generative interpolation model but not what sequences provide those features.  [0020] goes on to define “microbe sequence information” which is distinct from “crop sequence information” but does not provide what sequences are linked to or are known to provide a given feature.  
Further in claim 11, in the training of the topic model, the metes and bounds of the “k-mer” remains indefinite with respect to the new limitation for the new unspecified transformation. No definition of a particular length or other characteristic of the k-mer is set forth in the claim such that one of skill could generate and apply these k-mers to any generative topic model.  As cited previously, [0040] sets forth a definition of a k-mer, however this is not a clear definite recitation of the particular transformation, nor an identification of a specific length or other aspect of the kmer. 
“[0040] A “k-mer” or “kmer” is a subsequence of k contiguous characters within a second sequence, where k is any real integer. k is at least 1 character. In some embodiments, k is greater than 2 and less than 10 characters. For example, a subsequence that is 6 nucleotides in length may be referred to as a 6-mer and a subsequence that is 20 nucleotides in length may be referred to as a 20-mer. k-mer size is chosen to 1) maximize the ability to predict / make recommendations when identifying target candidates, while 11) minimizing computational load as the number of nucleotide k-mers scales as 4k. In practice, the smallest k-mer size that accurately represents phylogeny in the training data is chosen.”

Claim 11 does not provide the necessary data to choose the k-mer size as set forth in [0040] as it does not set forth any particular target candidate (for which to make recommendations or predictions) nor any limitation to minimizing computational load.  Claim 11 does not set forth any particular organisms from which a phylogeny can be identified or represented by the k-mer.
	Further in claims 11 The metes and bounds of the training step, for training the GIM are unclear.  This step requires the information from the previous step: “latent space representation of the k-mers” to perform a classification step and perform a determination.  Without the specific knowledge of the k-mers, sequences and desired features to be applied from the previous step, one of skill would not be apprised as to the classification and determinations to be determined or interpolated by the GIM to determine the presence of the desired features.  
	Further in claim 11, the metes and bounds of the second training step, for training the generative interpolation model (GIM) are unclear.  This step requires the information from the previous GTM output to perform a classification step and calculate a likelihood that a given First Crop would have any particular feature.  Without knowledge of the data to be applied from the previous step, one of skill would not be apprised as to the classification and determinations to be applied in training the GIM.  [0037] notes that “one or more generative models can be generated using observed k-mer frequency to understand and model how sequences differ” however claim 1 is not limited in such a way.  
	Further in claim 11, the metes and bounds of the second genetic sequence information of a second set of organisms and related steps are unclear.  There is no clear provision of a database or source providing sequence information for a multitude of organisms that also includes features in this limitation such that a “second organism that includes the feature” could be identified out of a set of multiple organisms.  The nature of the feature is not clearly pointed out or distinctly claimed, such that one of skill could utilize the presence or absence of a feature to select a set of sequence information as required. If the GIM is intended to be iterated for all known genomes of a set of organisms to generate such a database, the claim does not recite such limitations.  If the requesting entity is to provide “sequence information” for the GIM to then transform, classify and determine a likelihood, the steps of claim 11 do not reflect this particular set of steps.  No set of second organisms having a feature are provided by the requestor, nor are they clearly provided by the specification or a publicly available database or system.  
	The metes and bounds of claim 38 are unclear.  The claim lists a number of features, however these features are not clearly and distinctly set forth in the claim such that one of skill would be apprised as to which features applicant intends to claim and apply.  The metes and bounds of each term are unclear, and not clearly and specifically defined by the specification.  Terms such as “a genomic composition” fails to set forth what this specifically comprises, such that is could be used to discriminate between two populations. A search of the term “a genomic composition” provided no hits in a broad unlimited Google search.  Google suggested the term “a genetic composition” which is defined as a genotype (Google definition search genomic composition, downloaded 10/19/2022- see “combined printouts” reference for every instance related to this claim). It is entirely unclear whether the intended definition of “a genomic composition” is to be “genotype” or some other aspect of a genome.  With respect to “morphology”  the plain meaning is “the study of the size, shape and structure of animals, plants and microorganisms and the relationships of their constituent parts.” (Brittanica definition).  It is entirely unclear what aspects of this definition are intended to be a feature, to be used in the method.  With respect to “a lifestyle” the plain meaning is: “noun. the habits, attitudes, tastes, moral standards, economic level, etc., that together constitute the mode of living of an individual or group. adjective. pertaining to or catering to a certain lifestyle: unhealthy lifestyle choices; lifestyle advertising; a luxury lifestyle hotel.” (Dictionary.com lifestyle)  It is entirely unclear how to determine a lifestyle feature of plants and microbes based upon this definition, nor is there any art accepted correlation between sequences and lifestyle for plants or microbes.   With respect to  “suitability for manufacturing or harvesting”, no particular definition was provided by a broad Google search.  Individually, no definition is provided for “suitable for manufacturing” and none is found for “a suitability for harvesting.” (Google searches).  The term “suitable” is a term of judgement or a relative term which wholly relies upon the item in question.  One must know the identity and the genomic data of the organisms, crops, or plants in order to know when or if it should be harvested. One must know the identity of the product to determine whether it is suitable for manufacturing.  No known correlations exist between any given sequence and a feature of “suitability for manufacturing or harvesting.” The claim fails to provide any such information, and no such information is provided in the specification.  The term “a compatibility with commercial practices” fails to generate a specific definition in a broad Google search.  It is entirely unclear what this feature is intended to capture or represent with respect to sequences from an organisms, crop or plant.  Compatibility can only be determined between identified members.  No particular “commercial practices” are provided by the claim.  The term “compatibility” is a subjective determination based on a particular goal, and not a specific description of a particular feature.  There are no art recognized correlations between genomic information and “compatibility with commercial practices” as recited.  Similarly, the term “a compatibility with select formulations and chemical preparations” is entirely unclear.  The term, both as written and individually for “formulations” and “chemical preparations” have no definition in Google.  There is no art recognized correlation between any genomic information or sequences and “compatibility with select formulations and chemical preparations.”  To judge such compatibility, the organism must be known, or the “select formulation” must be known, or the “chemical preparation” desired must be known.  
With respect to “a chemical diversity production” this term is entirely unclear.  It is unclear what must display “chemical diversity” or what must produce “chemical diversity.”  A definition of “chemical diversity” on its own includes: “A US contract research organization headquartered in San Diego…” (Wikipedia). A more relevant definition by Gibbs et al. 2006: 
“The following chapter, though not exhaustive, describes the quantitative characterization of chemical diversity, a means to measure the extent of differential features and properties within a compound collection. The measurement of molecular diversity requires the definition of a chemical space. This N-dimensional chemical space is represented by a group of selected molecular descriptors. Each compound in a collection can be assigned coordinates based on the measurement of its descriptor values. Increasing distance, within the dimensions of the assigned coordinate space, should correlate with increasing diversity (or decreasing similarity) between compounds. The dimensionality of chemical structure space exceeds that of known biological functional space. The dimensionality of biological functional space has increased in 
recent years due to the discovery of a multitude of genes, largely from the Human Genome Project.1 This chapter, however, will focus on chemical diversity rather than functional diversity. Quantification of chemical diversity involves two areas: first, the predefinition of a chemical space, accomplished by selection of a diversity metric and a compound representation (i.e.,  molecular descriptors); and second, a rational subset selection, or classification, method dependent on efficient dimensionality reduction.”  

The specification provides no information as to how any particular genome information is to be linked to, or describe chemical diversity production.  There does not appear to be an art recognized correlation between biological sequence information, and production of chemical diversity.
	With respect to “a metabolic production” the metes and bounds of this term are entirely unclear.  Similarly to the idea of producing “chemical diversity” the particular metabolite needs to be identified before one can determine whether a given organism can produce it.  While some metabolites may be linked to certain metabolic processes in an organism, not all have been linked to a particular genomic sequence, genotype or other type of genetic data.  
It is entirely unclear how to associated these listed features from claim 38 with any particular k-mer as required in claim 11.  It is not clear that a genetic element is required for each feature listed.
	Applicant’s arguments:
	Applicant’s arguments have been considered but are not completely persuasive.  The Examiner has provided rational basis for a judgement of indefiniteness for each rejected term, and sought to identify clear, specific definitions of those terms in the specification, and the prior art.  
	As set forth in MPEP 2173: “The primary purpose of this requirement of definiteness of claim language is to ensure that the scope of the claims is clear so the public is informed of the boundaries of what constitutes infringement of the patent. A secondary purpose is to provide a clear measure of what the inventor or a joint inventor regards as the invention so that it can be determined whether the claimed invention meets all the criteria for patentability and whether the specification meets the criteria of 35 U.S.C. 112(a)  or pre-AIA  35 U.S.C. 112, first paragraph with respect to the claimed invention.”  
	In consideration of Applicant’s ability to define their invention, the MPEP 2173 states: “A fundamental principle contained in 35 U.S.C. 112(b)  or pre-AIA  35 U.S.C. 112, second paragraph is that applicants are their own lexicographers. They can define in the claims what the inventor or a joint inventor regards as the invention essentially in whatever terms they choose so long as any special meaning assigned to a term is clearly set forth in the specification. See MPEP § 2111.01. Applicant may use functional language, alternative expressions, negative limitations, or any style of expression or format of claim which makes clear the boundaries of the subject matter for which protection is sought. As noted by the court in In re Swinehart, 439 F.2d 210, 160 USPQ 226 (CCPA 1971), a claim may not be rejected solely because of the type of language used to define the subject matter for which patent protection is sought.”
	The examiner was unable to identify any special meaning attached to the rejected limitations.
	MPEP 2173: “During examination, after applying the broadest reasonable interpretation to the claim, if the metes and bounds of the claimed invention are not clear, the claim is indefinite and should be rejected. Packard, 751 F.3d at 1310 ("[W]hen the USPTO has initially issued a well-grounded rejection that identifies ways in which language in a claim is ambiguous, vague, incoherent, opaque, or otherwise unclear in describing and defining the claimed invention, and thereafter the applicant fails to provide a satisfactory response, the USPTO can properly reject the claim as failing to meet the statutory requirements of § 112(b)."); Zletz, 893 F.2d at 322, 13 USPQ2d at 1322. For example, if the language of a claim, given its broadest reasonable interpretation, is such that a person of ordinary skill in the relevant art would read it with more than one reasonable interpretation, then a rejection under 35 U.S.C. 112(b)  or pre-AIA  35 U.S.C. 112, second paragraph is appropriate.”
	The examiner has provided instances of more than one reasonable interpretation, as well as evidence of plain meanings and definitions of certain limitations. The terms were considered alone, and within the claim as a whole.  As such this rejection is maintained.
New Grounds of Rejection
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 1-4, 8-12, 14, 20, 22, 24, 26-30, 32-34, 38, 40, 41-44, 47 is/are rejected under 35 U.S.C. 102a1 as being anticipated by Peiffer et al (2015).
Peiffer, J.A. et al. (2015) The Genetic Architecture of Maize Height. Genetics, vol 196, p1337-1356.
	Claim 1 is directed to method of predicting or determining whether a particular crop will include a particular feature, through the use of generative topic models, using known features, and generative interpolation models to make the prediction in response to a user request.
	Peiffer is directed to understanding the genetic basis for the feature of height in a crop plant, Maize.  Peiffer utilizes genetic data, and known height loci to train a generative model on reference Maize strains, and then utilizes an interpolation model to make predictions of height.  The computer process overall is named GBLUP. These methods are all computer-implemented, and the “user device” is interpreted as a display of a general purpose computer.
	With respect to claim 1, Peiffer accesses genetic data of Maize, after linking mapping, and genotyping. (results section, p1339) The genetic sequence information includes the crop feature of “height” and are genotyped using 6 molecular marker sets comprising a large number of known SNP, IBD information. Missing values for some sets were imputed using fastPHA-SEv1.3, or nearest neighbor algorithms such as TASSELv3.0.  Distance-based weighted genotype averages are calculated.  This meets the crop sequence information of a first set of crops that include a crop feature in claim 1.  
	The genetic data is transformed into k-mers by overlapping windows and used to generate a latent space representation using a generative model (p1340). Generative models such as linear mixed models were utilized to partition trait variation into components (topics) of genetic variance and environmental variance.  Trait variation within each of the environments was fit in separate models. Terms retained in these models were chosen and nested within the environmental model.  The environmental models were separately fit for each environment, including a fixed effect and random effects (p1340). The combined topic model is used to predict whether a given strain will have a particular height, using an interpolation or predictive model. Other traits measured were ear height, days to pollen shed, and node counts. 
Peiffer validates genomic prediction across maize families using the rrBLUP package v2.12.0, which when applied to IBD matrices is given the name GBLUP.  (p1342-43) a subset of Maize genotypes were selected and applied to the model to predict the likelihood that a particular strain has a particular feature.  When the process is completed, the likelihood is displayed on a device being used by the user.  Figure 1 is an example of displayed distributions of plant height values within and between families. GBLUP performed better than QTL models (Fig 6A) in prediction accuracy within RIL families.  Between families, PHT prediction accuracy was better using GBLUP than bagging QTL models. Within the diversity panel NCRPIS subset, GBLUP revealed significant prediction accuracies for plant height (Fig 6E, see also, Discussion section).  As such, claim 1 is anticipated.
	With respect to claim 2, the general purpose computers utilized by Peiffer receive the request to make a prediction, and display the results via an interface element (the computer monitor, throughout).  
	With respect to claim 3, a feature investigated by Peiffer is Ear height, plant height, pollen shed, and node count which are all related to crop performance (1340-1344).
	With respect to claim 4, the predictions of plant height of Peiffer are displayed on the monitor as shown in Fig 1, 6, and in the discussion.  
	With respect to claim 8, matrices are created wherein kmers are an element of one column and features, another. (IBD matrices, p1340)
	With respect to claim 9, the general purpose computer, having an interface such as a display and keyboard/ mouse is used to select Maize strains, families, inbred lines and RIL progeny. (p1339-1340.)
	With respect to claim 10, thresholds are applied to the prediction model and only crops with a predicted feature value over the threshold are selected for display. (results, discussion).
	With respect to independent claim 11, Peiffer discloses accessing genetic sequence information of a first set of Maize families (the organism is the plant) which include the feature such as plant height, ear height, days to pollen shed, and node count. (1340-1344)
	The interpolative, predictive GBLUP models are trained on the transformed genetic data latent space representation. (Results section)
	A second set of genetic sequence information of a second set of maize families is obtained and applied to the model to predict the presence of the feature. (6 separate sets of maize family data were used in varying combinations for prediction or validation- Results). 
	With respect to claim 12, Peiffer uses generative topic models (results).
	With respect to claim 14, genetic training information is provided (results).
	With respect to claim 20, the IBD matrices of Peiffer are the matrices.  Each IBD segment is a given length, and present in columns of the matrix, and the maize strain in rows. (p1340).
	With respect to claim 22, DNA sequence information, marker sequence information, genome information, SNP information, QTL information are all employed.
	With respect to claim 24, the second, (or third or additional) dataset is converted, and likelihood analysis is calculated using average weighted genotypes between latent space representations. (p1341-1342).
	With respect to claim 26, the likelihood calculations reflect the prediction of the presence of the feature. (results). 
	With respect to claim 27, thresholds are applied to the prediction models (results, discussion). 
	With respect to claim 28-29, the organisms are plants.
	With respect to claim 30 maize plants are monocots.
	With respect to claims 32-33, the plant is corn (maize) and the plants comprise different genotypes of the same type of crop (maize). 
	With respect to claim 34, the 6 sets of maize family data are at least 10 different organisms each.
	With respect to claim 38, the features disclosed by Peiffer include genomic composition, morphology, environmental niche, days to pollen drop which is related to spore formation, gene frequency, etc.  
	With respect to claim 40, any number of additional families or strains can be tested for prediction of the features.
	With respect to claims 41-43, the general purpose computer display can be modified to display any desired information, including likelihood of the presence of the feature, data on additional subsets, et al.  Storage elements of general process computers can provide the genetic information for each subset.
	With respect to claim 44, the second set can be one requested by the user.
	With respect to claim 47, this computer program product claims is disclosed by Peiffer at the instances disclosed for claims 1 and 11.  

Claim(s) 11-12, 14-26, 28, 34, 38-44, 48-50 is/are rejected under 35 U.S.C. 102a1 as being anticipated by Kazeman (2011).
           Kazeman et al. (2011) Improved accuracy of supervised CRM discovery with interpolative Markov models and cross species comparison. Nucleic Acids Research vol 32 no 22, p9463-9472.
Claim 11 is directed to methods of utilizing generative interpolative models to predict whether an organism is likely to comprise a given feature.  Genetic data of an organism is obtained, transformed to kmers, used to generate a latent space representation, and used to train the model.  A second set of genetic sequence information for a second set of organsims is obtained and applied to the trained model to predict the presence of the feature.
	Kazeman is directed to utilizing trained generative interpolative models to predict the presence of transcriptional cis-regulatory modules (CRM). With respect to claim 11, Kazeman accesses a first set of genetic sequence information for a first set of organisms. Gene expression data and genome data are utilized. Organisms analyzed by Kazeman include Drosophila and human.  (p9464).  CRM are known to be relatively short stretches of DNA, within a certain distance from a gene, as well as in certain introns.  No similar properties for CRM were known such as open reading frames, or codon usage bias (introduction).  Some known CRM are known for Drosophila, and mouse.  The genetic sequence information is processed or transformed into words (k-mers) of variable or predetermined length, and transformed to a latent space representation.  This is used to train the IMM, and utilizes a likelihood ratio to demonstrate the presence or absence of a CRM.  These trained models then are tested using a second set of genetic data from a second set of organisms.  Kazeman utilizes Drosophila melanogaster and orthologs from other Drosophila species. 
	“Here, we advance the state-of-the-art in solving this problem by developing a new statistical approach that improves signiﬁcantly upon previous methods. This new score is based on ‘interpolated Markov models’ (IMM) and the use of multi-species comparison. An IMM can be thought of as a combination of Markov chains of varying orders, and considers the frequencies of words of variable length in learning a generative probabilistic model from the training CRMs. We train an IMM on the training set of CRMs from D. melanogaster as well as their orthologs from other Drosophila species, and use the likelihood ratio of this model and a suitable null model as the score of a candidate CRM. The new method is shown to be superior to existing techniques in terms of predictive accuracy, which is assessed using a new statistical test that extends the widely used Hypergeometric test to correct for an import-ant bias present in this test as applied to our setting.”
The computer program is named scrm-2 and can be accessed through the university website, or through an Drosophila genome viewer or browser interface (supervised CRM discovery).
As such, claim 11 is anticipated.
With respect to claim 12, the model is a generative model which generates the variable length words or kmers. P9466.  
With respect to claims 14-15, training information is provided (p9466) and sequences known to contain the feature, and those that do not were utilized (p9465-66.)  “We train two 5th order IMMs, one on the training set (‘positive’ model) and one on suitably chosen non-CRM or ‘background’ sequences (‘negative’ model), and score every candidate sequence by calculating the (log)-likelihood ratio between two models. A positive score means that the candidate sequence is more likely to have been generated by (i.e. is more similar to) the positive model. This score is henceforth called the IMM score.”
With respect to claims 16-19, updating, regeneration or responding to new data is exemplified in the regeneration of the initial IMM to utilize training CRMS from D melanogaster as well as their orthologs from 10 other Drosophila genomes. P9466.
With respect to claim 20, matrices of the kmers are generated, where rows are comprise organism information and columns represent kmer information. P9466.
With respect to claim 21, the latent space representation comprises a sparse representation of kmers. P9466.  
With respect to claim 22, DNA sequence information, genome sequence information, and gene expression data, which is based on RNA expression information are all utilized (throughtout). 
With respect to claim 23, the locus length aware hypergeometric test (LLHT) is a distance based measure. P9465.
With respect to claim 24, the second set of organisms converting the sequence information to a second latent space representation, and determining a likelihood based on the covariance of weighted averages of the variable in each representation.  (results, p9466) “score every candidate sequence by calculating the (log)-likelihood ratio between two models. A positive score means that the candidate sequence is more likely to have been generated by (i.e. is more similar to) the positive model. This score is henceforth called the IMM score.”
With respect to claim 25, the IMM is a Gaussian process model.
With respect to claim 26, the IMM score designates whether the organisms comprises the feature.  
With respect to claim 28, Drosophila, mouse and human sequence information has been explored.
With respect to claim 34, at least 10 types of Drosophila species were studied (results).
With respect to claim 38, the feature of the CRM is related to gene expression, genomic composition, frequency of CRM kmers, frequency of genes, et al.
With respect to claim 39 Kazeman validates the predictions by in vitro testing to determine if a CRM is present.  P9469-9470.
With respect to claim 40-43, the general process computers of Kazeman or the browser implementation of the program accept request or recommendations to obtain additional sets of organism data, to display information about the second set of organisms, to display the likelihood that the second set of organisms comprise the feature, and the additional datasets can be provided by accessing the web, or an internal storage device. 
With respect to claim 44, the requestor selects the second set.
With respect to claim 48, kmer frequency is determined for all the variable kmers, and used to generate the model.  P9466
With respect to claim 49, kmer length, while variable, is chosen to reduce or minimize load.  P9466
With respect to claim 50 at least some of the kmers are 6mers. P9466.

Claim(s) 11-24, 26-28, 34, 38, 40-44, 47-49 is/are rejected under 35 U.S.C. 102a1 as being anticipated by Yeh (2010).
Yeh, J-H. et al. (2010) Protein remote homology detection based on Latent Topic Vector Model. 2010 International conference on Networking and information technology, p456-460.
Yeh is directed to “feature extraction and efficient representation of protein vectors for SVM protein classification. The experiment uses protein database from Structural Classification of Proteins version(SCOP) 1.53 with latent topic extraction technique (Latent Dirichlet Allocation model) which is an efficient feature extraction technique from natural language processing. The basic building blocks of our model are word documents generated from protein sequence by N-gram segmentation and filtered by TF-IDF method. Then the LDA phase applies on these documents for latent topic extraction while the SVM method acts as a classifier of latent topic. In our experiment, the LDA-SVM model outperforms than LSA-SVM model in the previous research.” (abstract)
With respect to claim 11, Yeh discloses obtaining protein sequence information of a first set of proteins that include a feature of homology or “belongingness” from SCOP (abstract). The protein sequences are transformed to kmers or words of a predefined length (materials and methods, Building block: biological word).  These kmers are used to generate a latent space representation, to train a generative model to classify whether a given protein sequence word would belong to a specific protein.  (Fig 1). Test sequences are applied to the trained model to classify, and determine the performance of the classification. 
“the first part is the generation of building blocks of protein sequences. In this part, the a simple block generation method frequently used in computational linguistics domain is used, then a filtering method to reduce the noises contained in the generated building blocks is adopted. The second part is to create the latent topic model and generate topic belongingness for each protein sequence. In this step, a latent topic model of training data is created and the model similarity of each training protein sequence is calculated. The third part is to train the classifier based on these model similarities with binary class labels. Finally the test protein sequences are classified based on these steps to find the performance of classification.” (Introduction, Fig 4).
With respect to claims 12-13, Yeh utilizes LDA, latent Dirichlet allocation topic models (throughout, materials and methods, fig 1). 
With respect to claims 14-15, training data is applied to create the LDA. (p457.) The data comprises “terms” which are valid and those that are not. 
“The core component of our proposed method is a latent topic model based on Latent Dirichlet Allocation (LDA)[9]. In our previous research, the latent topic model has been proven to be a good method for movie recommendation[10] and image searching[II]. In this paper, the training data (protein sequences) are first processed by N-grams and TF­IDF. After this step, each protein sequence is converted into a "biological document" with filtered N-gram blocks. These generated training documents are prepared to create the latent topic model using LDA. After creation of the latent topic model, the training documents are then calculated the model similarities with the model. Since LDA computes the co-occurrence of the "terms" contained in the documents (N­gram block in our scenario), the model similarity is represented by a M-topic vector which shows the topic belongingness of a document. Fig. 1 shows the topic vectors of documents.” P457
With respect to claims 16-19, updating and regenerating the model can take place when new sequences are added to SCOP for either the training or test sets (p457).
With respect to claim 20, matrices of the kmers are generated with row and column information. “the N-gram method is the simplest one among the three methods, but it is an efficient pattern creation method widely used in computational linguistics domain. This method enumerates all blocks in a sequence with length N, which consists of consecutive amino acids in that sequence. The maximum possible block patterns are 20N since there are 20 different amino acid in total.” P457
With respect to claim 21, the training data is sparse.
With respect to claim 22, amino acid sequence information is obtained.
With respect to claim 23, distance metrics can be calculated. (Fig 4 and p458).
With respect to claims 24 and 26, the test sequences are transformed to a latent space representation, and a likelihood determination is performed to predict whether the protein has the desired feature/ topic/ word. Fig 4. 
With respect to claim 27, each term or word is associated with an above threshold likelihood that the term belongs to the “document” which is the full protein sequence. (Results). 
With respect to claim 28, all type of organisms contribute protein sequences to SCOP.
With respect to claim 34, the sets of organisms are 54 protein families, and 4,352 sequences.
With respect to claim 38, genomic composition, frequency of term, morphology et al are possible topics which are related to the “belongingness” of the kmer to the full protein sequence.
With respect to claims 40-44, the general process computer displays results, accepts requests, obtains data from the web, or from a storage device, and allows selection of the sets of families or organisms by the user.
With respect to claim 47, the computer program product is anticipated at the same places as for claim 11. 
With respect to claim 48, kmer frequencies are calculated and used to generate a topic model (Fig 4).
With respect to claim 49, the program uses trigrams (three amino acid words) to minimize computational load.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 45-46 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kazeman in view of Heydari (2010).
Kazeman et al. (2011) Improved accuracy of supervised CRM discovery with interpolative Markov models and cross species comparison. Nucleic Acids Research vol 32 no 22, p9463-9472.
Heydari et al. (2010) A review on biological control of fungal plant pathogens using microbial antagonists. Journal of Biological Sciences vol 10 (4) 273-290.
Claim 11 is directed to methods of utilizing generative interpolative models to predict whether an organism is likely to comprise a given feature.  Genetic data of an organism is obtained, transformed to k-mers, used to generate a latent space representation, and used to train the model.  A second set of genetic sequence information for a second set of organisms is obtained and applied to the trained model to predict the presence of the feature.
Claims 45-46 depend from claim 11, and are directed to using identified microbes or communities of microbes as a crop treatment.
Kazeman discloses the method of claim 11 as set forth above, and is directed to utilizing trained generative interpolative models to predict the presence of transcriptional cis-regulatory modules (CRM). With respect to claim 11, Kazeman accesses a first set of genetic sequence information for a first set of organisms. Gene expression data and genome data are utilized. Organisms analyzed by Kazeman include Drosophila and human.  (p9464).  CRM are known to be relatively short stretches of DNA, within a certain distance from a gene, as well as in certain introns.  No similar properties for CRM were known such as open reading frames, or codon usage bias (introduction).  Some known CRM are known for Drosophila, and mouse.  The genetic sequence information is processed or transformed into words (k-mers) of variable or predetermined length, and transformed to a latent space representation.  This is used to train the IMM, and utilizes a likelihood ratio to demonstrate the presence or absence of a CRM.  These trained models then are tested using a second set of genetic data from a second set of organisms.  Kazeman utilizes Drosophila melanogaster and orthologs from other Drosophila species. 
“Here, we advance the state-of-the-art in solving this problem by developing a new statistical approach that improves signiﬁcantly upon previous methods. This new score is based on ‘interpolated Markov models’ (IMM) and the use of multi-species comparison. An IMM can be thought of as a combination of Markov chains of varying orders, and considers the frequencies of words of variable length in learning a generative probabilistic model from the training CRMs. We train an IMM on the training set of CRMs from D. melanogaster as well as their orthologs from other Drosophila species, and use the likelihood ratio of this model and a suitable null model as the score of a candidate CRM. The new method is shown to be superior to existing techniques in terms of predictive accuracy, which is assessed using a new statistical test that extends the widely used Hypergeometric test to correct for an import-ant bias present in this test as applied to our setting.”
The computer program is named scrm-2 and can be accessed through the university website, or through an Drosophila genome viewer or browser interface (supervised CRM discovery).
	Kazeman does not address using identified organisms as a crop treatment. 
	Heydari et al, is a review addressing the application of microbial compositions to plants, to treat or prevent fungal plant pathogen growth.  The review details the how various microbes or microbial communities, having a particular feature or activity, are being used in the treatment of crop plants.  This technology has the benefit of not requiring harmful chemical pesticides or antibiotic treatments.  The terminology of “biological control” is defined by Heydari as: 
“Biological control is the inhibition of growth, infection or reproduction of one organism using another organism. Biocontrol is environmentally safe and in some cases is the only option available to protect plants against pathogens. Biological control employs natural enemies of pests or pathogens to eradicate or control their population. This can involve the introduction of exotic species, or it can be a matter of harnessing whatever form of biological control exists naturally in the ecosystem. The induction of plant resistance using non-pathogenic or incompatible microorganisms is also a form of biological control.” 

The treatment of a crop requires some level of direct or indirect contact with the plant.  “[I]n order to interact, organisms must have some form of direct or indirect contact. The types of interactions between plants and microorganisms have been referred to as mutualism, protocooperation, commensalisms, neutralism, competition, amensalism, parasitism and predation.” (introduction, p275).
	In consideration of crop plants, and the protection of crop production Heydari notes that: 
“Fungal plant pathogens are among the most important factors that cause serious losses to agricultural products every year. Biological control of plant diseases including fungal pathogens has been considered a viable alternative method to chemical control. In plant pathology, the term biocontrol applies to the use of microbial antagonists to suppress diseases. Throughout their lifecycle, plants and pathogens interact with a wide variety of organisms. These interactions can significantly affect plant health in various ways. Different mode of actions of biocontrol-active microorganisms in controlling fungal plant diseases include hyperparasitism, predation, antibiosis, cross protection, competition for site and nutrient and induced resistance. Successful application of biological control strategies requires more knowledge-intensive management. Various methods for application of biocontrol agents include: application directly to the infection court at a high population level to swamp the pathogen, application at one place in which biocontrol microorganisms are applied at one place (each crop year) but at lower populations which then multiply and spread to other plant parts and give protection against pathogens and one time or occasional application that maintain pathogen populations below threshold levels.” (abstract)

	Differing microbes have differing characteristics which have differing effects on certain crop plants.  
“[B]iocontrol products are applied against seed borne and soil borne fungal pathogens, including the causal agents of seed rot, damping-off and root rot diseases. These products are mostly used as seed treatment and have been effective in protecting several major crops such as wheat, rice, com, sugar beet and cotton against fungal pathogens. However, in some cases, biocontrol microorganisms have also been tested as spray application on foliar diseases, including powdery mildew, downy mildew, blights and leaf spots. A few post-harvest fungal diseases have also been controlled by the use of antagonistic fungi and bacteria.” (abstract)

	Through intensive study and experimentation, some bacteria, or microbes or communities of microbes have been found to have beneficial effects when applied to the crop at some stage of the growth process.  At pages 276-279 Heydari discusses various types of biological control processes that have been discovered, such as the application of a virus which infects the fungal cause of Chestnut Blight, and reduces the pathogenicity of the fungus.  Heydari notes that an example of hyperparasitism is utilizing several microbes that parasitize powdery mildew pathogens.  
“In some cases, a single fungal pathogen can be attacked by multiple hyperparasites. For example, Acremonium alternatum, Acrodontium crateriforme, Ampelomyces quisqualis, Cladosporium oxysporum and Gliocladium virens are just a few of the fungi that have the capacity to parasitize powdery mildew pathogens.” (p277) 
An example of anti-biosis is how some microbes produce antibiotic compounds which are effective anti-fungals:  “It has been shown that some antibiotics produced by microorganisms are particularly effective against plant pathogens and the diseases they cause... In all cases, the antibiotics have been shown to be particularly effective at suppressing growth of the target pathogen in vitro and/or in situ conditions. An effective antibiotic must be produced in sufficient quantities (dose) near the pathogen. In situ production of antibiotics by several different biocontrol agents has been studied.” 
Some microbes produce metabolites which affect the fungal pathogen.  “Many biocontrol active microorganisms produce other metabolites that can interfere with pathogen growth and activities. Lytic enzymes are among these metabolites that can break down polymeric compounds, including chitin, proteins, cellulose, hemicellulose and DNA ... Studies have shown that some of these metabolites can sometimes directly result in the suppression of plant pathogens. For example, control of Sclerotium rolfsii by Serratia marcescens appeared to be mediated by chitinase expression.” 
	Heydari notes that the genetic basis for all of the characteristics of a microbe which can suppress a pathogen of a crop was not known at the time (2010).  Heydari specifically notes that advances in computing, molecular biology, analytical chemistry and genetics will advance the understanding of these relationships between microbes, plants and pathogens.  Identifying new strains and microbes which produce antibiotic or metabolic compounds that affect pathogens is a major goal in this field.  
“Since fungal plant pathogens are very diverse and their pathogenicity is different on host plants, it is therefore very important to look for new and novel biocontrol microorganisms with different mechanisms. In this regard, the following criteria need further investigation: The use of previously uncharacterized microbes as biological control agents; 
Study on the roles of other genes and gene products which are involved in pathogen suppression; 
The efficacy of using novel strain combinations in comparison with individual agents; and
Study on the signal molecules of plant and microbial origin which regulate the expression of biocontrol traits by different agents.” P284. 

Heydari states in the conclusion that: “Advanced molecular techniques are now being used to characterize the diversity, abundance and activities of microbes that live in and around plants, including those that significantly impact plant health. Still, much remains to be learned about the microbial ecology of both plant pathogens and their microbial antagonists in different agricultural systems. Fundamental work remains to be done on characterizing the different mechanisms by which organic amendments reduce plant disease, including those caused by fungal pathogens.” P285.
In KSR Int 'l v. Teleflex, the Supreme Court, in rejecting the rigid application of the teaching, suggestion, and motivation test by the Federal Circuit, indicated that “The principles underlying [earlier] cases are instructive when the question is whether a patent claiming the combination of elements of prior art is obvious. When a work is available in one field of endeavor, design incentives and other market forces can prompt variations of it, either in the same field or a different one. If a person of ordinary skill can implement a predictable variation, § 103 likely bars its patentability.” KSR Int'l v. Teleflex lnc., 127 S. Ct. 1727, 1740 (2007).
	Applying the KSR standard of obviousness to Kazeman and Heydari, the Examiner concludes that known work in one field of endeavor may prompt predictable variations.  The problem addressed by applicant - the need to identify microbes having a particular genetic characteristic or feature suitable for the application to a crop as recognized by Heydari- was closely analogous to the task of identifying genetic elements which control gene expression in a multiplicity of systems such as humans, Drosophila and mice as in Kazeman. Thus, an artisan in the biocontrol technology, bioinformatics and in agricultural research would have recognized the similar class of problem and the known solutions of the prior art (Kazeman) and it would have been well within the ordinary skill level to implement the system in the different environment.  The application of similar computer-based machine learning processes to genetic sequence information is the same, regardless of the source of the genetic data.  Once the desired feature has been identified, suitable training data obtained, and desired plants to be treated are identified, carrying out the programs of Kazeman would have been well within the skill of one in the art.  Kazeman made the programs available, and discussed how to treat the genetic data to be useful in these methods.  One would have been motivated to look to such processes, as in the field of agriculture and disease management of plants there is a well-known desire to reduce the use of chemicals during the growth of crops.  Chemical pesticides are harmful to the environment, and to pollinators which are utilized in crop management.  The use of biological control is a “promising alternative strategy and has been successfully applied to control some known disease on different plants and crops.” (Heydari, conclusion) Heydari specially points out that understanding the genes and genetic elements responsible for the desired features or anti-pathogen activities is a major goal, and may be achievable due to the improvements in computer technology, genetics, and statistical analysis.  In KSR, the Court held that "[t]be gap between the prior art and respondent's system is simply not so great as to render the system nonobvious to one reasonably skilled in the art."
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARY K ZEMAN whose telephone number is 5712720723.  The examiner can normally be reached on 8am-2pm M-F.  Email may be sent to mary.zeman@uspto.gov if the appropriate permissions have been filed.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karl Skowronek can be reached on 571 272 9047.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

	/MARY K ZEMAN/            Primary Examiner, Art Unit 1672