DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
	Claims 1-47 are pending in this application.
	Applicant has filed 4 IDS statements: 3/12/18 and 3 on 7/30/2020.   Each IDS has been entered and considered. Applicant is reminded to not file a multitude of Patent and Non-Patented Literature citations that are not specifically related to the claims and inventions under examination.
	The CRF, sequence listings, CRF and Statements have been entered.
	The Drawings as filed are suitable for examination.
	This application is a CON under 35 USC 111 of PCT/US18/57679, and claims priority to two US provisional applications.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-47 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

	The metes and bounds of the “accessing first crop sequence information representative of each of a first set of crops that include a crop feature” are unclear.  There is no clear provision of a database or source providing crop sequence information for a multitude of crops that also includes crop features in this limitation such that a “first crop” could be identified out of a set of multiple crops.  The nature of the feature is not clearly pointed out or distinctly claimed, such that one of skill could utilize the presence or absence of a feature to select a set of crop sequence information as required.  The metes and bounds of “representative” in this limitation are unclear.  The claim fails to clearly point out and distinctly claim how the information “represents” the selected information.  It is unclear if sequence identity, similarity, or certain alignments are used to determine when one set of information is “representative” of the set of crops selected.  
	The metes and bounds of the second training step, for training the generative interpolation model are unclear.  This step requires the information from the previous step: “crop sequence information representative of a target crop” to perform a classification step and 
	Further in claim 1, the metes and bounds of the second crop information and related steps are unclear.  There is no clear provision of a database or source providing crop sequence information for a multitude of crops that also includes crop features in this limitation such that a “second crop” could be identified out of a set of multiple crops.  The nature of the feature is not clearly pointed out or distinctly claimed, such that one of skill could utilize the presence or absence of a feature to select a set of crop sequence information as required.  The metes and bounds of “representative” in this limitation are unclear.  The claim fails to clearly point out and distinctly claim how the information “represents” the selected information.  It is unclear if sequence identity, similarity, or certain alignments are used to determine when one set of information is “representative” of the set of crops selected.  
	The metes and bounds of claim 3 are unclear.  Claim 3 labels a feature as comprising “an above threshold expected crop performance”.  This limitation does not actually identify the feature, but a characteristic or description – “expected crop performance” does not set forth what the crop performs or produces, and does not set forth how the threshold is identified or retrieved in order to accurately use it as a label or a feature in the model.
	The metes and bounds of claim 4 are unclear.  Claim 4 provides differing “measures” based on the “probability that the identified crop includes the crop feature”.  However, how those measures are determined for any feature are not clearly set forth or particularly claimed.  Stating it is “based on” a probability is not a step by step set of instructions or algorithms which would produce the “measure of expected crop performance.”  

	The metes and bounds of claim 6 are unclear, as the claim fails to set forth the necessary information required. Claim 6 depends from claim 5.  The nature of the feature is not clearly pointed out or distinctly claimed, such that one of skill could utilize the presence or absence of a feature to select a set of crop sequence information as required.  Further it is unclear how the re-training affects the remainder of claims 6, and 1, including the results of the request.
	The metes and bounds of claim 10 are unclear, as it is not clearly set forth how to determine an “above threshold probability that the crop includes the crop feature” as required.  It is unclear what information from the first or second crop information provides this probability.  
	The metes and bounds of claim 11 are unclear.  Claim 47, the computer program product comprising instructions for the same limitations is indefinite in the same places. In claim 11, the nature of the “organism sequence information” is not clearly set forth.  It is unclear if a full genomic sequence, a haplotype, a genotype, or some other set of sequence information is required.  The Examiner assumes that “organism sequence” refers to a DNA or RNA sequence of the crop.  Further in the training of the topic model, it is unclear what the attributes of the k-mer are intended to be.  No definition of a particular length or other characteristic is set forth such that one of skill could generate and apply these k-mers to the generative interpolation model.  It 
	The metes and bounds of the “accessing first genetic sequence information representative of each of a first [or second] set of organisms that include an organism feature” are unclear.  There is no clear provision of a database or source providing organism sequence information for a multitude of organisms that also includes features in this limitation such that a “first organism”  (or second) could be identified out of a set of multiple organisms.  The nature of the feature is not clearly pointed out or distinctly claimed, such that one of skill could utilize the presence or absence of a feature to select a set of organism sequence information as required.  The metes and bounds of “representative” in this limitation are unclear.  The claim fails to clearly point out and distinctly claim how the information “represents” the selected information.  It is unclear if sequence identity, similarity, or certain alignments are used to determine when one set of information is “representative” of the set of organisms selected.  
	The metes and bounds of the second training step, for training the generative interpolation model are unclear.  This step requires the information from the previous step: “organism sequence information representative of a target organism” to perform a classification step and perform a determination.  Without knowledge of the data to be applied from the previous step, one of skill would not be apprised as to the classification and determinations to be applied.  
	Further in claim 11, the metes and bounds of the second organism information and related steps are unclear.  There is no clear provision of a database or source providing organism sequence information for a multitude of organisms that also includes features in this limitation such that a “second organism” could be identified out of a set of multiple organisms.  The nature 
	The metes and bounds of claim 17 are unclear.  The claim sets forth retraining in response to feedback indicating predictiveness of one or more of k-mers for the organism feature.  Claim 17 ultimately depends from claim 11.  The method of claim 11 do not clearly provide a structure or step for accepting feedback, nor does claim 17 clearly set forth how the application of the feedback affects the remainder of the process of claim 11.  
	The metes and bounds of the “triggering event” in claim 18 are unclear.  It is unclear how and where to apply the additional input variable to the latent space representation, and the nature of the event itself is not clearly set forth and distinctly claimed.
	The metes and bounds of claim 23 are unclear, as it refers to determining a distance between input variables, and it is not clear that more than one variable has been provided.  
The metes and bounds of claim 27 are unclear, as it is not clearly set forth how to determine an “above threshold likelihood that the organism includes the organism feature” as required.  It is unclear what information from the first or second crop information provides this probability.  
	The metes and bounds of claim 38 are unclear.  The claim lists a number of features, however these features are not clearly and distinctly set forth in the claim such that one of skill would be apprised as to which features applicant intends to claim and apply.  Terms such as “a 
	The recitations of “representative” of various characteristics in claims 41-42 are unclear for the same reasons set forth above.  The metes and bounds of the term cannot be determined for the required display absent some type of description or limitation or other identifying term.
	The metes and bounds of claim 46 are unclear in their entirety.  In claim 46, dependent on claims 45 and ultimately claim 1, no types of “treatment” or other curative or palliative features of any organism are clearly set forth.  The steps of claim 46, where claim 11 is modified to select a microbe or community of microbes as the second set of organisms completely lacks any other type of feature information which would lead to the recommendation of the application of a microbe or community of microbes as a crop treatment.  There is no linkage between the First organism set, the second organism set and any given crop.  No crop information is required to be applied or provided to claim 11.  No disease or treatment qualifications or actions are clearly and distinctly set forth such that one of skill would be apprised of how to made such recommendations.
Claim Rejections - 35 USC § 102

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-10 is/are rejected under 35 U.S.C. 102a1 as being anticipated by Frichot et al (2013).
Frichot et al. (2013) Testing for Associations between loci and environmental gradients using latent factor mixed models. Mol. Biol. Evol. 30:7 1687-1699.
	Claim 1 is drawn to a method, wherein a generative model is trained on sequence data of a crop, where the model configures the sequence information into a latent space representation of kmers, accesses data of a feature, generating latent space representation of kmers representative of the first set of crops, training an interpolative model to classify crop sequence information representative of a crop to determine a likelihood the crop has the feature.  The claim then provides information to a client as to a recommendation of a crop with the feature, a subset of a second set of crops with the feature, and providing the probabilities to the client.
	Frichot provides latent factor mixed models to provide a population structure introduced with unobserved variables.  Frichot applies the models to plants and humans.

	With respect to claim 2, 9 the requests are entered into the program as desired. All computer software, source codes are available online meeting the limitations of all steps of receiving requests and providing results to users. (p1967)
	With respect to claim 3-4, 10 in as far as “crop performance” is defined, the association of an SNP related to any of the stress factors set forth for loblolly pine appear to meet the limitations of the claims as a measure of crop performance (continued growth).
	With respect to claims 5-6, additional information can be uploaded and used in training, and identifying desired plants. (results, discussion).

	With respect to claim 8, Frichot provides a matrix which has a column corresponding to the presence of one or more kmers such as an SNP allele.

Claim(s) 1-29, 38 and 47 is/are rejected under 35 U.S.C. 102a1 as being anticipated by Bicego et al (2012).
Bicego et al. (2012) Investigating topic models capabilities in Expression Microarray data classification. IEEE/ transactions on computational biology and bioinformatics, 9:8 1831-1836.
Claim 1 is directed to generative topic modeling as applied to crop organisms (plants) and crop features such as sequence information or crop performance.  Multiple individual crops, or multiple types of crops are iteratively processed.
Bicego applies generative topic modeling to gene expression data of various organisms including plants (grape plants, Introduction).  Bicego discloses using generative topic models to derive feature vectors for the classification of microarray gene expression data from two species of a crop, a grape plant.  Gene expression sequence data is converted to feature vectors in a latent space representation (section 3.1, step 2).  In Bicego, the generative embedding supposed that every topic may be approximately associated with one or more biological processes (genes, activities, etc) which involve certain particular genes and are active in certain samples.  Expression sequence data is linked to features such as gene name.  The data can be classified or categorized as desired.  Bicego provides two generative models for analysis of microarray data (sections 2.1-2.2). The latent process decomposition is a topic model designed for microarray data.  This model can handle multiple samples, multiple organisms or multiple individuals. 
	Bicego notes that using feature vectors of the topics, instead of the number of genes or sequences provides a more compact and easy-to-manage representation (p1833).  
	Bicego applies their modeling and classification scheme to 48 samples of microarray expression data from two grapevine plant species.  One of these species are vulnerable to Plasmapara vinifera infection, leading to destructive disease, and the other is resistant. The experimental design treated each species either with the pathogen, or with water as a control.  The pLSA model was trained on the gene expression data, and the highest likely model was retained, subsequently, feature extraction of topics such as sequences related to: stress response, hormone stimulus response, transport, transcription factor activity, cellular, lipid and carbohydrate processes, as well as photosynthesis and generation of energy. Analysis of the results of the modeling process indicated that different aspects of the dataset can be reflected in a topic, including effects of treatment overall, at different time points, plant processes during the course of infection, etc.  Disease resistance was a key feature, and the V.riparia microarray data provided a variety of genes related to resistance of this infection.  (Fig 1, section 3). 
Bicego extended their model to extract features from the models of a more complex feature: Free Energy Score, expressing how well each data point (microarray data experiment) fits different parts of the trained generative model. (experimental evaluation, section 4).  Further retraining and experimentation with the data and model compared classification, between simple Bayesian (one model per class, and classification with Bayes rule), and the supervised topic models approach, taking into account the labels from the training process.  Bicego further applies their 
With respect to claims 2, 4, 6, as well as 9-10, 39-44 and 46, the computer program and interfaces of Bicego provide the desired selectability and identification, related to crop selection and crop performance. Bicego provides computer programs and interfaces to select elements such as plant type or crop type, as well as likelihoods that the crop comprises the desired feature. (Bicego section 3).
	With respect to claim 3, in Bicego, at least one feature is related to the grape crop performance (Fig 1). 
	With respect to claim 5, the processes are repeated for the second crop, or second species of crop (with or without resistance to pathogen) in Bicego. (section 3 and 4)
	With respect to claim 7, the user is not limited, and does not exclude the listed types of users.
	With respect to claim 8, both Bicego discuss matrices within the discussion of the generative topic modeling section 2 and 3.
	With respect to claims 12-14, Bicego provides generative topic models, comprising Dirichlet modeling, and the training is on genetic sequence information (gene expression data, names, functionality). (sections 2 and 3).
	With respect to claim 15, Bicego provides genetic sequence information of plants which do and do not include the feature of resistance to a particular pathogen.
	With respect to claims 16-20, Bicego provides periodic updating of the modeling, retraining in response to feedback, regerenation of latent space representation, triggering events such as infection, or passage of time et al. (sections 3 and 4.)

	With respect to claim 22, DNA and RNA sequence information is provided, as well as gene expression data (related to amino acid sequence information and RNA sequence information) by Bicego.
	With respect to claim 23, Bicego provides determining distances between variables. (both in the descriptions of the modeling processes)
	With respect to claim 24, further latent space representations and likelihood calculations are provided by Bicego (sections 2 and 3).
	With respect to claim 25-26, Gaussian processes and likelihood calculations are discussed by Bicego in sections 2-4.
	With respect to claim 27, in Bicego, each organism in a subset have a likelihood that the organism comprises the feature (each strain of the grape plant).
	With respect to claim 28-29, Bicego discloses two plant hosts for the pathogens. (Section 3 and 4).
	With respect to claim 38, the features of both Bicego comprise features such as genomic composition, gene clusters, taxonomic categorization, resistance to pathogens, viability over time, gene expression frequency, and crop performance.
	
Claim(s) 11-28, 34-35, 38-45, 47 are is/are rejected under 35 U.S.C. 102a1 as being anticipated by Gerber et al. (2012).
Gerber et al. (2012) Inferring Dynamic Signatures of microbes in complex host ecosystems. PLOS Computational Biology 8:8 e1002624, 14 pages, and supplemental information.

	Gerber provides: accessing genetic sequence information (sequencing reads) of at least one organism present in a microbial community, with a trait of antibiotic resistance, as well as reference operational taxonomic units (refOTU) as a prototype signature.  (abstract, Dataset Summary, p3).  Gerber first provides a generative model (Fig 1) that generates a probability that the refOTU is a signature of an organism, which includes clustering of refOTU into groups sharing similar dynamics.  Fig 1D shows the generation of observed and false data through a noise model. Gerber generates latent space representations of the organism data (supplemental text, p8 “5.2, Step 2 sampling prototype signature shape variables), and applies it to the model, MC-TIMME.  MC-TIMME is a nonparametric hierarchical generative probability model (Fig 1). It learns three levels of Signature Diversity in the organism data: SD1 intra-signature dimensionality, SD2: intra –ecosystem, a measure of the diversity of prototype signatures (representing organisms) in the sample, and SD3: inter-ecosystem, a measure of the prototype signatures across samples from multiple host ecosystems. SD1 is adaptive learning, SD2 uses Dirchlet process infinite mixture models, and SD3 maps experiments from different ecosystems to the same time-scale for continuous dynamics modeling (MC-TIMME overview, p3). Posterior distributions of MC-TIMME are generated using MCMC methods.  See also supplemental text 
	With respect to claims 12-13, the latent space representation is a generative topic model, and the type is a Dirichlet allocation model (MC-TIMME overview p3).
	With respect to claim 14, the training data is the refOTU data, and the generated noise data.
	With respect to claim 15, some of the organisms sampled have the feature of antibiotic resistance, and some do not, in the refOTU.
	With respect to claim 16, the model can be periodically updated, as in the Dynamic model for antibiotic pulses, where data from differing time points in treatment are obtained, and Automated Experimental Design, wherein the model is updated or re-trained at a later date. (p3-4, 9-10).
	With respect to claim 17, the k-mers are the sequencing reads and feedback can be used to retrain the model (Automated experimental design p9-10).
	With respect to claim 18-19, there is a triggering event, application of antibiotic, or a passage of time which is an input variable, and parameters may be adjusted or trimmed in the retraining/ regeneration (automated experimental design p9-10).  
	With respect to claim 20, the matrix is described in the supplemental text at p8.
	With respect to claim 21, the latent space representation comprises the sequencing read data, as the kmers.
	With respect to claim 22, the sequence read information is DNA or RNA. (throughout).

	With respect to claim 25, Gaussian process models are discussed in the supplemental text protocol information.
	With respect to claim 26-27, likelihood calculations related to whether the organism comprises the feature are set forth in MC-TIMME overview, and supplemental protocol.
	With respect to claim 28, the organisms are bacteria and the host was Animalia (human) (Dataset summary).
	With respect to claim 34-35, a multiplicity of organisms across multiple host communities were samples (Dataset summary).
	With respect to claim 38, sequencing reads (genomic composition), taxonomic categorization (OUT), stability over a range of conditions (antibiotic resistance and recovery), etc are possible features (MC-TIMME, Dataset summary, et al.)
	With respect to claims 39-44, aspects of the computer program, input, selection of organisms, features, databases et al are provided by the MC-TIMME program and associated files.
	With respect to claim 45, both sets of organisms are microbes or communities of microbes (dataset summary).

Claim(s) 11-17, 20-24, 26-28, 34-35, 38-45, 47 is/are rejected under 35 U.S.C. 102a1 as being anticipated by Holmes (2012).
Holmes, I. et al. 2012 Dirichlet multinomial mixtures: generative models for microbial metagenomics. PLoSONE 7:2, e30126, 15 pages.
	Holmes provides generative models to determining the presence and taxonomic classification of microbes in a human gut environment.  The human host environments included obese twins, presence of IBD, and presence of ileal Crohn’s disease. 
	With respect to claim 1, Holmes provides DMM (Dirichlet multinomial mixture models) for both clustering and classification of microbial communities.  As set forth in the Introduction, Holmes provides: “The natural prior for the parameters of the multinomial distribution is the Dirichlet. This is a probability distribution over probability vectors. In the context of microbial communities we can view it as describing a metacommunity from which communities can be sampled. Its parameters then describe both the mean expected community and the variance in the communities. As we will show, one of the major advantages of the Dirichlet prior is that the community parameter vectors which are unobserved can be integrated out or marginalized to give an analytic solution to the evidence: the probability that the data was generated by the model. By extending the Dirichlet prior to a mixture of Dirichlets, so that the data set is generated not by a single metacommunity but a mixture of multiple metacommunities, we obtain both a more flexible model for our data and a means to cluster communities. To perform the clustering, we simply impute for each sample the component which is most likely to have generated it. This separates samples into groups according to the metacommunity it has the highest probability of deriving from. The advantage of this approach over simple k-means type strategies is twofold: (1) the clusters can be of different sizes depending on the variability of the metacommunity, and more importantly (2) because we now have an explicit probabilistic model that is appropriate to the data, then we can use the evidence together with methods to penalise 
	Sequence information elements are vectorized to a latent space representation, and matrices are generated where the sequence element, the abundance of taxa in the community over the number of samples are all included (Materials and methods, Multinomial sampling).  A likelihood is generated for each sample that it is taken from a given community (a feature).
	The generative modeling steps of the DMM are disclosed in the Materials and Methods (Dirichlet mixture priors and equations 1-8).  “The Dirichlet is a conjugate prior for the multinomial: for a single Dirichlet the posterior is itself a Dirichlet with parameters obtained by summing the observed counts and the Dirichlet parameters, For the Dirichlet mixture this conjugacy is maintained and Equation 7 can also be written as a Dirichlet mixture…Maximising 
With respect to claim 20, the matrix is described in the Materials and Methods.
	With respect to claim 21, the sparse latent space representation comprises the 16s rRNA sequencing read data, as the kmers.
	With respect to claim 22, the sequence read information is RNA. (throughout).
	With respect to claim 23-24, multiple distance calculations are used in the training, and the processing of the second organism information is processed as in claim 11. (materials and methods, generative modeling, generative classifiers.)
	With respect to claim 26-27, likelihood calculations related to whether the organism comprises the feature are set forth in the general discussion of the DMM process, and the generative classifier for twins data.

	With respect to claim 34-35, a multiplicity of organisms across multiple host communities were samples (table 1).
	With respect to claim 38, sequencing reads (genomic composition), taxonomic categorization (OTU), are possible features (generative classifier for twin data.)
	With respect to claim 45, the organisms are microbes, or communities of microbes.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date 
Claims 1-10, 29-33, 36-37 and 46  is/are rejected under 35 U.S.C. 103 as being unpatentable over Gerber (2012) as applied to claims 11-28, 34-35, 38-45, 47 in view of Bicego as applied to claims 1-29, 38 and 47 above..
Gerber et al. (2012) Inferring Dynamic Signatures of microbes in complex host ecosystems. PLOS Computational Biology 8:8 e1002624, 14 pages, and supplemental information.
Bicego et al. (2012) Investigating topic models capabilities in Expression Microarray data classification. IEEE/ transactions on computational biology and bioinformatics, 9:8 1831-1836.
	Rejected claims 1-10, 29-33, 36-37 and 46 are related to the application of generative models to plant/crop sequence information.
	As set forth above, Gerber provides: accessing genetic sequence information (sequencing reads) of at least one organism present in a microbial community, with a trait of antibiotic resistance, as well as reference operational taxonomic units (refOTU) as a prototype signature.  (abstract, Dataset Summary, p3).  Gerber first provides a generative model (Fig 1) that generates a probability that the refOTU is a signature of an organism, which includes clustering of refOTU into groups sharing similar dynamics.  Fig 1D shows the generation of observed and false data through a noise model. Gerber generates latent space representations of the organism data (supplemental text, p8 “5.2, Step 2 sampling prototype signature shape variables), and applies it to the model, MC-TIMME.  MC-TIMME is a nonparametric hierarchical generative probability model (Fig 1). It learns three levels of Signature Diversity in the organism data: SD1 intra-signature dimensionality, SD2: intra –ecosystem, a measure of the diversity of prototype 
	In the Discussion section, page 11-12, Gerber specifically sets forth that the MC-TIMME modeling process can be applied to plant sequence data in a variety of hosts or other communities. The Discussion sets forth the benefits of the automated experimental design for generation of appropriate data and models for a variety of organisms and hosts and features or traits.  Sequence data is specifically referred to as being desirable, and increasingly available from various sources.  Key elements of the model are identified as the infinite mixture model for prototype signatures and the generative noise model for counts data.  Additional subprocesses include spare priors, observational studies, differing relaxation time modeling processes, and 
	In the same field of endeavor, Bicego applies generative topic modeling to gene expression data of various organisms including plants (grape plants, Introduction).  Bicego discloses using topic models to derive feature vectors for the classification of microarray gene expression data.  Gene expression sequence data is converted to feature vectors in a latent space representation (section 3.1, step 2).  In Bicego, the generative embedding supposed that every topic may be approximately associated with one or more biological processes (genes, activities, etc) which involve certain particular genes and are active in certain samples.  Expression sequence data is linked to features such as gene name.  The data can be classified or categorized as desired.  Bicego provides two generative models for analysis of microarray data (sections 2.1-2.2). The latent process decomposition is a topic model designed for microarray data.  This model can handle multiple samples, multiple organisms or multiple individuals. Section 3 sets forth the modeling in depth, including generative model training, generative embedding, and discriminative classification. The discriminative classification step uses information from the generative process as discriminative features of a discriminative classifier.
	Bicego notes that using feature vectors of the topics, instead of the number of genes or sequences provides a more compact and easy-to-manage representation (p1833).  
	Bicego applies their modeling and classification scheme to 48 samples of microarray expression data from two grapevine plant species.  One of these species are vulnerable to Plasmapara vinifera infection, leading to destructive disease, and the other is resistant. The experimental design treated each species either with the pathogen, or with water as a control.  The pLSA model was trained on the gene expression data, and the highest likely model was V.riparia microarray data provided a variety of genes related to resistance of this infection.  (Fig 1, section 3). 
Bicego extended their model to extract features from the models of a more complex feature: Free Energy Score, expressing how well each data point (microarray data experiment) fits different parts of the trained generative model. (experimental evaluation, section 4).  Further retraining and experimentation with the data and model compared classification, between simple Bayesian (one model per class, and classification with Bayes rule), and the supervised topic models approach, taking into account the labels from the training process.  Bicego further applies their processes to tumor data. and brain datasets (Table 1). As such, Bicego meets the limitations of claim 1, 11 and 47.
In KSR Int 'l v. Teleflex, the Supreme Court, in rejecting the rigid application of the teaching, suggestion, and motivation test by the Federal Circuit, indicated that “The principles underlying [earlier] cases are instructive when the question is whether a patent claiming the combination of elements of prior art is obvious. When a work is available in one field of endeavor, design incentives and other market forces can prompt variations of it, either in the same field or a different one. If a person of ordinary skill can implement a predictable variation, § 103 likely bars its patentability.” KSR Int'l v. Teleflex lnc., 127 S. Ct. 1727, 1740 (2007).


	With respect to claim 1, Bicego provides generative topic modeling meeting the limitations of claim 1 as set forth above, including computer program interfaces.  (sections 2 and 3).
	With respect to claims 2, 4, 6, the computer program and interfaces of Bicego provide the desired selectability and identification, related to crop selection and crop performance.
	With respect to claim 3, in Bicego, at least one feature is related to the grape crop performance (Fig 1). 
	With respect to claim 5, the processes are repeated for the second crop, or second species of crop (with or without resistance to pathogen) in Bicego.
	With respect to claim 7, the user is not limited, and does not exclude the listed types of users.
	With respect to claim 8, both Gerber and Bicego discuss matrices.
	With respect to claims 9-10, 39-44, 46 Bicego provides computer programs and interfaces to select elements such as plant type or crop type, as well as likelihoods that the crop comprises the desired feature. (Bicego section 3).
	With respect to claims 12-14, Bicego provides generative topic models, comprising Dirichlet modeling, and the training is on genetic sequence information (gene expression data, names, functionality). (sections 2 and 3).
	With respect to claim 15, Bicego provides genetic sequence information of plants which do and do not include the feature of resistance to a particular pathogen.

	With respect to claim 21, sparse modeling is directly suggested.
	With respect to claim 22, DNA and RNA sequence information is provided, as well as gene expression data (related to amino acid sequence information and RNA sequence information) by both Gerber and Bicego.
	With respect to claim 23, Bicego and Gerber both provide determining distances between variables. (both in the descriptions of the modeling processes)
	With respect to claim 24, further latent space representations and likelihood calculations are provided by Bicego (sections 2 and 3).
	With respect to claim 25-26, Gaussian processes and likelihood calculations are discussed by Bicego in sections 2-4.
	With respect to claim 27, in Bicego, each organism in a subset have a likelihood that the organism comprises the feature (each strain of the grape plant).
	With respect to claim 28-29, Bicego discloses two plant hosts for the pathogens. (Section 3 and 4).
	With respect to claims 30-33, Bicego suggests applying their processes to additional types of plants, crops or genotypes of the same crop. (Results and Discussion).
	With respect to claim 34, multiple organisms and multiple communities are discussed by Bicego in the discussion section. 
	With respect to claims 34-37 the features are also provided by Gerber as set forth above.

	With respect to claim 45, the organisms are microbes or microbial communities in Gerber as set forth above.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
The following references appear to be equally applicable under 35 USC 102a1 for at least claim 1 or claim 11 or claim 47, depending on the specific interpretation of a crop, an organism, a microbial community, a sequence feature, or a crop performance feature etc. applicant is encouraged to consider these references if amendments are made to the claims.
Kim et al. (2015) deciphering the human microbiome using next generation sequencing data and bioinformatic processes. Methods 79-80, p52-59.
Anesi et al. (2015) Towards a scientific interpretation of the terroir concept: plasticity of the grape berry metabolome. BMC plant biology 15: 191. 17 pages.
Hill, S. T. (2016) The pursuit of hoppiness: propelling hop into the genomic era. Thesis, Oregon State University, 80 pages.
Li et al. (March, 2017) Persistent homology and the branching topologies of plants. American Journal of Botany. 104:3 349-353.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARY K ZEMAN whose telephone number is 5712720723.  
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karl Skowronek can be reached on 571 272 9047.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

	/MARY K ZEMAN/            Primary Examiner, Art Unit 1631