DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims Status
Claims 1-32 are under examination.
Claim Objections
Claims 4, 7, 20, and 23  are objected to because of the following informalities: 
•	In claim 4, line 5, "met exceeded" should read " met or exceeded " for clarity.  
•	In claim 7, line 5, "met exceeded" should read " met or exceeded " for clarity.  
•	In claim 20, line 5, "met exceeded" should read " met or exceeded " for clarity.  
•	In claim 23, line 5, "met exceeded" should read " met or exceeded " for clarity.   
 Appropriate correction is required.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim(s) 1-7 and 17-23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Anders et al. (Nat Protoc 8, 1765–1786, published 2013) in view of Kornobis et al. (Evolutionary Bioinformatics 11 97–104, published 2015).
Regarding claims 1 and 17, Anders teaches that the RNA-seq platform addresses a multitude of applications, including relative expression analyses, alternative splicing, discovery of novel transcripts and isoforms, RNA editing, allele-specific expression and the exploration of non-model-organism transcriptomes (Page 1765, Column Right, Paragraph 2). An initial and fundamental analysis goal is to identify genes whose expression level changes between conditions. In the simplest case, the aim is to compare expression levels between two conditions, e.g., stimulated versus unstimulated or wild type versus mutant (Page 1765, Column Right, Paragraph 2). More complicated experimental designs can include additional experimental factors, potentially with multiple levels (e.g., multiple mutants, doses of a drug or time points) or may need to account for additional covariates (e.g. experimental batch or sex) or the pairing of samples (e.g., paired tumor and normal tissues from individuals) (Page 1765, Column Right, Paragraph 2). For RNA-seq data, the strategy taken is to count the number of reads that fall into annotated genes and to perform statistical analysis on the table of counts to discover quantitative changes in expression levels between experimental groups (Page 1765, Column Left, Paragraph 3).

Anders further teaches, the overall sequence of steps, from read sequences to feature counting to the discovery of differentially expressed genes, with a concerted emphasis on quality checks throughout. After initial checks on sequence quality, reads are mapped to a reference genome with a splice-aware aligner; up to this point, this protocol is identical to many other pipelines (e.g., TopHat and Cufflinks) (Page 1765, Column Right, Paragraph 2, Page 1766, Figure 1). From the set of mapped reads and either an annotation catalog or an assembled transcriptome, features, typically genes or transcripts, are counted and assembled into a table (rows for features and columns for samples) (Page 1765, Column Right, Paragraph 2). The statistical methods, which are integral to the differential expression discovery task, operate on a feature count table. Before the statistical modeling, further quality checks are encouraged to ensure that the biological question can be addressed. For example, a plot of sample relations can reveal possible batch effects and can be used to understand the similarity of replicates and the overall relationships between samples (Page 1765, Column Right, Paragraph 2). After the statistical analysis of differential expression is carried out, a set of genes deemed to be differentially expressed or the corresponding statistics can be used in downstream interpretive analyses to confirm or generate further hypotheses (Page 1765, Column Right, Paragraph 2).
Anders discussed that it is possible to do all computational steps in R and Bioconductor; however, for a few of the steps, the most mature and widely used tools are outside Bioconductor (Page 1768, Column Right, Paragraph 3). Here R and Bioconductor are adopted to tie together the workflow and provide data structures, and their unique strengths in workflow components are leveraged, including statistical algorithms, visualization and computation with annotation databases (Page 1769, Column Left, Paragraph 1). The R-based system, in terms of achieving best practices in genomic data analysis, is the opportunity for an interactive analysis whereby spot checks are made throughout the pipeline to guide the analyst (Page 1769, Column Left, Paragraph 1). In addition, a wealth of tools is available for exploring, visualizing and cross-referencing genomic data (Page 1769, Column Left, Paragraph 1). Additional features of Bioconductor are readily available that will often be important for scientific projects that involve an RNA-seq analysis, including access to many different file formats, range-based computations, annotation resources, manipulation of sequence data and visualization (Page 1769, Column Left, Paragraph 1).
Anders also teaches a method to perform an automated differential expression analysis of RNA-Sequencing (RNA-Seq) data workflow. The method taught by Anders identifies a plurality of RNA-Seq reads from genomic samples (Page 1766, Figure 1) subjected to at least one experimental condition (Page 1765, Column Right, Paragraph 2). The method aligns the plurality of RNA-Seq reads to a transcriptome for the genomic samples, quantifies the gene expression for the plurality of RNA-Seq reads as feature counts and quantifying differential gene expression in the plurality of RNA-Seq reads between a combination pair of the experimental condition (Page 1765, Column Left, Paragraph 2, Page 1788, Figure 1).
Anders does not teach using at least one data processing unit comprising at least one processor and memory coupled to the at least one processor, the memory operative to store instructions that, when executed by the processor, cause the processor to perform an automated differential expression analysis of RNA-Sequencing (RNA-Seq) data workflow. 

Kornobis teaches that a variety of software programs have been developed to perform different steps of the RNA-seq analysis, but most of them are computationally intensive (Page 97, Column Right, Paragraph 1). The vast majority of these programs run solely with command lines (Page 97, Column Right, Paragraph 1). Processing the data to connect one step to the next in RNA-seq pipelines can be cumbersome in many instances, mainly due to the variety of output formats produced and the postprocessing needed to accept them further as input (Page 97, Column Right, Paragraph 1). Moreover, as soon as a large computing effort is required, interactive execution is usually not feasible and an interface with the underlying batch systems used in clusters or supercomputers is needed (Page 98, Column Left, Paragraph 1). In order to provide users with such a bioinformatics tool that solve the above-mentioned problems, Kornobis et al developed TRUFA (TRanscriptomes User-Friendly Analysis), an informatics platform for RNA-seq data analysis, which runs on the ALTAMIRA supercomputer at the Instituto de Fisica de Cantabria (IFCA), Spain. The platform is highly parallelized both at the pipeline and program level. It can access up to 256 cores per execution instance for certain components of the pipeline. On top of allowing the user to obtain results in a relatively short time thanks to HPC (high-performance computing) resources, TRUFA is an integrative and graphical web tool for performing the main and most computationally demanding steps of a de novo RNA-seq analysis (Page 98, Column Left, Paragraph 1).
Kornobis further teaches that the first step of a de novo RNA-seq analysis consists in assessing data quality and cleaning raw reads (Page 98, Column Left, Paragraph 2). The output of a next-generation sequencing (NGS) reaction contains traces of polymerase chain reaction (PCR) primers and sequencing adapters as well as poor-quality bases/reads (Page 98, Column Left, Paragraph 2). Hence, it is advised to perform read trimming, which has been shown to have a positive effect on the rest of the RNA-seq analysis, although parameter values for such trimming have to be optimized (Page 98, Column Right, Paragraph 1). Once reads have been cleaned, they are assembled into transcripts, which are subsequently categorized into functional classes in order to understand their biological meaning. Finally, it is possible to perform expression quantification analyses by estimating the amount of reads sequenced per assembled transcript and taking into account that the number of reads sequenced theoretically correlates with the number of copies of the corresponding mRNA in vivo (Page 98, Column Right, Paragraph 2). All the above-mentioned steps in the RNA-seq analysis pipeline are included in TRUFA and correspond to distinct sections in the web-based user interface (see Figs. 1 and 2) (Page 98, Column Right, Paragraph 1). 
Kornobis also teaches about TRUFA (TRanscriptomes User-Friendly Analysis), an informatics platform for RNA-seq data analysis, which runs on the ALTAMIRA supercomputer at the Instituto de Fisica de Cantabria (IFCA), Spain (Page 98, Column Left, Paragraph 1). It is well known in the art that the supercomputer would comprise of at least one data processing unit, one or more processors, and memory to store instructions. When the instructions are executed by the processor, it will cause the processor to perform an automated differential expression analysis of RNA-Sequencing (RNA-Seq) data workflow.  
Thus, it would have been obvious to perform an automated differential expression analysis of RNA-Sequencing (RNA-Seq) data workflow with a data processing unit of a supercomputer with at least one processor and memory coupled to the at least one processor, the memory operative to store instructions that, when executed by the processor, cause the processor to perform an automated differential expression analysis of RNA-Sequencing (RNA-Seq) data workflow as taught by Kornobis (Page 98, Column Left, Paragraph 1). It would have also been obvious to compare expression levels between two conditions by identifying a plurality of RNA-Seq reads from genomic samples having been subjected to at least one experimental condition and aligning the plurality of RNA-Seq reads to a transcriptome for the genomic samples, quantifying gene expression for the plurality of RNA-Seq reads and quantifying differential gene expression in the plurality of RNA-Seq reads between a combination pair of the experimental condition as taught by Anders (Page 1765, Column Left, Paragraph 2, Page 1788, Figure 1). Therefore, it would have been obvious to implement the software platform of Anders on the supercomputer of Kornobis to achieve the purpose of performing an automated differential expression analysis of RNA-Sequencing (RNA-Seq) data workflow.
Thus, claims 1 and 17 are clearly rendered obvious by the combined teachings of Anders and Kornobis above and is rejected for the reasons above.
With respect to claims 2 and 18, Kornobis as discussed above teaches a supercomputer comprising of a data processing unit that is capable of being coupled to a display (Page 98, Column Left, Paragraph 1). When the software platform taught by Anders is implemented on the supercomputer, a graphical representation of the differential gene expression between each combination pair of experimental conditions of the plurality of RNA-Seq reads on the display is achieved (Page 1768, Column Right, Paragraph 3, Page 1769, Column Left, Paragraph 1, Page 1778, Figure 4, Page 1780, Figure 6). Thus, it would have been obvious to implement the software platform of Anders on the supercomputer of Kornobis to display the visual and graphical representation of the differential gene expression between each combination pair of experimental conditions to arrive at predictable results. 
With respect to claims 3 and 19, Kornobis as discussed above teaches pre-processing as the first step of a de novo RNA-seq analysis that involves assessing data quality and cleaning raw reads by removing PCR adaptors, removing duplicates, and trimming low quality bases/reads (Page 98, Column Left-Right, Paragraph 2, Figure 1). These steps occur before the RNA-Seq reads are aligned to the plurality of RNA-Seq reads to the transcriptome. Preprocessing is an obvious step of RNA-seq analysis.
With respect to claims 4 and 20, it is noted that the expression threshold value refers to the P value. Anders teaches that DESeq and edgeR differ slightly in the format of results outputted, but each contains columns for log-fold change (log), counts per million (or mean by condition), likelihood ratio statistic (for GLM-based analyses), as well as raw and adjusted P values. By default, P values are adjusted for multiple testing using the Benjamini-Hochberg procedure (Page 1784, Paragraph 6 to Page 1785, Paragraph 1). Since the P values are adjusted by default, it is assigned prior to quantifying the gene expression of the plurality of RNA-Seq reads and proceeds with quantifying the differential gene expression of the plurality of RNA-Seq reads when P value is met or exceeded.
With respect to claims 5 and 21, Kornobis teaches that once RNA-Seq reads have been cleaned, they are assembled into transcripts via Trinity (Page 98, Paragraph 2, Figure 1). Trinity is a software responsible for assembly and mapping or aligning the RNA-Seq reads to the reference genome (Page 100, Table 1). Subsequently, RNA-Seq reads are categorized or sorted into functional classes prior to quantifying gene expression for the plurality of RNA-Seq reads for the purpose of understanding their biological meaning (Page 98, Paragraph 2, Figure 1).
With respect to claims 6 and 22, Anders teaches normalizing the quantified gene expression referred to as feature counting for the plurality of RNA-Seq reads before quantifying differential gene expression between each combination pair of the plurality of RNA-Seq reads as depicted in Figure 1, step 14 (Page 1766, Figure 1, Step 14).
With respect to claims 7 and 23, it is noted that the normalized threshold value refers to the P value. Anders teaches that DESeq and edgeR differ slightly in the format of results outputted, but each contains columns for log-fold change (log), counts per million (or mean by condition), likelihood ratio statistic (for GLM-based analyses), as well as raw and adjusted P values. By default, P values are adjusted for multiple testing using the Benjamini-Hochberg procedure (Page 1784, Paragraph 6 to Page 1785, Paragraph 1). Since the P values are adjusted by default, it is assigned prior to quantifying the gene expression of the plurality of RNA-Seq reads and proceeds with quantifying the differential gene expression of the plurality of RNA-Seq reads when P value is met or exceeded.

Claims 8 and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Anders et al. (Nat Protoc 8, 1765–1786, published 2013) in view of Kornobis et al. (Evolutionary Bioinformatics 11 97–104, published 2015), as applied to claims 1-7 and 17-23 above, and further in view of and Johnson et al (BMC Bioinformatics 17, 66, published 2016).
All discussions of the combined teachings of Anders and Kornobis above are incorporated here.
They do not teach experimental condition replicates.
With respect to claims 8 and 24, Johnson teaches in figure 1 that the plurality of RNA-Seq reads comprises at least two replicates subjected to the same experimental condition stated as “Define Experimental Condition Replicates” (Page 2, Figure 1).
With respect to claims 9 and 25, Anders teaches that the fundamental analysis goal of RNA-seq platform is to identify genes whose expression level changes between conditions, such as stimulated versus unstimulated or wild type versus mutant (Page 1765, Column Left, Paragraph 2). Anders further mentions that complicated experimental designs can include additional experimental factors, potentially with multiple levels (e.g., multiple mutants, doses of a drug or time points) (Page 1765, Column Left, Paragraph 2). Therefore, the plurality of RNA-Seq reads were subjected to different experimental conditions as taught by Anders. 
With respect to claims 10 and 26, Anders teaches that users can use an alternative aligner or a different strategy (or software package) to count features (Page 1767, Column Right, Paragraph 4). The protocol taught by Anders include starting with either a set of sequence alignment map (SAM)/binary alignment map (BAM) files from an alternative alignment algorithm or a table of counts (Page 1767, Column Right, Paragraph 4). This provides at least two gene expression option tools for quantifying gene expression for the plurality of RNA-Seq reads.
With respect to claims 11 and 27, Anders teaches two differential gene expression option tools for quantifying differential gene expression between each combination pair of experimental conditions of the plurality of RNA-Seq reads (Page 1766, Figure 1). The tools are the 2-group differential comparison and the GLM-based differential comparisons depicted in Figure 1 (Page 1766, Figure 1).
With respect to claims 12 and 28, Johnson teaches that the SPARTA platform maintains analytic flexibility by allowing the user to tailor the analysis through option specification but is capable of proceeding with default values (Page 4, Column Right, Paragraph 2). Thus, user specified instructions for defining parameters of the workflow could be provided for the platform. Therefore, it is obvious to use the method rendered obvious here, a purpose taught by Johnson to arrive at predictable results.
With respect to claims 13 and 29, Johnson teaches that the SPARTA workflow (Fig. 1) is implemented utilizing Python for file input/output management and tool execution, combining several open-source computational tools (Page 2, Column Left, Paragraph 2). The Sparta platform allows for a location of one or more user files for identifying the plurality of RNA-Seq reads from the genomic samples to be located within the Input Folder (Page 2, Figure 1). 

Claims 14 and 30 is rejected under 35 U.S.C. 103 as being unpatentable over Anders et al (Nat Protoc 8, 1765–1786, published 2013) in view of Kornobis et al (Evolutionary Bioinformatics 11 97–104, published 2015), as applied to claims 1-7 and 17-23 above, further in view of Johnson et al (BMC Bioinformatics 17, 66, published 2016) as applied to claims 8, 12, 13, 24, 28 and 29 above, and further in view of Castro et al (BMC Bioinformatics 6, 87, published 2005).
All discussions above as to why claims 14 and 30 are rendered obvious by the combined teachings of Anders, Kornobis, and Johnson are incorporated here.
They do not teach automatically generated dependency graph.
Regarding claims 14 and 30, Castro teaches that workflow management systems (WFMS) are basically systems that control the sequence of activities in a given process (Page 2, Column Left, Paragraph 2). Castro et al developed GPIPE, a flexible workflow generator for PISE (Page 7, Column Left, Paragraph 3). Castro further teaches that workflows automate businesses procedures in which information or tasks are passed between conforming entities according to a defined set of rules; some of these business rules are defined by the user, and the implementations are managed via GPIPE. For our purposes, the conforming entities are analytical methods (Clustal, Protpars, etc.) (Page 7, Column Right, Paragraph 1). Syntactic rules drive the interaction between these entities (e.g. to ensure syntactic coherence between heterogeneous file formats) (Page 7, Column Right, Paragraph 1). GPIPE also assures the execution of the workflow, and makes it possible to distribute different jobs over a grid of servers (Page 7, Column Right, Paragraph 1). Thus, Castro teaches that the workflow management systems, which could also be referred to as the dependency graph could be automatically generated with GPIPE (Page 7, Column Left, Paragraph 3). The workflow executed by GPIPE could be parallelized to perform at least two operations according to the generated workflow (Page 7, Column Right, Paragraph 1).
It would have been obvious to incorporate the teachings of Castro to include a workflow management systems generator to the RNA-Seq analysis platform to automatically generate dependency graphs for the purpose of guiding the platform to perform parallelized operations of the workflow. The incorporation of Castro teachings would predictably facilitate analysis of RNA-Seq reads for the advantages and purposes supra.
Thus, the combined teachings above clearly render these instant claims obvious. 
With respect to claims 15 and 31, Kornobis teaches that the TRUFA platform is highly parallelized both at the pipeline and program level (Page 7, Column Left, Paragraph 3). Therefore, it would have been obvious for the operations of the workflow to be parallelized across the plurality of data processing units as taught by Kornobis.
With respect to claims 16 and 32, Kornobis teaches receiving user input defining the automated differential expression analysis of RNA-Sequencing data workflow comprising one or more user specified directives depicted in Figure 2 (Page 99, Figure 2). According to figure 2, the type of input and the type of tools that are utilized for different steps, such as preprocessing and mapping, for RNA-Seq analysis workflow could be chosen and specified by the user. Thus, it is obvious to receive user input for the purpose of defining user specified directives for the automation of differential expression analysis.

Conclusion
No claim is allowed.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KETTIP KRIANGCHAIVECH whose telephone number is (571)272-1735. The examiner can normally be reached 8:30am-5:00pm EDT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karlheinz R. Skowronek can be reached on 571.272.9047. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/K.K./
Examiner, Art Unit 1672
/Karlheinz R. Skowronek/Supervisory Patent Examiner, Art Unit 1671