DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of the Claims
Examined herein: 1–20

Priority
Applicant’s claim under 35 USC § 119(e) for the benefit of prior-filed Provisional Application No. 16/278611 is acknowledged.
In this action, all claims are examined as though they had an effective filing date of 17 Feb 2018.  In future actions, the effective filing date of one or more claims may change, due to amendments to the claims, or further analysis of the disclosure of the priority application.

Claim Objections
Claims 1, 4, 5 and 16 are objected to because of the following informalities.
In claims 1, 4 and 5, "periods may not be used elsewhere in the claims except for abbreviations" (MPEP 608.01(m)).
Claim 16 depends on succeeding claim 17.  35 USC § 112(d) requires that "a claim in dependent form shall contain a reference to a claim previously set forth".
Appropriate correction is required.


Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 16 and 18 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
Claim 16 recites the limitation "the selected allele".  There is insufficient antecedent basis for this limitation in the claim.
Hereinafter, claim 16 will be interpreted as depending on claim 15.
Claim 18 recites "a high probability to positive real polypeptide-MHC-I interaction data, a low probability to the positive simulated polypeptide-MHC-I interaction data, and a low probability to the negative real polypeptide-MHC-I interaction data".  The claim does not particularly point out the subject matter of the invention because it does not define the event to which the probability refers: it could be probability of membership to a particular class, probability of being included in a training data set, probability of being classified correctly by the CNN, or some other event.  Because the examiner cannot infer Applicant's intended claim scope without considerable speculation, this claim will not be examined with respect to the prior art (MPEP 2173.06 § II).

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1–20 are rejected under 35 USC § 101 because the claimed inventions are directed to non-statutory subject matter.

Mathematical concepts recited in the claims include "generating, by a GAN generator, increasingly accurate positive simulated data"; "a GAN discriminator classifies the positive simulated data as positive"; "presenting the positive simulated data, positive real data, and negative real data to a convolutional neural network"; "the CNN classifies each type of data as positive or negative"; "presenting the positive real data and the negative real data to the CNN"; "generate prediction scores"; 
Steps of evaluating, analyzing or organizing information recited in the claims include "determining, based on the prediction scores, whether the GAN is trained or not trained"; 
Hence, the claims explicitly recite numerous elements that, individually and in combination, constitute abstract ideas.  The claims must therefore be examined further to determine whether they integrate that abstract idea into a practical application (MPEP 2106.04(d)).
Claim 1 does not recite any additional elements.  It is wholly directed to an abstract mathematical procedure.
Likewise, none of claims 2–6, 8–12 and 14–19 recite any additional non-abstract elements; they are all directed to further aspects of the information being analyzed, the manner in which that analysis is performed, or the mathematical operations performed on the information.  These claims are therefore also wholly directed to abstract ideas.

Claim 13 recites an additional element that is not an abstract idea: "synthesizing the polypeptide".  The claims do not describe any specific synthetic procedure, nor do they even specify what polypeptide is being synthesized.  This claim element is nothing more than a mere to apply the abstract idea using a generic synthesis procedure.  The claim therefore does not integrate that abstract idea into a practical application (see MPEP 2106.04(d) § I; and MPEP 2106.05(f)).
Because the claims recite an abstract idea, and do not integrate that abstract idea into a practical application, the claims are directed to that abstract idea.  Claims that are directed to abstract ideas must be examined further to determine whether the additional elements besides the abstract idea render the claims significantly more than the abstract idea.  Claims that are directed to abstract ideas and that raise a concern of preemption of those abstract ideas must be examined to determine what elements, if any, they recite besides the abstract idea, and whether these additional elements constitute inventive concepts that are sufficient to render the claims significantly more than the abstract idea (MPEP 2106.05).
As explained above, the mere instructions to synthesize a polypeptide are, when considered individually, insufficient to constitute an inventive concept that would render the claims significantly more than an abstract idea (see MPEP 2106.05(f)).
As also explained above, the generic steps of outputting the GAN and CNN resulting from the abstract idea constitute insignificant extrasolution activity, and when considered individually, are insufficient to constitute inventive concepts that would render the claims significantly more than an abstract idea (see MPEP 2106.05(g)).
i.e. polypeptide synthesis).  See MPEP 2106.05(a) and 2106.05(h).
For these reasons, claims 7, 13 and 20, when the limitations are considered individually and as a whole, are directed to an abstract idea and lack an inventive concept.  Hence, the claimed invention does not constitute significantly more than the abstract idea, so the claims are rejected under 35 USC § 101 as being directed to non-statutory subject matter.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the 
Claims 1–12 and 15–20 are rejected under 35 U.S.C. 103 as being unpatentable over Calimeri, et al. (in Artificial Neural Networks and Machine Learning – ICANN 2017); Somasundaram, et al. (in 2nd International Conference on Information Technology Research 2017; ref. C on IDS of 6 Nov 2019); and Vang, et al. (Bioinformatics 2017; ref. A on IDS of 17 May 2019).
With respect to claim 1, Calimeri teaches
(a)	"our approach uses a GANN to automatically generate MRI slices of the brain" (p. 628 § 2.3); "the Generator should learn how to generate images that look more and more similar to the samples from the training set, in order to fool the Discriminator and make it believe that they are real" (top of p. 628)
(b)	the generated images can be presented to a "Validator" convolutional neural network for evaluation (p. 628 § 2.3)
(c)	—
(d)	—
Calimeri teaches that the artificial images generated by the GAN are evaluated for their similarity to real MRI images (p. 630 § 4.4), and that the GAN was determined to have acceptable performance when the synthetic images were indistinguishable from the real ones.  Calimeri teaches that "a possible way to overcome limited availability [of training data for a classification task], in some domains, is to artificially create new data" (top of p. 627).
cf. Calimeri) and "after the remarkable success of GAN, it’s widely used in many industries to generate things. GAN used to generate images, text, music and many more things" (p. 7 § IX).
Vang "propose a deep convolutional neural network architecture, name HLA-CNN, for the task of HLA class I-peptide binding prediction" (Abstract), comprising:
(a)	—
(b)	"the input into HLA-CNN network is the character string of the peptide, a 9-mer peptide in this example" (p. 2660 § 2.3); "indicators of binding were given as either binary values of or ic50 (half maximal inhibitory concentration) measurements. Binary indicators were used directly while values given in ic50 measurements were denoted as binding if ic50 < 500nM" (p. 2659 § 2.1); the "binary indicators" constitute "positive real data, and negative real data"; the CNN is trained until a loss criterion is reached (p. 2661, bot. of col. 1)
(c)	presenting an evaluation set of real positive and negative examples to the CNN (p. 2662 § 3.2)
(d)	—
Vang teaches that "the lack of training data is a well-known weakness of deep neural networks as the model may not converge to a solution or worst yet, may overfit to the small training set" (p. 2659, bot. of col. 1)

With respect to claim 2, Calimeri, Somasundaram and Vang all teach that the GAN and CNN operate on biological data.
With respect to claim 3, Vang teaches that "the focus of this article is on HLA class I proteins (p. 2658, bot. of col. 2) and "we apply machine learning techniques from the natural language processing (NLP) domain to tackle the task of MHC-peptide binding prediction" (p. 2659, mid. of col. 1).
With respect to claim 4, Somasundaram teaches that "in GAN the training data will be in 2 parts. One is the real data pdata(x) and another one is the generated data distribution pg(x)" (p. 4, mid. of col. 1).  GAN training includes adjusting the parameters of the Generator and the decision boundary of the Discriminator (p. 4, col. 2).  As explained above, in the combination of Calimeri, Somasundaram and Vang, the training data are polypeptide-HLA interaction data.
With respect to claim 5, Vang teaches that "the input into HLA-CNN network is the character string of the peptide" (p. 2660 § 2.3), the peptide being one that does or does not bind to HLA class I.  Since the combination of Calimeri and Somasundaram teaches using a GAN to generate synthetic positive training examples, synthetic training examples for the HLA-CNN of Vang must be peptide 
With respect to claim 6, Vang teaches that the HLA-CNN outputs a binary prediction of whether the peptide binds to HLA class I (p. 2661, col. 1; Fig. 1).
With respect to claims 7 and 8, Vang teaches evaluating the classification accuracy of the CNN (p. 2661, col. 1).  Calimeri teaches that the synthetic training examples generated by the GAN should be indistinguishable from real training examples (p. 630 § 4.4).  If a classifier is trained with synthetic training examples that are distinguishable from real training examples, then the classifier will have poor performance.  Hence, poor performance of a classifier (e.g. the HLA-CNN of Vang) indicates that the GAN is insufficiently trained; conversely, good performance of the classifier indicates that the GAN is sufficiently trained.  Calimeri (p. 630 § 4.2) and Vang (p. 2662 § 3.4) both teach computerized training of their respective models, which necessitates that the models themselves were outputted in some form.
With respect to claim 9, Vang teaches that the input to the HLA-CNN is a peptide 9-mer (p. 2660 § 2.3).  Hence, the GAN must generate 9-mer peptide sequences; i.e. "allele length".  Calimeri also teaches that the GAN architecture includes parameters of layer size (top of p. 629), which is a model complexity parameter.
With respect to claim 10, Vang teaches that the HLA-CNN model predicts HLA class I binding.  HLA-A, HLA-B and HLA-C are HLA class I proteins.
With respect to claims 11 and 12, Vang teaches "a 9-mer peptide" (p. 2660 § 2.3).
With respect to claims 15 and 16, Vang teaches training HLA-CNN models with specific HLA alleles, including A*02:01, A*02:03, B*27:03 and B*27:05 (p. 2663, Table 2).

With respect to claim 19, Somasundaram teaches that "the optimization of GAN can be formulated as a minimax problem" (p. 4, mid. of col. 2); i.e. an evaluation of a MSE function.  Vang teaches that "the loss function used is the binary cross entropy function" (p. 2661, mid. of col. 1), which is the equivalent of MSE for binary outputs.  Vang further teaches that the HLA-CNN model is evaluated using AUC (p. 2661, top of col. 2).
With respect to claim 20, Calimeri (p. 630 § 4.2) and Vang (p. 2662 § 3.4) both teach computerized training of their respective models, which necessitates that the models themselves were outputted in some form.
An invention would have been obvious to one of ordinary skill in the art if some motivation in the prior art would have led that person to modify prior art reference teachings to arrive at the claimed invention.  Prior to the time of invention, said practitioner would have been motivated to modify the HLA classification method of Vang to include synthetic training data generated by a GAN, because Calimeri teaches that GANs can successfully generate synthetic training data for a classifier, overcoming a problem noted by Vang.  Given that Somasundaram teaches that GANs can be used to generate any kind of biomedical data, including images — as in Calimeri — and sequences — as in Vang — said practitioner would have readily predicted that the modification would successfully result in a method of generating a classifier for HLA-binding sequences, trained on a combination of real HLA binding data and synthetic training data generated by a GAN.  The invention is therefore prima facie obvious.

3 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Calimeri, Somasundaram and Vang as applied to claims 1 and 3 above, and further in view of Carr, et al. (WO 2017/184590).
The combination of Calimeri, Somasundaram and Vang teaches a method of predicting HLA binding for a peptide sequence, but does not teach "synthesizing the polypeptide from the candidate polypeptide-MHC-I interaction classified as a positive polypeptide-MHC-I interaction".
Carr teaches "methods for improved prediction of HLA-peptide binding, datasets for predicting HLA-peptide binding and selection of HLA-binding peptides and compositions comprising HLA-binding peptides obtained by these methods" (0004).  Carr teaches "HLA-peptides sequenced by mass spectrometry along with a set of random decoys were used to build binary classifiers (one classifier per HLA allele) to predict whether a given peptide will bind to a specific HLA allele" (00471); classifiers can include "generative models" and "deep convolutional neural networks" (00114).  Carr further teaches that "a subset of [predicted] peptides were synthesized … and tested for binding to HLA molecules" (00470); the peptides synthesized for experimental validation were those predicted to bind to at least one HLA (00483).
With respect to claim 14, Vang teaches training HLA-CNN models to generate peptides that bind to specific HLA alleles, including A*02:01, A*02:03, B*27:03 and B*27:05 (p. 2663, Table 2).
An invention would have been obvious to one of ordinary skill in the art if some teaching in the prior art would have led that person to combine prior art reference teachings to arrive at the claimed invention.  Prior to the time of invention, said practitioner would have followed the teachings of Carr — synthesize peptides that are predicted to bind to HLA, to experimentally validate the prediction — and combined this experimental validation step with the method of Calimeri, Somasundaram and Vang.  Given that both Carr and the combination of Calimeri, Somasundaram and Vang are directed to generating peptide sequences predicted to bind to HLAs, and that peptides of any sequence can be readily synthesized using customary techniques, said practitioner would have readily predicted that the combination would successfully result in a method of generating predicted HLA-binding peptides, prima facie obvious.

Conclusion
No claim is allowable.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Soren Harward whose telephone number is (571)270-1324. The examiner can normally be reached M-Th 8am-5pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Karl Skowronek can be reached on 571-272-9047. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Soren Harward/Primary Examiner, Art Unit 1631