DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This application was filed on 07/30/2018, and claims benefit of provisional Application No. 62/538,627 (filed on 07/28/2017).
This action is in response to amendments and remarks filed on 02/03/2022. In the current amendments, claims 1-6 are amended and claims 7-15 are added. Claims 1-15 are pending and have been examined.
In response to amendments and remarks filed on 02/03/2022, the 35 U.S.C. 112(b) rejection to claims 1 and 3-5 made in the previous Office Action has been withdrawn. 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2, 6, 7, 10, and 15 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claim 6 recites the limitation "the mass spectra" in line 2-3. This limitation lacks clarity because it is unclear if the limitation refers to the “the mass spectra of the unknown protein sample” (claim 6) or “a mass spectra from a known protein sample” (claim 3). For examination purposes, "the mass spectra" has been interpreted as "a mass spectra".
Claim 15 recites “theoretical amino acid sequences” (emphasis added), which lacks clarity because the it is unclear what constitutes as a “theoretical” amino acid sequence and whether such sequence exists. For examination purposes, “theoretical amino acid sequences” has been interpreted as “a plurality of amino acid sequences”.
Dependent claim 7 is rejected based on the same rationale as claim 6.
Dependent claim 10 is rejected based on the same rationale as claim 2.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1, 3, 4, 6-9, and 11-14 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Tran et al. (“De novo peptide sequencing by deep learning”).
Each limitation that recites “or” has been interpreted as requiring only one of the alternatives, not all of the alternatives.
Regarding Claim 1,
Tran et al. teaches A method of identifying features in mass spectral data, comprising (pg. 8251 second paragraph: “DeepNovo integrates CNNs and LSTM networks to learn features of tandem mass spectra, fragment ions, and sequence patterns for predicting peptides” teaches learning features of mass spectra data):
training a convolutional neural network by inputting a first mass spectrum matched to an amino acid sequence into the convolutional neural network to produce a trained convolutional neural network (Fig. 1 and pg. 8248 first full paragraph: “The CNN and LSTM networks of DeepNovo can be jointly trained from scratch given a set of annotated spectra obtained from spectral libraries or database search tools. This architecture allows us to train both general and specific models to adapt to any sources of data” teach training a CNN by inputting spectra data (including a first mass spectrum) into the CNN to produce a trained CNN; pg. 8248 second full paragraph: “The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration” and pg. 8249 second full paragraph: “To measure the accuracy of de novo sequencing results, we compared the real peptide sequence and the de novo peptide sequence of each spectrum. A de novo amino acid is considered “matched” with a real amino acid if their masses are different by less than 0.1 Da” teach input spectrum is matched to amino acid sequence);
obtaining from a mass spectrometer a second mass spectrum of a protein sample having an unknown amino acid sequence (pg. 8248 second to last paragraph: “We evaluated the performance of DeepNovo compared with current state of the art de novo peptide sequencing tools...For performance evaluation, we used two sets of data, low resolution and high resolution, from previous publications. The low-resolution set includes seven datasets (41–47) (Table S1). The first five datasets were acquired from the Thermo Scientific LTQ Orbitrap with the collision-induced dissociation (CID) technique. The other two were acquired from the Thermo Scientific Orbitrap Fusion with the higher-energy collisional dissociation (HCD) technique. The high-resolution set includes nine datasets acquired from the Thermo Scientific Q-Exactive with the HCD technique (48–56) (Table S2)” teaches obtaining mass spectrum data from different types of mass spectrometers (including Thermo Scientific LTQ Orbitrap, Thermo Scientific Orbitrap Fusion, and Thermo Scientific Q-Exactive) to test the performance of the DeepNovo model; pg. 8248 second full paragraph: “The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration” and pg. 8249 second full paragraph: “To measure the accuracy of de novo sequencing results, we compared the real peptide sequence and the de novo peptide sequence of each spectrum” teach the input data are peptide samples having sequences that need to be predicted “de novo,” thus rendering the input to be unknown amino acid sequences to the model; pg. 8249 second full paragraph: “Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides” teaches that peptides are part of protein sequence (protein sample));
discretizing the second mass spectrum into a weighted vector; inputting the weighted vector into the trained convolutional neural network (Fig. 1 and pg. 8248 fourth full paragraph: “DeepNovo also encodes the input spectrum and uses it to initialize the cell state of the LSTM network (36, 37). For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network” teaches discretizing the pg. 8247  fourth full paragraph: “peptide fragmentation generates multiple types of ions, including a, b, c, x, y, z, internal cleavage, and immonium ions (38). Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights)” teaches intensity vectors represent the intensity values/peak heights (weights) associated with ions, thus rendering intensity vectors to correspond to weighted vectors); and
determining, by an output of the trained convolutional neural network, a predicted amino acid sequence corresponding to the second mass spectrum (Fig. 1 and pg. 8248 second full paragraph: “The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration” teaches using the output from the trained CNN to produce a predicted amino acid sequence corresponding to input mass spectrum).
Regarding Claim 3,
Tran et al. teaches A method of identifying features in mass spectral data, comprising (pg. 8251 second paragraph: “DeepNovo integrates CNNs and LSTM networks to learn features of tandem mass spectra, fragment ions, and sequence patterns for predicting peptides” teaches learning features of mass spectra data):
training a convolutional neural network by inputting a mass spectra from a known protein sample and a corresponding known amino acid sequence into the convolutional neural network to produce a trained convolutional neural network (Fig. 1 and pg. 8248 first full paragraph: “The CNN and LSTM networks of DeepNovo can be jointly trained from scratch given a set of annotated spectra obtained from spectral libraries or database search tools. This architecture allows us to train both general and specific models to adapt to any sources of data” and pg. 8248 second full paragraph: “The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration teach training a CNN by inputting pg. 8249 second full paragraph: “Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides” teaches that peptides are part of protein sequence (protein sample));
obtaining a mass spectra of an unknown protein sample; inputting the mass spectra of the unknown protein sample into the trained convolutional neural network (pg. 8248 second full paragraph: “The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration” and pg. 8249 second full paragraph: “To measure the accuracy of de novo sequencing results, we compared the real peptide sequence and the de novo peptide sequence of each spectrum” teach the input data is being inputted to the trained convolutional neural network (see Fig. 1) wherein the input data are mass spectra data of peptide samples having sequences that need to be predicted “de novo,” thus rendering the input to be unknown protein sample to the model; pg. 8249 second full paragraph: “Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides” teaches that peptides are part of protein sequence (protein sample));
and determining, by a first output of the trained convolutional neural network, a presence or absence of an amino acid in the unknown protein sample (pg. 8248 second full paragraph: “The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration” teaches the output of the trained pg. 8251 first full paragraph).
Regarding Claim 4,
Tran et al. teaches the method of claim 3.
Tran et al. further teaches further comprising determining, by a second output of the trained convolutional neural network, a length of a peptide sequence of the unknown protein sample (pg. 8251 first full paragraph: “we present a key downstream application of DeepNovo for complete de novo sequencing of mAbs. We trained the DeepNovo model with an in-house antibody database and used it to perform de novo peptide sequencing on two antibody datasets, the WIgG1 light and heavy chains of mouse (21). Note that the two testing datasets were not included in the training database. De novo peptides from DeepNovo were then used by the assembler ALPS (21) to automatically reconstruct the complete sequences of the antibodies (Figs. S5 and S6). For the light chain (length of 219 aa), we were able to reconstruct a single full-length contig that covered 100% of the target with 99.5% accuracy (218/219). For the heavy chain (length of 441 aa), we obtained three contigs together covering 97.5% of the target (430/441) with 97.2% accuracy (418/430)” teaches training DeepNovo model (including the convolutional neural network) to determine an output indicating the full-length predicted/sequenced peptide wherein the testing data is not included in the training database, thus rendering the testing sample to be “unknown”; pg. 8249 second full paragraph: “Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides” teaches that peptides are part of protein sequence (protein sample)).
Regarding Claim 6,
Tran et al. teaches the method of claim 3.
Tran et al. further teaches further comprising discretizing the mass spectra of the unknown protein sample into a one-dimensional vector prior to inputting the mass spectra into the trained convolutional neural network (Fig. 1 and pg. 8248 fourth full paragraph: “DeepNovo also encodes the input spectrum and uses it to initialize the cell state of the LSTM network (36, 37). For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network” teaches discretizing the input spectrum (second mass spectrum) into intensity vectors (correspond to weighted vectors), prior to inputting input spectra into the trained CNN, wherein the intensity vectors are at least one-dimensional; pg. 8247  fourth full paragraph: “peptide fragmentation generates multiple types of ions, including a, b, c, x, y, z, internal cleavage, and immonium ions (38). Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights)” teaches intensity vectors represent the intensity values/peak heights (weights) associated with ions, thus rendering intensity vectors to correspond to weighted vectors).
Regarding Claim 7,
Tran et al. teaches the method of claim 6.
Tran et al. further teaches wherein the one-dimensional vector corresponds to a presence or absence of a peak in each segment of the mass spectra of the unknown protein sample (Fig. 1 and pg. 8248 fourth full paragraph: “DeepNovo also encodes the input spectrum and uses it to initialize the cell state of the LSTM network (36, 37). For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network” teaches discretizing the input spectrum (second mass spectrum) into intensity vectors (correspond to weighted vectors), prior to inputting input spectra into the trained CNN, wherein the intensity vectors are at least one-dimensional; pg. 8247  fourth full paragraph: “peptide fragmentation generates multiple types of ions, including a, b, c, x, y, z, internal cleavage, and immonium ions (38). Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights)” teaches intensity vectors represent the intensity values/peak heights (weights) associated with ions, thus rendering intensity vectors to correspond to weighted vectors; peak heights correspond to presence of peak in a segment; pg. 8249 second full paragraph: “Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides” teaches that peptides are part of protein sequence (protein sample); also see pg. 8251 first full paragraph for unknown protein sample).
Regarding Claim 8,
Tran et al. teaches the method of claim 3.
Tran et al. further teaches further comprising, prior to inputting the mass spectra from the known protein sample into the convolutional neural network, discretizing the mass spectra of the known protein sample into a one-dimensional vector, wherein the one-dimensional vector corresponds to a presence or absence of a peak in each segment of the mass spectra of the known protein sample (Fig. 1 and pg. 8248 fourth full paragraph: “DeepNovo also encodes the input spectrum and uses it to initialize the cell state of the LSTM network (36, 37). For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network” teaches discretizing the input spectrum into intensity vectors (correspond to weighted vectors), prior to inputting input spectra into the trained CNN, wherein the intensity vectors are at least one-dimensional; pg. 8247  fourth full paragraph: “peptide fragmentation generates multiple types of ions, including a, b, c, x, y, z, internal cleavage, and immonium ions (38). Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights)” teaches intensity vectors represent the intensity values/peak heights (weights) associated with ions, thus rendering intensity vectors to correspond to weighted vectors; peak heights correspond to Fig. 1 and pg. 8248 first full paragraph: “The CNN and LSTM networks of DeepNovo can be jointly trained from scratch given a set of annotated spectra obtained from spectral libraries or database search tools. This architecture allows us to train both general and specific models to adapt to any sources of data” and pg. 8248 second full paragraph: “The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration teach training a CNN by inputting annotated spectra data (correspond to known amino acid sequence) into the CNN to produce a trained CNN; pg. 8249 second full paragraph: “Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides” teaches that peptides are part of protein sequence (protein sample)).
Regarding Claim 9,
Tran et al. teaches the method of claim 3.
Tran et al. further teaches further comprising, prior to inputting the mass spectra from the known protein sample into the convolutional neural network, discretizing the mass spectra of the known protein sample into a weighted vector, wherein the weighted vector corresponds to a peak height in each segment of the mass spectra of the known protein sample (Fig. 1 and pg. 8248 fourth full paragraph: “DeepNovo also encodes the input spectrum and uses it to initialize the cell state of the LSTM network (36, 37). For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network” teaches discretizing the input spectrum into intensity vectors (correspond to weighted vectors), prior to inputting input spectra into the trained CNN, wherein the intensity vectors are at least one-dimensional; pg. 8247  fourth full paragraph: “peptide fragmentation generates multiple types of ions, including a, b, c, x, y, z, internal cleavage, and immonium ions (38). Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights)” teaches intensity vectors represent the intensity values/peak heights (weights) associated with ions, thus rendering intensity vectors to correspond to weighted vectors; Fig. 1 and pg. 8248 first full paragraph: “The CNN and LSTM networks of DeepNovo can be jointly trained from scratch given a set of annotated spectra obtained from spectral libraries or database search tools. This architecture allows us to train both general and specific models to adapt to any sources of data” and pg. 8248 second full paragraph: “The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration teach training a CNN by inputting annotated spectra data (correspond to known amino acid sequence) into the CNN to produce a trained CNN; pg. 8249 second full paragraph: “Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides” teaches that peptides are part of protein sequence (protein sample)).
Regarding Claim 11,
Tran et al. teaches the method of claim 1.
Tran et al. further teaches further comprising, prior to inputting the first mass spectrum into the convolutional neural network, discretizing the first mass spectrum into a first weighted vector, wherein the first weighted vector corresponds to a peak height in segments of the first mass spectrum (Fig. 1 and pg. 8248 fourth full paragraph: “For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network (Fig. 1A and SI Text)” teaches prior to inputting the spectrum data into the convolutional neural network, the spectrum data is discretized into intensity vectors (correspond to weighted vectors); pg. 8247 fourth paragraph: “Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights)” teaches intensity vector representing intensity values correspond to peak heights in the segment of spectrum).
Regarding Claim 12,
Tran et al. teaches the method of claim 1.
Tran et al. further teaches wherein the weighted vector corresponds to a peak height in segments of the second mass spectrum (Fig. 1 and pg. 8248 fourth full paragraph: “For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network (Fig. 1A and SI Text)” teaches prior to inputting the spectrum data into the convolutional neural network, the spectrum data is discretized into intensity vectors (correspond to weighted vectors); pg. 8247 fourth paragraph: “Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights)” teaches intensity vector representing intensity values correspond to peak heights in the segment of spectrum).
Regarding Claim 13,
Tran et al. teaches A method of identifying features in mass spectral data, comprising (pg. 8251 second paragraph: “DeepNovo integrates CNNs and LSTM networks to learn features of tandem mass spectra, fragment ions, and sequence patterns for predicting peptides” teaches learning features of mass spectra data):
obtaining a first mass spectra matched to a first amino acid sequence (pg. 8248 second full paragraph: “The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration” and pg. 8249 second full paragraph: “To measure the accuracy of de novo sequencing results, we compared the real peptide sequence and the de novo peptide sequence of each spectrum. A de novo amino acid is considered “matched” with a real amino acid if their masses are different by less than 0.1 Da” teach input spectra matched to amino acid sequence);
discretizing the first mass spectra by assigning a first weighted vector, wherein the first weighted vector corresponds to a peak height in segments of the first mass spectra (Fig. 1 and pg. 8248 fourth full paragraph: “For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network (Fig. 1A and SI Text)” teaches the spectra data is discretized into intensity vectors (correspond to weighted vectors); pg. 8247 fourth paragraph: “Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights)” teaches intensity vector representing intensity values correspond to peak heights in the segment of spectra data);
training a convolutional neural network by inputting the first weighted vector and first amino acid sequence into the convolutional neural network to produce a trained convolutional neural network (Fig. 1 and pg. 8248 first full paragraph: “The CNN and LSTM networks of DeepNovo can be jointly trained from scratch given a set of annotated spectra obtained from spectral libraries or database search tools. This architecture allows us to train both general and specific models to adapt to any sources of data” teach training a CNN by inputting spectra data (including a first mass spectrum) into the CNN to produce a trained CNN; Fig. 1 further teaches the intensity vectors (weighted vector) are inputted to train the CNN);
obtaining from a mass spectrometer a second mass spectra from a protein sample having an unknown amino acid sequence (pg. 8248 second to last paragraph: “We evaluated the performance of DeepNovo compared with current state of the art de novo peptide sequencing tools...For performance evaluation, we used two sets of data, low resolution and high resolution, from previous publications. The low-resolution set includes seven datasets (41–47) (Table S1). The first five datasets were acquired from the Thermo Scientific LTQ Orbitrap with the collision-induced dissociation (CID) technique. The other two were acquired from the Thermo Scientific Orbitrap Fusion with the higher-energy collisional dissociation (HCD) technique. The high-resolution set includes nine datasets acquired from the Thermo Scientific Q-Exactive with the HCD technique (48–56) (Table S2)” teaches obtaining mass spectra data from different types of mass spectrometers (including Thermo Scientific LTQ Orbitrap, Thermo Scientific Orbitrap Fusion, and Thermo Scientific Q-Exactive) to test the performance of the DeepNovo model; pg. 8249 second full paragraph: “Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides” teaches that peptides are part of protein sequence (protein sample); also see pg. 8251 first full paragraph for unknown protein sample);
discretizing the second mass spectra by assigning a second weighted vector, wherein the second weighted vectors corresponds to a peak height in segments of the second mass spectra (Fig. 1 and pg. 8248 fourth full paragraph: “For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network (Fig. 1A and SI Text)” teaches the spectra data is discretized into intensity vectors (correspond to weighted vectors); pg. 8247 fourth paragraph: “Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights)” teaches intensity vector representing intensity values correspond to peak heights in the segment of spectra data);
inputting the second weighted vector into the trained convolutional neural network  (Fig. 1 and pg. 8248 fourth full paragraph: “DeepNovo also encodes the input spectrum and uses it to initialize the cell state of the LSTM network (36, 37). For that purpose, the spectrum is discretized into an intensity vector that subsequently flows through another CNN, called spectrum-CNN, before being fed to the LSTM network” teaches discretizing the input spectrum (second mass spectrum) into intensity vectors (correspond to weighted vectors), which are inputted into the trained CNN; pg. 8247  fourth full paragraph: “peptide fragmentation generates multiple types of ions, including a, b, c, x, y, z, internal cleavage, and immonium ions (38). Depending on the fragmentation methods, different types of ions may have quite different intensity values (peak heights)” teaches intensity vectors represent the intensity values/peak heights (weights) associated with ions, thus rendering intensity vectors to correspond to weighted vectors); and 
determining, by an output of the trained convolutional neural network, a predicted amino acid sequence for the protein sample (Fig. 1 and pg. 8248 second full paragraph: “The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration” teaches using the output from the trained CNN to produce a predicted amino acid sequence corresponding to input mass spectrum; pg. 8249 second full paragraph: “Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides” teaches that peptides are part of protein sequence (protein sample)).
Regarding Claim 14,
Tran et al. teaches the method of claim 13.
Tran et al. further teaches further comprising determining, by the output of the trained convolutional neural network, a length of a peptide sequence of the protein sample (pg. 8251 first full paragraph: “we present a key downstream application of DeepNovo for complete de novo sequencing of mAbs. We trained the DeepNovo model with an in-house antibody database and used it to perform de novo peptide sequencing on two antibody datasets, the WIgG1 light and heavy chains of mouse (21). Note that the two testing datasets were not included in the training database. De novo peptides from DeepNovo were then used by the assembler ALPS (21) to automatically reconstruct the complete sequences of the antibodies (Figs. S5 and S6). For the light chain (length of 219 aa), we were able to reconstruct a single full-length contig that covered 100% of the target with 99.5% accuracy (218/219). For the heavy chain (length of 441 aa), we obtained three contigs together covering 97.5% of the target (430/441) with 97.2% accuracy (418/430)” teaches training DeepNovo model (including the convolutional neural network) to determine an output indicating the full-length predicted/sequenced peptide wherein the testing data is not included in the training database, thus rendering the testing sample to be “unknown”; pg. 8249 second full paragraph: “Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides” teaches that peptides are part of protein sequence (protein sample)).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 5, and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Tran et al. (“De novo peptide sequencing by deep learning”) in view of Torng et al. (“3D deep convolutional neural networks for amino acid environment similarity analysis”).

Regarding Claim 2,
Each limitation that recites “or” has been interpreted as requiring only one of the alternatives, not all of the alternatives.
Tran et al. teaches the method of claim 1.
Tran et al. further teaches the amino acid sequence from the first mass spectrum (Fig. 1 teaches amino acid sequence from a first mass spectrum)...identifying, by the trained convolutional neural network, the presence or absence of each subsequence in the second mass spectrum of the protein sample; and determining, by an output of the trained convolutional neural network, the predicted amino acid sequence based on the presence or absence of each subsequence (pg. 8248 second full paragraph: “The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration” teaches predicting amino acid sequence based on the output of the trained convolutional neural network, which indicates the presence of specific subsequence of amino acids in the input mass protein sample; pg. 8249 second full paragraph: “Most importantly, all sequencing tools report confidence scores for their predictions. The confidence scores reflect the quality of predicted amino acids and are valuable for downstream analysis [e.g., reconstructing the entire protein sequence from its peptides” teaches that peptides are part of protein sequence (protein sample)).
Tran et al. does not appear to explicitly teach pooling the amino acid sequence...into a plurality of groups of sequential amino acids; classifying each amino acid as aliphatic or aromatic and assigning a first feature to each amino acid based on the classification as aliphatic or aromatic; classifying each amino acid as hydrophobic or hydrophilic and assigning a second feature to each amino acid based on the classification as hydrophobic or hydrophilic; classifying each amino acid as positively charged or negatively charged and assigning a third feature to each amino acid based on the classification as positively charged or negatively charged; producing a subsequence for each of the groups of sequential 
However, Torng et al. teaches pooling the amino acid sequence...into a plurality of groups of sequential amino acids (Fig. 6 caption: “3DCNN-Training. Amino acid groupings discovered by our 3DCNN generally agree with known amino acid similarities. Six clusters were discovered by our network. The first cluster includes phenylalanine, tryptophan, and tyrosine. These are the three amino acids known to be hydrophobic and aromatic. The second and third clusters comprises valine, isoleucine and leucine, methionine respectively, which are all non-polar and aliphatic. The polar amino acids form the fourth cluster. Amino acids with known distinct properties, glycine and cysteine do not form local blocks with the other amino acids” teaches clustering (pooling) amino acid sequence into groups);
classifying each amino acid as aliphatic or aromatic and assigning a first feature to each amino acid based on the classification as aliphatic or aromatic; classifying each amino acid as hydrophobic or hydrophilic and assigning a second feature to each amino acid based on the classification as hydrophobic or hydrophilic (Fig . 6 caption: “Amino acid groupings discovered by our 3DCNN generally agree with known amino acid similarities. Six clusters were discovered by our network. The first cluster includes phenylalanine, tryptophan, and tyrosine. These are the three amino acids known to be hydrophobic and aromatic. The second and third clusters comprises valine, isoleucine and leucine, methionine respectively, which are all non-polar and aliphatic” teaches discovering (classifying) amino acid as aliphatic or aromatic (corresponds to assigning first feature) and discovering (classifying) amino acid as hydrophobic (corresponds to assigning second feature));
classifying each amino acid as positively charged or negatively charged and assigning a third feature to each amino acid based on the classification as positively charged or negatively charged (pg. 20 first paragraph: “We present four examples of local amino acid microenvironments, including those of charged, polar, and non-polar amino acids” teaches determining (classifying) amino acid as charged, which includes “POS_CHARGE” (positively charged) and “NEG_CHARGE” (negatively charged), see Table 1);
producing a subsequence for each of the groups of sequential amino acids based on the first feature, the second feature, and the third feature for each amino acid (Fig . 6 caption: “Amino acid groupings discovered by our 3DCNN generally agree with known amino acid similarities. Six clusters were discovered by our network. The first cluster includes phenylalanine, tryptophan, and tyrosine. These are the three amino acids known to be hydrophobic and aromatic. The second and third clusters comprises valine, isoleucine and leucine, methionine respectively, which are all non-polar and aliphatic” teaches producing training subsequence for clusters (groups) based on features such as an amino acid as aliphatic or aromatic (corresponds to first feature) and an amino acid as hydrophobic (corresponds to second feature); pg. 20 first paragraph: “We present four examples of local amino acid microenvironments, including those of charged, polar, and non-polar amino acids” teaches an amino acid as charged, which includes “POS_CHARGE” (positively charged) and “NEG_CHARGE” (negatively charged), which corresponds to third feature; see Table 1);
training the convolutional neural network by inputting into the convolutional neural network the subsequence for each of the groups of sequential amino acids (pg. 4 first paragraph: “For each structure in the training and test structure sets, we placed a 3D grid with 10 Å spacing to sample positions in the protein for local box extraction” and pg. 5 first paragraph: “Different amino acids have strikingly different frequencies of occurrence within natural proteins. To ensure useful features can be extracted from all the 20 amino acid microenvironment types, we construct balanced training and test datasets by applying the following procedure to the training and test dataset: (1) The least abundant amino acid microenvironment in the original dataset is first identified. (2) All examples of the identified amino acid microenvironment type are included in the balanced dataset. (3) The number of examples for the least abundant amino acid microenvironment is used to randomly sample an equal amount of examples from all the other 19 amino acid microenvironment types” teach the training data to the convolutional neural network are protein structures formed by subsequences of sequential amino acids; also see pg. 3 first paragraph: “(1)To study how the 20 amino acids interact with their neighboring microenvironment, we train our network to predict the amino acids most compatible with a specific location within a protein structure”).
Tran et al. and Torng et al. are analogous art to the claimed invention because they are directed to amino acid sequence analysis using convolutional neural network.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate the above limitation(s) as taught by Torng et al. to the disclosed invention of Tran et al.
One of ordinary skill in the arts would have been motivated to make this modification because of the following: “To study how the 20 amino acids interact with their neighboring microenvironment, we train our network to predict the amino acids most compatible with a specific location within a protein structure...and show that out 3DCNN achieved superior performances over models using conventional features” (Torng et al. pg. 3 first paragraph).
Regarding Claim 5,
Tran et al. teaches the method of claim 3.
Tran et al. does not appear to explicitly teach further comprising determining, by a third output of the trained convolutional neural network, a frequency of the amino acid in the peptide sequence of the unknown protein sample.
However, Torng et al. teaches further comprising determining, by a third output of the trained convolutional neural network, a frequency of the amino acid in the peptide sequence of the unknown Fig. 5 and caption: “Fig. 5 Confusion matrices for predictions of the 20 amino acid microenvironments. Predictions on the training and test datasets using 3DCNN and FEATURE Softmax Classifier are summarized into confusion matrices to inspect the propensity of each microenvironment type to be predicted as one another. The 20 amino acids are arranged according to knowledge-based amino acid groups, where amino acids known to be biochemically similar are adjacent. The ith, jth element of the matrices shows the probability of examples of true label i being predicted as label j. The probability is represented in heat map colors. a 3DCNN-Train. b 3DCNN-Test. Local block structures in the confusion matrices for 3DCNN demonstrate similarities and differences between amino acid microenvironments. For example, phenylalanine (F), tryptophan (W), and tyrosine (Y) form a hydrophobic and aromatic block” teaches determining, by analyzing the output of the trained convolutional neural network, the propensity of each amino acid to be predicted as another, which would involve determining the number of times (frequency) the specific amino acid appears in the sequence).
Tran et al. and Torng et al. are analogous art to the claimed invention because they are directed to amino acid sequence analysis using convolutional neural network.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate the above limitation(s) as taught by Torng et al. to the disclosed invention of Tran et al.
One of ordinary skill in the arts would have been motivated to make this modification because of the following: “To study how the 20 amino acids interact with their neighboring microenvironment, we train our network to predict the amino acids most compatible with a specific location within a protein structure...and show that out 3DCNN achieved superior performances over models using conventional features” (Torng et al. pg. 3 first paragraph).

Regarding Claim 10,
Tran et al. in view of Torng et al. teaches the method of claim 2.
Torng et al. further teaches further comprising determining, by the output of the trained convolutional neural network, the predicted amino acid sequence based on a frequency of each subsequence (Fig. 5 and caption: “Fig. 5 Confusion matrices for predictions of the 20 amino acid microenvironments. Predictions on the training and test datasets using 3DCNN and FEATURE Softmax Classifier are summarized into confusion matrices to inspect the propensity of each microenvironment type to be predicted as one another. The 20 amino acids are arranged according to knowledge-based amino acid groups, where amino acids known to be biochemically similar are adjacent. The ith, jth element of the matrices shows the probability of examples of true label i being predicted as label j. The probability is represented in heat map colors. a 3DCNN-Train. b 3DCNN-Test. Local block structures in the confusion matrices for 3DCNN demonstrate similarities and differences between amino acid microenvironments. For example, phenylalanine (F), tryptophan (W), and tyrosine (Y) form a hydrophobic and aromatic block” teaches determining, by analyzing the output of the trained convolutional neural network, the propensity of each amino acid to be predicted as another, which would involve determining the number of times (frequency) the specific amino acid appears in the sequence; pg. 3 first full paragraph: “(1)To study how the 20 amino acids interact with their neighboring microenvironment, we train our network to predict the amino acids most compatible with a specific location within a protein structure” teaches the convolutional neural network is trained to predict amino acids).
Tran et al. and Torng et al. are analogous art to the claimed invention because they are directed to amino acid sequence analysis using convolutional neural network.

One of ordinary skill in the arts would have been motivated to make this modification because of the following: “To study how the 20 amino acids interact with their neighboring microenvironment, we train our network to predict the amino acids most compatible with a specific location within a protein structure...and show that out 3DCNN achieved superior performances over models using conventional features” (Torng et al. pg. 3 first paragraph).

Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Tran et al. (“De novo peptide sequencing by deep learning”) in view of Craig et al. (“Using Annotated Peptide Mass Spectrum Libraries for Protein Identification”).
Regarding Claim 15,
Tran et al. teaches the method of claim 13.
Tran et al. further teaches further comprising training the convolutional neural network by inputting a set of...mass spectra data and corresponding theoretical amino acid sequences into the convolutional neural network (Fig. 1 and pg. 8248 first full paragraph: “The CNN and LSTM networks of DeepNovo can be jointly trained from scratch given a set of annotated spectra obtained from spectral libraries or database search tools. This architecture allows us to train both general and specific models to adapt to any sources of data” teach training a CNN by inputting spectra data into the CNN to produce a trained CNN; pg. 8248 second full paragraph: “The DeepNovo model is briefly illustrated in Fig. 1. The model takes a spectrum as input and tries to sequence the peptide by predicting one amino acid at each iteration” and pg. 8249 second full paragraph: “To measure the accuracy of de novo sequencing results, we compared the real peptide sequence and the de novo peptide sequence of each spectrum. A de novo amino acid is considered “matched” with a real amino acid if their masses are different by less than 0.1 Da” teach input spectra is matched to amino acid sequence).
Tran et al. does not appear to explicitly teach inputting a set of synthetic mass spectra data.
However, Craig et al. teaches inputting a set of synthetic mass spectra data (pg. 1845 fourth full paragraph: “The algorithm for obtaining composite spectra for inclusion in a library was a straightforward, pairwise averaging process. The steps required were as follows. 1. Obtain all available spectra for a particular peptide sequence, parent ion charge state, and residue modification combination from the spectrum cluster database (e.g., obtain all of the spectra for the sequence “YHFMTWK”, where the parent ion charge is +2 and the methionine residue has been oxidized). 2. Order the resulting list of spectra, from most to least confidently assigned (lowest to highest expectation value). 3. Delete duplicate spectra from the list. 4. Start with the most confident assignment. Select the next most confident assignment and identify sets of shared ions between the two spectra: a set of ions have m/z ratios within the allowed fragment ion mass tolerance. 5. Create a new m/z value for each set, by calculating a centroid of the m/z-value and intensities of the peaks in the set. Sum together the intensities of the peaks in the set and create a new spectrum made up of the summed intensities and m/z centroid pairs. 6. Take the new composite spectrum and apply the same steps to it and the next most confident spectrum, creating a new composite. 7. Continue this process until all spectra have been included into the composite” teaches creating new composite spectra data (corresponds to synthetic spectra data) as input).
Tran et al. and Craig et al. are analogous art to the claimed invention because they are directed to analysis of spectra data.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate the above limitation(s) as taught by Craig et al. to the disclosed invention of Tran et al.
.

Response to Arguments
Applicant’s arguments with respect to the 35 U.S.C. 112(b) rejection to claims 2 and 6 have been considered but are not persuasive because the new ground of rejection is necessitated by amendments, and the arguments did not address the ground(s) of rejection presented in this Office Action.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: FAN et al. (US 2017/0329892 A1) teaches classifying and predicting protein side chain conformations using convolutional neural network, which is relevant to Fig. 3 of the present application.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YING YU CHEN whose telephone number is (571)270-1484. The examiner can normally be reached Monday-Friday 7:30 am-5:00 pm (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/YING YU CHEN/               Examiner, Art Unit 2125