Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Examiner’s Note
Providing supporting paragraph(s) with a clear explanation for each limitation of amended/new claim(s) in Remarks is strongly requested for clear and definite claim interpretations by Examiner.

Priority
Acknowledgment is made of applicant's claim for the provisional application filed on 11/22/2017.

Response to Arguments
Applicant's arguments filed on 08/15/2022 have been fully considered but they are not persuasive.
In Remarks, p. 9, Applicant contends: 
Figures 8, 10, and 11 are photographs of cell cultures and Figure 9 is a photograph of an electrophoresis gel, each of which is expressly permitted by 37 § C.F.R. l.84(b)(l). Accordingly, Applicant submits that Figures 8-11 are permitted by the Office and corrected drawing sheets are not required. Withdrawal of the objections of Figures 8-11 is respectfully requested.

Examiner’s response:
The Applicant needs to file a petition to have the drawings accepted, meet the requirements of requirements of 37 CFR 1.84, and pay a fee. However, it does not appear that the Applicant has filed the required petition, paid the fee, or met the other requirements of 37 CFR 1.84. For more details, please refer to 37 CFR 1.84.

Therefore, the applicant’s arguments are not convincing, and the objections to figs 8-11 are maintained. 

In Remarks, pp. 10-13, Applicant contends: 
With respect to Hou, the Office has alleged that it would have been obvious to a skilled artisan "to have modified the collagen property prediction system of Concu, Ramshaw, Iwazawa, and Chang with the solubility of Hou". The Office Action at page 42. Applicant respectfully disagrees. The Office has not provided any reasoned explanation as to why or how a skilled artisan would have modified the stability-predicting algorithm of Concu with the solubility of Hou to arrive at the claimed invention, aside from a mere allegation that "[d]oing so would lead to providing a genetic recombinant human collagen with excellent water solubility, high expression quantity and high purity". There is nothing in Hou that would motivate or lead a skilled artisan to replace the stability-predicting algorithm of Concu with the solubility of Hou, let alone any disclosure in any of the references on how a skilled artisan would go about doing so.

Examiner’s response:
Concu teaches predicting that the frequencies of collagen sequences (cf. [secs 2-3] “a stable series of 16 sequences and an unstable series of 86”) are associated with their stability (cf. [secs 2-3] “stable” and “unstable”), and then determining which collagen sequences are stable and which collagen sequences are unstable based on machine learning (cf. [secs 2-3] “classified 14 out of 16 (87.7%) sequences in the stable series and 71 out of 86 (82.55%) sequences in the unstable series” and “Sequence” of table 3.). 
On the other hand, Hou teaches artificial total gene synthesis to provide a genetic recombinant human collagen with excellent water solubility, high expression quantity and high purity (cf. [sec Disclosure of the Invention, pp. 3-4]). In other words, Hou teaches establishing the relationship between the collagen and water solubility. 
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the collagen property prediction system of Concu with the water solubility of Hou since Hou’s teaching about the relationship between the collagen and water solubility may be used in the machine learning framework of Concu instead of the relationship between the collagen and stability.
Therefore, the applicant’s arguments are not convincing.

Drawings
The drawings are objected to because it appears that the drawings in figs 8-11 are color drawings or photographs and are not black and white line drawings. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 7, 16, 46-49, 52-53, 58-59 are rejected under 35 U.S.C. 103 as being unpatentable over Concu et al. (Review of Computer-Aided Models for Predicting Collagen Stability), in view of Ramshaw et al. (Gly-X-Y Tripeptide Frequencies in Collagen: A Context for Host–Guest Triple-Helical Peptides), further in view of Hou et al. (AU 2016101562 A4), further in view of IWAZAWA et al. (US 2013/0084638A1) further in view of Chang et al. (US 2009/0143568 A1)

(Note: Hereinafter, if a limitation has brackets (i.e. [ ]) around claim languages, the bracketed claim languages indicate that they have not been taught yet by the current prior art reference but they will be taught by another prior art reference afterwards.)

Regarding claim 1
Concu teaches
A method of engineering one or more collagen molecules with at least one physical or chemical property, the comprising: 
(a) generating, using a machine learning model implemented on a computer system comprising one or more processors and system memory, a prediction that indicates that a set of target data comprising [relative] frequencies of amino acid residues in one or more target collagen sequences is associated with the at least one physical or chemical property, wherein the at least one physical or chemical property is selected from the group consisting of: [stiffness, elasticity, oxygen release rate, clarity, turbidity, ultraviolet blockage or absorption, viscosity, solubility, water content or hydration, resistance to protease, and ability to associate into fibrils], and wherein the machine learning model was trained by: 
(Concu, [tables 2-3]; [figs 5-6 and 8-9]; [sec Abstract] “The stability of the collagen triple helix is strictly related to its amino acid sequence, especially the main Gly-X-Y motif. … We used the literature to assemble a set of 102 peptides and their relative melting temperatures were determined experimentally, indicating a great variance with the main motif of the collagen.”; [sec 1] “The stability of the helix requires a glycine every three residues i.e. a repeated motif of Gly-X-Y [13, 14], where the X and Y position can be any amino acid that can generate hydrogen bonds to stabilize the structure.”; [sec 2] “A set of collagen peptides was retrieved from the literature [65, 66, 88], with a total of 102 sequences and a Tm range from 4°C to 47.8°C. The sequence diversity was very high and 58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY. In order to construct ANN models, the dataset was divided into a stable and unstable series for each model. … Table 3 shows all the peptide IDs, their Tms, and their amino acid sequence using standard one letter abbreviations. The backbone of the peptides was built using the “draw mode” of the MARCH-INSIDE® program and we performed calculations of the molecule descriptors for each peptide. We only considered covalent interactions (peptide bond) and hydrogen bonding interactions, so the –OH group of the hydroxyproline was included in the calculation. … ANN models were constructed using Statistica 6.0 [89] and all variables used were normalized to the same scale.”; [sec 3] “The best model with a Tm of 38°C was obtained with the MLP algorithm using a stable series of 16 sequences and an unstable series of 86. The model correctly classified 14 out of 16 (87.7%) sequences in the stable series and 71 out of 86 (82.55%) sequences in the unstable series. The MCC value for this model was 0.58 and the surface of the ROC curve was 0.86. Table 4 shows all the statistics for the training and validation series. Fig. (9) shows the topology of the network and the ROC curve”; Table 3 reads on “frequencies of amino acid residues in one or more target collagen sequences”. In addition, “stability” reads on “at least one physical or chemical property”. Furthermore, “The best model with a Tm of 38°C was obtained with the MLP algorithm” and “The model correctly classified 14 out of 16 (87.7%) sequences in the stable series and 71 out of 86 (82.55%) sequences in the unstable series” read on “prediction that indicates that a set of target data comprising [relative] frequencies of amino acid residues in one or more target collagen sequences is associated with the at least one physical or chemical property”. Note that “ANN models were constructed using Statistica 6.0” reads on “computer system comprising one or more processors and system memory” since Statistica 6.0 is a data analysis software system which runs on a computer system.)

(i) receiving a set of training data comprising [relative] frequencies of amino acid residues in a plurality of training collagen sequences and physical or chemical property data of the at least one physical or chemical property associated with the plurality of training collagen sequences, wherein a length of each of the plurality of training collagen sequences is at least [100] amino acid residues; and
(Concu, [tables 2-3] “Length”: “20” - “36”; [figs 5-6 and 8-9]; [secs Abstract and 1-2] as cited above; table 3 reads on “frequencies of amino acid residues in a plurality of training collagen sequences”. In addition, “stability” reads on “physical or chemical property data of the at least one physical or chemical property associated with the plurality of training collagen sequences”.)

(ii) training the machine learning model by fitting the machine learning model to the set of training data thereby generating a trained machine learning model, wherein the trained machine learning model is configured to receive as input [relative] amino acid frequency data of a test collagen sequence and predict at least one value of the at least one physical or chemical property associated with the test collagen sequence 
(Concu, [tables 2-3]; [figs 5-6 and 8-9]; [sec Abstract] as cited above, and “This dataset was then split in two classes, stable and unstable, according to their melting temperatures and the dataset was then used to build artificial neural network (ANN) models to predict collagen stability. We built models to predict stability at temperatures of 38°C, 35°C, 30°C, and 25°C degrees, and all models had an accuracy between 82% and 92%.”; [sec 1] as cited above; [sec 2] as cited above, and “All ANNs went through one-step testing (one training period) and later two-step testing (two training periods) of the training algorithms. In the two-step training, different algorithms were combined, including, back-propagation, Levenberg-Marquardt, quick propagation, quasi-Newton, and conjugated gradient descent. Combinations of two different methods were tested using a different number of epochs to train the ANN (ranging from 10 to 100,000 epochs). To obtain the ROC curve [93] using the ANN models we built linear neural networks (LNN) and selected the one that was most similar to our LDA final model.”);

(b) subsequent to generating the prediction that the set of target data is associated with the at least one physical or chemical property, determining, by the computer system, one or more collagen sequences corresponding to the set of target data by identifying the one or more collagen sequences based at least in part on the [relative] frequencies of amino acid residues in the target data 
(Concu, [tables 2-3]; [figs 5-6 and 8-9]; [secs Abstract, 1-2] as cited above; [sec 3] “The best model with a Tm of 38°C was obtained with the MLP algorithm using a stable series of 16 sequences and an unstable series of 86. The model correctly classified 14 out of 16 (87.7%) sequences in the stable series and 71 out of 86 (82.55%) sequences in the unstable series. The MCC value for this model was 0.58 and the surface of the ROC curve was 0.86. Table 4 shows all the statistics for the training and validation series. Fig. (9) shows the topology of the network and the ROC curve”);

(c) [producing one or more polynucleotides encoding] the one or more collagen sequences 
(Concu, [tables 2-3]; [figs 5-6 and 8-9]; [secs Abstract, 1-2] as cited above; [sec 3] “The best model with a Tm of 38°C was obtained with the MLP algorithm using a stable series of 16 sequences and an unstable series of 86. The model correctly classified 14 out of 16 (87.7%) sequences in the stable series and 71 out of 86 (82.55%) sequences in the unstable series. The MCC value for this model was 0.58 and the surface of the ROC curve was 0.86. Table 4 shows all the statistics for the training and validation series. Fig. (9) shows the topology of the network and the ROC curve”); and 

(d) [expressing, on a protein production platform, the one or more polynucleotides to produce] one or more collagen molecules comprising the one or more collagen sequences 
(Concu, [tables 2-3]; [figs 5-6 and 8-9]; [sec 1] “The March-Inside (Markovian Chemicals In Silico Design) methodology was developed by our research group to generate molecular descriptors based on Markov Chain (MC) theory.”; [sec 2] “The backbone of the peptides was built using the “draw mode” of the MARCH-INSIDE® program and we performed calculations of the molecule descriptors for each peptide. We only considered covalent interactions (peptide bond) and hydrogen bonding interactions, so the –OH group of the hydroxyproline was included in the calculation.”).

However, Concu does not teach
(a) generating, using a machine learning model implemented on a computer system comprising one or more processors and system memory, a prediction that indicates that a set of target data comprising [relative] frequencies of amino acid residues in one or more target collagen sequences is associated with the at least one physical or chemical property, wherein the at least one physical or chemical property is selected from the group consisting of: [stiffness, elasticity, oxygen release rate, clarity, turbidity, ultraviolet blockage or absorption, viscosity, solubility, water content or hydration, resistance to protease, and ability to associate into fibrils], and wherein the machine learning model was trained by: 
(i) receiving a set of training data comprising relative frequencies of amino acid residues in a plurality of training collagen sequences and physical or chemical property data of the at least one physical or chemical property associated with the plurality of training collagen sequences, wherein a length of each of the plurality of training collagen sequences is at least 100 amino acid residues; and
(ii) training the machine learning model by fitting the machine learning model to the set of training data thereby generating a trained machine learning model, wherein the trained machine learning model is configured to receive as input relative amino acid frequency data of a test collagen sequence and predict at least one value of the at least one physical or chemical property associated with the test collagen sequence;
(b) subsequent to generating the prediction that the set of target data is associated with the at least one physical or chemical property, determining, by the computer system, one or more collagen sequences corresponding to the set of target data by identifying the one or more collagen sequences based at least in part on the relative frequencies of amino acid residues in the target data;
(c) producing one or more polynucleotides encoding the one or more collagen sequences; and 
(d) expressing, on a protein production platform, the one or more polynucleotides to produce one or more collagen molecules comprising the one or more collagen sequences.

(Note: Hereinafter, if a limitation has one or more underlines, the one or more underlined claim languages indicate that they have not been taught yet, while the one or more non-underlined claim languages indicate that they have been taught already.)
	
Ramshaw teaches
(a) generating, using a machine learning model implemented on a computer system comprising one or more processors and system memory, a prediction that indicates that a set of target data comprising relative frequencies of amino acid residues in one or more target collagen sequences is associated with the at least one physical or chemical property, wherein the at least one physical or chemical property is selected from the group consisting of: [stiffness, elasticity, oxygen release rate, clarity, turbidity, ultraviolet blockage or absorption, viscosity, solubility, water content or hydration, resistance to protease, and ability to associate into fibrils], and wherein the machine learning model was trained by: 
(i) receiving a set of training data comprising relative frequencies of amino acid residues in a plurality of training collagen sequences and physical or chemical property data of the at least one physical or chemical property associated with the plurality of training collagen sequences, wherein a length of each of the plurality of training collagen sequences is at least [100] amino acid residues;
(ii) training the machine learning model by fitting the machine learning model to the set of training data thereby generating a trained machine learning model, wherein the trained machine learning model is configured to receive as input relative amino acid frequency data of a test collagen sequence and predict at least one value of the at least one physical or chemical property associated with the test collagen sequence;
(b) subsequent to generating the prediction that the set of target data is associated with the at least one physical or chemical property, determining, by the computer system, one or more collagen sequences corresponding to the set of target data by identifying the one or more collagen sequences based at least in part on the relative frequencies of amino acid residues in the target data 
(Ramshaw, [figs 1-3]; [sec Abstract] “All formed stable triple-helices, with their melting temperature depending on the identity of the guest triplet. While including less than 10% of all possible triplets, the data set covers 50–60% of collagen sequences and provides a starting point for establishing a stability scale to predict the relative stability of important collagen regions, such as the matrix metalloproteinase cleavage site or binding sites”; [sec “Gly-X-Y TRIPLET DISTRIBUTION IN COLLAGEN”] “Their studies are extended here by analyzing triplet frequencies for a set of human sequences containing both fibril forming and nonfibrillar collagens (total of 4040 triplets). The frequency of occurrence (shown in percentages) of GlyX-Y triplets is presented in Fig. 1, where residues in the X and Y position are shown on the vertical and horizontal scales, respectively”; see also [sec “HOST–GUEST TRIPLE-HELIX PEPTIDE DESIGN AND STABILITY”]; Fig 1 reads on “relative frequencies of amino acid residues”. Note that Concu teaches “(a) generating, using a machine learning model implemented on a computer system comprising one or more processors and system memory, a prediction that indicates that a set of target data comprising [relative] frequencies of amino acid residues in one or more target collagen sequences is associated with the at least one physical or chemical property, wherein the at least one physical or chemical property is selected from the group consisting of: [stiffness, elasticity, oxygen release rate, clarity, turbidity, ultraviolet blockage or absorption, viscosity, solubility, water content or hydration, resistance to protease, and ability to associate into fibrils], wherein the machine learning model was trained by: (i) receiving a set of training data comprising [relative] frequencies of amino acid residues in a plurality of training collagen sequences and physical or chemical property data of the at least one physical or chemical property associated with the plurality of training collagen sequences, , wherein a length of each of the plurality of training collagen sequences is at least [100] amino acid residues; (ii) training the machine learning model by fitting the machine learning model to the set of training data thereby generating a trained machine learning model, wherein the trained machine learning model is configured to receive as input [relative] amino acid frequency data of a test collagen sequence and predict at least one value of the at least one physical or chemical property associated with the test collagen sequence; (b) subsequent to generating the prediction that the set of target data is associated with the at least one physical or chemical property, determining, by the computer system, one or more collagen sequences corresponding to the set of target data by identifying the one or more collagen sequences based at least in part on the [relative] frequencies of amino acid residues in the target data”.);

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the collagen property prediction system of Concu with the relative frequencies of amino acid residues of Ramshaw. 
Doing so would lead to effectively establishing a stability scale to predict the relative stability of different collagen sequences.
(Ramshaw, [secs Abstract and “HOST–GUEST TRIPLE-HELIX PEPTIDE DESIGN AND STABILITY”] “While including less than 10% of all possible triplets, the data set covers 50–60% of collagen sequences and provides a starting point for establishing a stability scale to predict the relative stability of important collagen regions, such as the matrix metalloproteinase cleavage site or binding sites”)

However, the combination of Concu, Ramshaw does not appear to distinctly disclose:
the at least one physical or chemical property is selected from the group consisting of: [stiffness, elasticity, oxygen release rate, clarity, turbidity, ultraviolet blockage or absorption, viscosity, solubility, water content or hydration, resistance to protease, and ability to associate into fibrils], 
wherein a length of each of the plurality of training collagen sequences is at least 100 amino acid residues;
(c) producing one or more polynucleotides encoding the one or more collagen sequences; and 
(d) expressing, on a protein production platform, the one or more polynucleotides to produce one or more collagen molecules comprising the one or more collagen sequences.

Hou teaches
the at least one physical or chemical property is selected from the group consisting of: stiffness, elasticity, oxygen release rate, clarity, turbidity, ultraviolet blockage or absorption, viscosity, solubility, water content or hydration, resistance to protease, and ability to associate into fibrils, 
(Hou, [fig 4] [sec Disclosure of the Invention, pp. 3-4] “The present invention provides a technical solution with respect to the deficiencies in the prior art: existing collagens, which are structurally heterologous, are usually a mixture of peptides with different lengths; since the water solubility of the peptides are different, it is difficult to obtain pure isolated peptides. For the aforementioned reasons, the existing collagens likely exhibit rejection reactions clinically, which greatly influences the application of the collagens in clinical. In particular, one technical problem to be solved by the present invention is to provide a genetic recombinant human collagen with excellent water solubility, high expression quantity and high purity.” [sec Specific Mode for Carrying out the Present Invention, pp. 16-17] “The human collagen according to the present application has the characters of high expression quantity, excellent water solubility and high purity, which lay the foundation for the application of the genetic recombinant human- source collagen.”;)

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the collagen property prediction system of Concu, Ramshaw with the solubility of Hou. 
Doing so would lead to providing a genetic recombinant human collagen with excellent water solubility, high expression quantity and high purity.
(Hou, [sec Specific Mode for Carrying out the Present Invention, pp. 16-17] “The human collagen according to the present application has the characters of high expression quantity, excellent water solubility and high purity, which lay the foundation for the application of the genetic recombinant human- source collagen.”).

However, the combination of Concu, Ramshaw, Hou does not appear to distinctly disclose:
wherein a length of each of the plurality of training collagen sequences is at least 100 amino acid residues;
(c) producing one or more polynucleotides encoding the one or more collagen sequences; and 
(d) expressing, on a protein production platform, the one or more polynucleotides to produce one or more collagen molecules comprising the one or more collagen sequences.

IWAZAWA teaches
wherein a length of each of the plurality of training collagen sequences is at least 100 amino acid residues;
(IWAZAWA, [pars 73-94] “The recombinant gelatin preferably has a repeating sequence represented by Gly-X-Y as the amino acid sequence derived from a partial amino acid sequence of collagen. This repeating sequence is a sequence characteristic to collagen. … From the standpoint of biocompatibility, the recombinant gelatin preferably contains a cell adhesion signal, and more preferably contains two or more cell adhesion signals in one molecule. Examples of the cell adhesion signal include RGD, LDV, REDV, YIGSR, PDSGR, RYVVLPR, LGTIPG, RNIAEIIKDI, IKVAV, LRE, DGEA and HAV sequences, and preferred examples include RGD, YIGSR, PDSGR, LGTIPG, IKVAV and HAV sequences. The cell adhesion signal is particularly preferably an RGD sequence. ERGD sequence is still more preferred among RGD sequences. The arrangement of RGD sequences in the recombinant gelatin may be such that the number of amino acid residues between RGDs is preferably from 0 to 100, and more preferably from 25 to 60. The RGD sequences are preferably arranged unevenly with the number of amino acid residues therebetween within such a range.”; Note that Concu teaches “a length of each of the plurality of training collagen sequences is at least [100] amino acid residues”.)

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the collagen property prediction system of Concu, Ramshaw, Hou with the at least 100 amino acid residues of IWAZAWA. 
Doing so would lead to effectively producing recombinant collagen material having various preferred properties.
(IWAZAWA, [pars 73-94], “Examples of preferred properties that the recombinant gelatin may have include the followings: (1) not being deaminated; (2) not containing procollagen; (3) not containing telopeptide; and (4) being a substantially pure material for collagen which is prepared by nucleic acid encoding a natural collagen. The recombinant gelatin may have one of these preferred properties (1) to (4), or two or more thereof in combination.”)

However, the combination of Concu, Ramshaw, Hou, IWAZAWA does not appear to distinctly disclose
(c) producing one or more polynucleotides encoding the one or more collagen sequences; and 
(d) expressing, on a protein production platform, the one or more polynucleotides to produce one or more collagen molecules comprising the one or more collagen sequences.

Chang teaches
(c) producing one or more polynucleotides encoding the one or more collagen sequences; and
(Chang, [pars 3-11] “Gelatin is a derivative of collagen, a principal structural and connective protein in animals. Gelatin is derived from denaturation of collagen and contains polypeptide sequences having Gly-X-Y repeats, where X and Y are most often proline and hydroxyproline residues. These sequences contribute to triple helical structure and affect the gelling ability of gelatin polypeptides.”; [pars 18-26] “In specific embodiments, the recombinant gelatin of the present invention comprises an amino acid sequence selected from the group consisting of SEQID NOS:15, 16, 17. 18, 19, 20, 21, 22, 23, 24, 25, 30, 31, and 33. Polynucleotides encoding these amino acid sequences are also provided, as are expression vectors and host cells containing the polynucleotides. In certain aspects, the host cells of the present invention are prokaryotic or eukaryotic. In one embodiment, a eukaryotic host cell is selected from the group consisting of a yeast cell, an animal cell, an insect cell, a plant cell, and a fungal cell. The present invention further provides transgenic animals and transgenic plants comprising the polynucleotides. Recombinant gelatins comprising an amino acid sequence selected from the group consisting of SEQID NOs: 26, 27, 28, and 29 are also provided.”; see also [pars 52-102] for definitions; “amino acid sequences” reads on “collagen sequences”. In addition, “Polynucleotides encoding these amino acid sequences are also provided” reads on “producing one or more polynucleotides encoding the one or more collagen sequences”. Note that Concu also teaches “collagen sequences”.)

(d) expressing, on a protein production platform, the one or more polynucleotides to produce one or more collagen molecules comprising the one or more collagen sequences 
(Chang, [pars 18-26] as cited above, and “In a further aspect, the recombinant collagen is produced by co-expressing at least one polynucleotide encoding a collagen or procollagen and at least one polynucleotide encoding a collagen post-translational enzyme or subunit thereof. In a certain embodiment, the post-translational enzyme is prolyl hydroxylase.”; see also [pars 52-102] for definitions and [pars 170-195] for “Expression”; Note that Concu teaches “one or more collagen molecules comprising the one or more collagen sequences”.).

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the collagen property prediction system of Concu, Ramshaw, Hou, IWAZAWA with the expression of polynucleotides of Chang. 
Doing so would lead to providing a universal replacement material, obtained recombinantly, appropriate for use in the extraordinarily diverse spectrum of applications.
(Chang, pars 17-26, “The present invention solves these and other needs by providing a universal replacement material, obtained recombinantly, appropriate for use in the extraordinarily diverse spectrum of applications in which gelatin is currently used. The present materials can be designed to possess the properties and characteristics desired for particular applications, and can thus provide new properties and uses previously unavailable.”)

Regarding claim 2, 
The combination of Concu, Ramshaw, Hou, IWAZAWA, Chang teaches claim 1.

Concu further teaches 
the [relative] frequencies of amino acid residues indicates intra-sequence variation of amino acid trimers in the plurality of training collagen sequences ([tables 2-3]; [figs 5-6 and 8-9]; [sec 2] “The sequence diversity was very high and 58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY. In order to construct ANN models, the dataset was divided into a stable and unstable series for each model.”).

	Ramshaw further teaches 
the relative frequencies of amino acid residues indicates intra-sequence variation of amino acid trimers in the plurality of training collagen sequences ([figs 1-3]; [sec Abstract] “All formed stable triple-helices, with their melting temperature depending on the identity of the guest triplet. While including less than 10% of all possible triplets, the data set covers 50–60% of collagen sequences and provides a starting point for establishing a stability scale to predict the relative stability of important collagen regions, such as the matrix metalloproteinase cleavage site or binding sites”; [sec “Gly-X-Y TRIPLET DISTRIBUTION IN COLLAGEN”] “Their studies are extended here by analyzing triplet frequencies for a set of human sequences containing both fibril forming and nonfibrillar collagens (total of 4040 triplets). The frequency of occurrence (shown in percentages) of GlyX-Y triplets is presented in Fig. 1, where residues in the X and Y position are shown on the vertical and horizontal scales, respectively”; see also [sec “HOST–GUEST TRIPLE-HELIX PEPTIDE DESIGN AND STABILITY”]; Fig 1 reads on “relative frequencies of amino acid residues”. Note that Concu teaches “the … frequencies of amino acid residues indicates intra-sequence variation of amino acid trimers in the plurality of training collagen sequences”.);

Concu, Ramshaw, Hou, IWAZAWA, Chang are combinable with Ramshaw for the same rationale as set forth above with respect to claim 1.

Regarding claim 3, 
The combination of Concu, Ramshaw, Hou, IWAZAWA, Chang teaches claim 2.

Concu further teaches 
the relative frequencies of amino acid residues comprise: 
(a) a frequency for each of a plurality of different amino acids as residues at X positions of X-Y-Gly trimers in each training collagen sequence, and 
(b) a frequency for each of a plurality of different amino acids as residues at Y positions of the X-Y-Gly trimers in each training collagen sequence 
(Concu, [tables 2-3]; [figs 5-6 and 8-9]; [sec 2] “The sequence diversity was very high and 58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY. In order to construct ANN models, the dataset was divided into a stable and unstable series for each model.”; Note that Ramshaw teaches “relative”.).

Regarding claim 7, 
The combination of Concu, Ramshaw, Hou, IWAZAWA, Chang teaches claim 1.

Concu further teaches 
the set of training data is generated using a main collagen domain with an uninterrupted (X-Y-Gly)n repeating sequence ([tables 2-3] 
    PNG
    media_image1.png
    396
    1702
    media_image1.png
    Greyscale
; [figs 5-6 and 8-9]; [sec Abstract] “The stability of the collagen triple helix is strictly related to its amino acid sequence, especially the main Gly-X-Y motif. … We used the literature to assemble a set of 102 peptides and their relative melting temperatures were determined experimentally, indicating a great variance with the main motif of the collagen.”; [sec 1] “The stability of the helix requires a glycine every three residues i.e. a repeated motif of Gly-X-Y [13, 14], where the X and Y position can be any amino acid that can generate hydrogen bonds to stabilize the structure.”; [sec 2] “The sequence diversity was very high and 58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY. In order to construct ANN models, the dataset was divided into a stable and unstable series for each model.”; Each GXY reads on “a main collagen domain”. In addition, for example, “GPO-GPO-GPO-GPO-GPO-GPO-GPO-GPO” reads on “an uninterrupted (X-Y-Gly)n repeating sequence”.).

Regarding claim 16, 
The combination of Concu, Ramshaw, Hou, IWAZAWA, Chang teaches claim 1.

Concu further teaches 
the machine learning model comprises a random forest model, a neural network model, or a general linear model ([tables 2-3]; [figs 5-6 and 8-9]; [sec Abstract] as cited above, and “This dataset was then split in two classes, stable and unstable, according to their melting temperatures and the dataset was then used to build artificial neural network (ANN) models to predict collagen stability. We built models to predict stability at temperatures of 38°C, 35°C, 30°C, and 25°C degrees, and all models had an accuracy between 82% and 92%.”; [sec 2] as cited above, and “All ANNs went through one-step testing (one training period) and later two-step testing (two training periods) of the training algorithms. In the two-step training, different algorithms were combined, including, back-propagation, Levenberg-Marquardt, quick propagation, quasi-Newton, and conjugated gradient descent. Combinations of two different methods were tested using a different number of epochs to train the ANN (ranging from 10 to 100,000 epochs). To obtain the ROC curve [93] using the ANN models we built linear neural networks (LNN) and selected the one that was most similar to our LDA final model.”; Note that “a neural network model” is elected for examination.);

Regarding claim 46, 
The combination of Concu, Ramshaw, Hou, IWAZAWA, Chang teaches claim 1.

Concu further teaches 
the set of target data further comprises relative position data of the amino acid residues in the one or more target collagen sequences, and wherein (b) comprises identifying the one or more collagen sequences based at least in part on the relative position data ([tables 2-3] “length: length of the peptide”; [figs 5-6 and 8-9]; [sec 2] “A set of collagen peptides was retrieved from the literature [65, 66, 88], with a total of 102 sequences and a Tm range from 4°C to 47.8°C. The sequence diversity was very high and 58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY. In order to construct ANN models, the dataset was divided into a stable and unstable series for each model. … Table 3 shows all the peptide IDs, their Tms, and their amino acid sequence using standard one letter abbreviations. The backbone of the peptides was built using the “draw mode” of the MARCH-INSIDE® program and we performed calculations of the molecule descriptors for each peptide. We only considered covalent interactions (peptide bond) and hydrogen bonding interactions, so the –OH group of the hydroxyproline was included in the calculation. … ANN models were constructed using Statistica 6.0 [89] and all variables used were normalized to the same scale.”; e.g., “58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY” reads on “relative position”.).

Regarding claim 47, 
The combination of Concu, Ramshaw, Hou, IWAZAWA, Chang teaches claim 1.

Concu further teaches 
the set of target data further comprises length data of the one or more target collagen sequences, and wherein (b) comprises identifying the one or more collagen sequences based at least in part on the length data ([tables 2-3] “length: length of the peptide”; [figs 5-6 and 8-9]; [sec 2] “A set of collagen peptides was retrieved from the literature [65, 66, 88], with a total of 102 sequences and a Tm range from 4°C to 47.8°C. The sequence diversity was very high and 58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY. In order to construct ANN models, the dataset was divided into a stable and unstable series for each model. … Table 3 shows all the peptide IDs, their Tms, and their amino acid sequence using standard one letter abbreviations. The backbone of the peptides was built using the “draw mode” of the MARCH-INSIDE® program and we performed calculations of the molecule descriptors for each peptide. We only considered covalent interactions (peptide bond) and hydrogen bonding interactions, so the –OH group of the hydroxyproline was included in the calculation. … ANN models were constructed using Statistica 6.0 [89] and all variables used were normalized to the same scale.”; e.g., “length: length of the peptide” reads on “length”.). 

Regarding claim 48, 
The combination of Concu, Ramshaw, Hou, IWAZAWA, Chang teaches claim 1.

Concu further teaches 
the plurality of training collagen sequences comprises naturally occurring and synthetic collagen sequences ([tables 2-3]; [figs 5-6 and 8-9]; [sec 2] “A set of collagen peptides was retrieved from the literature [65, 66, 88], with a total of 102 sequences and a Tm range from 4°C to 47.8°C. The sequence diversity was very high and 58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY. In order to construct ANN models, the dataset was divided into a stable and unstable series for each model.”), 

Regarding claim 49, 
The combination of Concu, Ramshaw, Hou, IWAZAWA, Chang teaches claim 1.

Ramshaw further teaches 
the relative frequencies of amino acid residues of the training collagen sequences comprise percentages of an amino acid relative to all possible amino acids at a particular position of the training collagen sequences ([figs 1-3]; [sec Abstract] “All formed stable triple-helices, with their melting temperature depending on the identity of the guest triplet. While including less than 10% of all possible triplets, the data set covers 50–60% of collagen sequences and provides a starting point for establishing a stability scale to predict the relative stability of important collagen regions, such as the matrix metalloproteinase cleavage site or binding sites”; [sec “Gly-X-Y TRIPLET DISTRIBUTION IN COLLAGEN”] “Their studies are extended here by analyzing triplet frequencies for a set of human sequences containing both fibril forming and nonfibrillar collagens (total of 4040 triplets). The frequency of occurrence (shown in percentages) of GlyX-Y triplets is presented in Fig. 1, where residues in the X and Y position are shown on the vertical and horizontal scales, respectively”; see also [sec “HOST–GUEST TRIPLE-HELIX PEPTIDE DESIGN AND STABILITY”]; Fig 1 reads on “relative frequencies of amino acid residues”. Note that Concu teaches “frequencies of amino acid residues of the training collagen sequences” and “training collagen sequences”.);

Concu, Ramshaw, Hou, IWAZAWA, Chang are combinable with Ramshaw for the same rationale as set forth above with respect to claim 1.

Regarding claim 52, 
The combination of Concu, Ramshaw, Hou, IWAZAWA, Chang teaches claim 1.

Concu further teaches 
the plurality of training collagen sequences comprises lengths of the plurality of training collagen sequences or lengths of fragments of the training collagen sequences ([tables 2-3] “length: length of the peptide”; [figs 5-6 and 8-9]; [sec 2] “A set of collagen peptides was retrieved from the literature [65, 66, 88], with a total of 102 sequences and a Tm range from 4°C to 47.8°C. The sequence diversity was very high and 58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY. In order to construct ANN models, the dataset was divided into a stable and unstable series for each model. … Table 3 shows all the peptide IDs, their Tms, and their amino acid sequence using standard one letter abbreviations. The backbone of the peptides was built using the “draw mode” of the MARCH-INSIDE® program and we performed calculations of the molecule descriptors for each peptide. We only considered covalent interactions (peptide bond) and hydrogen bonding interactions, so the –OH group of the hydroxyproline was included in the calculation. … ANN models were constructed using Statistica 6.0 [89] and all variables used were normalized to the same scale.”; e.g., “length: length of the peptide” reads on “lengths of the plurality of training collagen sequences”.).

Regarding claim 53, 
The combination of Concu, Ramshaw, Hou, IWAZAWA, Chang teaches claim 1.

Concu further teaches 
the plurality of training collagen sequences comprises [gelatin] sequences ([tables 2-3]; [figs 5-6 and 8-9]; [sec 2] “A set of collagen peptides was retrieved from the literature [65, 66, 88], with a total of 102 sequences and a Tm range from 4°C to 47.8°C. The sequence diversity was very high and 58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY. In order to construct ANN models, the dataset was divided into a stable and unstable series for each model.”),

Chang further teaches 
the plurality of training collagen sequences comprises gelatin sequences ([pars 3-11] “Gelatin is a derivative of collagen, a principal structural and connective protein in animals. Gelatin is derived from denaturation of collagen and contains polypeptide sequences having Gly-X-Y repeats, where X and Y are most often proline and hydroxyproline residues. These sequences contribute to triple helical structure and affect the gelling ability of gelatin polypeptides.”; [pars 18-26] “In specific embodiments, the recombinant gelatin of the present invention comprises an amino acid sequence selected from the group consisting of SEQID NOS:15, 16, 17. 18, 19, 20, 21, 22, 23, 24, 25, 30, 31, and 33. Polynucleotides encoding these amino acid sequences are also provided, as are expression vectors and host cells containing the polynucleotides. In certain aspects, the host cells of the present invention are prokaryotic or eukaryotic. In one embodiment, a eukaryotic host cell is selected from the group consisting of a yeast cell, an animal cell, an insect cell, a plant cell, and a fungal cell. The present invention further provides transgenic animals and transgenic plants comprising the polynucleotides. Recombinant gelatins comprising an amino acid sequence selected from the group consisting of SEQID NOs: 26, 27, 28, and 29 are also provided.”; see also [pars 52-102] for definitions; Note that Concu teaches “the plurality of training collagen sequences comprises [gelatin] sequences”.);

Concu, Ramshaw, Hou, IWAZAWA, Chang are combinable with Chang for the same rationale as set forth above with respect to claim 1.

Regarding claim 58
The combination of Concu, Ramshaw, Hou, IWAZAWA teaches claim 1.

The combination of Concu, Ramshaw teaches
the relative frequencies of amino acid residues comprise: (see the rejections of claim 1)

Concu further teaches 
[relative] frequencies of amino acid residues in two or more regions of each training collagen sequence.
(Concu, [tables 2-3]; [figs 5-6 and 8-9]; [sec Abstract] “The stability of the collagen triple helix is strictly related to its amino acid sequence, especially the main Gly-X-Y motif. … We used the literature to assemble a set of 102 peptides and their relative melting temperatures were determined experimentally, indicating a great variance with the main motif of the collagen.”; [sec 1] “The stability of the helix requires a glycine every three residues i.e. a repeated motif of Gly-X-Y [13, 14], where the X and Y position can be any amino acid that can generate hydrogen bonds to stabilize the structure.”; [sec 2] “A set of collagen peptides was retrieved from the literature [65, 66, 88], with a total of 102 sequences and a Tm range from 4°C to 47.8°C. The sequence diversity was very high and 58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY. In order to construct ANN models, the dataset was divided into a stable and unstable series for each model. … Table 3 shows all the peptide IDs, their Tms, and their amino acid sequence using standard one letter abbreviations. The backbone of the peptides was built using the “draw mode” of the MARCH-INSIDE® program and we performed calculations of the molecule descriptors for each peptide. We only considered covalent interactions (peptide bond) and hydrogen bonding interactions, so the –OH group of the hydroxyproline was included in the calculation. … ANN models were constructed using Statistica 6.0 [89] and all variables used were normalized to the same scale.”; Table 3 reads on “frequencies of amino acid residues in two or more regions of each training collagen sequence”.)

Ramshaw further teaches 
relative frequencies of amino acid residues in two or more regions of each training collagen sequence.
(Ramshaw, [figs 1-3]; [sec Abstract] “All formed stable triple-helices, with their melting temperature depending on the identity of the guest triplet. While including less than 10% of all possible triplets, the data set covers 50–60% of collagen sequences and provides a starting point for establishing a stability scale to predict the relative stability of important collagen regions, such as the matrix metalloproteinase cleavage site or binding sites”; [sec “Gly-X-Y TRIPLET DISTRIBUTION IN COLLAGEN”] “Their studies are extended here by analyzing triplet frequencies for a set of human sequences containing both fibril forming and nonfibrillar collagens (total of 4040 triplets). The frequency of occurrence (shown in percentages) of GlyX-Y triplets is presented in Fig. 1, where residues in the X and Y position are shown on the vertical and horizontal scales, respectively”; see also [sec “HOST–GUEST TRIPLE-HELIX PEPTIDE DESIGN AND STABILITY”]; Fig 1 reads on “relative frequencies of amino acid residues”. Note that Concu teaches “[relative] frequencies of amino acid residues in two or more regions of each training collagen sequence”.);

Concu, Ramshaw are combinable with Ramshaw for the same rationale as set forth above with respect to claim 1.

Regarding claim 59
The combination of Concu, Ramshaw, Hou, IWAZAWA teaches claim 1.

The combination of Concu, Ramshaw teaches
the relative frequencies of amino acid residues comprise: (see the rejections of claim 1)

Concu further teaches 
(a) a frequency for each of a plurality of different amino acids at X positions of X-Y-Gly trimers in a first region of each training collagen sequence, 
(b) a frequency for each of a plurality of different amino acids at Y positions of X-Y-Gly trimers in the first region of each training collagen sequence, 
(c) a frequency for each of a plurality of different amino acids at X positions of X-Y-Gly trimers in a second region of each training collagen sequence, and 
(d) a frequency for each of a plurality of different amino acids at Y positions of X-Y-Gly trimers in the second region of each training collagen sequence.
(Concu, [tables 2-3]; [figs 5-6 and 8-9]; [sec Abstract] “The stability of the collagen triple helix is strictly related to its amino acid sequence, especially the main Gly-X-Y motif. … We used the literature to assemble a set of 102 peptides and their relative melting temperatures were determined experimentally, indicating a great variance with the main motif of the collagen.”; [sec 1] “The stability of the helix requires a glycine every three residues i.e. a repeated motif of Gly-X-Y [13, 14], where the X and Y position can be any amino acid that can generate hydrogen bonds to stabilize the structure.”; [sec 2] “A set of collagen peptides was retrieved from the literature [65, 66, 88], with a total of 102 sequences and a Tm range from 4°C to 47.8°C. The sequence diversity was very high and 58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY. In order to construct ANN models, the dataset was divided into a stable and unstable series for each model. … Table 3 shows all the peptide IDs, their Tms, and their amino acid sequence using standard one letter abbreviations. The backbone of the peptides was built using the “draw mode” of the MARCH-INSIDE® program and we performed calculations of the molecule descriptors for each peptide. We only considered covalent interactions (peptide bond) and hydrogen bonding interactions, so the –OH group of the hydroxyproline was included in the calculation. … ANN models were constructed using Statistica 6.0 [89] and all variables used were normalized to the same scale.”; Table 3 reads on “a frequency for each of a plurality of different amino acids at X positions of X-Y-Gly trimers in a first region of each training collagen sequence” and “Y positions” and “second region”.)

In the alternative, Ramshaw can also be interpreted to teach the following limitation:
Ramshaw further teaches 
(a) a frequency for each of a plurality of different amino acids at X positions of X-Y-Gly trimers in a first region of each training collagen sequence, 
(b) a frequency for each of a plurality of different amino acids at Y positions of X-Y-Gly trimers in the first region of each training collagen sequence, 
(c) a frequency for each of a plurality of different amino acids at X positions of X-Y-Gly trimers in a second region of each training collagen sequence, and 
(d) a frequency for each of a plurality of different amino acids at Y positions of X-Y-Gly trimers in the second region of each training collagen sequence.
 (Ramshaw, [figs 1-3]; [sec Abstract] “All formed stable triple-helices, with their melting temperature depending on the identity of the guest triplet. While including less than 10% of all possible triplets, the data set covers 50–60% of collagen sequences and provides a starting point for establishing a stability scale to predict the relative stability of important collagen regions, such as the matrix metalloproteinase cleavage site or binding sites”; [sec “Gly-X-Y TRIPLET DISTRIBUTION IN COLLAGEN”] “Their studies are extended here by analyzing triplet frequencies for a set of human sequences containing both fibril forming and nonfibrillar collagens (total of 4040 triplets). The frequency of occurrence (shown in percentages) of GlyX-Y triplets is presented in Fig. 1, where residues in the X and Y position are shown on the vertical and horizontal scales, respectively”; see also [sec “HOST–GUEST TRIPLE-HELIX PEPTIDE DESIGN AND STABILITY”]; Fig 1 reads on “frequency for each of a plurality of different amino acids”.);

Concu, Ramshaw are combinable with Ramshaw for the same rationale as set forth above with respect to claim 1.

Claims 4-5 and 51 are rejected under 35 U.S.C. 103 as being unpatentable over Concu et al. (Review of Computer-Aided Models for Predicting Collagen Stability), in view of Ramshaw et al. (Gly-X-Y Tripeptide Frequencies in Collagen: A Context for Host–Guest Triple-Helical Peptides), further in view of Hou et al. (AU 2016101562 A4), further in view of IWAZAWA et al. (US 2013/0084638A1) further in view of Chang et al. (US 2009/0143568 A1), in view of Harun et al. (WO 2017/079272 A2).

Regarding claim 4, 
The combination of Concu, Ramshaw, Hou, IWAZAWA, Chang teaches claim 3.

the plurality of different amino acids as residues at X positions of X-Y-Gly trimers in each training collagen sequence, the plurality of different amino acids as residues at Y positions of X-Y-Gly trimers in each training collagen sequence, or both, comprises (see the rejections of claim 3)

Concu, Ramshaw, Hou, IWAZAWA, Chang do not teach explicitly
However, the combination of Concu, Ramshaw, Hou, IWAZAWA, Chang does not appear to distinctly disclose
the plurality of different amino acids as residues at X positions of X-Y-Gly trimers in each training collagen sequence, the plurality of different amino acids as residues at Y positions of X-Y-Gly trimers in each training collagen sequence, or both, comprises 20 standard amino acids naturally occurring in organisms.

Harun teaches
the plurality of different amino acids as residues at X positions of X-Y-Gly trimers in each training collagen sequence, the plurality of different amino acids as residues at Y positions of X-Y-Gly trimers in each training collagen sequence, or both, comprises 20 standard amino acids naturally occurring in organisms 
(Harun [pars 37-98] “The term "amino acid" refers to naturally occurring and non-naturally occurring amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally encoded amino acids are the 20 common amino acids (alanine, arginine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, and valine) and pyrolysine and selenocysteine.”).

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the collagen property prediction system of Concu, Ramshaw, Hou, IWAZAWA, Chang with the 20 common amino acids of Harun. Doing so would lead to enabling a modification of each amino acid after it has been incorporated into a polypeptide chain (Harun, pars 37-98).

In the alternative, Concu can also be interpreted to teach this limitation:
the plurality of different amino acids as residues at X positions of X-Y-Gly trimers in each training collagen sequence, the plurality of different amino acids as residues at Y positions of X-Y-Gly trimers in each training collagen sequence, or both, comprises 20 standard amino acids naturally occurring in organisms 
(Concu [tables 2-3]; [figs 5-6 and 8-9]; [sec 1] “In the classical structure, the X and Y positions are occupied by proline (Pro-P), although the Pro in the Y position can be hydroxylated to hydroxyproline (Hyp-O) which is not one of the 20 essential amino acids.” [sec 2] “The sequence diversity was very high and 58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY. In order to construct ANN models, the dataset was divided into a stable and unstable series for each model.”; For example, Hong et al. (Amino Acids as Precursors of Trihalomethane and Haloacetic Acid Formation During Chlorination) teaches “20 essential amino acids” in [sec Materials and Methods], and Anne teaches “20 Essential Amino Acids” in the human body.).

Regarding claim 5, 
The combination of Concu, Ramshaw, Hou, IWAZAWA, Chang, Harun teaches claim 4.

the plurality of different amino acids as residues at X positions of X-Y-Gly trimers in each training collagen sequence, the plurality of different amino acids as residues at Y positions of X-Y-Gly trimers in each training collagen sequence, or both, comprises (see the rejections of claim 3)

Harun further teaches 
the plurality of different amino acids as residues at X positions of X-Y-Gly trimers in each training collagen sequence, the plurality of different amino acids as residues at Y positions of X-Y-Gly trimers in each training collagen sequence, or both, comprises one or more post-translational modifications of the 20 standard amino acids 
([pars 37-98] “The term "amino acid" refers to naturally occurring and non-naturally occurring amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally encoded amino acids are the 20 common amino acids (alanine, arginine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, and valine) and pyrolysine and selenocysteine. … The term "non-naturally encoded amino acid" also includes, but is not limited to, amino acids that occur by modification (e.g. post-translational modifications) of a naturally encoded amino acid (including but not limited to, the 20 common amino acids or pyrolysine and selenocysteine).”).

Regarding claim 51, 
The combination of Concu, Ramshaw, Hou, IWAZAWA, Chang teaches claim 3.

Concu further teaches 
the plurality of different amino acids as residues at X positions of the X-Y-Gly trimers in each training collagen sequence, the plurality of different amino acids as residues at Y positions of the X-Y-Gly trimers in each training collagen sequence, or both, comprises (see the rejections of claim 3)

However, the combination of Concu, Ramshaw, Hou, IWAZAWA, Chang does not appear to distinctly disclose
the plurality of different amino acids as residues at X positions of the X-Y-Gly trimers in each training collagen sequence, the plurality of different amino acids as residues at Y positions of the X-Y-Gly trimers in each training collagen sequence, or both, comprises a subset of 20 standard amino acids naturally occurring in organisms.

Harun teaches
the plurality of different amino acids as residues at X positions of the X-Y-Gly trimers in each training collagen sequence, the plurality of different amino acids as residues at Y positions of the X-Y-Gly trimers in each training collagen sequence, or both, comprises a subset of 20 standard amino acids naturally occurring in organisms.
(Harun, [pars 37-98] “The term "amino acid" refers to naturally occurring and non-naturally occurring amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally encoded amino acids are the 20 common amino acids (alanine, arginine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, and valine) and pyrolysine and selenocysteine.”).

Concu, Ramshaw, Hou, IWAZAWA, Chang are combinable with Harun for the same rationale as set forth above with respect to claim 4.

In the alternative, Concu can also be interpreted to teach this limitation:
the plurality of different amino acids as residues at X positions of the X-Y-Gly trimers in each training collagen sequence, the plurality of different amino acids as residues at Y positions of the X-Y-Gly trimers in each training collagen sequence, or both, comprises a subset of 20 standard amino acids naturally occurring in organisms.
 (Concu, [tables 2-3]; [figs 5-6 and 8-9]; [sec 1] “In the classical structure, the X and Y positions are occupied by proline (Pro-P), although the Pro in the Y position can be hydroxylated to hydroxyproline (Hyp-O) which is not one of the 20 essential amino acids.” [sec 2] “The sequence diversity was very high and 58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY. In order to construct ANN models, the dataset was divided into a stable and unstable series for each model.”; For example, Hong et al. (Amino Acids as Precursors of Trihalomethane and Haloacetic Acid Formation During Chlorination) teaches “20 essential amino acids” in [sec Materials and Methods], and Anne teaches “20 Essential Amino Acids” in the human body.).

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Concu et al. (Review of Computer-Aided Models for Predicting Collagen Stability) in view of Ramshaw et al. (Gly-X-Y Tripeptide Frequencies in Collagen: A Context for Host–Guest Triple-Helical Peptides), further in view of Hou et al. (AU 2016101562 A4), further in view of IWAZAWA et al. (US 2013/0084638A1) further in view of Chang et al. (US 2009/0143568 A1), in view of Chopra et al. (WO 2017/180902 A1).
	
Regarding claim 6, 
The combination of Concu, Ramshaw, Hou, IWAZAWA, Chang teaches claim 3.

Concu further teaches 
the plurality of different amino acids as residues at X positions of X-Y-Gly trimers in each training collagen sequence, the plurality of different amino acids as residues at Y positions of X-Y-Gly trimers in each training collagen sequence, or both, [consists of] a subset of 20 standard amino acids [and one or more post-translationally modified amino acids] 
([tables 2-3]; [figs 5-6 and 8-9]; [sec 1] “In the classical structure, the X and Y positions are occupied by proline (Pro-P), although the Pro in the Y position can be hydroxylated to hydroxyproline (Hyp-O) which is not one of the 20 essential amino acids.” [sec 2] “The sequence diversity was very high and 58 peptides varied only in the middle of the peptide where the normal sequence GPO was substituted for random structures GXY, where X and Y were essential amino acids. The remaining 48 peptides varied throughout the chain, where the normal sequence GPO was substituted for a random structure GXY. In order to construct ANN models, the dataset was divided into a stable and unstable series for each model.”; For example, Hong et al. (Amino Acids as Precursors of Trihalomethane and Haloacetic Acid Formation During Chlorination) teaches “20 essential amino acids” in [sec Materials and Methods], and Anne teaches “20 Essential Amino Acids” in the human body.).

However, the combination of Concu, Ramshaw, Hou, IWAZAWA, Chang does not appear to distinctly disclose
the plurality of different amino acids as residues at X positions of X-Y-Gly trimers in each training collagen sequence, the plurality of different amino acids as residues at Y positions of X-Y-Gly trimers in each training collagen sequence, or both, consists of a subset of 20 standard amino acids and one or more post-translationally modified amino acids.
 
Chopra teaches
the plurality of different amino acids as residues at X positions of X-Y-Gly trimers in each training collagen sequence, the plurality of different amino acids as residues at Y positions of X-Y-Gly trimers in each training collagen sequence, or both, consists of a subset of 20 standard amino acids and one or more post-translationally modified amino acids 
(Chopra, [pars 93-109] “The present invention, in many aspects, relies on the synthesis of peptides and polypeptides in cyto, via transcription and translation of appropriate polynucleotides. These peptides and polypeptides will include the twenty "natural" amino acids, and post-translational modifications thereof.”; “twenty ‘natural’ amino acids” reads on “20 standard amino acids”.).

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the collagen property prediction system of Concu, Ramshaw, Hou, IWAZAWA, Chang with the 20 natural amino acids and post-translational modifications of Chopra. Doing so would lead to enabling the synthesis of peptides and polypeptides via translation of appropriate polynucleotides (Chopra, pars 93-109).

Claims 44 and 54 are rejected under 35 U.S.C. 103 as being unpatentable over Concu et al. (Review of Computer-Aided Models for Predicting Collagen Stability), in view of Ramshaw et al. (Gly-X-Y Tripeptide Frequencies in Collagen: A Context for Host–Guest Triple-Helical Peptides), further in view of Hou et al. (AU 2016101562 A4), further in view of IWAZAWA et al. (US 2013/0084638A1)

Regarding claim 44
Claim 44 is a system claim corresponding to the method claim 1, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 1. 
Note that Concu teaches processor and memory (“ANN models were constructed using Statistica 6.0” reads on “one or more processors; system memory; and one or more computer-readable storage media” since Statistica 6.0 is a data analysis software system which runs on a computer system.).

Regarding claim 54, 
The combination of Concu, Ramshaw, Hou, IWAZAWA teaches claim 44.

Claim 54 is a system claim corresponding to the method claim 49, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejection of claim 49. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Jackson et al. (Amino-acid site variability among natural and designed proteins) teaches amino-acid frequencies in designed and natural proteins.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SEHWAN KIM whose telephone number is (571)270-7409.  The examiner can normally be reached on Mon - Thu 7:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J Huntley can be reached on (303) 297-4307.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/S.K./Examiner, Art Unit 2129
9/2/2022



/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129