DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions. 
Claim Status
	Claims 1-29 are pending and examined in the following Office action. 

Claim Rejections - 35 USC § 112
Indefiniteness
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-29 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The claims are directed to introducing an ectodomain derived from a virus of Coronaviridae. The claims are rendered indefinite because it is unclear how much of the ectodomain need be derived from Coronaviridae. For example, is a single nucleotide or single amino acid sufficient for derivation? Does the claim require that the VLP is a recognized portion of a coronavirus? Given the competing interpretations, the metes and bounds of the claimed invention cannot be determined as claimed. All dependent claims are included in this rejection. 

Claim 11 is directed to the nucleotide sequence defined in SEQ ID NO: 41. SEQ ID NO: 41 is not a nucleotide sequence. Does Applicant intend to refer to a different sequence? Given the discrepancy, the metes and bounds of the claimed invention cannot be determined as claimed. 



The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.


Written Description
Claims 1, 3-5, 8-10, 12-30 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for pre-AIA  the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Nicotiana benthamiana whole plants were placed upside down in the infiltration medium and subjected to vacuum pressure at 20-40 Torr for 2 minutes. Plants were allowed to incubate over the course of 2-6 days until harvest and protein extraction (pp. 49-51). 
The Applicants do not describe the full scope of transformation methods. The Applicants do not describe the full scope of regulatory regions active in a plant that led to chimeric VLP formation. As can be appreciated by one of ordinary skill in the art, “regulatory region” is a broad class of sequences, including but not limited to promoters, enhancers, and UTRs. The Applicant has not described the full scope of regulatory regions active in a plant that lead to VLP formation. 
The Federal Circuit has clarified the application of the written description requirement to inventions in the field of biotechnology.  The court stated that, “A description of a genus of cDNAs may be achieved by means of a recitation of a representative number of cDNAs, defined by nucleotide sequence, falling within the scope of the genus or of a recitation of structural University of California v. Eli Lilly and Co., 119 F. 3d 1559; 43 USPQ2d 1398, 1406 (Fed. Cir. 1997).
The essential feature is the production of chimeric VLPs via expression of a nucleic acid. 
For example, Spitsin et al (Vaccine, 27, pp. 1289-1292, 2009) teach that attempts at producing the H5 protein or its HA1 domain through transformation of N. benthamiana were reported unsuccessful because of poor yields (see Abstract). Spitsin teach general methods of transforming transgenic plants using standard Agrobacterium-mediated stable transformation of tobacco plants (see pp. 1289-1290; Materials and Methods). Spitsin also teaches that initial attempts to express the entire H5 protein or its HA1 domain in plants yielded minor quantities of the recombinant material by stable and/or transient transformation procedures. It is noted further the Spitsin used the MagnICON vector for transient production of influenza hemagglutinin. At the time, the MagnICON vector system was one of those expression systems that were believed to be amongst the highest expressing expression systems known. Furthermore, the teachings of Spitsin provide further evidence that it is a particular combination of expression construct, including its regulatory elements, is required in order to achieve the production of VLPs in plants. 
In addition, Cardineau et al (US20050048074) teach that full length hemagglutinin expressed in a plant recombinant vector did not result in particle accumulation. Cardineau teaches that general transformation methods were used via Agrobacterium (see Fig. 2; see paragraph 0166). 
Both Spitsin and Cardineau teach a methodology to broadly express influenza hemagglutinin in plants. However, neither Spitsin nor Cardineau observed the formation of a 
D'Auost et al (Plant Biotechnology Journal, 2008, 6: 930-940) teach that only non-enveloped VLPs had been previously produced with the exception of hepatitis B surface antigen (pg. 931, left column). D’Aoust also states that previous efforts at producing influenza HA in plants or plants cells have been reported but in none of these studies did the assembly of a VLP occur (paragraph bridging pp. 934-935).
D’Aoust also generates constructs containing either H1 or H5, as well as the plastocyanin promoter, plastocyanin 5’-UTR, and plastocyanin 3’-UTR. See Figure 1. However, D’Aoust modifies the construct encoding H1 to include a PDI signal peptide for VLP formation. Thus, it is clear that the formation of VLPs is complex, even where previous regulatory elements have shown success, does not reasonably predict success with that same regulatory element. Furthermore, Applicant has not described a genus of regulatory elements that would allow one of skill in the art to discern which regulatory elements will lead to VLP formation from those that will not. That is to say, Applicant has not described any specific requirement (e.g. required expression levels for VLP formation; required structural elements in a regulatory region for expression; etc.), let alone members common to the broadly claimed genus of regulatory elements to adequately describe the scope currently claimed. 
Eli Lilly.  Furthermore, given the lack of description of the necessary elements essential for chimeric VLP formation from said transformation methods and regulatory elements, it remains unclear what features identify regulatory elements and transformation methods capable of such activity.  Since the genus of regulatory elements has not been described by specific structural features, the specification fails to provide an adequate written description to support the breadth of the claims.  
(See Written Description guidelines published in 2008 online at http://www.uspto.gov/web/menu/written.pdf).



Scope of Enablement
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a)  IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same,  and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1, 3-5, 8-10, 12-30 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, because the specification, while being enabling for agroinfiltration based expression of an expression construct containing the 2x35S promoter, the PDI signal peptide, CPMV-HT, and H3 or H5 from influenza hemagglutinin, does not reasonably provide enablement for stable expression of any expression construct minimally comprising any regulatory to make a VLP.  The specification does not enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the invention commensurate in scope with these claims. 
The claimed invention is not supported by an enabling disclosure taking into account the Wands factors.  In re Wands, 858/F.2d 731, 8 USPQ2d 1400 (Fed. Cir. 1988).  In re Wands lists a number of factors for determining whether or not undue experimentation would be required by one skilled in the art to make and/or use the invention.  These factors are:  the quantity of experimentation necessary, the amount of direction or guidance presented, the presence or absence of working examples of the invention, the nature of the invention, the state of the prior art, the relative skill of those in the art, the predictability or unpredictability of the art, and the breadth of the claim.
The claims are broadly drawn to a method of producing a chimeric VLP comprising stably introducing a nucleic acid construct encoding said chimeric VLP.
Applicants teach several constructs expressing chimeric VLPs where the ectodomain is from HIV, Rabies, SARS, Varicella Zoster Virus (VZV), and Ebola. All constructs contained the alfalfa PDI signal peptide fused to the ectodomain cloned into a 2X35S-CPMV-HT expression system. The constructs were transfected into Agrobacterium and grown on media. Nicotiana benthamiana whole plants were placed upside down in the infiltration medium and subjected to vacuum pressure at 20-40 Torr for 2 minutes. Plants were allowed to incubate over the course of 2-6 days until harvest and protein extraction (pp. 49-51). Applicants teach chimeric VLPs generated with only H3 and H5 (Examples 1 and 3). Applicants teach that only utilization of a non-native signal peptide led to the formation of chimeric VLPs (Examples 1 and 3; constructs 997 and 999). 
The Applicants do not teach the full scope of transformation methods and regulatory regions leading to VLP formation.
For example, Spitsin et al (Vaccine, 27, pp. 1289-1292, 2009) teach that attempts at producing the H5 protein or its HA1 domain through transformation of N. benthamiana were reported unsuccessful because of poor yields (see Abstract). Spitsin teach general methods of transforming transgenic plants using standard Agrobacterium-mediated stable transformation of tobacco plants (see pp. 1289-1290; Materials and Methods). Spitsin also teaches that initial attempts to express the entire H5 protein or its HA1 domain in plants yielded minor quantities of the recombinant material by stable and/or transient transformation procedures. It is noted further the Spitsin used the MagnICON vector for transient production of influenza hemagglutinin. At the time, the MagnICON vector system was one of those expression systems that were believed to be amongst the highest expressing expression systems known. Furthermore, the teachings of Spitsin provide further evidence that it is a particular combination of expression construct, including its regulatory elements, is required in order to achieve the production of VLPs in plants. 

Both Spitsin and Cardineau teach a methodology to broadly express influenza hemagglutinin in plants. However, neither Spitsin nor Cardineau observed the formation of a VLP upon recombinant expression of a full length hemagglutinin gene, let alone a chimeric hemagglutinin coding sequence. Spitsin recovered several hemagglutinin polypeptides expressed full-size mature hemagglutinin without viral leader peptide (see Abstract; see pg. 1289; see Fig. 1). Cardineau teaches the formation of immunoprotective particles using the HN protein of Newcastle Disease Virus (see Abstract). However, Cardineau is silent regarding the formation of VLPs in influenza hemagglutinin. 
D'Auost et al (Plant Biotechnology Journal, 2008, 6: 930-940) teach that only non-enveloped VLPs had been previously produced with the exception of hepatitis B surface antigen (pg. 931, left column). D’Aoust also states that previous efforts at producing influenza HA in plants or plants cells have been reported but in none of these studies did the assembly of a VLP occur (paragraph bridging pp. 934-935).
D’Aoust also generates constructs containing either H1 or H5, as well as the plastocyanin promoter, plastocyanin 5’-UTR, and plastocyanin 3’-UTR. See Figure 1. Critically, D’Aoust required the use of a modified signal peptide to achieve proper expression of H1. See Figure 1, “SP PDI.” Thus, it is clear that the formation of VLPs is complex, even where previous regulatory elements have shown success, does not reasonably predict success with that same regulatory element. Other regulatory elements may need to be present for proper formation of 
Given the lack of guidance in the instant specification, undue trial and error experimentation would be required for one of skill in the art to generate VLPs. As of Applicant’s filing date, the only example of an influenza VLP forming was via transient agroinfiltration using a specific expression construct containing the 2x35S promoter, spPDI, CPMV-HT, and H3 or H5 influenza hemagglutinin. No evidence has been provided such that one of skill would have been able to conclude that Applicant was enabled for the broadly claimed genus.
Therefore, given the breadth of the claims; the lack of guidance and working examples; the unpredictability in the art; and the state-of-the-art as discussed above, undue experimentation would be required to make and use the claimed invention, and therefore, the invention is not enabled throughout the broad scope of the claims.


Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under pre-AIA  35 U.S.C. 103(a) are summarized as follows:

2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-4, 7-23, and 26-29 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over D’Aoust et al (Plant Biotechnology, 2008, 6(9): 930-940) in view of Smith et al (WO 2008005777 A2); GenBank Accession No. ABP51969.1 (published online May 2008; appended to the action) and UniProt HEMA_I59A0 (published online November 2008; appended to the action).
The claims are drawn to a VLP and a method of producing a VLP comprising the S protein from SARS virus and the transmembrane domain and cytoplasmic tail (TM/CT) from influenza hemagglutinin H5, transiently expressing a nucleic acid encoding a chimeric protein comprising the S protein from SARS virus fused to the TM/CT domain of influenza H5 in a  plant, purifying the VLP from the plant. The claims also require that no matrix or core protein be concurrently expressed with the chimeric nucleic acid; the VLP contain plant-specific N-glycans, a pharmaceutically acceptable carrier. The claims also require that the composition comprise an effective dose of the VLP for inducing an immune response. 
D’Aoust teaches a method of producing a VLP in a plant comprising recombinantly expressing a nucleic acid encoding influenza hemagglutinin (Fig. 1) from H5N1 (A/Indonesia/05/2005) and H1N1 (A/New Caledonia/20/99) by agroinfiltration, a transient type of expression (pg. 936, left column). D’Aoust teaches that VLPs were derived from the plant plasma membrane and thus containing plant derived lipids (Abstract). D’Aoust teaches three 
D’Aoust teaches harvesting of H5-producing leaves of N. bethamiana and purifying VLPs using an affinity column (pg. 938, VLP purification). 
D’Aoust teaches a composition comprising PBS and H5 VLPs and administering said composition to mice wherein the administered VLP was pelleted and re-suspended in 100 mM PBS (pg. 938). PBS is a pharmaceutically acceptable carrier. When administered a lethal dose of influenza, mice immunized with the plant made influenza VLP survived. Thus, D'Aoust teaches administering a composition comprising a pharmaceutically acceptable carrier and an effective dose of the VLP. 
D'Aoust teaches that M1 reduced the overall accumulation in leaf tissue. D’Aoust also shows that H5 and H1 alone had improved VLP expression relative to H1 and M1 alone. See page 932, Assessment of VLP formation. See also Figure 3.D’Aoust also states that previous work established that the M1 protein was dispensable for VLP formation and that the coexpression of HA and M1 in plants led to a substantial decrease in HA accumulation which led to a decrease in VLP accumulation. See page 935, paragraph bridging left and right columns. 
D’Aoust teaches the inclusion of the PDI signal peptide. See Figure 1. 
D’Aoust teaches that inclusion of M1 occurred because it was believed that M1 is a prerequisite for the formation of influenza VLPs in insect cells. D’Aoust also teaches that co-expression of M1 with H1 reudced the overall accumulation of H1 in the leaf tissue. 

D’Aoust does not teach a chimeric H5 influenza VLP containing the S protein from SARS, wherein the ectodomain or fragment thereof is derived from an env protein. D'Aoust does not teach a chimeric H5 influenza VLP of the invention eliciting an immune response. D’Aoust does not teach the location of the influenza ectodomain or transmembrane domain and cytoplasmic tail within the H5 protein.
Smith teaches their method for increasing glycoprotein incorporation on the surface of VLPs comprising expressing a nucleic acid encoding a chimeric glycoprotein that comprises the transmembrane domain and carboxyl terminal tail of influenza hemagglutinin and a viral glycoprotein. See page 2, lines 25-31. Smith teaches that the viral glycoproteins may comprise S protein from a coronavirus such as SARS. See pages 7-8. 
Smith teaches a chimeric viral glycoprotein comprising the a glycoprotein as well as the TM/CT domain from H3 influenza hemagglutinin. See Figure 1. 
Smith teaches their chimeric VLP composition comprising a pharmaceutically acceptable carrier. See claim 23. 
UniProt HEMA_159A0 teaches the location of the transmembrane domain and
cytoplasmic tail (see accession provided below). 
At the time the invention was made, it would have been prima facie obvious and within the scope of one of ordinary skill in the art to substitute the chimeric VLP as taught by Smith with the method of producing an influenza HA VLP in transgenic plants via transient agroinfiltration as taught by D’Aoust. One of ordinary skill in the art would have been motivated 
Given that Smith utilizes M1 for VLP formation and combined with the knowledge provided by D’Aoust that M1 is not required for VLP formation in plants, it would have been prima facie obvious to generate a chimeric influenza VLP without M1 in plants. Furthermore, because Smith teaches that the TM/CT region from influenza HA associates with M1 as well as other viral cores such as HIV Gag, one of ordinary skill in the art would have been motivated to omit core proteins from the plant produced chimeric VLP, especially in view of D’Aoust. The ordinary artisan would have understood that the influenza HA TM/CT domain would associate with the corresponding matrix protein from influenza (M1), and despite Smith showing that HIV Gag functions as a matrix protein with influenza HA TM/CT, one of ordinary skill in the art would not have been motivated to utilize any matrix protein in light of D’Aoust showing that the native matrix protein for the production of plant produced VLPs is not required. 
One of ordinary skill in the art would have been motivated to use the plant expression system of D’Aoust because D'Aoust teaches that none of the recombinant platforms currently available compare with the plant-based transient expression in terms of speed and cost and also 
Given that the S protein in SARS is a glycoprotein, and that plants such as Nicotiana benthamiana produce glycoproteins (pg. 932, right column), one of ordinary skill in the art would have recognized that the VLP must necessarily contain plant-specific N-glycans. 

Claims 5 and 24 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over D’Aoust et al (Plant Biotechnology, 2008, 6(9): 930-940) in view of Smith et al (WO 2008005777 A2); GenBank Accession No. ABP51969.1 (published online May 2008; appended to the action) and UniProt HEMA_I59A0 (published online November 2008; appended to the action) as applied to claims 1-4, 7-23, and 26-29 above, and further in view of GenBank P59594.1 (published online June 2008; appended below).
The teachings of D’Aoust, Smith, GenBank, and UniProt have been discussed above. 
D’Aoust, Smith, GenBank, and UniProt do not teach a sequence having at least 70% identity to SEQ ID NO: 31. 
P59594.1 has 97% identity to SEQ ID NO: 31. P59594.1 teaches that the TMD/CT domain corresponds to 1196-1255. See accession below. 
. 

Claims 6 and 25 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over D’Aoust et al (Plant Biotechnology, 2008, 6(9): 930-940) in view of Smith et al (WO 2008005777 A2); GenBank Accession No. ABP51969.1 (published online May 2008; appended to the action) and UniProt HEMA_I59A0 (published online November 2008; appended to the action) as applied to claims 1-4, 7-23, and 26-29 above, and further in view of GenBank AY278741.1 (published online October 2005; appended below).
The teachings of D’Aoust and Smith have been discussed above. 
D’Aoust and Smith do not teach a sequence having at least 70% identity to SEQ ID NO: 28. 
GenBank AY278741.1 teaches a sequence having 100% identity to the instant SEQ ID NO: 28. See alignment below. 
At the time the invention was made, it would have been obvious and within the scope of one of ordinary skill in the art to utilize the sequence disclosed in SEQ ID NO: 28 to generate a chimeric VLP. The ordinary artisan would have been motivated to do so because it is well known that a nucleic acid is necessary to express a protein in a plant cell and because Smith 

Conclusion
	No claim is allowed. 

Examiner’s Contact Information
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to STEPHEN G UYENO whose telephone number is (571)272-3041.  The examiner can normally be reached on 10:00-4:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Joe Zhou, can be reached on (571)272-0724.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.






/STEPHEN UYENO/
Primary Examiner, Art Unit 1662


	



	
Sequence Listing
LOCUS       SPIKE_CVHSA             1255 aa            linear   VRL 10-JUN-2008
DEFINITION  Spike glycoprotein precursor (S glycoprotein) (Peplomer protein)
            (E2) [Contains: Spike protein S1; Spike protein S2].
ACCESSION   P59594
VERSION     P59594.1
DBSOURCE    UniProtKB: locus SPIKE_CVHSA, accession P59594;
            class: standard.
            extra accessions:Q6QU82,Q7T696,Q7TA19,Q7TFA2,Q7TFB1,Q80BV6
            created: Apr 23, 2003.
            sequence updated: Apr 23, 2003.
            annotation updated: Jun 10, 2008.
            xrefs: AY278741.1, AAP13441.1, AY274119.3, AAP41037.1, AY282752.2,
            AAP30713.1, AY278554.2, AAP13567.1, AY278491.2, AY304495.1,
            AY304492.1, AY278487.3, AY278488.2, AAP30030.1, AY278490.3,
            AY279354.2, AY278489.2, AAP51227.1, AY283794.1, AY283795.1,
            AY283796.1, AY283797.1, AY283798.2, AY291451.1, AAP37017.1,
            AY310120.1, AAP50485.1, AY291315.1, AAP33697.1, AY304486.1,
            AY321118.1, AY323976.1, AAP73417.1, AY322207.1, AAP82968.1,
            AY338174.1, AAQ01597.1, AY338175.1, AAQ01609.1, AY348314.1,
            AAP97882.1, AP006557.1, BAC81348.1, AP006558.1, BAC81362.1,
            AP006559.1, BAC81376.1, AP006560.1, BAC81390.1, AP006561.1,
            BAC81404.1, AY323977.2, AAP72986.1, AY362698.1, AY362699.1,
            AY427439.1, AAQ94060.1, AY463059.1, AAR86788.1, AY525636.1,
            AAS10463.1, 1Q4Z_A, 1T7G_A, 1T7G_C, 1T7G_E, 1T7G_B, 1T7G_D, 1T7G_F,
            1U4K_D, 1WNC_A, 1WNC_B, 1WNC_C, 1WNC_D, 1WNC_E, 1WNC_F, 1WYY_A,
            1WYY_B, 1XJP_A, 1ZV7_B, 1ZV8_A, 1ZV8_C, 1ZV8_E, 1ZV8_G, 1ZV8_I,
            1ZV8_K, 1ZV8_B, 1ZV8_D, 1ZV8_F, 1ZV8_H, 1ZV8_J, 1ZV8_L, 1ZVB_A,
            1ZVB_B, 1ZVB_C, 2AJF_E, 2AJF_F, 2BEQ_A, 2BEQ_B, 2BEQ_C, 2BEQ_D,
            2BEQ_E, 2BEQ_F, 2BEZ_C, 2BEZ_F, 2DD8_S, 2FXP_A, 2FXP_B, 2FXP_C,
            2GHV_C, 2GHV_E, 2GHW_A, 2GHW_C
            xrefs (non-sequence databases): PDBsum:1Q4Z, PDBsum:1T7G,
            PDBsum:1U4K, PDBsum:1WNC, PDBsum:1WYY, PDBsum:1XJP, PDBsum:1ZV7,
            PDBsum:1ZV8, PDBsum:1ZVB, PDBsum:2AJF, PDBsum:2BEQ, PDBsum:2BEZ,
            PDBsum:2DD8, PDBsum:2FXP, PDBsum:2GHV, PDBsum:2GHW, DIP:29105N,
            GO:0019031, GO:0005198, GO:0009405, GO:0019089, InterPro:IPR002552,
            Pfam:PF01601
KEYWORDS    3D-structure; Coiled coil; Complete proteome; Envelope protein;
            Glycoprotein; Host-virus interaction; Lipoprotein; Membrane;
            Palmitate; Signal; Transmembrane; Virion; Virulence.
SOURCE      SARS coronavirus
  ORGANISM  SARS coronavirus
            Viruses; ssRNA positive-strand viruses, no DNA stage; Nidovirales;
            Coronaviridae; Coronavirus; Coronavirus group 2; Coronavirus group
            2b.
REFERENCE   1  (residues 1 to 1255)
  AUTHORS   Rota,P.A., Oberste,M.S., Monroe,S.S., Nix,W.A., Campagnoli,R.,
            Icenogle,J.P., Penaranda,S., Bankamp,B., Maher,K., Chen,M.H.,
            Tong,S., Tamin,A., Lowe,L., Frace,M., DeRisi,J.L., Chen,Q.,
            Wang,D., Erdman,D.D., Peret,T.C., Burns,C., Ksiazek,T.G.,
            Rollin,P.E., Sanchez,A., Liffick,S., Holloway,B., Limor,J.,
            McCaustland,K., Olsen-Rasmussen,M., Fouchier,R., Gunther,S.,
            Osterhaus,A.D., Drosten,C., Pallansch,M.A., Anderson,L.J. and
            Bellini,W.J.
  TITLE     Characterization of a novel coronavirus associated with severe
            acute respiratory syndrome
  JOURNAL   Science 300 (5624), 1394-1399 (2003)

  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate Urbani
REFERENCE   2  (residues 1 to 1255)
  AUTHORS   Marra,M.A., Jones,S.J., Astell,C.R., Holt,R.A., Brooks-Wilson,A.,
            Butterfield,Y.S., Khattra,J., Asano,J.K., Barber,S.A., Chan,S.Y.,
            Cloutier,A., Coughlin,S.M., Freeman,D., Girn,N., Griffith,O.L.,
            Leach,S.R., Mayo,M., McDonald,H., Montgomery,S.B., Pandoh,P.K.,
            Petrescu,A.S., Robertson,A.G., Schein,J.E., Siddiqui,A.,
            Smailus,D.E., Stott,J.M., Yang,G.S., Plummer,F., Andonov,A.,
            Artsob,H., Bastien,N., Bernard,K., Booth,T.F., Bowness,D., Czub,M.,
            Drebot,M., Fernando,L., Flick,R., Garbutt,M., Gray,M., Grolla,A.,
            Jones,S., Feldmann,H., Meyers,A., Kabani,A., Li,Y., Normand,S.,
            Stroher,U., Tipples,G.A., Tyler,S., Vogrig,R., Ward,D., Watson,B.,
            Brunham,R.C., Krajden,M., Petric,M., Skowronski,D.M., Upton,C. and
            Roper,R.L.
  TITLE     The Genome sequence of the SARS-associated coronavirus
  JOURNAL   Science 300 (5624), 1399-1404 (2003)
   PUBMED   12730501
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate Tor2
REFERENCE   3  (residues 1 to 1255)
  AUTHORS   Tsui,S.K., Chim,S.S. and Lo,Y.M.
  CONSRTM   Chinese University of Hong Kong Molecular SARS Research Group
  TITLE     Coronavirus genomic-sequence variations and the epidemiology of the
            severe acute respiratory syndrome
  JOURNAL   N. Engl. J. Med. 349 (2), 187-188 (2003)
   PUBMED   12853594
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate CUHK-Su10, and Isolate CUHK-W1
REFERENCE   4  (residues 1 to 1255)
  AUTHORS   Zeng,F.Y., Chan,C.W., Chan,M.N., Chen,J.D., Chow,K.Y., Hon,C.C.,
            Hui,K.H., Li,J., Li,V.Y., Wang,C.Y., Wang,P.Y., Guan,Y., Zheng,B.,
            Poon,L.L., Chan,K.H., Yuen,K.Y., Peiris,J.S. and Leung,F.C.
  TITLE     The complete genome sequence of severe acute respiratory syndrome
            coronavirus strain HKU-39849 (HK-39)
  JOURNAL   Exp. Biol. Med. (Maywood) 228 (7), 866-873 (2003)
   PUBMED   12876307
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate HKU-39849
REFERENCE   5  (residues 1 to 1255)
  AUTHORS   Guan,Y., Zheng,B.J., He,Y.Q., Liu,X.L., Zhuang,Z.X., Cheung,C.L.,
            Luo,S.W., Li,P.H., Zhang,L.J., Guan,Y.J., Butt,K.M., Wong,K.L.,
            Chan,K.W., Lim,W., Shortridge,K.F., Yuen,K.Y., Peiris,J.S. and
            Poon,L.L.
  TITLE     Isolation and characterization of viruses related to the SARS
            coronavirus from animals in southern China
  JOURNAL   Science 302 (5643), 276-278 (2003)
   PUBMED   12958366
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate GZ50, and Isolate HKU-36871
REFERENCE   6  (residues 1 to 1255)
  AUTHORS   Qin,E., Zhu,Q., Yu,M., Fan,B., Chang,G., Si,B., Yang,B., Peng,W.,
            Jiang,T., Liu,B., Deng,Y., Liu,H., Zhang,Y., Wang,C., Li,Y.,
            Gan,Y., Li,X., Lu,F., Tan,G., Yang,R., Cao,W.S., Wang,J., Chen,W.,
            Cong,L., Deng,Y., Dong,W., Han,Y., Hu,W., Lei,M., Li,C., Li,G.,
            Li,G., Li,H., Li,S., Li,S., Li,W., Li,W., Lin,W., Liu,J., Liu,Z.,
            Lu,H., Ni,P., Qi,Q., Sun,Y., Tang,L., Tong,Z., Wang,J., Wang,X.,

            Zhang,J., Zhang,X., Zhou,J. and Yang,H.
  TITLE     Direct Submission
  JOURNAL   Submitted (??-APR-2003) to the EMBL/GenBank/DDBJ databases
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate BJ01, Isolate BJ02, Isolate BJ03, Isolate BJ04, and
            Isolate GD01
REFERENCE   7  (residues 1 to 1255)
  AUTHORS   Ruan,Y.J., Wei,C.L., Ee,A.L., Vega,V.B., Thoreau,H., Su,S.T.,
            Chia,J.M., Ng,P., Chiu,K.P., Lim,L., Zhang,T., Peng,C.K., Lin,E.O.,
            Lee,N.M., Yee,S.L., Ng,L.F., Chee,R.E., Stanton,L.W., Long,P.M. and
            Liu,E.T.
  TITLE     Comparative full-length genome sequence analysis of 14 SARS
            coronavirus isolates and common mutations associated with putative
            origins of infection
  JOURNAL   Lancet 361 (9371), 1779-1785 (2003)
   PUBMED   12781537
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate Sin2500, Isolate Sin2677, Isolate Sin2679, Isolate
            Sin2748, and Isolate sin2774
            Erratum:[Lancet. 2003 May 24;361(9371):1832]
REFERENCE   8  (residues 1 to 1255)
  AUTHORS   Ruan,Y., Wei,C.L., Ling,A.E., Vega,V.B., Thoreau,H., Se Thoe,S.Y.,
            Chia,J.-M., Ng,P., Chiu,K.P., Lim,L., Zhang,T., Chan,K.P.,
            Oon,L.E.L., Ng,M.L., Leo,S.Y., Ng,L.F.P., Ren,E.C., Stanton,L.W.,
            Long,P.M. and Liu,E.T.
  JOURNAL   Lancet 361, 1832-1832 (2003)
  REMARK    ERRATUM.
REFERENCE   9  (residues 1 to 1255)
  AUTHORS   Yeh,S.-H., Kao,C.-L., Tsai,C.-Y., Liu,C.-J., Chen,D.-S. and
            Chen,P.-J.
  TITLE     Direct Submission
  JOURNAL   Submitted (??-MAY-2003) to the EMBL/GenBank/DDBJ databases
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate TW1
REFERENCE   10 (residues 1 to 1255)
  AUTHORS   Eickmann,M., Becker,S., Klenk,H.-D., Doerr,H.W., Stadler,K.,
            Censini,S., Guidotti,S., Masignani,V., Scarselli,M., Mora,M.,
            Donati,C., Han,J., Song,H.C., Abrignani,S., Covacci,A. and
            Rappuoli,R.
  TITLE     Direct Submission
  JOURNAL   Submitted (??-MAY-2003) to the EMBL/GenBank/DDBJ databases
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate FRA
REFERENCE   11 (residues 1 to 1255)
  AUTHORS   Thiel,V., Ivanov,K.A., Putics,A., Hertzig,T., Schelle,B., Bayer,S.,
            Weissbrich,B., Snijder,E.J., Rabenau,H., Doerr,H.W.,
            Gorbalenya,A.E. and Ziebuhr,J.
  TITLE     Mechanisms and enzymes involved in SARS coronavirus genome
            expression
  JOURNAL   J. Gen. Virol. 84 (PT 9), 2305-2315 (2003)
   PUBMED   12917450
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate Frankfurt 1
REFERENCE   12 (residues 1 to 1255)
  AUTHORS   Yang,J.-Y., Lin,J.-H., Chiu,S.-C., Wang,S.-F., Lee,S.C., Lin,Y.-C.,
            Hsu,C.-K., Chen,H.-Y., Chang,J.G., Chen,P.-J. and Su,I.-J.
  TITLE     Direct Submission

  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate TWC
REFERENCE   13 (residues 1 to 1255)
  AUTHORS   Cong,L.-M., Ding,G.-Q., Lu,Y.-Y., Weng,J.-Q., Yan,J.-Y., Hu,N.-P.,
            Wo,J.-E., Chen,S.-Y., Zhang,Y.-J., Mei,L.-L., Wang,Z.-G., Yao,J.,
            Zhu,H.-P., Lu,Q.-Y., Li,M.-H., Gong,L.-M., Shi,W. and Li,L.-J.
  TITLE     Direct Submission
  JOURNAL   Submitted (??-JUN-2003) to the EMBL/GenBank/DDBJ databases
  REMARK    NUCLEOTIDE SEQUENCE [MRNA].
            STRAIN=Isolate ZJ01
REFERENCE   14 (residues 1 to 1255)
  AUTHORS   Yuan,Z., Zhang,X., Hu,Y., Lan,S., Wang,H., Zhou,Z. and Wen,Y.
  TITLE     Direct Submission
  JOURNAL   Submitted (??-JUN-2003) to the EMBL/GenBank/DDBJ databases
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate Shanghai LY
REFERENCE   15 (residues 1 to 1255)
  AUTHORS   Chang,J.-G.C., Lin,T.-H., Chen,C.-M., Lin,C.-S., Chan,W.-L. and
            Shih,M.-C.
  TITLE     Direct Submission
  JOURNAL   Submitted (??-JUL-2003) to the EMBL/GenBank/DDBJ databases
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate Taiwan TC1, Isolate Taiwan TC2, and Isolate Taiwan
            TC3
REFERENCE   16 (residues 1 to 1255)
  AUTHORS   Shu,H.Y., Wu,K.M. and Tsai,S.F.
  TITLE     Direct Submission
  JOURNAL   Submitted (??-JUL-2003) to the EMBL/GenBank/DDBJ databases
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate TWH, Isolate TWJ, Isolate TWK, Isolate TWS, and
            Isolate TWY
REFERENCE   17 (residues 1 to 1255)
  AUTHORS   Canducci,F., Clementi,M., Poli,G. and Vicenzi,E.
  TITLE     Direct Submission
  JOURNAL   Submitted (??-JUL-2003) to the EMBL/GenBank/DDBJ databases
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate HSR 1
REFERENCE   18 (residues 1 to 1255)
  AUTHORS   Yang,J.-Y., Lin,J.-H., Chiu,S.-C., Wang,S.-F., Lee,H.-C.,
            Lin,Y.-C., Hsu,C.-K., Chen,H.-Y., Chen,P.-J. and Su,I.-J.
  TITLE     Direct Submission
  JOURNAL   Submitted (??-AUG-2003) to the EMBL/GenBank/DDBJ databases
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate TWC2, and Isolate TWC3
REFERENCE   19 (residues 1 to 1255)
  AUTHORS   Balotta,C., Corvasce,S., Violin,M., Galli,M., Moroni,M.,
            Vigevani,G.M., Ruan,Y.J. and Salemi,M.
  TITLE     Direct Submission
  JOURNAL   Submitted (??-OCT-2003) to the EMBL/GenBank/DDBJ databases
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate AS
REFERENCE   20 (residues 1 to 1255)
  AUTHORS   Yuan,Z., Zhang,X., Hu,Y., Lan,S., Wang,H., Zhou,Z. and Wen,Y.
  TITLE     Direct Submission
  JOURNAL   Submitted (??-JAN-2004) to the EMBL/GenBank/DDBJ databases
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate Shanghai QXC1

  AUTHORS   Song,H.D., Tu,C.C., Zhang,G.W., Wang,S.Y., Zheng,K., Lei,L.C.,
            Chen,Q.X., Gao,Y.W., Zhou,H.Q., Xiang,H., Zheng,H.J., Chern,S.W.,
            Cheng,F., Pan,C.M., Xuan,H., Chen,S.J., Luo,H.M., Zhou,D.H.,
            Liu,Y.F., He,J.F., Qin,P.Z., Li,L.H., Ren,Y.Q., Liang,W.J.,
            Yu,Y.D., Anderson,L., Wang,M., Xu,R.H., Wu,X.W., Zheng,H.Y.,
            Chen,J.D., Liang,G., Gao,Y., Liao,M., Fang,L., Jiang,L.Y., Li,H.,
            Chen,F., Di,B., He,L.J., Lin,J.Y., Tong,S., Kong,X., Du,L., Hao,P.,
            Tang,H., Bernini,A., Yu,X.J., Spiga,O., Guo,Z.M., Pan,H.Y.,
            He,W.Z., Manuguerra,J.C., Fontanet,A., Danchin,A., Niccolai,N.,
            Li,Y.X., Wu,C.I. and Zhao,G.P.
  TITLE     Cross-host evolution of severe acute respiratory syndrome
            coronavirus in palm civet and human
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 102 (7), 2430-2435 (2005)
   PUBMED   15695582
  REMARK    NUCLEOTIDE SEQUENCE [GENOMIC RNA].
            STRAIN=Isolate GD03
REFERENCE   22 (residues 1 to 1255)
  AUTHORS   Li,W., Moore,M.J., Vasilieva,N., Sui,J., Wong,S.K., Berne,M.A.,
            Somasundaran,M., Sullivan,J.L., Luzuriaga,K., Greenough,T.C.,
            Choe,H. and Farzan,M.
  TITLE     Angiotensin-converting enzyme 2 is a functional receptor for the
            SARS coronavirus
  JOURNAL   Nature 426 (6965), 450-454 (2003)
   PUBMED   14647384
  REMARK    INTERACTION WITH HUMAN ACE2, AND CHARACTERIZATION OF CELLULAR
            RECEPTOR.
REFERENCE   23 (residues 1 to 1255)
  AUTHORS   Wong,S.K., Li,W., Moore,M.J., Choe,H. and Farzan,M.
  TITLE     A 193-amino acid fragment of the SARS coronavirus S protein
            efficiently binds angiotensin-converting enzyme 2
  JOURNAL   J. Biol. Chem. 279 (5), 3197-3201 (2004)
   PUBMED   14670965
  REMARK    INTERACTION WITH HUMAN ACE2.
REFERENCE   24 (residues 1 to 1255)
  AUTHORS   Xu,Y., Zhu,J., Liu,Y., Lou,Z., Yuan,F., Liu,Y., Cole,D.K., Ni,L.,
            Su,N., Qin,L., Li,X., Bai,Z., Bell,J.I., Pang,H., Tien,P., Gao,G.F.
            and Rao,Z.
  TITLE     Characterization of the heptad repeat regions, HR1 and HR2, and
            design of a fusion core structure model of the spike protein from
            severe acute respiratory syndrome (SARS) coronavirus
  JOURNAL   Biochemistry 43 (44), 14064-14071 (2004)
   PUBMED   15518555
  REMARK    CHARACTERIZATION OF HEPTAD REPEAT REGIONS.
REFERENCE   25 (residues 1 to 1255)
  AUTHORS   Wu,X.D., Shang,B., Yang,R.F., Yu,H., Ma,Z.H., Shen,X., Ji,Y.Y.,
            Lin,Y., Wu,Y.D., Lin,G.M., Tian,L., Gan,X.Q., Yang,S., Jiang,W.H.,
            Dai,E.H., Wang,X.Y., Jiang,H.L., Xie,Y.H., Zhu,X.L., Pei,G., Li,L.,
            Wu,J.R. and Sun,B.
  TITLE     The spike protein of severe acute respiratory syndrome (SARS) is
            cleaved in virus infected Vero-E6 cells
  JOURNAL   Cell Res. 14 (5), 400-406 (2004)
   PUBMED   15450134
  REMARK    CLEAVAGE.
REFERENCE   26 (residues 1 to 1255)
  AUTHORS   Jeffers,S.A., Tusell,S.M., Gillim-Ross,L., Hemmila,E.M.,
            Achenbach,J.E., Babcock,G.J., Thomas,W.D. Jr., Thackray,L.B.,
            Young,M.D., Mason,R.J., Ambrosino,D.M., Wentworth,D.E.,

  TITLE     CD209L (L-SIGN) is a receptor for severe acute respiratory syndrome
            coronavirus
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 101 (44), 15748-15753 (2004)
   PUBMED   15496474
  REMARK    INTERACTION WITH HUMAN CLEC4M/DC-SIGNR, AND MUTAGENESIS OF CYS-323;
            CYS-348; GLU-452; ASP-454; ASP-463; CYS-467; CYS-474 AND ASP-480.
REFERENCE   27 (residues 1 to 1255)
  AUTHORS   Xiao,X., Feng,Y., Chakraborti,S. and Dimitrov,D.S.
  TITLE     Oligomerization of the SARS-CoV S glycoprotein: dimerization of the
            N-terminus and trimerization of the ectodomain
  JOURNAL   Biochem. Biophys. Res. Commun. 322 (1), 93-99 (2004)
   PUBMED   15313178
  REMARK    HOMOTRIMERIZATION.
REFERENCE   28 (residues 1 to 1255)
  AUTHORS   Sainz,B. Jr., Rausch,J.M., Gallaher,W.R., Garry,R.F. and
            Wimley,W.C.
  TITLE     Identification and characterization of the putative fusion peptide
            of the severe acute respiratory syndrome-associated coronavirus
            spike protein
  JOURNAL   J. Virol. 79 (11), 7195-7206 (2005)
   PUBMED   15890958
  REMARK    CHARACTERIZATION OF FUSION PEPTIDE.
REFERENCE   29 (residues 1 to 1255)
  AUTHORS   Nal,B., Chan,C., Kien,F., Siu,L., Tse,J., Chu,K., Kam,J.,
            Staropoli,I., Crescenzo-Chaigne,B., Escriou,N., van der Werf,S.,
            Yuen,K.Y. and Altmeyer,R.
  TITLE     Differential maturation and subcellular localization of severe
            acute respiratory syndrome coronavirus surface proteins S, M and E
  JOURNAL   J. Gen. Virol. 86 (PT 5), 1423-1434 (2005)
   PUBMED   15831954
  REMARK    SUBCELLULAR LOCATION.
REFERENCE   30 (residues 1 to 1255)
  AUTHORS   Li,W., Zhang,C., Sui,J., Kuhn,J.H., Moore,M.J., Luo,S., Wong,S.K.,
            Huang,I.C., Xu,K., Vasilieva,N., Murakami,A., He,Y., Marasco,W.A.,
            Guan,Y., Choe,H. and Farzan,M.
  TITLE     Receptor and viral determinants of SARS-coronavirus adaptation to
            human ACE2
  JOURNAL   EMBO J. 24 (8), 1634-1643 (2005)
   PUBMED   15791205
  REMARK    CHARACTERIZATION OF VARIANTS ARG-344; SER-360; LYS-479 AND SER-487.
REFERENCE   31 (residues 1 to 1255)
  AUTHORS   Simmons,G., Gosalia,D.N., Rennekamp,A.J., Reeves,J.D., Diamond,S.L.
            and Bates,P.
  TITLE     Inhibitors of cathepsin L prevent severe acute respiratory syndrome
            coronavirus entry
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 102 (33), 11876-11881 (2005)
   PUBMED   16081529
  REMARK    PROTEOLYSIS BY HUMAN CTSL.
REFERENCE   32 (residues 1 to 1255)
  AUTHORS   Tan,Y.J., Teng,E., Shen,S., Tan,T.H., Goh,P.Y., Fielding,B.C.,
            Ooi,E.E., Tan,H.C., Lim,S.G. and Hong,W.
  TITLE     A novel severe acute respiratory syndrome coronavirus protein,
            U274, is transported to the cell surface and undergoes endocytosis
  JOURNAL   J. Virol. 78 (13), 6723-6734 (2004)
   PUBMED   15194747
  REMARK    INTERACTION WITH ACCESSORY PROTEIN 3A.
REFERENCE   33 (residues 1 to 1255)

  TITLE     Severe acute respiratory syndrome coronavirus 7a accessory protein
            is a viral structural protein
  JOURNAL   J. Virol. 80 (15), 7287-7294 (2006)
   PUBMED   16840309
  REMARK    INTERACTION WITH ACCESSORY PROTEIN 7A.
REFERENCE   34 (residues 1 to 1255)
  AUTHORS   Follis,K.E., York,J. and Nunberg,J.H.
  TITLE     Furin cleavage of the SARS coronavirus spike glycoprotein enhances
            cell-cell fusion but does not affect virion entry
  JOURNAL   Virology 350 (2), 358-369 (2006)
   PUBMED   16519916
  REMARK    MUTAGENESIS OF ARG-667 AND LYS-672.
REFERENCE   35 (residues 1 to 1255)
  AUTHORS   Petit,C.M., Chouljenko,V.N., Iyer,A., Colgrove,R., Farzan,M.,
            Knipe,D.M. and Kousoulas,K.G.
  TITLE     Palmitoylation of the cysteine-rich endodomain of the
            SARS-coronavirus spike glycoprotein is important for spike-mediated
            cell fusion
  JOURNAL   Virology 360 (2), 264-274 (2007)
   PUBMED   17134730
  REMARK    PALMITOYLATION.
REFERENCE   36 (residues 1 to 1255)
  AUTHORS   McBride,C.E., Li,J. and Machamer,C.E.
  TITLE     The cytoplasmic tail of the severe acute respiratory syndrome
            coronavirus spike protein contains a novel endoplasmic reticulum
            retrieval signal that binds COPI and promotes interaction with
            membrane protein
  JOURNAL   J. Virol. 81 (5), 2418-2428 (2007)
   PUBMED   17166901
  REMARK    ENDOPLASMIC RETICULUM RETENTION MOTIF, AND MUTAGENESIS OF LYS-1251
            AND HIS-1253.
REFERENCE   37 (residues 1 to 1255)
  AUTHORS   Xu,Y., Lou,Z., Liu,Y., Pang,H., Tien,P., Gao,G.F. and Rao,Z.
  TITLE     Crystal structure of severe acute respiratory syndrome coronavirus
            spike protein fusion core
  JOURNAL   J. Biol. Chem. 279 (47), 49414-49419 (2004)
   PUBMED   15345712
  REMARK    X-RAY CRYSTALLOGRAPHY (2.8 ANGSTROMS) OF 900-948.
REFERENCE   38 (residues 1 to 1255)
  AUTHORS   Supekar,V.M., Bruckmann,C., Ingallinella,P., Bianchi,E., Pessi,A.
            and Carfi,A.
  TITLE     Structure of a proteolytically resistant core from the severe acute
            respiratory syndrome coronavirus S2 fusion protein
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 101 (52), 17958-17963 (2004)
   PUBMED   15604146
  REMARK    X-RAY CRYSTALLOGRAPHY (1.6 ANGSTROMS) OF 895-972 AND 1142-1180.
REFERENCE   39 (residues 1 to 1255)
  AUTHORS   Spiga,O., Bernini,A., Ciutti,A., Chiellini,S., Menciassi,N.,
            Finetti,F., Causarono,V., Anselmi,F., Prischi,F. and Niccolai,N.
  TITLE     Molecular modelling of S1 and S2 subunits of SARS coronavirus spike
            glycoprotein
  JOURNAL   Biochem. Biophys. Res. Commun. 310 (1), 78-83 (2003)
   PUBMED   14511651
  REMARK    3D-STRUCTURE MODELING OF 17-680.
REFERENCE   40 (residues 1 to 1255)
  AUTHORS   Li,F., Li,W., Farzan,M. and Harrison,S.C.
  TITLE     Structure of SARS coronavirus spike receptor-binding domain

  JOURNAL   Science 309 (5742), 1864-1868 (2005)
   PUBMED   16166518
  REMARK    X-RAY CRYSTALLOGRAPHY (2.9 ANGSTROMS) OF 323-502 IN COMPLEX WITH
            HUMAN ACE2.
REFERENCE   41 (residues 1 to 1255)
  AUTHORS   Deng,Y., Liu,J., Zheng,Q., Yong,W. and Lu,M.
  TITLE     Structures and polymorphic interactions of two heptad-repeat
            regions of the SARS virus S2 protein
  JOURNAL   Structure 14 (5), 889-899 (2006)
   PUBMED   16698550
  REMARK    X-RAY CRYSTALLOGRAPHY (1.7 ANGSTROMS) OF 1150-1193.
COMMENT     [FUNCTION] S1 attaches the virion to the cell membrane by
            interacting with human ACE2 and CLEC4M/DC-SIGNR, initiating the
            infection. Binding to the receptor and internalization of the virus
            into the endosomes of the host cell probably induce conformational
            changes in the S glycoprotein. Proteolysis by cathepsin CTSL may
            unmask the fusion peptide of S2 and activate membranes fusion
            within endosomes.
            [FUNCTION] S2 is a class I viral fusion protein. Under the current
            model, the protein has at least three conformational states:
            pre-fusion native state, pre-hairpin intermediate state, and
            post-fusion hairpin state. During viral and target cell membrane
            fusion, the coiled coil regions (heptad repeats) assume a
            trimer-of-hairpins structure, positioning the fusion peptide in
            close proximity to the C-terminal region of the ectodomain. The
            formation of this structure appears to drive apposition and
            subsequent fusion of viral and target cell membranes.
            [SUBUNIT] Homotrimer. Binds to human and palm civet ACE2 and human
            CLEC4M/DC-SIGNR. Interacts with the accessory proteins 3a and 7a.
            [SUBCELLULAR LOCATION] Virion membrane; Single-pass type I membrane
            protein. Endoplasmic reticulum-Golgi intermediate compartment
            membrane; Single-pass type I membrane protein (By similarity). Cell
            membrane; Single-pass type I membrane protein. Note=Accumulates in
            the endoplasmic reticulum-Golgi intermediate compartment, where it
            participates in virus particle assembly (By similarity). Some S
            oligomers are transported to the plasma membrane, where they may
            mediate cell-cell fusion.
            [DOMAIN] The KxHxx motif seems to function as an ER retrieval and
            binds COPI in vitro.
            [PTM] The cytoplasmic Cys-rich domain is palmitoylated. Spike
            glycoprotein is digested by cathepsin CTSL within endosomes.
            [MISCELLANEOUS] Tor2 is the prototype of the virus isolated during
            the severe SARS outbreak in 2002-2003. GD03 has been isolated from
            the second mild SARS outbreak in winter 2003-2004. SZ3 has been
            isolated from palm civet, the presumed animal reservoir. The spike
            proteins from those three isolates display a strong affinity for
            palm civet ACE2 receptor, whereas only the Tor2 spike protein
            efficiently binds human ACE2. This may explain the high
            pathogenicity of Tor2 virus, whose spike is highly adapted to the
            human host. Therefore, the lack of severity of disease during the
            2003-2004 outbreak could be due to the incomplete adaptation of
            GD03 virus to bind human ACE2. Mutation Asn-479 and Thr-487 in palm
            civet coronavirus seems necessary and sufficient for the virus to
            acquire the ability to efficiently infect humans.
            [SIMILARITY] Belongs to the coronaviruses spike protein family.
            [CAUTION] Cleavage into S1 and S2 remains controversial, since
            biochemical evidence for this proteolytic cleavage is largely

FEATURES             Location/Qualifiers
     source          1..1255
                     /organism="SARS coronavirus"
                     /host="Homo sapiens (Human)"
                     /host="Paguma larvata (Masked palm civet)"
                     /db_xref="taxon:227859"
     gene            1..1255
                     /gene="S"
                     /locus_tag="2"
     Protein         1..1255
                     /product="Spike glycoprotein precursor"
     Region          1..13
                     /region_name="Signal"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="Potential."
     Region          14..1255
                     /region_name="Mature chain"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="Spike glycoprotein. /FTId=PRO_0000037208."
     Region          14..1195
                     /region_name="Topological domain"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="Extracellular (Potential)."
     Region          14..667
                     /region_name="Mature chain"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="Spike protein S1 (Potential).
                     /FTId=PRO_0000037209."
     Site            29
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          49
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="S -> L (in strain: Isolate GZ50)."
     Site            65
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Site            73
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          77
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="G -> D (in strain: Isolate BJ01, Isolate BJ02,

                     HKU-36871, Isolate GD01, Isolate GD03 and Isolate SZ3)."
     Region          78
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="N -> D (in strain: Isolate GD03)."
     Site            109
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          118
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="N -> S (in strain: Isolate Shanghai LY)."
     Site            118
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Site            119
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          139
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="A -> V (in strain: Isolate GD03)."
     Region          144
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="M -> L (in strain: Isolate BJ03)."
     Region          147
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="Q -> R (in strain: Isolate GD03)."
     Site            158
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          193
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="F -> S (in strain: Isolate Shanghai LY)."
     Region          227
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="N -> K (in strain: Isolate SZ3)."
     Site            227

                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          239
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="S -> L (in strain: Isolate GD01 and Isolate SZ3)."
     Region          244
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="I -> T (in strain: Isolate BJ01, Isolate BJ02,
                     Isolate BJ03, Isolate BJ04, Isolate GZ50, Isolate CUHK-W1,
                     Isolate HKU-36871, Isolate GD01, Isolate GD03 and Isolate
                     SZ3)."
     Region          261
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="T -> K (in strain: Isolate SZ3)."
     Site            269
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          306..527
                     /region_name="Region of interest in the sequence"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="Receptor-binding domain."
     Region          311
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="G -> R (in strain: Isolate GD01 and Isolate BJ02)."
     Site            318
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Bond            bond(323,348)
                     /bond_type="disulfide"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Site            323
                     /site_type="mutagenized"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="C->A: No effect on human ACE2 binding in vitro."
     Region          326..329
                     /region_name="Helical region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Site            330
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional

                     /note="N-linked (GlcNAc...) (Potential)."
     Region          341..345
                     /region_name="Beta-strand region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Region          344
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="K -> R (in strain: Isolate GD01, Isolate GD03 and
                     Isolate SZ3; no effect on affinity with either human or
                     palm civet ACE2)."
     Region          347..349
                     /region_name="Beta-strand region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Site            348
                     /site_type="mutagenized"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="C->A: Complete loss of human ACE2 binding in
                     vitro."
     Region          353..356
                     /region_name="Helical region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Site            357
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          360
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="F -> S (in strain: Isolate GD03 and Isolate SZ3; no
                     effect on affinity with either human or palm civet ACE2)."
     Region          362..369
                     /region_name="Beta-strand region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Bond            bond(366,419)
                     /bond_type="disulfide"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Region          371..373
                     /region_name="Helical region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Region          374..376
                     /region_name="Beta-strand region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Region          378..390
                     /region_name="Beta-strand region"
                     /experiment="experimental evidence, no additional details
                     recorded"

                     /region_name="Helical region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Region          404..408
                     /region_name="Helical region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Region          418..424
                     /region_name="Beta-strand region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Region          424..494
                     /region_name="Region of interest in the sequence"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="Receptor-binding motif; binding to human ACE2."
     Region          426..429
                     /region_name="Helical region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Region          426
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="R -> G (in strain: Isolate Shanghai LY)."
     Region          431..433
                     /region_name="Beta-strand region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Region          437
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="N -> D (in strain: Isolate Shanghai LY)."
     Region          439..441
                     /region_name="Beta-strand region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Site            452
                     /site_type="mutagenized"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="E->A: 90% loss of human ACE2 binding in vitro."
     Site            454
                     /site_type="mutagenized"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="D->A: Complete loss of human ACE2 binding in
                     vitro."
     Site            463
                     /site_type="mutagenized"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="D->A: Partial loss of human ACE2 binding in vitro."
     Bond            bond(467,474)
                     /bond_type="disulfide"
                     /experiment="experimental evidence, no additional details

     Site            467
                     /site_type="mutagenized"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="C->A: Complete loss of human ACE2 binding in
                     vitro."
     Region          472
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="L -> P (in strain: Isolate GD03)."
     Site            474
                     /site_type="mutagenized"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="C->A: Complete loss of human ACE2 binding in
                     vitro."
     Region          477..480
                     /region_name="Beta-strand region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Region          479
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="N -> K (in strain: Isolate SZ3; 20fold decrease of
                     affinity with human ACE2; no effect on affinity with palm
                     civet ACE2)."
     Region          480
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="D -> G (in strain: Isolate GD03)."
     Site            480
                     /site_type="mutagenized"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="D->A: No effect on human ACE2 binding in vitro."
     Region          483..487
                     /region_name="Beta-strand region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Region          487
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="T -> S (in strain: Isolate GD03 and Isolate SZ3;
                     20fold decrease of affinity with human ACE2; decrease of
                     affinity with palm civet ACE2)."
     Region          489..491
                     /region_name="Helical region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Region          492..502
                     /region_name="Beta-strand region"
                     /experiment="experimental evidence, no additional details
                     recorded"

                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="F -> Y (in strain: Isolate GD01)."
     Region          509..511
                     /region_name="Beta-strand region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Region          577
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="S -> A (in strain: Isolate Tor2 and Isolate
                     Shanghai QXC1)."
     Site            589
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Site            602
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          605
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="D -> N (in strain: Isolate Shanghai QXC1)."
     Region          607
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="S -> P (in strain: Isolate SZ3)."
     Region          608
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="T -> A (in strain: Isolate Shanghai QXC1)."
     Region          609
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="A -> L (in strain: Isolate GD03)."
     Region          613
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="D -> E (in strain: Isolate GD03)."
     Region          665
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="L -> S (in strain: Isolate GD03 and Isolate SZ3)."
     Site            667..668
                     /site_type="cleavage"
                     /inference="non-experimental evidence, no additional

                     /note="Cleavage (Potential)."
     Site            667
                     /site_type="mutagenized"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="R->S: 40% loss of cell-cell fusion."
     Region          668..1255
                     /region_name="Mature chain"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="Spike protein S2 (Potential).
                     /FTId=PRO_0000037210."
     Site            672
                     /site_type="mutagenized"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="K->S: No effect on cell-cell fusion."
     Site            691
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Site            699
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          701
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="S -> L (in strain: Isolate SZ3)."
     Region          743
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="T -> A (in strain: Isolate SZ3)."
     Region          743
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="T -> R (in strain: Isolate GD03)."
     Region          754
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="A -> V (in strain: Isolate SZ3)."
     Region          765
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="A -> V (in strain: Isolate GD03)."
     Region          770..788
                     /region_name="Region of interest in the sequence"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="Fusion peptide (Potential)."

                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="Y -> D (in strain: Isolate GD01, Isolate GZ50,
                     Isolate GD03 and Isolate SZ3)."
     Site            783
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          794
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="P -> S (in strain: Isolate GD01)."
     Region          804
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="L -> P (in strain: Isolate Shanghai LY)."
     Region          860..861
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="VS -> LR (in strain: Isolate BJ03)."
     Region          894..972
                     /region_name="Helical region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Region          894
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="T -> A (in strain: Isolate SZ3)."
     Region          902..952
                     /region_name="Region of interest in the sequence"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="Heptad repeat 1."
     Region          931..975
                     /region_name="Coiled-coil region"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="Potential."
     Region          999
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="E -> G (in strain: Isolate Shanghai LY)."
     Region          1001
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="R -> M (in strain: Isolate BJ04)."
     Region          1004..1016
                     /region_name="Helical region"
                     /experiment="experimental evidence, no additional details

     Region          1017..1020
                     /region_name="Helical region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Site            1056
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Site            1080
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Site            1116
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          1132
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="E -> G (in strain: Isolate Shanghai QXC1)."
     Site            1140
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          1145..1184
                     /region_name="Region of interest in the sequence"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="Heptad repeat 2."
     Region          1148
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="L -> F (in strain: Isolate Frankfurt 1 and Isolate
                     FRA)."
     Region          1154..1184
                     /region_name="Helical region"
                     /experiment="experimental evidence, no additional details
                     recorded"
     Site            1155
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          1157..1185
                     /region_name="Coiled-coil region"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="Potential."
     Region          1163
                     /region_name="Variant"
                     /experiment="experimental evidence, no additional details

                     /note="K -> E (in strain: Isolate GD03 and Isolate SZ3)."
     Site            1176
                     /site_type="glycosylation"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="N-linked (GlcNAc...) (Potential)."
     Region          1196..1216
                     /region_name="Transmembrane region"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="Potential."
     Region          1217..1255
                     /region_name="Topological domain"
                     /inference="non-experimental evidence, no additional
                     details recorded"
                     /note="Cytoplasmic (Potential)."
     Region          1217..1236
                     /region_name="Compositionally biased region"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="Cys-rich."
     Region          1251..1255
                     /region_name="Short sequence motif of biological interest"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="KxHxx."
     Site            1251
                     /site_type="mutagenized"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="K->A: Decrease in Golgi localization, and complete
                     loss of COPI binding; when associated to A-1253."
     Site            1253
                     /site_type="mutagenized"
                     /experiment="experimental evidence, no additional details
                     recorded"
                     /note="H->A: Decrease in Golgi localization, and complete
                     loss of COPI binding; when associated to A-1251."
ORIGIN      
        1 mfifllfltl tsgsdldrct tfddvqapny tqhtssmrgv yypdeifrsd tlyltqdlfl
       61 pfysnvtgfh tinhtfgnpv ipfkdgiyfa ateksnvvrg wvfgstmnnk sqsviiinns
      121 tnvviracnf elcdnpffav skpmgtqtht mifdnafnct feyisdafsl dvseksgnfk
      181 hlrefvfknk dgflyvykgy qpidvvrdlp sgfntlkpif klplginitn frailtafsp
      241 aqdiwgtsaa ayfvgylkpt tfmlkydeng titdavdcsq nplaelkcsv ksfeidkgiy
      301 qtsnfrvvps gdvvrfpnit nlcpfgevfn atkfpsvyaw erkkisncva dysvlynstf
      361 fstfkcygvs atklndlcfs nvyadsfvvk gddvrqiapg qtgviadyny klpddfmgcv
      421 lawntrnida tstgnynyky rylrhgklrp ferdisnvpf spdgkpctpp alncywplnd
      481 ygfytttgig yqpyrvvvls fellnapatv cgpklstdli knqcvnfnfn gltgtgvltp
      541 sskrfqpfqq fgrdvsdftd svrdpktsei ldispcsfgg vsvitpgtna ssevavlyqd
      601 vnctdvstai hadqltpawr iystgnnvfq tqagcligae hvdtsyecdi pigagicasy
      661 htvsllrsts qksivaytms lgadssiays nntiaiptnf sisittevmp vsmaktsvdc
      721 nmyicgdste canlllqygs fctqlnrals giaaeqdrnt revfaqvkqm yktptlkyfg
      781 gfnfsqilpd plkptkrsfi edllfnkvtl adagfmkqyg eclgdinard licaqkfngl
      841 tvlpplltdd miaaytaalv sgtatagwtf gagaalqipf amqmayrfng igvtqnvlye
      901 nqkqianqfn kaisqiqesl tttstalgkl qdvvnqnaqa lntlvkqlss nfgaissvln
      961 dilsrldkve aevqidrlit grlqslqtyv tqqliraaei rasanlaatk msecvlgqsk
     1021 rvdfcgkgyh lmsfpqaaph gvvflhvtyv psqernftta paichegkay fpregvfvfn

     1141 htspdvdlgd isginasvvn iqkeidrlne vaknlnesli dlqelgkyeq yikwpwyvwl
     1201 gfiagliaiv mvtillccmt sccsclkgac scgscckfde ddsepvlkgv klhyt










LOCUS       AY278741               29727 bp    RNA     linear   VRL 04-OCT-2005
DEFINITION  SARS coronavirus Urbani, complete genome.
ACCESSION   AY278741
VERSION     AY278741.1
KEYWORDS    .
SOURCE      SARS coronavirus Urbani
  ORGANISM  SARS coronavirus Urbani
            Viruses; ssRNA positive-strand viruses, no DNA stage; Nidovirales;
            Coronaviridae; Coronavirinae; Betacoronavirus.
REFERENCE   1  (bases 1 to 29727)
  AUTHORS   Rota,P.A., Oberste,M.S., Monroe,S.S., Nix,W.A., Campagnoli,R.,
            Icenogle,J.P., Penaranda,S., Bankamp,B., Maher,K., Chen,M.H.,
            Tong,S., Tamin,A., Lowe,L., Frace,M., DeRisi,J.L., Chen,Q.,
            Wang,D., Erdman,D.D., Peret,T.C., Burns,C., Ksiazek,T.G.,
            Rollin,P.E., Sanchez,A., Liffick,S., Holloway,B., Limor,J.,
            McCaustland,K., Olsen-Rasmussen,M., Fouchier,R., Gunther,S.,
            Osterhaus,A.D., Drosten,C., Pallansch,M.A., Anderson,L.J. and
            Bellini,W.J.
  TITLE     Characterization of a novel coronavirus associated with severe
            acute respiratory syndrome
  JOURNAL   Science 300 (5624), 1394-1399 (2003)
   PUBMED   12730500
REFERENCE   2  (bases 1 to 29727)
  AUTHORS   Monroe,S.S.
  CONSRTM   CDC SARS Coronavirus Sequencing Team
  TITLE     SARS coronavirus (SARS-CoV), Urbani strain
  JOURNAL   Unpublished
REFERENCE   3  (bases 1 to 29727)
  AUTHORS   Bellini,W.J., Campagnoli,R.P., Icenogle,J.P., Monroe,S.S.,
            Nix,W.A., Oberste,M.S., Pallansch,M.A. and Rota,P.A.
  TITLE     Direct Submission
  JOURNAL   Submitted (17-APR-2003) Division of Viral and Rickettsial Diseases,
            Centers for Disease Control and Prevention, 1600 Clifton RD, NE,
            Atlanta, GA 30333, USA
FEATURES             Location/Qualifiers
     source          1..29727
                     /organism="SARS coronavirus Urbani"
                     /mol_type="genomic RNA"
                     /strain="Urbani"
                     /db_xref="taxon:228330"
                     /cell_line="Vero"
     5'UTR           1..264
     CDS             join(265..13398,13398..21485)
                     /ribosomal_slippage
                     /note="ORF 1ab"
                     /codon_start=1
                     /product="nonstructural polyprotein pp1ab"
                     /protein_id="AAP13442.1"
                     /translation="MESLVLGVNEKTHVQLSLPVLQVRDVLVRGFGDSVEEALSEARE
                     HLKNGTCGLVELEKGVLPQLEQPYVFIKRSDALSTNHGHKVVELVAEMDGIQYGRSGI
                     TLGVLVPHVGETPIAYRNVLLRKNGNKGAGGHSYGIDLKSYDLGDELGTDPIEDYEQN
                     WNTKHGSGALRELTRELNGGAVTRYVDNNFCGPDGYPLDCIKDFLARAGKSMCTLSEQ
                     LDYIESKRGVYCCRDHEHEIAWFTERSDKSYEHQTPFEIKSAKKFDTFKGECPKFVFP
                     LNSKVKVIQPRVEKKKTEGFMGRIRSVYPVASPQECNNMHLSTLMKCNHCDEVSWQTC
                     DFLKATCEHCGTENLVIEGPTTCGYLPTNAVVKMPCPACQDPEIGPEHSVADYHNHSN
                     IETRLRKGGRTRCFGGCVFAYVGCYNKRAYWVPRASADIGSGHTGITGDNVETLNEDL
                     LEILSRERVNINIVGDFHLNEEVAIILASFSASTSAFIDTIKSLDYKSFKTIVESCGN
                     YKVTKGKPVKGAWNIGQQRSVLTPLCGFPSQAAGVIRSIFARTLDAANHSIPDLQRAA
                     VTILDGISEQSLRLVDAMVYTSDLLTNSVIIMAYVTGGLVQQTSQWLSNLLGTTVEKL
                     RPIFEWIEAKLSAGVEFLKDAWEILKFLITGVFDIVKGQIQVASDNIKDCVKCFIDVV
                     NKALEMCIDQVTIAGAKLRSLNLGEVFIAQSKGLYRQCIRGKEQLQLLMPLKAPKEVT
                     FLEGDSHDTVLTSEEVVLKNGELEALETPVDSFTNGAIVGTPVCVNGLMLLEIKDKEQ
                     YCALSPGLLATNNVFRLKGGAPIKGVTFGEDTVWEVQGYKNVRITFELDERVDKVLNE
                     KCSVYTVESGTEVTEFACVVAEAVVKTLQPVSDLLTNMGIDLDEWSVATFYLFDDAGE
                     ENFSSRMYCSFYPPDEEEEDDAECEEEEIDETCEHEYGTEDDYQGLPLEFGASAETVR
                     VEEEEEEDWLDDTTEQSEIEPEPEPTPEEPVNQFTGYLKLTDNVAIKCVDIVKEAQSA
                     NPMVIVNAANIHLKHGGGVAGALNKATNGAMQKESDDYIKLNGPLTVGGSCLLSGHNL
                     AKKCLHVVGPNLNAGEDIQLLKAAYENFNSQDILLAPLLSAGIFGAKPLQSLQVCVQT
                     VRTQVYIAVNDKALYEQVVMDYLDNLKPRVEAPKQEEPPNTEDSKTEEKSVVQKPVDV
                     KPKIKACIDEVTTTLEETKFLTNKLLLFADINGKLYHDSQNMLRGEDMSFLEKDAPYM
                     VGDVITSGDITCVVIPSKKAGGTTEMLSRALKKVPVDEYITTYPGQGCAGYTLEEAKT
                     ALKKCKSAFYVLPSEAPNAKEEILGTVSWNLREMLAHAEETRKLMPICMDVRAIMATI
                     QRKYKGIKIQEGIVDYGVRFFFYTSKEPVASIITKLNSLNEPLVTMPIGYVTHGFNLE
                     EAARCMRSLKAPAVVSVSSPDAVTTYNGYLTSSSKTSEEHFVETVSLAGSYRDWSYSG
                     QRTELGVEFLKRGDKIVYHTLESPVEFHLDGEVLSLDKLKSLLSLREVKTIKVFTTVD
                     NTNLHTQLVDMSMTYGQQFGPTYLDGADVTKIKPHVNHEGKTFFVLPSDDTLRSEAFE
                     YYHTLDESFLGRYMSALNHTKKWKFPQVGGLTSIKWADNNCYLSSVLLALQQLEVKFN
                     APALQEAYYRARAGDAANFCALILAYSNKTVGELGDVRETMTHLLQHANLESAKRVLN
                     VVCKHCGQKTTTLTGVEAVMYMGTLSYDNLKTGVSIPCVCGRDATQYLVQQESSFVMM
                     SAPPAEYKLQQGTFLCANEYTGNYQCGHYTHITAKETLYRIDGAHLTKMSEYKGPVTD
                     VFYKETSYTTTIKPVSYKLDGVTYTEIEPKLDGYYKKDNAYYTEQPIDLVPTQPLPNA
                     SFDNFKLTCSNTKFADDLNQMTGFTKPASRELSVTFFPDLNGDVVAIDYRHYSASFKK
                     GAKLLHKPIVWHINQATTKTTFKPNTWCLRCLWSTKPVDTSNSFEVLAVEDTQGMDNL
                     ACESQQPTSEEVVENPTIQKEVIECDVKTTEVVGNVILKPSDEGVKVTQELGHEDLMA
                     AYVENTSITIKKPNELSLALGLKTIATHGIAAINSVPWSKILAYVKPFLGQAAITTSN
                     CAKRLAQRVFNNYMPYVFTLLFQLCTFTKSTNSRIRASLPTTIAKNSVKSVAKLCLDA
                     GINYVKSPKFSKLFTIAMWLLLLSICLGSLICVTAAFGVLLSNFGAPSYCNGVRELYL
                     NSSNVTTMDFCEGSFPCSICLSGLDSLDSYPALETIQVTISSYKLDLTILGLAAEWVL
                     AYMLFTKFFYLLGLSAIMQVFFGYFASHFISNSWLMWFIISIVQMAPVSAMVRMYIFF
                     ASFYYIWKSYVHIMDGCTSSTCMMCYKRNRATRVECTTIVNGMKRSFYVYANGGRGFC
                     KTHNWNCLNCDTFCTGSTFISDEVARDLSLQFKRPINPTDQSSYIVDSVAVKNGALHL
                     YFDKAGQKTYERHPLSHFVNLDNLRANNTKGSLPINVIVFDGKSKCDESASKSASVYY
                     SQLMCQPILLLDQVLVSDVGDSTEVSVKMFDAYVDTFSATFSVPMEKLKALVATAHSE
                     LAKGVALDGVLSTFVSAARQGVVDTDVDTKDVIECLKLSHHSDLEVTGDSCNNFMLTY
                     NKVENMTPRDLGACIDCNARHINAQVAKSHNVSLIWNVKDYMSLSEQLRKQIRSAAKK
                     NNIPFRLTCATTRQVVNVITTKISLKGGKIVSTCFKLMLKATLLCVLAALVCYIVMPV
                     HTLSIHDGYTNEIIGYKAIQDGVTRDIISTDDCFANKHAGFDAWFSQRGGSYKNDKSC
                     PVVAAIITREIGFIVPGLPGTVLRAINGDFLHFLPRVFSAVGNICYTPSKLIEYSDFA
                     TSACVLAAECTIFKDAMGKPVPYCYDTNLLEGSISYSELRPDTRYVLMDGSIIQFPNT
                     YLEGSVRVVTTFDAEYCRHGTCERSEVGICLSTSGRWVLNNEHYRALSGVFCGVDAMN
                     LIANIFTPLVQPVGALDVSASVVAGGIIAILVTCAAYYFMKFRRVFGEYNHVVAANAL
                     LFLMSFTILCLVPAYSFLPGVYSVFYLYLTFYFTNDVSFLAHLQWFAMFSPIVPFWIT
                     AIYVFCISLKHCHWFFNNYLRKRVMFNGVTFSTFEEAALCTFLLNKEMYLKLRSETLL
                     PLTQYNRYLALYNKYKYFSGALDTTSYREAACCHLAKALNDFSNSGADVLYQPPQTSI
                     TSAVLQSGFRKMAFPSGKVEGCMVQVTCGTTTLNGLWLDDTVYCPRHVICTAEDMLNP
                     NYEDLLIRKSNHSFLVQAGNVQLRVIGHSMQNCLLRLKVDTSNPKTPKYKFVRIQPGQ
                     TFSVLACYNGSPSGVYQCAMRPNHTIKGSFLNGSCGSVGFNIDYDCVSFCYMHHMELP
                     TGVHAGTDLEGKFYGPFVDRQTAQAAGTDTTITLNVLAWLYAAVINGDRWFLNRFTTT
                     LNDFNLVAMKYNYEPLTQDHVDILGPLSAQTGIAVLDMCAALKELLQNGMNGRTILGS
                     TILEDEFTPFDVVRQCSGVTFQGKFKKIVKGTHHWMLLTFLTSLLILVQSTQWSLFFF
                     VYENAFLPFTLGIMAIA ACAMLLVKHKHAFLCLFLLPSLATVAYFNMVYMPASWVMRI
                     MTWLELADTSLSGYRLKDCVMYASALVLLILMTARTVYDDAARRVWTLMNVITLVYKV
                     YYGNALDQAISMWALVISVTSNYSGVVTTIMFLARAIVFVCVEYYPLLFITGNTLQCI
                     MLVYCFLGYCCCCYFGLFCLLNRYFRLTLGVYDYLVSTQEFRYMNSQGLLPPKSSIDA
                     FKLNIKLLGIGGKPCIKVATVQSKMSDVKCTSVVLLSVLQQLRVESSSKLWAQCVQLH
                     NDILLAKDTTEAFEKMVSLLSVLLSMQGAVDINRLCEEMLDNRATLQAIA SEFSSLPS
                     YAAYATAQEAYEQAVANGDSEVVLKKLKKSLNVAKSEFDRDAAMQRKLEKMADQAMTQ
                     MYKQARSEDKRAKVTSAMQTMLFTMLRKLDNDALNNIINNARDGCVPLNIIPLTTAAK
                     LMVVVPDYGTYKNTCDGNTFTYASALWEIQQVVDADSKIVQLSEINMDNSPNLAWPLI
                     VTALRANSAVKLQNNELSPVALRQMSCAAGTTQTACTDDNALAYYNNSKGGRFVLALL
                     SDHQDLKWARFPKSDGTGTIYTELEPPCRFVTDTPKGPKVKYLYFIKGLNNLNRGMVL
                     GSLAATVRLQAGNATEVPANSTVLSFCAFAVDPAKAYKDYLASGGQPITNCVKMLCTH
                     TGTGQAITVTPEANMDQESFGGASCCLYCRCHIDHPNPKGFCDLKGKYVQIPTTCAND
                     PVGFTLRNTVCTVCGMWKGYGCSCDQLREPLMQSADASTFLNRVCGVSAARLTPCGTG
                     TSTDVVYRAFDIYNEKVAGFAKFLKTNCCRFQEKDEEGNLLDSYFVVKRHTMSNYQHE
                     ETIYNLVKDCPAVAVHDFFKFRVDGDMVPHISRQRLTKYTMADLVYALRHFDEGNCDT
                     LKEILVTYNCCDDDYFNKKDWYDFVENPDILRVYANLGERVRQSLLKTVQFCDAMRDA
                     GIVGVLTLDNQDLNGNWYDFGDFVQVAPGCGVPIVDSYYSLLMPILTLTRALAAESHM
                     DADLAKPLIKWDLLKYDFTEERLCLFDRYFKYWDQTYHPNCINCLDDRCILHCANFNV
                     LFSTVFPPTSFGPLVRKIFVDGVPFVVSTGYHFRELGVVHNQDVNLHSSRLSFKELLV
                     YAADPAMHAASGNLLLDKRTTCFSVAALTNNVAFQTVKPGNFNKDFYDFAVSKGFFKE
                     GSSVELKHFFFAQDGNAAISDYDYYRYNLPTMCDIRQLLFVVEVVDKYFDCYDGGCIN
                     ANQVIVNNLDKSAGFPFNKWGKARLYYDSMSYEDQDALFAYTKRNVIPTITQMNLKYA
                     ISAKNRARTVAGVSICSTMTNRQFHQKLLKSIAATRGATVVIGTSKFYGGWHNMLKTV
                     YSDVETPHLMGWDYPKCDRAMPNMLRIMASLVLARKHNTCCNLSHRFYRLANECAQVL
                     SEMVMCGGSLYVKPGGTSSGDATTAYANSVFNICQAVTANVNALLSTDGNKIADKYVR
                     NLQHRLYECLYRNRDVDHEFVDEFYAYLRKHFSMMILSDDAVVCYNSNYAAQGLVASI
                     KNFKAVLYYQNNVFMSEAKCWTETDLTKGPHEFCSQHTMLVKQGDDYVYLPYPDPSRI
                     LGAGCFVDDIVKTDGTLMIERFVSLAIDAYPLTKHPNQEYADVFHLYLQYIRKLHDEL
                     TGHMLDMYSVMLTNDNTSRYWEPEFYEAMYTPHTVLQAVGACVLCNSQTSLRCGACIR
                     RPFLCCKCCYDHVISTSHKLVLSVNPYVCNAPGCDVTDVTQLYLGGMSYYCKSHKPPI
                     SFPLCANGQVFGLYKNTCVGSDNVTDFNAIA TCDWTNAGDYILANTCTERLKLFAAET
                     LKATEETFKLSYGIATVREVLSDRELHLSWEVGKPRPPLNRNYVFTGYRVTKNSKVQI
                     GEYTFEKGDYGDAVVYRGTTTYKLNVGDYFVLTSHTVMPLSAPTLVPQEHYVRITGLY
                     PTLNISDEFSSNVANYQKVGMQKYSTLQGPPGTGKSHFAIGLALYYPSARIVYTACSH
                     AAVDALCEKALKYLPIDKCSRIIPARARVECFDKFKVNSTLEQYVFCTVNALPETTAD
                     IVVFDEISMATNYDLSVVNARLRAKHYVYIGDPAQLPAPRTLLTKGTLEPEYFNSVCR
                     LMKTIGPDMFLGTCRRCPAEIVDTVSALVYDNKLKAHKDKSAQCFKMFYKGVITHDVS
                     SAINRPQIGVVREFLTRNPAWRKAVFISPYNSQNAVASKILGLPTQTVDSSQGSEYDY
                     VIFTQTTETAHSCNVNRFNVAITRAKIGILCIMSDRDLYDKLQFTSLEIPRRNVATLQ
                     AENVTGLFKDCSKIITGLHPTQAPTHLSVDIKFKTEGLCVDIPGIPKDMTYRRLISMM
                     GFKMNYQVNGYPNMFITREEAIRHVRAWIGFDVEGCHATRDAVGTNLPLQLGFSTGVN
                     LVAVPTGYVDTENNTEFTRVNAKPPPGDQFKHLIPLMYKGLPWNVVRIKIVQMLSDTL
                     KGLSDRVVFVLWAHGFELTSMKYFVKIGPERTCCLCDKRATCFSTSSDTYACWNHSVG
                     FDYVYNPFMIDVQQWGFTGNLQSNHDQHCQVHGNAHVASCDAIMTRCLAVHECFVKRV
                     DWSVEYPIIGDELRVNSACRKVQHMVVKSALLADKFPVLHDIGNPKAIKCVPQAEVEW
                     KFYDAQPCSDKAYKIEELFYSYATHHDKFTDGVCLFWNCNVDRYPANAIVCRFDTRVL
                     SNLNLPGCDGGSLYVNKHAFHTPAFDKSAFTNLKQLPFFYYSDSPCESHGKQVVSDID
                     YVPLKSATCITRCNLGGAVCRHHANEYRQYLDAYNMMISAGFSLWIYKQFDTYNLWNT
                     FTRLQSLENVAYNVVNKGHFDGHAGEAPVSIINNAVYTKVDGIDVEIFENKTTLPVNV
                     AFELWAKRNIKPVPEIKILNNLGVDIAANTVIWDYKREAPAHVSTIGVCTMTDIAKKP
                     TESACSSLTVLFDGRVEGQVDLFRNARNGVLITEGSVKGLTPSKGPAQASVNGVTLIG
                     ESVKTQFNYFKKVDGIIQQLPETYFTQSRDLEDFKPRSQMETDFLELAMDEFIQRYKL
                     EGYAFEHIVYGDFSHGQLGGLHLMIGLAKRSQDSPLKLEDFIPMDSTVKNYFITDAQT
                     GSSKCVCSVIDLLLDDFVEIIKSQDLSVISKVVKVTIDYAEISFMLWCKDGHVETFYP
                     KLQASQAWQPGVAMPNLYKMQRMLLEKCDLQNYGENAVIPKGIMMNVAKYTQLCQYLN
                     TLTLAVPYNMRVIHFGAGSDKGVAPGTAVLRQWLPTGTLLVDSDLNDFVSDADSTLIG
                     DCATVHTANKWDLIISDMYDPRTKHVTKENDSKEGFFTYLCGFIKQKLALGGSIAVKI
                     TEHSWNADLYKLMGHFSWWTAFVTNVNASSSEAFLIGANYLGKPKEQIDGYTMHANYI
                     FWRNTNPIQLSSYSLFDMSKFPLKLRGTAVMSLKENQINDMIYSLLEKGRLIIRENNR
                     VVVSSDILVNN"
     CDS             265..13413
                     /note="ORF 1a"
                     /codon_start=1
                     /product="nonstructural polyprotein pp1a"
                     /protein_id="AAP13439.1"
                     /translation="MESLVLGVNEKTHVQLSLPVLQVRDVLVRGFGDSVEEALSEARE
                     HLKNGTCGLVELEKGVLPQLEQPYVFIKRSDALSTNHGHKVVELVAEMDGIQYGRSGI
                     TLGVLVPHVGETPIAYRNVLLRKNGNKGAGGHSYGIDLKSYDLGDELGTDPIEDYEQN
                     WNTKHGSGALRELTRELNGGAVTRYVDNNFCGPDGYPLDCIKDFLARAGKSMCTLSEQ
                     LDYIESKRGVYCCRDHEHEIAWFTERSDKSYEHQTPFEIKSAKKFDTFKGECPKFVFP
                     LNSKVKVIQPRVEKKKTEGFMGRIRSVYPVASPQECNNMHLSTLMKCNHCDEVSWQTC
                     DFLKATCEHCGTENLVIEGPTTCGYLPTNAVVKMPCPACQDPEIGPEHSVADYHNHSN
                     IETRLRKGGRTRCFGGCVFAYVGCYNKRAYWVPRASADIGSGHTGITGDNVETLNEDL
                     LEILSRERVNINIVGDFHLNEEVAIILASFSASTSAFIDTIKSLDYKSFKTIVESCGN
                     YKVTKGKPVKGAWNIGQQRSVLTPLCGFPSQAAGVIRSIFARTLDAANHSIPDLQRAA
                     VTILDGISEQSLRLVDAMVYTSDLLTNSVIIMAYVTGGLVQQTSQWLSNLLGTTVEKL
                     RPIFEWIEAKLSAGVEFLKDAWEILKFLITGVFDIVKGQIQVASDNIKDCVKCFIDVV
                     NKALEMCIDQVTIAGAKLRSLNLGEVFIAQSKGLYRQCIRGKEQLQLLMPLKAPKEVT
                     FLEGDSHDTVLTSEEVVLKNGELEALETPVDSFTNGAIVGTPVCVNGLMLLEIKDKEQ
                     YCALSPGLLATNNVFRLKGGAPIKGVTFGEDTVWEVQGYKNVRITFELDERVDKVLNE
                     KCSVYTVESGTEVTEFACVVAEAVVKTLQPVSDLLTNMGIDLDEWSVATFYLFDDAGE
                     ENFSSRMYCSFYPPDEEEEDDAECEEEEIDETCEHEYGTEDDYQGLPLEFGASAETVR
                     VEEEEEEDWLDDTTEQSEIEPEPEPTPEEPVNQFTGYLKLTDNVAIKCVDIVKEAQSA
                     NPMVIVNAANIHLKHGGGVAGALNKATNGAMQKESDDYIKLNGPLTVGGSCLLSGHNL
                     AKKCLHVVGPNLNAGEDIQLLKAAYENFNSQDILLAPLLSAGIFGAKPLQSLQVCVQT
                     VRTQVYIAVNDKALYEQVVMDYLDNLKPRVEAPKQEEPPNTEDSKTEEKSVVQKPVDV
                     KPKIKACIDEVTTTLEETKFLTNKLLLFADINGKLYHDSQNMLRGEDMSFLEKDAPYM
                     VGDVITSGDITCVVIPSKKAGGTTEMLSRALKKVPVDEYITTYPGQGCAGYTLEEAKT
                     ALKKCKSAFYVLPSEAPNAKEEILGTVSWNLREMLAHAEETRKLMPICMDVRAIMATI
                     QRKYKGIKIQEGIVDYGVRFFFYTSKEPVASIITKLNSLNEPLVTMPIGYVTHGFNLE
                     EAARCMRSLKAPAVVSVSSPDAVTTYNGYLTSSSKTSEEHFVETVSLAGSYRDWSYSG
                     QRTELGVEFLKRGDKIVYHTLESPVEFHLDGEVLSLDKLKSLLSLREVKTIKVFTTVD
                     NTNLHTQLVDMSMTYGQQFGPTYLDGADVTKIKPHVNHEGKTFFVLPSDDTLRSEAFE
                     YYHTLDESFLGRYMSALNHTKKWKFPQVGGLTSIKWADNNCYLSSVLLALQQLEVKFN
                     APALQEAYYRARAGDAANFCALILAYSNKTVGELGDVRETMTHLLQHANLESAKRVLN
                     VVCKHCGQKTTTLTGVEAVMYMGTLSYDNLKTGVSIPCVCGRDATQYLVQQESSFVMM
                     SAPPAEYKLQQGTFLCANEYTGNYQCGHYTHITAKETLYRIDGAHLTKMSEYKGPVTD
                     VFYKETSYTTTIKPVSYKLDGVTYTEIEPKLDGYYKKDNAYYTEQPIDLVPTQPLPNA
                     SFDNFKLTCSNTKFADDLNQMTGFTKPASRELSVTFFPDLNGDVVAIDYRHYSASFKK
                     GAKLLHKPIVWHINQATTKTTFKPNTWCLRCLWSTKPVDTSNSFEVLAVEDTQGMDNL
                     ACESQQPTSEEVVENPTIQKEVIECDVKTTEVVGNVILKPSDEGVKVTQELGHEDLMA
                     AYVENTSITIKKPNELSLALGLKTIATHGIAAINSVPWSKILAYVKPFLGQAAITTSN
                     CAKRLAQRVFNNYMPYVFTLLFQLCTFTKSTNSRIRASLPTTIAKNSVKSVAKLCLDA
                     GINYVKSPKFSKLFTIAMWLLLLSICLGSLICVTAAFGVLLSNFGAPSYCNGVRELYL
                     NSSNVTTMDFCEGSFPCSICLSGLDSLDSYPALETIQVTISSYKLDLTILGLAAEWVL
                     AYMLFTKFFYLLGLSAIMQVFFGYFASHFISNSWLMWFIISIVQMAPVSAMVRMYIFF
                     ASFYYIWKSYVHIMDGCTSSTCMMCYKRNRATRVECTTIVNGMKRSFYVYANGGRGFC
                     KTHNWNCLNCDTFCTGSTFISDEVARDLSLQFKRPINPTDQSSYIVDSVAVKNGALHL
                     YFDKAGQKTYERHPLSHFVNLDNLRANNTKGSLPINVIVFDGKSKCDESASKSASVYY
                     SQLMCQPILLLDQVLVSDVGDSTEVSVKMFDAYVDTFSATFSVPMEKLKALVATAHSE
                     LAKGVALDGVLSTFVSAARQGVVDTDVDTKDVIECLKLSHHSDLEVTGDSCNNFMLTY
                     NKVENMTPRDLGACIDCNARHINAQVAKSHNVSLIWNVKDYMSLSEQLRKQIRSAAKK
                     NNIPFRLTCATTRQVVNVITTKISLKGGKIVSTCFKLMLKATLLCVLAALVCYIVMPV
                     HTLSIHDGYTNEIIGYKAIQDGVTRDIISTDDCFANKHAGFDAWFSQRGGSYKNDKSC
                     PVVAAIITREIGFIVPGLPGTVLRAINGDFLHFLPRVFSAVGNICYTPSKLIEYSDFA
                     TSACVLAAECTIFKDAMGKPVPYCYDTNLLEGSISYSELRPDTRYVLMDGSIIQFPNT
                     YLEGSVRVVTTFDAEYCRHGTCERSEVGICLSTSGRWVLNNEHYRALSGVFCGVDAMN
                     LIANIFTPLVQPVGALDVSASVVAGGIIAILVTCAAYYFMKFRRVFGEYNHVVAANAL
                     LFLMSFTILCLVPAYSFLPGVYSVFYLYLTFYFTNDVSFLAHLQWFAMFSPIVPFWIT
                     AIYVFCISLKHCHWFFNNYLRKRVMFNGVTFSTFEEAALCTFLLNKEMYLKLRSETLL
                     PLTQYNRYLALYNKYKYFSGALDTTSYREAACCHLAKALNDFSNSGADVLYQPPQTSI
                     TSAVLQSGFRKMAFPSGKVEGCMVQVTCGTTTLNGLWLDDTVYCPRHVICTAEDMLNP
                     NYEDLLIRKSNHSFLVQAGNVQLRVIGHSMQNCLLRLKVDTSNPKTPKYKFVRIQPGQ
                     TFSVLACYNGSPSGVYQCAMRPNHTIKGSFLNGSCGSVGFNIDYDCVSFCYMHHMELP
                     TGVHAGTDLEGKFYGPFVDRQTAQAAGTDTTITLNVLAWLYAAVINGDRWFLNRFTTT
                     LNDFNLVAMKYNYEPLTQDHVDILGPLSAQTGIAVLDMCAALKELLQNGMNGRTILGS
                     TILEDEFTPFDVVRQCSGVTFQGKFKKIVKGTHHWMLLTFLTSLLILVQSTQWSLFFF
                     VYENAFLPFTLGIMAIA ACAMLLVKHKHAFLCLFLLPSLATVAYFNMVYMPASWVMRI
                     MTWLELADTSLSGYRLKDCVMYASALVLLILMTARTVYDDAARRVWTLMNVITLVYKV
                     YYGNALDQAISMWALVISVTSNYSGVVTTIMFLARAIVFVCVEYYPLLFITGNTLQCI
                     MLVYCFLGYCCCCYFGLFCLLNRYFRLTLGVYDYLVSTQEFRYMNSQGLLPPKSSIDA
                     FKLNIKLLGIGGKPCIKVATVQSKMSDVKCTSVVLLSVLQQLRVESSSKLWAQCVQLH
                     NDILLAKDTTEAFEKMVSLLSVLLSMQGAVDINRLCEEMLDNRATLQAIA SEFSSLPS
                     YAAYATAQEAYEQAVANGDSEVVLKKLKKSLNVAKSEFDRDAAMQRKLEKMADQAMTQ
                     MYKQARSEDKRAKVTSAMQTMLFTMLRKLDNDALNNIINNARDGCVPLNIIPLTTAAK
                     LMVVVPDYGTYKNTCDGNTFTYASALWEIQQVVDADSKIVQLSEINMDNSPNLAWPLI
                     VTALRANSAVKLQNNELSPVALRQMSCAAGTTQTACTDDNALAYYNNSKGGRFVLALL
                     SDHQDLKWARFPKSDGTGTIYTELEPPCRFVTDTPKGPKVKYLYFIKGLNNLNRGMVL
                     GSLAATVRLQAGNATEVPANSTVLSFCAFAVDPAKAYKDYLASGGQPITNCVKMLCTH
                     TGTGQAITVTPEANMDQESFGGASCCLYCRCHIDHPNPKGFCDLKGKYVQIPTTCAND
                     PVGFTLRNTVCTVCGMWKGYGCSCDQLREPLMQSADASTFLNGFAV"
     CDS             <13398..21485
                     /note="ORF 1b; expressed via predicted -1 ribosomal
                     frameshift"
                     /codon_start=1
                     /product="nonstructural polyprotein"
                     /protein_id="AAP13440.1"
                     /translation="RVCGVSAARLTPCGTGTSTDVVYRAFDIYNEKVAGFAKFLKTNC
                     CRFQEKDEEGNLLDSYFVVKRHTMSNYQHEETIYNLVKDCPAVAVHDFFKFRVDGDMV
                     PHISRQRLTKYTMADLVYALRHFDEGNCDTLKEILVTYNCCDDDYFNKKDWYDFVENP
                     DILRVYANLGERVRQSLLKTVQFCDAMRDAGIVGVLTLDNQDLNGNWYDFGDFVQVAP
                     GCGVPIVDSYYSLLMPILTLTRALAAESHMDADLAKPLIKWDLLKYDFTEERLCLFDR
                     YFKYWDQTYHPNCINCLDDRCILHCANFNVLFSTVFPPTSFGPLVRKIFVDGVPFVVS
                     TGYHFRELGVVHNQDVNLHSSRLSFKELLVYAADPAMHAASGNLLLDKRTTCFSVAAL
                     TNNVAFQTVKPGNFNKDFYDFAVSKGFFKEGSSVELKHFFFAQDGNAAISDYDYYRYN
                     LPTMCDIRQLLFVVEVVDKYFDCYDGGCINANQVIVNNLDKSAGFPFNKWGKARLYYD
                     SMSYEDQDALFAYTKRNVIPTITQMNLKYAISAKNRARTVAGVSICSTMTNRQFHQKL
                     LKSIAATRGATVVIGTSKFYGGWHNMLKTVYSDVETPHLMGWDYPKCDRAMPNMLRIM
                     ASLVLARKHNTCCNLSHRFYRLANECAQVLSEMVMCGGSLYVKPGGTSSGDATTAYAN
                     SVFNICQAVTANVNALLSTDGNKIADKYVRNLQHRLYECLYRNRDVDHEFVDEFYAYL
                     RKHFSMMILSDDAVVCYNSNYAAQGLVASIKNFKAVLYYQNNVFMSEAKCWTETDLTK
                     GPHEFCSQHTMLVKQGDDYVYLPYPDPSRILGAGCFVDDIVKTDGTLMIERFVSLAID
                     AYPLTKHPNQEYADVFHLYLQYIRKLHDELTGHMLDMYSVMLTNDNTSRYWEPEFYEA
                     MYTPHTVLQAVGACVLCNSQTSLRCGACIRRPFLCCKCCYDHVISTSHKLVLSVNPYV
                     CNAPGCDVTDVTQLYLGGMSYYCKSHKPPISFPLCANGQVFGLYKNTCVGSDNVTDFN
                     AIA TCDWTNAGDYILANTCTERLKLFAAETLKATEETFKLSYGIATVREVLSDRELHL
                     SWEVGKPRPPLNRNYVFTGYRVTKNSKVQIGEYTFEKGDYGDAVVYRGTTTYKLNVGD
                     YFVLTSHTVMPLSAPTLVPQEHYVRITGLYPTLNISDEFSSNVANYQKVGMQKYSTLQ
                     GPPGTGKSHFAIGLALYYPSARIVYTACSHAAVDALCEKALKYLPIDKCSRIIPARAR
                     VECFDKFKVNSTLEQYVFCTVNALPETTADIVVFDEISMATNYDLSVVNARLRAKHYV
                     YIGDPAQLPAPRTLLTKGTLEPEYFNSVCRLMKTIGPDMFLGTCRRCPAEIVDTVSAL
                     VYDNKLKAHKDKSAQCFKMFYKGVITHDVSSAINRPQIGVVREFLTRNPAWRKAVFIS
                     PYNSQNAVASKILGLPTQTVDSSQGSEYDYVIFTQTTETAHSCNVNRFNVAITRAKIG
                     ILCIMSDRDLYDKLQFTSLEIPRRNVATLQAENVTGLFKDCSKIITGLHPTQAPTHLS
                     VDIKFKTEGLCVDIPGIPKDMTYRRLISMMGFKMNYQVNGYPNMFITREEAIRHVRAW
                     IGFDVEGCHATRDAVGTNLPLQLGFSTGVNLVAVPTGYVDTENNTEFTRVNAKPPPGD
                     QFKHLIPLMYKGLPWNVVRIKIVQMLSDTLKGLSDRVVFVLWAHGFELTSMKYFVKIG
                     PERTCCLCDKRATCFSTSSDTYACWNHSVGFDYVYNPFMIDVQQWGFTGNLQSNHDQH
                     CQVHGNAHVASCDAIMTRCLAVHECFVKRVDWSVEYPIIGDELRVNSACRKVQHMVVK
                     SALLADKFPVLHDIGNPKAIKCVPQAEVEWKFYDAQPCSDKAYKIEELFYSYATHHDK
                     FTDGVCLFWNCNVDRYPANAIVCRFDTRVLSNLNLPGCDGGSLYVNKHAFHTPAFDKS
                     AFTNLKQLPFFYYSDSPCESHGKQVVSDIDYVPLKSATCITRCNLGGAVCRHHANEYR
                     QYLDAYNMMISAGFSLWIYKQFDTYNLWNTFTRLQSLENVAYNVVNKGHFDGHAGEAP
                     VSIINNAVYTKVDGIDVEIFENKTTLPVNVAFELWAKRNIKPVPEIKILNNLGVDIAA
                     NTVIWDYKREAPAHVSTIGVCTMTDIAKKPTESACSSLTVLFDGRVEGQVDLFRNARN
                     GVLITEGSVKGLTPSKGPAQASVNGVTLIGESVKTQFNYFKKVDGIIQQLPETYFTQS
                     RDLEDFKPRSQMETDFLELAMDEFIQRYKLEGYAFEHIVYGDFSHGQLGGLHLMIGLA
                     KRSQDSPLKLEDFIPMDSTVKNYFITDAQTGSSKCVCSVIDLLLDDFVEIIKSQDLSV
                     ISKVVKVTIDYAEISFMLWCKDGHVETFYPKLQASQAWQPGVAMPNLYKMQRMLLEKC
                     DLQNYGENAVIPKGIMMNVAKYTQLCQYLNTLTLAVPYNMRVIHFGAGSDKGVAPGTA
                     VLRQWLPTGTLLVDSDLNDFVSDADSTLIGDCATVHTANKWDLIISDMYDPRTKHVTK
                     ENDSKEGFFTYLCGFIKQKLALGGSIAVKITEHSWNADLYKLMGHFSWWTAFVTNVNA
                     SSSEAFLIGANYLGKPKEQIDGYTMHANYIFWRNTNPIQLSSYSLFDMSKFPLKLRGT
                     AVMSLKENQINDMIYSLLEKGRLIIRENNRVVVSSDILVNN"
     CDS             21492..25259
                     /note="surface spike glycoprotein"
                     /codon_start=1
                     /product="S protein"
                     /protein_id="AAP13441.1"
                     /translation="MFIFLLFLTLTSGSDLDRCTTFDDVQAPNYTQHTSSMRGVYYPD
                     EIFRSDTLYLTQDLFLPFYSNVTGFHTINHTFGNPVIPFKDGIYFAATEKSNVVRGWV
                     FGSTMNNKSQSVIIINNSTNVVIRACNFELCDNPFFAVSKPMGTQTHTMIFDNAFNCT
                     FEYISDAFSLDVSEKSGNFKHLREFVFKNKDGFLYVYKGYQPIDVVRDLPSGFNTLKP
                     IFKLPLGINITNFRAILTAFSPAQDIWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV
                     DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVVPSGDVVRFPNITNLCPFGEVFNATKF
                     PSVYAWERKKISNCVADYSVLYNSTFFSTFKCYGVSATKLNDLCFSNVYADSFVVKGD
                     DVRQIAPGQTGVIADYNYKLPDDFMGCVLAWNTRNIDATSTGNYNYKYRYLRHGKLRP
                     FERDISNVPFSPDGKPCTPPALNCYWPLNDYGFYTTTGIGYQPYRVVVLSFELLNAPA
                     TVCGPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQFGRDVSDFTDSVRDPK
                     TSEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVSTAIHADQLTPAWRIYST
                     GNNVFQTQAGCLIGAEHVDTSYECDIPIGAGICASYHTVSLLRSTSQKSIVAYTMSLG
                     ADSSIAYSNNTIAIPTNFSISITTEVMPVSMAKTSVDCNMYICGDSTECANLLLQYGS
                     FCTQLNRALSGIAAEQDRNTREVFAQVKQMYKTPTLKYFGGFNFSQILPDPLKPTKRS
                     FIEDLLFNKVTLADAGFMKQYGECLGDINARDLICAQKFNGLTVLPPLLTDDMIAAYT
                     AALVSGTATAGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKQIANQFNKAIS
                     QIQESLTTTSTALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDKVEAE
                     VQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYH
                     LMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAICHEGKAYFPREGVFVFNGTSWFITQ
                     RNFFSPQIITTDNTFVSGNCDVVIGIINNTVYDPLQPELDSFKEELDKYFKNHTSPDV
                     DLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIA
                     GLIAIVMVTILLCCMTSCCSCLKGACSCGSCCKFDEDDSEPVLKGVKLHYT"
     CDS             25268..26092
                     /note="potential product, c-terminal similarity to porin"
                     /codon_start=1
                     /product="protein X1"
                     /protein_id="AAP13446.1"
                     /translation="MDLFMRFFTLGSITAQPVKIDNASPASTVHATATIPLQASLPFG
                     WLVIGVAFLAVFQSATKIIALNKRWQLALYKGFQFICNLLLLFVTIYSHLLLVAAGME
                     AQFLYLYALIYFLQCINACRIIMRCWLCWKCKSKNPLLYDANYFVCWHTHNYDYCIPY
                     NSVTDTIVVTEGDGISTPKLKEDYQIGGYSEDRHSGVKDYVVVHGYFTEVYYQLESTQ
                     ITTDTGIENATFFIFNKLVKDPPNVQIHTIDGSSGVANPAMDPIYDEPTTTTSVPL"
     CDS             25689..26153
                     /note="potential product"
                     /codon_start=1
                     /product="protein X2"
                     /protein_id="AAP13447.1"
                     /translation="MMPTTLFAGTHITMTTVYHITVSQIQLSLLKVTAFQHQNSKKTT
                     KLVVILRIGTQVLKTMSLYMAISPKFTTSLSLHKLLQTLVLKMLHSSSLTSLLKTHRM
                     CKYTQSTALQELLIQQWIQFMMSRRRLLACLCKHKKVSTNLCTHSFRKKQVR"
     CDS             26117..26347
                     /note="envelope protein"
                     /codon_start=1
                     /product="E protein"
                     /protein_id="AAP13443.1"
                     /translation="MYSFVSEETGTLIVNSVLLFLAFVVFLLVTLAILTALRLCAYCC
                     NIVNVSLVKPTVYVYSRVKNLNSSEGVPDLLV"
     CDS             26398..27063
                     /note="small membrane protein"
                     /codon_start=1
                     /product="M protein"
                     /protein_id="AAP13444.1"
                     /translation="MADNGTITVEELKQLLEQWNLVIGFLFLAWIMLLQFAYSNRNRF
                     LYIIKLVFLWLLWPVTLACFVLAAVYRINWVTGGIAIA MACIVGLMWLSYFVASFRLF
                     ARTRSMWSFNPETNILLNVPLRGTIVTRPLMESELVIGAVIIRGHLRMAGHPLGRCDI
                     KDLPKEITVATSRTLSYYKLGASQRVGTDSGFAAYNRYRIGNYKLNTDHAGSNDNIAL
                     LVQ"
     CDS             27074..27265
                     /note="potential product"
                     /codon_start=1
                     /product="protein X3"
                     /protein_id="AAP13448.1"
                     /translation="MFHLVDFQVTIAEILIIIMRTFRIAIWNLDVIISSIVRQLFKPL
                     TKKNYSELDDEEPMELDYP"
     CDS             27273..27641
                     /note="potential product"
                     /codon_start=1
                     /product="protein X4"
                     /protein_id="AAP13449.1"
                     /translation="MKIILFLTLIVFTSCELYHYQECVRGTTVLLKEPCPSGTYEGNS
                     PFHPLADNKFALTCTSTHFAFACADGTRHTYQLRARSVSPKLFIRQEEVQQELYSPLF
                     LIVAALVFLILCFTIKRKTE"
     CDS             27864..28118
                     /note="potential product"
                     /codon_start=1
                     /product="protein X5"
                     /protein_id="AAP13450.1"
                     /translation="MCLKILVRYNTRGNTYSTAWLCALGKVLPFHRWHTMVQTCTPNV
                     TINCQDPAGGALIARCWYLHEGHQTAAFRDVLVVLNKRTN"
     CDS             28120..29388
                     /note="nucleocapsid protein"
                     /codon_start=1
                     /product="N protein"
                     /protein_id="AAP13445.1"
                     /translation="MSDNGPQSNQRSAPRITFGGPTDSTDNNQNGGRNGARPKQRRPQ
                     GLPNNTASWFTALTQHGKEELRFPRGQGVPINTNSGPDDQIGYYRRATRRVRGGDGKM
                     KELSPRWYFYYLGTGPEASLPYGANKEGIVWVATEGALNTPKDHIGTRNPNNNAATVL
                     QLPQGTTLPKGFYAEGSRGGSQASSRSSSRSRGNSRNSTPGSSRGNSPARMASGGGET
                     ALALLLLDRLNQLESKVSGKGQQQQGQTVTKKSAAEASKKPRQKRTATKQYNVTQAFG
                     RRGPEQTQGNFGDQDLIRQGTDYKHWPQIAQFAPSASAFFGMSRIGMEVTPSGTWLTY
                     HGAIKLDDKDPQFKDNVILLNKHIDAYKTFPPTEPKKDKKKKTDEAQPLPQRQKKQPT
                     VTLLPAADMDDFSRQLQNSMSGASADSTQA"
     polyA_site      29727
ORIGIN      
        1 atattaggtt tttacctacc caggaaaagc caaccaacct cgatctcttg tagatctgtt
       61 ctctaaacga actttaaaat ctgtgtagct gtcgctcggc tgcatgccta gtgcacctac
      121 gcagtataaa caataataaa ttttactgtc gttgacaaga aacgagtaac tcgtccctct
      181 tctgcagact gcttacggtt tcgtccgtgt tgcagtcgat catcagcata cctaggtttc
      241 gtccgggtgt gaccgaaagg taagatggag agccttgttc ttggtgtcaa cgagaaaaca
      301 cacgtccaac tcagtttgcc tgtccttcag gttagagacg tgctagtgcg tggcttcggg
      361 gactctgtgg aagaggccct atcggaggca cgtgaacacc tcaaaaatgg cacttgtggt
      421 ctagtagagc tggaaaaagg cgtactgccc cagcttgaac agccctatgt gttcattaaa
      481 cgttctgatg ccttaagcac caatcacggc cacaaggtcg ttgagctggt tgcagaaatg
      541 gacggcattc agtacggtcg tagcggtata acactgggag tactcgtgcc acatgtgggc
      601 gaaaccccaa ttgcataccg caatgttctt cttcgtaaga acggtaataa gggagccggt
      661 ggtcatagct atggcatcga tctaaagtct tatgacttag gtgacgagct tggcactgat
      721 cccattgaag attatgaaca aaactggaac actaagcatg gcagtggtgc actccgtgaa
      781 ctcactcgtg agctcaatgg aggtgcagtc actcgctatg tcgacaacaa tttctgtggc
      841 ccagatgggt accctcttga ttgcatcaaa gattttctcg cacgcgcggg caagtcaatg
      901 tgcactcttt ccgaacaact tgattacatc gagtcgaaga gaggtgtcta ctgctgccgt
      961 gaccatgagc atgaaattgc ctggttcact gagcgctctg ataagagcta cgagcaccag
     1021 acacccttcg aaattaagag tgccaagaaa tttgacactt tcaaagggga atgcccaaag
     1081 tttgtgtttc ctcttaactc aaaagtcaaa gtcattcaac cacgtgttga aaagaaaaag
     1141 actgagggtt tcatggggcg tatacgctct gtgtaccctg ttgcatctcc acaggagtgt
     1201 aacaatatgc acttgtctac cttgatgaaa tgtaatcatt gcgatgaagt ttcatggcag
     1261 acgtgcgact ttctgaaagc cacttgtgaa cattgtggca ctgaaaattt agttattgaa
     1321 ggacctacta catgtgggta cctacctact aatgctgtag tgaaaatgcc atgtcctgcc
     1381 tgtcaagacc cagagattgg acctgagcat agtgttgcag attatcacaa ccactcaaac
     1441 attgaaactc gactccgcaa gggaggtagg actagatgtt ttggaggctg tgtgtttgcc
     1501 tatgttggct gctataataa gcgtgcctac tgggttcctc gtgctagtgc tgatattggc
     1561 tcaggccata ctggcattac tggtgacaat gtggagacct tgaatgagga tctccttgag
     1621 atactgagtc gtgaacgtgt taacattaac attgttggcg attttcattt gaatgaagag
     1681 gttgccatca ttttggcatc tttctctgct tctacaagtg cctttattga cactataaag
     1741 agtcttgatt acaagtcttt caaaaccatt gttgagtcct gcggtaacta taaagttacc
     1801 aagggaaagc ccgtaaaagg tgcttggaac attggacaac agagatcagt tttaacacca
     1861 ctgtgtggtt ttccctcaca ggctgctggt gttatcagat caatttttgc gcgcacactt
     1921 gatgcagcaa accactcaat tcctgatttg caaagagcag ctgtcaccat acttgatggt
     1981 atttctgaac agtcattacg tcttgtcgac gccatggttt atacttcaga cctgctcacc
     2041 aacagtgtca ttattatggc atatgtaact ggtggtcttg tacaacagac ttctcagtgg
     2101 ttgtctaatc ttttgggcac tactgttgaa aaactcaggc ctatctttga atggattgag
     2161 gcgaaactta gtgcaggagt tgaatttctc aaggatgctt gggagattct caaatttctc
     2221 attacaggtg tttttgacat cgtcaagggt caaatacagg ttgcttcaga taacatcaag
     2281 gattgtgtaa aatgcttcat tgatgttgtt aacaaggcac tcgaaatgtg cattgatcaa
     2341 gtcactatcg ctggcgcaaa gttgcgatca ctcaacttag gtgaagtctt catcgctcaa
     2401 agcaagggac tttaccgtca gtgtatacgt ggcaaggagc agctgcaact actcatgcct
     2461 cttaaggcac caaaagaagt aacctttctt gaaggtgatt cacatgacac agtacttacc
     2521 tctgaggagg ttgttctcaa gaacggtgaa ctcgaagcac tcgagacgcc cgttgatagc
     2581 ttcacaaatg gagctatcgt tggcacacca gtctgtgtaa atggcctcat gctcttagag
     2641 attaaggaca aagaacaata ctgcgcattg tctcctggtt tactggctac aaacaatgtc
     2701 tttcgcttaa aagggggtgc accaattaaa ggtgtaacct ttggagaaga tactgtttgg
     2761 gaagttcaag gttacaagaa tgtgagaatc acatttgagc ttgatgaacg tgttgacaaa
     2821 gtgcttaatg aaaagtgctc tgtctacact gttgaatccg gtaccgaagt tactgagttt
     2881 gcatgtgttg tagcagaggc tgttgtgaag actttacaac cagtttctga tctccttacc
     2941 aacatgggta ttgatcttga tgagtggagt gtagctacat tctacttatt tgatgatgct
     3001 ggtgaagaaa acttttcatc acgtatgtat tgttcctttt accctccaga tgaggaagaa
     3061 gaggacgatg cagagtgtga ggaagaagaa attgatgaaa cctgtgaaca tgagtacggt
     3121 acagaggatg attatcaagg tctccctctg gaatttggtg cctcagctga aacagttcga
     3181 gttgaggaag aagaagagga agactggctg gatgatacta ctgagcaatc agagattgag
     3241 ccagaaccag aacctacacc tgaagaacca gttaatcagt ttactggtta tttaaaactt
     3301 actgacaatg ttgccattaa atgtgttgac atcgttaagg aggcacaaag tgctaatcct
     3361 atggtgattg taaatgctgc taacatacac ctgaaacatg gtggtggtgt agcaggtgca
     3421 ctcaacaagg caaccaatgg tgccatgcaa aaggagagtg atgattacat taagctaaat
     3481 ggccctctta cagtaggagg gtcttgtttg ctttctggac ataatcttgc taagaagtgt
     3541 ctgcatgttg ttggacctaa cctaaatgca ggtgaggaca tccagcttct taaggcagca
     3601 tatgaaaatt tcaattcaca ggacatctta cttgcaccat tgttgtcagc aggcatattt
     3661 ggtgctaaac cacttcagtc tttacaagtg tgcgtgcaga cggttcgtac acaggtttat
     3721 attgcagtca atgacaaagc tctttatgag caggttgtca tggattatct tgataacctg
     3781 aagcctagag tggaagcacc taaacaagag gagccaccaa acacagaaga ttccaaaact
     3841 gaggagaaat ctgtcgtaca gaagcctgtc gatgtgaagc caaaaattaa ggcctgcatt
     3901 gatgaggtta ccacaacact ggaagaaact aagtttctta ccaataagtt actcttgttt
     3961 gctgatatca atggtaagct ttaccatgat tctcagaaca tgcttagagg tgaagatatg
     4021 tctttccttg agaaggatgc accttacatg gtaggtgatg ttatcactag tggtgatatc
     4081 acttgtgttg taataccctc caaaaaggct ggtggcacta ctgagatgct ctcaagagct
     4141 ttgaagaaag tgccagttga tgagtatata accacgtacc ctggacaagg atgtgctggt
     4201 tatacacttg aggaagctaa gactgctctt aagaaatgca aatctgcatt ttatgtacta
     4261 ccttcagaag cacctaatgc taaggaagag attctaggaa ctgtatcctg gaatttgaga
     4321 gaaatgcttg ctcatgctga agagacaaga aaattaatgc ctatatgcat ggatgttaga
     4381 gccataatgg caaccatcca acgtaagtat aaaggaatta aaattcaaga gggcatcgtt
     4441 gactatggtg tccgattctt cttttatact agtaaagagc ctgtagcttc tattattacg
     4501 aagctgaact ctctaaatga gccgcttgtc acaatgccaa ttggttatgt gacacatggt
     4561 tttaatcttg aagaggctgc gcgctgtatg cgttctctta aagctcctgc cgtagtgtca
     4621 gtatcatcac cagatgctgt tactacatat aatggatacc tcacttcgtc atcaaagaca
     4681 tctgaggagc actttgtaga aacagtttct ttggctggct cttacagaga ttggtcctat
     4741 tcaggacagc gtacagagtt aggtgttgaa tttcttaagc gtggtgacaa aattgtgtac
     4801 cacactctgg agagccccgt cgagtttcat cttgacggtg aggttctttc acttgacaaa
     4861 ctaaagagtc tcttatccct gcgggaggtt aagactataa aagtgttcac aactgtggac
     4921 aacactaatc tccacacaca gcttgtggat atgtctatga catatggaca gcagtttggt
     4981 ccaacatact tggatggtgc tgatgttaca aaaattaaac ctcatgtaaa tcatgagggt
     5041 aagactttct ttgtactacc tagtgatgac acactacgta gtgaagcttt cgagtactac
     5101 catactcttg atgagagttt tcttggtagg tacatgtctg ctttaaacca cacaaagaaa
     5161 tggaaatttc ctcaagttgg tggtttaact tcaattaaat gggctgataa caattgttat
     5221 ttgtctagtg ttttattagc acttcaacag cttgaagtca aattcaatgc accagcactt
     5281 caagaggctt attatagagc ccgtgctggt gatgctgcta acttttgtgc actcatactc
     5341 gcttacagta ataaaactgt tggcgagctt ggtgatgtca gagaaactat gacccatctt
     5401 ctacagcatg ctaatttgga atctgcaaag cgagttctta atgtggtgtg taaacattgt
     5461 ggtcagaaaa ctactacctt aacgggtgta gaagctgtga tgtatatggg tactctatct
     5521 tatgataatc ttaagacagg tgtttccatt ccatgtgtgt gtggtcgtga tgctacacaa
     5581 tatctagtac aacaagagtc ttcttttgtt atgatgtctg caccacctgc tgagtataaa
     5641 ttacagcaag gtacattctt atgtgcgaat gagtacactg gtaactatca gtgtggtcat
     5701 tacactcata taactgctaa ggagaccctc tatcgtattg acggagctca ccttacaaag
     5761 atgtcagagt acaaaggacc agtgactgat gttttctaca aggaaacatc ttacactaca
     5821 accatcaagc ctgtgtcgta taaactcgat ggagttactt acacagagat tgaaccaaaa
     5881 ttggatgggt attataaaaa ggataatgct tactatacag agcagcctat agaccttgta
     5941 ccaactcaac cattaccaaa tgcgagtttt gataatttca aactcacatg ttctaacaca
     6001 aaatttgctg atgatttaaa tcaaatgaca ggcttcacaa agccagcttc acgagagcta
     6061 tctgtcacat tcttcccaga cttgaatggc gatgtagtgg ctattgacta tagacactat
     6121 tcagcgagtt tcaagaaagg tgctaaatta ctgcataagc caattgtttg gcacattaac
     6181 caggctacaa ccaagacaac gttcaaacca aacacttggt gtttacgttg tctttggagt
     6241 acaaagccag tagatacttc aaattcattt gaagttctgg cagtagaaga cacacaagga
     6301 atggacaatc ttgcttgtga aagtcaacaa cccacctctg aagaagtagt ggaaaatcct
     6361 accatacaga aggaagtcat agagtgtgac gtgaaaacta ccgaagttgt aggcaatgtc
     6421 atacttaaac catcagatga aggtgttaaa gtaacacaag agttaggtca tgaggatctt
     6481 atggctgctt atgtggaaaa cacaagcatt accattaaga aacctaatga gctttcacta
     6541 gccttaggtt taaaaacaat tgccactcat ggtattgctg caattaatag tgttccttgg
     6601 agtaaaattt tggcttatgt caaaccattc ttaggacaag cagcaattac aacatcaaat
     6661 tgcgctaaga gattagcaca acgtgtgttt aacaattata tgccttatgt gtttacatta
     6721 ttgttccaat tgtgtacttt tactaaaagt accaattcta gaattagagc ttcactacct
     6781 acaactattg ctaaaaatag tgttaagagt gttgctaaat tatgtttgga tgccggcatt
     6841 aattatgtga agtcacccaa attttctaaa ttgttcacaa tcgctatgtg gctattgttg
     6901 ttaagtattt gcttaggttc tctaatctgt gtaactgctg cttttggtgt actcttatct
     6961 aattttggtg ctccttctta ttgtaatggc gttagagaat tgtatcttaa ttcgtctaac
     7021 gttactacta tggatttctg tgaaggttct tttccttgca gcatttgttt aagtggatta
     7081 gactcccttg attcttatcc agctcttgaa accattcagg tgacgatttc atcgtacaag
     7141 ctagacttga caattttagg tctggccgct gagtgggttt tggcatatat gttgttcaca
     7201 aaattctttt atttattagg tctttcagct ataatgcagg tgttctttgg ctattttgct
     7261 agtcatttca tcagcaattc ttggctcatg tggtttatca ttagtattgt acaaatggca
     7321 cccgtttctg caatggttag gatgtacatc ttctttgctt ctttctacta catatggaag
     7381 agctatgttc atatcatgga tggttgcacc tcttcgactt gcatgatgtg ctataagcgc
     7441 aatcgtgcca cacgcgttga gtgtacaact attgttaatg gcatgaagag atctttctat
     7501 gtctatgcaa atggaggccg tggcttctgc aagactcaca attggaattg tctcaattgt
     7561 gacacatttt gcactggtag tacattcatt agtgatgaag ttgctcgtga tttgtcactc
     7621 cagtttaaaa gaccaatcaa ccctactgac cagtcatcgt atattgttga tagtgttgct
     7681 gtgaaaaatg gcgcgcttca cctctacttt gacaaggctg gtcaaaagac ctatgagaga
     7741 catccgctct cccattttgt caatttagac aatttgagag ctaacaacac taaaggttca
     7801 ctgcctatta atgtcatagt ttttgatggc aagtccaaat gcgacgagtc tgcttctaag
     7861 tctgcttctg tgtactacag tcagctgatg tgccaaccta ttctgttgct tgaccaagtt
     7921 cttgtatcag acgttggaga tagtactgaa gtttccgtta agatgtttga tgcttatgtc
     7981 gacacctttt cagcaacttt tagtgttcct atggaaaaac ttaaggcact tgttgctaca
     8041 gctcacagcg agttagcaaa gggtgtagct ttagatggtg tcctttctac attcgtgtca
     8101 gctgcccgac aaggtgttgt tgataccgat gttgacacaa aggatgttat tgaatgtctc
     8161 aaactttcac atcactctga cttagaagtg acaggtgaca gttgtaacaa tttcatgctc
     8221 acctataata aggttgaaaa catgacgccc agagatcttg gcgcatgtat tgactgtaat
     8281 gcaaggcata tcaatgccca agtagcaaaa agtcacaatg tttcactcat ctggaatgta
     8341 aaagactaca tgtctttatc tgaacagctg cgtaaacaaa ttcgtagtgc tgccaagaag
     8401 aacaacatac cttttagact aacttgtgct acaactagac aggttgtcaa tgtcataact
     8461 actaaaatct cactcaaggg tggtaagatt gttagtactt gttttaaact tatgcttaag
     8521 gccacattat tgtgcgttct tgctgcattg gtttgttata tcgttatgcc agtacataca
     8581 ttgtcaatcc atgatggtta cacaaatgaa atcattggtt acaaagccat tcaggatggt
     8641 gtcactcgtg acatcatttc tactgatgat tgttttgcaa ataaacatgc tggttttgac
     8701 gcatggttta gccagcgtgg tggttcatac aaaaatgaca aaagctgccc tgtagtagct
     8761 gctatcatta caagagagat tggtttcata gtgcctggct taccgggtac tgtgctgaga
     8821 gcaatcaatg gtgacttctt gcattttcta cctcgtgttt ttagtgctgt tggcaacatt
     8881 tgctacacac cttccaaact cattgagtat agtgattttg ctacctctgc ttgcgttctt
     8941 gctgctgagt gtacaatttt taaggatgct atgggcaaac ctgtgccata ttgttatgac
     9001 actaatttgc tagagggttc tatttcttat agtgagcttc gtccagacac tcgttatgtg
     9061 cttatggatg gttccatcat acagtttcct aacacttacc tggagggttc tgttagagta
     9121 gtaacaactt ttgatgctga gtactgtaga catggtacat gcgaaaggtc agaagtaggt
     9181 atttgcctat ctaccagtgg tagatgggtt cttaataatg agcattacag agctctatca
     9241 ggagttttct gtggtgttga tgcgatgaat ctcatagcta acatctttac tcctcttgtg
     9301 caacctgtgg gtgctttaga tgtgtctgct tcagtagtgg ctggtggtat tattgccata
     9361 ttggtgactt gtgctgccta ctactttatg aaattcagac gtgtttttgg tgagtacaac
     9421 catgttgttg ctgctaatgc acttttgttt ttgatgtctt tcactatact ctgtctggta
     9481 ccagcttaca gctttctgcc gggagtctac tcagtctttt acttgtactt gacattctat
     9541 ttcaccaatg atgtttcatt cttggctcac cttcaatggt ttgccatgtt ttctcctatt
     9601 gtgccttttt ggataacagc aatctatgta ttctgtattt ctctgaagca ctgccattgg
     9661 ttctttaaca actatcttag gaaaagagtc atgtttaatg gagttacatt tagtaccttc
     9721 gaggaggctg ctttgtgtac ctttttgctc aacaaggaaa tgtacctaaa attgcgtagc
     9781 gagacactgt tgccacttac acagtataac aggtatcttg ctctatataa caagtacaag
     9841 tatttcagtg gagccttaga tactaccagc tatcgtgaag cagcttgctg ccacttagca
     9901 aaggctctaa atgactttag caactcaggt gctgatgttc tctaccaacc accacagaca
     9961 tcaatcactt ctgctgttct gcagagtggt tttaggaaaa tggcattccc gtcaggcaaa
    10021 gttgaagggt gcatggtaca agtaacctgt ggaactacaa ctcttaatgg attgtggttg
    10081 gatgacacag tatactgtcc aagacatgtc atttgcacag cagaagacat gcttaatcct
    10141 aactatgaag atctgctcat tcgcaaatcc aaccatagct ttcttgttca ggctggcaat
    10201 gttcaacttc gtgttattgg ccattctatg caaaattgtc tgcttaggct taaagttgat
    10261 acttctaacc ctaagacacc caagtataaa tttgtccgta tccaacctgg tcaaacattt
    10321 tcagttctag catgctacaa tggttcacca tctggtgttt atcagtgtgc catgagacct
    10381 aatcatacca ttaaaggttc tttccttaat ggatcatgtg gtagtgttgg ttttaacatt
    10441 gattatgatt gcgtgtcttt ctgctatatg catcatatgg agcttccaac aggagtacac
    10501 gctggtactg acttagaagg taaattctat ggtccatttg ttgacagaca aactgcacag
    10561 gctgcaggta cagacacaac cataacatta aatgttttgg catggctgta tgctgctgtt
    10621 atcaatggtg ataggtggtt tcttaataga ttcaccacta ctttgaatga ctttaacctt
    10681 gtggcaatga agtacaacta tgaacctttg acacaagatc atgttgacat attgggacct
    10741 ctttctgctc aaacaggaat tgccgtctta gatatgtgtg ctgctttgaa agagctgctg
    10801 cagaatggta tgaatggtcg tactatcctt ggtagcacta ttttagaaga tgagtttaca
    10861 ccatttgatg ttgttagaca atgctctggt gttaccttcc aaggtaagtt caagaaaatt
    10921 gttaagggca ctcatcattg gatgctttta actttcttga catcactatt gattcttgtt
    10981 caaagtacac agtggtcact gtttttcttt gtttacgaga atgctttctt gccatttact
    11041 cttggtatta tggcaattgc tgcatgtgct atgctgcttg ttaagcataa gcacgcattc
    11101 ttgtgcttgt ttctgttacc ttctcttgca acagttgctt actttaatat ggtctacatg
    11161 cctgctagct gggtgatgcg tatcatgaca tggcttgaat tggctgacac tagcttgtct
    11221 ggttataggc ttaaggattg tgttatgtat gcttcagctt tagttttgct tattctcatg
    11281 acagctcgca ctgtttatga tgatgctgct agacgtgttt ggacactgat gaatgtcatt
    11341 acacttgttt acaaagtcta ctatggtaat gctttagatc aagctatttc catgtgggcc
    11401 ttagttattt ctgtaacctc taactattct ggtgtcgtta cgactatcat gtttttagct
    11461 agagctatag tgtttgtgtg tgttgagtat tacccattgt tatttattac tggcaacacc
    11521 ttacagtgta tcatgcttgt ttattgtttc ttaggctatt gttgctgctg ctactttggc
    11581 cttttctgtt tactcaaccg ttacttcagg cttactcttg gtgtttatga ctacttggtc
    11641 tctacacaag aatttaggta tatgaactcc caggggcttt tgcctcctaa gagtagtatt
    11701 gatgctttca agcttaacat taagttgttg ggtattggag gtaaaccatg tatcaaggtt
    11761 gctactgtac agtctaaaat gtctgacgta aagtgcacat ctgtggtact gctctcggtt
    11821 cttcaacaac ttagagtaga gtcatcttct aaattgtggg cacaatgtgt acaactccac
    11881 aatgatattc ttcttgcaaa agacacaact gaagctttcg agaagatggt ttctcttttg
    11941 tctgttttgc tatccatgca gggtgctgta gacattaata ggttgtgcga ggaaatgctc
    12001 gataaccgtg ctactcttca ggctattgct tcagaattta gttctttacc atcatatgcc
    12061 gcttatgcca ctgcccagga ggcctatgag caggctgtag ctaatggtga ttctgaagtc
    12121 gttctcaaaa agttaaagaa atctttgaat gtggctaaat ctgagtttga ccgtgatgct
    12181 gccatgcaac gcaagttgga aaagatggca gatcaggcta tgacccaaat gtacaaacag
    12241 gcaagatctg aggacaagag ggcaaaagta actagtgcta tgcaaacaat gctcttcact
    12301 atgcttagga agcttgataa tgatgcactt aacaacatta tcaacaatgc gcgtgatggt
    12361 tgtgttccac tcaacatcat accattgact acagcagcca aactcatggt tgttgtccct
    12421 gattatggta cctacaagaa cacttgtgat ggtaacacct ttacatatgc atctgcactc
    12481 tgggaaatcc agcaagttgt tgatgcggat agcaagattg ttcaacttag tgaaattaac
    12541 atggacaatt caccaaattt ggcttggcct cttattgtta cagctctaag agccaactca
    12601 gctgttaaac tacagaataa tgaactgagt ccagtagcac tacgacagat gtcctgtgcg
    12661 gctggtacca cacaaacagc ttgtactgat gacaatgcac ttgcctacta taacaattcg
    12721 aagggaggta ggtttgtgct ggcattacta tcagaccacc aagatctcaa atgggctaga
    12781 ttccctaaga gtgatggtac aggtacaatt tacacagaac tggaaccacc ttgtaggttt
    12841 gttacagaca caccaaaagg gcctaaagtg aaatacttgt acttcatcaa aggcttaaac
    12901 aacctaaata gaggtatggt gctgggcagt ttagctgcta cagtacgtct tcaggctgga
    12961 aatgctacag aagtacctgc caattcaact gtgctttcct tctgtgcttt tgcagtagac
    13021 cctgctaaag catataagga ttacctagca agtggaggac aaccaatcac caactgtgtg
    13081 aagatgttgt gtacacacac tggtacagga caggcaatta ctgtaacacc agaagctaac
    13141 atggaccaag agtcctttgg tggtgcttca tgttgtctgt attgtagatg ccacattgac
    13201 catccaaatc ctaaaggatt ctgtgacttg aaaggtaagt acgtccaaat acctaccact
    13261 tgtgctaatg acccagtggg ttttacactt agaaacacag tctgtaccgt ctgcggaatg
    13321 tggaaaggtt atggctgtag ttgtgaccaa ctccgcgaac ccttgatgca gtctgcggat
    13381 gcatcaacgt ttttaaacgg gtttgcggtg taagtgcagc ccgtcttaca ccgtgcggca
    13441 caggcactag tactgatgtc gtctacaggg cttttgatat ttacaacgaa aaagttgctg
    13501 gttttgcaaa gttcctaaaa actaattgct gtcgcttcca ggagaaggat gaggaaggca
    13561 atttattaga ctcttacttt gtagttaaga ggcatactat gtctaactac caacatgaag
    13621 agactattta taacttggtt aaagattgtc cagcggttgc tgtccatgac tttttcaagt
    13681 ttagagtaga tggtgacatg gtaccacata tatcacgtca gcgtctaact aaatacacaa
    13741 tggctgattt agtctatgct ctacgtcatt ttgatgaggg taattgtgat acattaaaag
    13801 aaatactcgt cacatacaat tgctgtgatg atgattattt caataagaag gattggtatg
    13861 acttcgtaga gaatcctgac atcttacgcg tatatgctaa cttaggtgag cgtgtacgcc
    13921 aatcattatt aaagactgta caattctgcg atgctatgcg tgatgcaggc attgtaggcg
    13981 tactgacatt agataatcag gatcttaatg ggaactggta cgatttcggt gatttcgtac
    14041 aagtagcacc aggctgcgga gttcctattg tggattcata ttactcattg ctgatgccca
    14101 tcctcacttt gactagggca ttggctgctg agtcccatat ggatgctgat ctcgcaaaac
    14161 cacttattaa gtgggatttg ctgaaatatg attttacgga agagagactt tgtctcttcg
    14221 accgttattt taaatattgg gaccagacat accatcccaa ttgtattaac tgtttggatg
    14281 ataggtgtat ccttcattgt gcaaacttta atgtgttatt ttctactgtg tttccaccta
    14341 caagttttgg accactagta agaaaaatat ttgtagatgg tgttcctttt gttgtttcaa
    14401 ctggatacca ttttcgtgag ttaggagtcg tacataatca ggatgtaaac ttacatagct
    14461 cgcgtctcag tttcaaggaa cttttagtgt atgctgctga tccagctatg catgcagctt
    14521 ctggcaattt attgctagat aaacgcacta catgcttttc agtagctgca ctaacaaaca
    14581 atgttgcttt tcaaactgtc aaacccggta attttaataa agacttttat gactttgctg
    14641 tgtctaaagg tttctttaag gaaggaagtt ctgttgaact aaaacacttc ttctttgctc
    14701 aggatggcaa cgctgctatc agtgattatg actattatcg ttataatctg ccaacaatgt
    14761 gtgatatcag acaactccta ttcgtagttg aagttgttga taaatacttt gattgttacg
    14821 atggtggctg tattaatgcc aaccaagtaa tcgttaacaa tctggataaa tcagctggtt
    14881 tcccatttaa taaatggggt aaggctagac tttattatga ctcaatgagt tatgaggatc
    14941 aagatgcact tttcgcgtat actaagcgta atgtcatccc tactataact caaatgaatc
    15001 ttaagtatgc cattagtgca aagaatagag ctcgcaccgt agctggtgtc tctatctgta
    15061 gtactatgac aaatagacag tttcatcaga aattattgaa gtcaatagcc gccactagag
    15121 gagctactgt ggtaattgga acaagcaagt tttacggtgg ctggcataat atgttaaaaa
    15181 ctgtttacag tgatgtagaa actccacacc ttatgggttg ggattatcca aaatgtgaca
    15241 gagccatgcc taacatgctt aggataatgg cctctcttgt tcttgctcgc aaacataaca
    15301 cttgctgtaa cttatcacac cgtttctaca ggttagctaa cgagtgtgcg caagtattaa
    15361 gtgagatggt catgtgtggc ggctcactat atgttaaacc aggtggaaca tcatccggtg
    15421 atgctacaac tgcttatgct aatagtgtct ttaacatttg tcaagctgtt acagccaatg
    15481 taaatgcact tctttcaact gatggtaata agatagctga caagtatgtc cgcaatctac
    15541 aacacaggct ctatgagtgt ctctatagaa atagggatgt tgatcatgaa ttcgtggatg
    15601 agttttacgc ttacctgcgt aaacatttct ccatgatgat tctttctgat gatgccgttg
    15661 tgtgctataa cagtaactat gcggctcaag gtttagtagc tagcattaag aactttaagg
    15721 cagttcttta ttatcaaaat aatgtgttca tgtctgaggc aaaatgttgg actgagactg
    15781 accttactaa aggacctcac gaattttgct cacagcatac aatgctagtt aaacaaggag
    15841 atgattacgt gtacctgcct tacccagatc catcaagaat attaggcgca ggctgttttg
    15901 tcgatgatat tgtcaaaaca gatggtacac ttatgattga aaggttcgtg tcactggcta
    15961 ttgatgctta cccacttaca aaacatccta atcaggagta tgctgatgtc tttcacttgt
    16021 atttacaata cattagaaag ttacatgatg agcttactgg ccacatgttg gacatgtatt
    16081 ccgtaatgct aactaatgat aacacctcac ggtactggga acctgagttt tatgaggcta
    16141 tgtacacacc acatacagtc ttgcaggctg taggtgcttg tgtattgtgc aattcacaga
    16201 cttcacttcg ttgcggtgcc tgtattagga gaccattcct atgttgcaag tgctgctatg
    16261 accatgtcat ttcaacatca cacaaattag tgttgtctgt taatccctat gtttgcaatg
    16321 ccccaggttg tgatgtcact gatgtgacac aactgtatct aggaggtatg agctattatt
    16381 gcaagtcaca taagcctccc attagttttc cattatgtgc taatggtcag gtttttggtt
    16441 tatacaaaaa cacatgtgta ggcagtgaca atgtcactga cttcaatgcg atagcaacat
    16501 gtgattggac taatgctggc gattacatac ttgccaacac ttgtactgag agactcaagc
    16561 ttttcgcagc agaaacgctc aaagccactg aggaaacatt taagctgtca tatggtattg
    16621 ctactgtacg cgaagtactc tctgacagag aattgcatct ttcatgggag gttggaaaac
    16681 ctagaccacc attgaacaga aactatgtct ttactggtta ccgtgtaact aaaaatagta
    16741 aagtacagat tggagagtac acctttgaaa aaggtgacta tggtgatgct gttgtgtaca
    16801 gaggtactac gacatacaag ttgaatgttg gtgattactt tgtgttgaca tctcacactg
    16861 taatgccact tagtgcacct actctagtgc cacaagagca ctatgtgaga attactggct
    16921 tgtacccaac actcaacatc tcagatgagt tttctagcaa tgttgcaaat tatcaaaagg
    16981 tcggcatgca aaagtactct acactccaag gaccacctgg tactggtaag agtcattttg
    17041 ccatcggact tgctctctat tacccatctg ctcgcatagt gtatacggca tgctctcatg
    17101 cagctgttga tgccctatgt gaaaaggcat taaaatattt gcccatagat aaatgtagta
    17161 gaatcatacc tgcgcgtgcg cgcgtagagt gttttgataa attcaaagtg aattcaacac
    17221 tagaacagta tgttttctgc actgtaaatg cattgccaga aacaactgct gacattgtag
    17281 tctttgatga aatctctatg gctactaatt atgacttgag tgttgtcaat gctagacttc
    17341 gtgcaaaaca ctacgtctat attggcgatc ctgctcaatt accagccccc cgcacattgc
    17401 tgactaaagg cacactagaa ccagaatatt ttaattcagt gtgcagactt atgaaaacaa
    17461 taggtccaga catgttcctt ggaacttgtc gccgttgtcc tgctgaaatt gttgacactg
    17521 tgagtgcttt agtttatgac aataagctaa aagcacacaa ggataagtca gctcaatgct
    17581 tcaaaatgtt ctacaaaggt gttattacac atgatgtttc atctgcaatc aacagacctc
    17641 aaataggcgt tgtaagagaa tttcttacac gcaatcctgc ttggagaaaa gctgttttta
    17701 tctcacctta taattcacag aacgctgtag cttcaaaaat cttaggattg cctacgcaga
    17761 ctgttgattc atcacagggt tctgaatatg actatgtcat attcacacaa actactgaaa
    17821 cagcacactc ttgtaatgtc aaccgcttca atgtggctat cacaagggca aaaattggca
    17881 ttttgtgcat aatgtctgat agagatcttt atgacaaact gcaatttaca agtctagaaa
    17941 taccacgtcg caatgtggct acattacaag cagaaaatgt aactggactt tttaaggact
    18001 gtagtaagat cattactggt cttcatccta cacaggcacc tacacacctc agcgttgata
    18061 taaagttcaa gactgaagga ttatgtgttg acataccagg cataccaaag gacatgacct
    18121 accgtagact catctctatg atgggtttca aaatgaatta ccaagtcaat ggttacccta
    18181 atatgtttat cacccgcgaa gaagctattc gtcacgttcg tgcgtggatt ggctttgatg
    18241 tagagggctg tcatgcaact agagatgctg tgggtactaa cctacctctc cagctaggat
    18301 tttctacagg tgttaactta gtagctgtac cgactggtta tgttgacact gaaaataaca
    18361 cagaattcac cagagttaat gcaaaacctc caccaggtga ccagtttaaa catcttatac
    18421 cactcatgta taaaggcttg ccctggaatg tagtgcgtat taagatagta caaatgctca
    18481 gtgatacact gaaaggattg tcagacagag tcgtgttcgt cctttgggcg catggctttg
    18541 agcttacatc aatgaagtac tttgtcaaga ttggacctga aagaacgtgt tgtctgtgtg
    18601 acaaacgtgc aacttgcttt tctacttcat cagatactta tgcctgctgg aatcattctg
    18661 tgggttttga ctatgtctat aacccattta tgattgatgt tcagcagtgg ggctttacgg
    18721 gtaaccttca gagtaaccat gaccaacatt gccaggtaca tggaaatgca catgtggcta
    18781 gttgtgatgc tatcatgact agatgtttag cagtccatga gtgctttgtt aagcgcgttg
    18841 attggtctgt tgaataccct attataggag atgaactgag ggttaattct gcttgcagaa
    18901 aagtacaaca catggttgtg aagtctgcat tgcttgctga taagtttcca gttcttcatg
    18961 acattggaaa tccaaaggct atcaagtgtg tgcctcaggc tgaagtagaa tggaagttct
    19021 acgatgctca gccatgtagt gacaaagctt acaaaataga ggagctcttc tattcttatg
    19081 ctacacatca cgataaattc actgatggtg tttgtttgtt ttggaattgt aacgttgatc
    19141 gttacccagc caatgcaatt gtgtgtaggt ttgacacaag agtcttgtca aacttgaact
    19201 taccaggctg tgatggtggt agtttgtatg tgaataagca tgcattccac actccagctt
    19261 tcgataaaag tgcatttact aatttaaagc aattgccttt cttttactat tctgatagtc
    19321 cttgtgagtc tcatggcaaa caagtagtgt cggatattga ttatgttcca ctcaaatctg
    19381 ctacgtgtat tacacgatgc aatttaggtg gtgctgtttg cagacaccat gcaaatgagt
    19441 accgacagta cttggatgca tataatatga tgatttctgc tggatttagc ctatggattt
    19501 acaaacaatt tgatacttat aacctgtgga atacatttac caggttacag agtttagaaa
    19561 atgtggctta taatgttgtt aataaaggac actttgatgg acacgccggc gaagcacctg
    19621 tttccatcat taataatgct gtttacacaa aggtagatgg tattgatgtg gagatctttg
    19681 aaaataagac aacacttcct gttaatgttg catttgagct ttgggctaag cgtaacatta
    19741 aaccagtgcc agagattaag atactcaata atttgggtgt tgatatcgct gctaatactg
    19801 taatctggga ctacaaaaga gaagccccag cacatgtatc tacaataggt gtctgcacaa
    19861 tgactgacat tgccaagaaa cctactgaga gtgcttgttc ttcacttact gtcttgtttg
    19921 atggtagagt ggaaggacag gtagaccttt ttagaaacgc ccgtaatggt gttttaataa
    19981 cagaaggttc agtcaaaggt ctaacacctt caaagggacc agcacaagct agcgtcaatg
    20041 gagtcacatt aattggagaa tcagtaaaaa cacagtttaa ctactttaag aaagtagacg
    20101 gcattattca acagttgcct gaaacctact ttactcagag cagagactta gaggatttta
    20161 agcccagatc acaaatggaa actgactttc tcgagctcgc tatggatgaa ttcatacagc
    20221 gatataagct cgagggctat gccttcgaac acatcgttta tggagatttc agtcatggac
    20281 aacttggcgg tcttcattta atgataggct tagccaagcg ctcacaagat tcaccactta
    20341 aattagagga ttttatccct atggacagca cagtgaaaaa ttacttcata acagatgcgc
    20401 aaacaggttc atcaaaatgt gtgtgttctg tgattgatct tttacttgat gactttgtcg
    20461 agataataaa gtcacaagat ttgtcagtga tttcaaaagt ggtcaaggtt acaattgact
    20521 atgctgaaat ttcattcatg ctttggtgta aggatggaca tgttgaaacc ttctacccaa
    20581 aactacaagc aagtcaagcg tggcaaccag gtgttgcgat gcctaacttg tacaagatgc
    20641 aaagaatgct tcttgaaaag tgtgaccttc agaattatgg tgaaaatgct gttataccaa
    20701 aaggaataat gatgaatgtc gcaaagtata ctcaactgtg tcaatactta aatacactta
    20761 ctttagctgt accctacaac atgagagtta ttcactttgg tgctggctct gataaaggag
    20821 ttgcaccagg tacagctgtg ctcagacaat ggttgccaac tggcacacta cttgtcgatt
    20881 cagatcttaa tgacttcgtc tccgacgcag attctacttt aattggagac tgtgcaacag
    20941 tacatacggc taataaatgg gaccttatta ttagcgatat gtatgaccct aggaccaaac
    21001 atgtgacaaa agagaatgac tctaaagaag ggtttttcac ttatctgtgt ggatttataa
    21061 agcaaaaact agccctgggt ggttctatag ctgtaaagat aacagagcat tcttggaatg
    21121 ctgaccttta caagcttatg ggccatttct catggtggac agcttttgtt acaaatgtaa
    21181 atgcatcatc atcggaagca tttttaattg gggctaacta tcttggcaag ccgaaggaac
    21241 aaattgatgg ctataccatg catgctaact acattttctg gaggaacaca aatcctatcc
    21301 agttgtcttc ctattcactc tttgacatga gcaaatttcc tcttaaatta agaggaactg
    21361 ctgtaatgtc tcttaaggag aatcaaatca atgatatgat ttattctctt ctggaaaaag
    21421 gtaggcttat cattagagaa aacaacagag ttgtggtttc aagtgatatt cttgttaaca
    21481 actaaacgaa catgtttatt ttcttattat ttcttactct cactagtggt agtgaccttg
    21541 accggtgcac cacttttgat gatgttcaag ctcctaatta cactcaacat acttcatcta
    21601 tgaggggggt ttactatcct gatgaaattt ttagatcaga cactctttat ttaactcagg
    21661 atttatttct tccattttat tctaatgtta cagggtttca tactattaat catacgtttg
    21721 gcaaccctgt catacctttt aaggatggta tttattttgc tgccacagag aaatcaaatg
    21781 ttgtccgtgg ttgggttttt ggttctacca tgaacaacaa gtcacagtcg gtgattatta
    21841 ttaacaattc tactaatgtt gttatacgag catgtaactt tgaattgtgt gacaaccctt
    21901 tctttgctgt ttctaaaccc atgggtacac agacacatac tatgatattc gataatgcat
    21961 ttaattgcac tttcgagtac atatctgatg ccttttcgct tgatgtttca gaaaagtcag
    22021 gtaattttaa acacttacga gagtttgtgt ttaaaaataa agatgggttt ctctatgttt
    22081 ataagggcta tcaacctata gatgtagttc gtgatctacc ttctggtttt aacactttga
    22141 aacctatttt taagttgcct cttggtatta acattacaaa ttttagagcc attcttacag
    22201 ccttttcacc tgctcaagac atttggggca cgtcagctgc agcctatttt gttggctatt
    22261 taaagccaac tacatttatg ctcaagtatg atgaaaatgg tacaatcaca gatgctgttg
    22321 attgttctca aaatccactt gctgaactca aatgctctgt taagagcttt gagattgaca
    22381 aaggaattta ccagacctct aatttcaggg ttgttccctc aggagatgtt gtgagattcc
    22441 ctaatattac aaacttgtgt ccttttggag aggtttttaa tgctactaaa ttcccttctg
    22501 tctatgcatg ggagagaaaa aaaatttcta attgtgttgc tgattactct gtgctctaca
    22561 actcaacatt tttttcaacc tttaagtgct atggcgtttc tgccactaag ttgaatgatc
    22621 tttgcttctc caatgtctat gcagattctt ttgtagtcaa gggagatgat gtaagacaaa
    22681 tagcgccagg acaaactggt gttattgctg attataatta taaattgcca gatgatttca
    22741 tgggttgtgt ccttgcttgg aatactagga acattgatgc tacttcaact ggtaattata
    22801 attataaata taggtatctt agacatggca agcttaggcc ctttgagaga gacatatcta
    22861 atgtgccttt ctcccctgat ggcaaacctt gcaccccacc tgctcttaat tgttattggc
    22921 cattaaatga ttatggtttt tacaccacta ctggcattgg ctaccaacct tacagagttg
    22981 tagtactttc ttttgaactt ttaaatgcac cggccacggt ttgtggacca aaattatcca
    23041 ctgaccttat taagaaccag tgtgtcaatt ttaattttaa tggactcact ggtactggtg
    23101 tgttaactcc ttcttcaaag agatttcaac catttcaaca atttggccgt gatgtttctg
    23161 atttcactga ttccgttcga gatcctaaaa catctgaaat attagacatt tcaccttgct
    23221 cttttggggg tgtaagtgta attacacctg gaacaaatgc ttcatctgaa gttgctgttc
    23281 tatatcaaga tgttaactgc actgatgttt ctacagcaat tcatgcagat caactcacac
    23341 cagcttggcg catatattct actggaaaca atgtattcca gactcaagca ggctgtctta
    23401 taggagctga gcatgtcgac acttcttatg agtgcgacat tcctattgga gctggcattt
    23461 gtgctagtta ccatacagtt tctttattac gtagtactag ccaaaaatct attgtggctt
    23521 atactatgtc tttaggtgct gatagttcaa ttgcttactc taataacacc attgctatac
    23581 ctactaactt ttcaattagc attactacag aagtaatgcc tgtttctatg gctaaaacct
    23641 ccgtagattg taatatgtac atctgcggag attctactga atgtgctaat ttgcttctcc
    23701 aatatggtag cttttgcaca caactaaatc gtgcactctc aggtattgct gctgaacagg
    23761 atcgcaacac acgtgaagtg ttcgctcaag tcaaacaaat gtacaaaacc ccaactttga
    23821 aatattttgg tggttttaat ttttcacaaa tattacctga ccctctaaag ccaactaaga
    23881 ggtcttttat tgaggacttg ctctttaata aggtgacact cgctgatgct ggcttcatga
    23941 agcaatatgg cgaatgccta ggtgatatta atgctagaga tctcatttgt gcgcagaagt
    24001 tcaatggact tacagtgttg ccacctctgc tcactgatga tatgattgct gcctacactg
    24061 ctgctctagt tagtggtact gccactgctg gatggacatt tggtgctggc gctgctcttc
    24121 aaataccttt tgctatgcaa atggcatata ggttcaatgg cattggagtt acccaaaatg
    24181 ttctctatga gaaccaaaaa caaatcgcca accaatttaa caaggcgatt agtcaaattc
    24241 aagaatcact tacaacaaca tcaactgcat tgggcaagct gcaagacgtt gttaaccaga
    24301 atgctcaagc attaaacaca cttgttaaac aacttagctc taattttggt gcaatttcaa
    24361 gtgtgctaaa tgatatcctt tcgcgacttg ataaagtcga ggcggaggta caaattgaca
    24421 ggttaattac aggcagactt caaagccttc aaacctatgt aacacaacaa ctaatcaggg
    24481 ctgctgaaat cagggcttct gctaatcttg ctgctactaa aatgtctgag tgtgttcttg
    24541 gacaatcaaa aagagttgac ttttgtggaa agggctacca ccttatgtcc ttcccacaag
    24601 cagccccgca tggtgttgtc ttcctacatg tcacgtatgt gccatcccag gagaggaact
    24661 tcaccacagc gccagcaatt tgtcatgaag gcaaagcata cttccctcgt gaaggtgttt
    24721 ttgtgtttaa tggcacttct tggtttatta cacagaggaa cttcttttct ccacaaataa
    24781 ttactacaga caatacattt gtctcaggaa attgtgatgt cgttattggc atcattaaca
    24841 acacagttta tgatcctctg caacctgagc tcgactcatt caaagaagag ctggacaagt
    24901 acttcaaaaa tcatacatca ccagatgttg atcttggcga catttcaggc attaacgctt
    24961 ctgtcgtcaa cattcaaaaa gaaattgacc gcctcaatga ggtcgctaaa aatttaaatg
    25021 aatcactcat tgaccttcaa gaattgggaa aatatgagca atatattaaa tggccttggt
    25081 atgtttggct cggcttcatt gctggactaa ttgccatcgt catggttaca atcttgcttt
    25141 gttgcatgac tagttgttgc agttgcctca agggtgcatg ctcttgtggt tcttgctgca
    25201 agtttgatga ggatgactct gagccagttc tcaagggtgt caaattacat tacacataaa
    25261 cgaacttatg gatttgttta tgagattttt tactcttgga tcaattactg cacagccagt
    25321 aaaaattgac aatgcttctc ctgcaagtac tgttcatgct acagcaacga taccgctaca
    25381 agcctcactc cctttcggat ggcttgttat tggcgttgca tttcttgctg tttttcagag
    25441 cgctaccaaa ataattgcgc tcaataaaag atggcagcta gccctttata agggcttcca
    25501 gttcatttgc aatttactgc tgctatttgt taccatctat tcacatcttt tgcttgtcgc
    25561 tgcaggtatg gaggcgcaat ttttgtacct ctatgccttg atatattttc tacaatgcat
    25621 caacgcatgt agaattatta tgagatgttg gctttgttgg aagtgcaaat ccaagaaccc
    25681 attactttat gatgccaact actttgtttg ctggcacaca cataactatg actactgtat
    25741 accatataac agtgtcacag atacaattgt cgttactgaa ggtgacggca tttcaacacc
    25801 aaaactcaaa gaagactacc aaattggtgg ttattctgag gataggcact caggtgttaa
    25861 agactatgtc gttgtacatg gctatttcac cgaagtttac taccagcttg agtctacaca
    25921 aattactaca gacactggta ttgaaaatgc tacattcttc atctttaaca agcttgttaa
    25981 agacccaccg aatgtgcaaa tacacacaat cgacggctct tcaggagttg ctaatccagc
    26041 aatggatcca atttatgatg agccgacgac gactactagc gtgcctttgt aagcacaaga
    26101 aagtgagtac gaacttatgt actcattcgt ttcggaagaa acaggtacgt taatagttaa
    26161 tagcgtactt ctttttcttg ctttcgtggt attcttgcta gtcacactag ccatccttac
    26221 tgcgcttcga ttgtgtgcgt actgctgcaa tattgttaac gtgagtttag taaaaccaac
    26281 ggtttacgtc tactcgcgtg ttaaaaatct gaactcttct gaaggagttc ctgatcttct
    26341 ggtctaaacg aactaactat tattattatt ctgtttggaa ctttaacatt gcttatcatg
    26401 gcagacaacg gtactattac cgttgaggag cttaaacaac tcctggaaca atggaaccta
    26461 gtaataggtt tcctattcct agcctggatt atgttactac aatttgccta ttctaatcgg
    26521 aacaggtttt tgtacataat aaagcttgtt ttcctctggc tcttgtggcc agtaacactt
    26581 gcttgttttg tgcttgctgc tgtctacaga attaattggg tgactggcgg gattgcgatt
    26641 gcaatggctt gtattgtagg cttgatgtgg cttagctact tcgttgcttc cttcaggctg
    26701 tttgctcgta cccgctcaat gtggtcattc aacccagaaa caaacattct tctcaatgtg
    26761 cctctccggg ggacaattgt gaccagaccg ctcatggaaa gtgaacttgt cattggtgct
    26821 gtgatcattc gtggtcactt gcgaatggcc ggacaccccc tagggcgctg tgacattaag
    26881 gacctgccaa aagagatcac tgtggctaca tcacgaacgc tttcttatta caaattagga
    26941 gcgtcgcagc gtgtaggcac tgattcaggt tttgctgcat acaaccgcta ccgtattgga
    27001 aactataaat taaatacaga ccacgccggt agcaacgaca atattgcttt gctagtacag
    27061 taagtgacaa cagatgtttc atcttgttga cttccaggtt acaatagcag agatattgat
    27121 tatcattatg aggactttca ggattgctat ttggaatctt gacgttataa taagttcaat
    27181 agtgagacaa ttatttaagc ctctaactaa gaagaattat tcggagttag atgatgaaga
    27241 acctatggag ttagattatc cataaaacga acatgaaaat tattctcttc ctgacattga
    27301 ttgtatttac atcttgcgag ctatatcact atcaggagtg tgttagaggt acgactgtac
    27361 tactaaaaga accttgccca tcaggaacat acgagggcaa ttcaccattt caccctcttg
    27421 ctgacaataa atttgcacta acttgcacta gcacacactt tgcttttgct tgtgctgacg
    27481 gtactcgaca tacctatcag ctgcgtgcaa gatcagtttc accaaaactt ttcatcagac
    27541 aagaggaggt tcaacaagag ctctactcgc cactttttct cattgttgct gctctagtat
    27601 ttttaatact ttgcttcacc attaagagaa agacagaatg aatgagctca ctttaattga
    27661 cttctatttg tgctttttag cctttctgct attccttgtt ttaataatgc ttattatatt
    27721 ttggttttca ctcgaaatcc aggatctaga agaaccttgt accaaagtct aaacgaacat
    27781 gaaacttctc attgttttga cttgtatttc tctatgcagt tgcatatgca ctgtagtaca
    27841 gcgctgtgca tctaataaac ctcatgtgct tgaagatcct tgtaaggtac aacactaggg
    27901 gtaatactta tagcactgct tggctttgtg ctctaggaaa ggttttacct tttcatagat
    27961 ggcacactat ggttcaaaca tgcacaccta atgttactat caactgtcaa gatccagctg
    28021 gtggtgcgct tatagctagg tgttggtacc ttcatgaagg tcaccaaact gctgcattta
    28081 gagacgtact tgttgtttta aataaacgaa caaattaaaa tgtctgataa tggaccccaa
    28141 tcaaaccaac gtagtgcccc ccgcattaca tttggtggac ccacagattc aactgacaat
    28201 aaccagaatg gaggacgcaa tggggcaagg ccaaaacagc gccgacccca aggtttaccc
    28261 aataatactg cgtcttggtt cacagctctc actcagcatg gcaaggagga acttagattc
    28321 cctcgaggcc agggcgttcc aatcaacacc aatagtggtc cagatgacca aattggctac
    28381 taccgaagag ctacccgacg agttcgtggt ggtgacggca aaatgaaaga gctcagcccc
    28441 agatggtact tctattacct aggaactggc ccagaagctt cacttcccta cggcgctaac
    28501 aaagaaggca tcgtatgggt tgcaactgag ggagccttga atacacccaa agaccacatt
    28561 ggcacccgca atcctaataa caatgctgcc accgtgctac aacttcctca aggaacaaca
    28621 ttgccaaaag gcttctacgc agagggaagc agaggcggca gtcaagcctc ttctcgctcc
    28681 tcatcacgta gtcgcggtaa ttcaagaaat tcaactcctg gcagcagtag gggaaattct
    28741 cctgctcgaa tggctagcgg aggtggtgaa actgccctcg cgctattgct gctagacaga
    28801 ttgaaccagc ttgagagcaa agtttctggt aaaggccaac aacaacaagg ccaaactgtc
    28861 actaagaaat ctgctgctga ggcatctaaa aagcctcgcc aaaaacgtac tgccacaaaa
    28921 cagtacaacg tcactcaagc atttgggaga cgtggtccag aacaaaccca aggaaatttc
    28981 ggggaccaag acctaatcag acaaggaact gattacaaac attggccgca aattgcacaa
    29041 tttgctccaa gtgcctctgc attctttgga atgtcacgca ttggcatgga agtcacacct
    29101 tcgggaacat ggctgactta tcatggagcc attaaattgg atgacaaaga tccacaattc
    29161 aaagacaacg tcatactgct gaacaagcac attgacgcat acaaaacatt cccaccaaca
    29221 gagcctaaaa aggacaaaaa gaaaaagact gatgaagctc agcctttgcc gcagagacaa
    29281 aagaagcagc ccactgtgac tcttcttcct gcggctgaca tggatgattt ctccagacaa
    29341 cttcaaaatt ccatgagtgg agcttctgct gattcaactc aggcataaac actcatgatg
    29401 accacacaag gcagatgggc tatgtaaacg ttttcgcaat tccgtttacg atacatagtc
    29461 tactcttgtg cagaatgaat tctcgtaact aaacagcaca agtaggttta gttaacttta
    29521 atctcacata gcaatcttta atcaatgtgt aacattaggg aggacttgaa agagccacca
    29581 cattttcatc gaggccacgc ggagtacgat cgagggtaca gtgaataatg ctagggagag
    29641 ctgcctatat ggaagagccc taatgtgtaa aattaatttt agtagtgcta tccccatgtg
    29701 attttaatag cttcttagga gaatgac