CA2443740A1

CA2443740A1 - Constructs and methods for expression of recombinant hcv envelope proteins

Info

Publication number: CA2443740A1
Application number: CA002443740A
Authority: CA
Inventors: Erwin Sablon; Annie Van Broekhoven; Alfons Bosman; Erik Depla; Geert Deschamps
Original assignee: Individual
Current assignee: GenImmune NV
Priority date: 2001-04-24
Filing date: 2002-04-24
Publication date: 2002-10-31
Also published as: WO2002085932A3; OA13092A; YU84103A; WO2002086101A2; CN1636050A; US20030211597A1; US7048930B2; BR0209034A; AU2002252856A1; EP1414942A2; US20030152940A1; CA2443781A1; KR100950104B1; RU2003130955A; WO2002086101A3; AR035868A1; US7238356B2; JP2004536052A; ZA200308274B; ZA200308272B

Abstract

The current invention relates to vectors and methods for efficient expression of HCV envelope proteins in eukaryotic cells. More particularly said vectors comprise the coding sequence for an avian lysozyme signal peptide or a functional equivalent thereof joined to a HCV envelope protein or a part thereof. Said avian lysozyme signal peptide is efficiently removed when the protein comprising said avian lysozyme signal peptide joined to a HCV envelope protein or a part thereof is expressed in a eukaryotic cell. Suitable eukaryotic cells include yeast cells such as Saccharomyces or Hansenula cells.

Description

CONSTRUCTS AND METHODS FOR EXPRESSION OF RECOMBINANT HCV
ENVELOPE PROTEINS
FIELD OF THE INVENTION
The present invention relates to the general field of recombinant protein expression.
More particularly, the present invention relates to the expression of hepatitis C virus envelope proteins in a eukaryote such as yeast. Constructs and methods are disclosed for the expression l0 of core-glycosylated viral envelope proteins in yeast.
BACKGROUND OF THE INVENTION
Hepatitis C virus (HCV) infection is a major health problem in both developed and developing countries. It is estimated that about 1 to 5 % of the world population is affected by the virus. HCV infection appears to be the most important cause of transfusion-associated hepatitis and frequently progresses to chronic liver damage. Moreover, evidence exists implicating HCV in induction of hepatocellular carcinoma. Consequently, the demand for reliable diagnostic methods and effective therapeutic agents is high. Also sensitive and specific screening methods of HCV-contaminated blood-products and improved methods to culture HCV are needed.
HCV is a positive stranded RNA virus of approximately 9,600 bases which encode a single polyprotein precursor of about 3000 amino acids. Proteolytic cleavage of the precursor coupled to co- and posttranslational modifications has been shown to result in at least three structural and six non-structural proteins. Based on sequence homology, the structural proteins have been functionally assigned as one single core protein and two.
envelope glycoproteins: El and E2. The El protein consists of 192 amino acids and contains 5 to 6 N-glycosylation sites, depending on the HCV genotype. The E2 protein consists of 363 to 370 amino acids and contains 9 to 11 N-glycosylation sites, depending on the HCV
genotype (for reviews see: Major, M. E. and Feinstone, S. M. 1997, Maertens, G. and Stuyver, L. 1997).
The E1 protein contains various variable domains (Maertens, G. and Stuyver, L.
1997). The E2 protein contains three hypervariable domains, of which the major domain is located at the N-terminus of the protein (Maertens, G. and Stuyver, L. 1997). The HCV
glycoproteins localize predominantly in the ER where they are modified and assembled into oligomeric complexes.
In eukaryotes, sugar residues are commonly linked to four different amino acid residues. These amino acid residues are classified as O-linked (serine, threonine, and hydroxylysine) and N-linked (asparagine). The O-linked sugars are synthesized in the Golgi or rough Endoplasmic Reticulum (ER) from nucleotide sugars. The N-linked sugars are synthesized from a common precursor, and subsequently processed. It is believed that HCV
envelope proteins are N-glycosylated. It is known in the art that addition of N-linked l0 carbohydrate chains is important for stabilization of folding intermediates and thus for efficient folding, prevention of malfolding and degradation in the endoplasmic reticulum, oligomerization, biological activity, and transport of glycoproteins (see reviews by Rose, J. K.
and Doms, R. W. 1988, Doms, R. W. et al. 1993, Helenius, A. 1994)). The tripeptide sequences Asn-X-Ser and Asn-X-Thr (in which X can be any amino acid) on polypeptides are the consensus sites for binding N-linked oligosaccharides. After addition of the N-linked oligosaccharide to the polypeptide, the oligosaccharide is further processed into the complex type (containing N-acetylglucosamine, mannose, fucose, galactose and sialic acid) or the high-mannose type (containing N-acetylglucosamine and mannose). HCV envelope proteins are believed to be of the high-mannose type. N-linked oligosaccharide biosynthesis in yeast is 2o very different from the biosynthesis in mammalian cells. In yeast the oligosaccharide chains are elongated in the Golgi through stepwise addition of mannose, leading to elaborate high mannose structures, leading to elaborate high mannose structures, referred to as hyperglycosylation. In contrast therewith, proteins expressed in prokaryotes are never glycosylated.
To date, vaccination against disease has been proven to be the most cost effective and efficient method for controlling diseases. Despite promising results, efforts to develop an efficacious HCV vaccine, however, have been plagued with difficulties. A
conditio sine qua non for vaccines is the induction of an immune response in patients.
Consequently, HCV antigenic determinants should be identified, and administered to patients in a proper setting. Antigenic determinants can be divided in at least two forms, i.e.. lineair and conformational epitopes. Conformational epitopes result from the folding of a molecule in a three-dimensional space, including co- and posttranslational modifications, such as glycosylation. In general, it is believed that conformational epitopes will realize the most efficacious vaccines, since they represent epitopes which resemble native-like HCV epitopes, and which may be better conserved than the actual linear amino acid sequence.
Hence, the eventual degree of glycosylation of the HCV envelope proteins is of the utmost importance for generating native-like HCV antigenic determinants. However, there are seemingly insurmountable problems with culturing HCV, that result in only minute amounts of virions.
In addition, there are vast problems with the expression and purification of recombinant proteins, that result in either low amounts of proteins, hyperglycosylated proteins, or proteins that are not glycosylated.
In order to obtain glycosylation of an expressed protein, said protein needs to be targeted to the endoplasmic reticulum (ER). This process requires the presence of a pre-pro l0 or pre-sequence, the latter also known as signal peptide or leader peptide, at the amino terminal end of the expressed protein. Upon translocation of the protein into the lumen of the ER, the pre-sequence is removed by means of a signal peptidase complex. A
large number of pre-pro- and pre-sequences is currently known in the art. These include the S.
ce~evisiae a-mating factor leader (pre-pro; aMF or MFa), the Cap°ciuus maercas hyperglycemic hormone leader sequence (pre; CHH), the S. occide~atalis amylase leader sequence (pre;
Amyl), the S.
occide~ztalis glucoamylase Gaml leader sequence (pre; Gam1), the fungal phytase leader sequence (pre; PhyS)~ the Pichia pastonis acid phosphatase leader sequence (pre; phol), the yeast aspartic protease 3 signal peptide (pre; YAP3), the mouse salivary amylase signal peptide (pre) and the chicken lysozyme leader sequence (pre; CL).
2o The CHH leader has been coupled with hirudin and G-CSF (granulocyte colony stimulating factor) and expression of the CHH-hirudin and CHH-G-CSF proteins in Hansenula polymo~pha results in correct removal of the leader sequence (Weydemann, U. et al. 1995, Fischer et al. in WO00/40727). The chicken lysozyme leader sequence has been fused to human interferona,2b (IFNa2b), human serum albumin and human lysozyme or 1,4-(3-N-acetylmuramidase and expressed in S. ce~~evisiae (Rape in GenBank accession nmnber AF405538, Okabayashi, I~. et al. 1991, de Baetselier et al. in EP0362183, Oberto and Davison in EP0184575). Mustilli and coworkers (Mustilli, A. C. et al. 1999) have utilized the Kluyver°omyces lactis killer toxin leader peptide for expression of HCV
E2 in S. cef°evisiae and h'. lactis.
3o The HCV envelope proteins have been produced by recombinant techniques in Esclae~ichia coli, insect cells, yeast cells and mammalian cells. However, expression in higher eukaryotes has been characterised by the difficulty of obtaining large amounts of antigens for eventual vaccine production. Expression in prokaryotes, such as E. coli results in HCV

envelope proteins that are not glycosylated. Expression of HCV envelope proteins in yeast resulted in hyperglycosylation. As already demonstrated in WO 96/04385, the expression of HCV envelope protein E2 in Sacchanonayces cef°cvisiae leads to proteins which are heavily glycosylated. This hyperglycosylation leads to shielding of protein epitopes.
Although Mustilli and co-workers (Mustilli, A. C. et al. 1999) claims that expression of HCV E2 in S
cerevisiae results in core-glycosylation, the analysis of the intracellularly expressed material demonstrates that part of it is at least hyperglycosylated, while the correct processing of the remainder of this material has not been shown. The need for HCV envelope proteins derived from an intracellular source is well accepted (WO 96!04385 to Maertens et al.
and Heile, J.
to M. et al. 2000). This need is further exemplified by the poor reactivity of the secreted yeast derived E2 with sera of chimpanzee immunized with mammalian cell culture derived E2 proteins as evidenced in Figure 5 of Mustilli and coworkers (Mustilli, A. C.
et al. 1999). 'This is further documented by Rosa and colleagues (Rosa, D. et al. 1996) who show that immunization with yeast derived HCV envelope proteins fails to protect from challenge.
Consequently, there is a need for efficient expression systems resulting in large and cost-effective amounts of proteins and, in particular, such systems are needed for production of HCV envelope proteins. If a pre- or pre-pro-sequence is used to direct the protein of interest to the ER, then efficiency of the expression system is, amongst others, dependent on the efficiency and fidelity with which the pre- or pre-pro-sequences are removed from the 2o protein ofinterest.
SUMMARY OF THE INVENTION
A first aspect of the present invention relates to recombinant nucleic acids comprising a nucleotide sequence encoding a protein comprising an avian lysozyme leader peptide or a functional equivalent thereof joined to an HCV envelope protein or a part thereof. More specifically said protein is characterized by the structure CL-[(A1)a - (PS1)b - (A2)~]
HCVENV-[(A3)d - (PS2)e - (A4)f~
wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, Al, A2, A3 and A4 are adaptor peptides which can be different or the same, PS l and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, a, b, c, d, a and f are 0 or 1, and wherein, optionally, A1 and/or A2 are part of PS1 and/or wherein A3 and/or A4 are part of PS2.
The recombinant nucleic acids according to the invention may further comprise regulatory elements allowing expression of said protein in a eukaryotic host cell.
Another aspect of the invention relates to a recombinant nucleic acid according to the l0 invention which are comprised in a vector. Said vector may be an expression vector and/or an autonomously replicating vector or an integrative vector.
A further aspect of the invention relates to a host cell harboring a recombinant nucleic acid according to the invention or a vector according to the invention. More particularly, said host cell is capable of expressing the protein comprising an avian lysozyme leader peptide or a functional equivalent thereof joined to an HCV envelope protein or a part thereof. More specifically, said protein is characterized by the structure CL-[(Al)a -(PS1)b - (A2)~]-HCVENV-[(A3)d - (PS2)e - (A4) f] ..
wherein:
2o CL is an avian lysozyme leader peptide or a functional equivalent thereof, A1, A2, A3 and A4 are adaptor peptides which can be different or the same, PS1 and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, a, b, c, d, a and f are 0 or l, and wherein, optionally, A1 and/or A2 are part of PS1 and/or wherein A3 and/or A4 are part of PS2.
The host cell according to the invention may be capable of removing the avian lysozyme leader peptide with high efficiency and fidelity and may be capable of processing 3o the processing sites PS1 and/or PS2 in said protein translocated to the endoplasmic reticulum.
Said host cell may further be capable of N-glycosylating said protein translocated to the endoplasmic reticulum or said protein translocated to the endoplasmic reticulum and processed at said sites PS 1 and/or PS2. The host cell may be an eukaryotic cell such as a yeast cell.
A next aspect of the invention relates to a method for producing an HCV
envelope protein or part thereof in a host cell, said method comprising transforming said host cell with a recombinant nucleic acid according to the invention or with a vector according to the invention, and wherein said host cell is capable of expressing a protein comprising the avian lysozyme leader peptide or a functional equivalent thereof joined to an HCV
envelope protein or a part thereof. More particularly, said protein is characterized by the structure CL-[(A1)a -(PS 1)b- (A2)~]-HCVENV-[(A3)d - (PS2)e - (A4) f]
1 o wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, Al, A2, A3 and A4 are adaptor peptides which can be different or the same, PS 1 and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, a, b, c, d, a and f are 0 or 1, and wherein, optionally, A1 and/or A2 are part of PS 1 and/or wherein A3 and/or A4 are part of PS2.
The method according to the invention may further comprise cultivation of said host cells in a suitable medium to obtain expression of said protein, isolation of the expressed protein from a 2o culture of said host cells, or from said host cells. Said isolation may include one or more of (i) lysis of said host cells in the presence of a chaotropic agent, (ii) chemical modification of the cysteine thiol-groups in the isolated proteins wherein said chemical modification may be reversible or irreversible and (iii) heparin affinity chromatography.
FIGURE LEGENDS
Figure 1. Schematic map of the vector pGEMT-ElsH6RB which has the sequence as defined in SEQ ID NO:6.
Figure 2. Schematic map of the vector pCHH-Hir which has the sequence as defined in SEQ
ID N0:9.

Figure 3. Schematic map of the vector pFPMT121 which has the sequence as defined in SEQ
ID N0:12.
Figure 4. Schematic map of the vector pFPMT-CHH-E1-H6 which has the sequence as defined in SEQ ID N0:13.
Figure 5. Schematic map of the vector pFPMT-MFa-E1-H6 which has the sequence as defined in SEQ ID N0:16.
to Figure 6. Schematic map of the vector pUClB-FMD-MFa-E1-H6 which has the sequence as defined in SEQ ID NO:I7.
Figure 7. Schematic map of the vector pUCl8-FMD-CL-E1-H6 which has the sequence as defined in SEQ ID N0:20.
Figure 8. Schematic map of the vector pFPMT-CL-E1-H6 which has the sequence as defined in SEQ ID N0:21.
Figure 9. Schematic map of the vector pSP72E2H6 which has the sequence as defined in SEQ ID N0:22.
Figure 10. Schematic map of the vector pMPT121 which has the sequence as defined in SEQ
ID NO:23.
Figure 11. Schematic map of the vector pFPMT-MFa-E2-H6 which has the sequence as defined in SEQ ID NO:24.
Figure 12. Schematic map of the vector pMPT-MFa-E2-H6 which has the sequence as 3o defined in SEQ ID N0:25.
Figure 13. Schematic map of the vector pMF30 which has the sequence as defined in SEQ ID
N0:28.

_g_ Figure 14. Schematic map of the vector pFPMT-CL-E2-H6 which has the sequence as defined in SEQ ID N0:32.
Figure 15. Schematic map of the vector pUClB-FMD-CL-E1 which has the sequence as defined in SEQ ID N0:35.
Figure 16. Schematic map of the vector pFPMT-CL-E1 which has the sequence as defined in SEQ ID N0:36.
to Figure 17. Schematic map of the vector pUCl8-FMD-CL-H6-E1-K-H6 which has the sequence as defined in SEQ ID N0:39.
Figure 18. Schematic map of the vector pFPMT-CL-H6-K-E1 which has the sequence as defined in SEQ ID N0:40.
Figure 19. Schematic map of the vector pYIGS which has the sequence as defined in SEQ ID
N0:41.
Figure 20. Schematic map of the vector pYIG5E1H6 which has the sequence as defined in 2o SEQ ID N0:42.
Figure 21. Schematic map of the vector pSYl which has the sequence as defined in SEQ ID
N0:43.
Figure 22. Schematic map of the vector pSYl aMFEl sH6a which has the sequence as defined in SEQ ID N0:44.
Figure 23. Schematic map of the vector pBSK-E2sH6 which has the sequence as defined in SEQ ID N0:45.
Figure 24. Schematic map of the vector pYIGSHCCL-22aH6 which has the sequence as defined in SEQ ID N0:46.

Figure 25. Schematic map of the vector pYYIGSE2H6 which has the sequence as defined in SEQ ID N0:47.
Figure 26. Schematic map of the vector pYIG7 which has the sequence as defined in SEQ ID
N0:48.
Figure 27. Schematic map of the vector pYIG7E1 which has the sequence as defined in SEQ
ID N0:49.
to Figure 28. Schematic map of the vector pSYlYIG7Els which has the sequence as defined in SEQ ID NO:50.
Figure 29. Schematic map of the vector pPICZalphaA which has the sequence as defined in SEQ ID N0:51.
Figure 30. Schematic map of the vector pPICZalphaD' which has the sequence as defined in SEQ ID N0:52.
Figure 31. Schematic map of the vector pPICZalphaE' which has the sequence as defined in 2o SEQ ID N0:53.
Figure 32. Schematic map of the vector pPICZalphaD'E1 sH6 which has the sequence as defined iri SEQ ID NO:58.
Figure 33. Schematic map of the vector pPICZaIphaE'E1 sH6 which has the sequence as defined in SEQ ID N0:59.
Figure 34. Schematic map of the vector pPICZaIphaD'E2sH6 which has the sequence as defined in SEQ ID N0:60.
Figure 35. Schematic map of the vector pPICZaIphaE'E2sH6 which has the sequence as defined in SEQ ID N0:61.

Figure 36. Schematic map of the vector pUCI8MFa which has the sequence as defined in SEQ ID N0:62.
Figure 37. Elution profile of size exclusion chromatography of IMAC-purified E2-H6 protein expressed from the MFa-E2-H6-expressing Hansenula pol'nnofpha (see Example 15). The X-axis indicates the elution volume (in mL). The vertical lines through the elution profile indicate the fractions collected. "P1"= pooled fractions 4 to 9, "P2"= pooled fractions 30 to 35, and "P3"= pooled fractions 37 to 44. The Y-axis indicates absorbance given in mAU
(milli absorbance units). The X-axis indicates the elution volume in mL.
l0 Figure 38. The different pools and fractions collected after size exclusion chromatography (see Figure 37) were analyzed by non-reducing SDS-PAGE followed by silver staining of the polyacrylamide gel. The analyzed pools ("Pl", "P2", and "P3") and fractions (16 to 26) are indicated on top of the picture of the silver-stained gel. At the left (lane "M") are indicated the sizes of the molecular mass markers.
Figure 39. Fractions 17 to 23 of the size exclusion chromatographic step as shown in Figure 37 were pooled and alkylated. Thereafter, the protein material was subjected to Endo H
treatment for deglycosylation. Untreated material and Endo H-treated material were separated on an SDS-PAGE gel and blotted to a PVDF membrane. The blot was stained with amido black.
Lane 1: Alkylated E2-H6 before Endo H-treatment Lane 2: Alkylated E2-H6 after Endo H-treatment.
Figure 40. Western-blot analysis of cell lysates of E1 expressed in Saccha~omyces cef°evisiae.
The Western-blot was developed using the El-specific monoclonal antibody IGH
20I.
Lanes 1-4: expression product after 2, 3, 5 or 7 days expression, respectively, in a Sacchaf-ofnyces clone transformed with pSYlYIG7Els (SEQ ID N0:50, Figure 28) comprising the nucleotide sequence encoding the chicken lysozyme leader peptide joined to 3o E1-H6.
Lanes 5-7: expression product after 2, 3 or 5 days expression, respectively, in a Sacclaaromyees clone transformed with pSYlaMFElsH6aYIG1 (SEQ ID N0:44, Figure 22) comprising the nucleotide sequence encoding the a-mating factor leader peptide joined to E1-H6.

_11_ Lane 8: molecular weight markers with sizes as indicated.
Lane 9: purified E 1 s produced by HCV-recombinant vaccinia virus-infected mammalian cells.
Figure 41. Analysis of the immobilized metal ion affinity chromatography (IMAC)-purified E2-H6 protein expressed by and processed from CL-E2-H6 to E2-H6 by H.
polymofpha (see Example 17). Proteins in different wash fractions (lanes 2 to 4) and elution fractions (lanes 5 to 7) were analyzed by reducing SDS-PAGE followed by. silver staining of the gel (A, top picture) or by western blot using using a specific monoclonal antibody directed against E2 (S, 1o bottom picture). The sizes of the molecular mass markers are indicated at the left.
Figure 42. Elution profile of the first IMAC chromatography step on a Ni-IDA
coham_n_ (Chelating Sepharose FF loaded with Ni2+, Pharmacia) for the purification of the sulfonated H6-K-E1 protein produced by H. polynao~pha (see Example 18). The column was equilibrated with buffer A (50 mM phosphate, 6 M GuHCI, 1 % Empigen BB (v/v), pH 7.2) supplemented with 20 mM imidazole. After sample application, the column was washed sequentially with buffer A containing 20 mM and 50 mM imidazole, respectively (as indicated on chromatogram). A further washing and elution step of the His-tagged products was performed by the sequential application of buffer B (PBS, 1 % empigen BB, pH 7.2) supplemented with 2o 50 mM imidazole and 200 mM imidazole respectively (as indicated on chromatogram).
Following fractions were pooled: the wash pool 1 (fractions 8 to 11, wash with 50 mM
imidazole). The eluted material was collected as separate fractions 63 to 72 or an elution pool (fractions 63 to 69) was made. The Y-axis indicates absorbance given in mAU
(milli absorbance units). The X-axis indicates the elution volume in mL
Figure.43. Analysis of the IMAC-purified H6-K-El protein (see Figure 42) expressed by and processed from CL-H6-K-E1 to H6-K-E1 by H. polymorpha. Proteins in the wash pool 1 (lane 12) and elution fractions 63 to 72 (lanes 2 to 11) were analyzed by reducing SDS-PAGE
followed by silver staining of the gel (A, top picture). Proteins present in the sample before IMAC (lane 2), in the flow-through pool (lane 4), in wash pool 1 (lane 5) and in the elution pool (lane 6) were analyzed by western blot using a specific monoclonal antibody directed against El (IGH201) (B, bottom picture; no sample was loaded in lane 3). The sizes of the molecular mass markers (lanes M) are indicated at the left.

Figure 44. Elution profile of the second IMAC chromatography step on a Ni-IDA
column (Chelating Sepharose FF loaded with Ni2+, Pharmacia) for the purification of E1 resulting from the in vits°o processing of H6-K-E1 (purification: see Figure 42) with Endo Lys-C. The flow through was collected in different fractions (1 to 40) that were screened for the presence of Els-products. The fractions (7 to 28), containing intact E1 processed from H6-K-El were pooled. The Y-axis indicates absorbance given in mAU (milli absorbance units).
The X-axis indicates the elution volume in mL
Figure 45. Western-blot analysis indicating specific Els proteins bands reacting with to biotinylated heparin (see also Example 19). Els preparations purified from HCV-recombinant vaccinia virus-infected mammalian cell culture or expressed by H. polynorpha were analyzed. The panel right from the vertical line shows a Western-blot developed with the biotinylated El specific monoclonal IGH 200. The panel left from the vertical line shows a Western-blot developed with biotinylated heparin. From these results it is concluded that mainly the lower-glycosylated E1 s has high affinity for heparin.
Lanes M: molecular weight marker (molecular weights indicated at the left).
Lanes 1: Els from maxmnalian cells and alkylated during isolation.
Lanes 2: Els-H6 expressed by Fl. poly~nofpha and sulphonated during isolation.
Lanes 3: El s-H6 expressed by FI. polymorpha and alkylated during isolation.
Lanes 4: same material as loaded in lane 2 but treated with dithiotreitol to convert the sulphonated Cys-thiol groups to Cys-thiol.
Figure 46. Size exclusion chromatography (SEC) profile of the purified H.
polynaoopha-expressed E2-H6 in its sulphonated form, submitted to a run in PBS, 3% betain to force virus-Like particle formation by exchange of Empigen BB for betain. The pooled fractions containing the VLPs used for fut-ther study are indicated by "H". The Y-axis indicates absorbance given in mAU (milli absorbance units). The X-axis indicates the elution volume in mL. See also Example 20.
3o Figure 47. Size exclusion chromatography (SEC) profile of the purified H.
polysno~plaa-expressed E2-H6 in its alkylated form, submitted to a run in PBS, 3% betain to force virus-like particle formation by exchange of Empigen BB for betain. The pooled fractions containing the VLPs are indicated by "H". The Y-axis indicates absorbance given in mAU

(milli absorbance units). The X-axis indicates the elution volume in mL. See also Example 20.
Figure 48. Size exclusion chromatography (SEC) profile of the purified H.
polyrno~pha expressed El in its sulphonated form, submitted to a run in PBS, 3% betain to force virus-like particle formation by exchange of Empigen BB for betain. The pooled fractions containing the VLPs are indicated by "H". The Y-axis indicates absorbance given in mAU
(milli absorbance units). The X-axis indicates the elution volume in mL. See also Example 20.
Figure 49. Size exclusion chromatography (SEC) profile of the purified H.
polymorpha-io expressed E1 in its alkylated form, submitted to a run in PBS, 3% betain to force virus-like particle formation by exchange of Empigen BB for betain. The pooled fractions containing the VLPs are indicated by "H". The Y-axis indicates absorbance given in mAU
(mini absorbance units). The X-axis indicates the elution volume in mL. See also Example 20.
Figure 50. SDS-PAGE (under reducing conditions) and western blot analysis of VLPs as isolated after size exclusion chromatography (SEC) as described in Figures 48 and 49. Left panel: silver-stained SDS-PAGE gel. Right panel: western blot using a specific monoclonal antibody directed against El (IGH201). Lanes l: molecular weight markers (molecular weights indicated at the left); lanes 2: pool of VLPs containing sulphonated E1 (cfr. Figure 48); lanes 3: pool of VLPs containing alkylated El (cfr. Figure 49). See also Example 20.
Figure 51. E1 produced in mammalian cells ("M") or HanseiZUla-produced E1 ("H") were coated on a ELISA solid support to determine the end point titer of antibodies present in sera after vaccination of mice with E1 produced in mammalian cells (top panel), or after vaccination of mice with Hahse~zula-produced E1 (bottom panel). The horizontal bar represents the mean antibody titer. The end-point titers (fold-dilution) are indicated on the Y-axis. See also Example 22.
Figure 52. Hafasenula-produced E1 was alkylated ("A") or sulphonated ("S") and coated on a ELISA solid support to determine the end point titer of antibodies present in sera after vaccination of mice with Hansenula-produced E1 that was alkylated (top panel), or after vaccination of mice with Haizse~ula-produced E1 that was sulphonated (bottom panel). The horizontal bar represents the mean antibody titer. The end-point titers (fold-dilution) are indicated on the Y-axis. See also Example 23.
Figure 53. HCV E1 produced by HCV-recombinant vaccinia virus-infected mammalian cells and HCV E1 produced by H. polyrnorpha were coated directly to ELISA plates.
End point titers of antibodies were deteremined in sera of chimpanzees vaccinated with E1 produced by mammalian cells (top panel) and of murine monoclonal antibodies raised against E1 produced by mammalian cells (bottom panel). Chimpanzees Yoran and Marti were prophylactically vaccinated. Chimpanzees Ton, Phil, Marcel, Peggy and Femora were therapeutically vaccinated. Black filled bars: ELISA plate coated with El produced by mammalian cells.
Open bars: ELISA plate coated with EI produced by Hansenula. The end-point titers (fold-dilution) are indicated on the Y-axis. See also Example 24.
Figure 54. Fluorophore-assisted carbohydrate gelelectrophoresis of oligosaccharides released from E1 produced by recombinant vaccinia virus-infected mammalian cells and from E1-H6 protein produced by Hansenula.
Lane 1: Glucose ladder standard with indication at the left of the number of monosaccharides (3 to 10, indicated by G3 to G10).
Lane 2: 25 ~,g N-linked oligosaccharides released from (alkylated) El produced by mammalian cells.
Lane 3: 25 wg N- linked oligosaccharides released from (alkylated) El-H6 produced by Hanseizula.
Lane 4: 100 pmoles maltotetraose.
See also Example 25.

DETAILED DESCRIPTION OF THE INVENTION
In work leading to the present invention, it was observed that expression of HCV
envelope proteins as aMF-HCVENV (a, mating factor-HCV envelope protein) pre-y proproteins in Sacclaaromyces cerevisiac, Pichia pastoris and Ha~zsenula polymo~plza was possible but that the extent of removal of the pre-pro- or pre-sequences was unacceptably low and that removal of pre-pro- or pre-sequences is very often not occurring with high fidelity.
As a result, many different HCV envelope proteins are produced in these yeasts which do not have a natural amino-terminus (see Example 15). The majority of the HCV
envelope proteins to expressed in these yeast species were glycosylated (see Examples 6, 10, 13 and 25). More specifically the S. cef°evisiae (glycosylation deficient mutant)- and H. polymorpha-expressed HCV envelope proteins were glycosylated in a manner resembling core-glycosylation. The HCV envelope proteins expressed in Piclaia pasto3°is were hyperglycosylated despite earlier reports that proteins expressed in this yeast are normally not hyperglycosylated (Gellissen, G.
15 2000, Sugrue, R. J. et al. 1997).
Constructs were made for expression of the HCV envelope proteins as pre-pro-or pre-proteins wherein these pre-pro- or pre-sequences were either the CarcirZUS
maefaas hyperglycemic hormone leader sequence (pre; CHH), the S. occide~ztalis amylase leader sequence (pre; Amyl), the S. occidefztalis glucoamylase Gaml leader sequence (pre; Gaml), 2o the fungal phytase leader sequence (pre; PhyS), the Pichia pastof°is acid phosphatase leader sequence (pre; pho 1 ), the yeast aspartic protease 3 signal peptide (pre;
YAP3), the mouse salivary amylase signal peptide (pre) and the chicken lysozyme leader sequence (pre; CL).
Only for one of these pre-pro-HCVENV or pre-HCVENV proteins, removal of the pre-pro- or pre-sequence with high frequency and high fidelity was observed. This was surprisingly 25 found for the chicken lysozyme leader sequence (CL) and was confirmed both in S. cep°evisiae and H. polynoTp7aa (see Example 16). The CL signal peptide is thus performing very well for expression of glycosylated HCV envelope proteins in eukaryotic cells. This unexpected finding is reflected in the different aspects and embodiments of the present invention as presented below.
A first aspect of the current invention relates to a recombinant nucleic acid comprising a nucleotide sequence encoding a protein comprising an avian lysozyme leader peptide or a functional equivalent thereof joined to an HCV envelope protein or a part thereof.

In one embodiment thereto, the recombinant nucleic acid comprising nucleotide sequence encodes characterized by the structure CL-L(A1)a - (PS1)b- (A2)c~-HCVENV-L(A3)d - (PS2)e - (A4)f~
wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, Al, A2, A3 and A4 are adaptor peptides which can be different or the same, PSl and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, a, b, c, d, a and f are 0 or 1, and to wherein, optionally, A1 and/or A2 are part of PS1 andlor wherein A3 and/or A4 are part of PS2.
In a further embodiment, the recombinant nucleic acids according to the invention further comprise regulatory elements allowing expression in a eukaryotic host cell of said protein comprising an avian lysozyme leader peptide or a functional equivalent thereof joined to an HCV envelope protein or a part thereof, or of said protein characterized by the structure CL-[(Al)a - (PS 1)b- (A2)cj-HCVENV-[(A3)d - (PS2)~ - (A4) f].
The terms "polynucleotide", "polynucleic acid", "nucleic acid sequence", "nucleotide sequence", "nucleic acid molecule", "oligonucleotide", "probe" or "primer", when used 2o herein refer to nucleotides, either ribonucleotides, deoxyribonucleotides, peptide nucleotides or locked nucleotides, or a combination thereof, in a polymeric form of any length or any shape (e.g. branched DNA). Said terms furthermore include double-stranded (ds) and single-stranded (ss) polynucleotides as well as triple-stranded polynucleotides. Said terms also include known nucleotide modifications such as methylation, cyclization and 'caps' and substitution of one or more of the naturally occurring nucleotides with an analog such as inosine or with non-amplifiable monomers such as HEG (hexethylene glycol).
Ribonucleotides are denoted as NTPs, deoxyribonucleotides as dNTPs and dideoxyribonucleotides as ddNTPs.
Nucleotides can generally be labeled radioactively, chemiluminescently, fluorescently, phosphorescently or with infrared dyes or with a surface-enhanced Raman label or plasmon resonant particle (PRP) Said terms "polynucleotide", "polynucleic acid", "nucleic acid sequence", "nucleotide sequence", "nucleic acid molecule", "oligonucleotide", "probe" or "primer"
also encompass peptide nucleic acids (PNAs), a DNA analogue in which the backbone is a pseudopeptide consisting of N-(2-aminoethyl)-glycine units rather than a sugax. PNAs mimic the behavior of DNA and bind complementary nucleic acid strands. The neutral backbone of PNA
results in stronger binding and greater specificity than normally achieved. In addition, the unique chemical, physical and biological properties of PNA have been exploited to produce powerful biomolecular tools, antisense and antigene agents, molecular probes and biosensors. PNA
probes can generally be shorter than DNA probes and are generally from 6 to 20 bases in length and more optimally from I2 to I~ bases in length (Nielsen, P. E. 2001).
Said terms further encompass locked nucleic acids (LNAs) which are RNA derivatives in which the to ribose ring is constrained by a methylene linkage between the 2'-oxygen and the 4'-carbon.
LNAs display unprecedented binding affinity towards DNA or RNA target sequences. LNA
nucleotides can be oligomerized and can be incorporated in chimeric or mix-meric LNA/DNA
or LNA/RNA molecules. LNAs seem to be nontoxic for cultured cells (Orum, H.
and Wengel, J. 2001, Wahlestedt, C. et al. 2000). In general, chimeras or mix-mers of any of DNA, RNA, PNA and LNA are considered as well as any of these wherein thymine is replaced by uracil.
The term "protein" refers to a polymer of amino acids and does not refer to a specific length of the product; thus, peptides, oligopeptides, and polypeptides are included within the definition of protein. This term also does not refer to or exclude post-expression modifications of the protein, for example, glycosylations, acetylations, phosphorylations and the like. Included within the definition are, for example, polypeptides containing one or more analogues of an amino acid (including, fox example, unnatural amino acids, PNA, etc.), polypeptides with substituted linkages, as well as other modifications known in the art, both naturally occurring and non-naturally occurring.
With "pre-pro-protein" or "pre-protein" is, when used herein, meant a protein comprising a pre-pro-sequence joined to a protein of interest or a protein comprising a pro sequence joined to a protein of interest, respectively. As alternatives for "pre-sequence", the terms "signal sequence", "signal peptide", "leader peptide", or "leader sequence" are used; all refer to an amino acid sequence that targets a pre-protein to the rough endoplasmic reticulum (ER) which is a prerequisite for (N-)glycosylation. The "signal sequence", "signal peptide", "leader peptide", or "leader sequence" is cleaved off, i.e. "removed" from the protein comprising the signal sequence joined to a protein of interest, at the on the Iuminal side of this ER by host specific proteases referred to as signal peptidases. Likewise, a pre-pro-protein is converted to a pro-protein upon translocation~to the lumen of the ER.
Depending on the nature of the "pro" amino acid sequence, it can or can not be removed by the host cell expressing the pre-pro-protein. A well known pre-pro-amino acid sequence is the a, mating factor pre-pro-sequence of the S. cef~evisiae a mating factor.
With "recombinant nucleic acid" is intended a nucleic acid of natural or synthetic origin which has been subjected to at least one recombinant DNA technical manipulation such as restriction enzyme digestion, PCR, ligation, dephosphorylation, phosphorylation, mutagenesis, adaptation of codons for expression in a heterologous cell etc. In general, a recombinant nucleic acid is a fragment of a naturally occurring nucleic acid or comprises at least two nucleic acid fragments not naturally associated or is a fully synthetic nucleic acid.
to With "an avian leader peptide or a functional equivalent thereof joined to a HCV
envelope protein or, a part thereof' is meant that the C-tennin.al amino acid of said leader peptide is covalently linked via a peptide bond to the N-terminal amino acid of said HCV
envelope protein or part thereof. Alternatively, the C-terminal amino acid of said leader peptide is separated from the N-terminal amino acid of said HCV envelope protein or part thereof by a peptide or protein. Said peptide or protein may have the structure -[(A1 )a - (PS 1)b - (A2)~] as defined above.
The derivation of the HCV envelope protein of interest from the protein comprising an avian lysozyme leader peptide or a functional equivalent thereof joined to an HCV envelope protein or a part thereof or of the protein characterized by the structure CL-[(A1)a - (PS1)b (A2)~]-HCVENV-[(A3)d - (PS2)e - (A~)f] can be performed ih vivo by the proteolytic machinery of the cells in which the pre-protein protein is expressed. More specifically, the step consisting of removal of the avian leader peptide is preferably performed ifz viv~ by the proteolytic machinery of the cells in which the pre-protein is expressed.
Derivation may, however, also be performed solely ih vitr°o after and/or during isolation and/or purification of the pre-protein and/or protein from the cells expressing the pre-protein and/or from the culture fluid in which the cells expressing the pre-protein are grown. Alternatively, said in vivo derivation is performed in combination With said ifZ vit~~o derivation.
Derivation of the HCV
protein of interest from a recombinantly expressed pre-protein can further comprise the use of (an) proteolytic enzymes) in a polishing step wherein all or most of the contaminating proteins co-present with the protein of interest are degraded and wherein the protein of interest is resistant to the polishing proteolytic enzyme(s). Derivation and polishing are not mutually exclusive processes and may be obtained by using the same single proteolytic enzyme. As an example is given here the HCV E1s protein of HCV genotype 1b (SEQ ID
N0:2) which is devoid of Lys-residues. By digesting of a protein extract containing said HCV
E1 proteins with the Endoproteinase Lys-C (endo-lys C), the El proteins will not be degraded whereas contaminating proteins containing one or more Lys-residues are degraded. Such a process may significantly simplify or enhance isolation and/or purification of the HCV E1 proteins. Furthermore, by including in a pre-protein an additional Lys-residue, e.g. between a leader peptide and a HCV E1 protein, the additional advantageous possibility of correct i~z vit~~o separation of the leader peptide from the HCV E1 pre-protein is obtainable. Other HCV
E1 proteins may comprise a Lys-residue at either one or more of the positions 4, 40, 42, 44, l0 61, 65 or 179 (wherein position 1 is the first, N-terminal natural amino acid of the E1 protein, i.e. position 192 in the HCV polyprotein). In order to enable the use of endo-lys C as described above, said Lys-residues may be mutated into another amino acid residue, preferably into an Arg-residue.
With a "correctly removed" leader peptide is meant that said leader peptide is removed from the protein comprising the . signal sequence joined to a protein of interest with high efficiency, i.e. a large number of pre-(pro-)proteins is converted to pro-proteins or proteins, and with high fidelity, i.e. only the pre-amino acid sequence is removed and not any amino acids of the protein of interest joined to said pre-amino acid sequence. With "removal of a leader peptide with high efficiency" is meant that at least about 40%, but more preferentially 2o about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or even 99%
of the pre-proteins is converted to the protein from which the pre-sequence is removed.
Alternatively, if a substantial part of the expressed pre-proteins is not converted to the protein from which the pre-sequence is removed, these pre-proteins may still be purified.
With "functional equivalent of the avian lysozyme (CL) leader peptide" is meant a CL
leader peptide wherein one or more amino acids have been substituted for another amino acid and whereby said substitution is a conservative amino acid substitution. With "conservative amino acid substitution" is meant a substitution of an amino acid belonging to a group of conserved amino acids with another amino acid belonging to the same group of conserved amino acids. As groups of conserved amino acids are considered: the group consisting of Met, Ile, Leu and Val; the group consisting of Arg, Lys and His; the group consisting of Phe, Trp and Tyr; the group consisting of Asp and Glu; the group consisting of Asn and Gln; the group consisting of Cys, Ser and Thr; and the group consisting of Ala and Gly. An exemplary conservative amino acid substitution in the CL leader peptide is the naturally variation at position 6, the amino acid at this position being either Val or Ile; another variation occurs at position 17, the amino acid at this position being, amongst others, Leu or Pro (see SEQ ID
NO:1). The resulting CL leader peptides are thus to be considered as functional equivalents.
Other functional equivalents of the CL leader peptides include those leader peptides reproducing the same technical aspects as the CL leader peptides as described throughout the current invention, including deletion variants and insertion variants.
With "A" or "adaptor peptide" is meant a peptide (e.g. 1 to 30 amino acids) or a protein which may serve as a linker between e.g. a leader peptide and a processing site (PS), a leader peptide and a protein of interest, a PS and a protein of interest, and/or a protein of to interest and a PS; and/or may serve as a linker N- or C-terminal of e.g. a leader peptide, a PS
or a protein of interest. The adaptor peptide "A" may have a certain three-dimensional structure, e.g. an a-helical or [3-sheet structure or a combination thereof.
Alternatively the three-dimensional structure of A is not well defined, e.g. a coiled-coil structure. The adaptor A may be part of e.g. a pre-sequence, a pro-sequence, a protein of interest sequence or a processing site. The adaptor A rnay serve as a tag enhancing or enabling detection and/or purification and/or processing of the protein of which A is a part. One examples of an A
peptide is the his-tag peptide (HHHHHH; SEQ ID N0:63) H" wherein n usually is six, but may be 7, 8, 9, 10, 11, or 12. Other examples of A-peptides include the peptides EEGEPK
(Kjeldsen et al. in W098/28429; SEQ ID N0:64) or EEAEPK (Kjeldsen et al. in 2o W097/22706; .SEQ ID N0:65) which, when present at the N-terminal of the a protein of interest, were reported to increase fermentation yield but also to protect the N-terminus of the protein of interest against processing by dipeptidyl aminopeptidase and thus resulting in a homogenous N-terminus of the polypeptide. At the same time, in vitro maturation of the protein of interest, i.e. removal of said peptides EEGEPK (SEQ ID N0:64) and EEAEPK
(SEQ ID N0:65) from the protein of interest can be achieved by using e.g. endo-lys C which cleaves C-terminal of the Lys-residue in said peptides. Said peptides thus serve the function of adaptor peptide (A) as well as processing site (PS), (see below). Adaptor peptides are given in SEQ ID NOs:63-65, 70-72 and 74-82. Another example of an adaptor peptide is the G4S
i_m_m__unosilent linker. Other examples of adaptor peptides or adaptor proteins are listed in 3o Table 2 of Stevens (Stevens et al. 2000).
With "PS" or "processing site" is meant a specific protein processing or processable site. Said processing may occur enzymatically or chemically. Examples of processing sites prone to specific enzymatic processing include IEGR.~X (SEQ ID N0:66), IDGR.~X
(SEQ ID

NO:67), AEGR~~X (SEQ ID N0:68), all recognized by and cleaved between the Arg and Xaa (any amino acid) residues as indicated by the "~~" by the bovine factor Xa protease (Nagai, K.
and Thogersen, H. C. 1984). Another example of a PS site is a dibasic site, e.g. Arg-Arg, Lys-Lys, Arg-Lys or Lys-Arg, which is cleavable by the yeast Kex2 protease (Julius, D. et al.
s 1984). The PS site may also be a monobasic Lys-site. Said monobasic Lys-PS-site may also be included at the C-terminus of an A peptide. Examples of A adaptor peptides comprising a C-terminal monobasic Lys-PS-site are given by SEQ ID NOs:64-65 and 74-76.
Exoproteolytic removal of a His-tag (HHHHHH; SEQ ID N0:63) is possible by using the dipeptidyl aminopeptidase I (DAPase) alone or in combination with glutamine to cyclotransferase (Qcyclase) and pyroglutamic aminopeptidase (pGAPase) (Pedersen, J. et al.
1999). Said exopeptidases comprising a recombinant His-tag (allowing removal of the peptidase from the reaction mixture by immobilize metal-affinity chromatography, IMAC) are commercially available, e.g. as the TAGZyme System of Unizyme Laboratories (Horsholm, DK). With "processing" is thus generally meant any method or procedure is whereby a protein is specifically cleaved or cleavable at at least one processing site when said processing site is present in said protein. A PS may be prone to endoproteolytic cleavage or may be prone to exproteolytic cleavage, in any case the cleavage is specific, i.e. does not extend to sites other than the sites recognized by the processing proteolytic enzyme. A
number of PS sites are given in SEQ ID NOs:66-68 and 83-84.
20 ' The versatility of the [(Al/3)ald - (PS1/2)b~e- (A2/4)~if] structure as outlined above is demonstrated by means of some examples. In a first example, said structure is present at the C-terminal end of a protein of interest comprised in a pre-protein and wherein A3 is the "VIEGR" peptide (SEQ ID NO:69) which is overlapping with the factor Xa "IEGRX"
PS site (SEQ ID N0:66) and wherein X=A4 is the histidine-tag (SEQ ID N0:63) (d, a and f thus are 25 all 1 in this case). The HCV protein of interest can (optionally) be purified by IMAC. After processing with factor Xa, the (optionally purified) HCV protein of interest will carry at its C-terminus a processed PS site which is "IEGR" (SEQ ID N0:70). Variant processed factor Xa processing site, can be IDGR (SEQ ID N0:71) or AEGR (SEQ ID N0:72). In a further example, the [(A1/3)~d - (PS 1/2)b~e - (A2/4)~if] structure is present at the N-terminus of the 30 HCV protein of interest. Furthermore, A1 is the histidine-tag (SEQ ID
N0:63), PS is the factor Xa recognition site (any of SEQ ID NOs:66-68) wherein X is the protein of interest, and wherein a=b=1 and c=0. Upon correct removal of a leader peptide, e.g. by the host cell, the resulting HCV protein of interest can be purified by IMAC (optional).
After processing with factor Xa, the protein of interest will be devoid of the [(A1)a - (PS 1)b-(A2)~] structure.

It will furthermore be clear that any of A1, A2, A3, A4, PS 1 and P82, when present, may be present in a repeat structure. Such a repeat structure, when present, is in this context still counted as 1, i.e. a, b, c, d, e, or f are 1 even if e.g. A1 is occurring as e.g. 2 repeats (A1-A1).
s With "HCV envelope protein" is meant a HCV E1 or HCV E2 envelope protein or a part thereof whereby said proteins may be derived from a HCV strain of any genotype. More specifically, HCVENV is chosen from the group of amino acid sequences consisting of SEQ
ID NOs:85 to 98, amino acid sequences which are at least 90% identical to SEQ
ID NOs:85 to to98, and fragments of any thereof. As "identical" amino acids are considered the groups of conserved amino acids as described above, i.e. the group consisting of Met, Ile, Leu and Val;
the group consisting of Arg, Lys and His; the group consisting of Phe, Trp and Tyr; the group consisting of Asp and Glu; the group consisting of Asn and Gln; the group consisting of Cys, Ser and Thr; and the group consisting of Ala and Gly.
15 More specifically, the term "HCV envelope proteins" relates to a polypeptide or an analogue thereof (e.g. mimotopes) comprising an amino .acid sequence (and/or amino acid analogues) defining at least one HCV epitope of either the El or the E2 region, in addition to a glycosylation site. These envelope proteins may be both monomeric, hetero-oligomeric or homo-oligomeric forms of recombinantly expressed envelope proteins. Typically, the 2o sequences defining the epitope correspond to the amino acid sequences of either the E1 or the E2 region of HCV (either identically or via substitutions of analogues of the native amino acid residue that do not destroy the epitope).
It will be understood that the HCV epitope may co-locate with the glycosylation site.
In general, the epitope-defining sequence will be 3 or 4 amino acids in length, more typically, 5, 25 6, or 7 amino acids in length, more typically 8 or 9 amino acids in length, and even more typically 10 or more amino acids in length. With respect to conformational epitopes, the length of the epitope-defining sequence can be subject to wide variations, since it is believed that these epitopes are formed by the three-dimensional shape of the antigen (e.g.
folding). Thus, the amino acids defining the epitope can be relatively few in number, but widely dispersed along 3o the length of the molecule being brought into the correct epitope conformation via folding. The portions of the antigen between the residues defining the epitope may not be critical to the conformational structure of the epitope. For example, deletion or substitution of these intervening sequences may not affect the conformational epitope provided sequences critical to epitope conformation are maintained (e.g. cysteines involved in disulfide bonding, glycosylation sites, etc.). A conformational epitope may also be formed by 2 or more essential regions of subunits of a homo-oligomer or hetero-oligomer.
As used herein, an epitope of a designated polypeptide denotes epitopes with the same amino acid sequence as the epitope in the designated polypeptide, and immunologic equivalents thereof Such equivalents also include strain, subtype (=genotype), or type(group)-specific variants, e.g. of the currently known sequences or strains belonging to genotypes Ia, 1b, lc, Id, 1e, 1f; 2a, 2b, 2c, 2d, 2e, 2f, 2g, 2h, 2i, 3a, 3b, 3c, 3d, 3e, 3f, 3g, 4a, 4b, 4c, 4d, 4e, 4f, 4g, 4h, 4i, 4j, 4k, 41, Sa, Sb, 6a, 6b, 6c, 7a, 7b, 7c, 8a, 8b, 9a, 9b, 10a, 11 (and subtypes thereof), I2 (and subtypes thereof) or 13 (and subtypes thereof) or any other newly defined HCV
(sub)type. It is to to be understood that the amino acids constituting the epitope need not be part of a linear sequence, but may be interspersed by any number of amino acids, thus forming a conformational epitope.
The HCV antigens of the present invention comprise conformational epitopes from the E1 and/or E2 (envelope) domains of HCV. The E1 domain, which is believed to correspond to the viral envelope protein, is currently estimated to span amino acids 192-383 of the HCV
polyprotein (Hijikata, M. et al. 1991). Upon expression in a mammalian system (glycosylated), it is believed to have an approximate molecular weight of 3S kDa as determined via SDS-PAGE. 'The E2 protein, previously called NS1, is believed to span amino acids 384-809 or 384-746 (Grakoui, A. et al. 1993) of the HCV polyprotein and also to be an envelope protein. Upon 2o expression in a vaccinia system (glycosylated), it is believed to have an apparent gel molecular weight of about 72 kDa. It is understood that these protein endpoints are approximations (e.g.
the carboxy terminal end of E2 could lie somewhere in the 730-820 amino acid region, e.g.
ending at amino acid 730, 735, 740, 742, 744, 745, preferably 746, 747, 748, 750, 760, 770, 780, 790, 800, 809, 810, 820). The E2 protein may also be expressed together with E1, and/or core (aa 1-191), and/or P7 (aa 747-809); and/or NS2 (aa 810-1026), and/or NS3 (aa 1027-1657), andlor NS4A (aa 1658-1711) and/or NS4B (aa 1712-1972) and/or NSSA (aa 1973-2420), and/or NSSB (aa 2421-3011), and/or any part of any of these HCV proteins different from E2 .
Likewise, the E1 protein may also be expressed together with the E2, and/or core (aa 1-191), and/or P7 (aa 747-809), and/or NS2 (aa 8I0-1026), and/or NS3 (aa 1027-1657), and/or NS4A
(aa 1658-1711) andlor NS4B (aa 1712-1972), and/or NSSA (aa 1973-2420) , and/or NSSB (aa 2421-3011), and/or any part of any of these HCV proteins different from El.
Expression together with these other HCV proteins may be important for obtaining the correct protein folding.

The term "El" as used herein also includes analogs and truncated forms that are immunologically cross-reactive with natural E1, and includes E1 proteins of genotypes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 or any other newly identified HCV type or subtype. The term 'E2' as used herein also includes analogs and truncated forms that are inununologically cross-reactive with natural E2, and includes E2 proteins of genotypes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 or any other newly identified HCV type or subtype. For example, insertions of multiple codons between codon 383 and 384, as well as deletions of amino acids 384-387 have been reported (Kato, N. et al. 1992). It is thus also understood that the isolates used in the examples section of the present invention were not intended to limit the scope of the invention and that to any HCV isolate from type l, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 l, 12 or 13 or any other new genotype of HCV is a suitable source of El and/or E2 sequence for the practice of the present invention.
Similarly, as described above, the HCV proteins that are co-expressed with the HCV envelope proteins of the present invention, can be derived from any HCV type, thus also from the same type as the HCV envelope proteins of the present invention.
"E1/E2" as used herein refers to an oligomeric form of envelope proteins containing at least one El component and at least one E2 component.
The term "specific oligomeric" El and/or E2 and/or El/E2 envelope proteins refers to all possible oligomeric forms of recombinantly expressed El and/or E2 envelope proteins which are not aggregates. E1 and/or E2 specific oligomeric envelope proteins are also referred to as 2o homo-oligomeric E1 or E2 envelope proteins (see below). The term 'single or specific oligomeric' E1 and/or E2 and/or E1/E2 envelope proteins refers to single monomeric E1 or E2 proteins (single in the strict sense of the word) as well as specific oligomeric E1 and/or E2.
and/or E1/E2 recombinantly expressed proteins. These single or specific oligomeric envelope proteins according to the present invention can be further defined by the following formula (E1)X(E2)y wherein x can be a number between 0 and 100, and y can be a number between 0 and 100, provided that x and y are not both 0. With x=1 and y=0 said envelope proteins include monomeric E1.
The term "homo-oligomer" as used herein refers to a complex of El or E2 containing more than one E1 or E2 monomer, e.g. E1/E1 dimers, E1/E1/El trimers or EllEl/E1/E1 3o tetramers and E2/E2 dimers, E2/E2/E2 trimers or E2/E2/E2/E2 tetramers, El pentamers and hexamers, E2 pentamers and hexamers or any higher-order homo-oligomers of El or E2 are all 'homo-oligomers' within the scope of this definition. The oligomers may contain one, two, or several different monomers of E1 or E2 obtained from different types or subtypes of hepatitis C
virus including for example those described by Maertens et al. in WO 94/25601 and WO

96113590 both by the present applicants. Such mixed oligomers are still homo-oligomers within the scope of this invention, and may allow more universal diagnosis, prophylaxis or treatment of HCV.
The E1 and E2 antigens used in. the present invention may be full-length viral proteins, substantially full-length versions thereof, or functional fragments thereof (e.g. fragments comprising at least one epitope and/or glycosylation site). Furthermore, the HCV antigens of the present invention can also include other sequences that do not block or prevent the formation of the conformational epitope of interest. The presence or absence of a conformational epitope can be readily determined through screening the antigen of interest with an antibody (polyclonal to serum or monoclonal to the conformational epitope) and comparing its reactivity to that of a denatured version of the antigen which retains only linear epitopes (if any).
In such screening using polyclonal antibodies, it may be advantageous to adsorb the polyclonal serum first with the denatured antigen and see if it retains antibodies to the antigen of interest.
The HCV proteins of the present invention may be glycosylated. Glycosylated proteins intend proteins that contain one or more carbohydrate groups, in particular sugar groups. In general, all eukaryotic cells are able to glycosylate proteins. After alignment of the different envelope protein sequences of HCV genotypes, it may be inferred that not all 6 glycosylation sites on the HCV E1 protein are required for proper folding and reactivity.
For instance, HCV
subtype 1b E1 protein contains 6 glycosylation sites, but some of these glycosylation sites are absent in certain other (sub)types. The fourth carbohydrate motif (on Asn250), present in types 1b, 6a, 7, 8, and 9, is absent in all other types know today. This sugar-addition motif may be mutated to yield a type 1b E1 protein with improved reactivity. Also, the type 2b sequences show an extra glycosylation site in the VS region (on Asn299). The isolate 583, belonging to genotype 2c, even lacks the first carbohydrate motif in the V1 region (on Asn), while it is present on all other isolates (Stuyver, L. et al. 1994). However, even among the completely conserved sugar-addition motifs, the, presence of the carbohydrate may not be required for folding, but may have a role in evasion of immune surveillance. Thus, the identification of the role of glycosylation can be further tested by mutagenesis of the glycosylation motifs.
3o Mutagenesis of a glycosylation motif (NXS or NXT sequences) can be achieved by either mutating the codons for N, S, or T, in such a way that these codons encode amino acids different from N in the case of N, and/or amino acids different from S or T in the case of S and in the case of T. Alternatively, the X position may be mutated into P, since it is known that NPS
or NPT are not frequently modified with carbohydrates. After establishing which carbohydrate-addition motifs are required for folding and/or reactivity and which are not, combinations of such mutations may be made. Such experiments have been described extensively by Maertens et al. in WO 96/04385 (Example 8), which is included herein specifically by reference.
The term glycosylation as used in the present invention refers to N-glycsoylation unless s otherwise specified.
In particular, the present invention relates to HCV envelope proteins, or parts thereof that are core-glycosylated. In this respect, the term "core-glycosylation" refers to a structure "similar"
to the structure as depicted in the boxed structure in Figure 3 of Herscovics and Orlean (Herscovics, A. and Orlean, P. 1993). Thus, the carbohydrate structure referred to contains 10 to or 11 mono-saccharides. Notably, said disclosure is herein incorporated by reference. The term "similar" intends that not more than about 4 additional mono-saccharides have been added to the structure or that not more than about 3 mono-saccharides have been removed from the structure. Consequently, a carbohydrate structure consists most preferentially of 10 mono-saccharides, but minimally of 7, and more preferentially of 8 or 9 mono-sacchariden, is and maximally of 15 mono-sacchaxides, and more preferentially of 14, 13, 12, or 11 mono-saccharides. The mono-saccharides connoted are preferentially glucose, mannose or N-acetyl glucosamine.
Another aspect of the present invention covers vectors comprising a polynucleic acid, 20 or a part thereof, of the invention. Such vectors comprise universal cloning vectors such as the pUC-series or pEMBL-series vectors and furthermore include other cloning vectors such as cloning vectors requiring a DNA topoisomerase reaction for cloning,. TA-cloning vectors and recombination-based cloning vectors such as those used in the Gateway system (InVitrogen).
Vectors comprise plasmids, phagemids, cosmids, bacmids (baculovirus vectors) or may be 2s viral or retroviral vectors. A vector can merely function as a cloning tool and/or -vehicle or may additionally comprise regulatory sequences such as promoters, enhancers and terminators or polyadenylation signals. Said regulatory sequences may enable expression of the information contained within the DNA fragment of interest cloned into a vector comprising.said regulatory sequences. Expression may be the production of RNA
molecules 30 or mRNA molecules and, optionally, the production of protein molecules thereof. Expression may be the production of an RNA molecule by means of a viral polymerise promoter (e.g.
SP6, T7 or T3 promoter) introduced to the 5'- or 3'- end of the DNA of interest. Expression may furthermore be transient expression or stable expression or, alternatively, controllable expression. Controllable expression comprises inducible expression, e.g. using a tetracyclin-regulatable promoter, a stress-inducible (e.g. human lasp70 gene promoter), a methallothionine promoter, a glucocorticoid promoter or a progesterone promoter. Expression vectors are known in the art that mediate expression in bacteria (e.g.
Eschericlria coli, Str°eptoy~ces species), insect cells (Spodopte~a fi°ugipeT°da cells, S~ cells), plant cells (e.g.
potato virus X-based expression vectors, see e.g. Vance et al. 1998 in W098/44097) and mammalian cells (e.g. CHO or COS cells, Vero cells, cells from the HeLa cell line).
This aspect of the invention thus specifically relates to a vector comprising the recombinant nucleic acids according to the invention encoding a protein comprising an avian lysozyme leader peptide or a functional equivalent thereof joined to an HCV envelope protein or a part to thereof, or a protein characterized by the structure CL-[(A1)a - (PS1)b -(A2)~]-HCVENV-[(A3)d - (PS2)e - (A4)f]~
Embodied in the present invention are also said vectors further comprising regulatory sequences allowing expression of said protein.
In a specific embodiment, said vector according to the invention is an expression vector.
In another specific embodiment, said vector according to the invention is . an autonomously replicating vector or an integrative vector.
In yet another specific embodiment, said vector according to the invention is chosen from any of SEQ ID NOs: 20, 21, 32, 35, 36, 39, 40.
2o Suitable vectors or expression vectors of the invention are yeast vectors.
A yeast vector may comprise a DNA sequence enabling the vector to replicate autonomously.
Examples of such sequences are the yeast plasmid 2,u replication genes REP 1-3 and origin of replication. Other vectors are integrating partially or completely in the yeast genome. Such integrative vectors are either targeted to specific genomic loci or integrate randomly. In P.
pastoois, foreign DNA is targeted to the AOXI and the HIS4 genes (Cregg, J. M.
1999); in P.
methanolica to the AUDI gene (Raymond, C. I~. 1999). In most recombinant H.
polymo~plaa strains, foreign DNA can be randomly integrated using HARS-sequence-harboring circular plasmids for transformation (Hollenberg, C. P. and Gellissen, G. 1997).
Targeted integration can be achieved by homologous recombination using the MOXlTRP3 locus for 3o disruption/integration (Agaphonov, M. O. et al. 1995, Sohn, J. H. et al.
1999), the LEU2 gene (Agaphonov, M. O. et al. 1999) or the rDNA cluster (Cox, H. et al. 2000).
Transformations in H. polyfrzo~pha typically result in a variety of individual, mitotically stable strains containing single to multiple copies of the expression cassette in a head-to-tail arrangement. Strains with -2~
up to 100 copies have been identified (Hollenberg, C. P. and Gellissen, G.
1997). Random multiple-copy integration can be forced in the uracil-auxotroph H. polynaofpha strain RB11 by a sequence of passages under selective conditions if a H. polyrrzorplza or S. ce~evisiae-derived URA3 gene is present. A HARS sequence can be excluded (Gatzke, R. et al. 1995) or can be present (Hollenberg, C. P. and Gellissen, G. 1997). This passaging furthermore leads to mitotically stable strains. The vector may also comprise a selectable marker, e.g. the SclZizosaccharomyces pon2be TPI gene as described by Russell (Russell, P. R.
195), or the yeast URA3 gene. Other marker genes so far used for transformation of Sacchas°omyees, for example TRPS, LEU2, ADEl, ADE2, HIS3, HIS4, LYS2, may be obtained from e.g.
l0 Haszsehula, Piclzia or Schvvafzniomyces.
"Regulatory elements (or sequences) allowing expression of a protein in a eukaryotic host" are to be understood to comprise at least a genetic element displaying promoter activity and a genetic element displaying terminator activity whereby said regulatory elements are operably linked to the open reading frame encoding the protein to be expressed.
The term "promoter" is a nucleotide sequence which is comprised of consensus sequences which allow the binding of RNA polymerise to the DNA template in a manner such that mRNA production initiates at the normal transcription initiation site for the adjacent structural gene.
2o The term "operably linked" refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner. A control sequence "operably linked" to a coding sequence is ligated in such a way that expression of the coding sequence is achieved under,conditions compatible with the control sequences.
An "open reading frame" (ORF) is a region of a polynucleotide sequence which encodes a polypeptide and does not contain stop codons; this region may represent a portion of a coding sequence or a total coding sequence.
A "coding sequence" is a polynucleotide sequence which is transcribed into mRNA
and/or translated into a polypeptide when placed under the control of appropriate regulatory sequences. The boundaries of the coding sequence are determined by a translation start codon at 3o the 5'-terminus and a translation stop codon at the 3'-terminus. A coding sequence can include but is not limited to mRNA, DNA (including cDNA), and recombinant polynucleotide sequences.
Many regulatory elements are known in the art. Examples of suitable yeast promoters are the Sacchaf°omyces cerevisiae MFal, TPI, ADH I, ADH IT or PGK
promoters, or corresponding promoters from other yeast species, e.g.
Sclaizosacchai°omyces pombe.
Examples of suitable promoters are described by, for instance, (Alber, T. and Kawasaki, G.
1982, Ammerer, G. 1983, Ballou, L. et al. 1991, Hitzeman, R. A. et al. 1980, Kawasaki, G.
and Fraenkel, D. G. 1982, Russell, D. W. et al. 1983, Russell, P. R. 1983, Russell, P. R. and Hall, B. D. 1983). A suitable yeast terminator is, e.g. the TPI terminator (Alber, T. and Kawasaki, G. 1982), or the yeast CYC1 terminator. For methylotrophic or facultative methylotrophic yeast species, the strong and regulatable promoters of the enzymes involved in the methanol utilization pathway are good candidate promoters and include the promoters of the alcohol oxidase genes (AOXI of Pichia pastof°is, AUGI of P.
metha~2olica, AODI of to Cafzdida boidifZii, and MOX of Hansehula polymofpha), the formaldehyde dehydrogenase promoter (FLDI of P. pastof°is), the dihydroxyacetone synthase promoter (DASI of C.
boidinii) and the fonnate dehydrogenase promoter (FMD of H. polymorplaa).
Other promoters include the GAPI promoter of P. pastor~is or H. polymo~pha and the PMAI and TPSl promoter of H. polymorplza ((Gellissen, G. 2000), and references cited therein). The terminator element derived from any of these genes are examples of suitable terminator elements, more specifically suitable terminator elements include the ADDl, AOXI and MOX
terminator elements.
A further aspect of the current invention covers host cells comprising a recombinant nucleic acid or a vector according to the invention.
In a specific embodiment thereto, said host cells comprising a recombinant nucleic acid or a vector according to the invention are capable of expressing the protein according to the invention comprising the avian leader lysozyme leader peptide or a functional variant thereof joined to an HCV envelope protein or a part thereof.
In an alternative embodiment, said host cells are capable of expressing the protein characterized by the structure CL-[(Al)a - (PS1)b- (A2)~]-HCVENV-[(A3)d -(PS2)e - (A4)f]
wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, A1, A2, A3 and A4 are adaptor peptides which can be different or the same, 3o PS 1 and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, a, b, c, d, a and f are 0 or 1, and wherein, optionally, A1 and/or A2 are part of PS1 and/or wherein A3 and/or A4 are part of PS2.
In a further specific embodiment thereto, said host cells comprising a recombinant nucleic acid or a vector according to the invention are capable of translocating the protein comprising the avian lysozyme leader peptide or a functional equivalent thereof joined to an HCV envelope protein or a part thereof to the endoplasmic reticulum upon removal of the avian lysozyme leader peptide.
In a further specific embodiment thereto, said host cells comprising a recombinant nucleic acid or a vector according to the invention are capable of translocating the protein [(Al)X -to (PS1)Y - (AZ)Z]-HCVENV-[(A3)X - (PS2)y - (A4)2] to the endoplasmic reticulum upon removal of the CL peptide wherein said protein and said CL peptide are derived from the protein characterized by the structure CL-[(Al)a - (PS1)b- (A2)~]-HCVENV-[(A3)a - (PS2)e -(A4)f]
wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, AI, A2, A3 and A4 are adaptor peptides which can be different or the same, PSl and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, a, b, c, d, a and f are 0 or I, and 2o wherein, optionally, A1 and/or A2 are part of PS1 and/or wherein A3 and/or A4 are part of PS2.
Also embodied are host cells comprising a recombinant nucleic acid or a vector according to the invention which are capable of processing the processing sites PSl and/or PS2 in said protein translocated to the endoplasmic reticulum.
Also embodied are host cells comprising a recombinant nucleic acid or a vector according to the invention which are capable of N-glycosylating said protein translocated to the endoplasmic reticulum.
Also embodied are host cells comprising a recombinant nucleic acid or a vector according to the invention which are capable of N-glycosylating said protein translocated to 3o the endoplasmic reticulum and processed at said sites PS1 and/or PS2.
More specifically, the host cells comprising a recombinant nucleic acid or a vector according to the invention are eukaryotic cells and, more particularly, yeast cells such as cells of strains of Sacclaaromyces, such as Saccha~omyces cerevisiae, Saccha~°omyces hluyve~°i, or Saccharonayces uvamm, Schizosaccha~°omyces, such as Schizosaccharomyces pombe, Kluyve~°omyces, such as Kluyvef°omyces lactis, Yarf°owia, such as Yar-rowia lipolytica, Hansenula, such as Hansenula polymospha, Pichia, such as Pichia pastoris, Aspefgillus species, Neurospora, such as Nem°ospo~°a cf°assa, or Schwanniom'~ces, such as Sclzwan,niomyces occidentalis, or mutant cells derived from any thereof.
The term "eukaryotic cells" includes lower eukaryotic cells as well as higher eukaryotic cells. Lower eukaryotic cells are cells such as yeast cells, fungal cells and the like. Particularly suited host cells in the context of the present invention are yeast cells or mutant cells derived to from any thereof as described above. Mutant cells include yeast glycosylation minus strains, such as Sacchaf°omyces glycosylation minus strains as used in the present invention.
Glycosylation minus strains are defined as strains carrying a mutation, in which the nature of the mutation is not necessarily known, but resulting in a glycosylation of glycoproteins comparable to the core-glycosylation In particular, it is contemplated that Sacchai°omyces glycosylation minus strains carry a mutation resulting in a significant shift iri mobility on PAGE of the invertase protein. Invertase is a protein which is normally present in Sacchaf°omyces in a hyperglycosylated form only (Ballou, L. et al.
1991). Glycosylation minus strains include nZnn2, and/or ochl and/or mnn9 deficient strains. The mutant host cells of the invention do not include cells which, due to the mutation, have lost their capability to 2o remove the avian lysozyme leader peptide from a protein comprising said leader peptide joined to a protein of interest.
Higher eukaryotic cells include host cells derived from higher animals, such as mammals, reptiles, insects, and the like. Presently preferred higher eukaryote host cells are derived from Chinese hamster (e.g. CHO), monkey (e.g. COS and Vero cells), baby hamster kidney (BHK), pig kidney (PKlS), rabbit kidney 13 cells (RK13), the human osteosarcoma cell line 143 B, the human cell line HeLa and human hepatoma cell lines like Hep G2, and insect cell lines (e.g. Spodoptera fi°ugipef°da). The host cells may be provided in suspension or flask cultures, tissue cultures, organ cultures and the like. Alternatively the host cells may also be transgenic animals or transgenic plants.
Introduction of a vector, or an expression vector, into a host cell may be effectuated by any available transformation or transfection technique applicable to said host cell as known in the art. Such transformation or transfection techniques comprise heat-shock mediated transformation (e.g. of E. colt), conjugative DNA transfer, electroporation, PEG-mediated DNA uptake, liposome-mediated DNA uptake, lipofection, calcium-phosphate DNA
coprecipitation, DEAF-dextran mediated transfection, direct introduction by e.g.
microinjection or particle bombarclinent, or introduction by means of a virus, virion or viral particle.
Yet another aspect of the invention relates to methods for producing a HCV
envelope protein or part thereof ina host cell, said method comprising transforming said host cell with the recombinant nucleic acid according to the invention or with the vector according to the invention, and wherein said host cell is capable of expressing a protein comprising the avian to lysozyrne leader peptide or a functional equivalent thereof joined to a HCV
envelope protein or a part thereof.
In a specific embodiment thereto, said method for producing a HCV envelope protein or part thereof in a host cell is comprising the step of transforming said host cell with the recombinant nucleic acid according to the invention or with the vector according to the invention, and wherein said host cell is capable of expressing the protein characterized by the structure CL-[(Al)a - (PSl)b- (A2)~]-HCVENV-[(A3)d - (PS2)e - (A4)f]
wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, A1, A2, A3 and A4 are adaptor peptides which can be different or the same, 2o PS 1 and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, .
a, b, c, d, a and f are 0 or l, and wherein, optionally, Al and/or A2 are part of PS1 and/or wherein A3 and/or A4 are part of PS2.
In another specific embodiment thereto, the host cell in said method is capable of translocating the protein CL-[(Al)a - (PS1)b- (A2)~]-HCVENV-[(A3)d - (PS2)e -(A4)f] to the endoplasmic reticulum upon removal of the CL peptide wherein said protein and said CL
peptide are derived from the protein characterized by the structure CL-[(Al)a -(PSl)b (A2)c]-HCVENV-[(A3)d - (PS2)e - (A4) f]
wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, Al, A2, A3 and A4 are adaptor peptides which can be different or the same, PS 1 and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, a, b, c, d, a and f are 0 or l, and wherein, optionally, A1 and/or A2 are part of PS1 and/or wherein A3 and/or A4 are part of PS2.
Also embodied is the method for producing a HCV envelope protein or part thereof wherein said host cell is capable of N-glycosylating said protein translocated to the endoplasmic reticulum.
Further embodied is the method for producing a HCV envelope protein or part thereof l0 wherein said host cell is capable of N-glycosylating said protein translocated to the endoplasmic reticulum and processed at said sites PS1 and/or PS2.
More specifically, the host cell in any of said methods for producing a HCV
envelope protein or part thereof is an eukaryotic cell and, more particularly, a yeast cell such as a cell of strains of Saceha~°omyces, such as Sacclzaromyces cep°evisiae, Saccha~°omyces kluyveri, or Sacchanomyces uva~um, Schizosacclaaromyces, such as Schizosacclaa~omyces pombe, Kluyves~omyces, such as Kluyve~omyces lactis, Yap°owia, such as Yarj°owia lipolytiea, Haszsefzula, such as Hansenula polynzoiplaa, Pichia, such as Pichia pastof~is, Aspefgillus species, Neurospora, such as Neuf°ospo~a cf°assa, or SclawarZniomyces, such as Schwanniomyces occidentalis, or mutant cells derived from any thereof.
Any of the methods according to the invention for producing a HCV envelope protein or part thereof may further comprise cultivation of the host cells comprising a recombinant nucleic acid or a vector according to the invention in a suitable medium to obtain expression of said protein.
A further embodiment thereto comprises isolation of the produced HCV envelope protein or part thereof from a culture of said host cells, or, alternatively, from said host cells.
Said isolation step may include one or more of (i) lysis of said host cells in the presence of chaotropic agent, (ii) chemical and/or enzymatic modification of the cysteine thiol-groups in the isolated proteins wherein said modification may be reversible or irreversible, and producing a HCV envelope protein or part thereof (iii) heparin affinity chromatography.
Exemplary "ehaotropic agents" are guanidinium chloride and urea. In general, a chaotropic agent is a chemical that can disrupt the hydrogen bonding structure of water. In concentrated solutions they can denature proteins because they reduce the hydrophobic effect In the HCV envelope proteins or parts thereof as described herein comprising at least one cysteine residue, but preferably 2 or more cysteine residues, the cysteine thiol-groups can be irreversibly protected by chemical or enzymatic means. In particular, "irreversible protection" or "irreversible blocking" by chemical means refers to alkylation, preferably alkylation of the HCV envelope proteins by means of alkylating agents, such as, fox example, active halogens, ethylenimine or N-(iodoethyl)trifluoro-acetamide. In this respect, it is to be understood that alkylation of cysteine thiol-groups refers to the replacement of the thiol-hydrogen by (CH2)nR, in which n is 0, 1, 2, 3 or 4 and R= H, COOH, NH2, CONH2 , phenyl, or any derivative thereof. Alkylation can be performed by any method known in the art, such to as, for example, active halogens X(CH2)nR in which X is a halogen such as I, Br, Cl or F.
Examples of active halogens are methyliodide, iodoacetic acid, iodoacetamide, and 2-bromoethylamine. Other methods of alkylation include the use of NEM (N-ethylmaleimide) or Biotin-NEM, a mixture thereof, or ethylenimine or N-(iodoethyl)trifluoroacetamide both resulting in substitution of -H by -CH2-CH2-NH2 (Hermanson, G. T. 1996). The term "alkylating agents" as used herein refers to compounds which are able to perform alkylation as described herein. Such alkylations finally result in a modified cysteine, which can mimic other aminoacids. Alkylation by an ethylenimine results in a structure resembling lysine, in such a way that new cleavage sites for trypsine are introduced (Hermanson, G.
T. 1996).
Similarly, the usage of methyliodide results in an amino acid resembling methionine, while 2o the usage of iodoacetate and iodoacetamide results in amino acids resembling glutamic acid and glutamine, respectively. In analogy, these amino acids are preferably used in direct mutation of cysteine. Therefore, the present invention pertains to HCV
envelope proteins as described herein, wherein at least one cysteine residue of the HCV envelope protein as described herein is mutated to a natuxal amino acid, preferentially to methionine, glutamic acid, glutamine or lysine. The term "mutated" refers to site-directed mutagenesis of nucleic acids encoding these amino acids, ie to the well kown methods in the art, such as, for example, site-directed mutagenesis by means of PCR or via oligonucleotide-mediated mutagenesis as described in (Sambrook, J. et al. 1989). It should be understood that for the Examples section of the present invention, alkylation refers to the use of iodo-acetamide as an 3o alkylating agent unless otherwise specified.
It is further understood that in the purification procedure, the cysteine thiol-groups of the HCV proteins or the parts thereof of the present invention can be reversibly protected. The purpose of reversible protection is to stabilize the HCV protein or part thereof. Especially, after reversible protection the sulfur-containing functional group (eg thiols and disulfides) is retained in a non-reactive condition. The sulfur-containing functional group is thus unable to react with other compounds, e.g. have lost their tendency of forming or exchanging disulfide bonds, such as, for example Rl-SH + RZ-SH ---X---> Rl-S-S-R2 ;
s Rl-S-S-R2 + R3-SH ---X---> R~-S-S-R3 + RZ-SH ;
Rl_S_S_RZ + R3_S_S_Rq. ___X___> Rl-S-S_R3 + R2-S-S-R4 .
The described reactions between thiols andlor disulphide residues are not limited to intermolecular processes, but may also occur intramolecularly.
The term "reversible protection" or "reversible blocking" as used herein contemplates to covalently binding of modification agents to the cysteine thiol-groups, as well as manipulating the environment of the HCV protein such, that the redox state of the cysteine thiol-groups remains unaffected throughout subsequent steps of the purification procedure (shielding). Reversible protection of the cysteine thiol-groups can be carried out chemically or enzymatically.
15 The term "reversible protection by enzymatical means" as used herein contemplates reversible protection mediated by enzymes, such as for example acyl-transferases, e.g. acyl-transferases that are involved in catalysing thio-esterification, such as palmitoyl acyltransferase (see below).
The term "reversible protection by chemical means" as used herein contemplates 2o reversible protection:
1. by modification agents that reversibly modify cysteinyls such as for example by sulphonation and thio-esterification;
Sulphonation is a reaction where thiol or cysteines involved in disulfide bridges are modified to S sulfonate: RSH -~ RS-S03- (Darbre, A. 1986) or RS-SRS 2 RS-S03 25 (sulfitolysis; (Kumar, N. et al. 1986)). Reagents for sulfonation are e.g.
NaZS03, or sodium tetrathionate. The latter reagents for sulfonation are used in a concentration of 10-200 mM, and more preferentially in a concentration of 50-200 mM. Optionally sulfonation can be performed in the .presence of a catalysator such as, for example Cu2+
(100 ~M-1 mlV~ or cysteine (1-10 mM).
3o The reaction can be performed under protein denaturing as well as native conditions (Kumar, N. et al. 1985, Kumar, N. et al. 1986).
Thioester bond formation, or thio-esterification is characterised by:
RSH+R'COX ~ RS-COR' in which X is preferentially a halogenide in the compound R'CO-X.

2. by modification agents that reversibly modify the cysteinyls of the present invention such as, for example, by heavy metals, in particular Zn2+', Cd2+, mono-, dithio-and disulfide-compounds (e.g. aryl- and alkylmethanethiosulfonate, dithiopyridine, dithiomorpholine, dihydrolipoamide, Ellmann reagent, aldrothiolTM (Aldrich) (Rein, A. et al.
1996), dithiocarbamates), or thiolation agents (e.g. gluthathion, N-Acetyl cysteine, cysteineamine). Dithiocarbamate comprise a broad class of molecules possessing an R1R2NC(S)SR3 functional group, which gives them the ability to react with sulphydryl groups. Thiol containing compounds are preferentially. used in a concentration of 0.1-50 mM, more preferentially in a concentration of 1-50 mM, and even more preferentially in a concentration of 10-50 mM;
3. by the presence of modification agents that preserve the thiol status (stabilise), in particular antioxidantia, such as for example DTT, dihydroascorbate, vitamins and derivates, mannitol, amino acids, peptides and derivates (e.g. histidine, ergothioneine, carnosine, methionine), gallates, hydroxyanisole, hydoxytoluene, hydroquinon, hydroxymethylphenol and their derivates in concentration range of I O ~M-I O
mM, more preferentially in a concentration of I-10 mM;
4. by thiol stabilising conditions such as, for example, (i) cofactors as metal ions (Zn2+, Mg2+), ATP, (ii) pH control (e.g. for proteins in most cases pH ~5 or pH is preferentially thiol pI~-2; e.g. for peptides purified by Reversed Phase Chromatography at pH
~2).
2o Combinations of reversible protection as described in (1), (2), (3) and (4) may result in similarly pure and refolded HCV proteins. In effect, combination compounds can be used, such as, for example 2103 (Zn carnosine), preferentially in a concentration of 1-10 xnM. It should be clear that reversible protection also refers to, besides the modification groups or shielding described above, any cysteinyl protection method which may be reversed enzymatically or chemically, without disrupting the peptide backbone. In this respect, the present invention specifically refers to peptides prepared by classical chemical synthesis (see above), in which, for example, thioester bounds are cleaved by thioesterase, basic buffer conditions (Beekman, N. J. et al. 1997) or by hydroxylamine treatment (Vingerhoeds, M. H.
et aI. 1996).
3o Thiol containing HCV proteins can be purified, for example, on affinity chromatography resins which contain (1) a cleavable connector arm containing a disulfide bond (e.g. immobilised 5,5' dithiobis(2-nitrobenzoic acid) (Jayabaskaran, C.
et al. 1987) and covalent chromatography on activated thiol-Sepharose 4B (Pharmacia)) or (2) a aminohexanoyl-4-aminophenylarsine as immobilised ligand. The latter affinity matrix has been used for the purification of proteins, which are subject to redox regulation and dithiol proteins that are targets for oxidative stress (Kalef, E. et al. 1993).
Reversible protection may also be used to increase the solubilisation and extraction of peptides (Pomroy, N. C. and Deber, C. M. 1998).
The reversible protection and thiol stabilizing compounds may be presented under a monomeric, polymeric or liposomic form.
The removal of the reversibly protection state of the cysteine residues can chemically or enzymatically accomplished by e.g.:
- a reluctant, in particular DTT, DTE, 2-mercaptoethanol, dithionite, SnCl2, sodium to borohydride, hydroxylamine, TCEP, in particular in a concentration of 1-200 mM, more preferentially in a concentration of 50-200 mM;
- removal of the thiol stabilising conditions or agents by e.g. pH increase;
- enzymes, in particular thioesterases, glutaredoxine, thioredoxine, in particular in a concentration of 0.01-5 ~.M, even more particular in a concentration range of 0.1-5 1s ~.M.;
- combinations of the above described chemical and/or enzymatical conditions.
The removal of the reversibly protection state of the cysteine residues can be carried out i~z vita°o or iTa vivo, e.g. in a cell or in an individual.
2o It will be appreciated that in the purification procedure, the cysteine residues may or may not be irreversibly blocked, or replaced by any reversible modification agent, as listed above.
A reluctant according to the present invention is any agent which achieves reduction of the sulfur in cysteine residues, e.g. "S-S" disulfide bridges, desulphonation of the cysteine 25 residue (RS-S03- ~ RSH). An antioxidant is any reagent which preserves the thiol status or minimises "S-S" formation and/or exchanges. Reduction of the "S-S" disulfide bridges is a chemical reaction whereby the disulfides are reduced to thiol (-SH). The disulfide bridge breaking agents and methods disclosed by Maertens et al. in WO 96/04385 are hereby incorporated by reference in the present description. "S-S" Reduction can be obtained by (1) 30 enzymatic cascade pathways or by (2) reducing compounds. Enzymes like thioredoxin, glutaredoxin are known to be involved in the in vivo reduction of disulfides and have also been shown to be effective in reducing "S-S" bridges in vitro. Disulfide bonds are rapidly cleaved by reduced thioredoxin at pH 7.0, with an apparent second order rate that is around 104 times larger than the corresponding rate constant for the reaction with DTT. The reduction kinetic can be dramatically increased by preincubation the protein solution with 1 mM DTT
or dihydrolipoamide (Holingren, A. 1979). Thiol compounds able to reduce protein disulfide bridges are for instance Dithiothreitol (DTT), Dithioerythritol (DTE), (3-mercaptoethanol, thiocarbamates, bis(2-mercaptoethyl) sulfone and N,N'-bis(mercaptoacetyl)hydrazine, and sodium-dithionite. Reducing agents without thiol groups like ascorbate or stannous chloride (SnCl2), which have been shown to be very useful in the reduction of disulfide bridges in monoclonal antibodies (Thakur, M. L. et al. 1991), may also be used for the reduction of HCV proteins. In addition, changes in pH values may influence the redox status of HCV
proteins. Sodium borohydride treatment has been shown to be effective for the reduction of to disulfide bridges in peptides (Gailit, J. 1993). Tris (2-carboxyethyl)phosphine (TCEP) is able to reduce disulfides at low pH (Burns, J. et al. 1991). Selenol catalyses the reduction of disulfide to thiols when DTT or sodium borohydride is used as reluctant.
Selenocysteamine, a commercially available diselenide, was used as precursor of the catalyst (Singh, R. and Kats, L. 1995).
Heparin is known to bind to several viruses and consequently binding to the HCV
envelope has already been suggested (Garson, J. A. et al. 1999). In this respect, in order to analyze potential binding of HCV envelope proteins to heparin, heparin can be biotinylated and subsequently the interaction of heparin with HCV envelope proteins can be analyzed, e.g.
on microtiterplates coated with HCV envelope proteins. In this way different expression systems can be scrutinized. For example, a strong binding is observed with part of the HCV
E1 expressed in Ha>ZSerZUla, while binding with HCV E1 from mammalian cell culture is absent. In this respect, the term "heparin affinity chromatography" relates to an immobilized heparin, which is able to specifically bind to HCV envelope proteins. Proteins of the high-mannose type bind agglutinins such as LerZS culir2ar°is, Galarrthus raivalis, Nar°cissus pseudorz.a>"cissus Pisum sativurzz or Alliurn ursirzum. Moreover, N-acetylglucosamine can be bound by lectins, such as WGA (wheat germ agglutinin) and its equivalents.
Therefore, one may employ lectins bound to a solid phase to separate the HCV envelope proteins of the present invention from cell culture supernatants, cell lysates and other fluids, e.g. for 3o purification during the production of antigens for vaccine or immunoassay use.
With "HCV-recombinant vaccinia virus" is meant a vaccinia virus comprising a nucleic acid sequence encoding a HCV protein or part thereof A further aspect of the invention relates to an isolated HCV envelope protein or part thereof resulting from the method of production as described herein. In particular, the invention relates to an isolated HCV envelope protein or part thereof resulting from the expression in an eukaryotic cell of a recombinant nucleic acid comprising a nucleotide sequence encoding a protein comprising an avian lysozyme leader peptide or a functional equivalent thereof joined to said HCV envelope protein or a part thereof. More specifically, said recombinant nucleic acid is encoding a protein which is characterized by the structure CL-[(Al)a - (PS1)b- (A2)cJ-HCVENV-[(A3)d - (PS2)e - (A4)fJ
wherein:
to CL is an avian lysozyme leader peptide or a functional equivalent thereof, Al, A2, A3 and A4 are adaptor peptides which can be different or the same, PS1 and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, a, b, c, d, a and f are 0 or 1, and wherein, optionally, A1 and/or A2 are.part of PS1 and/or wherein A3 and/or A4 are part of PS2.
In a specific embodiment, the isolated HCV envelope protein or part thereof is derived from said protein comprising an avian lysozyme leader peptide or a functional equivalent thereof joined to said HCV envelope protein or a part thereof. In another specific 2o embodiment, the isolated HCV envelope protein or part thereof is derived from said protein which is characterized by the structure CL-[(A1)a - (PSl)b- (A2)~J-HGVENV-[(A3)d - (PS2)e - (A4)rJ
wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, A1, A2, A3 and A4 are adaptor peptides which can be different or the same, PS 1 and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, a, b, c, d, a and f are 0 or 1, and wherein, optionally, A1 and/or A2 are part of PSl and/or wherein A3 and/or A4 are 3o part of PS2.

Another aspect of the current invention relates to the use of the avian lysozyme leader peptide to direct a recombinantly expressed protein to the endoplasmic reticulum of Hansenula p~lymospha or any mutant thereof.
Thus, all aspects and embodiments of the current invention as described above and relating to a HCV envelope protein can, specific for H. polym,orpha or any mutant thereof as host cell, be read as relating to a protein instead of relating to a HCV
envelope protein.
More specifically, the current invention also relates to a recombinant nucleic acid comprising a nucleotide sequence encoding a protein comprising an avian lysozyme leader peptide or a functional equivalent thereof joined to a protein of interest or a part thereof.
l0 In one embodiment thereto, the recombinant nucleic acid comprising nucleotide sequence encodes characterized by the structure CL-[(Al)a - (PS1)b- (AZ)c]-PROT-[(A3)d - (PS2)e - (A'l)f]
wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, i5 A1, A2, A3 and A4 are adaptor peptides which can be different or the same, PS 1 and PS2 are processing sites which can be the different or the same, PROT is a protein of interest or a part thereof, a, b, c, d, a and f are 0 or 1, and wherein, optionally, Al and/or A2 are part of PS1 and/or wherein A3 and/or A4 are 2o part of PS2.
In a further embodiment, the recombinant nucleic acids according to the invention farther comprise regulatory elements allowing expression in a H_ polyno~pha cell or any mutant thereof of said protein comprising an aviam lysozyme leader peptide or a functional equivalent thereof joined to a protein of interest or a part thereof, or of said protein 25 characterized by the structure CL-[(Al)X - (PSl)y - (A2)Z]-PROT-[(A3)X -(PS2)y - (A4)2].
Further included are vectors comprising said recombinant nucleic acids, host cells comprising said recombinant nucleic acids or said vectors, said host cells expressing the protein comprising an avian lysozyme leader peptide or a functional variant thereof joined to a protein of interest and methods for producing said protein of interest in said host cells.
30 A further aspect of the invention relates to an isolated protein of interest or part thereof resulting from the expression in a Hansehula cell of a recombinant nucleic acid comprising a nucleotide sequence encoding a protein comprising an avian lysozyme leader peptide or a functional equivalent thereof joined to said protein of interest or a part thereof.

More specifically, said recombinant nucleic acid is encoding a protein which is characterized by the structure CL-[(A1)a - (PS1)b- (A2)~]-PROT-[(A3)d - (PS2)e - (A4)f]
wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, Al, A2, A3 and A4 are adaptor peptides which can be different or the same, PS1 and PS2 are processing sites which can be the different or the same, PROT is a protein of interest or a part thereof, a, b, c, d, a and f are 0 or 1, and wherein, optionally, Al and/or A2 are part of PS1 and/or wherein A3 and/or A4 are 1o part of PS2.
In a specific embodiment, the isolated protein of interest or part thereof is derived from said protein comprising an avian lysozyme leader peptide or a functional equivalent thereof joined to said protein of interest or a part thereof. In another specific embodiment, the isolated protein of interest or part thereof is derived from said protein which is characterized by the structure CL-L(A1)~ - (PS1)b- (A2)~~-PROT-[(A3)d - (PS2)e - (A4)f~
wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, A1, A2, A3 and A4 are adaptor peptides which can be different or the same, 2o PS1 and PS2 are processing sites which can be the different or the same, PROT is' a protein of interest or a part thereof, a, b, c, d, a and f are 0 or l, and wherein, optionally, Al and/or A2 are part of PS1 and/or wherein A3 and/or A4 are part of PS2.
In a specific embodiment of the invention, said protein of interest or fragment thereof can e.g. be a viral envelope protein or a fragment thereof such as a HCV envelope protein or HBV
(hepatitis B) envelope protein, or fragments thereto. In general, said protein of interest or fragment thereof can be any protein needing the N-glycosylation characteristics of the current invention. Other exemplary viral envelope proteins include the HIV (human 3o immunodeficiency virus) envelope protein gpI20 and viral envelope proteins of a virus belonging to the Flavirideae.

The teens "HCV virus-like particle formed of a HCV envelope protein"
"oligomeric particles formed of HCV envelope proteins" are herein defined as structures of a specific nature and shape containing several basic units of the HCV El and/or E2 envelope proteins, which on their own are thought to consist of one or two E 1 and/or E2 monomers, respectively.
It should be clear that the particles of the present invention are defined to be devoid of infectious HCV RNA genomes. The particles of the present invention can be higher-order particles of spherical nature which can be empty, consisting of a shell of envelope proteins in which lipids, detergents, the HCV core protein, or adjuvant molecules can be incorporated.
The latter particles can also be encapsulated by liposomes or apolipoproteins, such as, for to example, apolipoprotein B or low density lipoproteins, or by any other means of targeting said particles to a specific organ or tissue. In this case, such empty spherical particles are often referred to as "virus-like particles" or VLPs. Alternatively, the higher-order particles can be solid spherical structures, in which the complete sphere consists of HCV E1 or E2 envelope protein oligomers, in which lipids, detergents, the HCV core protein, or adjuvant molecules is can be additionally incorporated, or which in turn may be themselves encapsulated by liposomes or apolipoproteins, such as, for example, apolipoprotein B, low density lipoproteins, or by any other means of targeting said particles to a specific organ or tissue, e.g.
asialoglycoproteins. The particles can also consist of smaller structures (compared to the empty or solid spherical structures indicated above) which are usually round (see further)-2o shaped and which usually do not contain more than a single layer of HCV
envelope proteins.
A typical example of such smaller particles are rosette-like structures which consist of a lower number of HCV envelope proteins, usually between 4 and 16. A specific example of the latter includes the smaller particles obtained with E1 s in 0.2% CHAPS as exemplified herein which apparently contain 8-10 monomers of Els. Such rosette-like structures are usually organized 25 in a plane and are round-shaped, e.g. in the form of a wheel. Again lipids, detergents, the HCV core protein, or adjuvant molecules can be additionally incorporated, or the smaller particles may be encapsulated by liposomes or apolipoproteins, such as, for example, apolipoprotein B or low density lipoproteins, or by any other means of targeting said particles to a specific organ or tissue. Smaller particles may also form small spherical or globular 30 structures consisting of a similar smaller number of HCV E1 or E2 envelope proteins in which lipids, detergents, the HCV core protein, or adjuvant molecules could be additionally incorporated, or which in turn may be encapsulated by liposomes or apolipoproteins, such as, for example, apolipoprotein B or low density lipoproteins, or by any other means of targeting said particles to a specific organ or tissue. The size (i.e. the diameter) of the above-defined particles, as measured by the well-known-in-the-art dynamic light scattering techniques (see further in examples section), is usually between 1 to 100 nm, more preferentially between 2 to 70 nm., even more preferentially between 2 and 40 mn, between 3 to 20 nm, between 5 to 16 mn, between 7 to 14 nm or between 8 to 12 nm.
s In particular, the present invention relates to a method for purifying hepatitis C virus (HCV) envelope proteins, or any part thereof, suitable for use in an immunoassay or vaccine, which method comprising:
(i) growing Hansenula or Saccharomyces glycosylation minus strains transformed with to an envelope gene encoding an HCV E1 andlor HCV E2 protein, or any part thereof, in a suitable culture medium;
(ii) causing expression of said HCV E1 and/or HCV E2 gene, or any part thereof; and (iii) purifying said HCV E1 and/or HCV E2 protein, or any part thereof, from said cell culture.
is The invention further pertains to a method for purifying hepatitis C virus (HCV) envelope proteins, or any part thereof, suitable for use in an immunoassay or vaccine, which method comprising:
(i) growing Hansenula or Saccharomyces glycosylation minus strains transformed with 2o an envelope gene encoding an HCV El and/or HCV E2 protein, or any part thereof, in a suitable culture medium;
(ii) causing expression of said HCV El and/or HCV E2 gene, or any part thereof; and (iii) purifying said intracellularly expressed HCV E1 and/or HCV E2 protein,, or any part thereof, upon lysing the transformed host cell.
2s The invention fiuther pertains to a method for purifying hepatitis C virus (HCV) envelope proteins, or any part thereof, suitable for use in an immunoassay or vaccine, which method comprising:
(i) growing Hansenula or Saccharomyces glycosylation minus strains transformed with 3o an envelope gene encoding an HCV El andlor HCV E2 protein, or any part thereof, in a suitable culture medium, in which said HCV E1 and/or HCV E2 protein, or any part thereof, comprises at least two Cys-amino acids;
(ii) causing expression of said HCV El and/or HCV E2 gene, or any part thereof; and (iii) purifying said HCV E1 andlor HCV E2 protein, or any part thereof, in which said Cys-amino acids are reversibly protected by chemical and/or enzymatic means, from said culture.
The invention further pertains to a method for purifying hepatitis C virus (HCV) envelope proteins, or any part thereof, suitable for use in an immunoassay or vaccine, which method comprising:
(i) growing Hansenula or Saccharomyces glycosylation minus strains transformed with an envelope gene encoding an HCV El and/or HCV E2 protein, or any part thereof, in to a suitable culture medium, in which said HCV E1 and/or HCV E2 protein, or any part thereof, comprises at least two Cys-amino acids;
(ii) causing expression of said HCV E1 and/or HCV E2 gene, or any part thereof; and, (iii) purifying said intracellulary expressed HCV E1 and/or HCV E2 protein, or any part thereof, upon lysing the transformed host cell, in which said Cys-amino acids are reversibly protected by chemical and/or enzymatic means.
The present invention specifically relates to a method for purifying recombinant HCV -yeast proteins, or any part thereof, as described herein, in which said purification includes heparin affinity chromatography.
Hence, the present invention also relates to a method for purifying recombinant HCV
yeast proteins, or any part thereof, as described above, in which said chemical means is sulfonation.
Hence, the present invention also relates to a method for purifying recombinant HCV
yeast proteins, or any part thereof, as described above, in which said reversibly protection of Cys-amino acids is exchanged for an irreversible protection by chemical and/or enzymatic means.
3o Hence, the present invention also relates to a method for purifying recombinant HCV
yeast proteins, or any part thereof, as described above, in which said irreversible protection by chemical means is iodo-acetamide.

Hence, the present invention also relates to a method for purifying recombinant HCV
yeast proteins, or any part thereof, as described above, in which said irreversible protection by chemical means is NEM or Biotin-NEM or a mixture thereof.
The present invention also relates to a composition as defined above which also comprises HCV core, E1, E2, P7, NS2, NS3, NS4A, NS4B, NSSA and/or NSSB
protein, or parts thereof. The core-glycosylated proteins E1, E2, and/or El/E2 of the present invention may, for example, be combined with other HCV antigens, such as, for example, core, P7, NS3, NS4A, NS4B, NSSA and/or NSSB. The purification of these NS3 proteins will to preferentially include a reversible modification of the cysteine residues, and even more preferentially sulfonation of cysteines. Methods to obtain such a reversible modification, including sulfonation have been described for NS3 proteins in Maertens et al.
(PCT/EP99/02547). It should be stressed that the whole content, including all the definitions, of the latter document is incorporated by reference in the present application.
Also, the present invention relates to the use of a envelope protein as described herein for inducing immunity against HCV, characterized in that said HCV envelope protein is used as part of a series of time and compounds. In this regard, it is to be understood that the term "a series of time and compounds" refers to administering with time intervals to an individual the 2o compounds used for eliciting an immune response. The latter compounds may comprise any of the following components: a HCV envelope protein according to the invention, HCV DNA
vaccine composition, HCV polypeptides.
In this respect, a series comprises administering, either:
(i) an HCV antigen, such as, for example, a HCV envelope protein according to the invention, with time intervals, or (ii) an HCV antigen, such as, for example, a HCV envelope protein according to the invention in combination with a HCV DNA vaccine composition, in which said envelope protein and said HCV DNA vaccine composition, can be administered simultaneously, or at different time intervals, including at alternating time intervals, or (iii) either (i) or (ii), possibly in combination with other HCV peptides, with time intervals.
In this regard, it should be clear that a HCV DNA vaccine composition comprises nucleic acids encoding HCV envelope peptide, including El-, E2-, E1/E2-peptides, NS3 peptide, other HCV peptides, or parts of said peptides. Moreover, it is to be understood that said HCV peptides comprises HCV envelope peptides, including El-, E2-, El/E2-peptides, other HCV peptides, or parts thereof. The term "other HCV peptides" refers to any HCV
peptide or fragment thereof. In item (ii) of the above scheme, the HCV DNA
vaccine composition comprises preferentially nucleic acids encoding HCV envelope peptides. In item (ii) of the above scheme, the HCV DNA vaccine composition consists even more preferentially of nucleic acids encoding HCV envelope peptides, possibly in combination with a HCV-NS3 DNA vaccine composition. In this regard, it should be clear that an HCV DNA
vaccine composition comprises a plasmid vector comprising a polynucleotide sequence encoding an HCV peptide as described above, operably linked to transcription regulatory elements. As used herein, a "plasmid vector" refers to a nucleic acid molecule capable of to transporting another nucleic acid to which it has been linked. Preferred vectors are those capable of autonomous replication and/or expression of nucleic acids to which they have been linked. In general, but not limited to those, plasmid vectors are circular double stranded DNA
loops which, in their vector form, are not bound to the chromosome. As used herein, a "polynucleotide sequence" refers to polynucleotides such as deoxyribonucleic acid (DNA), and, where appropriate, ribonucleic acid (RNA). The term should also be understood to include, as equivalents, analogs of either RNA or DNA made from nucleotide analogs, and single (sense or antisense) and double-stranded polynucleotides. As used herein, the term "transcription regulatory elements" refers to a nucleotide sequence which contains essential regulatory elements, such that upon introduction into a living vertebrate cell it is able to direct 2o the cellular machinery to produce translation products encoded by the polynucleotide. The term "operably linked" refers to a juxtaposition wherein the components are configured so as to perform their usual function. Thus, transcription regulatory elements operably linked to a nucleotide sequence are capable of effecting the expression of said nucleotide sequence.
Those skilled in the art can appreciate that different transcriptional promoters, terminators, carrier vectors or specific gene sequences may be used succesfully.
Alternatively, the DNA vaccine may be delivered through a live vector such as adenovirus, canary pox virus, MVA, and the like.
The HCV envelope proteins of the present invention, or the parts thereof, are particularly suited for incorporation into an immunoassay for the detection of anti-HCV
antibodies, and/or genotyping of HCV, for prognosing/monitoring of HCV
disease, or as a therapeutic agent.

A further aspect of the invention relates to a diagnostic kit for the detection of the presence of anti-HCV antibodies in a sample suspected to comprise anti-HCV
antibodies, said kit comprising a HCV envelope protein or part thereof according to the invention. In a specific embodiment thereto, said HCV envelope protein or part thereof is attached to a solid support. In a further embodiment, said sample suspected to comprise anti-HCV
antibodies is a biological sample.
The term "biological sample" as used herein, refers to a sample of tissue or fluid isolated from an individual, including but not limited to, for example, senun, plasma, lymph fluid, the external sections of the skin, respiratory-, intestinal- or genito-urinary tracts, l0 oocytes, tears, saliva, milk, blood cells, tumors, organs, gastric secretions, mucus, spinal cord fluid, external secretions such as, for example, excrement, urine, sperm, and the like.
Another aspect of the invention refers to a composition comprising an isolated HCV
envelope protein or fragment thereof according to the invention. Said composition may further comprise a pharmaceutically acceptable carrier and can be a medicament or a vaccine.
A further aspect of the invention covers a medicament or a vaccine comprising a HCV
envelope protein or part thereof according to the invention.
Yet another aspect of the invention comprises a pharmaceutical composition for inducing a HCV-specific immune response in a mammal, said composition comprising an effective amount of a HCV envelope protein or-part thereof according to the invention and, optionally, a pharamaceutically acceptable adjuvant. Said pharmaceutical composition comprising an effective amount of a HCV envelope protein or part thereof according to the invention may also be capable of inducing HCV-specific antibodies in a mammal, or capable of inducing a T-cell function in a mammal. Said pharmaceutical compostion comprising an effective amount of a HCV envelope protein.or part thereof according to the invention may be prophylactic composition or a therapeutic composition. In a specific embodiment said 3o mammal is a human.
A "mammal" is to be understood as any member of the higher vertebrate class Mammalia, including humans; characterized by live birth, body hair, and mammary glands in the female that secrete milk for feeding the young. Mammals thus also include non-human primates and trimera mice (Zauberman et al. 1999).

_48_ A "vaccine" or "medicament" is a composition capable of eliciting protection against a disease, whether partial or complete, whether against acute or chronic disease; in this case the vaccine or medicament is a prophylactic vaccine or medicament. A vaccine or medicament may also be useful for treatment of an akeady ill individual, in which case it is called a therapeutic vaccine or medicament. Likewise, a pharmaceutical composition can be used for either prophylactic and/or therapeutic purposes in which cases it is a prophylactic and/or therapeutic composition, respectively.
The HCV envelope proteins of the present invention can be used as such, in a biotinylated form (as explained in WO 93/18054) and/or complexed to~
Neutz°alite Avidin to (Molecular Probes Inc., Eugene, OR, USA), avidin or streptavidin. It should also be noted that "a vaccine" or "a medicament" may comprise, in addition to an active substance, a "pharmaceutically acceptable carrier" or "pharmaceutically acceptable adjuvant" which may be a suitable excipient, diluent, carrier and/or adjuvant which, by themselves, do not induce the production of antibodies harmful to the individual receiving the composition nor do they elicit protection. Suitable carriers are typically large slowly metabolized macromolecules such as proteins, polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers and inactive virus particles. Such carriers are well known to those skilled in the art. Preferred adjuvants to enhance effectiveness of the composition include, but are not limited to: aluminium hydroxide, aluminium in combination with 3-0-deacylated 2o monophosphoryl lipid A as described in WO 93/19780, aluminium phosphate as described in WO 93/24148, N-acetyl-muramyl-L-threonyl-D-isoglutamine as described in U.S.
Patent N°
4,606,918, N-acetyl-normuramyl-L-alanyl-D-isoglutamine, N-acetylmuramyl-L-alanyl-D-isoglutamyl-L-alanine2-(1'2'dipalinitoyl-sn-glycero-3-hydroxyphosphoryloxy) ethylamine, RIBI (ImmunoChem Research Inc., Hamilton, MT, USA) which contains monophosphoryl lipid A, detoxified endotoxin, trehalose-6,6-dimycolate, and cell wall skeleton (MPL + TDM
+ CWS) in a 2% squalene/Tween 80 emulsion. Any of the three components MPL, TDM or CWS may also be used alone or combined 2 by 2. The MPL may also be replaced by its synthetic analogue referred to as RC-529. Additionally, adjuvants such as Stimulon (Cambridge Bioscience, Worcester, MA, USA), SAF-1 (Syntex) or bacterial DNA-based 3o adjuvants such as ISS (Dynavax) or CpG (Coley Pharmaceuticals) may be used, as well as adjuvants such as combinations between QS21 and 3-de-O-acetylated monophosphoryl lipid A (W094/00153), or MF-59 (Chiron), or poly[di(carboxylatophenoxy) phosphazene]
based adjuvants (Virus Research Institute), or blockcopolyrner based adjuvants such as Optivax (Vaxcel, Cythx) or inulin-based adjuvants, such as Algammulin and GammaInulin (Anutech), Incomplete Freund's Adjuvant (IFA) or Gerbu preparations (Gerbu Biotechnik).
It is to be understood that Complete Freund's Adjuvant (CFA) may be used for non-human applications .
and research purposes as well. "A vaccine composition" may further contain excipients and diluents, which are inherently non-toxic and non-therapeutic, such as water, saline, glycerol, ethanol, wetting or emulsifying agents, pH buffering substances, preservatives, and the like.
Typically, a vaccine composition is prepared as an injectable, either as a liquid solution or suspension. Injection may be subcutaneous, intramuscular, intravenous, intraperitoneal, intrathecal, intradermal. Other types of administration comprise implantation, suppositories, oral ingestion, enteric application, inhalation, aerosolization or nasal spray or drops. Solid to forms, suitable for solution on, or suspension in, liquid vehicles prior to injection may also be prepared. The preparation may also be emulsified or encapsulated in liposomes for enhancing adjuvant effect. The polypeptides may also be incorporated into Immune Stimulating Complexes together with saponins, for example Quil A (ISCOMS). Vaccine compositions comprise an effective amount of an active substance, as well as any other of the above-mentioned components. "Effective amount" of an active substance means that the administration of that amount to an individual, either in a single dose or as part of a series, is effective for prevention or treatment of a disease or for inducing a desired effect. This amount varies depending upon the health and physical condition of the individual to be treated, the taxonomic group of the individual to be treated (e.g. human, non-human primate, primate, 2o etc.), the capacity of the individual's immune system to mount an effective immune response, the degree of protection desired, the formulation of the vaccine, the treating doctor's assessment, the strain of the infecting pathogen and other relevant factors.
It is expected that the amount will fall in a relatively broad range that can be determined through routine trials.
Usually, the amount will vary from 0.01 to 1000 p,g/dose, more particularly from 0.1 to 100 ~,gldose. Dosage treatment may be a single dose schedule or a multiple dose schedule. The vaccine may be administered in conjunction with other immunoregulatory agents.
The present invention is illustrated by the Examples as set forth below. These 3o Examples are merely illustrative and are not construed to restrict or limit the invention in any way.

EXAMPLES

CONSTRUCTION OF pFPMT-MFa-El-H6 SHUTTLE VECTOR
Plasmids for Hansenaula polyn2o~pha transformation were constructed as follows. The pFPMT-MFa-E1-H6 shuttle vector has been constructed in a multi-step procedure.
Intially the nucleic acid sequence encoding the HCV Els protein (SEQ ID N0:2) was cloned after a CHH leader sequence (CHH = Ca~cinus ~naenas hyperglycemic hormone) which was to subsequently changed for a MFa leader sequence (MFa = Sacchanomyces cep°evisiae a-mating factor).
At first a pUCl8 derivative has been constructed harboring the CHH-E1-H6 unit as a EcoRIlBamHI fragment by the seamless cloning method (Padgett, I~. A. and Sorge, J. A.
1996). Thereto, the Els-H6-encoding DNA fragment and the pCHH-Hir-derived acceptor plasmid were generated by PCR as described below.
Generation of Els-H6-encoding DNA fragment The E1-H6 DNA fragment (coding for HCV type 1b Els protein consisting of the amino 2o acids 192 to 326 of Els elongated with 6 His-residues; SEQ ID N0:5) was isolated by PCR
from the plasmid pGEMTEIsH6 (SEQ ID N0:6; Figure 1). The following primers were used thereto:
- CHHE1-F:S.~-agttactcttca.aggtatgaggtgcgcaacgt.gtccg-3~
(SEQ ID N0:7);
The Earn1104I site is underlined, the dot marks the cleavage site. The bold printed bases are complementary to those of primer CHH-links. The non-marked bases anneal within the start region of E1 (192-326) in sense direction; and - CHHE1-R:
5~- agttactcttca. cagggatcctccttaatggtgatggtggtggtgcC-3~
(SEQ ID NO: 8);
The Eam1104I site is underlined, the dot marks the cleavage site. The bold 'printed bases are complementary to those of primer MF30-rechts. The bases forming the BarnHI
site usefull for later cloning procedures are printed in italics. The non-marked bases anneal in antisense direction within the end of the El H6 unit, including the stop colon and three additional bases between the stop colon and the BamHI site.
The reaction mixture was constituted as follows: total volume of 50 ~,L
containing 20 ng of Eco311-linearized pGEMTEIsH6, each 0.2 ~,M of primers CHHE1-F and CHHE1-R, dNTP's (each at 0.2 ~M), 1 x buffer 2 (Expand Long Template PCR System; Boehringer;
Cat No 1681 834), 2.5 U polymerase mix (Expand Long Template PCR System; Boehringer;
Cat No 1681 834).
Program 1 was used, said program consisting of the following steps:
1. denaturation: Smin 95°C;
l0 2. 10 cycles of 30 sec denaturation at 95°C, 30 sec annealing at 65°C, and 130 sec elongation at 68°C
3, termination at 4°C.
Then 5 ~.L 10 x buffer 2 (Expand Long Template PCR System; Boeringer; Cat No 1681 834), 40 ~,L H20, and 5 ~.L of [dATP, dGTP, and dTTP (2mM each); lOmM 5-methyl-dCTP]
ware added to the sample derived from program 1, and further amplification was performed following program 2 consisting of the following steps:
1. denataruation: 5 min at 95°C
2. 5 cycles of 45 sec denaturation at 95°C, 30 sec annealing at 65°C, and 130 sec at 68°C
3. termination at 4°C.
Generation of pCHH-Hir-derived acceptor plasmid The acceptor fragment was made by PCR from the pCHH-Hir plasmid (SEQ ID N0:9;
Figure 2) and consists of almost the complete pCHH-Hir plasmid, except that the Hir-coding sequence is not present in the PCR product. Following primers were used for this PCR:
1. CHH-links: 5'-agttactcttca. cctcttttccaacgggtgtgtag-3' (SEQ ID NO:10);
The Eam1104I site is underlined, the dot marks the cleavage site. The bold printed bases are complementary to those of primer CHHE1-F. The non-maxked bases anneal within the end of the CHH sequence in antisense direction; and 2. MF30-rechts: 5'-agtcactcttca. ctgcaggcatgcaagcttggcg-3' (SEQ ID NO:11);
The Eam1104I site is underlined, the dot marks the cleavage site. The bold printed bases are complementary to those of primer CHHEl-R. The non-marked bases anneal within the pUCl8 sequences behind the cloned GHHHir~tdin HL20 of pCHH-Hir; pointing away from the insert.
The reaction mixture was constituted as follows: total volume of 50 ~,L
containing 20 ng of Asp718I-linearized pCHH-Hir, each 0.2 ~,M of primers CHH-links and MF30-rechts, dNTP's (each at 0.2 ~.M), 1 x buffer 2 (Expand Long Template PCR System; Boeringer;
Cat No 1681 834), 2.5 U polymerase mix (Expand Long Template PCR System; Boeringer; Cat No 834).
Program 1 was as described above was used.
Then 5 ~,L 10 x buffer 2 (Expand Long Template PCR System; Boeringer; Cat No 1681 834), l0 40 ~,L H20, and 5 ~.L of [dATP, dGTP, and dTTP (2mM each); lOmM 5-methyl-dCTP] were added to the sample derived from program l, and further amplification was performed following program 2 as described above.
Generation of vector pCHHEl The Els-H6-encoding DNA fragment and the.pCHH-Hir-derived acceptor plasmid generated by PCR as described above were purified using the PCR product purification kit (Qiagen) according to the supplier's specifications. Subsequently the purified fragments were digested separately with Earn1104I. Subsequently, the Els-H6 DNA fragment was ligated into the pCHH-Hir-derived acceptor plasmid using T4 ligase (Boehringer) following the 2o specifications of the supplier.
E. coli XL-Gold cells were transformed with the ligation mixture and the plasmid DNA of several ampicillin-resistant colonies were analyzed by digestion with EcoRI
and Bar~2HI. A
positive clone was selected and denominated as pCHHEl.
Generation of vector pFPMT-CHH-E1H6 The EcoRIlBanaHI fragment of pCHHEl was ligated with the EcoRIlBarnHI digested vector pFPMTl21 (SEQ ID N0:12; Figure 3). T4 ligase (Boehringer) was used according to the supplier's instructions. The ligation mixture was used to transform E. coli DHSocF'cells.
Several transformants were analyzed on restriction pattern of the plasmid DNA
and a positive clone was withheld which was denominated pFPMT-CHH-E1H6 (SEQ ID N0:13; Figure 4).

Generation of pFPMT-MFa,-E1-H6 Finally the shuttle vector pFPMT-MFa-E1-H6 was generated by ligation of three fragments, said fragments being:
1. the 6.961 kb EcoRIlBaznHI digested pFPMT121 (SEQ ID N0:12; Figure 3), 2. the 0.245 EcoRIlHizzdIII fragment of pUClB-MFa (SEQ ID N0:62; Figure 36), and 3. the 0.442 kb HifzdIIIlBamHI fragment of a 0.454 kb PCR product derived from pFPMT-CHH-E 1 H6.
The 0.454 kb PCR product giving rise to fragment No.3 was obtained by PCR
using the following primers:
1. primer MFa-E1 f Hi:
5'-aggggtaagCttggataaaaggtatgaggtgcgCaacgtgtccgggatgt-3' (SEQ ID
N0:14); and 2. primer El back-Bam:
5'-agttacggatccttaatggtgatggtggtggtgccagttcat-3°
(SEQ ID N0:15).
The reaction mixture was constituted as follows: Reaction mixture volume 50 ~L, pFPMT-CHH-E1-H6 (EcoRI-linearized; 15 ng/~,L), 0.5 ~,L; primer MFa-E1 f Hi (50 ~,M), 0.25 ~,L; primer E1 back-Bam (50 ~,M), 0.25 ~,L; dNTP's (all at 2mM), 5 ~,L;
DMSO, SqL;
H20, 33.5 ~,L; Expand Long Template PCR System (Boeringer Mannheim; Cat No 1681 834) Buffer 2 (10 x concentrated), 5 ~.L; Expand Long Template PCR System Polymerase mixture (1 U/~L), 0.5 qL.
The PCR program consisting of the following steps was used:
1. denaturation: 5 min at 95°C
2. 29 cycles of 45 sec denaturation at 95°C, 45 sec annealing at 55°C, and 40 sec elongation at 68°C
3. termination at 4°C.
Based on the primers used, the resulting 0.454kb PCR product contained the codons of 3o E1(192-326) followed by six histidine codons and a "taa" stop codon, upstream flanked by the 22 3'-terminal base pairs of the MFa prepro sequence (including the cloning relevant HizzdIII site plus a six base pairs overhang) and downstream flanked by a (cloning relevant) BaznHI site and a six base pairs overhang.

For the ligation reaction, T4 DNA ligase (Boehringer Mannheim) has been used according to the supplier's conditions (sample volume 20 ~,L).
E. coli HB 1 O 1 cells were transformed with the ligation mixture and positive clones withheld after restriction analysis of the plasmids isolated from several transformants. A positive plasmid was selected and denominated as pFPMT-MFa-E1-H6 (SEQ ID N0:16; Figure 5).

1o CONSTRUCTION OF pFPMT-CL-El-H6 SHUTTLE VECTOR
Plasmids fox Hafzs~fzula polymo~pha transformation were constructed as follows. The pFPMT-CL-E1-H6 shuttle vector was constructed in three steps starting from pFPMT-MFa-E1-H6 (SEQ ID NO:16, Figure 5).
In a first step, the MFa-E1-H6 reading frame of pFPMT-MFa-E1-H6 was subcloned into the pUCl8 vector. Therefore a 1.798kb SaIIlBamHI fragment of pFPMT-MFa-E1-H6 (containing the FMD promotor plus MFa-E1-H6) was ligated to the SaIIlBanaHI
vector fragment of pUCl8 with T4 ligase (Boehringer) accordig to the supplier's conditions. This resulted in plasmid that is depicted in Figure 6 (SEQ ID N0:17), and further denominated as 2o pMal2-1 (pUCl8-FMD-MFa-E1-H6). The ligation mixture was used to transform E.coli DHSaF' cells. Several ampicillin-resistant colonies were picked and analyzed by restriction enzyme digestion of plasmid DNA isolated from the picked clones. A positive clone was further analyzed by determining the DNA sequence of the MFa-El-H6 coding sequence. A
correct clone was used for PCR directed mutagenesis to replace the MFa pre-pro-sequence with the codons of the avian lysozyme pre-sequence ("CL"; corresponding to amino acids 1 to 18 of avian lysozyme; SEQ ID NO:l). The principle of the applied PCR-directed mutagenesis method is based on the amplification of an entire plasmid with the desired alterations located at the 5'-ends of the primers. In downstream steps, the ends of the linear PCR
product are modified prior to self ligation resulting in the desired altered plasmid.
3o The following primers were used for the PCR reaction:
1. primer CL hin: 5'-tgcttcctaccactagcagcactaggatatgaggtgcgcaacgtgtccggg-3' (SEQ ID
N0:18);

2. primer CL her neu: 5'-tagtactagtattagtaggcttcgcatgaattcccgatgaaggcagagagcg-3' (SEQ ID
N0:19).
The underlined 5' regions of the primers contain the codons of about half of the avian lysozyme pre-sequence. Primer CL her neu includes a SpeI restriction site (italic). The non-underlined regions of the primers anneal with the codons for amino acid residues 192 to 199 of E1 (CL hin) or the with the "atg" start codon over the EcoRI site up to position -19 (counted from the EcoRI site) of FMD promoter. The primers are designed to amplify the complete pMal2-1 thereby replacing the codons of the MFa pre-pro-sequence with the codons of the avian lysozyme pre sequence.
The reaction mixture was constituted as follows: pUClB-FMD-Mfoc-El-H6 (pMal2-1; 1.3 ng/~,L), 1 ~.L; primer CL hin (100 ~.M), 2~,L; primer CL her neu (100 ~M), 2 ~L; dNTP°s (all at 2.SmM), 8 pL; HaO, 76 ~L; Expand Long Template PCR System (Boeringer; Cat No 1681 834) Buffer 2 (10 x concentrated), 10~,L; Expand Long Template PCR System Polymerase mixture (1 U/~.L), 0.75 ~.L.
The PCR program consisting of the following steps was applied:
1. denaturation: 15 min at 95°C
2. 35 cycles of 30 sec denaturation at 95°C, 1 min annealing at 60°C, and 1 min elongation at 72°C
3. termination at 4°C.
2o The resulting PCR product was checked by agarose gel electrophoresis for its correct size (3.5 kb). Thereafter the 3'-A overhangs form the PCR product were removed by a T4 polymerase reaction resulting in blunt ends with 3'- and 5'-OH-groups. Therefore, the PCR
product was treated with T4 polymerase (Boehringer; 1 U/~L): to the remaining 95 ~L of PCR
reaction mix were added 1 ~L T4 polymerase and 4 ~.L dNTP's (all at 2.5 mM). The sample was incubated for 20 min at 37°C. Subsequently, the DNA was precipitated with ethanol and taken up in 16 ~.L H20.
Subsequently 5'-phosphates were added to the blunt-ended PCR product by a kinase reaction.
Therefore, to the 16 ~,L blunt-ended PCR product were added 1 ~,L T4 polynucleotide kinase (Boehringer; lU/~.L), 2 ~L 10-fold concentrated T4 polynucleotide kinase reaction buffer (Boehringer), and 1 ~,L ATP (lOmM). The sample was incubated for 30 min at 37°C.
Subsequently the DNA was applied onto a 1% agarose gel and the correct product band was isolated by means of the gel extraction kit (Qiagen) according to the supplier's conditions.
Fifty (50) ng of the purified product was then self ligated by use of T4 ligase (Boehringer) according to the supplier's conditions. After 72 h incubation at I6°C, the DNA in the Iigation mix was precipitated with ethanol and dissolved in 20 ~L water.
E.coli DHSoc-F' cells were subsequently transformed with 10 ~,L of the ligation sample. The plasmid DNA of several ampicillin-resistant clones was checked by means of restriction enzyme digestion. A positive clone was withheld and denominated p27d-3 (pUCl 8-FMD-CL
E1-H6, SEQ ID N0:20, Figure 7). Subsequently the CL-E1-H6 reading frame was verified by DNA sequencing.
In a last step the pFPMT-CL-El-H6 shuttle vector was constructed as described below. The 0.486kb EcoRIlBamHI fragment of p27d-3 (harboring CL-El(192-326)-H6) was ligated with to EcoRI/BamHI-digested pFPMT121 (SEQ ID N0:12, Figure 3). For the reaction, T4 ligase (Boehringer) has been used according to the supplier's recommendations. The DNA in the ligation sample was precipitated with ethanol and dissolved in 10 qL H2O. E.
coli DHSoc F' cells were transformed with 10 ~,L of the ligation sample, and the plasmid DNA
of several ampicillin-resistant colonies were analyzed by digestion with EcoRI and BamHI.
Plasmid clone p37-5 (pFPMT-CL-E1-H6; SEQ TD N0:21, Figure 8) showed the desired fragment sizes of 0.486kb and 6.961kb. The correct sequence of CL-EI-H6 of p3.7-5 was verified by sequencing.

CONSTRUCTION OF pFPMT-MFa,-E2-H6 AND pMPT-MFa.-E2-H6 SHUTTLE
'VECTORS
Plasmids for HaTZSenula polys~2o~pha transformation were constructed as follows. The DNA
sequence encoding the MFa,-E2s (amino acids 384-673 of HCV E2)-VIEGR-His6 (SEQ
ID
N0:5) was isolated as a 1.331kb EcoRIlBgIII fragment from plasmid pSP72E2H6 (SEQ ID
N0:22, Figure 9).This fragment was ligated with either the EcoRI/BgIII-digested vectors pFPMT121 (SEQ ID N0:12, Figure C+2) or pMPT121 (SEQ ID N0:23, Figure 10) using 3o DNA ligase (Boehringer Mannheim) according to the supplier's recommendations. After transformation of E. coli and checking of plasrnid DNA isolated from different transformants by restriction enzyme digestion, positive clones were withheld and the resulting shuttle vectors are denominated pFPMT-MFa-E2-H6 (SEQ ID N0:22, Figure 11) and pMPT-MFoc-E2-H6 (SEQ ID N0:23, Figure 12), respectively.

CONSTRUCTION OF pFPMT-CL-E2-H6 SHUTTLE VECTOR
The shuttle vector pFPMT-CL-E2-H6 was assembled in a three-step procedure. An_ to intermediate construct was prepared in which the E2 coding sequence was cloned behind the signal sequence of a,-amylase of Schwanniomyces accidentalis. This was done by the seamless cloning method (Padgett, K. A. and Sorge, J. A. 1996).
Generation of E2s-H6 encoding DNA fragment At first the DNA sequence encoding E2-H6 (amino acids 384 to 673 of HCV E2 extended with the linker peptide "VIEGR" and with 6 His residues, SEQ ID NO:S) was amplified from the pSP72E2H6 plasmid (SEQ ID N0:24, Figure 11) by PCR. The used primers were denoted MF30E2/F and MF30E2/R and have the following sequences:
- primer MF30E2/F: 5'-agtcactcttca. aggcatacccgcgtgtcaggaggg-3' (SEQ ID
2o N0:26; the Eaml 104I site is underlined, the dot marks the enzyme's cleavage site; the last codon of the S. occide~atalis signal sequence is printed in bold; the non-marked bases anneal with the codons~of E2 (amino acids 384-390 of HCV E2);
- primer MF30E2/R:
5'-agtcactcttca. cagggatcc~ttagtgatggtggtgatg-3' (SEQ ID N0:27; the Eam1104I site is underlined, the dot marks the enzyme's cleavage site; the bold printed bases are complementary to the bold printed bases of primer MF30-Rechts (see below); a BamHI site to be introduced into the construct is printed in italic; the non-marked sequence anneals with the stop codon and the six terminal His codons of E2 (384-673)-VIEGR-H6 (SEQ ID N0:5).
3o The reaction mixture was constituted as follows: total volume of 50 ~L
containing 20 ng of the 1.33kb EcoRIlBgIII fragment of pSP72E2H6, each 0.2 ~,M of primers MF30E2/F
and MF30E2/R, dNTP's (each 0.2 ~,M), lx buffer 2 (Expand Long Template PCR System;

Boeringer; Cat No 1681 834), 2.5 U polymerise rnix (Expand Long Template PCR
System;
Boeringer; Cat No 1681 834).
The PCR program 3 consisting of the following steps was used:
1. denaturation: 5 min at 95°C
2. 10 cycles of 30 sec denaturation at 95°C, 30 sec annealing at 65°C, and 1 min elongation at 68°C
3. termination at 4°C.
Then 10 ~.L 10 x buffer 2 (Expand Long Template PCR System; Boeringer; Cat No 834), 40 ~.L H20, and 5 ~.L of [dATP, dGTP, and dTTP (2mM each); l OmM 5-methyl-dCTP]
l0 have been added to the sample derived from PCR program 3, and it has been continued with PCR program 4 consisting of the following steps:
1. denaturation: 5 min at 95°C
2. 5 cycles of 45 sec denaturation at 95°C, 30 sec annealing at 65°C, and 1 min elongation at 68°C
3. termination at 4°C.
Generation of pMF30-derived acceptor plasmid The second fragment originated from the plasmid pMF30 (SEQ ID N0:28, Figure 13), the amplicon was almost the complete pMF30 plasmid excluding the codons of the mature a-2o amylase of S. oceide~ztalis, modifications relevant fox cloning were introduced by primer design. The following set of primers was used:
- primer MF30-Links:
5'-agtcactcttca. cctcttgtcaaaaataatcggttgag-3' (SEQ ID NO:29; the Ea3n1104I site is underlined, the dot marks the enzyme's cleavage site; the bold printed "cct" is complementary to the bold printed "agg" of primer MF30E2/F (see above); the non-marked and the bold printed bases anneal with the 26 terminal bases of the codons of the a-Amylase of S. occidentalis in pMF30);
- primer MF30-Rechts: 5'-agtcactcttca. ctgcaggcatgcaagcttggcg-3' (SEQ ID
NO:11; the Eam1104I site is underlined, the dot marks the enzyme's cleavage site; the 3o bold printed "ctg" is complementary to the bold printed "cag" of primer MF30E2/R (see above); the non-marked bases anneal with pUCl8 sequences downstream of the stop codon of the a-Amylase of S. occidentalis in pMF30).

The reaction mixture was constituted as follows: total volume of 50 ~,L
containing 20 ng of the BgIII-linearized pMF30, each 0.2 ~M of primers MF30-Links and MF30-Rechts, dNTP's (each 0.2 p,M), lx buffer 1 (Expand Long Template PCR System; Boeringer; Cat No 1681 834), 2.5 U polymerise mix (Expand Long Template PCR System; Boeringer; Cat No 834). The same PCR programs (programs 3 and 4) as described above were used, except for the elongation times which were extended from 1 minute to 4 minutes in both programs.
Generation of vector pAMY-E2 The E2s-H6 encoding DNA fragment and pMF30-derived acceptor plasmid obtained by PCR
to were controlled on their respective size by gel electrophoresis on a 1 %
agarose gel. The PCR
products were purified with a PCR product purification kit (Qiagen) according to the supplier's instructions. Subsequently the purified fragments were digested separately with Ean211004I. Ligation of the E2s-H6 fragment with the pMF30-derived acceptor plasmid was performed by using T4 ligase (Boehringer) according to the supplier's recommendations. The ligation mixture was used to transform E.coli DHSa,F'cells and the plasmid DNA
of several.
clones was analyzed by EcoRIlBamHI digestion. A positive clone was selected, its plasmid further denominated as pAMY-E2, and utilized for further modifications as described below.
Generation of vector pUClB-CL-E2-H6 2o The pAMY-E2 was subjected to PCR-directed mutagenesis in order to replace the codons of the a.-amylase signal sequence with the codons of the avian lysozyme pre sequence. This is further denominated as "CL", corresponding to the first 18 amino acids of avian lysozyme ORF (SEQ ID NO:l). For this mutagenesis following primers were used:
- primer CL2 hin:
5'-tctcttcctaccactagcagcactacrgacatacccgcgtgtcaggaggggcag-3' (SEQ
ID N0:30); and - primer CL2 her:
5'-tacttactagt:attacttaactcttccLcatgcTaat tcactggccgtcgtttta-caacgtc-3' (SEQ ID N0:31). ' 3o The underlined 5'-regions of the primers contain the DNA sequence of about half of the avian lysozyme pre sequence. Primer CL2 her includes SpeI (italic) and EcoRI
(italic, double underlined) restriction sites. The non-underlined regions of the primers anneal with the codons of amino acid residues 384 to 392 of E2 (CL2 hin) or the with the "atg"
start codon over the EcoRI site up to position -19 (counted from the EcoRI site) of PMD
promoter. The primers are designed to amplify the complete pAMY-E2 vector thereby replacing the codons of the a-amylase signal sequence with the codons of the avian lysozyme pre-sequence.
The PCR reaction was performed according to the following program:
1. denaturation: 15 min at 95°C
2. 35 cycles of 30 sec denaturation at 95°C, 1 min annealing at 60°C, and 1 min elongation at 72°C
3. termination at 4°C.
The following reaction mixture was used: pAMY-E2 (1 ng/~,L), 1 ~,L; primer CL2 hin (100 l0 ~,M), 2 ~L; primer CL2 her (100 ~M), 2 ~.L; dNTP's (2.SmM each), 8 ~L; H20, 76 ~.L;
Expand Long Template PCR System (Boeringer; Cat No 1681 834) Buffer 2 (10 x concentrated), 10 ~.L; Expand Long Template PCR System Polymerase mixture (lUI~,L), 0.75 q,L.
The resulting PCR product was checked by gel electrophoresis on a 1 % agarose gel. Prior to ligation the PCR fragment was modified as follows. The 3'-A overhangs were removed by T4 polymerase resulting in blunt ends with 3'- and 5'-OH-groups. Thereto 1 ~,L T4 polymerase (Boehringer, lUlq,L) was added to the residual 95 qL PCR reaction mixture along with 4 ~,L
dNTP's (2.5 mM each). The sample was incubated for 20 min at 37°c.
Subsequently the DNA
was precipitated with ethanol and dissolved in 16 ~,L deionized water. This was followed by a 2o kinase treatment to add 5'-phosphates to the blunt-ended PCR product. To the 16 q,L
dissolved blunt-ended PCR product were added 1 q,L T4 polynucleotide kinase (Boehringer, lU/~,L), 2 ~.L 10-fold concentrated T4 polynucleotide kinase reaction buffer (Boehringer) and 1 ~,L ATP (10 mM). The sample was incubated for 30 min at 37°C.
The kinase treated sample was subsequently separated on a 1 % agarose gel. The product band was isolated. The DNA was extracted from the agarose slice by means of the Gel Extraction kit (Qiagen) according to the supplier's recommendations. Fifty (50) ng of the purified product was then self ligated by use of T4 ligase (Boehringer) according to the supplier's conditions. After 16 h incubation at 16°C, the DNA in the ligation mix was precipitated with ethanol and dissolved in 20 ~L H20 (ligation sample).
E.coli DHSaF' cells were transformed with 10 q,L of the ligation sample.
Several ampicillin-resistant clones were further characterized via restriction analysis of the isolated plasmid DNA. A positive clone was denominated as pUCl8-CL-E2-H6 and was used for further modifications as described below.

Generation of shuttle vector pFPMT-CL-EZ-H6 A 0.966 kb EcoRIlBan~HI fragment was isolated from pUClB-CL-E2-H6 (harboring CL-E2(384 - 673)-VIEGR-H6) and was ligated into the EcoRI/BamHI-digested pFPMT121 (SEQ
ID NO:12, Figure 3). For the reaction, T4 ligase (Boehringer) was used according to the supplier's conditions. The ligation sample was precipitated with ethanol and dissolved in 10 ~.L water. This was used to transform E. coli DHSccF' cells, a positive clone was withheld after restriction analysis and the respective plasmid is denominated pFPMT-CL-E2-H6 (SEQ
ID N0:32, Figure 14).

CONSTRUCTION OF pFPMT-CL-K-H6-El SHUTTLE VECTOR
The construction of the shuttle vector was comprised of two steps.
In a first step the pUCl8-FMD-CL-H6-K-E1-H6 construct was constructed by site-directed mutagenesis. The pUClB-FMD-CL-E1-H6 was used as template (SEQ ID NO:20; Figure 7).
The following primers were used:
. -Primer H6K hin neu: 5°-catcacaaatatgaggtgcgcaacgtgtccgggatgtac-3' (SEQ
ID N0:37).
-Primer H6KRK her neu:
5'-gtaatctatqgtgtcctagtgctgctagtggtaggaagcatag-3' (SEQ ID N0:38).
(The bases providing additional codons are underlined.) The PCR reaction mixture was constituted as follows: pUCl8-FMD-CL-E1-H6 (2 ng/~,L), 1 ~,L; primer H6K hin neu (100 ~,M), 2~,L; primer H6KRK her neu (100 ~M), 2~.L; dNTP's (2.5 mM each), 8~,L; H20, 76 ~,L; Expand Long Template PCR System (Boeringer;
Cat No 1681 834) Buffer 2 (10 x concentrated), 10 ~.L; Expand Long Template PCR
System Polymerise mixture (1 U/~,L), 0.75 ~,L.
3o The PCR program used consisted of the following steps:
- denaturation step: 15 min at 95°C
- 35 cycles of 30 sec denaturation at 95°C, 1 min annealing at 60°C, and 5 min elongation at 72°C

- termination at 4°C.
An aliquot of the PCR sample was analyzed on a 1% agarose gel to check its size, which was correct (~4.2 kb).
Thereafter the 3'-A overhangs from the PCR product were removed by a T4 polymerise reaction resulting in blunt ends with 3'- and 5'-OH groups. Therefore, to the remaining 95 ~,L
of the PCR reaction were added 1 ~L T4 polymerise (Boehringer; 1 U/~,L) and 4 ~L dNTP's (2.5mM each). The sample was incubated for 20 min at 37°C.
Subsequently, the DNA in the sample was precipitated with ethanol and dissolved in 16 ~L H2O.
Subsequently 5'-phosphates were added to the blunt-ended PCR product by a kinase reaction.
to Therefore, to the I6 ~L dissolved blunt-ended PCR product were added I ~,L

polynucleotide kinase (Boehringer; 1 U/~L), 2 q,L 10-fold concentrated T4 polynucleotide kinase reaction buffer (Boehringer), and I wL ATP (10 mM). The sample was incubated for 30 min at 37°C.
Subsequently the sample was applied onto a 1 % agarose gel and the correct product band was isolated, by means of the gel extraction kit (Qiagen) according to the supplier's conditions.
Fifty (50) ng of the purified product has then been self ligated by use of T4 ligase (Boehringer) according to the supplier's recommendations. After 72 h incubation at 16°C the DNA in the ligation sample was precipitated with ethanol and dissolved in 10 ~.L water.
E.coli DHSa,F' cells were transformed with 5 ~L of the ligation sample. The plasmid DNA of 2o several ampicillin-resitant colonies was analyzed by restriction enzyme digestion, a positive clone was withheld and the corresponding plasmid denominated: pUClB-FMD-CL-H6-El-K
H6 (SEQ ID N0:39, Figure 17).
In a second step the transfer vector was constructed by a two-fragment ligation. In the following construction fragments with BcII cohesive ends were involved. Since BcZI can cleave its site only on unmethylated DNA, an E. coli dani strain was transformed with the involved plasmids pUCl8-FMD-CL-H6-K-E1-H6 (SEQ ID N0:39, Figure 17) and pFPMT-CL-EI (SEQ ID N0:36, Figure I6). From each transformation, an ampicillin-resistant colony was picked, grown in a liquid culture and the unmethylated plasmid DNAs were prepared for 3o the further use. The 1.273kb BcIIlHihdIII fragment of the unmethylated plasmid pUClB-FMD-CL-H6-K-E1-H6 (harbouring the FMD promoter, the codons of the CL-H6-K
unit, and the start of EI) and the 6.057kb BcIIlHifadIII fragment of plasmid pFPMT-CL-E1 (harbouring the missing part of the E1 reading frame starting from the BcII site, without C-terminal His tag, as well as the pFPMT121-located elements except for the FMD promoter) were prepared and ligated together for 72 h at 16°C by use of T4 ligase (Boehringer) in a total volume of 20 ~,L according to the supplier's specifications. Subsequently, the ligation mixture was placed on a piece of nitrocellulose membrane floating on sterile deionized water in order to desalt the ligation mixture (incubation for 30 min at room temperature). E. coli TOP10 cells were transformed by electroporation with 5 ~L of the desalted sample. The plasmid DNA of several resulting ampicillin-resistant colonies was analyzed by restriction enzyme digestion.
A positive clone was withheld and denominated pFPMT-CL-H6-K-El (SEQ ID N0:40, Figure 18).
to TRANSFORMATION OF HANSENULA POLYMORPHA AND SELECTION OF
TRANSFORMANTS
H.polymoTplaa strain RB11 was been transformed (PEG-mediated DNA uptake protocol essentially as described by (Klebe, R. J. et al. 1983) with the modification of (Roggenkamp, R. et al. 1986) with the different parental shuttle vectors as described in Examples 1 to 5: For each transformation, 72 uracil-prototrophic colonies were selected and used for strain generation by the following procedure. For each colony, a 2 mL liquid culture was inoculated and grown in test tubes for 48h (37°C; 160 rpm; angle 45°) in selective medium (YNB/glueose, Difco). This step is defined as the, first passaging step. A 150 ~,L aliquot of the cultures of the first passaging step were used to inoculate 2 mL fresh YNB/glucose medium.
Again, the cultures have been incubated as described above (second passaging step).
Together, eight of such passaging steps were carried out. Aliquots of the cultures after the third and the eighth passaging steps were used to inoculate 2 mL of non-selective YPD
medium (Difco). After 48 h of incubation at 37°C (160 rpm; angle 45°; the so-called first stabilization step), 150 ~L aliquots of these YPD cultures have been used to inoculate fresh 2 mL YPD cultures which were incubated as described above (second stabilization step).
3o Aliquots of the cultures of the second stabilization step were then streaked on plates containing selective YNB/agar. These plates were incubated for four days until macroscopic colonies became visible. A well-defined single colony of each separation was defined as strain and used for further expression analysis.

Expression analysis was performed on small-scale shake flask cultures. A
colony was picked from the above mentioned YNB/agax plate and inoculated in 2 mL YPD and incubated for 48 h as mentioned above. This 2 mL-aliquot was used as seed culture for 20 mL
shake flask culture. YPGlycerol (I%) was used as medium and the shake flask was incubated on a rotary shaker (200 rpm, 37°C). After 48 h of growth 1 % MeOH was added to the culture for induction of the expression cassette. At different time intervals cell pellets of 1 mL aliquots were collected and stored at -20°C until further analysis. Specific protein expression was analyzed by SDS-PAGE/ Western blotting. Therefore cell pellets were solubilized in sample-buffer (TrisHCl - SDS) and incubated for > 15 minutes at 95°C. Proteins were separated on a l0 15% polyacryl-amide geI and blotted (wet-blot; bicarbonate buffer) onto nitrocellulose membranes. Blots were developed using a specific marine anti-EI (IGH 20I) or marine anti-E2 (IGH 216, described by Maertens et al. in W096/04385) as first antibody, Rabbit-Anti-Mouse- AP was used as second antibody. Staining was performed with NBT-BLIP.
Positive strains were withheld for further investigation.
Five of these positive clones were used in a shake flask expression experiment. A colony of the respective strain was picked from YNB plate and used to inoculate 2 mL
YPD. These cultures were incubated as described above. This cell suspension was used to inoculate a second seed culture of I00 mL YPD medium in a 500 mL shake flask. This shake flask was incubated on a rotary shaker for 48 h at 37°C and 200 rpm. A 25 mL
aliquot of this seed 2o culture was used to inoculate 250 mL YPGlycerol (1%) medium and was incubated in a baffled 2-1 shake flask under the above described conditions. 48 h after inoculation 1 MeOH (promotor induction) was added and the shake flasks were further incubated under the above described conditions. 24 h post induction, the experiment was stopped and cell pellets collected by centrifugation. The expression level of the five different clones was analyzed by SDS-PAGE / Western blotting (conditions as above). A titration series of each clone was loaded onto the gel and the most productive strain was selected for further fermentation and purification trials.
Surprisingly, H. polyno~plza, a yeast strain closely related to Pichia pastor°is (Gellissen, G.
2000), is able to express HCV proteins essentially without hyperglycosylation and thus with 3o sugar moieties comparable in size to the HCV envelope proteins expressed by HCV
recombinant vaccinia virus-infected mammalian cells.
The Hansef~ula polyi~2ofpha strain RB11 was deposited on April I9, 2002 under the conditions of the Budapest Treaty at the Mycotheque de 1'UCL (MUCL), Universite Catholique de Louvain, Laboratoire de mycologie, Place Croix du Sud 3 bte 6, B-Louvain-la-Neuve, Belgium and has the MUCL accession number MUCL43S05.

CONSTRUCTION OF pSYlaMFElsH6a VECTOR
The S. cef°evisiae expression plasmid was constructed as follows. An E1-coding sequence was 1o isolated as a NslllEco52I fragment from pGEMT-ElsH6 (SEQ ID N0:6, Figure 1) which was made blunt-ended (using T4 DNA polymerase) and cloned in the pYIGS vector (SEQ ID
N0:41, Figure 19) using T4 DNA ligase (Boehringer) according to the supplier's specifications. The cloning was such that the Els-H6 encoding fragment was joined directly and in frame to the aMF-coding sequence. The ligation mixture was transformed in E.coli DHSa.F'cells. Subsequently, the plasmid DNA of several ampicilin resistant clones was analyzed by restriction digestion and a positive clone was withheld and denominated as pYIG5E1H6 (ICCG3470; SEQ ID N0:42, Figure 20).
The expression cassette (containing the a,MF-sequence and the Els-coding region with a His tag) was transferred as a BamHI fragment (2790 bp) of pYIG5E1H6 into the BamHI-digested 2o E .colilS. ces°evisiae pSYI shuttle vector (SEQ ID N0:21, Figure 43). The ligation was performed with T4 DNA ligase (Boehringer) according to supplier's conditions.
The ligation mix was transformed to E . coli DHSocF' cells, and the plasmid DNA of several ampicilin resistant colonies was analyzed by restriction enzyme digestion. A positive clone was withheld and denominated pSYlaMFElsH6a (ICCG3479; SEQ ID N0:44, Figure 22). .

CONSTRUCTION OF pSYYIGSE2H6 VECTOR
The S ceoevisiae expression plasmid pSYYIGSE2H6 was constructed as follows. An coding sequence was isolated as a SaIIlKpTaI fragment from pBSK-E2sH6 (SEQ ID
N0:45, Figure 23) which was made blunt-ended (using T4 DNA polymerase) and subsequently cloned in the pYIGS vector (SEQ ID N0:41, Figure 19) using T4 DNA ligase (Boehringer) according to the supplier's specifications. The cloning was such that the E2-H6 encoding fragment was joined directly and in frame to the aMF-coding sequence. The ligation mixture was then transformed to E. coli DHSaF' cells, the plasmid DNA of several ampicilin resistant clones was analyzed by restriction digestion and a positive clone withheld and denominated as pYIGSHCCL-22aH6 (ICCG2424; SEQ ID N0:46, Figure 24).
The expression cassette (containing the ocMF-sequence and the E2 (384 - 673) coding region with a His-tag) was transferred as a BamFII fragment (3281 bp) of pYIGSHCCL-22aH6 into the BanzHI opened E. colilS. ce~evisiae pSYl shuttle vector (SEQ ID N0:43, Figure 21). The 1o ligation was performed with T4 DNA ligase (Boehringer) according to supplier's conditions.
The ligation mix was transformed to E. coli DHSaF' cells and the plasmid DNA
of several ampicilin resistant colonies was analyzed by restriction enzyme digestion. A
restriction positive clone was withheld and denominated pSYYIGSE2H6 (ICCG2466; SEQ ID
N0:47, Figure 25).

CONSTRUCTION OF pSYlYIG7Els VECTOR
The S. ce~-evisiae expression plasmid pSYlYIG7Els was constructed as follows.
An El coding sequence was isolated as a NslllEco52I fragment from pGEMT-Els (SEQ ID
N0:6, Figure 1) which was made blunt-ended and cloned into the pYIG7 vector (SEQ ID
N0:48, Figure 26) using T4 DNA ligase (Boehringer) according to the supplier's specifications. The cloning was such that the E1-encoding fragment was joined directly and in frame to the aMF-coding sequence. The ligation mixture was transformed to E. coli DHSaF' cells, the plasmid DNA of several ampicilin resistant clones analyzed by restriction digestion and a positive clone withheld and denominated as pYIG7E1 (SEQ ID N0:49, Figure 27).
The expression cassette (containing the CL leader sequence and the E1 (192-326) coding region) was transferred as a BanzHI fragment (2790 bp) of pYIG7E1 into the BamHI-digested E. colilS. cef°evisiae pSYl shuttle vector (SEQ ID NO:43, Figure 21).
The Iigation was performed with T4 DNA ligase (Boehringer) according to supplier's conditions.
The ligation mix was transformed to E. coli DHSa,F' cells and the plasmid DNA of several ampicilin resistant colonies was analyzed by restriction enzyme digestion. A positive clone was withheld and denominated pSYlYIG7Els (SEQ ID N0:50, Figure 2~).
E~~A.MPLE 10 TRANSFORMATION OF SACCHAROMYCES CEREi~'ISIAE AND SELECTION OF
TRANSFORMANTS
l0 In order to overcome hyper-glycosylation problems, often reported for proteins over-expressed in Saccharon2yces cep°evisiae, a mutant screening was set-up.
This screening was based on the method of Ballou (Ballou, L. et al. 1991), whereby spontaneous recessive orthovanadate-resistant mutants were selected. Initial strain selection was performed based on the glycosylation pattern of invertase, as observed after native gel electrophoresis. A strain, reduced in glycosylation capabilities, was withheld for further recombinant protein expression experiments and denominated strain IYCC155. The nature of mutation has not been further studied.
Said glycosylation-deficient strain IYCC155 was transformed with the plasmids as described in Examples 7 to 9 essentially by to the lithium acetate method as described by 2o Elble (Elble, R. 1992). Several Ura complemented strains were picked from a selective YNB
+ 2 % agar plate (Difco) and used to inoculate 2m1 YNB+2%glucose. These cultures were incubated for 72 h, 37°C, 200 rpm on orbital shaker, and the culture supernatant and intracellular fractions were analysed for expression of E1 by western blot developed with a E1 specific marine monoclonal antibody (IGH 201). A high producing clone was withheld for further experiments.
The expression of proteins in the S. cef-ivisiae glycosylation deficient mutant used here is hampered by the suboptimal growth characteristics of such strains which leads to a lower biomass yield and thus a lower yield of the desired proteins compared to wild-type S.
ce~ivisiae strains. The yield of the desired proteins was still substantially higher than in 3o mammalian cells.

CONSTRUCTION OF pPICZaIphaD'ElsH6 AND pPICZaIphaE'ElsH6 VECTORS
The shuttle vector pPICZalphaE'ElsH6 was constructed starting from the pPICZalphaA
vector (Invitrogen; SEQ ID NO:51, Figure 29). In a first step said vector was adapted in order to enable cloning of the E1 coding sequence directly behind the cleavage site of the KEX2 or STE13 processing proteases, respectively. Therefore pPICZaIphaA was digested with XhoI
and Notl. The digest was separated on a 1 % agarose gel and the 3519 kb fragment (major part of vector) was isolated and purified by means of a gel extraction kit (Qiagen). This fragment l0 was then ligated using T4 polymerase (Boehringer) according to the supplier's conditions in presence of specific oligonucleotides yielding pPICZaIphaD' (SEQ ID NO:52, Figure 30) or pPICZaIphaE' (SEQ ID N0:53, Figure 31).
The following oligonucleotides were used:
- for constructing pPICZaIphaD':
8822: 5'-TCGAGAAAAGGGGCCCGAATTCGCATGC-3' (SEQ ID N0:54); and 8823: 5'-GGCCGCATGCGAATTCGGGCCCCTTTTC-3' (SEQ ID NO:55) which yield, after annealing, the linker oligonucleotide:
TCGAGAAAAGGGGCCCGAATTCGCATGC (SEQ ID N0:54) CTTTTCCCCGGGCTTAAGCGTACGCCGG (SEQ ID NO:55) - for constructing pPICZaIphaE' 8649: 5'-TCGAGAAAAGAGAGGCTGAAGCCTGCAGCATATGC-3' (SEQ ID NO: 56) 8650: 5'-GGCCGCATATGCTGCAGGCTTCAGCCTCTCTTTTC-3' (SEQ ID N0:57) which yield, after annealing, the linker oligonucleotide:
TCGAGAAAAGAGAGGCTGAAGCCTGCAGCATATGC (SEA ID NO:56) CTTTTCTCTCCGACTTCGGACGTCGTATACGCCGG (SEQ ID N0:57) These shuttle vectors pPICZalphaD' and pPICZalphaE' have newly introduced cloning sites directly behind the cleavage site of the respective processing proteases, KEXZ
and STE13.
The El-H6 coding sequence was isolated as a NslllEco52I fragment from pGEMT-ElsH6 (SEQ ID N0:6, Figure 1). The fragment was purified using a gel extraction kit (Qiagen) after separation of the digest on a 1% agarose gel. The resulting fragment was made blunt-ended (using T4 DNA polymerise) and ligated into either pPICZaIphaD' or pPICZalphaE' directly behind the respective processing protease cleavage site.
The ligation mixtures were transformed to E. coli TOP 1 OF' cells and plasmid DNA of several zeocin resistant colonies analyzed by restriction enzyme digestion. Positive clones were withheld and denominated pPICZalphaD'ElsH6 (ICCG3694; SEQ ID NO:S~, Figure 32) and pPICZalphaE'ElsH6 (ICCG3475; SEQ ID N0:59, Figure 33), respectively.
1 o EXAMPLE 12 CONSTRUCTION OF pPICZaIphaD'E2sH6 AND pPICZaIphaE'E2sH6 VECTORS
The shuttle vectors pPICZalphaD' and pPICZalphaE' were constructed as described in Example 11.
The E2-H6 coding sequence was isolated as a SaIIlKpnI fragment from pBSK-E2sH6 (SEQ
ID N0:45, Figure 23). The fragment was purified with a gel extraction kit (Qiagen) after separation of the digest on a 1% agarose gel. The resulting fragment was made blunt-ended (using T4 DNA polymerise) and ligated into either pPICZalphaD' or pPICZaIphaE' directly behind the respective processing protease cleavage site.
2o The ligation mixture was transformed to E. coli TOP10F' cellls and the plasmid DNA of several zeocin resistant colonies was analyzed by restriction enzyme digestion. Positive clone were withheld and denominated pPICZalphaD'E2sH6 (ICCG3692; SEQ ID N0:60, Figure 34) and pPICZalphaE'E2sH6 (ICGG3476; SEQ ID N0:61, Figure 35), respectively.

TRANSFORMATION OF PICHIA PASTORIS AND SELECTION OF
TRANSFORMANTS
The P. pastor-is shuttle plasmids as described in Examples 1 l and 12 were transformed to P.
pastof°is cells according to the supplier's conditions (Invitrogen). An E1- and an E2-producing strain were withheld for further characterization.

The HCV envelope proteins were expressed in P. pasto~is, a yeast strain well known for the fact that hyperglycosylation is normally absent (Gellissen, G. 2000) and previously used to express dengue virus E protein as GST fusion (Sugrue, R. J. et al. 1997).
Remarkably, the resulting P. pastor°is-expressed HCV envelope proteins displayed a comparable glycosylation as is observed in wild-type Saccha~°omyces strains. More specifically, the HCV envelope proteins produced by P. pastor°is are hyperglycosylated (based on the molecular weight of the expression products detected in western-blots of proteins isolated from transformed P.
pastof°is cells).
to CULTURE CONDITIONS FOR SACCHAROMYCES CERET~ISIAE, HANSENULA

Sacclzanomyces ceoevisiae Cell banking Of the selected recombinant clone a master cell bank and working cell bank were prepared.
Cryo-vials were prepared from a mid-exponentially grown shake flask culture (incubation 2o conditions as for fermentation seed cultures, see below). Glycerol was added (50 % final cone) as a cryoprotectant.
Fermentation Seed cultures were started from a cryo-preserved working cell bank vial and grown in 500 mL
medium (YNB supplemented with 2 % sucrose, Difco) in a 2 L Erlenmeyer shake flasks at 37°C, 200 rpm for 48h.
Fermentations were typically performed in Biostat C fermentors with a working volume of 15 L (B.Braun Int., Melsungen, Germany). The fermentation medium contained 1 %
Yeast Extract, 2% Peptone and 2 % sucrose as carbon source. Poly-ethylene glycol was used as anti 3o foam agent.
Temperature, pH and dissolved oxygen were typically controlled during the fermentation, applicable set-points are summarised in Table 1. Dissolved oxygen was cascade controlled by agitation/aeration. pH was controlled by addition of NaOH (0.5 M) or H3P04 solution (8.5 %).
Table 1. Typical parameter settings for S. ce~evisiae fermentations Parameter set-point Temperature 33 - 37 C

pH 4.2 - 5.0 DO (growth phase)10 - 40 % air saturation DO (induction) 0 - 5 aeration 0.5 -1.8 vvm*

agitation 150 - 900 rpm * volume replacement per minute The fermentation was started by the addition of 10 % seed-culture. During the growth phase the sucrose concentration was monitored off line by HPLC analysis (Polysphere Column to OAKC Merck).
During the growth phase the dissolved oxygen was controlled by cascade control (agitation/aeration). After complete metabolisation of sucrose. the heterologous protein production was driven by the endogenous produced ethanol supplemented with stepwise addition of EtOH in order to maintain the concentration at approximately 0.5 %
(off line HPLC analysis, polyspher OAKC column) During this induction phase the dissolved oxygen was controlled below 5% air-saturation, by manual adjustment of airflow rate and agitator speed.
Typically the fermentation was harvested 48 to 72 h post induction by concentration via tangential flow filtration followed by centrifugation of the concentrated cell suspension to obtain cell pellets. If not analyzed immediately, cell pellets were stored at -70°C.
Hafzsehula polyyno~ pha Cell banking Of the selected recombinant clone a master cell bank and working cell bank were prepared.

Cryo-vials were prepared from a mid-exponentially grown shake flask culture (incubation conditions as for fermentation seed cultures, see below). Glycerol was added (50 % final cone) as a cryoprotectant.
Fermentation Seed cultures were started from a cryo-preserved (-70°C) working cell bank vial and grown in 500 mL medium (YPD, Difco) in a 2 L Erlenmeyer shake flasks at 37°C, 200 rpm for 48h.
1o Fermentations were typically performed in Biostat C fermentors with a working volume of 15 L (B.Braun Int., Melsungen, Germany). The fermentation medium contained 1 %
Yeast Extract, 2% Peptone and 1% glycerol as carbon source. Poly-ethylene glycol was used as anti-foam agent.
Temperature, pH, air-in and dissolved oxygen were typically controlled during the fermentation, applicable set-points are summarised in Table 2. Dissolved oxygen was controlled by agitation. pH was controlled by addition of NaOH (0.5 M) or H3P04 solution (8.5 %).
Table 2. Typical parameter settings for FI. polymo~pha fermentations Parameter set-point Temperature 30 - 40 C

pH 4.2 - 5.0 DO 10 - 40 % air saturation aeration 0.5 - 1.8 vvm*

agitation I50 - 900 rpm * volume replacement per minute The fermentation was started by the addition of 10 % seed-culture. During the growth phase the glycerol concentration was monitored off line (Polysphere Column OAI~C
Merck) and 24 h after complete glycerol consumption 1 % methanol was added in order to induce the heterologous protein expression. The fermentation was harvested 24 h post induction by concentration via tangential flow filtration followed by centrifugation of the concentrated cell suspension to obtain cell pellets. If not analyzed immediately, cell pellets were stored at -70°C.
Piclzia pastonis Small scale protein production experiments with recombinant Pichia pastoris were set up in shake flask cultures. Seed cultures were grown overnight in YPD medium (Difco). Initial medium pH was corrected to 4.5. Shake flasks were incubated on a rotary shaker at 200 - 250 rpm, 37°C.
The small scale production was typically performed at 500 mL scale in 2 L
shake flasks and to were started with a 10 % inoculation in expression medium, containing 1%
Yeast extract, 2 Peptone (both Difco), and 2 % glycerol as carbon source. Incubation conditions were as for the seed culture. Induction was started by addition of 1 % MeOH approximately 72 h after inoculation. The cells were collected 24 h post induction by centrifugation.
If not analyzed immediately, cell pellets were stored at -70°C.

LEADER PEPTIDE REMOVAL FROM MFa-El-H6 AND MFa-E2-H6 PROTEINS
2o EXPRESSED IN SELECTED YEAST CELLS
The expression products in Hafzsefzula polymozp7aa and a Saccha~omyces ccnevisiae glycosylation minus strain of the HCV El and E2 protein constructs with the a-mating factor (aMF) leader sequence of S. cef°evisiae were further analyzed. Since both genotype 1b HCV
Els (aa 192-326) and HCV E2s (aa 383-673 extended by the VIEGR (SEQ ID N0:69)-sequence) were expressed as C-terminal his-tagged (H6, HHHHHH, SEQ ID N0:63;
said .
HCV proteins are ftirtheron in this Example denoted as aMF-El-H6 and aMF-E2-H6) proteins, a rapid and efficient purification of the expressed products after guanidinium chloride (GuHCl)-solubilization of the yeast cells was performed on Ni-IDA (Ni-3o iminodiacetic acid). In brief, cell pellets were resuspended in 50 mM
phosphate, 6M GuHCI, pH 7.4 (9 vol/g cells). Proteins were sulfonated overnight at room temperature (RT) in the presence of 320 mM (4% w/v) sodium sulfite and 65 mM (2% w/v) sodium tetrathionate. The lysate was cleared after a freeze-thaw cycle by centrifugation (10.000 g, 30 min, 4°C) and Empigen (Albright Wilson, UK) and imidazole were added to the supernatant to final concentrations of 1 % (w/v) and 20 mM, respectively. The sample was filtrated (0.22 ~,M) and loaded on a Ni-IDA Sepharose FF column, which was equilibrated with 50 mM
phosphate, 6M GuHCI, 1 % Empigen (buffer A) supplemented with 20 mM imidazole. The column was washed sequentially with buffer A containing 20 mM and 50 mM imidazole, respectively, till absorbance at 280 nm reached baseline level. The his-tagged products were eluted by applying buffer D, 50 mM phosphate, 6M GuHCl, 0.2 % (for El) or 1 % (for E2) Empigen, 200 mM imidazole. The eluted materials were analyzed by SDS-PAGE and western-blot using a specific monoclonal antibodies directed against E1 (IGH201), or E2 (IGH212).
The E1-products were immediately analyzed by Edman degradation.
Since at this stage, SDS-PAGE revealed already a very complex picture of protein bands for HCV E2, a further fractionation by size exclusion chromatography was performed. The Ni-IDA eluate was concentrated by ultrafiltration (MWCO 10 kDa, centriplus, Amicon, Millipore) and loaded on Superdex 6200 (10/30 or 16/60; Pharmacia) in PBS, 1%
Empigen or PBS, 3% Empigen. Elution fractions, containing E2 products, with a Mr between ~80 kDa and ~45 kDa, i.e. fractions 17-23 of the elution profile in Figure 37 based on the migration on SDS-PAGE (Figure 38), were pooled and alkylated (incubation with 10 rnM DTT 3h at RT
followed by incubation with 30 mM iodo-acetamide for 3 hours at RT). Samples for amino-terminal sequencing were treated with Endo H (Roche Biochemicals) or left untreated. The glycosylated and deglycosylated E2 products were blotted on PVDF-membranes for amino-2o terminal sequencing. An amido-black stained blot of glycosylated and deglycosylated E2 is shown in Figure 39.
The sequencing of both E1 and E2 purified products lead to the disappointing observation that removal of the signal sequence from the HCV envelope proteins is occurring only partially (see Table 3). In addition, the majority of the side products (degradation products and products still containing the leader sequence or part thereof) are glycosylated. This glycosylation resides even in part on the non-cleaved fragment of the signal sequence which contains also an N-glycosylation site. These sites can be mutated in order to result in less glycosylated side products. However, even more problematic is the finding that some alternatively cleaved products have only 1 to 4 amino acids difference compared to the 3o desired intact envelope protein. Consequently, purification of the correctly processed product is virtually impossible due to the lack of sufficiently discriminating biochemical characteristics between the different expression products. Several of the degradation products may be a result of a Kex-2 like cleavage (e.g. the cleavage observed after as 196 of El which is a cleavage after an arginine), which is also required for the cleavage of the a-mating factor leader and which can thus not be blocked without disturbing this essential process.
A high El producing clone derived from transformation of S. cef-wisiae IYCC155 with pSYlYIG7Els (SEQ ID NO:50; Figure 28) was compared with a high producing clone derived from transformation of S. cei°evisiae IYCC155 with pSYlaMFElsH6aYIGIEls (SEQ
ID N0:44; Figure 22). The intracellular expression of the El protein was evaluated after 2 up to 7 days after induction, and this by means of Western-blot using the El specific monoclonal antibody (IGH 201). As can be judged from Figure 40, maximal expression was observed to after 2 days for both strains but the expression patterns for both strains are completely different. Expression with the a-mating factor leader results in a very complex pattern of bands, which is a consequence from the fact that the processing of the leader is not efficient.
This leads to several expression products with a different amino-terminus and of which some are modified by 1 to 5 N-glycosylations. However, for the E1 expressed with the CL leader a limited number of distinct bands is visible which reflects the high level of correct CL leader removal and the fact that only this correctly processed material may be modified by N-glycosylation (1 to 5 chains), as observed fox HarZSeTZUIa-derived E1 expressed with the same CL leader (see Example 16).
2o The hybridoma cell line producting the monoclonal antibody directed against E1 (IGH201) was deposited on March 12, 1998 under the conditions of the Budapest Treaty at the European Collection of Cell Cultures, Centre for Applied Microbiology &
Research, Salisbury, Wiltshire SP4 OJG, UI~, and has the accession number ECACC
98031216. The monoclonal antibody directed against E2 (IGH212) has been described as antibody 12D11F2 in Example 7.4 by Maertens et al. in WO96/04385.

Table 3. Identification of N-termini of aMF-E1-H6 and aMF-E2-H6 proteins expressed in S
cef°evisiae or .H. poly~zo~pha. Based on the N-terminal sequencing the amount of N-termini of the mature E1-H6 and E2-H6 proteins could be estimated ("mature" indicating correct removal of the aMF signal sequence). The total amount of protein products was calculated as pmol of protein based on the intensity of the peaks recovered by Edman degradation.
Subsequently, for each specific protein (i.e. for each 'detected N-terminus') the mol % versus the total was estimated.
Yeast aMF-El-H6 aMF-E2-VIEGR-H6 S, cefevisiae Experiment 1: /

- 16% of proteins still containing aMF sequences - 18% of proteins cleaved between as 195 and 196 of E1 66% of proteins with correctly removed aMF

Experiment 2 /

18% of proteins still containing.

aMF sequences - 33% of proteins cleaved between as 195 and 196 of E1 - 8% of other proteins other El cleavage products - ' 44% of proteins with correctly removed aMF

H. polyaao~plza- 64% of proteins still - 75% of proteins still containing aMF sequences containing aMF sequences - 6% of proteins cleaved - 25% of proteins with between as 192 and 193 of E1 correctly removed aMF

- 30% of proteins with correctly removed aMF

to EXPRESSION OF AN El CONSTRUCT IN YEAST SUITABLE FOR LARGE SCALE
is PRODUCTION AND PURIFICATION

_77_ Several other leader sequences were used to replace the S. cerevisiae aMF
leader peptide including CHH (leader sequence of Cap°cinus naae~aas hyperglycemic hormone), Amyl (leader sequence of amylase from S. occideiztalis), Gaml (leader sequence of glucoamylase from S.
occidefZtalis), PhyS (leader sequence from fungal phytase), phol (leader sequence from acid phosphatase from Pichia pasto~~is) and CL (leader of avian lysozyme C, 1,4-beta-N-acetylmuramidase C) and linked to E1-H6 (i.e. E1 with C-terminal his-tag). All constructs were expressed in Hansernula polyfnorpha and each of the resulting cell lysates was subjected to western blot analysis. This allowed already to conclude that the extent of removal of the leader or signal sequence or peptide was extremely low, except for the construct wherein CL
to is used as leader peptide. This was confirmed for the CHH-E1-H6 construct by Edman-degradation of Ni-IDA purified material: no correctly cleaved product could be detected although several different sequences were recovered (see Table 4).
Table 4. Identification of N-termini of CHH-El-H6 proteins expressed in H.
polymo~pha, based on N-terminal amino acid sequencing of different protein bands after separation by SDS-PAGE and blotting to a PVDF membrane.
Molecular Identified N-termini size 45 kD starts at amino acid 27 of CHH leader = only pre-sequence cleaved, pro-sequence still attached 26 kD - partially starts at amino acid 1 of CHH leader = no removal of pre-pro-sequence - partially starts at amino acid 9 of CHH leader = product of alternative translation starting at second AUG
codon 24 kD - partially starts at amino acid 1 of CHH leader = no removal of pre-pro-sequence - partially starts at amino acid 9 of CHH leader = product of alternative translation starting at second AUG
codon 2o As mentioned already, the western-blots of the cell lysates revealed a pattern of E1 specific protein bands, indicative for a higher degree of correct removal of the CL
leader peptide. This is surprising since this leader is not derived from a yeast. Amino acid sequencing by Edman degradation of GuHCl solubilized and Ni-IDA purified material indeed confirmed that 84% of the El proteins is correctly cleaved and the material is essentially free of degradation products. Still 16% of non-processed material is present but since this material is non-glycosylated it can be easily removed from the mixture allowing specific enrichment of -7g- , correctly cleaved and gl~cosylated E1. Such a method for enrichment may be an affinity chromatography on lectins, other alternatives are also given in Example 19.
Alternatively, the higher hydrophobic character of the non-glycosylated material may be used to select and optimize other enrichment procedures. The correct removal of the CL leader peptide from the CL-E1-H6 protein was further confirmed by mass spectrometry which also confirmed that up to 4 out of the 5 N-glycosylation sites of genotype 1b Els can be occupied, whereby the sequence NNSS (amino acids 233 to 236; SEQ ID N0:73) are considered to be a single N-glycosylation site.
to ENCODING CONSTRUCT
The efficiency of removal of the CL leader peptide from CL-E2-VIEGR-H6 (furtheron in this Example denoted as "CL-E2-H6") protein expressed in HafZSenula polyno~pha was analyzed.
2o Since the HCV E2s (aa 383-673) was expressed as a his-tagged protein, a rapid and efficient purification of the expressed protein after GuHCI-solubilization of collected cells was performed on Ni-IDA. In brief, cell pellets were resuspended in 30 mM
phosphate, 6 M
GuHCI, pH 7.2 (9 mL buffer/g cells). The protein was sulfonated overnight at room temperature in the presence of 320 mM (4% w/v) sodium sulfite and 65 mM (2%
w/v) sodium tetrathionate. The lysate was cleared after a freeze-thaw cycle by centrifugation (10.000 g, 30 min, 4°C). Empigen BB (Albright & Wilson) and imidazole were added to a final concentration of 1 % (w/v) and 20 mM, respectively. All further.
chromatographic steps were executed on an ~kta FPLC workstation (Pharmacia). The sample was filtrated through a 0.22 ~m pore size membrane (cellulose acetate) and loaded on a Ni-IDA column (Chelating 3o Sepharose FF loaded with Ni2+, Pharmacia), which was equilibrated with 50 mM phosphate, 6 M GuHCI, 1 % Empigen BB, pH 7.2 (buffer A) supplemented with 20 mM imidazole.
The column was washed sequentially with buffer A containing 20 mM and 50 mM
imidazole, respectively, till the absorbance at 280 nm reached the baseline Level. The his-tagged products were eluted by applying buffer D, 50 mM phosphate, 6 M GuHCl, 0.2 % Empigen BB
(pH
7.2), 200 mM imidazole. The purified materials were analysed by SDS-PAGE and western-blot using a specific monoclonal antibody directed against E2 (IGH212) (Figure 41). The IMAC-purified E2-H6 protein was also subjected to N-terminal sequencing by Edman degradation. Thereto proteins were treated with N-glycosidase F (Roche) (0.2 U/~g E2, 1 h incubation at 37°C in PBS/3% empigen BB) or left untreated. The glycosylated and deglycosylated EZ-H6 proteins were subjected to SDS-PAGE and blotted on a PVDF-membrane for amino acid sequencing (analysis was performed on a PROCISETM 492 protein sequencer, Applied Biosystems). Since at this stage, SDS-PAGE revealed some degradation products, a further fractionation by size exclusion chromatography was performed. Hereto, the Ni-IDA eluate was concentrated by ultrafiltration (MWCO 10 kDa, centriplus, Amicon, Millipore) and loaded on a Superdex 6200 (Pharmacia) in PBS, 1% Empigen BB.
Elution fractions, containing mainly intact EZs related products with a Mr between ~30 kDa and ~70 kDa based on the migration on SDS-PAGE, were pooled and eventually alkylated (incubation with 5 mM DTT for 30 minutes at 37°C, followed by incubation with 20 mM
iodoacetamide for 30 minutes at 37°C). The possible presence of degradation products after IMAC
purification can thus be overcome by a further fractionation of the intact product by means of size exclusion chromatography. An unexpectedly good result was obtained. Based on the N-terminal sequencing the amount of EZ product from which the CL leader peptide is removed could lie estimated. The total amount of protein products is calculated as pmol of protein based on the intensity of the peaks recovered by Edman degradation.
Subsequently, for each 2o specific protein (i.e. for each 'detected N-terminus') the mol % versus the total is estimated.
In the current experiment, only the correct N-terminus of EZ-H6 was detected and other variants of E2-H6 lacking amino acid of the E2 protein or containing N-terminal amino acids not comprised in the EZ protein were absent. In conclusion, the EZ-H6 protein expressed by H. polynaofpha as CL-EZ-H6 protein was isolated without any further in vita°o processing as a > 95 % correctly cleaved protein. This is in sharp contrast with the fidelity of leader peptide removal by H. polyfno~pha of the aMF-E2-H6 protein to the E2-H6 protein, which was estimated to occur in 25 % of the isolated proteins (see Table 3).

PROTEIN EXPRESSED IN HANSENZILA POLYMORPHA FROM THE CL-H6-K-El ENCODING CONSTRUCT AND IN 1~ITR0 PROCESSING OF H6-CONTAINING
PROTEINS
The efficiency of removal of the CL leader peptide from the CL-H6-K-E1 protein expressed in H. polymofp7Za was analyzed, as well as the efficiency of subsequent in vitro processing in order to remove the H6 (his-tag)-adaptor peptide and the Endo Lys-C processing site. Since the HCV Els (aa 192-326) was expressed as a N-terminal His-K-tagged protein CL-H6-K-E1, a rapid and efficient purification could be performed as described in Example 17. The elution profile of the IMAC-chromatographic purification of H6-K-E1 (and possibly residual CL-H6-to K-El) proteins is shown in Figure 42. After SDS-PAGE and silver staining of the gel and western-blot analysis using a specific monoclonal antibody directed against El (IGH201) (Figure 43), the elution fractions (63-69) containing the recombinant Els products were pooled ('IMAC pool') and subjected to an overnight Endoproteinase Lys-C
(Roche) treatment (enzyme/substrate ratio of 1/50 (w/w), 37 °C) in order to remove the H6-K-fusion tail.
Removal of non-processed fusion product was performed by a negative IMAC
chromatography step on a Ni-IDA column whereby Endo-Lys-C-processed proteins are collected in the flow-through fraction. Hereto the Endoproteinase Lys-C
digested protein sample was applied on a Ni-IDA column after a 10-fold dilution with 10 mM
NaH2P04.3H20, 1 % (v/v) Empigen B, pH 7.2 (buffer B) followed by washing with buffer B till the absorbance at 280 nm reached the baseline level. The flow through was collected in different fractions (1-40) that were screened for the presence of Els-products (Figure 44). The fractions (7-28), containing intact E1 from which the N-terminal H6-K (and possibly residual CL-H6-K) tail is removed (with a Mr between ~15 kDa and ~30 kDa based on the migration on SDS-PAGE followed by silver staining or western blot analysis using a specific monoclonal antibody directed against E1 (IGH201), were pooled and alkylated (incubation with 5 mM
DTT for 30 minutes at 37°C, followed by incubation with 20 mM
iodoacetamide for 30 minutes at 37°C).
This material was subjected to N-terminal sequencing (Edman degradation).
Hereto, protein samples were treated with N-glycosidase F (Roche) (0.2U/~g E1, 1h incubation at 37 °C in 3o PBS/3% empigen BB) or left untreated. The glycosylated and deglycosylated E1 proteins were then separated by SDS-PAGE and blotted on a PVDF-membrane for further analysis by Edman degradation (analysis was performed on a PROCISETM 492 protein sequencer, Applied Biosystems). Based on the N-terminal sequencing the amount of correctly processed E1 product could be estimated' (processing includes correct cleavage of the H6-K-sequence).

The total amount of protein products is calculated as pmol of protein based on the intensity of the peaks recovered by Edman degradation. Subsequently, for each specific protein (i.e. for each 'detected N-terminus') the mol % versus the total is estimated. In the current experiment, only the correct N-terminus of E1 was detected and not the N-termini of other processing variants of H6-K-E1. Based thereon, in vita°o processing by Endo Lys-C
of the H6-K-E1 El (and possibly residual CL-H6-K-E1) protein to the E1 protein was estimated to occur with a fidelity of more than 95 %.
to HEPARIN
In order to find specific purification steps for HCV envelope proteins from yeast cells binding with heparin was evaluated. Heparin is known to bind to several viruses and consequently binding to the HCV envelope has already been suggested (Garson, J. A. et al.
1999). In order to analyze this potential binding, heparin was biotinylated and interaction with HCV El analyzed in microtiterplates coated with either sulfonated HCV E1 from H.
polymofpl2a, 2o alkylated HCV El from H. polymo~p7ia (both produced as described in Example 16) and alkylated HCV E1 from a culture of mammalian cells transfected with a vaccinia expression vector. Surprisingly, a strong binding could only be observed with sulfonated HCV E1 from H. polymo~pha, while binding with HCV E1 from mammalian cell culture was completely absent. By means of western-blot we could show that this binding was specific for the lower molecular weight bands of the HCV E1 protein mixture (Figure 45), corresponding to low-glycosylated mature HCV Els. Figure 45 also reveals that sulfonation is not essential for heparin binding since upon removal of this sulfonation binding is still observed for the low molecular weight E1 (lane 4). Alternatively, alkylation is reducing this binding substantially, however, this may be caused by the specific alkylation agent (iodo-acetamide) used in this 3o example. This finding further demonstrated the industrial applicability of the CL-HCV-envelope expression cassettes for yeast since we specifically can enrich HCV
E1 preparations towards a preparation with HCV E1 proteins with a higher degree of glycosylation (i.e. more glycosylation sites occupied).

FORMATION AND ANALYSIS OF VIRUS-LIKE PARTICLES (VLPs) Conversion of the HCV E1 and E2 envelope proteins expressed in H. polyno~pha (Examples 16 to 18) to VLPs was done essentially as described by Depla et al. in W099/67285 and by Bosman et al. in WO01/30815. Briefly, after cultivation of the transformed H.
polyrno~pha cells during which the HCV envelope proteins were expressed, cells were harvested, lysed in GuHCI and sulphonated as described in Example 17. His-tagged proteins were subsequently 1o purified by IMAC and concentrated by ultrafiltration as described in Example 17.
VLP-formation of HCV envelope proteins. with sulphonated Cys-thiol groups The concentrated HCV envelope proteins sulphonated during the isolation procedure were not subjected to a reducing treatment and loaded on a size-exclusion chromatograpy column (Superdex 6200, Pharmacia) equilibrated with PBS, 1 % (v/v) Empigen. The eluted fractions were analyzed by SDS-PAGE and western blotting. The fractions with a relative Mr ~29-~15 kD (based on SDS-PAGE migration) were pooled, concentrated and loaded on Superdex 6200, equilibrated With PBS, 3% (w/v) betain, to enforce virus like particle formation (VLP).
The fractions were pooled, concentrated and desalted to PBS, 0.5% (w/v) betain.
VLP-formation of HCV envelope proteins with irreversibly modified Cys-thiol groups The concentrated HCV envelope proteins sulphonated during the isolation procedure were subjected to a reducing treatment (incubation in the presence of 5 mM DTT in PBS) to convert the sulphonated Cys-thiol groups to free Cys-thiol groups.
Irreversible Cys-thiol modification Was performed by (i) incubation for 30 min in the presence of 20 mM
iodoacetamide, or by (ii) incubation for 30 min in the presence of 5 mM N-ethylmaleimide . (NEM) and 15 mM biotin-N-ethylmaleimide. The proteins were subsequently loaded on a size-exclusion chromatograpy column (Superdex 6200, Pharmacia) equilibrated with PBS, 1 (v/v) Empigen in case of iodoacetamide-blocking, or with PBS, 0.2 % CHAPS in case of 3o blocking with NEM and biotin-NEM. The eluted fractions were analyzed by SDS-PAGE and Western blotting. The fractions with a relative Mr ~29-~15 kD (based on SDS-PAGE
migration) were pooled, concentrated and, to force virus-like particle formation, loaded on a Superdex 6200 column equilibrated with PBS, 3% (w/v) betain. The fractions were pooled, concentrated and desalted to PBS, 0.5% (w/v) betain in case of iodoacetatnide-blocking, or with PBS, 0.05 % CHAPS in case of blocking with NEM and biotin-NEM.
VI,~P-formation of HCV envelope proteins with reversibly modified Cys-thiol groups The concentrated HCV envelope proteins sulphonated during the isolation procedure were subjected to a reducing treatment (incubation in the presence of 5 mM DTT in PBS) to convert the sulphonated Cys-thiol groups to free Cys-thiol groups. Reversible Cys-thiol modification was performed by incubation for 30 min in the presence of dithiodipyridine (DTDP), dithiocarbamate (DTC) or cysteine. The proteins were subsequently loaded on a size-exclusion chromatograpy column (Superdex 6200, Pharmacia) equilibrated with PBS, 1 to % (v/v) Empigen. The eluted fractions were analyzed by SDS-PAGE and Western blotting.
The fractions with a relative Mr ~29-~15 kD (based on SDS-PAGE migration) were pooled, concentrated and loaded on Superdex 6200, equilibrated with PBS, 3% (w/v) betain, to enforce virus like particle formation (VLP). The fractions were pooled, concentrated and desalted to PBS, 0.5% (w/v) betain.
The elution profiles of size-exclusion chromatography in PBS, 3% (w/v) betain to obtain VLPs of FI. polymozpha-expressed E2-H6 are shown in Figure 46 (sulphonated) and Figure 47 (alkylated with iodoacetamide).
The elution profiles of size-exclusion chromatography in PBS, 3% (w/v) betain to obtain 2o VLPs of H. polyrnofpha-expressed E1 are shown in Figure 48 (sulphonated) and Figure 49 (alkylated with iodoacetamide). The resulting VLPs were analyzed by SDS-PAGE
and western blotting as shown in Figure 50.
Size-analysis of VLPs formed by H. polyrrzozpha-expressed HCV envelope proteins The VLP particle size was determined by Dynamic Light Scattering. For the light-scattering experiments, a particle-size analyzer (Model Zetasizer 1000 HS, Malvern Instruments Ltd., Malvern, Worcester UK) was used which was controlled by photon correlation spectroscopy (PCS) software. Photon correlation spectroscopy or dynamic light scattering (DLS) is an optical method that measures brownian motion and relates this to the size of particles. Light 3o from a continuous, visible laser beam is directed through an ensemble of macromolecules or particles in suspension and moving under brownian motion. Some of the laser light is scattered by the particles and this scattered light is measured by a photomultiplier.
Fluctuations in the intensity of scattered light are converted into electrical pulses which are fed into a correlator. This generates the autocorrelation function which is passed to a computer where the appropriate data analysis is performed. The laser used was a 10 mW
monochromatic coherent He-Ne laser with a fixed wavelength of 633 rim. For each sample, three to six consecutive measurements were taken.
The results of these experiments are summarized in Table 5.
Table 5. Results of dynamic light scattering analysis on the indicated VLP-compositions of HCV envelope proteins expressed by H. polymo~pha. The VLP particle sizes are given as mean diameter of the particles.
Cys-thiol modificationEl-H6 EZ-VIEGR-H6 El sulphonation 25-45 riri1 20 nm 20-26 nm alkylation 23-56 nm 20-56 nm 21-25 rim (iodoacetamide) to The observation that sulphonated HCV E1 derived from H. polyrcorpha still forms particles with a size in the same range as alkylated HCV E1 from Haszsmzula is surprising. Such an effect was not expected since the high (up to 8 Cys-tluol groups can be modified on HCV E1) net increase of negative charges as a consequence of sulphonation should induce an ionic repulsion between the subunits. The other reversible cysteine modifying agents tested also allowed particle formation, the HCV El produced in this way, however, proved to be less stable than the sulphonated material, resulting in disulfide-Based aggregation of the HCV El.
In order to use these other reversible blockers, further optimization of the conditions is 2o required.

ANTIGENIC EQUIVALENCE OF HANSENULA-PRODUCED HCV El-H6 AND
HCV El PRODUCED BY VACCINIA-INFECTED MAMMALIAN CELLS
The reactivity of Ha~zsenula-produced HCV E1-H6 with sera from HCV chronic carriers was 3o compared to the reactivity of HCV E1 produced by HCV-recombinant vaccinia virus-infected mammalian cells as described by Depla et al. in WO 99/67285. Both HCV-E1 preparations tested consisted of VLP's wherein the HCV El proteins were alkylated with NEM
and biotin-NEM. The reactivities of both HCV El VLP-preparations with sera from HCV
chronic carriers was determined by ELISA. The results are siunmarized in Table 6. As can be derived from Table 6, no differences in reactivity were noted between HCV E1 expressed in HCV-recombinant vaccinia virus-infected mammalian cells and HCV E1 expressed in H.
polyn~coypha.
Table 6. Antigenicity of El produced in a mammalian cell culture or produced in H.
polymofpha were evaluated on a panel of sera from human HCV chronic carriers.
For this purpose biotinylated EI was bound to streptavidin coated ELISA plates.
Thereafter human l0 sera were added at a 1/20 dilution and bound immunoglobulins from the sera bound to El were detected with a rabbit-anti-human IgG-Fc specific secondary antibody labeled with peroxidase. Results are expressed as OD-values. The average values are the averages of the OD-values of all serum samples tested.
Hahse~zulamammalian Hanseuulamammalian Serum Serum 17766 1.218 1.159 55337 1.591 1.416 17767 1.513 1.363 55348 1.392 1.261 17777 0.806 0.626 55340 1.202 0.959 17784 1.592 1.527 55342 I.599 1.477 17785 1.508 1.439 55345 1.266 1.428 17794 1.724 1.597 55349 1.329 1.137 17798 1.132 0.989 55350 1.486 1.422 17801 1.636 1.504 55352 0.722 1.329 17505 1.053 0.944 55353 1.065 1.157 17810 1.134 0.999 55354 1.118 1.092 17819 1.404 1.24 55355 0.754 0.677 17820 1.308 1.4 55362 1.43 1.349 17826 1.163 1.009 55365 1.612 1.608 17827.1.668 1.652 55368 0.972 0.959 17849 1.595 1.317 55369 _ 1.506 1.377 55333 1.217 1.168 .
~ I

IMMUNOGENIC EQUIVALENCE OF HANSENUL~1-PRODUCED HCV El-H6 AND
HCV El PRODUCED BY VACCINIA-INFECTED MAMMALIAN CELLS
The immunogenecity of Ha>zserzula-produced HCV E1-H6 was compared to the immunogenecity of HCV El produced by HCV-recombinant vaccinia virus-infected mammalian cells as described by Depla et al. in W099/67285. Both HCV-El preparations tested consisted of VLP's wherein the HCV El proteins were alkylated with iodoacetamide.
to Both VLP preparations were formulated with alum and injected in Balb/c mice (3 intramuscular/subcutaneous injections with a three week interval between each and each consisting of 5 ~g E1 in 125 ~,1 containing 0.13% Alhydrogel, Superfos, Demnark). Mice were bled ten days after the third immunization.
Results of this experiment are shown in Figure 51. For the top part of Figure 51, antibodies raised following immunization with VLPs . of E1 produced in mammalian cells were determined. Antibody titers were determined by ELISA (see Example 21) wherein either E1 produced in mammalian cells ("M") or HarZSerZUla-produced E1 ("H") were coated directly on the ELISA solid support whereafter the ELISA plates were blocked with casein.
For the bottom part of Figure 51, antibodies raised following immunization with VLPs of Hansenula-2o produced E1 were determined. Antibody titers were determined by ELISA (see Example 21) wherein either E1 produced in mammalian cells ("M") or Hansenula-produced E1 ("H") were coated directly on the ELISA solid support whereafter the ELISA plates were blocked with casein.
The antibody titers determined were end point titers. The end point titer is determined as the dilution of serum resulting in an OD (as determined by ELISA) equal to two times the mean of the background of the assay.
Figure 51 shows that no significant differences were observed between the immunogenic properties of both El-compositions and that the determined antibody titers are independent of the antigen used in the ELISA to perform the end point titration.
3o The yeast-derived HCV E1 induced upon vaccination a protective response similar to the protective response obtained upon vaccination with alkylated HCV E1 derived from mammalian cell culture. The latter response was able to prevent chronic evolution of HCV
after an acute infection.

_87_ ANTIGENIC AND IMMUNOGENIC PROFILE OF HANSENULA-PRODUCED HCV
El-H6 WHICH IS SULPHONATED
The reactivity of Haizsefaula-produced HCV E1-H6 with sera from HCV chronic carriers was compared to the reactivity of HCV E1 produced by HCV-recombinant vaccinia virus-infected mammalian cells as described by Depla et al. in W099/67285. Both HCV-E1 preparations tested consisted of VLP's wherein the Ha~cse~2ula-produced HCV E1 proteins were sulphonated and the HCV E1 produced by mammalian cells was alkylated. The results are to given in Table 7. Although the overall (average) reactivity was identical, some major differences were noted for individual sera. This implies that the sulphonated material presents at least some of its epitopes in a way different from alkylated HCV E1.
The immunogenecity of Ha~zsehula-produced HCV E1-H6 which was sulphonated was compared to the immunogenecity of Ha~seizula-produced HCV E1-H6 which was alkylated.
Both HCV-El preparations tested consisted of VLP's. Both VLP preparations were formulated with alum and injected in Balb/c mice (3 intramuscular/subcutaneous injections with a three week interval between each and each consisting of 5 ~,g E1 in 125 ~,l containing 0.13% Alhydrogel, Superfos, Denmark). Mice were bled ten days after the third immunization.
Antibody titers were determined similarly as described in Example 22.
Surprisingly, immunization with sulphonated material resulted in higher antibody titers, regardless of the antigen used in ELISA to assess these titers (Figure 51; top panel: titration of antibodies raised against alkylated E1; bottom panel: titration of antibodies raised against sulphonated E1; "A": alkylated El coated on ELISA plate; "S": sulphonated El coated on ELISA plate).
However, in this experiment individual titers are different dependent on the antigen used for analysis which confirms the observation noted with sera from HCV patients.
Consequently, HCV E1 wherein the cysteine thiol-gorups are modified in a reversible way may be more immunogenic and. thus have an increased potency as a vaccine protecting against HCV
(chronic infection). In addition thereto, induction of a response to neo-epitopes induced by irrreversible blocking is less likely to occur.

-88_ Table 7. Antigenicity of alkylated E1 (produced in mammalian cell culture) or sulphonated E1-H6 (produced in H. polynoipha) was evaluated on a panel of sera from human HCV
chronic carriers ("patient sera") and a panel of control sera ("blood donor sera"). To this purpose E1 was bound to ELISA plates, after which the plates were further saturated with casein. Human sera were added at a 1/20 dilution and bound immunoglobulins were detected with a rabbit-anti-human IgG-Fc specific secondary antibody labeled with peroxidase. Results are expressed as OD-values. The average values are the averages of the OD-values of all serum samples tested.
patient blood donor sera sera Hahse~aulamammalian Hahsenula mammalian serer serer 17766 0.646 0.333 F500 0.055 0.054 17777 0.46 0.447 F504 0.05 0.05 17785 0.74 0.417 F508 0.05 0.054 17794 1.446 1.487 F510 0.05 0.058 17801 0.71 0.902 F511 0.05 0.051 17819 0.312 0.539 F512 0.051 0.057 17827 1.596 1.576 F513 0.051 0.052 17849 0.586 0.964 F527 0.057 0.054 55333 0.69 0.534 . I I
~
~
I
I

55338 0.461 0.233 55340 0.106 0.084 55345 1.474 1.258 55352 1.008 0.668 55355 0.453 0.444 55362 0.362 0.717 55369 0.24 0.452 .~~ I 1. I

to IDENTICAL ANTIGENIC REACTIVITY OF HANSElVULA-PRODUCED HCV El-WITH SERA FROM VACCINATED CHIMPANZEES
The reactivities of the El produced by HCV-recombinant vaccinia virus-infected mammalian cells and the E1-H6 produced by Hansehula (both alkylated) with sera from vaccinated chimpanzees and with monoclonal antibodies were compared. Thereto, said E1 proteins were coated directly to ELISA plates followed by saturation of the plates with casein. The end l0 point titers of antibodies binding the E1 proteins coated to the ELISA
plates was determined for chimpanzee sera and for specific marine monoclonal antibodies, all obtained from animals immunized with El produced by mammalian cells. End point titer determination was done as described in Example 22. The marine monoclonal antibodies used were IGH201 (see Example 15), IGH198 (IGH198 = 23C12 in Maertens et al. in W096/04385), IGH203 (IGH203 = 1566 in Maertens et al. in W096/04385) and IGH202 (IGH202 = 3F3 in Maertens et al. in W099/50301).
As can be derived from Figure 53, the reactivities of 7 different chimpanzee are identical when tested with E1 protein produced by either Hahsenula or mammalian cells.
The reactivities of the monoclonal antibodies against HCV E1 are also almost equal. Two of the 2o chimpanzees (Yoran and Marti) were involved in a prophylactic vaccine study and were able to clear an acute infection upon challenge while a control animal did not clear the infection.
The five other chimpanzees (Ton, Phil, Marcel, Peggy, Femora) were involved in therapeutic vaccination studies and showed a reduction in liver damage, as measured by ALT
in serum and/or histological activity index on liver biopsy, upon the HCV E1 immunizations.
The results obtained in this experiment are clearly different from the findings of Mustilli and coworkers (Mustilli, A. C. et al. 1999) who expressed the HCV E2 protein both in Sacc7Zaf°omyces cef°evisiae and Kluyvei°omyces lactis.
The purified yeast-produced E2 was, however, different from the HCV E2 produced by mammalian (CHO) cells in that a lower reactivity was observed with sera from chimpanzees inununized with HCV E2 produced by 3o mammalian cells while reactivity with monoclonal antibodies was higher for the yeast-produced HCV E2.

GLYCOPROFILING OF HCV El BY FLUOROPHORE-ASSISTED
CARBOHYDRATE ELECTROPHORESIS (FACE) The glycosylation profiles were compared of Hanserzula-produced HCV El and HCV

produced by HCV-recombinant vaccinia virus-infected mammalian cells as described by Depla et al. in W099/67285. This was done by means of fluorophore-assisted carbohydrate electrophoresis (FACE). Thereto, oligosaccharides were released from Els produced by mammalian cells or HanserZUla by peptide-N-glycosidase (PNGase F) and labelled with ANTS (the E1 proteins were alkylated with iodoacetamide prior to PNGase F
digestion).
ANTS-labeled oligosaccharides were separated by PAGE on a 21 % polyacrylamide gel at a current of 15 mA at 4°C for 2-3 h. From Figure 54, it was concluded that the oligosaccharides on E1 produced by mammalian cells and El-H6 produced by Harzsenula migrate like oligomaltose with a degree of polymerization between 7 and 11 monosaccharides. This indicates that the Haraserzula expression system surprisingly leads to an El protein which is not hyperglycosylated and which has sugar chains with a length similar to the sugar chains added to El proteins produced in mammalian cells.

Reference List Agaphonov,M.O., Beburov,M.Y., Ter Avanesyan,M.D., and Smirnov,V.N. (1995) A
disruption-replacement approach for the targeted integration of foreign genes in Hansenula polymorpha. Yeast 11:1241-1247.
Agaphonov,M.O., Trushkina,P.M., Sohn,J.H., Choi,E.S., Rhee,S.K., and Ter Avanesyan,M.D. (1999) Vectors for rapid selection of integrants with different plasmid copy numbers in the yeast Hansenula polymorpha DL1. Yeast 15:541-551.
Alber,T. and Kawasaki,G. (1982) Nucleotide sequence of the triose phosphate isomerase to gene of Saccharomyces cerevisiae. J.Mol Appl.Genet 1:419-434.
Ammerer,G. (1983) Expression of genes in yeast using the ADCI promoter.
Methods Enzymol. 101:192-201.
Ballou,L., Hitzeman,R.A., Lewis,M.S., and Ballou,C.E. (1991) Vanadate-resistant yeast mutants are defective in protein glycosylation. Proc.Natl.Acad.Sci.U.S.A
8:3209-3212.
Beekman,N.J., Schaaper,W.M., Tesser,G.L, Dalsgaard,K., Kamstrup,S., Langeveld,J.P., Boshuizen,R.S., and Meloen,R.H. (1997) Synthetic peptide vaccines:
palmitoylation of peptide antigens by a thioester bond increases immunogenicity. J.Pept.Res.
50:357-364.
Burns,J., Butler,J., and Whitesides,G. (1991) Selective reduction of disulfides by Tris(2-carboxyethyl)phosphine. J.Org.Chem. 56:2648-2650.
2o Cog,H., Mead,D., Sudbery,P., Eland,R.M., Mannazzu,L, and Evans,L. (2000) Constitutive expression of recombinant proteins in the methylotrophic yeast Hansenula polymorpha using the PMA1 promoter. Yeast 16:1191-1203.
Cregg,J.M. (1999) Expression in the methylotophic yeast Pichia pastoris. In Gene expression systems: using nature for the art of expression, J.M.Fernandez and J.P.Hoeffler, eds (San Diego: Academic Press), pp. 157-191.
Darbre,A. (1986) Pf~actical pf~otein claemistry: a handbook. Whiley & Sons Ltd.

Doms,R.W., Lamb,R.A., Rose,J.K., and Helenius,A. (1993) Folding and assembly of viral membrane proteins. Virology 193:545-562.
Elble,R. (1992) A simple and efficient procedure for transformation of yeasts.
Biotechniques 13:18-20.
Gailit,J. (1993) Restoring free sulfhydryl groups in synthetic peptides.
Anal.Biochem.
214:334-335.
Garson,J.A., Lubach,D., Passas,J., Whitby,K., and Grant,P.R. (1999) Suramin blocks hepatitis C binding to human hepatoma cells in vitro. J.Med.Virol. 57:238-242.
Gatzke,R., Weydemann,U., Janowicz,Z.A., and Hollenberg,C.P. (1995) Stable multicopy to integration of vector sequences in Hansenula polymorpha.
Appl.Microbial.Biotechnol.
43:844-849.
Gellissen,G. (2000) Heterologous protein production in methylotrophic yeasts.
Appl.Microbiol.Biotechnol. 54:741-750.
Grakoui,A., Wychowski,C., Lin,C., Feinstone,S.M., and Rice,C.M. (1993) Expression and identification of hepatitis C virus polyprotein cleavage products. J.Virol.
67:1385-1395.
Heile,J.M., Fong,Y.L., Rosa,D., Berger,K., Saletti,G., Campagnofi,S., Bensi,G., Capo,S., Coates,S., Crawford,I~., Dong,C., Wininger,M., Baker,G., Cousens,L., Chien,D., Ng,P., Archangel,P., Grandi,G., Houghton,M., and Abrignani,S. (2000) Evaluation of hepatitis C
virus glycoprotein. E2 for vaccine design: an endoplasmic reticulum-retained recombinant 2o protein is superior to secreted recombinant protein and DNA-based vaccine candidates.
J.Virol. 74:6885-6892.
Helenius,A. (1994) How N-linked oligosaccharides affect glycoprotein folding in the endoplasmic reticulum. Mol Biol.Cel15:253-265.
Hermanson,G.T. (1996) Biocohjugate techniques. San Diego: Academic Press.
Herscovics,A. and Orlean,P. (1993) Glycoprotein biosynthesis in yeast. FASEB
J. 7:540-550.

Hijikata,M., Kato,N., Ootsuyama,Y., Nakagawa,M., and Shimotohno,K. (1991) Gene mapping of the putative structural region of the hepatitis C virus genome by in vitro processing analysis. Proc.Natl.Acad.Sci.U.S.A 88:5547-5551.
Hitzeman,R.A., Clarke,L., and Carbon,J. (1980) Isolation and characterization of the yeast 3-phosphoglycerokinase gene (PGK) by an irrnnunological screening technique.
J.Biol.Chem.
255:12073-12080.
Hollenberg,C.P. and Gellissen,G. (1997) Production of recombinant proteins by methylotrophic yeasts. Curr.Opin.Biotechnol. 8:554-560.
Holingren,A. (1979) Thioredoxin catalyzes the reduction of insulin disulfides by to dithiothreitol and dihydrolipoamide. J.Biol.Chem. 254:9627-9632.
Jayabaskaran,C., Davison,P.F., and Paulus,H. (1987) Facile preparation and some applications of an affinity matrix with a cleavable connector arm containing a disulfide bond.
Prep.Biochem. 17:121-141.
Julius,D., Brake,A., Blair,L., Kunisawa,R., and Thorner,J. (1984) Isolation of the putative structural gene for the lysine-arginine-cleaving endopeptidase required for processing of yeast prepro-alpha-factor. Cell 37:1075-1089.
Kalef,E., Walfish,P.G., and Gitler,C. (1993) Arsenical-based affinity chromatography of vicinal dithiol-containing proteins: purification of L1210 leukemia cytoplasmic proteins and the recombinant rat c-erb A beta 1 T3 receptor. Anal.Biochem. 212:325-334.
2o Kato,N., Ootsuyama,Y., Tanaka,T., Nakagawa,M., Nakazawa,T., Muraiso,K., Ohkoshi,S., Hijikata,M., and Shimotohno,K. (1992) Marked sequence diversity in the putative envelope proteins of hepatitis C viruses. Virus Res. 22:107-123.
Kawasaki,G. and Fraenkel,D.G. (1982) Cloning of yeast glycolysis genes by complementation. Biochem.Biophys.Res.Commun. 108:1107-1122.
Klebe,R.J., Harriss,J.V., Sharp,Z.D., and Douglas,M.G. (1983) A general method for polyethylene-glycol-induced genetic transformation of bacteria and yeast. Gene 25:333-341.

Kumar,N., Kella,D., and Kinsella,J.E. (1985) A method for the controlled cleavage of disulfide bonds in proteins in the absence of denaturants.
J.Biochem.Biophys.Methods 11:251-263. .
Kumar,N., Kella,D., and Kinsella,J.E. (1986) Anomalous effects of denaturants on sulfitolysis of protein disulfide bonds. Int.J.Peptide Prot.Res. 28:586-592.
Maertens,G. and Stuyver,L. (1997) Genotypes and genetic variation of hepatitis C virus. In The molecular medicine of viral hepatitis, T.J.Harrison and A.J.Zuckerman, eds John Wiley & Sons), pp. 183-233.
Major,M.E. and Feinstone,S.M. (1997) The molecular virology of hepatitis C.
Hepatology 25:1527-1538.
Mustilli,A.C., Izzo,E:, Houghton,M., and Galeotti,C.L. (1999) Comparison of secretion of a hepatitis C virus glycoprotein in Saccharomyces cerevisiae and Kluyveromyces lactis.
Res.Microbiol. 150:179-187.
Nagai,K. and Thogersen,H.C. (1984) Generation of beta-globin by sequence-specific proteolysis of a hybrid protein produced in Escherichia coli. Nature 309:810-812.
Nielsen,P.E. (2001) Targeting double stranded DNA with peptide nucleic acid (PNA). Curr Med Chem 8:545-550.
Okabayashi,K., Nakagawa,Y., Hayasuke,N., Ohi,H., Miura,M., Ishida,Y., Shimizu,M., Murakami,K., Hirabayashi,K., Minamino,H., and . (1991) Secretory expression of the human serum albumin gene in the yeast, Saccharomyces cerevisiae.
J.Biochem.(Tokyo) 110:103-110.
Orum,H. and Wengel,J. (2001) Locked nucleic acids: a promising molecular family for gene-function analysis and antisense drug development. Curr Opin.Mol.Ther.
3:239-243.
Padgett,K.A. and Sorge,J.A. (1996) Creating seamless junctions independent of restriction 2s sites in PCR cloning. Gene 168:31-35.
Pedersen,J., Lauritzen,C., Madsen,M.T., and Weis,D.S. (1999) Removal of N-terminal polyhistidine tags from recombinant proteins using engineered aminopeptidases.
Protein Expr.Purif. 15:3 89-400.

Pomroy,N.C. and Deber,C.M. (1998) Solubilization of hydrophobic peptides by reversible cysteine PEGylation. Biochem.Biophys.Res.Commun. 245:618-621.
Rayrnond,C.K. (1999) Recombinant protein expression in Pichia methanolica. In Gene expression systems: using nature for the art of expression, J.M.Fernandez and J.P.Hoeffler, eds (San Diego: Academic Press), pp. 193-209.
Rein,A., Ott,D.E., Mirro,J., Arthur,L.O., Rice,W., and Henderson,L.E. (1996) Inactivation of marine leukemia virus by compounds that react with the zinc finger in the viral nucleocapsid protein. J.Virol. 70:4966-4972.
Roggenkamp,R., Hansen,H., Eckart,M., Janowicz,Z., and .Hollenberg,C.P. (1986) to Transformation of the methylotrophic yeast Hansenula polymorpha by autonomous replication and integration vectors. Mol Gen Genet 202:302-308.
Rosa,D., Campagnoli,S., Moretto,C., Guenzi,E., Cousens,L., Chin,M., Dong,C., Weiner,A.J., Lau,J.Y., Choo,Q.L., Chien,D., Pileri,P., Houghton,M., and Abrignani,S.
(1996) A quantitative test to estimate neutralizing antibodies to the hepatitis C virus:
cytofluorimetric assessment of envelope glycoprotein 2 binding to target cells.
Proc.Natl.Acad.Sci.U.S.A 93:1759-1763.
Rose,J.K. and Doms,R.W. (1988) Regulation of protein export from the endoplasmic reticulum. Annu.Rev.Cell Biol. 4:257-288.
Russell,D.W., Smith,M., Williamson,V.M., and Young,E.T. (1983) Nucleotide sequence of 2o the yeast alcohol dehydrogenase II gene. J.Biol.Chem. 258:2674-2682.
Russell,P.R. (1983) Evolutionary divergence of the mRNA transcription initiation mechanism in yeast. Nature 301:167-169.
Russell,P.R. (1985) Transcription of the triose-phosphate-isomerase gene of Schizosaccharomyces pombe initiates from a start point different from that in Saccharomyces cerevisiae. Gene 40:125-130.
Russell,P.R. and Hall,B.D. (1983) The primary structure of the alcohol dehydrogenase gene from the fission yeast Schizosaccharomyces pombe. J.Biol.Chem. 258:143-149.

Sambrook,J., Fritsch,E.F., and Maniatis,T. (1989) Molecular Cloning: A
Labo~atoy Manual. Cold Spring Harbor Laboratory Press.
Singh,R. and Kats,L. (1995) Catalysis of reduction of disulfide by selenol.
Anal.Biochem.
232:86-91.
Sohn,J.H., Choi,E.S., Kang,H.A., Rhee,J.S., and Rhee,S.K. (1999) A family of telomere-associated autonomously replicating sequences and their functions in targeted recombination in Hansenula polymorpha DL-1. J.Bacteriol. 181:1005-1013.
Stevens,R.C. (2000) Design of high-throughput methods of protein production for structural biology. Structure.Fold.Des 8:8177-8185.
Stuyver,L., van Arnhem,W., Wyseur,A., Hernandez,F., Delaporte,E., and Maertens,G.
(1994) Classification of hepatitis C viruses based on phylogenetic analysis of the envelope 1 and nonstructural 5B regions and identification of five additional subtypes.
Proc.Natl.Acad.Sci.U.S.A 91:10134-10138.
Sugrue,R.J., Cui,T., Xu,Q., Fu,J., and Chan,Y.C. (1997) The production of recombinant dengue virus E protein using Escherichia coli and Pichia pastoris.
J.Virol.Methods 69:159-169.
2o Thakur,M.L., DeFulvio,J., Richard,M.D., and Park,C.H. (1991) Technetium-99m labeled monoclonal antibodies: evaluation of reducing agents. Int.J.Rad.Appl.Instrum.B
18:227-233.
Vingerhoeds,M.H., Haisma,H.J., Belliot,S.O., Smit,R.H., Crommelin,D.J., and Storm,G.
(1996) Immunoliposomes as enzyme-carriers (immuno-enzymosomes) for antibody-directed enzyme prodrug therapy (ADEPT): optimization of prodrug activating capacity.
Pharm.Res.
13:604-610.
Wahlestedt,C., Salmi,P., Good,L., Kela,J., Johnsson,T., Hokfelt,T., Broberger,C., Porreca,F., Lai,J., Ren,K., Ossipov,M., Koshkin,A., Jakobsen,N., Skouv,J., Oerum,H., Jacobsen,M.H., and Wengel,J. (2000) Potent and nontoxic antisense oligonucleotides containing locked nucleic acids. Proc Natl Acad Sci U S A 97:5633-5638.

Weydemann,U., Keup,P., Piontek,M., Strasser,A.W., Schweden,J., Gellissen,G., and Janowicz,Z.A. (1995) High-level secretion of hirudin by Hansenula polymorpha--authentic processing of three different preprohirudins. Appl.Microbiol.Biotechnol.
44:377-385.

SEQUENCE hISTING
<110> Innogenetics N.V.
<120> Constructs and methods for expression of recombinant HCV envelope proteins <130> 134 PCT
<160> 98 <170> PatentIn version 3.1 <210> 1 <211> 18 <212> PRT
<213> avian lysozyme signal peptide <220>
<221> MISC_FEATURE
<222> (2) . (2) <223> Xaa is Arg, Lys or Val <220>
<221> MISC_FEATURE
<222> (3) . (3) <223> Xaa is Ser, Ala, Val, Arg or Met <220>
<221> MISC_FEATURE
<222> (4) . (4) <223> Xaa is Leu or Phe <220>
<221> MISC_FEATURE
<222> (5) . (5) <223> Xaa is Leu or Ala <220>
<221> MISC_FEATURE
<222> (6) . (6) <223> Xaa is Ile, Thr, Phe or Val <220>
<221> MISC FEATURE
<222> (7) . ._(7) <223> Xaa is Leu, Phe or Ala <220>
<221> MISC_FEATURE
<222> (8) . (8) <223> Xaa is Val, Ile, Ala, Leu or Cys <220>

<221> MISC_FEATURE
<222> (9) . (9) <223> Xaa is Leu, Phe, Ala or Ile <220>
<221> MISC_FEATURE
<222> (10) .(10) <223> Xaa is Cys, Phe, Ser or Leu <220>
<221> MISC_FEATURE
<222> (11) .(11) <223> Xaa is Phe, Leu, Ser or Pro <220>
<221> MISC_FEATURE
<222> (12) . (12) <223> Xaa is Leu, Ala or Met <220>
<221> MISC_FEATURE
<222> (13) .(13) <223> Xaa is Pro, Ala or Ile <220>
<221> MISC_FEATURE
<222> (14) .(14) <223> Xaa is Leu or Ala <220>
<221> MISC_FEATURE
<222> (15) .(15) <223> Xaa is Ala, Val, Ser or Met <220>
<221> MISC_FEATURE
<222> (I6) .(16) <223> Xaa is Ala, Lys or Ser <220>
<221> MISC_FEATURE
<222> (17) .(17) <223> Xaa is Leu, Pro, Gln or Ile <400> 1 Met Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Gly <210> 2 <211> 135 <212> PRT
<213> hepatitis C virus <400> 2 Tyr Glu Val Arg Asn Val Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val Tyr Glu Ala Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr Ile Arg Arg His Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gln Leu Phe Thr Ile Ser Pro Arg Arg His Glu Thr Val Gln Asp Cys Asn Cys Ser Ile Tyr Pro Gly His Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp <210> 3 <211> 290 <212> PRT
<213> hepatitis C virus <400> 3 His Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Asp Thr Arg Gly Leu Val Ser Leu Phe Ser Pro Gly Ser Ala Gln Lys Ile Gln Leu Val Asn Thr Asn Gly Ser Trp His Ile Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Gln Thr Gly Phe Phe Ala Ala Leu Phe Tyr Lys His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys Arg Ser Ile Asp Lys Phe A1a Gln Gly Trp Gly Pro Leu Thr Tyr Thr Glu Pro Asn Ser Ser Asp Gln Arg Pro Tyr Cys Trp His Tyr Ala Pro Arg Pro Cys Gly Ile Val Pro Ala Ser Gln Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Sex Pro Val Val Val Gly Thr Thr Asp Arg Phe Gly Val Pro Thr Tyr Asn Trp Gly Ala Asn Asp Ser Asp Val Leu Ile Leu Asn Asn Thr Arg Pro Pro Arg Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn Ile Gly Gly Ala Gly Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala Thr Tyr Ala Arg Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Met Val His Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr Ile Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg Phe Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Glu Trp Gln <210> 4 <211> 141 <212> PRT
<213> hepatitis C virus <400> 4 Tyr Glu Val Arg Asn Val Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val Tyr Glu Ala Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr I1e Arg Arg His Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gln Leu Phe Thr Ile Ser Pro Arg Arg His Glu Thr Val Gln Asp Cys Asn Cys Ser Ile Tyr Pro Gly His I1e Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp His His His His His His <210> 5 <211> 301 <212> PRT
<213> hepatitis C virus <400> 5 His Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Asp Thr Arg Gly Leu Val Ser Leu Phe Ser Pro Gly Ser Ala Gln Lys Ile Gln Leu Val Asn Thr Asn Gly Ser Trp His Ile Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Gln Thr Gly Phe Phe Ala Ala Leu Phe Tyr Lys His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Leu A1a Ser Cys Arg Ser Ile Asp Lys Phe Ala Gln Gly Trp Gly Pro Leu Thr Tyr Thr Glu Pro Asn Ser Ser Asp Gln Arg Pro Tyr Cys Trp His Tyr Ala Pro Arg Pro Cys Gly 100 105 1.10 Ile Val Pro Ala Ser Gln Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Phe Gly Val Pro Thr Tyr Asn Trp Gly Ala Asn Asp Ser Asp Val Leu Ile Leu Asn Asn Thr Arg Pro Pro Arg Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn Ile Gly Gly Ala Gly Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala Thr Tyr Ala Arg Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Met Val His Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr Ile Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg Phe Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asp Arg Asp Arg Ser Glu heu Ser Pro Zeu L,eu Z,eu Ser Thr Thr G1u Trp Gln Val Ile Glu Gly Arg His His His His His His <210>

<211>

<212>
DMA

<213>
vector pGEMTEIsH6 <400>

aatcactagtgcggccgcctgcaggtcgaccatatgggagagctcccaacgcgttggatg60 catagcttgagtattctatagtgtcacctaaatagcttggcgtaatcatggtcatagctg120 tttcctgtgtgaaattgttatccgctcacaattccacacaacatacgagccggaagcata180 aagtgtaaagcctggggtgcctaatgagtgagctaactcacattaattgcgttgcgctca240 ctgcccgctttccagtcgggaaacctgtcgtgccagctgcattaatgaatcggccaacgc300 gcggggagaggcggtttgcgtattgggcgctettccgcttcctcgctcactgactcgctg360 cgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggtaatacggtta420 tccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggcc480 aggaaccgtaaaaaggccgcgttgctggcgtttttcgataggctccgcccccctgacgag540 catcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagatac600 caggcgtttccccctggaagctccctcgtgCgCtCtCCtgttcCgaCCCtgCCgCttaCC660 ggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgt720 aggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaacccccc780 gttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaaga840 cacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgta900 ggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaaggacagta960 tttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttga1020 tccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacg1080 cgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcag1140 tggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacc1200 tagatccttttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaact1260 tggtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctattt1320 cgttcatccatagttgcctgactccccgtcgtgtagataactacgatacgggagggctta1380 ccatctggccccagtgctgcaatgataccgcgagacccacgctcaccggctccagattta1440 tcagcaataaaccagccagccggaagggccgagcgcagaagtggtcctgcaactttatcc1500 gcctccatccagtctattaattgttgccgggaagctagagtaagtagttcgccagttaat1560 agtttgcgcaacgttgttggcattgctacaggcatcgtggtgtcacgctcgtcgtttggt1620 atggcttcattcagctccggttcccaacgatcaaggcgagttacatgatcccccatgttg1680 tgcaaaaaagcggttagctccttcggtcctccgatcgttgtcagaagtaagttggccgca1740 gtgttatcactcatggttatggcagcactgcataattctcttactgtcatgccatccgta1800 agatgcttttctgtgactggtgagtactcaaccaagtcattctgagaataccgcgcccgg1860 cgaccgagttgctcttgcccggcgtcaatacgggataatagtgtatgacatagcagaact1920 ttaaaagtgctcatcattggaaaacgttcttcggggcgaaaactctcaaggatcttaccg1980 ctgttgagatccagttcgatgtaacccactcgtgcacccaactgatcttcagcatctttt2040 actttcaccagcgtttctgggtgagcaaaaacaggaaggcaaaatgccgcaaaaaaggga2100 ataagggcgacacggaaatgttgaatactcatactcttcctttttcaatattattgaagc2160 atttatcagggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaa2220 caaataggggttccgcgcacatttccccgaaaagtgccacctgtatgcggtgtgaaatac2280 cgcacagatgcgtaaggagaaaataccgcatcaggcgaaattgtaaacgttaatattttg2340 ttaaaattcgcgttaaatatttgttaaatcagctcattttttaaccaataggccgaaatc2400 ggcaaaatcccttataaatcaaaagaatagaccgagatagggttgagtgttgttccagtt2460 tggaacaagagtccactattaaagaacgtggactccaacgtcaaagggcgaaaaaccgtc2520 tatcagggcgatggcccactacgtgaaccatcacccaaatcaagttttttgcggtcgagg2580 tgccgtaaagctctaaatcggaaccctaaagggagcccccgatttagagcttgacgggga2640 aagccggcgaacgtggcgagaaaggaagggaagaaagcgaaaggagcgggcgctagggcg2700 ctggcaagtgtagcggtcacgctgcgcgtaaccaccacacccgccgcgcttaatgcgccg2760 ctacagggcgcgtccattcgccattcaggctgcgcaactgttgggaagggcgatcggtgc2820 gggcctcttcgctattacgccagctggcgaaagggggatgtgctgcaaggcgattaagtt2880 gggtaacgccagggttttcccagtcacgacgttgtaaaacgacggccagtgaattgtaat2940 acgactcactatagggcgaattgggcccgacgtcgcatgctcccggccgccatggccgcg3000 ggattccaatgcatatgaggtgcgcaacgtgtccgggatgtaccatgtcacgaacgactg3060 ctccaactcaagcattgtgtatgaggcagcggacatgatcatgcacacccccgggtgcgt3120 gccctgcgttcgggagaacaactcttcccgctgctgggtagcgctcacccccacgctcgc3180 agctaggaacgccagcgtccccactacgacaatacgacgccacgtcgatttgctcgttgg'3240 ggcggctgctttctgttccgctatgtacgtgggggatctctgcggatctgtcttcctcgt3300 ctcccagctgttcaccatctcgcctcgccggcatgagacggtgcaggactgcaattgctc3360 aatctatcccggccacataacaggtcaccgtatggcttgggatatgatgatgaactggca3420 ccaccaccatcaccattaaggatccaag 3448 <210> 7 <211> 37 <212> DNA
<213> synthetic probe or primer <400> 7 agttactctt caaggtatga ggtgcgcaac gtgtccg 37 <210> 8 <211> 47 <212> DNA
<213> synthetic probe or primer <400> 8 agttactctt cacagggatc ctccttaatg gtgatggtgg tggtgcc 47 <210> 9 <211> 3067 <212> DNA
<213> vector pCHH-Hir <400>

gcgcccaatacgcaaaccgcctctccccgcgcgttggccgattcattaatgcagctggca 60 cgacaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaatgtgagttagct 120 cactcattaggcaccccaggctttacactttatgcttccggctcgtatgttgtgtggaat 180 tgtgagcggataacaatttcacacaggaaacagctatgaccatgattacgccaagcttgc 240 atgcctgcaggtcgaccctagatctctattactgcaggtattcttccgggatttcttcga 300 agtcgccgtcgttgtgagactgcggacgcggggtaccttcgccagtaacgcactggttac 360 gttcgcctttagagcccaggatgcatttgttgccctggccgcaaacgttagagccttcgc 420 acaggcacaggttctgaccggattcagtgcagtcagtgtaaacaaccctcttttccaacg 480 ggtgtgtagttccattctccaccgctagggctgcgctgggctccattggcgaggttttca 540 aggccgctaggatgcgatccatgcgtccgtagccttgcgtggagcgtgcgtgtgcgtgcg 600 ggagtgcgcataggtaggctacggtgatgattgctagcatggcgggaatagttttgctat 660 acatgaattcactggccgtcgttttacaacgtcgtgactgggaaaaccctggcgttaccc 720 aacttaatcgccttgcagcacatccccctttcgccagctggcgtaatagcgaagaggccc 780 gcaccgatcg cccttcccaa cagttgcgca gcctgaatgg cgaatggcgc ctgatgcggt 840 attttctcct tacgcatctg tgcggtattt cacaccgcat atggtgcact ctcagtacaa 900 tCtgCtCtga tgccgcatag ttaagccagc cccgacaccc gccaacaccc gctgacgcgc 960 cctgacgggc ttgtctgctc ccggcatccg cttacagaca agctgtgacc gtctccggga 1020 gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg cgcgagacga aagggcctcg 1080 tgatacgcct atttttatag gttaatgtca tgataataat ggtttcttag acgtcaggtg 1140 gcacttttcg gggaaatgtg cgcggaaccc ctatttgttt atttttctaa atacattcaa 1200 atatgtatcc gctcatgaga caataaccct gataaatgct tcaataatat tgaaaaagga 1260 agagtatgag tattcaacat ttccgtgtcg cccttattcc cttttttgcg gcattttgcc 1320 ttcctgtttt tgctcaccca gaaacgctgg tgaaagtaaa agatgctgaa gatcagttgg 1380 gtgcacgagt gggttacatc gaactggatc tcaacagcgg taagatcctt gagagttttc 1440 gccccgaaga acgttttcca atgatgagca cttttaaagt tctgctatgt ggcgcggtat 1500 tatcccgtat tgacgccggg caagagcaac tcggtcgccg catacactat tctcagaatg 1560 acttggttga gtactcacca gtcacagaaa agcatcttac ggatggcatg acagtaagag 1620 aattatgcag tgctgccata accatgagtg ataacactgc ggccaactta cttctgacaa 1680 cgatcggagg accgaaggag ctaaccgctt ttttgcacaa catgggggat catgtaactc 1740 gccttgatcg ttgggaaccg gagctgaatg aagccatacc aaacgacgag cgtgacacca 1800 cgatgcctgt agcaatggca acaacgttgc gcaaactatt aactggcgaa ctacttactc 1860 tagcttcccg gcaacaatta atagactgga tggaggcgga taaagttgca ggaccacttc 1920 tgcgctcggc ccttccggct ggctggttta ttgctgataa atctggagcc ggtgagcgtg 1980 ggtctcgcgg tatcattgca gcactggggc cagatggtaa gccctcccgt atcgtagtta 2040 tctacacgac ggggagtcag gcaactatgg atgaacgaaa tagacagatc gctgagatag 2100 gtgcctcact gattaagcat tggtaactgt cagaccaagt ttactcatat atactttaga 2160 ttgatttaaa acttcatttt taatttaaaa ggatctaggt gaagatcctt tttgataatc 2220 tcatgaccaa aatcccttaa cgtgagtttt cgttccactg agcgtcagac cccgtagaaa 2280 agatcaaagg atcttcttga gatccttttt ttctgcgcgt aatctgctgc ttgcaaacaa 2340 aaaaaccacc gctaccagcg gtggtttgtt tgccggatca agagctacca actctttttc 2400 cgaaggtaac tggcttcagc agagcgcaga taccaaatac tgtccttcta gtgtagccgt 2460 agttaggcca ccacttcaag aactctgtag caccgcctac atacctcgct ctgctaatcc 2520 tgttaccagt ggctgctgcc agtggcgata agtcgtgtct taccgggttg gactcaagac 2580 gatagttacc ggataaggcg cagcggtcgg gctgaacggg gggttcgtgc acacagccca 2640 gcttggagcgaacgacctacaccgaactgagatacctacagcgtgagctatgagaaagcg2700 ccacgcttcccgaagggagaaaggcggacaggtatccggtaagcggcagggtcggaacag2760 gagagcgcacgagggagcttccagggggaaacgcctggtatctttatagtcctgtcgggt2820 ttcgccacctctgacttgagcgtcgatttttgtgatgctcgtcaggggggcggagcctat2880 ggaaaaacgccagcaacgcggcctttttacggttcctggccttttgctggccttttgctc2940 acatgttctttcctgcgttatcccctgattctgtggataaccgtattaccgcctttgagt3000 gagctgataccgctcgccgcagccgaacgaccgagcgcagcgagtcagtgagcgaggaag3060 cggaaga 3067 <210> 10 <211> 35 <212> DNA
<213> synthetic probe or primer <400> 10 agttactctt cacctctttt ccaacgggtg tgtag 35 <210> 11 <211> 34 <212> DNA
<213> synthetic probe or primer <400> 11 agtcactctt cactgcaggc atgcaagctt ggcg 34 <210>

<211>

<212>
DNA

<213> or pFPMT121 vect <400>

ggtaccctgctcaatctccggaatggtgatctgatcgttc~ctgaaaacctcgacattggc60 tccctcctgacacaggtactcgtacaggttccaggtaaacgagtcgtagttgtcgatcat120 gacaacgttcttagaagcggccggcattttgaaggtgactaatagcctaagaaaatattt180 aatttaattttcattaaattttcctatactcgctatttcagcttttcatctcatcacttc240 ataaacgatataaaccagaaaaagaactattttcaaacacgcttctcaaaagcggtatgt300 ccttccacgtctccttagaatctggcaagtccgcgagggggatccagatctgaattcccg360 atgaagcagagagcgcaggaggcggtatttatagtgccattcccctctctgagagacccg420 gatggtagtcgagtgtatcggagacagcttgatgtagactCCgtgCCtgCCggCtCCtCt480 tattggcggacaccagtgagacaccccggaacttgctgtttttctgcaaaatccggggtg540 accagtgggagcctatttgcacacacgagcgggacaccccactctggtgaagagtgccaa600 agtcattctt tttcccgttg cggggcagcc gattgcatgt tttaggaaaa tattaccttt 660 gctacaccctgtcagatttaccctccacacatatatattccgtcacctccagggactatt720 attcgtcgttgcgccgccagcggaagatatccagaagctgttttccgagagactcggttg780 gcgcctggtatatttgatggatgtcgcgctgcctcacgtcccggtacccaggaacgcggt840 gggatctcgggcccatcgaagactgtgctccagactgctcgcccagcaggtgtttcttga900 tcgccgcctctaaattgtccgcgcatcgccggtaacatttttccagctcggagtttgcgt960 ttagatacagtttctgcgatgccaaaggagcctgcagattataacctcggatgctgtcat1020 tcagcgcttttaatttgacctccagatagttgctgtatttctgttcccattggctgctgc1080 gcagcttcgtataactcgagttattgttgcgctctgcctcggcgtactggctcatgatct1140 ggatcttgtccgtgtcgcttttcttcgagtgtttctcgcaaacgatgtgcacggcctgca1200 gtgtccaatcggagtcgagctggcgccgaaactggcggatctgagcctccacactgccct1260 gtttctctat ccacggcggaaccgcctcctgccgtttcagaatgttgttcaagtggtact1320 ctgtgcggtc aatgaaggcgttattgccggtgaaatctttgggaagcggttttcctcggg1380 gaagattacg aaattccccgcgtcgttgcgcttcctggatctcgaggagatcgttctccg1440 cgtcgaggag atcgttctccgcgtcgacaccattccttgcggcggcggtgctcaacggcc1500 tcaacctact actgggctgcttcctaatgcaggagtcgcataagggagagcgtcgacaaa1560 cccgcgtttg agaacttgctcaagcttctggtaaacgttgtagtactctgaaacaaggcc1620 ctagcactct gatctgtttctcttgggtagcggtgagtggtttattggagttcactggtt1680 tcagcacatc tgtcatctagacaatattgttactaaatttttttgaactacaattgttcg1740 taattcatct attattatacatcctcgtcagcaatttctggcagacggagtttactaacg1800 tcttgagtat gaggccgaga atccagctct gtggccatac tcagtcttga cagcctgctg 1860 atgtggctgc gttcaacgca ataagcgtgt cctccgactc cgagttgtgc tcgttatcgt 1920 cgttctcatc ctcggaaaaa tcacacgaaa gaacatactc accagtaggc tttctggtcc 1980 ctggggcacg gctgtttctg acgtattccg gcgttgataa tagctcgaaa gtgaacgccg 2040 agtcgcgggagtcgaccgatgcccttgagagccttcaacccagtcagctccttccggtgg2100 gcgcggggcatgactatcgtcgccgcacttatgactgtcttctttatcatgcaactcgta2160 ggacaggtgccggcagcgctctgggtcattttcggcgaggaccgctttcgctggagcgcg2220 acgatgatcggcctgtcgcttgcggtattcggaatcttgcacgccctcgctcaagccttc2280 gtcactggtcccgccaccaaacgtttcggcgagaagcaggccattatcgccggcatggcg2340 gccgacgcgctgggctacgtcttgctggcgttcgcgacgcgaggctggatggccttcccc2400 attatgattcttCtCgCttCCggCggCatCgggatgcccgcgttgcaggccatgctgtcc2460 aggcaggtagatgacgaccatcagggacagcttcaaggatcgctcgcggctcttaccagc2520 ctaacttcgatcactggaccgctgatcgtcacggcgatttatgccgcctcggcgagcaca2580 tggaacgggttggcatggattgtaggcgccgccctataccttgtctgcctccccgcgttg2640 cgtcgcggtgcatggagccgggccacctcgacctgaatggaagccggcggcacctcgcta2700 acggattcaccactccaagaattggagccaatcaattcttgcggagaactgtgaatgcgc2760 aaaccaacccttggcagaacatatccatcgcgtccgccatctccagcagccgcacgcggc2820 gcatcggggggggggggggggggggggggcaaacaattcatcattttttttttattcttt2880 tttttgatttcggtttctttgaaatttttttgattcggtaatctccgaacagaaggaaga2940 acgaaggaaggagcacagacttagattggtatatatacgcatatgtagtgttgaagaaac3000 atgaaattgcccagtattcttaacccaactgcacagaacaaaaacctgcaggaaacgaag3060' ataaatcatgtcgaaagctacatataaggaacgtgctgctactcatcctagtcctgttgc3120 tgccaagctatttaatatcatgcacgaaaagcaaacaaacttgtgtgcttcattggatgt3180 tcgtaccaccaaggaattactggagttagttgaagcattaggtcccaaaatttgtttact3240 aaaaacacatgtggatatcttgactgatttttccatggagggcacagttaagccgctaaa3300 ggcattatccgccaagtacaattttttactcttcgaagacagaaaatttgctgacattgg3360 taatacagtcaaattgcagtactctgcgggtgtatacagaatagcagaatgggcagacat3420 tacgaatgcacacggtgtggtgggcccaggtattgttagcggtttgaagcaggcggcaga3480 agaagtaacaaaggaacctagaggccttttgatgttagcagaattgtcatgcaagggctc3540 cctatctactggagaatatactaagggtactgttgacattgcgaagagcgacaaagattt3600 tgttatcggctttattgctcaaagagacatgggtggaagagatgaaggttacgattggtt3660 gattatgacacccggtgtgggtttagatgacaagggagacgcattgggtcaacagtatag3720 aaccgtggatgatgtggtctctacaggatctgacattattattgttggaagaggactatt3780 tgcaaagggaagggatgctaaggtagagggtgaacgttacagaaaagcaggctgggaagc3840 atatttgagaagatgcggccagcaaaactaaaaaactgtattataagtaaatgcatgtat3900 actaaactcacaaattagagcttcaatttaattatatcagttattacccgggaatctcgg3960 tcgtaatgatttttata~.tgacgaaaaaaaaaaaattggaaagaaaagcccccccccccc4020 ccccccccccccccccccccccgcagcgttgggtcctggccacgggtgcgcatgatcgtg4080 ctcctgtcgttgaggacccggctaggctggcggggttgccttactggttagcagaatgaa4140 tcaccgatacgcgagcgaacgtgaagcgactgctgctgcaaaacgtctgcgacctgagca4200 acaacatgaatggtcttcggtttccgtgtttcgtaaagtctggaaacgcggaagtcagcg4260 ccctgcaccattatgttccggatctgcatcgcaggatgctgctggctaccctgtggaaca4320 cctacatctgtattaacgaagcgctggcattgaccctgagtgatttttctctggtcccgc4380 cgcatccataccgccagttgtttaccctcacaacgttccagtaaccgggcatgttcatca4440 tcagtaacccgtatcgtgagcatcctctctcgtttcatcggtatcattacccccatgaac4500 agaaattcccccttacacggaggcatcaagtgaccaaacaggaaaaaaccgcccttaaca4560 tggcccgctttatcagaagccagacattaacgcttctggagaaactcaacgagctggacg4620 cggatgaacaggcagacatctgtgaatcgcttcacgaccacgctgatgagctttaccgca4680 gctgcctcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccggaga4740 cggtcacagcttgtctgtaagcggatgccgggagcagacaagcccgtcagggcgcgtcag4800 cgggtgttggcgggtgtcggggcgcagccatgacccagtcacgtagcgatagcggagtgt4860 atactggcttaactatgcggcatcagagcagattgtactgagagtgcaccatatgcggtg4920 tgaaataccgcacagatgcgtaaggagaaaataccgcatcaggcgctcttccgcttcctc4980 gctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaa5040 ggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaa5100 aggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggct5160 ccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgac5220 aggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttcc5280 gaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttc5340 tcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctg5400 tgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttga5460 gtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattag5520 cagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggcta5580 cactagaaggacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaag5640 agttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttg5700 caagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctac5760 ggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatc5820 aaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaag5880 tatatatgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctc5940 agcgatctgtctatttcgttcatccatagttgcctgactccccgtcgtgtagataactac6000 gatacgggagggcttaccatctggccccagtgctgcaatgataccgcgagacccacgctc6060 accggctccagatttatcagcaataaaccagccagccggaagggccgagcgcagaagtgg6120 tcctgcaactttatccgcctccatccagtctattaattgttgccgggaagctagagtaag6180 tagttcgccagttaatagtttgcgcaacgttgttgccattgctgcaggcatcgtggtgtc6240 acgctcgtcgtttggtatggcttcattcagctccggttcccaacgatcaaggcgagttac6300 atgatcccccatgttgtgcaaaaaagcggttagctccttcggtcctccgatcgttgtcag6360 aagtaagttggccgcagtgttatcactcatggttatggcagcactgcataattctcttac6420 tgtcatgccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctg6480 agaatagtgtatgcggcgaccgagttgctcttgcccggcgtcaacacgggataataccgc6540 gccacatagcagaactttaaaagtgctcatcattggaaaacgttcttcggggcgaaaact6600 ctcaaggatcttaccgctgttgagatccagttcgatgtaacccactcgtgcacccaactg6660 atcttcagcatcttttactttcaccagcgtttctgggtgagcaaaaacaggaaggcaaaa6720 tgccgcaaaaaagggaataagggcgacacggaaatgttgaatactcatactcttcctttt6780 tcaatattattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatg6840 tatttagaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctga6900 cgtctaagaa accattatta tcatgacatt aacctataaa aataggcgta tcacgaggcc 6960 ctttcgtctt caa 6973 <210>

<211>

<212>
DNA

<213>
vector pFPMT-CHH-E1H6 <400>

ggtaccctgctcaatctccggaatggtgatctgatcgttcctgaaaacctcgacattggc60 tccctcctgacacaggtactcgtacaggttccaggtaaacgagtcgtagttgtcgatcat120 gacaacgttcttagaagcggccggcattttgaaggtgactaatagcctaagaaaatattt180 aatttaattttcattaaattttcctatactcgctatttcagcttttcatctcatcacttc240 ataaacgatataaaccagaaaaagaactattttcaaacacgcttctcaaaagcggtatgt300 ccttccacgtctccttagaatctggcaagtccgcgagggggatccttaatggtgatggtg360 gtggtgccagttcatcatcatatcccaagccatacggtgacctgttatgtggccgggata420 gattgagcaattgcagtcctgcaccgtctcatgccggcgaggcgagatggtgaacagctg480 ggagacgaggaagacagatccgcagagatcccccacgtacatagcggaacagaaagcagc540 cgccccaaccagcaaatcgacgtggcgtcgtattgtcgtagtggggacgctggcgttcct600 agctgcgagcgtgggggtgagcgctacccagcagcgggaagagt'tgttctcccgaacgca660 gggcacgcacccgggggtgtgcatgatcatgtccgctgcctcatacacaatgcttgagtt720 ggagcagtcgttcgtgacatggtacatcccggacacgttgcgcacctcatacctcttttc780 caacgggtgtgtagttccattctccaccgctagggctgcgctgggctccattggcgaggt840 tttcaaggccgctaggatgcgatccatgcgtccgtagccttgcgtggagcgtgcgtgtgc900 gtgcgggagtgcgcataggtaggctacggtgatgattgctagcatggcgggaatagtttt960 gctatacatgaattcccgatgaagcagagagcgcaggaggcggtatttatagtgccattc1020 ccctctctgagagacccggatggtagtcgagtgtatcggagacagcttgatgtagactcc1080 gtgcctgccggctcctcttattggcggacaccagtgagacaccccggaacttgctgtttt.1140 tctgcaaaatccggggtgaccagtgggagcctatttgcacacacgagcgggacaccccac1200 tctggtgaagagtgccaaagtcattctttttcccgttgcggggcagccgattgcatgttt1260 taggaaaatattacctttgctacaccctgtcagatttaccctccacacatatatattccg1320 tcacctccagggactattattcgtcgttgcgccgccagcggaagatatccagaagctgtt1380 ttccgagagactcggttggcgcctggtatatttgatggatgtcgcgctgcctcacgtccc1440 ggtacccaggaacgcggtgggatctcgggcccatcgaagactgtgctccagactgctcgc1500 ccagcaggtgtttcttgatcgccgcctctaaattgtccgcgcatcgccggtaacattttt1560 ccagctcggagtttgcgtttagatacagtttctgcgatgccaaaggagcctgcagattat1620 aacctcggatgctgtcattcagcgcttttaatttgacctccagatagttgctgtatttct1680 gttcccattggctgctgcgcagcttcgtataactcgagttattgttgcgctctgcctcgg1740 cgtactggctcatgatctggatcttgtccgtgtcgcttttcttcgagtgtttctcgcaaa1800 cgatgtgcacggcctgcagtgtccaatcggagtcgagctggcgccgaaactggcggatct1860 gagcctccacactgccctgtttctctatccacggcggaaccgcctcctgccgtttcagaa1920 tgttgttcaagtggtactctgtgcggtcaatgaaggcgttattgccggtgaaatctttgg1980 gaagcggttttcctcggggaagattacgaaattccccgcgtcgttgcgcttcctggatct2040 cgaggagatcgttctccgcgtcgaggagatcgttctccgcgtcgacaccattccttgcgg2100 cggcggtgctcaacggcctcaacctactactgggctgcttcctaatgcaggagtcgcata2160 agggagagcgtcgacaaacccgcgtttgagaacttgctcaagcttctggtaaacgttgta2220 gtactctgaaacaaggccctagcactctgatctgtttctcttgggtagcggtgagtggtt2280 tattggagttcactggtttcagcacatctgtcatctagacaatattgttactaaattttt2340 ttgaactacaattgttcgtaattcatctattattatacatcctcgtcagcaatttctggc2400 agacggagtttactaacgtcttgagtatgaggccgagaatccagctctgtggccatactc2460 agtcttgacagcctgctgatgtggctgcgttcaacgcaataagcgtgtcctccgactccg2520 agttgtgctcgttatcgtcgttctcatcctcggaaaaatcacacgaaagaacatactcac2580 cagtaggctttctggtccctggggcacggctgtttctgacgtattccggcgttgataata2640 gctcgaaagtgaacgccgagtcgcgggagtcgaccgatgcccttgagagccttcaaccca2700 gtcagctccttccggtgggcgcggggcatgactatcgtcgccgcacttatgactgtcttc2760 tttatcatgcaactcgtaggacaggtgccggcagcgctctgggtcattttcggcgaggac2820 cgctttcgctggagcgcgacgatgatcggcctgtcgcttgcggtattcggaatcttgcac2880 gccctcgctcaagccttcgtcactggtcccgccaccaaacgtttcggcgagaagcaggcc2940 attatcgccggcatggcggccgacgcgctgggctacgtcttgctggcgttcgcgacgcga3000 ggctggatggccttccccattatgattcttctcgcttccggcggcatcgggatgcccgcg3060 ttgcaggccatgctgtccaggcaggtagatgacgaccatcagggacagcttcaaggatcg3120 ctcgcggctcttaccagcctaacttcgatcactggaccgctgatcgtcacggcgatttat3180 gccgcctcggcgagcacatggaacgggttggcatggattgtaggcgccgccctatacctt3240 gtctgcctccccgcgttgcgtcgcggtgcatggagccgggccacctcgacctgaatggaa3300 gccggcggcacctcgctaacggattcaccactccaagaattggagccaatcaattcttgc3360 ggagaactgtgaatgcgcaaaccaacccttggcagaacatatccatcgcgtccgccatct3420 ccagcagccgcacgcggcgcatcggggggggggggggggggggggggcaaacaattcatc3480 attttttttttattcttttttttgatttcggtttctttgaaatttttttgattcggtaat3540 ctccgaacagaaggaagaacgaaggaaggagcacagacttagattggtatatatacgcat3600 atgtagtgttgaagaaacatgaaattgcccagtattcttaacccaactgcacagaacaaa3660 aacctgcaggaaacgaagataaatcatgtcgaaagctacatataaggaacgtgctgctac3720 tcatcctagtcctgttgctgccaagctatttaatatcatgcacgaaaagcaaacaaactt3780 gtgtgcttcattggatgttcgtaccaccaaggaattactggagttagttgaagcattagg3840 tcccaaaatttgtttactaaaaacacatgtggatatcttgactgatttttccatggaggg3900 cacagttaagccgctaaaggcattatccgccaagtacaattttttactcttcgaagacag3960 aaaatttgctgacattggtaatacagtcaaattgcagtactctgcgggtgtatacagaat4020 agcagaatgggcagacattacgaatgcacacggtgtggtgggcccaggtattgttagcgg4080 tttgaagcaggcggcagaagaagtaacaaaggaacctagaggccttttgatgttagcaga4140 attgtcatgcaagggctccctatctactggagaatatactaagggtactgttgacattgc4200 gaagagcgacaaagattttgttatcggctttattgctcaaagagacatgggtggaagaga4260 tgaaggttacgattggttgattatgacacccggtgtgggtttagatgacaagggagacgc4320 attgggtcaacagtatagaaccgtggatgatgtggtctctacaggatctgacattattat4380 tgttggaagaggactatttgcaaagggaagggatgctaaggtagagggtgaacgttacag4440 aaaagcaggctgggaagcatatttgagaagatgcggccagcaaaactaaaaaactgtatt4500 ataagtaaatgcatgtatactaaactcacaaattagagcttcaatttaattatatcagtt4560 attacccgggaatctcggtcgtaatgatttttataatgacgaaaaaaaaaaaattggaaa4620 gaaaagccccccccccccccccccccccccccccccccccgcagcgttgggtcctggcca4680 cgggtgcgcatgatcgtgctcctgtcgttgaggacccggctaggctggcggggttgcctt4740 actggttagcagaatgaatcaccgatacgcgagcgaacgtgaagcgactgctgctgcaaa4800 acgtctgcgacctgagcaacaacatgaatggtcttcggtttccgtgtttcgtaaagtctg4860 gaaacgcggaagtcagcgccctgcaccattatgttccggatctgcatcgcaggatgctgc4920 tggctaccctgtggaacacctacatctgtattaacgaagcgctggcattgaccctgagtg4980 atttttctctggtcccgccgcatccataccgccagttgtttaccctcacaacgttccagt5040 aaccgggcatgttcatcatcagtaacccgtatcgtgagcatcctctctcgtttcatcggt5100 atcattacccccatgaacagaaattcccccttacacggaggcatcaagtgaccaaacagg5160 aaaaaaccgcccttaacatggcccgctttatcagaagccagacattaacgcttctggaga5220 aactcaacgagctggacgcggatgaacaggcagacatctgtgaatcgcttcacgaccacg5280 ctgatgagctttaccgcagctgcctcgcgcgtttcggtgatgacggtgaaaacctctgac5340 acatgcagctcccggagacggtcacagcttgtctgtaagcggatgccgggagcagacaag5400 cccgtcagggcgcgtcagcgggtgttggcgggtgtcggggcgcagccatgacccagtcac5460 gtagcgatagcggagtgtatactggcttaaetatgcggcatcagagcagattgtactgag5520 agtgcaccatatgcggtgtgaaataccgcacagatgcgtaaggagaaaataccgcatcag5580 gcgctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagc5640 ggtatcagctcactcaaaggcggtaatacggttatccacagaatcaggggataacgcagg5700 aaagaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgct5760 ggcgtttttccataggctcc~gcccccctgacgagcatcacaaaaatcgacgctcaagtca5820 gaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccct5880 cgtgcgctctCCtgttCCgaCCCtgCCJCttaCCggataCCtgtCCgCCtttCt.CCCttC5940 gggaagcgtggcgctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcgt6000 tcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatc6060 cggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagc6120 cactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtg6180 gtggcctaactacggctacactagaaggacagtatttggtatctgcgctctgctgaagcc6240 agttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtag6300 cggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaaga6360 tcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggat6420 tttggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaag6480 ttttaaatcaatctaaagtatatatgagtaaacttggtctgacagttaccaatgcttaat6540 cagtgaggcacctatctcagcgatctgtctatttcgttcatccatagttgcctgactccc6600 cgtcgtgtagataactacgatacgggagggcttaccatctggccccagtgctgcaatgat6660 accgcgagacccacgctcaccggctccagatttatcagcaataaaccagccagccggaag6720 ggccgagcgcagaagtggtcctgcaactttatccgcctccatccagtctattaattgttg6780 ccgggaagctagagtaagtagttcgccagttaatagtttgcgcaacgttgttgccattgc6840 tgcaggcatcgtggtgtcacgctcgtcgtttggtatggcttcattcagctccggttccca6900 acgatcaaggcgagttacatgatcccccatgttgtgcaaaaaagcggttagctccttcgg6960 tcctccgatcgttgtcagaagtaagttggccgcagtgttatcactcatggttatggcagc7020 actgcataattctcttactgtcatgccatccgtaagatgcttttctgtgactggtgagta7080 ctcaaccaagtcattctgagaatagtgtatgcggcgaccgagttgctcttgcccggcgtc7140 aacacgggataataccgcgccacatagcagaactttaaaagtgctcatcattggaaaacg7200 ttcttcggggcgaaaactctcaaggatcttaccgctgttgagatccagttcgatgtaacc72_60 cactcgtgcacccaactgatcttcagcatcttttactttcaccagcgtttctgggtgagc7320 aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg gcgacacgga aatgttgaat 7380 actcatactc ttcctttttc aatattattg aa,gcatttat cagggttatt gtctcatgag 7440 cggatacata tttgaatgta tttagaaaaa taaacaaata ggggttccgc gcacatttcc 7500 ccgaaaagtg ccacctgacg tctaagaaac cattattatc atgacattaa cctataaaaa 7560 taggcgtatc acgaggccct ttcgtcttca a 7591 <210> 14 <211> 50 <212> DNA
<213> synthetic probe or primer <400> 14 aggggtaagc ttggataaaa ggtatgaggt gcgcaacgtg tccgggatgt 50 <210> 15 <211> 42 <212> DNA
<213> synthetic probe or primer <400> 15 agttacggat ccttaatggt gatggtggtg gtgccagttc at 42 <210>

<211>

<212>
DNA

<213>
vector pFPMT-Mfalfa-E1-H6 <400>

ggtaccctgctcaatctccggaatggtgatctgatcgttcctgaaaacctcgacattggc60 tccctcctgacacaggtactcgtacaggttccaggtaaacgagtcgtagttgtcgatcat120 gacaacgttcttagaagcggccggcattttgaaggtgactaatagcctaagaaaatattt180 aatttaattttcattaaattttcctatactcgctatttcagcttttcatctcatcacttc240 ataaacgatataaaccagaaaaagaactattttcaaacacgcttctcaaaagcggtatgt300 ccttccacgtctccttagaatctggcaagtccgcgagggggatccttaatggtgatggtg360 gtggtgccagttcatcatcatatcccaagccatacggtgacctgttatgtggccgggata420 gattgagcaattgcagtcctgcaccgtctcatgccggcgaggcgagatggtgaacagctg480 ggagacgaggaagacagatccgcagagatcccccacgtacatagcggaacagaaagcagc540 cgccccaacgagcaaatcgacgtggcgtcgtattgtcgtagtggggacgctggcgttcct600 agctgcgagcgtgggggtgagcgctacccagcagcgggaagagttgttctcccgaacgca660 gggcacgcacccgggggtgtgcatgatcatgtccgctgcctcatacacaatgcttgagtt720 ggagcagtcgttcgtgacatggtacatcccggacacgttgcgcacctcataccttttatc780 caagcttaccccttcttctttagcagcaatgctggcaatagtagtatttataaacaataa840 cccgttatttgtgctgttggaaaatggcaaaacagcaacatcgaaatccccttctaaatc900 tgagtaaccgatgacagcttcagccggaatttgtgccgtttcatcttctgttgtagtgtt960 gactggagcagctaatgcggaggatgctgcgaataaaactgcagtaaaaattgaaggaaa1020 tctcatgaattcccgatgaagcagagagcgcaggaggcggtatttatagtgccattcccc1080 tctctgagagacccggatggtagtcgagtgtatcggagacagcttgatgtagactccgtg1140 cctgccggctcctcttattggcggacaccagtgagacaccccggaacttgctgtttttct1200 gcaaaatccggggtgaccagtgggagcctatttgcacacacgagcgggacaccccactct1260 ggtgaagagtgccaaagtcattctttttcccgttgcggggcagccgattgcatgttttag1320 gaaaatattacctttgctacaccctgtcagatttaccctccacacatatatattccgtca1380 cctccagggactattattcgtcgttgcgccgccagcggaagatatccagaagctgttttc1440 cgagagactcggttggcgcctggtatatttgatggatgtcgcgctgcctcacgtcccggt1500 acccaggaacgcggtgggatctcgggcccatcgaagactgtgctccagactgctcgccca1560 gcaggtgtttcttgatcgccgcctctaaattgtccgcgcatcgccggtaacatttttcca1620 gctcggagtttgcgtttagatacagtttctgcgatgccaaaggagcctgcagattataac1680 ctcggatgctgtcattcagcgcttttaatttgacctccagatagttgctgtatttctgtt1740 cccattggctgctgcgcagcttcgtataactcgagttattgttgcgctctgcctcggcgt1800 actggctcatgatctggatcttgtccgtgtcgcttttcttcgagtgtttctcgcaaacga1860 tgtgcacggcctgcagtgtccaatcggagtcgagctggcgccgaaactggcggatctgag1920 cctccacactgccctgtttctctatccacggcggaaccgcctcctgccgtttcagaatgt1980 tgttcaagtggtactctgtgcggtcaatgaaggcgttattgccggtgaaatctttgggaa2040 gcggttttcctcggggaagattacgaaattccccgcgtcgttgcgcttcctggatctcga2100 ggagatcgttctccgcgtcgaggagatcgttctccgcgtcgacaccattccttgcggcgg2160 cggtgctcaacggcctcaacctactactgggctgcttcctaatgcaggagtcgcataagg2220 gagagegtcgacaaacccgcgtttgagaacttgctcaagcttctggtaaacgttgtagta2280 ctctgaaacaaggccctagcactctgatctgtttctcttgggtagcggtgagtggtttat2340 tggagttcactggtttcagcacatctgtcatctagacaatattgttactaaatttttttg2400 aactacaattgttcgtaattcatctattattatacatcctcgtcagcaatttctggcaga2460 cggagtttactaacgtcttgagtatgaggccgagaatccagctctgtggccatactcagt2520 cttgacagcctgctgatgtggctgcgttcaacgcaataagcgtgtcctccgactccgagt2580 tgtgctcgttatcgtcgttctcatcctcggaaaaatcacacgaaagaacatactcaccag2640 taggctttctggtccctggggcacggctgtttctgacgtattccggcgttgataatagct2700 cgaaagtgaacgccgagtcgcgggagtcgaccgatgcccttgagagccttcaacccagtc2760 agctccttccggtgggcgcggggcatgactatcgtcgccgcacttatgactgtcttcttt2820 atcatgcaactcgtaggacaggtgccggcagcgctctgggtcattttcggcgaggaccgc2880 tttcgctggagcgcgacgatgatcggcctgtcgcttgcggtattcggaatCttgCdCgCC2940 ctcgctcaagccttcgtcactggtcccgccaccaaacgtttcggcgagaagcaggccatt3000 atcgccggcatggcggccgacgcgctgggctacgtcttgctggcgttcgcgacgcgaggc3060 tggatggccttccccattatgattcttctcgcttccggcggcatcgggatgcccgcgttg3120 caggccatgctgtccaggcaggtagatgacgaccatcagggacagcttcaaggatcgctc3180 gcggctcttaccagcctaacttcgatcactggaccgctgatcgtcacggcgatttatgcc3240 gcctcggcgagcacatggaacgggttggcatggattgtaggcgccgccctataccttgtc3300 tgcctccccgcgttgcgtcgcggtgcatggagccgggccacctcgacctgaatggaagcc3360 ggcggcacctcgctaacggattcaccactccaagaattggagccaatcaattcttgcgga3420 gaactgtgaatgcgcaaaccaacccttggcagaacatatccatcgcgtccgccatctcca3480 gcagccgcacgcggcgcatcggggggggggggggggggggggggcaaacaattcatcatt3540 ttttttttattcttttttttgatttcggtttctttgaaatttttttgattcggtaatctc3600 cgaacagaaggaagaacgaaggaaggagcacagacttagattggtatatatacgcatatg3660 tagtgttgaagaaacatgaaattgcccagtattcttaacccaactgcacagaacaaaaac3720 ctgcaggaaacgaagataaatcatgtcgaaagctacatataaggaacgtgctgctactca3780 tcctagtcctgttgctgccaagctatttaatatcatgcacgaaaagcaaacaaacttgtg3840 tgcttcattggatgttcgtaccaccaaggaattactggagttagttgaagcattaggtcc3900 caaaatttgtttactaaaaacacatgtggatatcttgactgatttttccatggagggcac3960 agttaagccgctaaaggcattatccgccaagtacaattttttactcttcgaagacagaaa4020 atttgctgacattggtaatacagtcaaattgcagtactctgcgggtgtatacagaatagc4080 agaatgggcagacattacgaatgcacacggtgtggtgggcccaggtattgttagcggttt4140 gaagcaggcggcagaagaagtaacaaaggaacctagaggccttttgatgttagcagaatt4200 gtcatgcaagggctccctatctactggagaatatactaagggtactgttgacattgcgaa4260 gagcgacaaagattttgttatcggctttattgctcaaagagacatgggtggaagagatga4320 aggttacgattggttgattatgacacccggtgtgggtttagatgacaagggagacgcatt4380 gggtcaacagtatagaaccgtggatgatgtggtctctacaggatctgacattattattgt4440 tggaagaggactatttgcaaagggaagggatgctaaggtagagggtgaacgttacagaaa4500 agcaggctgggaagcatatttgagaagatgcggccagcaaaactaaaaaactgtattata4560 agtaaatgcatgtatactaaactcacaaattagagcttcaatttaattatatcagttatt4620 acccgggaatctcggtcgtaatgatttttataatgacgaaaaaaaaaaaattggaaagaa4680 aagCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCgCagcgttgggtcctggccacgg4740 gtgcgcatgatcgtgctcctgtcgttgaggacccggctaggctggcggggttgccttact4800 ggttagcagaatgaatcaccgatacgcgagcgaacgtgaagcgactgctgctgcaaaacg4860 tctgcgacctgagcaacaacatgaatggtcttcggtttccgtgtttcgtaaagtctggaa4920 acgcggaagtcagcgccctgcaccattatgttccggatctgcatcgcaggatgctgctgg4980 ctaccctgtggaacacctacatctgtattaacgaagcgctggcattgaccctgagtgatt5040 tttctctggtcccgccgcatccataccgccagttgtttaccctcacaacgttccagtaac5100 cgggcatgttcatcatcagtaacccgtatcgtgagcatcctctctcgtttcatcggtatc5160 attacccccatgaacagaaattcccccttacacggaggcatcaagtgaccaaacaggaaa5220 aaaccgcccttaacatggcccgctttatcagaagccagacattaacgcttctggagaaac5280 tcaacgagctggacgcggatgaacaggcagacatctgtgaatcgcttcacgaccacgctg5340 atgagctttaccgcagctgcctcgcgcgtttcggtgatgacggtgaaaacctctgacaca5400 tgcagctcccggagacggtcacagcttgtctgtaagcggatgccgggagcagacaagccc5460 gtcagggcgcgtcagcgggtgttggcgggtgtcggggcgcagccatgacccagtcacgta5520 gcgatagcggagtgtatactggcttaactatgcggcatcagagcagattgtactgagagt5580 gcaccatatgcggtgtgaaataccgcacagatgcgtaaggagaaaataccgcatcaggcg5640 ctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggt5700 atcagctcactcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaa5760 gaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggc5820 gtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagag5880 gtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgt5940 gCgCtCtCCtgttCCgaCCCtgccgcttaccggatacctgtccgcctttcteccttcggg6000 aagcgtggcgctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcg6060 ctccaagctgggctgtgtgcacgaattccccgttcagcccgaccgctgcgccttatccgg6120 taactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccac6180 tggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtg6240 gcctaactacggctacactagaaggacagtatttggtatctgcgctctgctgaagccagt6300 taccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcgg6360 tggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcc6420 tttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggatttt6480 ggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttt6540 taaatcaatctaaagtatatatgagtaaacttggtctgacagttaccaatgcttaatcag6600 tgaggcacctatctcagcgatctgtctatttcgttcatccatagttgcctgactccccgt6660 cgtgtagataactacgatacgggagggcttaccatctggccccagtgctgcaatgatacc6720 gcgagacccacgctcaccggctccagatttatcagcaataaaccagccagccggaagggc6780 cgagcgcagaagtggtcctgcaactttatccgcctccatccagtctattaattgttgccg6840 ggaagctagagtaagtagttcgccagttaatagtttgcgcaacgttgttgccattgctgc6900 aggcatcgtggtgtcacgctcgtcgtttggtatggcttcattcagctccggttcccaacg6960 atcaaggcgagttacatgatcccccatgttgtgcaaaaaagcggttagctccttcggtcc7020 tccgatcgttgtcagaagtaagttggccgcagtgttatcactcatggttatggcagcact7080 gcataattctcttactgtcatgccatccgtaagatgcttttctgtgactggtgagtactc7140 aaccaagtcattctgagaatagtgtatgcggcgaccgagttgctcttgcccggcgtcaac7200 acgggataataccgcgccacatagcagaactttaaaagtgctcatcattggaaaacgttc7260 ttcggggcgaaaactctcaaggatcttaccgctgttgagatccagttcgatgtaacccac7320 tcgtgcacccaactgatcttcagcatcttttactttcaccagcgtttctgggtgagcaaa7380 aacaggaaggcaaaatgccgcaaaaaagggaataagggcgacacggaaatgttgaatact7440 catactcttcctttttcaatattattgaagcatttatcagggttattgtctcatgagcgg7500 atacatatttgaatgtatttagaaaaataaacaaataggggttccgcgcacatttccccg7560 aaaagtgccacctgacgtctaagaaaccattattatcatgacattaacctataaaaatag7620 gcgtatcacgaggccctttcgtcttcaa 7648 <210> 17 <211> 4453 <212> DNA
<213> vector pUCl8-FMD-MFalfa-E1-H6 <220>
<221> misc_feature <222> (1207)..(1208) <223> N is any nucleotide <220>
<221> misc_feature <222> (1386)..(1387) <223> N is any nucleotide <400>

gcgcccaatacgcaaaccgcctctccccgcgcgttggccgattcattaatgcagctggca60 cgacaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaatgtgagttagct120 cactcattaggcaccccaggctttacactttatgcttccggctcgtatgttgtgtggaat180 tgtgagcggataacaatttcacacaggaaacagctatgaccatgattacgaattcgagct240 cggtacccggggatccttaatggtgatggtggtggtgccagttcatcatcatatcccaag300 ccatacggtgacctgttatgtggccgggatagattgagcaattgcagtcctgcaccgtct360 catgccggcg aggcgagatg gtgaacagct gggagacgag gaagacagat ccgcagagat 420 cccccacgta catagcggaa cagaaagcag ccgccccaac gagcaaatcg acgtggcgtc 480 gtattgtcgt agtggggacg ctggcgttcc tagctgcgag cgtgggggtg agcgctaccc 540 agcagcgggaagagttgttctcccgaacgcagggcacgcacccgggggtgtgcatgatca600 tgtccgctgcctcatacacaatgcttgagttggagcagtcgttcgtgacatggtacatcc660 cggacacgttgcgcacctcataccttttatccaagcttaccccttcttctttagcagcaa720 tgctggcaatagtagtatttataaacaataacccgttatttgtgctgttggaaaatggca780 aaacagcaacatcgaaatccccttctaaatctgagtaaccgatgacagcttcagccggaa840 tttgtgccgtttcatcttctgttgtagtgttgactggagcagctaatgcggaggatgctg900 cgaataaaactgcagtaaaaattgaaggaaatctcatgaattcccgatgaaggcagagag960 cgcaaggaggcggtatttatagtgccattcccctctctgagagacccggatggtagtcga1020 gtgttatcggagacagcttgatgtagactccgtgcctgccggtcctcttattggcggaca1080 ccagtgagacaccccggaacttgctgtttttctgcaaaatccggggtgaccagtgggagc1140 ctatttgcacacacgagcgggacaccccactctggtgaagagtgccaaagtcattctttt1200 tcccgtnncggggcagccgattgcatgttttaggaaaatattacctttgctacaccctgt1260 cagatttaccctccacacatatatattccgtcacctccagggactattcttggctcgttg1320 cgccgccgcggaagatatccagaagctgtgttttccgagagactcggttggcgcctggta1380 tatttnnaggatgtcgcgctgcctcacgtcccggtacccaggaacgcggtgggatctcgg1440 gcccatcgaagactgtgctccagactgctcgcccagcaggtgtttcttgattgccgcctc1500 taaatagtccgcgcatcgccggtaacatttttccagctcggagtttgcgtttagatacat1560 ttctgcgatgccaaaggagcctgcagattataacctcggatgctgtcattcagcgctttt1620 aatttgacctccagatagttgctgtatttctgttccattggctgctggacgttcgtataa1680 ctcgagttattgttgcgctctgcctcggcgtactggctcatgactgactgcggtcgcttc1740 tcgagtgttc,tcgcaacaggacgcctgcaggtcatcgagtcgagctggcgccgaaactgg1800 cggatctgacctccacactgCCCtgtatCtctatccaccgggaaccgcctCCtgCCgttC1860 cagaatgttgttcaagtggtagctctgtgcggtcaatgaaggcgttattgccggtgaaat1920 ctttgggaagcggtttatcctcggggaagattacgaaattcccgcgcgtcgttgcgcttc1980 ctggatctcgaggaagatcgttctccgcgtcgaggagatcgttctccgcgtcgacctgca2040 ggcatgcaagcttggcactggccgtcgttttacaacgtcgtgactgggaaaaccctggcg2100 ttacccaacttaatcgccttgcagcacatccccctttcgccagctggcgtaatagcgaag2160 aggcccgcaccgatcgcccttcccaacagttgcgcagcctgaatggcgaatggcgcctga2220 tgcggtattttctccttacgcatctgtgcggtatttcacaccgcatatggtgcactctca2280 gtacaatctgctctgatgccgcatagttaagccagccccgacacccgccaacacccgctg2340 acgcgccctgacgggcttgtctgctcccggcatccgcttacagacaagctgtgaccgtct2400 ccgggagctgcatgtgtcagaggttttcaccgtcatcaccgaaacgcgcgagacgaaagg2460 gcctcgtgatacgcctatttttataggttaatgtcatgataataatggtttcttagacgt2520 caggtggcacttttcggggaaatgtgcgcggaacccctatttgtttatttttctaaatac2580 attcaaatatgtatccgctcatgagacaataaccctgataaatgcttcaataatattgaa2640 aaaggaagagtatgagtattcaacatttccgtgtcgcccttattcccttttttgcggcat2700 tttgccttcctgtttttgctcacccagaaacgctggtgaaagtaaaagatgctgaagatc2760 agttgggtgcacgagtgggttacatcgaactggatctcaacagcggtaagatccttgaga2820 gttttcgccccgaagaacgttttccaatgatgagcacttttaaagttctgctatgtggcg2880 cggtattatcccgtattgacgccgggcaagagcaactcggtcgccgcatacactattctc2940 agaatgacttggttgagtactcaccagtcacagaaaagcatcttacggatggcatgacag3000 taagagaattatgcagtgctgccataaccatgagtgataacactgcggccaacttacttc3060 tgacaacgatcggaggaccgaaggagctaaccgcttttttgcacaacatgggggatcatg3120 taactcgccttgatcgttgggaaccggagctgaatgaagccataccaaacgacgagcgtg3180 acaccacgatgcctgtagcaatggcaacaacgttgcgcaaactattaactggcgaactac3240 ttactctagcttcccggcaacaattaatagactggatggaggcggataaagttgcaggac3300 cacttctgcgctcggcccttccggctggctggtttattgctgataaatctggagccggtg3360 agcgtgggtctcgcggtatcattgcagcactggggccagatggtaagccctcccgtatcg3420 tagttatctacacgacggggagtcaggcaactatggatgaacgaaatagacagatcgctg3480 agataggtgcctcactgattaagcattggtaactgtcagaccaagtttactcatatatac3540 tttagattgatttaaaacttcatttttaatttaaaaggatctaggtgaagatcctttttg3600 ataatctcatgaccaaaatcccttaacgtgagttttcgttccactgagcgtcagaccccg3660 tagaaaagatcaaaggatcttcttgagatcctttttttctgcgcgtaatctgctgcttgc3720 aaacaaaaaaaccaccgctaccagcggtggtttgtttgccggatcaagagctaccaactc3780 tttttccgaaggtaactggcttcagcagagcgcagataccaaatactgtccttctagtgt3840 agccgtagttaggccaccacttcaagaactctgtagcaccgcctacatacctcgctctgc3900 taatcctgttaccagtggctgctgccagtggcgataagtcgtgtcttaccgggttggact3960 caagacgatagttaccggataaggcgcagcggtcgggctgaacggggggttcgtgcacac4020 agcccagcttggagcpaacgacctacaccgaactgagatacctacagcgtgagctatgag4080 aaagcgccacgcttcccgaagggagaaaggcggacaggtatccggtaagcggcagggtcg4140 gaacaggagagcgcacgagggagcttccagggggaaacgcctggtatctttatagtcctg4200 tcgggtttcg ccacctctga cttgagcgtc gatttttgtg atgctcgtca ggggggcgga 4260 gcctatggaa aaacgccagc aacgcggcct ttttacggtt cctggccttt tgctggcctt 4320 ttgctcacat gttctttcct gcgttatccc ctgattctgt ggataaccgt attaccgcct 4380 ttgagtgagc tgataccgct cgccgcagcc gaacgaccga gcgcagcgag tcagtgagcg 4440 aggaagcgga aga 4453 <210> 18 <211> 51 <212> DNA
<213> synthetic probe or primer c400> 18 tgcttcctac cactagcagc actaggatat gaggtgcgca acgtgtccgg g 51 <210> 19 c211> 52 c212> DNA
c213> synthetic probe or primer c400> 19 tagtactagt attagtaggc ttcgcatgaa ttcccgatga aggcagagag cg 52 c210> 20 c211> 4252 c212> DNA
c213> vector pUClB-FMD-CL-E1-H6 c220>
<221> misc_feature c222> (1006)..(1007) c223> N is any nucleotide c220>
<221> misc_feature c222> (1185)..(1186) c223> N is any nucleotide c400> 20 gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg attcattaat gcagctggca 60 cgacaggttt cccgactgga aagcgggcag tgagcgcaac gcaattaatg tgagttagct 120 cactcattag gcaccccagg ctttacactt tatgcttccg gctcgtatgt tgtgtggaat 180 tgtgagcgga taacaatttc acacaggaaa cagctatgac catgattacg aattcgagct 240 cggtacccgg ggatccttaa tggtgatggt ggtggtgcca gttcatcatc atatcccaag 300 ccatacggtg acctgttatg tggccgggat agattgagca attgcagtcc tgcaccgtct 360 catgccggcg aggcgagatg gtgaacagct gggagacgag gaagacagat ccgcagagat 420 cccccacgtacatagcggaacagaaagcagccgccccaacgagcaaatcgacgtggcgtc480 gtattgtcgtagtggggacgctggcgttcctagctgcgagcgtgggggtgagcgctaccc540 agcagcgggaagagttgttctcccgaacgcagggcacgcacccgggggtgtgcatgatca600 tgtccgctgcctcatacacaatgcttgagttggagcagtcgttcgtgacatggtacatcc660 cggacacgttgcgcacctcatatcctagtgctgctagtggtaggaagcatagtactagta720 ttagtaggcttcgcatgaattcccgatgaaggcagagagcgcaaggaggcggtatttata780 gtgccattcccctctctgagagacccggatggtagtcgagtgttatcggagacagcttga840 tgtagactccgtgcctgccggtcctcttattggcggacaccagtgagacaccccggaact900 tgctgtttttctgcaaaatccggggtgaccagtgggagcctatttgcacacacgagcggg960 acaccccactctggtgaagagtgccaaagtcattctttttcccgtnncggggcagccgat1020 tgcatgttttaggaaaatattacctttgctacaccctgtcagatttaccctccacacata1080 tatattccgtcacctccagggactattcttggctcgttgcgccgccgcggaagatatcca1140 gaagctgtgttttccgagagactcggttggcgcctggtatatttnnaggatgtcgcgctg1200 cctcacgtcccggtacccaggaacgcggtgggatctcgggcccatcgaagactgtgctcc1260 agactgctcgcccagcaggtgtttcttgattgccgcctctaaatagtccgcgcatcgccg1320 gtaacatttttccagctcggagtttgcgtttagatacatttctgcgatgccaaaggagcc1380 tgcagattataacctcggatgctgtcattcagcgcttttaatttgacctccagatagttg1440 ctgtatttctgttccattggctgctggacgttcgtataactcgagttattgttgcgctct1500 gcctcggcgtactggctcatgactgactgcggtcgcttctcgagtgttctcgcaacagga1560 cgcctgcaggtcatcgagtcgagctggcgccgaaactggcggatctgacctccacactgc1620 cctgtatctctatccaccgggaaccgcctcctgccgttccagaatgttgttcaagtggta1680 gctctgtgcggtcaatgaaggcgttattgccggtgaaatctttgggaagcggtttatcct1740 cggggaagattacgaaattcccgcgcgtcgttgcgcttcctggatctcgaggaagatcgt1800 tctccgcgtcgaggagatcgttctccgcgtcgacctgcaggcatgcaagcttggcactgg1860 ccgtcgttttacaacgtcgtgactgggaaaaccctggcgttacccaacttaatcgccttg1920 cagcacatccccctttcgccagctggcgtaatagcgaagaggcccgcaccgatcgccctt1980 cccaacagttgcgcagcctgaatggcgaatggcgcctgatgcggtattttctccttacgc2040 atctgtgcggtatttcacaccgcatatggtgcactctcagtacaatctgctctgatgccg2100 catagttaagccagccccgacacccgccaacacccgctgacgcgccctgacgggcttgtc2160 tgctcccggcatccgcttacagacaagctgtgaccgtctccgggagctgcatgtgtcaga2220 ggttttcaccgtcatcaccgaaacgcgcgagacgaaagggcctcgtgatacgcctatttt2280 tataggttaatgtcatgataataatggtttcttagacgtcaggtggcacttttcggggaa2340 atgtgcgcggaacccctatttgtttatttttctaaatacattcaaatatgtatccgctca2400 tgagacaataaccctgataaatgcttcaataatattgaaaaaggaagagtatgagtattc2460 aacatttccgtgtcgcccttattcccttttttgcggcattttgccttcctgtttttgctc2520 acccagaaacgctggtgaaagtaaaagatgctgaagatcagttgggtgcacgagtgggtt2580 acatcgaactggatctcaacagcggtaagatccttgagagttttcgccccgaagaacgtt2640 ttccaatgatgagcacttttaaagttctgctatgtggcgcggtattatcccgtattgacg2700 ccgggcaagagcaactcggtcgccgcatacactattctcagaatgacttggttgagtact2760 caccagtcacagaaaagcatcttacggatggcatgacagtaagagaattatgcagtgctg2820 ccataaccatgagtgataacactgcggccaacttacttctgacaacgatcggaggaccga2880 aggagctaaccgcttttttgcacaacatgggggatcatgtaactcgccttgatcgttggg2940 aaccggagctgaatgaagccataccaaacgacgagcgtgacaccacgatgcctgtagcaa3000 tggcaacaacgttgcgcaaactattaactggcgaactacttactctagcttcccggcaac3060 aattaatagactggatggaggcggataaagttgcaggaccacttctgcgctcggcccttc3120 cggctggctggtttattgctgataaatctggagccggtgagcgtgggtctcgcggtatca3180 ttgcagcactggggccagatggtaagccctcccgtatcgtagttatctacacgacgggga3240 gtcaggcaactatggatgaacgaaatagacagatcgctgagataggtgcctcactgatta3300 agcattggtaactgtcagaccaagtttactcatatatactttagattgatttaaaacttc3360 atttttaatttaaaaggatctaggtgaagatcctttttgataatctcatgaccaaaatcc3420 cttaacgtgagttttcgttccactgagcgtcagaccccgtagaaaagatcaaaggatctt3480 cttgagatcctttttttctgcgcgtaatctgctgcttgcaaacaaaaaaaccaccgctac3540 cagcggtggtttgtttgccggatcaagagctaccaactctttttccgaaggtaactggct3600 tcagcagagcgcagataccaaatactgtccttctagtgtagccgtagttaggccaccact3660 tcaagaactctgtagcaccgcctacatacctcgctctgctaatcctgttaccagtggctg3720 ctgccagtggcgataagtcgtgtcttaccgggttggactcaagacgatagttaccggata3780 aggcgcagcggtcgggctgaacggggggttcgtgcacacagcccagcttggagcgaacga3840 cctacaccgaactgagatacctacagcgtgagctatgagaaagcgccacgcttcccgaag3900 ggagaaaggcggacaggtatccggtaagcggcagggtcggaacaggagagcgcacgaggg3960 agcttccagggggaaacgcctggtatctttatagtcctgtcgggtttcgccacctctgac4020 ttgagcgtcgatttttgtgatgctcgtcaggggggcggagcctatggaaaaacgccagca4080 acgcggcctt tttacggttc ctggcctttt gctggccttt tgctcacatg ttctttcctg 4140 cgttatcccc tgattctgtg gataaccgta ttaccgcctt tgagtgagct gataccgctc 4200 gccgcagccg aacgaccgag cgcagcgagt'cagtgagcga ggaagcggaa ga 4252 <210>

<211>

<212>
DNA

<213>
vector pFPMT-CL-E1-H6 <400>

ggtaccctgctcaatctccggaatggtgatctgatcgttcctgaaaacctcgacattggc60 tccctcctgacacaggtactcgtacaggttccaggtaaacgagtcgtagttgtcgatcat120 gacaacgttcttagaagcggccggcattttgaaggtgactaatagcctaagaaaatattt180 aatttaattttcattaaattttcctatactcgctatttcagcttttcatctcatcacttc240 ataaacgatataaaccagaaaaagaactattttcaaacacgcttctcaaaagcggtatgt300 ccttccacgtctccttagaatctggcaagtccgcgagggggatccttaatggtgatggtg360 gtggtgccagttcatcatcatatcccaagccatacggtgacctgttatgtggccgggata420 gattgagcaattgcagtcctgcaccgtctcatgccggcgaggcgagatggtgaacagctg480 ggagacgaggaagacagatccgcagagatcccccacgtacatagcggaacagaaagcagc540 cgccccaacgagcaaatcgacgtggcgtcgtattgtcgtagtggggacgctggcgttcct600 agctgcgagcgtgggggtgagcgctacccagcagcgggaagagttgttctcccgaacgca660 gggcacgcacccgggggtgtgcatgatcatgtccgctgcctcatacacaatgcttgagtt720 ggagcagtcgttcgtgacatggtacatcccggacacgttgcgcacctcatatcctagtgc780 tgctagtggtaggaagcatagtactagtattagtaggcttcgcatgaattcccgatgaag840 cagagagcgcaggaggcggtatttatagtgccattcccctctctgagagacccggatggt900 agtcgagtgtatcggagacagcttgatgtagactccgtgcctgccggctcctcttattgg960 cggacaccagtgagacaccccggaacttgctgtttttctgcaaaatccggggtgaccagt1020 gggagcctatttgcacacacgagcgggacaccccactctggtgaagagtgccaaagtcat1080 tctttttcccgttgcggggcagccgattgcatgttttaggaaaatattacctttgctaca1140 ccctgtcagatttaccctccacacatatatattccgtcacctccagggactattattcgt1200 cgttgcgccgccagcggaagatatccagaagctgttttccgagagactcggttggcgcct1260 ggtatatttgatggatgtcgcgctgcctcacgtcccggtacccaggaacgcggtgggatc1320 tcgggcccatcgaagactgtgctccagactgctcgcccagcaggtgtttcttgatcgccg1380 cctctaaattgtccgcgcatcgccggtaacatttttccagctcggagtttgcgtttagat1440 acagtttctgcgatgccaaaggagcctgcagattataacctcggatgctgtcattcagcg1500 cttttaatttgacctccagatagttgctgtatttctgttcccattggctgctgcgcagct1560 tcgtataactcgagttattgttgcgctctgcctcggcgtactggctcatgatctggatct1620 tgtccgtgtcgcttttcttcgagtgtttctcgcaaacgatgtgcacggcctgcagtgtcc2680 aatcggagtcgagctggcgccgaaactggcggatctgagcctccacactgccctgtttct1740 ctatccacggcggaaccgcctcctgccgtttcagaatgttgttcaagtggtactctgtgc1800 ggtcaatgaaggcgttattgccggtgaaatctttgggaagcggttttcctcggggaagat1860 tacgaaattccccgcgtcgttgcgcttcctggatctcgaggagatcgttctccgcgtcga1920 ggagatcgttctccgcgtcgacaccattccttgcggcggcggtgctcaacggcctcaacc1980 tactactgggctgcttcctaatgcaggagtcgcataagggagagcgtcgacaaacccgcg2040 tttgagaacttgctcaagcttctggtaaacgttgtagtactctgaaacaaggccctagca2100 ctctgatctgtttctcttgggtagcggtgagtggtttattggagttcactggtttcagca2160 catctgtcatctagacaatattgttactaaatttttttgaactacaattgttcgtaattc2220 atctattattatacatcetcgtcagcaatttctggcagacggagtttactaacgtcttga2280 gtatgaggccgagaatccagctctgtggccatactcagtcttgacagcctgctgatgtgg2340 ctgcgttcaacgcaataagcgtgtcctccgactccgagttgtgctcgttatcgtcgttct2400 catcctcggaaaaatcacacgaaagaacatactcaccagtaggctttctggtccctgggg2460 cacggctgtttctgacgtattccggcgttgataatagctcgaaagtgaacgccgagtcgc2520 gggagtcgaccgatgcccttgagagccttcaacccagtcagctccttccggtgggcgcgg2580 ggcatgactatcgtcgccgcacttatgactgtcttctttatcatgcaactcgtaggacag2640 gtgccggcagcgctctgggtcattttcggcgaggaccgctttcgctggagcgcgacgatg2700 atcggcctgtcgcttgcggtattcggaatcttgcacgccctcgctcaagccttcgtcact2760 ggtcccgccaccaaacgtttcggcgagaagcaggccattatcgccggcatggcggccgac2820 gcgctgggctacgtcttgctggcgttcgcgacgcgaggctggatggccttccccattatg2880 attcttctcgcttccggcggcatcgggatgcccgcgttgcaggccatgctgtccaggcag2940 gtagatgacgaccatcagggacagcttcaaggatcgctcgcggctcttaccagcctaact3000 tcgatcactggaccgctgatcgtcacggcgatttatgccgcctcggcgagcacatggaac3060 gggttggcatggattgtaggcgccgccctataccttgtctgcctccccgcgttgcgtcgc3120 ggtgcatggagccgggccacctcgacctgaatggaagccggcggcacctcgctaacggat3180 tcaccactccaagaattggagccaatcaattcttgcggagaactgtgaatgcgcaaacca3240 acccttggcagaacatatccatcgcgtccgccatctccagcagccgcacgcggcgcatcg3300 gggggggggggggggggggggggcaaacaattcatcattttttttttattcttttttttg3360 atttcggtttctttgaaatttttttgattcggtaatctccgaacagaaggaagaacgaag3420 gaaggagcacagacttagattggtatatatacgcatatgtagtgttgaagaaacatgaaa3480 ttgcccagtattcttaacccaactgcacagaacaaaaacctgcaggaaacgaagataaat3540 catgtcgaaagctacatataaggaacgtgctgctactcatcctagtcctgttgctgccaa3600 gctatttaatatcatgcacgaaaagcaaacaaacttgtgtgcttcattggatgttcgtac3660 caccaaggaattactggagttagttgaagcattaggtcccaaaatttgtttactaaaaac3720 acatgtggatatcttgactgatttttccatggagggcacagttaagccgctaaaggcatt3780 atccgccaagtacaattttttactcttcgaagacagaaaatttgctgacattggtaatac3840 agtcaaattgcagtactctgcgggtgtatacagaatagcagaatgggcagacattacgaa3900 tgcacacggtgtggtgggcccaggtattgttagcggtttgaagcaggcggcagaagaagt3960 aacaaaggaacctagaggccttttgatgttagcagaattgtcatgcaagggctccctatc4020 tactggagaatatactaagggtactgttgacattgcgaagagcgacaaagattttgttat4080 cggctttattgctcaaagagacatgggtggaagagatgaaggttacgattggttgattat4140 gacacccggtgtgggtttagatgacaagggagacgcattgggtcaacagtatagaaccgt4200 ggatgatgtggtctctacaggatctgacattattattgttggaagaggactatttgcaaa4260 gggaagggatgctaaggtagagggtgaacgttacagaaaagcaggctgggaagcatattt4320 gagaagatgcggccagcaaaactaaaaaactgtattataagtaaatgcatgtatactaaa4380 ctcacaaattagagcttcaatttaattatatcagttattacccgggaatctcggtcgtaa4440 tgatttttataatgacgaaaaaaaaaaaattggaaagaaaagcccccccccccccccccc4500 ccccccccccccccccgcagcgttgggtcctggccacgggtgcgcatgatcgtgctcctg4560 tcgttgaggacccggctaggctggcggggttgccttactggttagcagaatgaatcaccg4620 atacgcgagcgaacgtgaagcgactgctgctgcaaaacgtctgcgacctgagcaacaaca4680 tgaatggtcttcggtttccgtgtttcgtaaagtctggaaacgcggaagtcagcgccctgc4740 accattatgttccggatctgeatcgcaggatgctgctggctaccctgtggaacacctaca4800 tctgtattaacgaagcgctggcattgaccctgagtgatttttctctggtcccgccgcatc4860 cataccgccagttgtttaccctcacaacgttccagtaaccgggcatgttcatcatcagta4920 acccgtatcgtgagcatcctctctcgtttcatcggtatcattacccccatgaacagaaat4980 tcccccttacacggaggcatcaagtgaccaaacaggaaaaaaccgcccttaacatggccc5040 gctttatcagaagccagacattaacgcttctggagaaactcaacgagctggacgcggatg5100 aacaggcagacatctgtgaatcgcttcacgaccacgctgatgagctttaccgcagctgcc5160 tcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtca5220 cagcttgtctgtaagcggatgccgggagcagacaagcccgtcagggcgcgtcagcgggtg5280 ttggcgggtgtcggggcgcagccatgacccagtcacgtagcgatagcggagtgtatactg5340 gcttaactatgcggcatcagagcagattgtactgagagtgcaccatatgcggtgtgaaat5400 accgcacagatgcgtaaggagaaaataccgcatcaggcgctcttccgcttcctcgctcac5460 tgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggt5520 aatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggcca5580 gcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggC'tCCgCCC5640 ccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggact5700 ataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccct5760 gccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcatag5820 ctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgca5880 cgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaa5940 cccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagc6000 gaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactag6060 aaggacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttgg6120 tagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagca6180 gcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtc6240 tgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaag6300 gatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaagtatata6360 tgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgat6420 ctgtctatttcgttcatccatagttgcctgactccccgtcgtgtagataactacgatacg6480 ggagggcttaccatctggccccagtgctgcaatgataccgcgagacccacgctcaccggc6540 tccagatttatcagcaataaaccagccagccggaagggccgagcgcagaagtggtcctgc6600 aactttatccgcctccatccagtctattaattgttgccgggaagctagagtaagtagttc6660 gccagttaatagtttgcgcaacgttgttgccattgctgcaggcatcgtggtgtcacgctc6720 gtcgtttggtatggcttcattcagctccggttcccaacgatcaaggcgagttacatgatc6780 ccccatgttgtgcaaaaaagcggttagctccttcggtcctccgatcgttgtcagaagtaa6840 gttggccgcagtgttatcactcatggttatggcagcactgcataattctcttactgtcat6900 gccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctgagaata6960 gtgtatgcggcgaccgagttgctcttgcccggcgtcaacacgggataataccgcgccaca7020 tagcagaactttaaaagtgctcatcattggaaaacgttcttcggggcgaaaactctcaag7080 gatcttaccgctgttgagatccagttcgatgtaacccactcgtgcacccaactgatcttc7140 agcatcttttactttcaccagcgtttctgggtgagcaaaaacaggaaggcaaaatgccgc7200 aaaaaagggaataagggcgacacggaaatgttgaatactcatactcttcctttttcaata7260 ttattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtattta7320 gaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctgacgtcta7380 agaaaccattattatcatgacattaacctataaaaataggcgtatcacgaggccctttcg7440 tcttcaa 7447 <210>

<211>

<212>
DNA

<213>
vector pSP72E2H6 <400>

gaactcgagcagctgaagcttgaattcatgagatttccttcaatttttactgcagtttta60 ttcgcagcatcctccgcattagctgctccagtcaacactacaacagaagatgaaacggca120 caaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgttgct180 gttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgcc240 agcattgctgctaaagaagaaggggtatctctagataaaaggcatacccgcgtgtcagga300 ggggcagcagcctccgataccaggggccttgtgtccctctttagccccgggtcggctcag360 aaaatccagctcgtaaacaccaacggcagttggcacatcaacaggactgccctgaactgc420 aacgactccctccaaacagggttctttgccgcactattctacaaacacaaattcaactcg480 tctggatgcccagagcgcttggccagctgtcgctccatcgacaagttcgctcaggggtgg540 ggtcccctcacttacactgagcctaacagctcggaccagaggccctactgctggcactac600 gcgcctcgaccgtgtggtattgtacccgcgtctcaggtgtgcggtccagtgtattgcttc660 accccgagccctgttgtggtggggacgaccgatcggtttggtgtccccacgtataactgg720 ggggcgaacgactcggatgtgctgattctcaacaacacgcggccgccgcgaggcaactgg780 ttcggctgtacatggatgaatggcactgggttcaccaagacgtgtgggggccccccgtgc840 aacatcgggggggccggcaacaacaccttgacctgccccactgactgttttcggaagcac900 cccgaggccacttacgccagatgcggttctgggccctggctgacacctaggtgtatggtt960 cattacccatataggctctggcactacccctgcactgtcaacttcaccatcttcaaggtt1020 aggatgtacgtggggggcgtggagcacaggttcgaagccgcatgcaattggactcgagga1080 gagcgttgtgacttggaggacagggatagatcagagcttagctcgctgctgctgtctaca1140 acagagtggcaggtgatcgagggcagacaccatcaccaccatcactaatagttaattaac1200 gatctcgacttggttgaacacgttgccaaggcttaagtgaatttactttaaagtcttgca1260 tttaaataaattttctttttatagctttatgacttagtttcaatttatatactattttaa1320 tgacattttcgattcattgattgaaagctatcagatctgccggtctccctatagtgagtc1380 gtattaatttcgataagccaggttaacctgcattaatgaatcggccaacgcgcggggaga1440 ggcggtttgcgtattgggcgCtCttCCgCttCCtCgCtCaCtgaCtCgCtgCgCtCggtC1500 gttcggctgcggcgagcggtatcagctcactcaaaggcggtaatacggttatccacagaa1560 tcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaaccgt1620 aaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaa1680 aatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgttt1740 ccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctg1800 tccgcctttctcccttcgggaagcgtggcgctttctcaatgctcacgctgtaggtatctc1860 agttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagccc1920 gaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgactta1980 tcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgct2040 acagagttcttgaagtggtggcctaactacggctacactagaaggacagtatttggtatc2100 tgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaa2160 caaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaa2220 aaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaa2280 aactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatcctt2340 ttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttggtctgac2400 agttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcatcc2460 atagttgcctgactccccgtcgtgtagataactacgatacgggagggcttaccatctggc2520 cccagtgctgcaatgataccgcgagacccacgctcaccggctccagatttatcagcaata2580 aaccagccagccggaagggccgagcgcagaagtggtcctgcaactttatccgcctccatc2640 cagtctattaattgttgccgggaagctagagtaagtagttcgccagttaatagtttgcgc2700 aacgttgttgccattgctacaggcatcgtggtgtcacgctcgtcgtttggtatggcttca2760 ttcagctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtgcaaaaaa2820 gcggttagctccttcggtcctccgatcgttgtcagaagtaagttggccgcagtgttatca2880 ctcatggttatggcagcactgcataattctcttactgtcatgccatccgtaagatgcttt2940 tctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcgaccgagt3000 tgctcttgcccggcgtcaatacgggataataccgcgccacatagcagaactttaaaagtg3060 ctcatcattggaaaacgttcttcggggcgaaaactctcaaggatcttaccgctgttgaga3120 tccagttcgatgtaacccactcgtgcacccaactgatcttcagcatcttttactttcacc3180 agcgtttctgggtgagcaaaaacaggaaggcaaaatgccgcaaaaaagggaataagggcg3240 acacggaaatgttgaatactcatactcttcctttttcaatattattgaagcatttatcag3300 ggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaataggg3360 gttccgcgcacatttccccgaaaagtgccacctgacgtctaagaaaccattattatcatg3420 acattaacctataaaaataggcgtatcacgaggccctttcgtctcgcgcgtttcggtgat3480 gacggtgaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgtaagcg3540 gatgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcggggc3600 tggcttaactatgcggcatcagagcagattgtactgagagtgcaccatatggacatattg3660 tcgttagaacgcggctacaattaatacataaccttatgtatcatacacatacgatttagg3720 tgacactata 3730 <210> 23 <211> 7370 <212> DNA
<213> vector pMPTl21 <220>
<221> misc_feature <222> (778) .. (778) <223> N is any nucleotide <400> 23 ggtaccctgc tcaatctccg gaatggtgat ctgatcgttc ctgaaaacct cgacattggc 60 tccctcctgacacaggtactcgtacaggttccaggtaaacgagtcgtagttgtcgatcat120 gacaacgttcttagaagcggccggcattttgaaggtgactaatagcctaagaaaatattt180 aatttaattttcattaaattttcctatactcgCtatttcagcttttcatctcatcacttc240 ataaacgatataaaccagaaaaagaactattttcaaacacgcttctcaaaagcggtatgt300 ccttccacgtctccttagaatctggcaagtccgcgagggggatccagatctgaattcgtt360 tttgtactttagattgatgtcaccaccgtgcactggcagcagtatttatagatggaccgt420 gtggggacggttgggtacacttagcggcagcgctgaccccatctgtgatcaagtagggca480 aaaactggggatgtcggagtcgctgcacggtagcataagaatttactttctggccggttc540 acccgcatttgcactgtggagaaacagcctgtccgacaccccaccagttgccacatcggc600 CC'tCtgCtgCtctggtgattttctggtagcaggcacagacagcagtgggtagcgccgtcc660 ggttaggcaaggtcacgttgtaggctaccccagcaaacagagcctcacatgacaccatcc720 agctgcgtcctcgaagcgaaaagttcggttgcggctgcagaaccccctcagttgccanat780 tcacaagttttacgcgacggctaaagcgagtgggttttaaaaacttgcggtgcaaggatg840 catgcggcaacaattaattggtgcatccagcacagcaagcccagtctcgagatgtccagt900 cgctacagagtggagtacgcactcaaggaacaccgtcgagatggcctcatagaatggatc960 aagggcctgctggccacgccgttcgtcctgtacgcggtgaagagcaacggcatctctgca1020 gtggacgacctcatggtaaactctgaggcaaaacgccgctacgcggaaatcttccacgac1080 ctcgaactcctcatcgacgacaacattgaaatgaccaaagccggcacccccgaattgtct1140 cggctcgtgcagctggttccgagcgttggcagcttcttcacgagactgcctctggaaaag1200 gccttctacatcgaggacgagcgccgcgccatcagcaaacgccggcttgtggccccctcg1260 ttcaacgacgtccggctcattctcaacacggcccagctgttggagatgtcgcggttcttc1320 cattccaaaaccatccgagatcgcaagctgcagctcattacattcgatggtgacatcaca1380 ctgtacgacgacggcaaaaatttcgatgccgagtcgcccatcctgccccacctcatcaaa1440 ctaatggccaaggacctctatgtgggtatcgtcaccgcggccggctacagcgacggaaca1500 agtactacgagcgcctcaagggcctcatcgacgccgtccagacgtccccgctgctcacag1560 gccaccagaaagagaacctgttcattatgggcggcgaggcaaactacctcttccggtaca1620 gtaacgaggagcagagattacgcttctactccaaagacagatggctgctcgagaacatgc1680 tgaattggtccgaggaggacattcatctgacactggactttgcgcaggacgttctaaacg1740 acctcgttcacaaactgggctcgccagccaccgtggtccgcaaggagcgtcgcgtcggcc1800 tggttccattaccgggccacaagctgatccgcgagcagctcgaggagatcgttctccgcg1860 tcgacaccattccttgcggcggcggtgctcaacggcctcaacctactactgggctgcttc1920 ctaatgcaggagtcgcataagggagagcgtcgactcccgcgactcggcgttcactttcga1980 gctattatcaacgccggaatacgtcagaaacagccgtgccccagggaccagaaagcctac2040 tggtgagtatgttctttcgtgtgatttttccgaggatgagaacgacgataacgagcacaa2100 ctcggagtcggaggacacgcttattgcgttgaacgcagccacatcagcaggctgtcaaga2160 ctgagtatggccacagagctggattctcggcctcatactcaagacgttagtaaactccgt2220 ctgccagaaattgctgacgaggatgtataataatagatgaattacgaacaattgtagttc2280 aaaaaaatttagtaacaatattgtctagatgacagatgtgctgaaaccagtgaactccaa2340 taaaccactcaccgctacccaagagaaacagatcagagtgctagggccttgtttcagagt2400 actacaacgtttaccagaagcttgagcaagttctcaaacgcgggtttgtcgaccgatgcc2460 cttgagagccttcaacccagtcagctccttccggtgggcgcggggcatgactatcgtcgc2520 cgcacttatgactgtcttctttatcatgcaactcgtaggacaggtgccggcagcgctctg2580 ggtcattttcggcgaggaccgctttcgctggagcgcgacgatgatcggcctgtcgcttgc2640 ggtattcggaatcttgcacgCCCtCg'CtCaagccttcgtcactggtcccgccaccaaacg2700 tttcggcgagaagcaggccattatcgccggcatggcggccgacgcgctgggctacgtctt2760 gctggcgttcgcgacgcgaggctggatggccttccccattatgattcttctcgcttccgg2820 cggcatcgggatgcccgcgttgcaggccatgctgtccaggcaggtagatgacgaccatca2880 gggacagcttcaaggatcgctcgcggctcttaccagcctaacttcgatcactggaccgct2940 gatcgtcacggcgatttatgccgcctcggcgagcacatggaacgggttggcatggattgt3000 aggcgccgccctataccttgtctgcctccccgcgttgcgtcgcggtgcatggagccgggc3060 cacctcgacctgaatggaagccggcggcacctcgctaacggattcaccactccaagaatt3120 ggagccaatcaattcttgcggagaactgtgaatgcgcaaaccaacccttggcagaacata3180 tccatcgcgtccgccatctccagcagccgcacgcggcgcatcgggggggggggggggggg3240 ggggggcaaacaattcatcattttttttttattcttttttttgatttcggtttctttgaa3300 atttttttgattcggtaatctccgaacagaaggaagaacgaaggaaggagcacagactta3360 gattggtatatatacgcatatgtagtgttgaagaaacatgaaattgcccagtattcttaa3420 cccaactgcacagaacaaaaacctgcaggaaacgaagataaatcatgtcgaaagctacat3480 ataaggaacgtgctgctactcatcctagtcctgttgctgccaagctatttaatatcatgc3540 acgaaaagcaaacaaaettgtgtgcttcattggatgttcgtaccaccaaggaattactgg3600 agttagttgaagcattaggtcccaaaatttgtttactaaaaacacatgtggatatcttga3660 ctgatttttccatggagggcacagttaagccgctaaaggcattatccgccaagtacaatt3720 ttttactcttcgaagacagaaaatttgctgacattggtaatacagtcaaattgcagtact3780 ctgcgggtgtatacagaatagcagaatgggcagacattacgaatgcacacggtgtggtgg3840 gcccaggtattgttagcggtttgaagcaggcggcagaagaagtaacaaaggaacctagag3900 gccttttgatgttagcagaattgtcatgcaagggctccctatctactggagaatatacta3960 agggtactgttgacattgcgaagagcgacaaagattttgttatcggctttattgctcaaa4020 gagacatgggtggaagagatgaaggttacgattggttgattatgacacccggtgtgggtt4080 tagatgacaagggagacgcattgggtcaacagtatagaaccgtggatgatgtggtctcta4140 caggatctgacattattattgttggaagaggactatttgcaaagggaagggatgctaagg4200 tagagggtgaacgttacagaaaagcaggctgggaagcatatttgagaagatgcggccagc4260 aaaactaaaaaactgtattataagtaaatgcatgtatactaaactcacaaattagagctt4320 caatttaattatatcagttattacccgggaatctcggtcgtaatgatttttataatgacg4380 aaaaaaaaaaaattggaaagaaaagCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCg4440 cagcgttgggtcctggccacgggtgcgcatgatcgtgctcctgtcgttgaggacccggct4500 aggctggcggggttgccttactggttagcagaatgaatcaccgatacgcgagcgaacgtg4560 aagcgactgctgctgcaaaacgtctgcgacctgagcaacaacatgaatggtcttcggttt4620 ccgtgtttcgtaaagtctggaaacgcggaagtcagcgccctgcaccattatgttccggat4680 ctgcatcgcaggatgctgctggctaccctgtggaacacctacatctgtattaacgaagcg4740 ctggcattgaccctgagtgatttttctctggtcccgccgcatccataccgccagttgttt4800 accctcacaacgttccagtaaccgggcatgttcatcatcagtaacccgtatcgtgagcat4860 cctctctcgtttcatcggtatcattacccccatgaacagaaattcccccttacacggagg4920 catcaagtgaccaaacaggaaaaaaccgcccttaacatggcccgctttatcagaagccag4980 acattaacgcttctggagaaactcaacgagctggacgcggatgaacaggcagacatctgt5040 gaatcgcttcacgaccacgctgatgagctttaccgcagctgcctcgcgcgtttcggtgat5100 gacggtgaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgtaagcg5160 gatgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcggggc5220 gcagccatgacccagtcacgtagcgatagcggagtgtatactggcttaactatgcggcat5280 cagagcagattgtactgagagtgcaccatatgcggtgtgaaataccgcacagatgcgtaa5340 ggagaaaataccgcatcaggcgctcttccgcttcctcgctcactgactcgctgcgctcgg5400 tcgttcggctgcggcgagcg,gtatcagctcactcaaaggcggtaatacggttatccacag5460 aatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaacc5520 gtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcaca5580 aaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgt5640 ttCCCCCtggaagctccctcgtgcgctctcCtgttCCgaCCCtgCCgCttaccggatacc5700 tgtccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatc5760 tcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagc5820 ccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgact5880 tatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtg5940 ctacagagttcttgaagtggtggcctaactacggctacactagaaggacagtatttggta6000 tctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggca6060 aacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaa6120 aaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacg6180 aaaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatcc6240 ttttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttggtctg6300 acagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcat6360 ccatagttgcctgactccccgtcgtgtagataactacgatacgggagggcttaccatctg6420 gccccagtgctgcaatgataccgcgagacccacgctcaccggctccagatttatcagcaa6480 taaaccagcc.agccggaagggccgagcgcagaagtggtcctgcaactttatccgcctcca6540 tccagtctattaattgttgccgggaagctagagtaagtagttcgccagttaatagtttgc6600 gcaacgttgttgccattgctgcaggcatcgtggtgtcacgctcgtcgtttggtatggctt6660 cattcagctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtgcaaaa6720 aagcggttagctccttcggtcctccgatcgttgtcagaagtaagttggccgcagtgttat6780 cactcatggttatggcagcactgcataattctcttactgtcatgccatccgtaagatgct6840 tttctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcgaccga6900 gttgctcttgcccggcgtcaacacgggataataccgcgccacatagcagaactttaaaag6960 tgctcatcattggaaaacgttcttcggggcgaaaactctcaaggatcttaccgctgttga7020 gatccagttcgatgtaacccactcgtgcacccaactgatcttcagcatcttttactttca7080 ccagcgtttctgggtgagcaaaaacaggaaggcaaaatgccgcaaaaaagggaataaggg7140 cgacacggaaatgttgaatactcatactcttcctttttcaatattattgaagcatttatc7200 agggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaatag7260 gggttccgcgcacatttccccgaaaagtgccacctgacgtctaagaaaccattattatca7320 tgacattaacctataaaaataggcgtatcacgaggccctttcgtcttcaa 7370 <210> 24 <211> 8298 <212> DNA
<213> vector pFMPT-MFalfa-E2-H6 <400> 24 ggtaccctgc tcaatctccg gaatggtgat ctgatcgttc ctgaaaacct cgacattggc 60 tccctcctga cacaggtact cgtacaggtt ccaggtaaac gagtcgtagt tgtcgatcat 120 gacaacgttc ttagaagcgg ccggcatttt gaaggtgact aatagcctaa gaaaatattt 180 aatttaattt tcattaaatt ttcctatact cgctatttca gcttttcatc tcatcacttc 240 ataaacgatataaaccagaaaaagaactattttcaaacacgcttctcaaaagcggtatgt300 ccttccacgtctccttagaatctggcaagtccgcgagggggatccagatctgatagcttt360 caatcaatgaatcgaaaatgtcattaaaatagtatataaattgaaactaagtcataaagc420 tataaaaagaaaatttatttaaatgcaagactttaaagtaaattcacttaagccttggca480 acgtgttcaaccaagtcgagatcgttaattaactattagtgatggtggtgatggtgtctg540 ccctcgatcacctgccactctgttgtagacagcagcagcgagctaagctctgatctatcc600 ctgtcctccaagtcacaacgctctcctcgagtccaattgcatgcggcttcgaacctgtgc660 tccacgccccccacgtacatcctaaccttgaagatggtgaagttgacagtgcaggggtag720 tgccagagcctatatgggtaatgaaccatacacctaggtgtcagccagggcccagaaccg780 catctggcgtaagtggcctcggggtgcttccgaaaacagtcagtggggcaggtcaaggtg840 ttgttgccggcccccccgatgttgcacggggggcccccacacgtcttggtgaacccagtg900 ccattcatccatgtacagccgaaccagttgcctcgcggcggccgcgtgttgttgagaatc960 agcacatccgagtcgttcgccccccagttatacgtggggacaccaaaccgatcggtcgtc1020 cccaccacaacagggctcggggtgaagcaatacactggaccgcacacctgagacgcgggt1080 acaataccacacggtcgaggcgcgtagtgccagcagtagggcctctggtccgagctgtta1140 ggctcagtgtaagtgaggggaccccacccctgagcgaacttgtcgatggagcgacagctg1200 gccaagcgctctgggcatccagacgagttgaatttgtgtttgtagaatagtgcggcaaag1260 aaccctgtttggagggagtcgttgcagttcagggcagtcctgttgatgtgccaactgccg1320 ttggtgtttacgagctggattttctgagccgacccggggctaaagagggacacaaggccc1380 ctggtatcggaggctgctgcccctcctgacacgcgggtatgccttttatctagagatacc1440 ccttcttctttagcagcaatgctggcaatagtagtatttataaacaataacccgttattt1500 gtgctgttggaaaatggcaaaacagcaacatcgaaatccccttctaaatctgagtaaccg1560 atgacagcttcagccggaatttgtgccgtttcatcttctgttgtagtgttgactggagca1620 gctaatgcggaggatgctgcgaataaaactgcagtaaaaattgaaggaaatctcatgaat1680 tcccgatgaagcagagagcgcaggaggcggtatttatagtgccattcccctctctgagag1740 acccggatggtagtcgagtgtatcggagacagcttgatgtagactccgtgcctgccggct1800 cctcttattggcggacaccagtgagacaccccggaacttgctgtttttctgcaaaatccg1860 gggtgaccagtgggagcctatttgcacacacgagcgggacaccccactctggtgaagagt1920 gccaaagtcattctttttcccgttgcggggcagccgattgcatgttttaggaaaatatta1980 cctttgctacaccctgtcagatttaccctccacacatatatattccgtcacctccaggga2040 ctattattcgtcgttgcgccgccagcggaagatatccagaagctgttttccgagagactc2100 ggttggcgcctggtatatttgatggatgtcgcgctgcctcacgtcccggtacccaggaac2160 gcggtgggatctcgggcccatcgaagactgtgctccagactgctcgcccagcaggtgttt2220 cttgatcgccgcctctaaattgtccgcgcatcgccggtaacatttttccagctcggagtt2280 tgcgtttagatacagtttctgcgatgccaaaggagcctgcagattataacctcggatgct2340 gtcattcagcgcttttaatttgacctccagatagttgctgtatttctgttcccattggct2400 gctgcgcagcttcgtataactcgagttattgttgcgctctgcctcggcgtactggctcat2460 gatctggatcttgtccgtgtcgcttttcttcgagtgtttctcgcaaacgatgtgcacggc2520 ctgcagtgtccaatcggagtcgagctggcgccgaaactggcggatctgagcctccacact2580 gccctgtttctctatccacggcggaaccgcctcctgccgtttcagaatgttgttcaagtg2640 gtactctgtgcggtcaatgaaggcgttattgccggtgaaatctttgggaagcggttttcc2700 tcggggaagattacgaaattccccgcgtcgttgcgcttcctggatctcgaggagatcgtt2760 ctccgcgtcgaggagatcgttctccgcgtcgacaccattccttgcggcggcggtgctcaa2820 CggCCtCaaCCtaCtaCtgggCtgCttCCtaatgcaggagtcgcataagggagagcgtcg2880 acaaacccgcgtttgagaacttgctcaagcttctggtaaacgttgtagtactctgaaaca2940 aggccctagcactctgatctgtttctcttgggtagcggtgagtggtttattggagttcac3000 tggtttcagcacatctgtcatctagacaatattgttactaaatttttttgaactacaatt3060 gttcgtaattcatctattattatacatcctcgtcagcaatttctggcagacggagtttac3120 taacgtcttgagtatgaggccgagaatccagctctgtggccatactcagtcttgacagcc3180 tgctgatgtggctgcgttcaacgcaataagcgtgtcctccgactccgagttgtgctcgtt3240 atcgtcgttctcatcctcggaaaaatcacacgaaagaacatactcaccagtaggctttct3300 ggtccctggggcacggctgtttctgacgtattccggcgttgataatagctcgaaagtgaa3360 cgccgagtcgcgggagtcgaccgatgcccttgagagccttcaacccagtcagctccttcc3420 ggtgggcgcggggcatgactatcgtcgccgcacttatgactgtcttctttatcatgcaac3480 tcgtaggacaggtgccggcagcgctctgggtcattttcggcgaggaccgctttcgctgga3540 gcgcgacgatgatcggcctgtcgcttgcggtattcggaatcttgcacgecctcgctcaag3600 ccttcgtcactggtcccgccaccaaacgtttcggcgagaagcaggccattatcgccggca3660 tggcggccgacgcgctgggctacgtcttgctggcgttcgcgacgcgaggctggatggcct3720 tccccattatgattcttctcgcttccggcggcatcgggatgcccgcgttgcaggccatgc3780 tgtccaggcaggtagatgacgaccatcagggacagcttcaaggatcgctcgcggctctta3840 ccagcctaacttcgatcactggaccgctgatcgtcacggcgatttatgccgcctcggcga3900 gcacatggaacgggttggcatggattgtaggcgccgccctataccttgtctgcctccccg3960 cgttgcgtcgcggtgcatggagccgggccacctcgacctgaatggaagccggcggcacct4020 cgctaacggattcaccactccaagaattggagccaatcaattcttgcggagaactgtgaa4080 tgcgcaaaccaacccttggcagaacatatccatcgcgtccgccatctccagcagccgcac414D

gcggcgcatcggggggggggggggggggggggggcaaacaattcatcattttttttttat4200 tcttttttttgatttcggtttctttgaaatttttttgattcggtaatctccgaacagaag4260 gaagaacgaaggaaggagcacagacttagattggtatatatacgcatatgtagtgttgaa4320 gaaacatgaaattgcccagtattcttaacccaactgcacagaacaaaaacctgcaggaaa4380 cgaagataaatcatgtcgaaagctacatataaggaacgtgctgctactcatcctagtcct4440 gttgctgccaagctatttaatatcatgcacgaaaagcaaacaaacttgtgtgcttcattg4500 gatgttcgtaccaccaaggaattactggagttagttgaagcattaggtcccaaaatttgt4560 ttactaaaaacacatgtggatatcttgactgatttttccatggagggcacagttaagccg4620 ctaaaggcattatccgccaagtacaattttttactcttcgaagacagaaaatttgctgac4680 attggtaatacagtcaaattgcagtactctgcgggtgtatacagaatagcagaatgggca4740 gacattacgaatgcacacggtgtggtgggcccaggtattgttagcggtttgaagcaggcg4800 gcagaagaagtaacaaaggaacctagaggccttttgatgttagcagaattgtcatgcaag4860 ggctccctatctactggagaatatactaagggtactgttgacattgcgaagagcgacaaa4920 gattttgttatcggctttattgctcaaagagacatgggtggaagagatgaaggttacgat4980 tggttgattatgacacccggtgtgggtttagatgacaagggagacgcattgggtcaacag5040 tatagaaccgtggatgatgtggtctctacaggatctgacattattattgttggaagagga5100 ctatttgcaaagggaagggatgctaaggtagagggtgaacgttacagaaaagcaggctgg5160 gaagcatatttgagaagatgcggccagcaaaactaaaaaactgtattataagtaaatgca5220 tgtatactaaactcacaaattagagcttcaatttaattatatcagttattacccgggaat5280 ctcggtcgtaatgatttttataatgacgaaaaaaaaaaaattggaaagaaaagccccccc5340 CCCCCCCCCCCCCCCCCCCCCCCCCCCgCagcgttgggtcctggccacgggtgcgcatga5400 tcgtgctcctgtcgttgaggacccggctaggctggcggggttgccttactggttagcaga5460 atgaatcaccgatacgcgagcgaacgtgaagcgactgctgctgcaaaacgtctgcgacct5520 gagcaacaacatgaatggtcttcggtttccgtgtttcgtaaagtctggaaacgcggaagt5580 cagcgccctgcaccattatgttccggatctgcatcgcaggatgctgctggctaccctgtg5640 gaacacctacatctgtattaacgaagcgctggcattgaccctgagtgatttttctctggt5700 cccgccgcatccataccgccagttgtttaccctcacaacgttccagtaaccgggcatgtt5760 catcatcagt aacccgtatc gtgagcatcc tctctcgttt catcggtatc attaccccca 5820 tgaacagaaa ttccccctta cacggaggca tcaagtgacc aaacaggaaa aaaccgccct 5880 taacatggcccgctttatcagaagccagacattaacgcttctggagaaactcaacgagct5940 ggacgcggatgaacaggcagacatctgtgaatcgcttcacgaccacgctgatgagcttta6000 ccgcagctgcctcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctccc6060 ggagacggtcacagcttgtctgtaagcggatgccgggagcagacaagcccgtcagggcgc6120 gtcagcgggtgttggcgggtgtcggggcgcagccatgacccagtcacgtagcgatagcgg6180 agtgtatactggcttaactatgcggcatcagagcagattgtactgagagtgcaccatatg6240 cggtgtgaaataccgcacagatgcgtaaggagaaaataccgcatcaggcgctcttccgct6300 tcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcac6360 tcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtga6420 gcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccat6480 aggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaac6540 ccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcct6600 gttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcg6660 ctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctg6720 ggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgt6780 cttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacagg6840 attagcagagcgaggtatgtaggeggtgctacagagttcttgaagtggtggcctaactac6900 ggctacactagaaggacagtatttggtatctgcgctctgctgaagccagttaccttcgga6960 aaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggttttttt7020 gtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttt7080 tctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgaga7140 ttatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatc7200 taaagtatatatgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacct7260 atctcagcgatctgtctatttcgttcatccatagttgcctgactccccgtcgtgtagata7320 actacgatacgggagggcttaccatctggccccagtgctgcaatgataccgcgagaccca7380 cgctcaccggctccagatttatcagcaataaaccagccagccggaagggccgagcgcaga7440 agtggtcctgcaactttatccgcctccatccagtctattaattgttgccgggaagctaga7500 gtaagtagttcgccagttaatagtttgcgcaacgttgttgccattgctgcaggcatcgtg7560 gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg gttcccaacg atcaaggcga 7620 gttacatgatcccccatgttgtgcaaaaaagcggttagctccttcggtcctccgatcgtt7680 gtcagaagtaagttggccgcagtgttatcactcatggttatggcagcactgcataattct7740 cttactgtcatgccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtca7800 ttctgagaatagtgtatgcggcgaccgagttgctcttgcccggcgtcaacacgggataat7860 accgcgccacatagcagaactttaaaagtgctcatcattggaaaacgttcttcggggcga7920 aaactctcaaggatcttaccgctgttgagatccagttcgatgtaacccactcgtgcaccc7980 aactgatcttcagcatcttttactttcaccagcgtttctgggtgagcaaaaacaggaagg8040 caaaatgccgcaaaaaagggaataagggcgacacggaaatgttgaatactcatactcttc8100 ctttttcaatattattgaagcatttatcagggttattgtctcatgagcggatacatattt8160 gaatgtatttagaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgcca8220 cctgacgtctaagaaaccattattatcatgacattaacctataaaaataggcgtatcacg8280 aggccctttcgtcttcaa 8298 <210> 25 <211> 8695 <212> DNA
<213> vector pMPT-Mfalfa-E2-H6 <220>
<221> misc_feature <222> (2103)..(2103) <223> N is any nucleotide <400>

ggtaccctgctcaatctccggaatggtgatctgatcgttcctgaaaacctcgacattggc60 tccctcctgacacaggtactcgtacaggttccaggtaaacgagtcgtagttgtcgatcat120 gacaacgttcttagaagcggccggcattttgaaggtgactaatagcctaagaaaatattt180 aatttaattttcattaaattttcctatactcgctatttcagcttttcatctcatcacttc240 ataaacgatataaaccagaaaaagaactattttcaaacacgcttctcaaaagcggtatgt300 ccttccacgtctccttagaatctggcaagtccgcgagggggatccagatctgatagcttt360 caatcaatgaatcgaaaatgtcattaaaatagtatataaattgaaactaagtcataaagc420 tataaaaagaaaatttatttaaatgcaagactttaaagtaaattcacttaagccttggca480 acgtgttcaaccaagtcgagatcgttaattaactattagtgatggtggtgatggtgtctg540 ccctcgatcacctgccactctgttgtagacagcagcagcgagctaagctctgatctatcc600 ctgtcctccaagtcacaacgctctcctcgagtccaattgcatgcggcttcgaacctgtgc660 tccacgccccccacgtacatcctaaccttgaagatggtgaagttgacagtgcaggggtag720 tgccagagcctatatgggtaatgaaccatacacctaggtgtcagccagggcccagaaccg780 catctggcgtaagtggcctcggggtgcttccgaaaacagtcagtggggcaggtcaaggtg840 ttgttgccggcccccccgatgttgcacggggggcccccacacgtcttggtgaacccagtg900 ccattcatccatgtacagccgaaccagttgcctcgcggcggccgcgtgttgttgagaatc960 agcacatccgagtcgttcgccccccagttatacgtggggacaccaaaccgatcggtcgtc1020 cccaccacaacagggctcggggtgaagcaatacactggaccgcacacctgagacgcgggt1080 acaataccacacggtcgaggcgcgtagtgccagcagtagggcctctggtccgagctgtta1140 ggctcagtgtaagtgaggggaccccacccctgagcgaacttgtcgatggagcgacagctg1200 gccaagcgctctgggcatccagacgagttgaatttgtgtttgtagaatagtgcggcaaag1260 aaccctgtttggagggagtcgttgcagttcagggcagtcctgttgatgtgccaactgccg1320 ttggtgtttacgagctggattttctgagccgacccggggctaaagagggacacaaggccc1380 ctggtatcggaggctgctgcccctcctgacacgcgggtatgccttttatctagagatacc1440 ccttcttctttagcagcaatgctggcaatagtagtatttataaacaataacccgttattt1500 gtgctgttggaaaatggcaaaacagcaacatcgaaatccccttctaaatctgagtaaccg1560 atgacagcttcagccggaatttgtgccgtttcatcttctgttgtagtgttgactggagca1620 gctaatgcggaggatgctgcgaataaaactgcagtaaaaattgaaggaaatctcatgaat1680 tcgtttttgtactttagattgatgtcaccaccgtgcactggcagcagtatttatagatgg1740 accgtgtggggacggttgggtacacttagcggcagcgctgaccccatctgtgatcaagta1800 gggcaaaaactggggatgtcggagtcgctgcacggtagcataagaatttactttctggcc1860 ggttCdCCCgcatttgcactgtggagaaacagcctgtccgacaccccaccagttgccaca1920 tcggccctctgctgctctggtgattttctggtagcaggcacagacagcagtgggtagcgc1980 cgtccggttaggcaaggtcacgttgtaggctaccccagcaaacagagcctcacatgacac2040 catccagctgcgtcctcgaagcgaaaagttcggttgcggctgcagaaccccctcagttgc2100 canattcacaagttttacgcgacggctaaagcgagtgggttttaaaaacttgcggtgcaa2160 ggatgcatgcggcaacaattaattggtgcatccagcacagcaagcccagtctcgagatgt2220 ccagtcgctacagagtggagtacgcactcaaggaacaccgtcgagatggcctcatagaat2280 ggatcaagggcctgctggccacgccgttcgtcctgtacgcggtgaagagcaacggcatct2340 etgcagtggacgacctcatggtaaactctgaggcaaaacgccgctacgcggaaatcttcc2400 acgacctcgaactcctcatcgacgacaacattgaaatgaccaaagccggcacccccgaat2460 tgtctcggctcgtgcagctggttccgagcgttggcagcttcttcacgagactgcctctgg2520 aaaaggccttctacatcgaggacgagcgccgcgccatcagcaaacgccggcttgtggccc2580 cctcgttcaacgacgtccggctcattctcaacacggcccagctgttggagatgtcgcggt2640 tcttccattccaaaaccatccgagatcgcaagctgcagctcattacattcgatggtgaca2700 tcacactgtacgacgacggcaaaaatttcgatgccgagtcgcccatcctgccccacctca2760 tcaaactaatggccaaggacctctatgtgggtatcgtcaccgcggccggctacagcgacg2820 gaacaagtactacgagcgcctcaagggcctcatcgacgccgtccagacgtccccgctgct2880 cacaggccaccagaaagagaacctgttcattatgggcggcgaggcaaactacctcttccg2940 gtacagtaacgaggagcagagattacgcttctactccaaagacagatggctgctcgagaa3000 catgctgaattggtccgaggaggacattcatctgacactggactttgcgcaggacgttct3060 aaacgacctcgttcacaaactgggctcgccagccaccgtggtccgcaaggagcgtcgcgt3120 cggcctggttccattaccgggccacaagct.gatccgcgagcagctcgaggagatcgttct3180 ccgcgtcgacaccattccttgcggcggcggtgctcaacggcctcaacctactactgggct3240 gcttcctaatgcaggagtcgcataagggagagcgtcgactcccgcgactcggcgttcact3300 ttcgagctattatcaacgccggaatacgtcagaaacagccgtgccccagggaccagaaag3360 cctactggtgagtatgttctttcgtgtgatttttccgaggatgagaacgacgataacgag3420 cacaactcggagtcggaggacacgcttattgcgttgaacgcagccacatcagcaggctgt3480 caagactgagtatggccacagagctggattctcggcctcatactcaagacgttagtaaac3540 tccgtctgccagaaattgctgacgaggatgtataataatagatgaattacgaacaattgt3600 agttcaaaaaaatttagtaacaatattgtctagatgacagatgtgctgaaaccagtgaac3660 tccaataaaccactcaccgctacccaagagaaacagatcagagtgctagggccttgtttc3720 agagtactacaacgtttaccagaagcttgagcaagttctcaaacgcgggtttgtcgaccg3780 atgcccttgagagccttcaacccagtcagctccttccggtgggcgcggggcatgactatc3840 gtcgccgcacttatgactgtcttctttatcatgcaactcgtaggacaggtgccggcagcg3900 ctctgggtcattttcggcgaggaccgctttcgctggagcgcgacgatgatcggcctgtcg3960 cttgcggtattcggaatcttgcacgccctcgctcaagccttcgtcactggtcccgccacc4020 aaacgtttcggcgagaagcaggccattatcgccggcatggcggccgacgcgctgggctac4080 gtcttgctggcgttcgcgacgcgaggctggatggecttccccattatgattcttctcgct4140 tccggcggcatcgggatgcccgcgttgcaggccatgctgtccaggcaggtagatgacgac4200 catcagggacagcttcaaggatcgctcgcggctcttaccagcctaacttcgatcactgga4260 ccgctgatcgtcacggcgatttatgccgcctcggcgagcacatggaacgggttggcatgg4320 attgtaggcgccgccctataccttgtctgcctccccgcgttgcgtcgcggtgcatggagc4380 cgggccacctcgacctgaatggaagccggcggcacctcgctaacggattcaccactccaa4440 gaattggagccaatcaattcttgcggagaactgtgaatgcgcaaaccaacccttggcaga4500 acatatccatcgcgtccgccatctccagcagccgcacgcggcgcatcggggggggggggg4560 gggggggggggcaaacaattcatcattttttttttattcttttttttgatttcggtttct4620 ttgaaatttttttgattcggtaatctccgaacagaaggaagaacgaaggaaggagcacag4680 acttagattggtatatatacgcatatgtagtgttgaagaaacatgaaattgcccagtatt4740 cttaacccaactgcacagaacaaaaacctgcaggaaacgaagataaatcatgtcgaaagc4800 tacatataaggaacgtgctgctactcatcctagtcctgttgctgccaagctatttaatat4860 catgcacgaaaagcaaacaaacttgtgtgcttcattggatgttcgtaccaccaaggaatt4920 actggagttagttgaagcattaggtcccaaaatttgtttactaaaaacacatgtggatat4980 cttgactgatttttccatggagggcacagttaagccgctaaaggcattatccgccaagta5040 caattttttactcttcgaagacagaaaatttgctgacattggtaatacagtcaaattgca5100 gtactctgcgggtgtatacagaatagcagaatgggcagacattacgaatgcacacggtgt5160 ggtgggcccaggtattgttagcggtttgaagcaggcggcagaagaagtaacaaaggaacc5220 tagaggccttttgatgttagcagaattgtcatgcaagggctccctatctactggagaata5280 tactaagggtactgttgacattgcgaagagcgacaaagattttgttatcggctttattgc5340 tcaaagagacatgggtggaagagatgaaggttacgattggttgattatgacacccggtgt5400 gggtttagatgacaagggagacgcattgggtcaacagtatagaaccgtggatgatgtggt5460 ctctacaggatctgacattattattgttggaagaggactatttgcaaagggaagggatgc5520 taaggtagagggtgaacgttacagaaaagcaggctgggaagcatatttgagaagatgcgg5580 ccagcaaaactaaaaaactgtattataagtaaatgcatgtatactaaactcacaaattag5640 agcttcaatttaattatatcagttattacccgggaatctcggtcgtaatgatttttataa5700 tgacgaaaaaaaaaaaattggaaagaaaagcccccccccccccccccccccccccccccc5760 ccccgcagcgttgggtcctggccacgggtgcgcatgatcgtgctcctgtcgttgaggacc5820 cggctaggctggcggggttgccttactggttagcagaatgaatcaccgatacgcgagcga5880 acgtgaagcgactgctgctgcaaaacgtctgcgacctgagcaacaacatgaatggtcttc5940 ggtttccgtgtttcgtaaagtctggaaacgcggaagtcagcgccctgcaccattatgttc6000 cggatctgcatcgcaggatgctgctggctaccctgtggaacacctacatctgtattaacg6060 aagcgctggcattgaccctgagtgatttttctctggtcccgccgcatccataccgccagt6120 tgtttaccctcacaacgttccagtaaccgggcatgttcatcatcagtaacccgtatcgtg6180 agcatcctctctcgtttcatcggtatcattacccccatgaacagaaattcccccttacac6240 ggaggcatcaagtgaccaaacaggaaaaaaccgcccttaacatggcccgctttatcagaa6300 gccagacattaacgcttctggagaaactcaacgagctggacgcggatgaacaggcagaca6360 tctgtgaatcgcttcacgaccacgctgatgagctttaccgcagctgcctcgcgcgtttcg6420 gtgatgacggtgaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgt6480 aagcggatgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtc6540 ggggcgcagccatgacccagtcacgtagcgatagcggagtgtatactggcttaactatgc6600 ggcatcagagcagattgtactgagagtgcaccatatgcggtgtgaaataccgcacagatg6660 cgtaaggagaaaataccgcatcaggcgctcttccgcttcctcgctcactgactcgctgcg6720 ctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggtaatacggttatc6780 cacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccag6840 gaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagca6900 tcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagatacca6960 ggcgtttccccctggaagctCCCt CgtgCgctctcctgttccgaccctgcegcttaccgg7020 atacctgtccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtag7080 gtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgt7140 tcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagaca7200 cgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtagg7260 cggtgctacagagttcttgaagtggtggcctaactacggctacactagaaggacagtatt7320 tggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatc7380 cggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcg7440 cagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtg7500 gaacgaaaactcacgttaagggattttggt'catgagattatcaaaaaggatcttcaceta7560 gatccttttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttg7620 gtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcg7680 ttcatccatagttgcctgactccccgtcgtgtagataactacgatacgggagggcttacc7740 atctggccccagtgctgcaatgataccgcgagacccacgctcaccggctccagatttatc7800 agcaataaaccagccagccggaagggccgagcgcagaagtggtcctgcaactttatccgc7860 ctccatccagtctattaattgttgccgggaagctagagtaagtagttcgccagttaatag7920 tttgcgcaacgttgttgccattgctgcaggcatcgtggtgtcacgctcgtcgtttggtat7980 ggcttcattcagctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtg8040 caaaaaagcggttagctccttcggtcctccgatcgttgtcagaagtaagttggccgcagt8100 gttatcactcatggttatggcagcactgcataattctcttactgtcatgc.catccgtaag8160 atgcttttctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcg8220 accgagttgctcttgcccggcgtcaacacgggataataccgcgccacatagcagaacttt8280 aaaagtgctcatcattggaaaacgttcttcggggcgaaaactctcaaggatcttaccgct8340 gttgagatccagttcgatgtaacccactcgtgcacccaactgatcttcagcatcttttac8400 tttcaccagc gtttctgggt gagcaaaaac aggaaggcaa aatgccgcaa aaaagggaat 8460 aagggcgaca cggaaatgtt gaatactcat actcttcctt tttcaatatt attgaagcat 8520 ttatcagggt tattgtctca tgagcggata catatttgaa tgtatttaga aaaataaaca 8580 aataggggtt ccgcgcacat ttccccgaaa agtgccacct gacgtctaag aaaccattat 8640 tatcatgaca ttaacctata aaaataggcg tatcacgagg ccctttcgtc ttcaa 8695 <210> 26 <211> 36 <212> DNA
<213> synthetic probe or primer i <400> 26 agtcactctt caaggcatac ccgcgtgtca ggaggg 36 <210> 27 <211> 39 <212> DNA
<213> synthetic probe or primer <400> 27 agtcactctt cacagggatc cttagtgatg gtggtgatg 39 <210> 28 <211> 4190 <212> DNA
<213> vector pMF30 <400> 28 gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg attcattaat gcagctggca 60 cgacaggttt cccgactgga aagcgggcag tgagcgcaac gcaattaatg tgagttagct 120 cactcattag gcaccccagg ctttacactt tatgcttccg gctcgtatgt tgtgtggaat 180 tgtgagcgga taacaatttc acacaggaaa cagctatgac catgattacg ccaagcttgc 240 atgcctgcag ttgattgcag atgccagatc ccgaaagaac agaggacgga gcgtaaactt 300 gtggcattcc accagaaatt gatacagata agcttccgga gtcaccagct aaaacggaat 360 tgcaagaaataatatcgataactttatcaccactagaatagccggtgttgctgacagtaa420 tatcctgtgacccgtttgaacctaaattattaaaaatggaaatcaattgattagcatcgc480 tacccttcctagtggctatatagtggtctgaagaagaaacaactgaggatttgtaagttg540 aataggcagaatccttcttaatagcttgatttcttatttgatttagtttactgattagct600 cgtagtattctgaatcggtattatatccacttaaccataaagcttctctattggcaggat660 cggaaccaccattgagaccttgttcttggccataataaataattgggataccatcaccca720 aaattataaaagccatgtcattcttaatcaaggatgtgtctgaggtaactgatggaaatc780 taacttggtcatggttttcaataaagtttcccaacaaagagacgtccgaacaagatgact840 gtaacgtggagatcattgaagttaactcactggaagtcgccgaagtatcactgaagaatc900 tatatactggatagtataatggatagttggtaactcctttcatataattctgatatggac960 aagtataagttggatctccttgataaacttcacctaagttataaacaccagaagcgtcct1020 caaacttcgttaatgaagcggtatctacgtgctttgcactatcaattcttaaaccatcga1080 ttgaatagttttgaacaaaatctgacacccaagtttgaaatactcctataacttcattat1140 cctcggtacttaaatctggaagggagacttcagtatcaccttcccaacaatcttcaacat1200 tggtttgatcattataatttgtaatcaaacaataatcgtggaagtaagattgttgattga1260 atggagtgaaactagaataatctacgcttgaaccatctccgttccaagcataatggttgt1320 aaacaacgtcgaccatcaataacatgcttctggaatgcaattcgctagctaattgtttca1380 attcatcagcggtaccaaaattagtgttcaattcatcaatatttttcatccaataaccat1440 ggtaagcataaccataagcagtattgtcaggaatttgctcaacaactggggagatccaga1500 tcgcagtgaaacccataccttgaatataatccaacttgtcgataatccctttataagatc1560 caccacagtacttgcgatcactcactaaacagtcagctgtggtcgagccatcagatctgg1620 caaacctatcagtaacgatttgataaatcgattggtctttccatttatcagctgacgagc1680 taacatccctcttgtcaaaaataatcggttgagcagataccaatcttgagaatgctaaaa1740 ttgctgcaacaactttacttgtaaatccttcagttgaaaatctcattgaattcactggcc1800 gtcgttttacaacgtcgtgactgggaaaaccctggcgttacccaacttaatcgccttgca1860 gcacatccccctttcgccagctggcgtaatagcgaagaggcccgcaccgatcgcccttcc1920 caacagttgcgcagcctgaatggcgaatggcgcctgatgcggtattttctccttacgcat1980 ctgtgcggtatttcacaccgcatatggtgcactctcagtacaatctgctctgatgccgca2040 tagttaagccagccccgacacccgccaacacccgctgacgcgccctgacgggcttgtctg2100 ctcccggcatccgcttacagacaagctgtgaccgtctccgggagctgcatgtgtcagagg2160 ttttcaccgtcatcaccgaaacgcgcgagacgaaagggcctcgtgatacgcctattttta2220 taggttaatgtcatgataataatggtttcttagacgtcaggtggcacttttcggggaaat2280 gtgcgcggaacccctatttgtttatttttctaaatacattcaaatatgtatccgctcatg2340 agacaataaccctgataaatgcttcaataatattgaaaaaggaagagtatgagtattcaa2400 catttccgtgtcgcccttattcccttttttgcggcattttgccttcctgtttttgctcac2460 ccagaaacgctggtgaaagtaaaagatgctgaagatcagttgggtgcacgagtgggttac2520 atcgaactggatctcaacagcggtaagatccttgagagttttcgccccgaagaacgtttt2580 ccaatgatgagcacttttaaagttctgctatgtggcgcggtattatcccgtattgacgcc2640 gggcaagagcaactcggtcgccgcatacactattctcagaatgacttggttgagtactca2700 ccagtcacagaaaagcatcttacggatggcatgacagtaagagaattatgcagtgctgcc2760 ataaccatgagtgataacactgcggccaacttacttctgacaacgatcggaggaccgaag2820 gagctaaccgcttttttgcacaacatgggggatcatgtaactcgccttgatcgttgggaa2880 ccggagctgaatgaagccataccaaacgacgagcgtgacaccacgatgcctgtagcaatg2940 gcaacaacgttgcgcaaactattaactggcgaactacttactctagcttcccggcaacaa3000 ttaatagactggatggaggcggataaagttgcaggaccacttctgcgctcggcccttccg3060 gctggctggtttattgctgataaatctggagccggtgagcgtgggtctcgcggtatcatt3120 gcagcactggggccagatggtaagccctcccgtatcgtagttatctacacgacggggagt3180 caggcaactatggatgaacgaaatagacagatcgctgagataggtgcctcactgattaag3240 cattggtaactgtcagaccaagtttactcatatatactttagattgatttaaaacttcat3300 ttttaatttaaaaggatctaggtgaagatcctttttgataatctcatgaccaaaatccct3360 taacgtgagttttcgttccactgagcgtcagaccccgtagaaaagatcaaaggatcttct3420 tgagatcctttttttctgcgcgtaatctgctgcttgcaaacaaaaaaaccaccgctacca3480 gcggtggtttgtttgccggatcaagagctaccaactctttttccgaaggtaactggcttc3540 agcagagcgcagataccaaatactgtccttctagtgtagccgtagttaggccaccacttc3600 aagaactctgtagcaccgcctacatacctcgctctgctaatcctgttaccagtggctgct3660 gccagtggcgataagtcgtgtcttaccgggttggactcaagacgatagttaccggataag3720 gcgcagcggtcgggctgaacggggggttcgtgcacacagcccagcttggagcgaacgacc3780 tacaccgaactgagatacctacagcgtgagctatgagaaagcgccacgcttcccgaaggg3840 agaaaggcggacaggtatccggtaagcggcagggtcggaacaggagagcgcacgagggag3900 cttccagggggaaacgcctggtatctttatagtcctgtcgggtttcgccacctctgactt3960 gagcgtcgatttttgtgatgctcgtcaggggggcggagcctatggaaaaacgccagcaac4020 gcggcctttt tacggttcct ggccttttgc tggccttttg ctcacatgtt ctttcctgcg 4080 ttatcccctg attctgtgga taaccgtatt accgcctttg agtgagctga taccgctcgc 4140 cgcagccgaa cgaccgagcg cagcgagtca gtgagcgagg aagcggaaga 4190 <210> 29 <211> 38 <212> DNA
<213> synthetic probe or primer c400> 29 agtcactctt cacctcttgt caaaaataat cggttgag 38 <210> 30 <211> 52 c212> DNA
c213> synthetic probe or primer <400> 30 tgcttcctac cactagcagc actaggacat acccgcgtgt caggaggggc ag 52 <210> 31 <211> 57 <212> DNA
c213> synthetic probe or primer <400> 31 tagtactagt attagtaggc ttcgcatgga attcactggc cgtcgtttta caacgtc 57 <210>

<211>

c212>
DNA

<213> or pFMPT-CL-E2-H6 vect <400>

ggtaccctgctcaatctccggaatggtgatctgatcgttcctgaaaacctcgacattggc60 tccctcctgacacaggtactcgtacaggttccaggtaaacgagtcgtagttgtcgatcat120 gacaacgttcttagaagcggccggcattttgaaggtgactaatagcctaagaaaatattt180 aatttaattttcattaaattttcctatactcgctatttcagcttttcatctcatcacttc240 ataaacgatataaaccagaaaaagaactattttcaaacacgcttctcaaaagcggtatgt300 ccttccacgtctccttagaatctggcaagtccgcgagggggatccttagtgatggtggtg360 atggtgtctgccctcgatcacctgccactctgttgtagacagcagcagcgagctaagctc420 tgatctatccctgtcctccaagtcacaacgctctcctcgagtccaattgcatgcggcttc480 gaacctgtgctccacgccccccacgtacatcctaaccttgaagatggtgaagttgacagt540 gcaggggtagtgccagagcctatatgggtaatgaaccatacacctaggtgtcagccaggg600 cccagaaccgcatctggcgtaagtggcctcggggtgcttccgaaaacagtcagtggggca660 ggtcaaggtgttgttgccggcccccccgatgttgcacggggggcccccacacgtcttggt720 gaacccagtgccattcatccatgtacagccgaaccagttgcctcgcggcggccgcgtgtt780 gttgagaatcagcacatccgagtcgttcgccccccagttatacgtggggacaccaaaccg840 atcggtcgtccccaccacaacagggctcggggtgaagcaatacactggaccgcacacctg900 agacgcgggtacaataccacacggtcgaggcgcgtagtgccagcagtagggcctctggtc960 cgagctgttaggctcagtgtaagtgaggggaccccacccctgagcgaacttgtcgatgga1020 gcgacagctggccaagcgctctgggcatccagacgagttgaatttgtgtttgtagaatag1080 tgcggcaaagaaccctgtttggagggagtcgttgcagttcagggcagtcctgttgatgtg1140 ccaactgccgttggtgtttacgagctggattttctgagccgacccggggctaaagaggga1200 cacaaggcccctggtatcggaggctgctgcccctcctgacacgcgggtatgtcctagtgc1260 tgctagtggtaggaagcatagtactagtattagtaggctgcgcatgaattcccgatgaag1320 cagagagegcaggaggcggtatttatagtgccattcccctctctgagagacccggatggt1380 agtcgagtgtatcggagacagcttgatgtagactccgtgcctgccggctcctcttattgg1440 cggacaccagtgagacaccccggaacttgctgtttttctgcaaaatccggggtgaccagt1500 gggagcctatttgcacacacgagcgggacaccccactctggtgaagagtgccaaagtcat1560 tctttttcccgttgcggggcagccgattgcatgttttaggaaaatattacctttgctaca1620.

ccctgtcagatttaccctccacacatatatattccgtcacctccagggactattattcgt1680 cgttgcgccgccagcggaagatatccagaagctgttttccgagagactcggttggcgcct1740 ggtatatttgatggatgtcgcgctgcctcacgtcccggtacccaggaacgcggtgggatc1800 tcgggcccatcgaagactgtgctccagactgctcgcccagcaggtgtttcttgatcgccg1860 cctctaaattgtccgcgcatcgccggtaacatttttccagctcggagtttgcgtttagat1920 acagtttctgcgatgccaaaggagcctgcagattataacctcggatgctgtcattcagcg1980 cttttaatttgacctccagatagttgctgtatttctgttcccattggctgctgcgcagct2040 tcgtataactcgagttattgttgcgctctgcctcggcgtactggctcatgatetggatct2100 tgtccgtgtcgcttttcttc~gagtgtttctcgcaaacgatgtgcacggcctgcagtgtcc2160 aatcggagtcgagctggcgccgaaactggcggatctgagcctccacactgccctgtttct2220 ctatccacggcggaaccgcctcctgccgtttcagaatgttgttcaagtggtactctgtgc2280 ggtcaatgaaggcgttattgccggtgaaatctttgggaagcggttttcctcggggaagat2340 tacgaaattccccgcgtcgttgcgcttcctggatctcgaggagatcgttctccgcgtcga2400 ggagatcgttctccgcgtcgacaccattccttgcggcggcggtgctcaacggcctcaacc2460 tactactggg ctgcttccta atgcaggagt cgcataaggg agagcgtcga caaacccgcg 2520 tttgagaact tgctcaagct tctggtaaac gttgtagtac tctgaaacaa ggccctagca 2580 ctctgatctg tttctcttgg gtagcggtga gtggtttatt ggagttcact ggtttcagca 2640 catctgtcatctagacaatattgttactaaatttttttgaactacaattgttcgtaattc2700 atctattattatacatcctcgtcagcaatttctggcagacggagtttactaacgtcttga2760 gtatgaggccgagaatccagctctgtggccatactcagtcttgacagcctgctgatgtgg2820 ctgcgttcaacgcaataagcgtgtcctccgactccgagttgtgctcgttatcgtcgttct2880 catcctcggaaaaatcacacgaaagaacatactcaccagtaggctttctggtccctgggg2940 cacggctgtttctgacgtattccggcgttgataatagctcgaaagtgaacgccgagtcgc.3000 gggagtcgaccgatgcccttgagagccttcaacccagtcagctccttccggtgggcgcgg3060 ggcatgactatcgtcgccgcacttatgactgtcttctttatcatgcaactcgtaggacag3120 gtgccggcagcgctctgggtcattttcggcgaggaccgctttcgctggagcgcgacgatg3180 atcggcctgtcgcttgcggtattcggaatcttgcacgccctcgctcaagccttcgtcact3240 ggtcccgccaccaaacgtttcggcgagaagcaggccattatcgccggcatggcggccgac3300 gcgctgggctacgtcttgctggcgttcgcgacgcgaggctggatggccttccccattatg3360 attcttctcgcttccggcggcatcgggatgcccgcgttgcaggccatgctgtccaggcag3420 gtagatgacgaccatcagggacagcttcaaggatcgctcgcggctcttaccagcctaact3480 tcgatcactggaccgctgatcgtcacggcgatttatgccgcctcggcgagcacatggaac3540 gggttggcatggattgtaggcgccgccctataCCttgtCtgCC'tCCCCgCgttgcgtcgc3600 ggtgcatggagccgggccacctcgacctgaatggaagccggcggcacctcgctaacggat3660 tcaccactccaagaattgga~gccaatcaattcttgcggagaactgtgaatgcgcaaacca3720 acccttggcagaacatatccatcgcgtccgccatctccagcagccgcacgcggcgcatcg3780 gggggggggggggggggggggggcaaacaattcatcattttttttttattcttttttttg3840 atttcggtttctttgaaatttttttgattcggtaatctccgaacagaaggaagaacgaag3900 gaaggagcacagacttagattggtatatatacgcatatgtagtgttgaagaaacatgaaa3960 ttgcccagtattcttaacccaactgcacagaacaaaaacctgcaggaaacgaagataaat4020 catgtcgaaagctacatataaggaacgtgctgctactcatcctagtcctgttgctgccaa4080 gctatttaatatcatgcacgaaaagcaaacaaacttgtgtgcttcattggatgttcgtac4140 caccaaggaattactggagttagttgaagcattaggtcccaaaatttgtttactaaaaac4200 acatgtggatatcttgactgatttttccatggagggcacagttaagccgctaaaggcatt4260 atccgccaagtacaattttttactcttcgaagacagaaaatttgctgacattggtaatac4320 agtcaaattgcagtactctgcgggtgtatacagaatagcagaatgggcagacattacgaa4380 tgcacacggtgtggtgggcccaggtattgttagcggtttgaagcaggcggcagaagaagt4440 aacaaaggaacctagaggcc~ttttgatgttagcagaattgtcatgcaagggctccctatc4500 tactggagaatatactaagggtactgttgacattgcgaagagcgacaaagattttgttat4560 cggctttattgctcaaagagacatgggtggaagagatgaaggttacgattggttgattat4620 gacacccggtgtgggtttagatgacaagggagacgcattgggtcaacagtatagaaccgt4680 ggatgatgtggtctctacaggatctgacattattattgttggaagaggactatttgcaaa4740 gggaagggatgctaaggtagagggtgaacgttacagaaaagcaggctgggaagcatattt4800 gagaagatgcggccagcaaaactaaaaaactgtattataagtaaatgcatgtatactaaa4860 ctcacaaattagagcttcaatttaattatatcagttattacccgggaatctcggtcgtaa4920 tgatttttataatgacgaaaaaaaaaaaattggaaagaaaagcccccccccccccccccc4980 CCCCCCCCCCCCCCCCgCagcgttgggtcctggccacgggtgcgcatgatcgtgctcctg5040 tcgttgaggacccggctaggctggcggggttgccttactggttagcagaatgaatcaccg5100 atacgcgagcgaacgtgaagcgactgctgctgcaaaacgtctgcgacctgagcaacaaca5160 tgaatggtcttcggtttccgtgtttcgtaaagtctggaaacgcggaagtcagcgccctgc5220 accattatgttccggatctgcatcgcaggatgctgctggctaccctgtggaacacctaca5280 tctgtattaacgaagcgctggcattgaccctgagtgatttttctctggtcccgccgcatc5340 cataccgccagttgtttaccctcacaacgttccagtaaccgggcatgttcatcatcagta5400 acccgtatcgtgagcatcctctctcgtttcatcggtatcattacccccatgaacagaaat5460 tcccccttacacggaggcatcaagtgaccaaacaggaaaaaaccgcccttaacatggccc5520 gctttatcagaagccagacattaacgcttctggagaaactcaacgagctggacgcggatg5580 aacaggcagacatctgtgaatcgcttcacgaccacgctgatgagctttaccgcagctgcc5640 tcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtca5700 cagcttgtctgtaagcggatgccgggagcagacaagcccgtcagggcgcgtcagcgggtg5760 ttggcgggtgtcggggcgcagccatgacccagtcacgtagcgatagcggagtgtatactg5820 gcttaactatgcggcatcagagcagattgtactgagagtgcaccatatgcggtgtgaaat5880 accgcacagatgcgtaaggagaaaataccgcatcaggcgctcttccgcttcctcgctcac5940 tgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggt6000 aatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggcca6060 gcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgccc6120 ccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggact6180 ~ataaagataccaggcgtttccccctggaagctccctcgtgCgCtCtCCtgttCCgaCCCt6240 gccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcatag6300 ctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgca6360 cgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaa6420 cccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagc6480 gaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactag6540 aaggacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagt'tgg6600 tagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagca6660 gcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtc6720 tgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaag6780 gatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaagtatata6840 tgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgat6900 ctgtctatttcgttcatccatagttgcctgactccccgtcgtgtagataactacgatacg6960 ggagggctta ccatctggcc ccagtgctgc aatgataccg cgagacccac gctcaccggc 7020 tccagattta tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc 7080 aactttatcc gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc 7140 gccagttaat agtttgcgca acgttgttgc cattgctgca ggcatcgtgg tgtcacgctc 7200 gtcgtttggtatggcttcattcagctccggttcccaacgatcaaggcgagttacatgatc7260 ccccatgttgtgcaaaaaagcggttagctccttcggtcctccgatcgttgtcagaagtaa7320 gttggccgcagtgttatcactcatggttatggcagcactgcataattctcttactgtcat7380 gccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctgagaata7440 gtgtatgcggCgaCCgagttgCtCttgCCCggcgtcaacacgggataataccgcgccaca7500 tagcagaactttaaaagtgctcatcattggaaaacgttcttcggggcgaaaactctcaag7560 gatcttaccgctgttgagatccagttcgatgtaacccactcgtgcacccaactgatcttc7620 agcatcttttactttcaccagcgtttctgggtgagcaaaaacaggaaggcaaaatgccgc7680 aaaaaagggaataagggcgacacggaaatgttgaatactcatactcttcctttttcaata7740 ttattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtattta7800 gaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctgacgtcta7860 agaaaccattattatcatgacattaacctataaaaataggcgtatcacgaggccctttcg7920 tcttcaa 7927 <210> 33 <211> 24 <212> DNA
<213> synthetic probe or primer <400> 33 taaggatccc cgggtaccga gctc 24 <210> 34 <211> 25 <212> DNA
<213> synthetic probe or primer <400> 34 ccagttcatc atcatatccc aagcc 25 <210> 35 <211> 4234 <212> DNA
<213> vector pUCl8-FMD-CL-El <220>
<221> misc_feature <222> (988)..(989) <223> N is any nucleotide <220>
<221> misc_feature <222> (1167)..(1168) <223> N is any nucleotide <400>

gcgcccaatacgcaaaccgcctctccccgcgcgttggccgattcattaatgcagctggca 60 cgacaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaatgtgagttagct 120 cactcattaggcaccccaggctttacactttatgcttccggctcgtatgttgtgtggaat 180 tgtgagcggataacaatttcacacaggaaacagctatgaccatgattacgaattcgagct 240 cggtacccggggatccttaccagttcatcatcatatcccaagccatacggtgacctgtta 300 tgtggccggg atagattgag caattgcagt cctgcaccgt ctcatgccgg cgaggcgaga 360 tggtgaacag ctgggagacg aggaagacag atccgcagag atcccccacg tacatagcgg 420 aacagaaagcagccgccccaacgagcaaatcgacgtggcgtcgtattgtcgtagtgggga 480 cgctggcgttcctagctgcgagcgtgggggtgagcgctacccagcagcgggaagagttgt 540 tctcccgaacgcagggcacgcacccgggggtgtgcatgatcatgtccgctgcctcataca 600 caatgcttgagttggagcagtcgttcgtgacatggtacatcccggacacgttgcgcacct 660 catatcctagtgctgctagtggtaggaagcatagtactagtattagtaggcttcgcatga 720 attcccgatg aaggcagaga gcgcaaggag gcggtattta tagtgccatt cccctctctg 780 agagacccgg atggtagtcg agtgttatcg gagacagctt gatgtagact ccgtgcctgc 840 cggtcctctt attggcggac accagtgaga caccccggaa cttgctgttt ttctgcaaaa 900 tccggggtgaccagtgggagcctatttgcacacacgagcgggacaccccactctggtgaa960 gagtgccaaagtcattctttttcccgtnncggggcagccgattgcatgttttaggaaaat1020 attacctttgctacaccctgtcagatttaccctccacacatatatattccgtcacctcca1080 gggactattcttggctcgttgcgccgccgcggaagatatccagaagctgtgttttccgag1140 agactcggttggcgcctggtatatttnnaggatgtcgcgctgcctcacgtcccggtaccc1200 aggaacgcggtgggatctcgggcccatcgaagactgtgctccagactgctcgcccagcag1260 gtgtttcttgattgccgcctctaaatagtccgcgcatcgccggtaacatttttccagctc1320 ggagtttgcgtttagatacatttctgcgatgccaaaggagcctgcagattataacctcgg1380 atgctgtcattcagcgcttttaatttgacctccagatagttgctgtatttctgttccatt1440 ggctgctggacgttcgtataactcgagttattgttgcgctctgcctcggcgtactggctc1500 atgactgactgcggtcgcttctcgagtgttctcgcaacaggacgcctgcaggtcatcgag1560 tcgagctggcgccgaaactggcggatctgacctccacactgccctgtatctctatccacc1620 gggaaccgcctcctgccgttccagaatgttgttcaagtggtagctctgtgcggtcaatga1680 aggcgttattgccggtgaaatctttgggaagcggtttatcctcggggaagattacgaaat1740 tcccgcgcgtcgttgcgcttcctggatctcgaggaagatcgttctccgcgtcgaggagat1800 cgttctccgcgtcgacctgcaggcatgcaagcttggcactggccgtcgttttacaacgtc1860 gtgactgggaaaaccctggcgttacccaacttaatcgccttgcagcacatccccctttcg1920 ccagctggcgtaatagcgaagaggcccgcaccgatcgcccttcccaacagttgcgcagcc1980 tgaatggcga atggcgectg atgcggtatt ttctccttac gcatctgtgc ggtatttcac 2040 accgcatatg gtgcactctc agtacaatct gctctgatgc cgcatagtta agccagcccc 2100 gaCaCCCgCCaaCaCCCgCtgaCg'Cg'CCCtgaCgggCttgtCtgCtCCCggcatccgctt2160 acagacaagctgtgacegtctccgggagctgcatgtgtcagaggttttcaccgtcatcac2220 cgaaacgcgcgagacgaaagggcctcgtgatacgcctatttttataggttaatgtcatga2280 taataatggtttcttagacgtcaggtggcacttttcggggaaatgtgcgcggaaccccta2340 tttgtttatttttctaaatacattcaaatatgtatccgctcatgagacaataaccctgat2400 aaatgcttcaataatattgaaaaaggaagagtatgagtattcaacatttccgtgtcgccc2460 ttattcccttttttgcggcattttgccttcctgtttttgctcacccagaaacgctggtga2520 aagtaaaagatgctgaagatcagttgggtgcacgagtgggttacatcgaactggatctca2580 acagcggtaagatccttgagagttttcgccccgaagaacgttttccaatgatgagcactt2640 ttaaagttctgctatgtggcgcggtattatcccgtattgacgccgggcaagagcaactcg2700 gtcgccgcatacactattctcagaatgacttggttgagtactcaccagtcacagaaaagc2760 atcttacggatggcatgacagtaagagaattatgcagtgctgccataaccatgagtgata2820 acactgcggccaacttacttctgacaacgatcggaggaccgaaggagctaaccgcttttt2880 tgcacaacatgggggatcatgtaactcgccttgatcgttgggaaccggagctgaatgaag2940 ccataccaaacgacgagcgtgacaccacgatgcctgtagcaatggcaacaacgttgcgca3000 aactattaactggcgaactacttactctagcttcccggcaacaattaatagactggatgg3060 aggcggataaagttgcaggaccacttctgcgctcggcccttccggctggctggtttattg3120 ctgataaatctggagccggtgagcgtgggtctcgcggtatcattgcagcactggggccag3180 atggtaagccctcccgtatcgtagttatctacacgacggggagtcaggcaactatggatg3240 aacgaaatagacagatcgctgagataggtgcctcactgattaagcattggtaactgtcag3300 accaagtttactcatatatactttagattgatttaaaacttcatttttaatttaaaagga3360 tctaggtgaagatcctttttgataatctcatgaccaaaatcccttaacgtgagttttcgt3420 tccactgagcgtcagaccccgtagaaaagatcaaaggatcttcttgagatcctttttttc3480 tgcgcgtaatctgctgcttgcaaacaaaaaaaccaccgctaccagcggtggtttgtttgc3540 cggatcaagagctaccaactctttttccgaaggtaactggcttcagcagagcgcagatac3600 caaatactgtccttctagtgtagccgtagttaggccaccacttcaagaactctgtagcac3660 CgCCtaCataCCtCg'CrCtgctaatcctgttaccagtggctgctgccagtggcgataagt3720 cgtgtcttaccgggttggactcaagacgatagttaccggataaggcgcagcggtcgggct3780 gaacggggggttcgtgcacacagcccagcttggagcgaacgacctacaccgaactgagat3840 acctacagcgtgagctatgagaaagcgccacgcttcccgaagggagaaaggcggacaggt3900 atccggtaagcggcagggtcggaacaggagagcgcacgagggagcttccagggggaaacg3960 cctggtatctttatagtcctgtcgggtttcgccacctctgacttgagcgtcgatttttgt4020 gatgctcgtcaggggggcggagcctatggaaaaacgccagcaacgcggcctttttacggt4080 tcctggccttttgctggccttttgctcacatgttctttcctgcgttatcccctgattctg4140 tggataaccgtattaccgcctttgagtgagctgataccgctcgccgcagccgaacgaccg4200 agcgcagcgagtcagtgagcgaggaagcggaaga 4234 <210>

<211>

<212>
DNA

<213>
vector pFPM2'-CL-E1 <400>

ggtaccctgctcaatctccggaatggtgatctgatcgttcctgaaaacctcgacattggc60 tccctcctgacacaggtactcgtacaggttccaggtaaacgagtcgtagttgtcgatcat120 gacaacgttcttagaagcggccggcattttgaaggtgactaatagcctaagaaaatattt180 aatttaattttcattaaattttcctatactcgctatttcagcttttcatctcatcacttc240 ataaacgatataaaccagaa,aaagaactattttcaaacacgcttctcaaaagcggtatgt300 ccttccacgtctccttagaatctggcaagtccgcgagggggatccttaccagttcatcat360 catatcccaagccatacggtgacctgttatgtggccgggatagattgagcaattgcagtc420 ctgcaccgtctcatgccggcgaggcgagatggtgaacagctgggagacgaggaagacaga480 tccgcagagatcccccacgtacatagcggaacagaaagcagccgccccaacgagcaaatc540 gacgtggcgtcgtattgtcgtagtggggacgctggcgttcctagctgcgagcgtgggggt600 gagcgctacccagcagcgggaagagttgttctcccgaacgcagggcacgcacccgggggt660 gtgcatgatcatgtccgctgcctcatacacaatgcttgagttggagcagtcgttcgtgac720 atggtacatcccggacacgttgcgcacctcatatcctagtgctgctagtggtaggaagca780 tagtactagtattagtaggcttcgcatgaattcccgatgaagcagagagcgcaggaggcg840 gtatttatagtgccattcccctctctgagagacccggatggtagtcgagtgtatcggaga900 cagcttgatgtagactccgtgcctgccggctcctcttattggcggacaccagtgagacac960 cccggaacttgctgtttttetgcaaaatccggggtgaccagtgggagcctatttgcacac1020 acgagcgggacaccccactctggtgaagagtgccaaagtcattctttttcccgttgcggg1080 gcagccgattgcatgttttaggaaaatattacctttgctacaccctgtcagatttaccct1140 ccacacatatatattccgtcacctccagggactattattcgtcgttgcgccgccagcgga1200 agatatccagaagctgttttccgagagactcggttggcgcctggtatatttgatggatgt1260 cgcgctgcctcacgtcccggtacccaggaacgcggtgggatctcgggcccatcgaagact1320 gtgctccagactgctcgcccagcaggtgtttcttgatcgccgcctctaaattgtccgcgc1380 atcgccggtaacatttttccagctcggagtttgcgtttagatacagtttctgcgatgcca1440 aaggagcctgcagattataacctcggatgctgtcattcagcgcttttaatttgacctcca1500 gatagttgctgtatttctgttcccattggctgctgcgcagcttcgtataactcgagttat1560 tgttgcgctctgcctcggcgtactggctcatgatctggatcttgtccgtgtcgcttttct1620 tcgagtgtttctcgcaaacgatgtgcacggcctgcagtgtccaatcggagtcgagctggc1680 gccgaaactggcggatctgagcctccacactgccctgtttctctatccacggcggaaccg1740 cctcctgccgtttcagaatgttgttcaagtggtactctgtgcggtcaatgaaggcgttat1800 tgccggtgaaatctttgggaagcggttttcctcggggaagattacgaaattccccgcgtc1860 gttgcgcttcctggatctcgaggagatcgttctccgcgtcgaggagatcgttctccgcgt1920 cgacaccattccttgcggcggcggtgctcaacggcctcaacctactactgggctgcttcc1980 taatgcaggagtcgcataagggagagcgtcgacaaacccgcgtttgagaacttgctcaag2040 cttctggtaaacgttgtagtactctgaaacaaggccctagcactctgatctgtttctctt2100 gggtagcggtgagtggtttattggagttcactggtttcagcacatctgtcatctagacaa2160 tattgttactaaatttttttgaactacaattgttcgtaattcatctattattatacatcc2220 tcgtcagcaatttctggcagacggagtttactaacgtcttgagtatgaggccgagaatce2280 agctctgtggccatactcagtcttgacagcctgctgatgtggctgcgttcaacgcaataa2340 gcgtgtcctccgactccgagttgtgctcgttatcgtcgttctcatcctcggaaaaatcac2400 acgaaagaacatactcaccagtaggctttctggtccctggggcacggctgtttctgacgt2460 attccggcgttgataatagctcgaaagtgaacgccgagtcgcgggagtcgaccgatgccc2520 ttgagagccttcaacccagtcagctccttccggtgggcgcggggcatgactatcgtcgcc2580 gcacttatgactgtcttctttatcatgcaactcgtaggacaggtgccggcagcgctctgg2640 gtcattttcg gcgaggaccg ctttcgctgg agcgcgacga tgatcggcct gtcgcttgcg 2700 gtattcggaa tcttgcacgc cctcgctcaa gccttcgtca ctggtcccgc caccaaacgt 2760 ttcggcgagaagcaggccattategccggcatggcggccgacgcgctgggctacgtcttg2820 ctggcgttcgcgacgcgaggctggatggccttccccattatgattcttctcgcttccggc2880 ggcatcgggatgcccgcgttgcaggccatgctgtccaggcaggtagatgacgaccatcag2940 ggacagcttcaaggatcgctcgcggctcttaccagcctaacttcgatcactggaccgctg3000 atcgtcacggcgatttatgccgcctcggcgagcacatggaacgggttggcatggattgta3060 ggcgccgccctataccttgtctgcctccccgcgttgcgtcgcggtgcatggagccgggcc3120 acctcgacctgaatggaagccggcggcacctcgctaacggattcaccactccaagaattg3180 gagccaatcaattcttgcggagaactgtgaatgcgcaaaccaacccttggcagaacatat3240 ccatcgcgtccgccatctccagcagccgcacgcggcgcatcggggggggggggggggggg3300 gggggcaaacaattcatcattttttttttattcttttttttgatttcggtttctttgaaa3360 tttttttgattcggtaatctccgaacagaaggaagaacgaaggaaggagcacagacttag3420 attggtatatatacgcatatgtagtgttgaagaaacatgaaattgcccagtattcttaac3480 ccaactgcacagaacaaaaacctgcaggaaacgaagataaatcatgtcgaaagctacata3540 taaggaacgtgctgctactcatcctagtcctgttgctgccaagctatttaatatcatgca3600 cgaaaagcaaacaaacttgtgtgcttcattggatgttcgtaccaccaaggaattactgga3660 gttagttgaagcattaggtcccaaaatttgtttactaaaaacacatgtggatatcttgac3720 tgatttttccatggagggcacagttaagccgctaaaggcattatccgccaagtacaattt3780 tttactcttcgaagacagaaaatttgctgacattggtaatacagtcaaattgcagtactc3840 tgcgggtgtatacagaatagcagaatgggcagacattacgaatgcacacggtgtggtggg3900 cccaggtattgttagcggtttgaagcaggcggcagaagaagtaacaaaggaacctagagg3960 ccttttgatgttagcagaattgtcatgcaagggctccctatctactggagaatatactaa4020 gggtactgttgacattgcgaagagcgacaaagattttgttatcggctttattgctcaaag4080 agacatgggtggaagagatgaaggttacgattggttgattatgacacccggtgtgggttt4140 agatgacaagggagacgcattgggtcaacagtatagaaccgtggatgatgtggtctctac4200 aggatctgacattattattgttggaagaggactatttgcaaagggaagggatgctaaggt4260 agagggtgaacgttacagaaaagcaggctgggaagcatatttgagaagatgcggccagca4320 aaactaaaaaactgtattataagtaaatgcatgtatactaaactcacaaattagagcttc4380 aatttaattatatcagttattacccgggaatctcggtcgtaatgatttttataatgacga-4440 aaaaaaaaaaattggaaagaaaagccccccccccccccccccccccccccccccccccgc4500 agcgttgggtcctggccacgggtgcgcatgatcgtgctcctgtcgttgaggacccggcta4560 ggctggcggggttgccttactggttagcag.aatgaatcaccgatacgcgagcgaacgtga4620 agcgactgctgctgcaaaacgtctgcgacctgagcaacaacatgaatggtcttcggtttc4680 cgtgtttcgtaaagtctggaaacgcggaagtcagcgccctgcaccattatgttccggatc4740 tgcatcgcaggatgctgctggctaccctgtggaacacctaeatctgtattaacgaagcgc4800 tggcattgaccctgagtgatttttctctggtcccgccgcatccataccgccagttgttta4860 ccctcacaacgttccagtaaccgggcatgttcatcatcagtaacccgtatcgtgagcatc4920 ctctctcgtttcatcggtatcattacccccatgaacagaaattcccccttacacggaggc4980 atcaagtgaccaaacaggaaaaaaccgcccttaacatggcccgctttatcagaagccaga5040 cattaacgcttctggagaaactcaacgagctggacgcggatgaacaggcagacatctgtg5100 aatcgcttcacgaccacgctgatgagctttaccgcagctgcctcgcgcgtttcggtgatg5160 acggtgaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgtaagcgg5220 atgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcggggcg5280 cagccatgacccagtcacgtagcgatagcggagtgtatactggcttaactatgcggcatc5340 agagcagattgtactgagagtgcaccatatgcggtgtgaaataccgcacagatgcgtaag5400 gagaaaataccgcatcaggcgCtCttCCgCttCC'tCgC'tCaCtgaCtCgCtgcgctcggt5460 cgttcggctgcggcgagcggtatcagctcactcaaaggcggtaatacggttatccacaga5520 atcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaaccg5580 taaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaa5640 aaatcgacgctcaagtcagaggtggcgaaacccgacaggaatataaagataccaggcgtt5700 tccccctggaagctccctcgtgcgctctcctgttCCgaCCctgccgcttaccggatacct5760 gtccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatct5820 cagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcc5880 cgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgactt5940 atcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgc6000 tacagagttcttgaagtggtggcctaactacggctacactagaaggacagtatttggtat6060 ctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaa6120 acaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaa6180 aaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacga6240 aaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatcct6300 tttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttggtctga6360 cagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcatc6420 catagttgcctgactccccgtcgtgtagataactacgatacgggagggcttaccatctgg6480 ccccagtgctgcaatgataccgcgagacccacgctcaccggctccagatttatcagcaat6540 aaaccagccagccggaagggccgagcgcagaagtggtcctgcaactttatccgcctccat6600 ccagtctattaattgttgccgggaagctagagtaagtagttcgccagttaatagtttgcg6660 caacgttgttgccattgctgcaggcatcgtggtgtcacgctcgtcgtttggtatggcttc6720 attcagctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtgcaaaaa6780 agcggttagctccttcggtcctccgatcgttgtcagaagtaagttggccgcagtgttatc6840 actcatggttatggcagcactgcataattctcttactgtcatgccatccgtaagatgctt6900 ttctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcgaccgag6960 ttgctcttgcccggcgtcaacacgggataataccgcgccacatagcagaactttaaaagt7020 gctcatcattggaaaacgttcttcggggcgaaaactctcaaggatcttaccgctgttgag7080 atccagttcgatgtaacccactcgtgcacccaactgatcttcagcatcttttactttcac7140 cagcgtttctgggtgagcaaaaacaggaaggcaaaatgccgcaaaaaagggaataagggc7200 gacacggaaa tgttgaatac tcatactctt cctttttcaa tattattgaa gcatttatca 7260 gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata aacaaatagg 7320 ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc taagaaacca ttattatcat 7380 gacattaacc tataaaaata ggcgtatcac gaggcccttt cgtcttcaa 7429 <210> 37 <211> 39 <212> DNA
<213> synthetic probe or primer <400> 37 catcacaaat atgaggtgcg caacgtgtcc gggatgtac 39 <210> 38 <211> 42 <212> DNA
<213> synthetic probe or primer <400> 38 gtgatggtgg tgtcctagtg ctgctagtgg taggaagcat ag 42 <210> 39 <211> 4273 <212> DNA
<213> vector pUCl8-FMD-CL-E1-H-K6 <220>
<221> misc_feature <222> (1027)..(1028) <223> N is any nucleotide <220>
<221> misc_feature <222> (1206)..(1207) <223> N is any nucleotide <400>

gCJCCCaataCgCaaaCCgCCtC'tCCCCC.JCgcgttggccgattcattaatgcagctggca60 cgacaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaatgtgagttagct120 cactcattaggcaccccaggctttacactttatgcttccggctcgtatgttgtgtggaat180 tgtgagcggataacaatttcacacaggaaacagctatgaccatgattacgaattcgagct240 cggtacccggggatccttaatggtgatggtggtggtgccagttcatcatcatatcccaag300 ccatacggtgacctgttatgtggccgggatagattgagcaattgcagtcctgcaccgtct360 catgccggcg aggcgagatg gtgaacagct gggagacgag gaagacagat ccgcagagat 420 cccccacgta catagcggaa cagaaagcag ccgccccaac gagcaaatcg acgtggcgtc 480 gtattgtcgtagtggggacgctggcgttcctagctgcgagcgtgggggtgagcgctaccc 540 agcagcgggaagagttgttctcccgaacgcagggcacgcacccgggggtgtgcatgatca 600 tgtccgctgcctcatacacaatgcttgagttggagcagtcgttcgtgacatggtacatcc 660 cggacacgttgcgcacctcatatttgtgatggtgatggtggtgtcctagtgctgctagtg 720 gtaggaagcatagtactagtattagtaggcttcgcatgaattcccgatgaaggcagagag 780 cgcaaggaggcggtatttatagtgccattcccctctctgagagacccggatggtagtcga 840 gtgttatcggagacagcttgatgtagactccgtgcctgccggtcctcttattggcggaca 900 ccagtgagac accccggaac ttgctgtttt tctgcaaaat ccggggtgac cagtgggagc 960 ctatttgcac acacgagcgg gacaccccac tctggtgaag agtgccaaag tcattctttt 1020 tcccgtnncg gggcagccga ttgcatgttt taggaaaata ttacctttgc tacaccctgt 1080 cagatttacc ctccacacat atatattccg tcacctccag ggactattct tggctcgttg 1140 cgccgccgcggaagatatccagaagctgtgttttccgagagactcggttggcgcctggta1200 tatttnnaggatgtcgcgctgcctcacgtcccggtacceaggaacgcggtgggatctcgg1260 gcccatcgaagactgtgctccagactgctcgcccagcaggtgtttcttgattgccgcctc1320 taaatagtccgcgcatcgccggtaacatttttccagctcggagtttgcgtttagatacat1380 ttctgcgatgccaaaggagcctgcagattataacctcggatgctgtcattcagcgctttt1440 aatttgacctccagatagttgctgtatttctgttccattggctgctggacgttcgtataa1500 ctcgagttattgttgcgctctgcctcggcgtactggctcatgactgactgcggtcgcttc1560 tcgagtgttctcgcaacaggacgcctgcaggtcatcgagtcgagctggcgccgaaactgg1620 cggatctgacctccacactgccctgtatctctatccaccgggaaccgcctcctgccgttc1680 cagaatgttgttcaagtggtagctctgtgcggtcaatgaaggcgttattgccggtgaaat1740 ctttgggaagcggtttatcctcggggaagattacgaaattcccgcgcgtcgttgcgcttc1800 ctggatctcgaggaagatcgttctccgcgtcgaggagatcgttctccgcgtcgacctgca1860 ggcatgcaagcttggcactggccgtcgttttacaacgtcgtgactgggaaaaccctggeg1920 ttacccaacttaatcgccttgcagcacatccccctttcgccagctggcgtaatagcgaag1980 aggcccgcaccgatcgcccttcccaacagttgcgcagcctgaatggcgaatggcgcctga2040 tgcggtattttctccttacgcatctgtgcggtatttcacaccgcatatggtgcactctca2100 gtacaatctgctctgatgccgcatagttaagccagccccgacacccgccaacacccgctg2160 acgcgccctgacgggcttgtctgctcccggcatccgcttacagacaagctgtgaccgtct2220 ccgggagctgcatgtgtcagaggttttcaccgtcatcaccgaaacgcgcgagacgaaagg2280 gcctcgtgatacgcctatttttataggttaatgtcatgataataatggtttcttagacgt2340 caggtggcacttttcggggaaatgtgcgcggaacccctatttgtttatttttctaaatac2400 attcaaatatgtatccgctcatgagacaataaccctgataaatgcttcaataatattgaa2460 aaaggaagagtatgagtattcaacatttccgtgtcgcccttattcccttttttgcggcat2520 tttgccttcctgtttttgctcacccagaaacgctggtgaaagtaaaagatgctgaagatc2580 agttgggtgcacgagtgggttacatcgaactggatctcaacagcggtaagatccttgaga2640 gttttcgccccgaagaacgttttccaatgatgagcacttttaaagttctgctatgtggcg2700 cggtattatcccgtattgacgccgggcaagagcaactcggtcgccgcatacactattctc2760 agaatgacttggttgagtactcaccagtcacagaaaagcatcttacggatggcatgacag2820 taagagaattatgcagtgctgccataaccatgagtgataacactgcggccaacttacttc2880 tgacaacgatcggaggaccgaaggagctaaccgcttttttgcacaacatgggggatcatg2940 taactcgccttgatcgttgggaaccggagctgaatgaagccataccaaacgacgagcgtg3000 acaccacgatgcctgtagcaatggcaacaacgttgcgcaaactattaactggcgaactac3060 ttactctagcttcccggcaacaattaatagactggatggaggcggataaagttgcaggac3120 cacttctgcgctcggcccttccggctggctggtttattgctgataaatctggagccggtg3180 agcgtgggtctcgcggtatcattgcagcactggggccagatggtaagccctcccgtatcg3240 tagttatctacacgacggggagtcaggcaactatggatgaacgaaatagacagatcgctg3300 agataggtgcctcactgattaagcattggtaactgtcagaccaagtttactcatatatac3360 tttagattgatttaaaacttcatttttaatttaaaaggatctaggtgaagatcctttttg3420 ataatctcatgaccaaaatcecttaacgtgagttttcgttccactgagcgtcagaccccg3480 tagaaaagatcaaaggatcttcttgagatcctttttttctgcgcgtaatctgctgcttgc3540 aaacaaaaaaaccaccgctaccagcggtggtttgtttgccggatcaagagctaccaactc3600 tttttccgaaggtaactggcttcagcagagcgcagataccaaatactgtccttctagtgt3660 agccgtagttaggccaccacttcaagaactctgtagcaccgcctacatacctcgctctgc3720 taatcctgttaccagtggctgctgccagtggcgataagtcgtgtcttaccgggttggact3780 caagacgatagttaccggataaggcgcagcggtcgggctgaacggggggttcgtgcacac3840 agcccagcttggagcgaacgacctacaccgaactgagatacctacagcgtgagctatgag3900 aaagcgccacgcttcccgaagggagaaaggcggacaggtatccggtaagcggcagggtcg3960 gaacaggagagcgcacgagggagcttccagggggaaacgcctggtatctttatagtcctg4020' tcgggtttcgccacctctgacttgagcgtcgatttttgtgatgctcgtcaggggggcgga4080 gcctatggaaaaacgccagcaacgcggcctttttacggttcctggccttttgctggcctt4140 ttgctcacat gttctttcct gcgttatccc ctgattctgt ggataaccgt attaccgcct 4200 ttgagtgagc tgataccgct cgccgcagcc gaacgaccga gcgcagcgag tcagtgagcg 4260 aggaagcgga aga 4273 <210> 40 <211> 7330 <212> DNA
<213> vector pFPMT-CL-H6-K-E1 <220>
<221> misc_feature <222> (1098)..(1099) <223> N is any nucleotide , <220>
<221> misc_feature <222> (1277)..(1278) <223> N is any nucleotide <400>

ggtaccctgctcaatctccggaatggtgatctgatcgttcctgaaaacctcgacattggc60 tccctcctgacacaggtactcgtacaggttccaggtaaacgagtcgtagttgtcgatcat120 gacaacgttcttagaagcggccggcattttgaaggtgactaatagcctaagaaaatattt180 aatttaattttcattaaattttcctatactcgctatttcagcttttcatctcatcacttc240 ataaacgatataaaccagaaaaagaactattttcaaacacgcttctcaaaagcggtatgt300 ccttccacgtctccttagaatctggcaagtccgcgagggggatccttaccagttcatcat360 catatcccaagccatacggtgacctgttatgtggccgggatagattgagcaattgcagtc420 ctgcaccgtctcatgccggcgaggcgagatggtgaacagctgggagacgaggaagacaga480 tccgcagagatcccccacgtacatagcggaacagaaagcagccgccccaacgagcaaatc540 gacgtggcgtcgtattgtcgtagtggggacgctggcgttcctagctgcgagcgtgggggt600 gagcgctacccagcagcgggaagagttgttctcccgaacgcagggcacgcacccgggggt660 gtgcatgatcatgtccgctgcctcatacacaatgcttgagttggagcagtcgttcgtgac720 atggtacatcccggacacgttgcgcacctcatatttgtgatggtgatggtggtgtcctag780 tgctgctagtggtaggaagcatagtactagtattagtaggcttcgcatgaattcccgatg840 aaggcagagagcgcaaggaggcggtatttatagtgccattcccctctctgagagacccgg900 atggtagtcgagtgttatcggagacagcttgatgtagactccgtgcctgccggtcctctt960 attggcggacaccagtgagacaccccggaacttgctgtttttctgcaaaatccggggtga1020 ccagtgggagcctatttgcacacacgagcgggacaccccactctggtgaagagtgccaaa1080 gtcattcttt ttcccgtnnc ggggcagccg attgcatgtt ttaggaaaat attacctttg 1140 ctacaccctg tcagatttac cctccacaca tatatattcc gtcacctcca gggactattc 1200 ttggctcgtt gcgccgccgc ggaagatatc cagaagctgt gttttccgag agactcggtt 1260 ggcgcctggt atatttnnag gatgtcgcgc tgcctcacgt cccggtaccc aggaacgcgg 1320 tgggatctcgggcccatcgaagactgtgctccagactgctcgcccagcaggtgtttcttg1380 attgccgcctctaaatagtccgcgcatcgccggtaacatttttccagctcggagtttgcg1440 tttagatacatttctgcgatgccaaaggagcctgcagattataacctcggatgctgtcat1500 tcagcgcttttaatttgacctccagatagttgctgtatttctgttccattggctgctgga1560 cgttcgtataactcgagttattgttgcgctetgcctcggcgtactggcteatgactgact1620 gcggtcgcttctcgagtgttctcgcaacaggacgcctgcaggtcatcgagtcgagctggc1680 gccgaaactggcggatctgacctccacactgccctgtatctctatccaccgggaaccgcc1740 tcctgccgttccagaatgttgttcaagtggtagctctgtgcggtcaatgaaggcgttatt1800 gccggtgaaatctttgggaagcggtttatcctcggggaagattacgaaattcccgcgcgt1860 cgttgcgcttcctggatctcgaggaagatcgttctccgcgtcgaggagatCgttCtCCgC1920 gtcgacctgcaggcatgcaagcttctggtaaacgttgtagtactctgaaacaaggcccta1980 gcactctgatctgtttctcttgggtagcggtgagtggtttattggagttcactggtttca2040 gcacatctgtcatctagacaatattgttactaaatttttttgaactacaattgttcgtaa2100 ttcatctattattatacatcctcgtcagcaatttctggcagacggagtttactaacgtct2160 tgagtatgaggccgagaatccagctctgtggccatactcagtcttgacagcctgctgatg2220 tggctgcgttcaacgcaataagcgtgtcctccgactccgagttgtgctcgttatcgtcgt2280 tctcatcctcggaaaaatcacacgaaagaacatactcaccagtaggctttctggtccctg2340 gggcacggctgtttctgacgtattccggcgttgataatagctcgaaagtgaacgccgagt2400 cgcgggagtcgaccgatgcccttgagagccttcaacccagtcagctccttccggtgggcg2460 cggggcatgactatcgtcgccgcacttatgactgtcttctttatcatgcaactcgtagga2520 caggtgccggcagcgctctgggtcattttcggcgaggaccgctttcgctggagcgcgacg2580 atgatcggcctgtcgcttgcggtattcggaatcttgcacgccctcgctcaagccttcgtc2640 actggtcccg ccaccaaacg tttcggcgag aagcaggcca ttatcgccgg catggcggcc 2700 gacgcgctgg gctacgtctt gctggcgttc gcgacgcgag gctggatggc cttccccatt 2760 atgattcttc tcgcttccgg cggcatcggg atgcccgcgt tgcaggccat gctgtccagg 2820 caggtagatg acgaccatca gggacagctt caaggatcgc tcgcggctct taccagccta 2880 acttcgatca ctggaccgct gatcgtcacg gcgatttatg ccgcctcggc gagcacatgg 2940 aacgggttggcatggattgtaggcgccgccctataccttgtctgcctccccgcgttgcgt3000 cgcggtgcatggagccgggccacctcgacctgaatggaagccggcggcacctcgctaacg3060 gattcaccactccaagaattggagccaatcaattcttgcggagaactgtgaatgcgcaaa3120 ccaacccttggcagaacatatccatcgcgtCCgCCatCtCCagCagCCgCaCJCggCgCa3180 tcggggggggggggggggggggggggcaaacaattcatcattttttttttattctttttt3240 ttgatttcggtttctttgaaatttttttgattcggtaatctccgaacagaaggaagaacg3300 aaggaaggagcacagacttagattggtatatatacgcatatgtagtgttgaagaaacatg3360 aaattgcccagtattcttaacccaactgcacagaacaaaaacctgcaggaaacgaagata3420 aatcatgtcgaaagctacatataaggaacgtgctgctactcatcctagtcctgttgctgc3480 caagctatttaatatcatgcacgaaaagcaaacaaacttgtgtgcttcattggatgttcg3540 taccaccaaggaattactggagttagttgaagcattaggtcccaaaatttgtttactaaa3600 aacacatgtggatatcttgactgatttttccatggagggcacagttaagccgctaaaggc3660 attatccgccaagtacaattttttactcttcgaagacagaaaatttgctgacattggtaa3720 tacagtcaaattgcagtactctgcgggtgtatacagaatagcagaatgggcagacattac3780 gaatgcacacggtgtggtgggcccaggtattgttagcggtttgaagcaggcggcagaaga3840 agtaacaaaggaacctagaggccttttgatgttagcagaattgtcatgcaagggctccct3900 atctactggagaatatactaagggtactgttgacattgcgaagagcgacaaagattttgt3960 tatcggctttattgctcaaagagacatgggtggaagagatgaaggttacgattggttgat4020 tatgacacccggtgtgggtttagatgacaagggagacgcattgggtcaacagtatagaac4080 cgtggatgatgtggtctctacaggatctgacattattattgttggaagaggactatttgc4140 aaagggaagggatgctaaggtagagggtgaacgttacagaaaagcaggctgggaagcata4200 tttgagaagatgcggccagcaaaactaaaaaactgtattataagtaaatgcatgtatact4260 aaactcacaaattagagcttcaatttaattatatcagttattacccgggaatctcggtcg4320 taatgatttttataatgacgaaaaaaaaaaaattggaaagaaaagccccccccccccccc4380 CCCCCCCCCCCCCCCCCCCgcagcgttgggtcctggccacgggtgcgcatgatcgtgctc4440 ctgtcgttgaggacccggctaggctggcggggttgccttactggttagcagaatgaatca4500 ccgatacgcgagcgaacgtgaagcgactgctgctgcaaaacgtctgcgacctgagcaaca4560 acatgaatggtcttcggtttccgtgtttcgtaaagtctggaaacgcggaagtcagcgccc4620 tgcaccattatgttccggatctgcatcgcaggatgctgctggctaccctgtggaacacct4680 acatctgtattaacgaagcgctggcattgaccctgagtgatttttctctggtcccgccgc4740 atccataccgccagttgtttaccctcacaacgttccagtaaccgggcatgttcatcatca4800 gtaacccgtatcgtgagcatcctctctcgtttcatcggtatcattacccccatgaacaga4860 aattcccccttacacggaggcatcaagtgaccaaacaggaaaaaaccgcccttaacatgg4920 cccgctttatcagaagccagacattaacgcttctggagaaactcaacgagctggacgcgg4980 atgaacaggcagacatctgtgaatcgcttcacgaccacgctgatgagctttaccgcagct5040 gcctcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacgg5100 tcacagcttgtctgtaagcggatgccgggagcagacaagcccgtcagggcgcgtcagcgg5160 gtgttggcgggtgtcggggcgcagccatgacccagtcacgtagcgatagcggagtgtata5220 ctggcttaactatgcggcatcagagcagattgtactgagagtgcaccatatgcggtgtga5280 aataccgcacagatgcgtaaggagaaaataccgcatcaggcgctcttccgcttcctcgct5340 cactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggc5400 ggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaagg5460 ccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccg5520 cccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacagg5580 actataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgac5640 cctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctca5700 tagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgt5760 gcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtc5820 caacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcag5880 agcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacac5940 tagaaggacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagt6000 tggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaa6060 gcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggg6120 gtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaa6180 aaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaagtat6240 atatgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctcagc6300 gatctgtctatttcgttcatccatagttgcctgactccccgtcgtgtagataactacgat6360 acgggagggcttaccatctggccccagtgctgcaatgataccgcgagacccacgctcacc6420 ggctccagatttatcagcaataaaccagccagccggaagggccgagcgcagaagtggtcc6480 tgcaactttatccgcctccatccagtctattaattgttgccgggaagctagagtaagtag6540 ttcgccagttaatagtttgcgcaacgttgttgccattgctgcaggcatcgtggtgtcacg6600 ctcgtcgtttggtatggcttcattcagctccggttcccaacgatcaaggcgagttacatg6660 atcccccatgttgtgcaaaaaagcggttagctccttcggtcctccgatcgttgtcagaag6720 taagttggccgcagtgttatcactcatggttatggcagcactgcataattctcttactgt6780 catgccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctgaga6840 atagtgtatgcggcgaccgagttgctcttgcccggcgtcaacacgggataataccgcgcc6900 acatagcagaactttaaaagtgctcatcattggaaaacgttcttcggggcgaaaactctc6960 aaggatcttaccgctgttgagatccagttcgatgtaacccactcgtgcacccaactgatc7020 ttcagcatcttttactttcaccagcgtttctgggtgagcaaaaacaggaaggcaaaatgc7080 cgcaaaaaagggaataagggcgacacggaaatgttgaatactcatactcttcctttttca7140 atattattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtat7200 ttagaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctgacgt7260 ctaagaaacc attattatca tgacattaac ctataaaaat aggcgtatca cgaggccctt 7320 tcgtcttcaa 7330 <210> 41 <211> 5202 <212> DNA
<213> vector pYIG5 <400>

agcgcccaatacgcaaaccgcctctccccgcgcgttggccgattcattaatgcagctggc60 acgacaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaatgtgagttagc120 tcactcattaggcaccccaggctttacactttatgcttccggctcgtatgttgtgtggaa180 ttgtgagcggataacaatttcacacaggaaacagctatgaccatgattacgaatttaata240 cgactcactatagggaattcgaggatccttcaatatgcgcacatacgctgttatgttcaa300 ggtcccttcgtttaagaacgaaagcggtcttccttttgagggatgtttcaagttgttcaa360 atctatcaaatttgcaaatccccagtctgtatctagagcgttgaatcggtgatgcgattt420 gttaattaaattgatggtgtcaccattaccaggtctagatataccaatggcaaactgagc480 acaacaataccagtccggatcaactggcaccatctctcccgtagtctcatctaatttttc540 ttccggatgaggttccagatataccgcaacacctttattatggtttccctgagggaataa600 tagaatgtcccattcgaaatcaccaattctaaacctgggcgaattgtatttcgggtttgt660 taactcgttccagtcaggaatgttccacgtgaagctatcttccagcaaagtctccacttc720 ttcatcaaattgtggagaatactcccaatgctcttatctatgggacttccgggaaacaca780 gtaccgatacttcccaattcgtcttcagagctcattgtttgtttgaagagactaatcaaa840 gaatcgttttctcaaaaaaattaatatcttaactgatagtttgatcaaaggggcaaaacg900 taggggcaaacaaacggaaaaatcgtttctcaaattttctgatgccaagaactctaacca960 gtcttatctaaaaattgccttatgatccgtctctccggttacagcctgtgtaactgatta1020 atcctgcctttctaatcaccattctaatgttttaattaagggattttgtCttcattaacg1080 gctttcgctcataaaaatgttatgacgttttgcccgcaggcgggaaaccatccacttcac1140 gagactgatctcctctgccggaacaccgggcatctccaacttataagttggagaaataag1200 agaatttcagattgagagaatgaaaaaaaaaaaccctgaaaaaaaaggttgaaaccagtt1260 ccctgaaattattcccctacttgactaataagtatataaagacggtaggtattgattgta1320 attctgtaaatctatttcttaaacttcttaaattctacttttatagttagtctttttttt1380 agttttaaaacaccaagaacttagtttcgaataaacacacataaacaaacaccatgagat1440 ttccttcaatttttactgcagttttattcgcagcatcctccgcattagctgctccagtca1500 acactacaacagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcag1560 atttagaaggggatttcgatgttgctgttttgccattttccaacagcacaaataacgggt1620 tattgtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctag1680 ataaaaggcctgtcgacggtaccagatctcgacttggttgaacacgttgccaaggcttaa1740 gtgaatttactttaaagtcttgcatttaaataaattttctttttatagctttatgactta1800 gtttcaatttatatactattttaatgacattttcgattcattgattgaaagctttgtgtt1860 ttttcttgatgcgctattgcattgttcttgtctttttcgccacatgtaatatctgtagta1920 gatacctgatacattgtggatgctgagtgaaattttagttaataatggaggcgctcttaa1980 taattttggggatattggcttttttttttaaagtttacaaatgaattttttccgccagga2040 taacgattctgaagttactcttagcgttcctatcggtacagccatcaaatcatgcctata2100 aatcatgcctatatttgcgtgcagtcagtatcatctacatgaaaaaaactcccgcaattt2160 cttatagaatacgttgaaaattaaatgtacgcgccaagataagataacatatatctagct2220 agatgcagtaatatacacagattcccgcggacgtgggaaggaaaaaattagataacaaaa2280 tctgagtgatatggaaattccgctgtatagctcatatctttcccttcaacaccagaaatg2340 taaaaatcttgttacgaaggatctttttgctaatgtttctcgctcaatcctcatttcttc2400 cctacgaagagtcaaatctacttgttttctgccggtatcaagatccatatcttctagttt2460 caccatcaaagtccaatttctagtatacagtttatgtcccaacgtaacagacaatcaaaa2520 ttggaaaggataagtatccttcaaagaatgattctgcgctggctcctgaaccgcctaatg2580 ggaacagagaagtccaaaacgatgctataagaaccagaaataaaacgataaaaccatacc2640 aggatccaagcttggcactggccgtcgttttacaacgtcgtgactgggaaaaccctggcg2700 ttacccaacttaatcgccttgcagcacatccccctttcgccagctggcgtaatagcgaag2760 aggcccgcaccgatcgcccttcccaacagttgcgcagcctgaatggcgaatgggaaattg2820 taaacgttaatattttgttaaaattcgcgttaaatttttgttaaatcagctcatttttta2880 accaataggccgaaatcggcaaaatcccttataaatcaaaagaatagaccgagatagggt2940 tgagtgttgttccagtttggaacaagagtccactattaaagaacgtggactccaacgtca3000 aagggcgaaaaaccgtctatcagggcgatggcccactacgtgaaccatcaccctaatcaa3060 gttttttggggtcgaggtgccgtaaagcactaaatcggaaccctaaagggagcccccgat3120 ttagagcttgacggggaaagccggcgaacgtggcgagaaaggaagggaagaaagcgaaag3180 gagcgggcgctagggcgctggcaagtgtagcggtcacgctgcgcgtaaccaccacacccg3240 ccgcgcttaatgcgccgctacagggcgcgtcaggtggcacttttcggggaaatgtgcgcg3300 gaacccctatttgtttatttttctaaatacattcaaatatgtatccgctcatgagacaat3360 aaccctgataaatgcttcaataatattgaaaaaggaagagtatgagtattcaacatttcc3420 gtgtcgcccttattcccttttttgcggcattttgccttcctgtttttgctcacccagaaa3480 cgctggtgaaagtaaaagatgctgaagatcagttgggtgcacgagtgggttacatcgaac3540 tggatctcaacagcggtaagatccttgagagttttcgccccgaagaacgttttccaatga3600 tgagcacttttaaagttctgctatgtggcgcggtattatcccgtattgacgccgggcaag3660 agcaactcggtcgccgcatacactattctcagaatgacttggttgagtactcaccagtca3720 cagaaaagcatcttacggatggcatgacagtaagagaattatgcagtgctgccataacca3780 tgagtgataacactgcggccaacttacttctgacaacgatcggaggaccgaaggagctaa3840 ccgcttttttgcacaacatgggggatcatgtaactcgccttgatcgttgggaaccggagc3900 tgaatgaagccataccaaacgacgagcgtgacaccacgatgcctgtagcaatggcaacaa3960 cgttgcgcaaactattaactggcgaactacttactctagcttcccggcaacaattaatag4020 actggatggaggcggataaagttgcaggaccacttctgcgC'tCggCCCttccggctggct4080 ggtttattgctgataaatctggagccggtgagcgtgggtctcgcggtatcattgcagcac4140 tggggccagatggtaagccctcccgtatcgtagttatctacacgacggggagtcaggcaa4200 ctatggatgaacgaaatagacagatcgctgagataggtgcctcactgattaagcattggt4260 aactgtcagaccaagtttactcatatatactttagattgatttaaaacttcatttttaat4320 ttaaaaggatctaggtgaagatcctttttgataatctcatgaccaaaatcccttaacgtg4380 agttttcgttccactgagcgtcagaccccgtagaaaagatcaaaggatcttcttgagatc4440 ctttttttctgcgcgtaatctgctgcttgcaaacaaaaaaaccaccgctaccagcggtgg4500 tttgtttgceggatcaagagctaccaactctttttccgaaggtaactggcttcagcagag4560 cgcagataccaaatactgtccttctagtgtagccgtagttaggccaccacttcaagaact4620 ctgtagcaccgcctacatacctcgctctgctaatcctgttaccagtggctgctgccagtg4680 gcgataagtcgtgtcttaccgggttggactcaagacgatagttaccggataaggcgcagc4740 ggtcgggctgaacggggggttcgtgcacacagcccagcttggagcpaacgacctacaccg4800 aactgagatacctacagcgtgagcattgagaaagcgccacgcttcccgaagggagaaagg4860 cggacaggtatccggtaagcggcagggtcggaacaggagagcgcacgagggagcttccag4920 ggggaaacgcctggtatctttatagtcctgtcgggtttcgccacctctgacttgagcgtc4980 gatttttgtgatgctcgtcaggggggcggagcctatggaaaaacgccagcaacgcggcct5040 ttttacggttcctggccttttgctggccttttgctcacatgttctttcctgcgttatccc5100 ctgattctgtggataaccgtattaccgcctttgagtgagctgataccgctcgccgcagcc5160 gaacgaccgagcgcagcgagtcagtgagcgaggaagcggaag 5202 <210>

<211>

<212>

<213>
vector pYIG5ElH6 <400>

ggatccttcaatatgcgcacatacgctgttatgttcaaggtcccttcgtttaagaacgaa 60 agcggtcttccttttgagggatgtttcaagttgttcaaatctatcaaatttgcaaatccc 120 cagtctgtatctagagcgttgaatcggtgatgcgatttgttaattaaattgatggtgtca 180 ccattaccaggtctagatataccaatggcaaactgagcacaacaataccagtccggatca 240 actggcacca~tctctcccgtagtctcatctaatttttcttccggatgagg.ttccagatat 300 accgcaacacctttattatggtttccctgagggaataatagaatgtcccattcgaaatca 360 ccaattctaaacctgggcgaattgtatttcgggtttgttaactcgttccagtcaggaatg 420 ttccacgtgaagctatcttccagcaaagtctccacttcttcatcaaattgtggagaatac 480 tcccaatgctcttatctatgggacttccgggaaacacagtaccgatacttcccaattcgt 540 cttcagagctcattgtttgtttgaagagactaatcaaagaatcgttttctcaaaaaaatt 600 aatatcttaactgatagtttgatcaaaggggcaaaacgtaggggcaaacaaacggaaaaa 660 tcgtttctcaaattttctgatgccaagaactctaaccagtcttatctaaaaattgcctta 720 tgatccgtctctccggttacagcctgtgtaactgattaatcctgcctttctaatcaccat 780 tctaatgttttaattaagggattttgtcttcattaacggctttcgctcataaaaatgtta 840 tgacgttttgcccgcaggcgggaaaccatccacttcacgagactgatctcctctgccgga 900 acaccgggcatctccaacttataagttggagaaataagagaatttcagattgagagaatg960 aaaaaaaaaaaccctgaaaaaaaaggttgaaaccagttccctgaaattattcccctactt1020 gactaataagtatataaagacggtaggtattgattgtaattctgtaaatctatttcttaa1080 acttcttaaattctacttttatagttagtcttttttttag.ttttaaaacaccaagaactt1140 agtttcgaataaacacacataaacaaacaccatgagatttccttcaatttttactgcagt1200 tttattcgcagcatcctccgcattagctgctccagtcaacactacaacagaagatgaaac1260 ggcacaaattccggctgaagctgtcatcggttacttagatttagaaggggatttcgatgt1320 tgctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactat1380 tgccagcattgctgctaaagaagaaggggtatctctagataaaaggtatgaggtgcgcaa1440 cgtgtccgggatgtaccatgtcacgaacgactgctccaactcaagcattgtgtatgaggc1500 agcggacatgatcatgcacacccccgggtgcgtgccctgcgttcgggagaacaactcttc1560 ccgctgctgggtagcgctcacccccacgctcgcagctaggaacgccagcgtccccactac1620 gacaatacgacgccacgtcgatttgctcgttggggcggctgctttctgttccgctatgta1680 cgtgggggatctctgcggatctgtcttcctcgtctcccagctgttcaccatctcgcctcg1740 ccggcatgagacggtgcaggactgcaattgctcaatctatcccggccacataacaggtca1800 ccgtatggcttgggatatgatgatgaactggcaccaccaccatcaccattaaagatctcg1860 acttggttgaacacgttgccaaggcttaagtgaatttactttaaagtcttgcatttaaat1920 aaattttctttttatagctttatgacttagtttcaatttatatactattttaatgacatt1980 ttcgattcattgattgaaagctttgtgttttttcttgatgcgctattgcattgttcttgt2040 ctttttcgccacatgtaatatctgtagtagatacctgatacattgtggatgctgagtgaa2100 attttagttaataatggaggcgctcttaataattttggggatattggcttttttttttaa2160 agtttacaaatgaattttttccgccaggataacgattctgaagttactcttagcgttcct2220 atcggtacagccatcaaatcatgcctataaatcatgcctatatttgcgtgcagtcagtat2280 catctacatgaaaaaaactcccgcaatttcttatagaatacgttgaaaattaaatgtacg2340 cgccaagataagataacatatatctagctagatgcagtaatatacacagattcccgcgga2400 cgtgggaaggaaaaaattagataacaaaatctgagtgatatggaaattccgctgtatagc2460 tcatatctttcccttcaacaccagaaatgtaaaaatcttgttacgaaggatctttttgct2520 aatgtttctcgctcaatcctcatttcttccctacgaagagtcaaatctacttgttttctg2580 ccggtatcaagatccatatcttctagtttcaccatcaaagtccaatttctagtatacagt2640 ttatgtcccaacgtaacagacaatcaaaattggaaaggataagtatccttcaaagaatga2700 ttctgcgctggctcctgaaccgcctaatgggaacagagaagtccaaaacgatgctataag2760 aaccagaaataaaacgataaaaccataccaggatccaagcttggcactggccgtcgtttt2820 acaacgtcgtgactgggaaaaccctggcgttacccaacttaatcgccttgcagcacatcc2880 ccctttcgccagctggcgtaatagcgaagaggcccgcaccgatcgcccttcccaacagtt2940 gcgcagcctgaatggcgaatgggaaattgtaaacgttaatattttgttaaaattcgcgtt3000 aaatttttgttaaatcagctcattttttaaccaataggccgaaatcggcaaaatccctta3060 taaatcaaaagaatagaccgagatagggttgagtgttgttccagtttggaacaagagtcc3120 actattaaagaacgtggactccaacgtcaaagggcgaaaaaccgtctatcagggcgatgg3180 cccactacgtgaaccatcaccctaatcaagttttttggggtcgaggtgccgtaaagcact3240 aaatcggaaccctaaagggagcccccgatttagagcttgacggggaaagccggcgaacgt3300 ggcgagaaaggaagggaagaaagcgaaaggagcgggcgctagggcgctggcaagtgtagc3360 ggtcacgctgcgcgtaaccaccacacccgccgcgcttaatgcgccgctacagggcgcgtc3420 aggtggcacttttcggggaaatgtgcgcggaacccctatttgtttatttttctaaataca3480 ttcaaatatgtatccgctcatgagacaataaccctgataaatgcttcaataatattgaaa3540 aaggaagagtatgagtattcaacatttccgtgtcgcccttattcccttttttgcggcatt3600 ttgccttcctgtttttgctcacccagaaacgctggtgaaagtaaaagatgctgaagatca3660 gttgggtgcacgagtgggttacatcgaactggatctcaacagcggtaagatccttgagag3720 ttttcgccccgaagaacgttttccaatgatgagcacttttaaagttctgctatgtggcgc3780 ggtattatcccgtattgacgccgggcaagagcaactcggtcgccgcatacactattctca3840 gaatgacttggttgagtactcaccagtcacagaaaagcatcttacggatggcatgacagt3900 aagagaattatgcagtgctgccataaccatgagtgataacactgcggccaacttacttct3960 gacaacgatcggaggaccgaaggagctaaccgcttttttgcacaacatgggggatcatgt4020 aactcgccttgatcgttgggaaccggagctgaatgaagccataccaaacgacgagcgtga.4080 caccacgatgcctgtagcaatggcaacaacgttgcgcaaactattaactggcgaactact4140 tactctagcttcccggcaacaattaatagactggatggaggcggataaagttgcaggacc4200 acttctgcgctcggcccttccggctggctggtttattgctgataaatctggagccggtga4260 gcgtgggtctcgcggtatcattgcagcactggggccagatggtaagccctcccgtatcgt4320 agttatctacacgacggggagtcaggcaactatggatgaacgaaatagacagatcgctga4380 gataggtgcctcactgattaagcattggtaactgtcagaccaagtttactcatatatact4440 ttagattgatttaaaacttcatttttaatttaaaaggatctaggtgaagatcctttttga4500 taatctcatgaccaaaatcccttaacgtgagttttcgttccactgagcgtcagaccccgt4560 agaaaagatcaaaggatcttcttgagatcctttttttctgcgcgtaatctgctgcttgca4620 aacaaaaaaaccaccgctaccagcggtggtttgtttgccggatcaagagctaccaactct4680 ttttccgaaggtaactggcttcagcagagcgcagataccaaatactgtccttctagtgta4740 gccgtagttaggccaccacttcaagaactctgtagcaccgcctacatacctcgctctgct4800 aatcctgttaccagtggctgctgccagtggcgataagtcgtgtcttaccgggttggactc4860 aagacgatagttaccggataaggcgcagcggtcgggctgaacggggggttcgtgcacaca4920 gcccagcttggagcgaacgacctacaccgaactgagatacctacagcgtgagcattgaga4980 aagcgccacgcttcccgaagggagaaaggcggacaggtatccggtaagcggcagggtcgg5040 aacaggagagcgcacgagggagcttccagggggaaacgcctggtatctttatagtcctgt5100 cgggtttcgccacctctgacttgagcgtcgatttttgtgatgctcgtcaggggggcggag5160 cctatggaaaaacgccagcaacgcggcctttttacggttcctggccttttgctggccttt5220 tgctcacatgttctttcctgcgttatcccctgattctgtggataaccgtattaccgcctt5280 tgagtgagctgataccgctcgccgcagccgaacgaccgagcgcagcgagtcagtgagcga5340 ggaagcggaagagcgcccaatacgcaaaccgcctctccccgcgcgttggccgattcatta5400 atgcagctggcacgacaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaa5460 tgtgagttagctcactcattaggcaccccaggctttacactttatgcttccggctcgtat5520 gttgtgtggaattgtgagcggataacaatttcacacaggaaacagctatgaccatgatta5580 cgaatttaatacgactcactatagggaattcga 5613 <210>

<211>

<212>
DNA

<213>
vector pSY1 <400>

atcgataagcttttcaattcaattcatcattttttttttattcttttttttgatttcggt 60 ttctttgaaatttttttgattcggtaatctccgaacagaaggaagaacgaaggaaggagc 120 acagacttagattggtatatatacgcatatgtagtgttgaagaaacatgaaattgcccag 180 tattcttaacccaactgcacagaacaaaaacctgcaggaaacgaagataaatcatgtcga 240 aagctacatataaggaacgtgctgctactcatcctagtcctgttgctgccaagctattta 300 atatcatgcacgaaaagcaaacaaacttgtgtgcttcattggatgttcgtaccaccaagg 360 aattactggagttagttgaagcattaggtcccaaaatttgtttactaaaaacacatgtgg 420 atatcttgactgatttttccatggagggcacagttaagccgctaaaggcattatccgcca 480 agtacaattttttactcttcgaagacagaaaatttgctgacattggtaatacagtcaaat 540 tgcagtactctgcgggtgtatacagaatagcagaatgggcagacattacgaatgcacacg600 gtgtggtgggcccaggtattgttagcggtttgaagcaggcggcagaagaagtaacaaagg660 aacctagaggccttttgatgttagcagaattgtcatgcaagggctccctatctactggag720 aatatactaagggtactgttgacattgcgaagagcgacaaagattttgttatcggcttta780 ttgctcaaagagacatgggtggaagagatgaaggttacgattggttgattatgacacccg840 gtgtgggtttagatgacaagggagacgcattgggtcaacagtatagaaccgtggatgatg900 tggtctctacaggatctgacattattattgttggaagaggactatttgcaaagggaaggg960 atgctaaggtagagggtgaacgttacagaaaagcaggctgggaagcatatttgagaagat1020 gcggccagcaaaactaaaaaactgtattataagtaaatgcatgtatactaaactcacaaa1080 ttagagcttcaatttaattatatcagttattacccgggaatctcggtcgtaatgattttt1140 ataatgacgaaaaaaaaaaaattggaaagaaaaagctttaatgcggtagtttatcacagt1200 taaattgctaacgcagtcaggcaccgtgtatgaaatctaacaatgcgctcatcgtcatcc1260 tcggcaccgtcaccctggatgctgtaggcataggcttggttatgccggtactgccgggcc1320 tcttgcgggatatcgtccattccgacagcatcgccagtcactatggcgtgctgctagcgc1380 tatatgcgttgatgcaatttctatgcgcacccgttctcggagcactgtccgaccgctttg1440 gccgccgcccagtcctgctcgcttcgctacttggagccactatcgactacgcgatcatgg150Q

cgaccacacccgtcctgtggatcctctacgccggacgcatcgtggccggcatcaccggeg1560 ccacaggtgcggttgctggcccctatatcgccgacatcaccgatggggaagatcgggctc1620 gccacttcgggctcatgagcgcttgtttcggcgtgggtatggtggcaggccccgtggccg1680 ggggactgttgggcgccatctccttgcatgcaccattccttgcggcggcggtgctcaacg1740 gcctcaacctactactgggctgcttcctaatgcaggagtcgcataagggagagcgtcgac1800 cgatgcccttgagagccttcaacccagtcagctccttccggtgggcgcggggcatgacta1860 tcgtcgccgcacttatgactgtcttctttatcatgcaactcgtaggacaggtgccggcag1920 cgctctgggtcattttcggcgaggaccgctttcgctggagcgcgacgatgatcggcctgt1980 cgcttgcggtattcggaatcttgcacgccctcgctcaagccttcgtcactggtcccgcca2040 ccaaacgtttcggcgagaagcaggccattatcgccggcatggcggccgacgcgctgggct2100 acgtcttgctggcgttcgcgacgcgaggctggatggccttccccattatgattcttctcg2160 cttccggcggcatcgggatgcccgcgttgcaggccatgctgtccaggcaggtagatgacg2220 accatcagggacagcttcaaggatcgctcgcggctcttaccagcctaacttcgatcactg2280 gaccgctgatcgtcacggcgatttatgccgcctcggcgagcacatggaacgggttggcat2340 ggattgtaggcgccgccctataccttgtctgcctccccgcgttgcgtcgcggtgcatgga2400 gccgggccacctcgacctgaatggaagccggcggcacctcgctaacggattcaccactcc2460 aagaattggagccaatcaattcttgcggagaactgtgaatgcgcaaaccaacccttggca2520 gaacatatccatCJCgtCCgCCatCt cagccgcacgCggCg'CatCtcgggcagcgt2580 CCag tgggtcctggccacgggtgcgcatgatcgtgctcctgtcgttgaggacccggctaggctg2640 gcggggttgccttactggttagcagaatgaatcaccgatacgcgagcgaacgtgaagcga2700 ctgctgctgcaaaacgtctgcgacctgagcaacaacatgaatggtcttcggtttccgtgt2760 ttcgtaaagtctggaaacgcggaagtcagcgccctgcaccattatgttccggatctgcat2820 cgcaggatgctgctggctaccctgtggaacacctacatctgtattaacgaagcgctggca2880 ttgaccctgagtgatttttctCtggtCCCgCCgCatCCataccgccagttgtttaccctc2940 acaacgttccagtaaccgggcatgttcatcatcagtaacccgtatcgtgagcatcctctc3000 tCgtttCatCggtatcattacccccatgaacagaaattcccccttacacggaggcatcaa3060 gtgaccaaacaggaaaaaaccgcccttaacatggcccgctttatcagaagccagacatta3120 acgcttctggagaaactcaacgagctggacgcggatgaacaggcagacatctgtgaatcg3180 cttcacgaccacgctgatgagctttaccgcagctgcctcgcgcgtttcggtgatgacggt3240 gaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgtaagcggtgccg3300 ggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcggggcgcagcca3360 tgacccagtcacgtagcgatagcggagtgtatactggcttaactatgcggcatcagagca3420 gattgtactgagagtgcaccatatgcggtgtgaaataccgcacagatgcgtaaggagaaa3480 ataccgcatcaggcgctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcg3540 gctgcggcgagcggtatcagctcactcaaaggcggtaatacggttatccacagaatcagg3600 ggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaa3660 ggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcg3720 acgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccc3780 tggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgc3840 ctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatctcagttc3900 ggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccg3960 ctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgcc4020 actggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacaga4080 gttcttgaagtggtggcctaactacggctacactagaaggacagtatttggtatctgcgc4140 tctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaac4200 caccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaagg4260 atctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactc4320 acgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatccttttaaa4380 ttaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttggtctgacagtta4440 ccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcatccatagt4500 tgcctgactccccgtcgtgtagataactacgatacgggagggcttaccatctggccccag4560 tgctgcaatgataccgcgagacccacgctcaccggctccagatttatcagcaataaacca4620 gccagccggaagggccgagcgcagaagtggtcctgcaactttatccgcctccatccagtc4680 tattaattgttgccgggaagctagagtaagtagttcgccagttaatagtttgcgcaacgt4740 tgttgccattgctgcaggcatcgtggtgtcacgctcgtcgtttggtatggcttcattcag4800 ctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtgcaaaaaagcggt4860 tagctccttcggtcctccgatcgttgtcagaagtaagttggccgcagtgttatcactcat4920 ggttatggcagcactgcataattctcttactgtcatgccatccgtaagatgcttttctgt4980 gactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcgaccgagttgctc5040 ttgcccggcgtcaacacgggataataccgcgccacatagcagaactttaaaagtgctcat5100 cattggaaaacgttcttcggggcgaaaactctcaaggatcttaccgctgttgagatccag5160 ttcgatgtaacccactcgtgcacccaactgatcttcagcatcttttactttcaccagcgt5220 ttctgggtgagcaaaaacaggaaggcaaaatgccgcaaaaaagggaataagggcgacacg5280 gaaatgttgaatactcatactcttcctttttcaatattattgaagcatttatcagggtta5340 ttgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaataggggttcc5400 gcgcacatttccccgaaaagtgccacctgacgtctaagaaaccattattatcatgacatt5460 aacctataaaaaataggcgtatcacgaggccctttcgtcttcaagaattctcatgtttga5520 cagcttatcatcgatccacttgtatatttggatgaatttttgaggaattctgaaccagtc5580 ctaaaacgagtaaataggaccggcaattcttcaagcaataaacaggaataccaattatta5640 aaagataacttagtcagatcgtacaataaagctttgaagaaaaatgcgccttattcaatc5700 tttgcataaaaaaatggcccaaaatctcacattggaagacatttgatgacctcatttctt5760 tcaatgaagggcctaacggagttgactaatgttgtgggaaattggaccgataagcgtgct5820 tctgccgtggccaggacaacgtatactcatcagataacagcaatacctgatcactacttc5880 gcactagtttctcggtactatgcatatgatccaatatcaaaggaaatgatagcattgaag5940 gatgagactaatccaattgaggagtggcagcatatagaacagctaaagggtagtgctgaa6000 ggaagcatacgataccccgcatggaatgggataatatcacaggaggtactagactacctt6060 tcatcctacataaatagacgcatataagtacgcatttaagcataaacacgcactatgccg6120 ttcttctcatgtatatatatatacaggcaacacgcagatataggtgcgacgtgaacagtg6180 agctgtatgtgcgcagctcgcgttgcattttcggaagcgctcgttttcggaaacgctttg6240 aagttcctattccgaagttcctattctctagaaagtataggaacttcagagcgcttttga6300 aaaccaaaagcgctctgaagacgcactttcaaaaaaccaaaaacgcaccggactgtaacg6360 agctactaaaatattgcgaataccgcttccacaaacattgctcaaaagtatctctttgct6420 atatatctctgtgctatatccctatataaccatcccatccacctttcgctccttgaactt6480 gcatctaaactcgacctctacattttttatgtttatctctagtattacctcttagacaaa6540 aaaattgtagtaagaactattcatagagttaatcgaaaacaatacgaaaatgtaaacatt6600 tcctatacgtagtatatagagacaaaatagaagaaaccgttcataattttctgaccaatg6660 aagaatcatcaacgctatcactttctgttcacaaagtatgcgcaatccacatcggtatag6720 aatataatcggggatgcctttatcttgaaaaaatgcacccgcagcttcgctagtaatcag6780 taaacgcgggaagtggagtcaggctttttttatggaagagaaaatagacaccaaagtagc6840 cttcttctaaccttaacggacctacagtgcaaaaagttatcaagagactgcattatagag6900 cgcacaaaggagaaaaaaagtaatctaagatgctttgttagaaaaatagcgctctcggga6960 tgcatttttgtagaacaaaaaagaagtatagattcttgttggtaaaatagcgctctcgcg7020 ttgcatttctgttctgtaaaaatgcagctcagattctttgtttgaaaaattagcgctctc7080 gcgttgcatttttgttttacaaaaatgaagcacagattcttcgttggtaaaatagcgctt7140 tcgcgttgcatttctgttctgtaaaaatgcagctcagattctttgtttgaaaaattagcg7200 ctctcgcgttgcatttttgttctacaaaatgaagcacagatgcttcgttaacaaagatat7260 gctattgaagtgeaagatggaaacgcagaaaatgaaccggggatgcgacgtgcaagatta7320 cctatgcaatagatgcaatagtttctccaggaaccgaaatacatacattgtcttccgtaa7380 agcgctagactatatattattatacaggttcaaatatactatctgtttcagggaaaactc7440 ccaggttcggatgttcaaaattcaatgatgggtaacaagtacgatcgtaaatctgtaaaa7500 cagtttgtcggatattaggctgtatctcctcaaagcgtattcgaatatcattgagaagct7560 gcattttttttttttttttttttttttttttttttatatatatttcaaggatataccatt7620 gtaatgtctgcccctaagaagatcgtcgttttgccaggtgaccacgttggtcaagaaatc7680 acagccgaagccattaaggttcttaaagctatttctgatgttcgttccaatgtcaagttc7740 gatttcgaaaatcatttaattggtggtgctgctatcgatgctacaggtgtcccacttcca7800 gatgaggcgctggaagcctccaagaaggttgatgccgttttgttaggtgctgtgggtggt7860 cctaaatggggtaccggtagtgttagacctgaacaaggtttactaaaaatccgtaaagaa7920 cttcaattgtacgccaacttaagaccatgtaactttgcatccgactctcttttagactta7980 tctccaatcaagccacaatttgctaaaggtactgacttcgttgttgtcagagaattagtg8040 ggaggtatttactttggtaagagaaaggaagacgatggtgatggtgtcgcttgggatagt8100 gaacaatacaccgttccagaagtgcaaagaatcacaagaatggccgctttcatggcccta8160 caacatgagccaccattgcctatttggtccttggataaagctaatgttttggcctcttca8220 agattatggagaaaaactgtggaggaaaccatcaagaacgaattccctacattgaaggtt8280 caacatcaattgattgattctgccgccatgatcctagttaagaacccaacccacctaaat8340 ggtattataatcaccagcaacatgtttggtgatatcatctccgatgaagcctccgttatc8400 ccaggttccttgggtttgttgccatctgcgtccttggcctctttgccagacaagaacacc8460 gcatttggtttgtacgaaccatgccacggttctgctccagatttgccaaagaataaggtt8520 gaccctatcgccactatcttgtctgctgcaatgatgttgaaattgtcattgaacttgcct8580 gaagaaggtaaggccattgaagatgcagttaaaaaggttttggatgcaggtatcagaact8640 ggtgatttaggtggttccaacagtaccaccgaagtcggtgatgctgtcgccgaagaagtt8700 aagaaaatccttgcttaaaaagattctctttttttatgatatttgtacaaaaaaaaaaaa8760 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaatgcagcgtcacatcggataataatga8820 tggcagccattgtagaagtgccttttgcatttctagtctctttctcggtctagctagttt8880 tactacatcgcgaagatagaatcttagatcacactgcctttgctgagctggatcatatga8940 gtaacaaaagagtggtaaggcctcgttaaaggacaaggacctgagcggaagtgtatcgta9000 aagtagacggagtatactagtatagtctatagtccgtggaattctaagtgccagctttat9060 aatgtcattctccttactacagacccgcctgaaagtagacacatcatcatcagtaagctt9120 tgacaaaaagcattgagtagctaactcttctatgcaatctatagctgttttataaggcat9180 tcaatggacagattgaggtttttgaaacatactagtgaaattagccttaatcccttctcg9240 aagttaatcatgcattatggtgtaaaaaatgcaactagcgttgctctactttttcccgaa9300 ~

tttccaaatacgcagctggggtgattgctcgatttcgtaacgaaagttttgtttataaaa9360 accgcgaaaaccttctgtaacagatagatttttacagcgctgatatacaatgacatcagc9420 tgtaatggaaaataactgaaatatgaatggcgagagactgcttgcttgtattaagcaatg9480 tattatgcagcacttccaacctatggtgtacgatgaaagtaggtgtgtaatcgagacgac9540 aagggggacttttccagttcctgatcattataagaaatacaaaacgttagcatttgcatt9600 tgttggacatgtactgaatacagacgacacaccggtaattgaaaaagaactggattggcc9660 tgatcctgca ctagtgtaca atacaattgt cgatcgaatc ataaatcacc cagaattatc 9720 acagtttata tcggttgcat ttattagtca gttaaaggcc accatcggag agggtttaga 9780 tattaatgta aaaggcacgc taaaccgcag gggaaagggt atcagaaggc ctaaaggcgt 9840 attttttaga tacatggaat ctccatttgt caatacaaag gtcactgcat tcttctctta 9900 tcttcgagat tataataaaa ttgcctcaga atatcacaat aatactaaat tcattctcac 9960 gttttcatgt caagcatatt gggcatctgg cccaaacttc tccgccttga agaatgttat 10020 ttggtgctcc ataattcatg aatacatttc taagtttgtg gaaagagaac aggataaagg 10080 tcatatagga gatcaggagc taccgcctga agaggaccct tctcgtgaac taaacaatgt 10140 acaacatgaa gtcaatagtt taacggaaca agatgcggag gcggatgaag gattgtgggg 10200 tgaaatagat tcattatgtg aaaaatggca gtctgaagcg gagagtcaaa ctgaggcgga 10260 gataatagcc gacaggataa ttggaaatag ccagaggatg gcgaacctca aaattcgtcg 10320 tacaaagttc aaaagtgtct tgtatcatat actaaag-gaa ctaattcaat ctcagggaac 10380 cgtaaaggtt tatcgcggta gtagtttttc acacgattcg ataaagataa gcttacatta 10440 tgaagagcag catattacag ccgtatgggt ctacttgata gtaaaatttg aagagcattg 10500 gaagcctgtt gatgtagagg tcgagtttag atgcaagttc aaggagcgaa aggtggatgg 10560 gtaggttata tagggatata gcacagagat atatagcaaa gagatacttt tgaggcaatg 10620 tttgtggaag cggtattcgc aatattttag tagctcgtta cagtccggtg cgtttttggt 10680 tttttgaaag tgcgtcttca gagcgctttt ggttttcaaa agcgctctga agttcctata 10740 ctttctagag aataggaact tcggaatagg aacttcaaag cgtttccgaa aacgagcgct 10800 tccgaaaatg caacgcgagc tgcgcacata cagctcactg ttcacgtcgc acctatatct 10860 gcgtgttgcc tgtatatata tatacatgag aagaacggca tagtgcgtgt t tatgcttaa 10920 atgcgtactt atatgcgtct atttatgtag gatgaaaggt agtctagtac ctcctgtgat 10980 attatcccat tccatgcggg gtatcgtatg cttccttcag cactaccctt tagctgttct 11040 atatgctgcc actcctcaat tggattagtc tcatccttca atgcattcat ttcctttgat 11100 attggatcat accctagaag tattacgtga ttttCtgCCC CttaCCC'tCg ttgCtaC'tCt 11160 cctttttttc gtgggaaccg ctttagggcc ctcagtgatg gtgttttgta atttatatgc 11220 tcctcttgca tttgtgtctc tacttcttgt tcgcctggag ggaacttctt catttgtatt 11280 agcatggttc acttcagtcc ttccttccaa ctcactettt ttttgctgta aacgattctc 11340 tgccgccagt tcattgaaac tattgaatat atcctttaga gattccggga tgaataaatc 11400 acctattaaa gcagcttgac gatctggtgg aactaaagta agcaattggg taacgacgct 11460 tacgagcttc ataacatctt cttccgttgg agctggtggg actaataact gtgtacaatc 11520 catttttctc atgagcattt cggtagctct cttcttgtct ttctcgggca atcttcctat 11580 tattatagca atagatttgt atagttgctt tctattgtct aacagcttgt tattctgtag 11640 catcaaatct atggcagcct gacttgcttc ttgtgaagag agcataccat ttccaatcga 11700 agatacgctg gaatcttctg cgctagaatc aagaccatac ggcctaccgg ttgtgagaga 11760 ttccatgggc cttatgacat atcctggaaa gagtagctca tcagacttac gtttactctc 11820 tatatcaata tctacatcag gagcaatcat ttcaataaac agccgacata catcccagac 11880 gctataagct gtacgtgctt ttaccgtcag attcttggct gtttcaatgt cgtccatttt 11940 ggttttcttt taccagtatt gttcgtttga taatgtattc ttgcttatta cattataaaa 12000 tctgtgcaga tcacatgtca aaacaacttt ttatcacaag atagtaccgc aaaacgaacc 12060 tgcgggccgt ctaaaaatta aggaaaagca gcaaaggtgc atttttaaaa tatgaaatga 12120 agataccgca gtaccaatta ttttcgcagt acaaataatg cgcggccggt gcatttttcg 12180 aaagaacgcg agacaaacag gacaattaaa gttagttttt cgagttagcg tgtttgaata 12240 ctgcaagata caagataaat agagtagttg aaactagata tcaattgcac acaagatcgg 12300 cgctaagcat gccacaattt ggtatattat gtaaaacacc acctaaggtg cttgttcgtc 12360 agtttgtgga aaggtttgaa agaccttcag gtgagaaaat agcattatgt gctgctgaac 12420 taacctattt atgttggatg attacacata acggaacagc aatcaagaga gccacattca~ 12480 tgagctataa tactatcata agcaattcgc tgagtttcga tattgtcaat aaatcactcc 12540 agtttaaata caagacgcaa aaagcaacaa ttctggaagc ctcattaaag aaattgattc 12600 ctgcttggga atttacaatt attccttact atggacaaaa acatcaatct gatatcactg 12660 atattgtaag tagtttgcaa ttacagttcg aatcatcgga agaagcagat aagggaaata 12720 gccacagtaa aaaaatgcta aagcacttct aagtgagggt gaaagcatct gggagatcac 12780 tgagaaaata ctaaattcgt ttgagtatac ttcgagattt acaaaaacaa aaactttata 12840 ccaattcctc ttcctagcta ctttcatcaa ttgtggaaga ttcagcgata ttaagaacgt 12900 tgatccgaaa tcatttaaat tagtccaaaa taagtatctg ggagtaataa tccagtgttt 12960 agtgacagag acaaagacaa gcgttagtag gcacatatac ttctttagcg caaggggtag 13020 <210> 44 <211> 15810 <212> DNA
<213> vector pSYIaMFEIsH6a <400> 44 atcgataagc ttttcaattc aattcatcat ttttttttta ttcttttttt tgatttcggt 60 ttctttgaaa tttttttgat tcggtaatct ccgaacagaa ggaagaacga aggaaggagc 120 acagacttag attggtatat atacgcatat gtagtgttga agaaacatga aattgcccag 180 tattcttaac ccaactgcac agaacaaaaa cctgcaggaa acgaagataa atcatgtcga 240 aagctacata taaggaacgt gctgctactc atcctagtcc tgttgctgcc aagctattta 300 atatcatgcacgaaaagcaaacaaacttgtgtgcttcattggatgttcgtaccaccaagg360 aattactggagttagttgaagcattaggtcccaaaatttgtttactaaaaacacatgtgg420 atatcttgactgatttttccatggagggcacagttaagccgctaaaggcattatccgcca480 agtacaattttttactcttcgaagacagaaaatttgctgacattggtaatacagtcaaat540 tgcagtactctgcgggtgtatacagaatagcagaatgggcagacattacgaatgcacacg600 gtgtggtgggcccaggtattgttagcggtttgaagcaggcggcagaagaagtaacaaagg660 aacctagaggccttttgatgttagcagaattgtcatgcaagggctccctatctactggag720 aatatactaagggtactgttgacattgcgaagagcgacaaagattttgttatcggcttta780 ttgctcaaagagacatgggtggaagagatgaaggttacgattggttgattatgacacccg840 gtgtgggtttagatgacaagggagacgcattgggtcaacagtatagaaccgtggatgatg900 tggtctctacaggatctgacattattattgttggaagaggactatttgcaaagggaaggg960 atgctaaggtagagggtgaacgttacagaaaagcaggctgggaagcatatttgagaagat1020 gcggccagcaaaactaaaaaactgtattataagtaaatgcatgtatactaaactcacaaa1080 ttagagcttcaatttaattatatcagttattacccgggaatctcggtcgtaatgattttt1140 ataatgacgaaaaaaaaaaaattggaaagaaaaagctttaatgcggtagtttatcacagt1200 taaattgctaacgcagtcaggcaccgtgtatgaaatctaacaatgcgctcatcgtcatcc1260 tcggcaccgtcaccctggatgctgtaggcataggcttggttatgccggtactgccgggcc1320 tcttgcgggatatcgtccattccgacagcatcgccagtcactatggcgtgctgctagcgc1380 tatatgcgttgatgcaatttctatgcgcacccgttctcggagcactgtccgaccgctttg1440 gCCJCCgCCC agtCCtgCtC gcttcgctac ttggagccac tatcgactac gcgatcatgg 1500 cgaccacacc cgtcctgtgg atccttcaat atgcgcacat acgctgttat gttcaaggtc 1560 ccttcgttta agaacgaaag cggtcttcct tttgagggat gtttcaagtt gttcaaatct 1620 atcaaatttgcaaatccccagtctgtatctagagcgttgaatcggtgatgcgatttgtta1680 attaaattgatggtgtcaccattaccaggtctagatataccaatggcaaactgagcacaa1740 caataccagtccggatcaactggcaccatctctcccgtagtctcatctaatttttcttcc1800 ggatgaggttccagatataccgcaacacctttattatggtttccctgagggaataataga1860 atgtcccattcgaaatcaccaattctaaacctgggcgaattgtatttcgggtttgttaac1920 tcgttccagtcaggaatgttccacgtgaagctatcttccagcaaagtctccacttcttca1980 tcaaattgtggagaatactcccaatgctcttatctatgggacttccgggaaacacagtac2040 cgatacttcccaattcgtcttcagagctcattgtttgtttgaagagactaatcaaagaat2100 cgttttctcaaaaaaattaatatcttaactgatagtttgatcaaaggggcaaaacgtagg2160 ggcaaacaaacggaaaaatcgtttctcaaattttctgatgccaagaactctaaccagtct2220 tatctaaaaattgccttatgatccgtctctccggttacagcctgtgtaactgattaatcc2280 tgcctttctaatcaccattctaatgttttaattaagggattttgtcttcattaacggctt2340 tcgctcataaaaatgttatgacgttttgcccgcaggcgggaaaccatccacttcacgaga2400 ctgatctcctctgccggaacaccgggcatctccaacttataagttggagaaataagagaa2460 tttcagattgagagaatgaaaaaaaaaaaccctgaaaaaaaaggttgaaaccagttccct2520 gaaattattcccctacttgactaataagtatataaagacggtaggtattgattgtaattc2580 tgtaaatctatttcttaaac.ttcttaaattctacttttat,agttagtcttttttttagtt2640 ttaaaacaccaagaacttagtttcgaataaacacacataaacaaacaccatgagatttcc2700 ttcaatttttactgcagttttattcgcagcatcctccgcattagctgctccagtcaacac2760 tacaacagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagattt2820 agaaggggatttcgatgttgctgttttgccattttccaacagcacaaataacgggttatt2880 gtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctagataa2940 aaggtatgaggtgcgcaacgtgtccgggatgtaccatgtcacgaacgactgctccaactc3000 aagcattgtgtatgaggcagcggacatgatcatgcacacccccgggtgcgtgccctgcgt3060 tcgggagaacaactcttcccgctgctgggtagcgctcacccccacgctcgcagctaggaa3120 cgccagcgtccccactacgacaatacgacgccacgtcgatttgctcgttggggcggctgc3180 tttctgttccgctatgtacgtgggggatctctgcggatctgtcttcctcgtctcccagct3240 gttcaccatctcgcctcgccggcatgagacggtgcaggactgcaattgctcaatctatcc3300 cggccacataacgggtcaccgtatggcttgggatatgatgatgaactggcaccaccacca3360 tcaccattaaagatctcgacttggttgaacacgttgccaaggcttaagtgaatttacttt3420 aaagtcttgcatttaaataaattttctttttatagctttatgacttagtttcaatttata3480 tactattttaatgacattttcgattcattgattgaaagctttgtgttttttcttgatgcg3540 ctattgcattgttcttgtctttttcgccacatgtaatatctgtagtagatacctgataca3600 ttgtggatgctgagtgaaattttagttaataatggaggcgctcttaataattttggggat3660 attggcttttttttttaaagtttacaaatgaattttttccgccaggataacgattctgaa3720 gttactcttagcgttcctatcggtacagccatcaaatcatgcctataaatcatgcctata3780 tttgcgtgcagtcagtatcatctacatgaaaaaaactcccgcaatttcttatagaatacg3840 ttgaaaattaaatgtacgcgccaagataagataacatatatctagctagatgcagtaata3900 tacacagattcccgcggacgtgggaaggaaaaaattagataacaaaatctgagtgatatg3960 gaaattccgctgtatagctcatatctttcccttcaacaccagaaatgtaaaaatcttgtt4020 acgaaggatctttttgctaatgtttctcgctcaatcctcatttcttccctacgaagagtc4080 aaatctacttgttttctgccggtatcaagatccatatcttctagtttcaccatcaaagtc4140 caatttctagtatacagtttatgtcccaacgtaacagacaatcaaaattggaaaggataa4200 gtatccttcaaagaatgattctgcgctggctcctgaaccgcctaatgggaacagagaagt4260 ccaaaacgatgctataagaaccagaaataaaacgataaaaccataccaggatcctctacg4320 ccggacgcatcgtggccggcatcaccggcgccacaggtgcggttgctggcccctatatcg4380 ccgacatcaccgatggggaagatcgggctcgccacttcgggctcatgagcgcttgtttcg4440 gcgtgggtatggtggcaggccccgtggccgggggactgttgggcgccatctccttgcatg4500 caccattccttgcggcggcggtgctcaacggcctcaacctactactgggctgcttcctaa4560 tgcaggagtcgcataagggagagcgtcgaccgatgcccttgagagccttcaacccagtca4620 gctccttccggtgggcgcggggcatgactatcgtcgccgcacttatgactgtcttcttta4680 tcatgcaactcgtaggacaggtgccggcagcgctctgggtcattttcggcgaggaccgct4740 ttcgctggagcgcgacgatgatcggcctgtcgcttgcggtattcggaatcttgcacgccc4800 tcgctcaagccttcgtcactggtcccgccaccaaacgtttcggcgagaagcaggccatta4860 tcgccggcatggcggccgacgcgctgggctacgtcttgctggcgttcgcgacgcgaggct4920 ggatggccttccccattatgattcttctcgcttccggcggcatcgggatgcccgcgttgc4980 aggccatgctgtccaggcaggtagatgacgaccatcagggacagcttcaaggatcgctcg5040 cggctcttaccagcctaacttcgatcactggaccgctgatcgtcacggcgatttatgccg5100 cctcggcgagcacatggaacgggttggcatggattgtaggcgccgccctataccttgtct5160 gcctccccgcgttgcgtcgcggtgcatggagccgggccacctcgacctgaatggaagccg5220 gcggcacctcgctaacggattcaccactccaagaattggagccaatcaattcttgcggag5280 aactgtgaatgcgcaaaccaacccttggcagaacatatccatcgcgtccgccatctccag5340 cagccgcacgcggcgcatctcgggcagcgttgggtcctggccacgggtgcgcatgatcgt5400 gctcctgtcgttgaggacccggctaggctggcggggttgccttactggttagcagaatga5460 atcaccgatacgcgagcgaacgtgaagcgactgctgctgcaaaacgtctgcgacctgagc5520 aacaacatgaatggtcttcggtttccgtgtttcgtaaagtctggaaacgcggaagtcagc5580 gccctgcaccattatgttccggatctgcatcgcaggatgctgctggctaccctgtggaac5640 acctacatctgtattaacgaagcgctggcattgaccctgagtgatttttctctggtcccg5700 ccgcatccataccgccagttgtttaccctcacaacgttccagtaaccgggcatgttcatc5760 atcagtaacccgtatcgtgagcatcctctctcgtttcatcggtatcattacccccatgaa5820 cagaaattcccccttacacggaggcatcaagtgaccaaacaggaaaaaaccgcccttaac5880 atggcccgctttatcagaagccagacattaacgcttctggagaaactcaacgagctggac5940 gcggatgaacaggcagacatctgtgaatcgcttcacgaccacgctgatgagctttaccgc6000 agctgcctcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccggag6060 acggtcacagcttgtctgtaagcggtgccgggagcagacaagcccgtcagggcgcgtcag6120 cgggtgttggcgggtgtcggggcgcagccatgacccagtcacgtagcgatagcggagtgt6180 atactggcttaactatgcggcatcagagcagattgtactgagagtgcaccatatgcggtg6240 tgaaataccgcacagatgcgtaaggagaaaataccgcatcaggcgctcttccgcttcctc6300 gctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaa6360 ggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaa6420 aggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggct6480 ccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgac6540 aggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttcc6600 gaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttc6660 tcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctg6720 tgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttga6780 gtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattag6840 cagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggcta6900 cactagaaggacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaag6960 agttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttg7020 caagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctac7080 ggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatc7140 aaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaag7200 tatatatgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctc7260 agcgatctgtctatttcgttcatccatagttgcctgactccccgtcgtgtagataactac7320 gatacgggagggcttaccatctggccccagtgctgcaatgataccgcgagacccacgctc7380 accggctccagatttatcagcaataaaccagccagccggaagggccgagcgcagaagtgg7440 tcctgcaact ttatccgcct ccatccagtc tattaattgt tgccgggaag ctagagtaag 7500 tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt gctgcaggca tcgtggtgtc 7560 acgctcgtcg tttggtatgg cttcattcag ctccggttcc caacgatcaa ggcgagttac 7620 atgatccccc atgttgtgca aaaaagcggt tagctccttc ggtcctccga tcgttgtcag 7680 aagtaagttggccgcagtgttatcactcatggttatggcagcactgcataattctcttac7740 tgtcatgccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctg7800 agaatagtgtatgcggcgaccgagttgctcttgcccggcgtcaacacgggataataccgc7860 gccacatagcagaactttaaaagtgctcatcattggaaaacgttcttcggggcgaaaact7920 ctcaaggatcttaccgctgttgagatccagttcgatgtaacccactcgtgcacccaactg7980 atcttcagcatcttttactttcaccagcgtttctgggtgagcaaaaacaggaaggcaaaa8040 tgccgcaaaaaagggaataagggcgacacggaaatgttgaatactcatactcttcctttt8100 tcaatattattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatg8160 tatttagaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctga8220 cgtctaagaaaccattattatcatgacattaacctataaaaaataggcgtatcacgaggc8280 cctttcgtcttcaagaattctcatgtttgacagcttatcatcgatccacttgtatatttg8340 gatgaatttttgaggaattctgaaccagtcctaaaacgagtaaataggaccggcaattct8400 tcaagcaataaacaggaataccaattattaaaagataacttagtcagatcgtacaataaa8460 gctttgaagaaaaatgcgccttattcaatctttgcataaaaaaatggcccaaaatctcac8520 attggaagacatttgatgacctcatttctttcaatgaagggcctaacggagttgactaat8580 gttgtgggaaattggaccgataagcgtgcttctgccgtggccaggacaacgtatactcat8640 cagataacagcaatacctgatcactacttcgcactagtttctcggtactatgcatatgat8700 ccaatatcaaaggaaatgatagcattgaaggatgagactaatccaattgaggagtggcag8760 catatagaacagctaaagggtagtgctgaaggaagcatacgataccccgcatggaatggg8820 ataatatcacaggaggtactagactacctttcatcctacataaatagacgcatataagta8880 cgcatttaagcataaacacgcactatgccgttcttctcatgtatatatatatacaggcaa8940 cacgcagatataggtgcgacgtgaacagtgagctgtatgtgcgcagctcgcgttgcattt9000 tcggaagcgctcgttttcggaaacgctttgaagttcctattccgaagttcctattctcta9060 gaaagtatag,gaacttcagagcgcttttgaaaaccaaaagcgctctgaagacgcactttc9120 aaaaaaccaaaaacgcaccggactgtaacgagctactaaaatattgcgaataccgcttcc9180 acaaacattgctcaaaagtatctctttgctatatatctctgtgctatatccctatataac9240 catcccatcc acctttcgct ccttgaactt gcatctaaac tcgacctcta cattttttat 9300 gtttatctct agtattacct cttagacaaa aaaattgtag taagaactat tcatagagtt 9360 aatcgaaaac aatacgaaaa tgtaaacatt tcctatacgt agtatataga gacaaaatag 9420 aagaaaccgt tcataatttt ctgaccaatg aagaatcatc aacgctatca ctttctgttc 9480 acaaagtatg cgcaatccac atcggtatag aatataatcg gggatgcctt tatcttgaaa 9540 aaatgcaccc gcagcttcgc tagtaatcag taaacgcggg aagtggagtc aggctttttt 9600 tatggaagag aaaatagaca ccaaagtagc cttcttctaa ccttaacgga cctacagtgc 9660 aaaaagttat caagagactg cattatagag cgcacaaagg agaaaaaaag taatctaaga 9720 tgctttgtta gaaaaatagc gctctcggga tgcatttttg tagaacaaaa aagaagtata 9780 gattcttgtt ggtaaaatag cgctctcgcg ttgcatttct gttctgtaaa aatgcagctc 9840 agattctttg tttgaaaaat tagcgctctc gcgttgcatt tttgttttac aaaaatgaag 9900 cacagattct tcgttggtaa aatagcgctt tcgcgttgca tttctgttct gtaaaaatgc 9960 agctcagatt ctttgtttga aaaattagcg ctctcgcgtt gcatttttgt tctacaaaat 10020 gaagcacaga tgcttcgtta acaaagatat gctattgaag tgcaagatgg aaacgcagaa 10080 aatgaaccgg ggatgcgacg tgcaagatta cctatgcaat agatgcaata gtttctccag 10140 gaaccgaaat acatacattg tcttccgtaa agcgctagac tatatattat tatacaggtt 10200 caaatatact atctgtttca gggaaaactc ccaggttcgg atgttcaaaa ttcaatgatg 10260 ggtaacaagt acgatcgtaa atctgtaaaa cagtttgtcg gatattaggc tgtatctcct 10320 caaagcgtat tcgaatatca ttgagaagct gcattttttt tttttttttt tttttttttt 10380 tttttatata tatttcaagg atataccatt gtaatgtctg cccctaagaa gatcgtcgtt 10440 ttgccaggtg accacgttgg tcaagaaatc acagccgaag ccattaaggt tcttaaagct 10500 atttctgatg ttcgttccaa tgtcaagttc gatttcgaaa atcatttaat tggtggtgct 10560 gctatcgatg ctacaggtgt cccacttcca gatgaggcgc tggaagcctc caagaaggtt 10620 gatgccgttt tgttaggtgc tgtgggtggt cctaaatggg gtaccggtag tgttagacct 10680 gaacaaggtt tactaaaaat ccgtaaagaa cttcaattgt acgccaactt aagaccatgt 10740 aactttgcat ccgactctct tttagactta tctccaatca agccacaatt tgctaaaggt 10800 actgacttcg ttgttgtcag agaattagtg ggaggtattt actttggtaa gagaaaggaa 10860 gacgatggtg atggtgtcgc ttgggatagt gaacaataca ccgttccaga agtgcaaaga 10920 atcacaagaa tggccgcttt catggcccta caacatgagc caccattgcc tatttggtcc 10980 ttggataaag ctaatgtttt ggcctcttca agattatgga gaaaaactgt ggaggaaacc 11040 atcaagaacg aattccctac attgaaggtt caacatcaat tgattgattc tgccgccatg 11100 atcctagtta agaacccaac ccacctaaat ggtattataa tcaccagcaa catgtttggt 11160 gatatcatct ccgatgaagc ctccgttatc ccaggttcct tgggtttgtt gccatctgcg 11220 tccttggcct ctttgccaga caagaacacc gcatttggtt tgtacgaacc atgccacggt 11280 tctgctccag atttgccaaa gaataaggtt gaccctatcg ccactatctt gtctgctgca 11340 atgatgttga aattgtcatt gaacttgcct gaagaaggta aggccattga agatgcagtt 11400 aaaaaggttt tggatgcagg tatcagaact ggtgatttag gtggttccaa cagtaccacc 11460 gaagtcggtg atgctgtcgc cgaagaagtt aagaaaatcc ttgcttaaaa agattctctt 11520 tttttatgat atttgtacaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 11580 aaaatgcagc gtcacatcgg ataataatga tggcagccat tgtagaagtg ccttttgcat 11640 ttctagtctc tttctcggtc tagctagttt tactacatcg cgaagataga atcttagatc 11700 acactgcctt tgctgagctg gatcatatga gtaacaaaag agtggtaagg cctcgttaaa 11760 ggacaaggac ctgagcggaa gtgtatcgta aagtagacgg agtatactag tatagtctat 11820 agtccgtgga attctaagtg ccagctttat aatgtcattc tccttactac agacccgcct 11880 gaaagtagac acatcatcat cagtaagctt tgacaaaaag cattgagtag ctaactcttc 11940 tatgcaatct atagctgttt tataaggcat tcaatggaca gattgaggtt tttgaaacat 12000 actagtgaaa ttagccttaa teccttctcg aagttaatca tgcattatgg tgtaaaaaat 12060 gcaactcgcg ttgctctact ttttcccgaa tttccaaata cgcagctggg gtgattgctc 12120 gatttcgtaa cgaaagtttt gtttataaaa accgcgaaaa ccttctgtaa cagatagatt 12180 tttacagcgc tgatatacaa tgacatcagc tgtaatggaa aataactgaa atatgaatgg 12240 cgagagactg cttgcttgta ttaagcaatg tattatgcag cacttccaac ctatggtgta 12300 cgatgaaagt aggtgtgtaa tcgagacgac aagggggact tttccagttc ctgatcatta 12360 taagaaatac aaaacgttag catttgcatt tgttggacat gtactgaata cagacgacac 12420 accggtaatt gaaaaagaac tggattggcc tgatcctgca ctagtgtaca atacaattgt 12480 cgatcgaatc ataaatcacc cagaattatc acagtttata tcggttgcat ttattagtca 12540 gttaaaggcc accatcggag agggtttaga tattaatgta aaaggcacgc taaaccgcag 12600 gggaaagggt atcagaaggc ctaaaggcgt attttttaga tacatggaat ctccatttgt 12660 caatacaaag gtcactgcat tcttctctta tcttcgagat tataataaaa ttgcctcaga 12720 atatcacaat aatactaaat tcattctcac gttttcatgt caagcatatt gggcatctgg 12780 cccaaacttc tccgccttga agaatgttat ttggtgctcc ataattcatg aatacatttc 12840 taagtttgtg gaaagagaac aggataaagg tcatatagga gatcaggagc taccgcctga 12900 agaggaccct tctcgtgaac taaacaatgt acaacatgaa gtcaatagtt taacggaaca 12960 agatgcggag gcggatgaag gattgtgggg tgaaatagat tcattatgtg aaaaatggca 13020 gtctgaagcg gagagtcaaa ctgaggcgga gataatagcc gacaggataa ttggaaatag 13080 ccagaggatg gcgaacctca aaattcgtcg tacaaagttc aaaagtgtct tgtatcatat 13140 actaaaggaa ctaattcaat ctcagggaac cgtaaaggtt tatcgcggta gtagtttttc 13200 acacgattcg ataaagataa gcttacatta tgaagagcag.catattacag ccgtatgggt 13260 ctacttgata gtaaaatttg aagagcattg gaagcctgtt.gatgtagagg tcgagtttag 13320 atgcaagttc aaggagcgaa aggtggatgg gtaggttata tagggatata gcacagagat 13380 atatagcaaa gagatacttt tgaggcaatg tttgtggaag cggtattcgc aatattttag 13440 tagctcgtta cagtccggtg cgtttttggt tttttgaaag tgcgtcttca gagcgctttt 13500 ggttttcaaa agcgctctga agttcctata ctttctagag aataggaact tcggaatagg 13560 aacttcaaag cgtttccgaa aacgagcgct tccgaaaatg caacgcgagc tgcgcacata 13620 cagctcactg ttcacgtcgc acctatatct gcgtgttgcc tgtatatata tatacatgag 13680 aagaacggca tagtgcgtgt ttatgcttaa atgcgtactt atatgcgtct atttatgtag 13740 gatgaaaggt agtctagtac ctcctgtgat attatcccat tccatgcggg gtatcgtatg 13800 cttccttcag cactaccctt tagctgttct atatgctgcc actcctcaat tggattagtc 13860 tcatccttca atgcattcat ttcctttgat attggatcat accctagaag tattacgtga 13920 ttttctgccc cttaccctcg ttgctactct cctttttttc gtgggaaccg ctttagggcc 13980 ctcagtgatg gtgttttgta atttatatgc tcctcttgca tttgtgtctc tacttcttgt 14040 tcgcctggag ggaacttctt catttgtatt agcatggttc acttcagtcc ttccttccaa 7.4100 ctcactcttt ttttgctgta aacgattctc tgccgccagt tcattgaaac tattgaatat 14160 atcctttaga gattccggga tgaataaatc acctattaaa gcagcttgac gatctggtgg 14220 aactaaagta agcaattggg taacgacgct tacgagcttc ataacatctt cttccgttgg 14280 agctggtggg actaataact gtgtacaatc catttttctc atgagcattt cggtagctct 14340 cttcttgtct ttctcgggca atcttcctat tattatagca atagatttgt atagttgctt 14400 tctattgtct aacagcttgt tattctgtag catcaaatct atggcagcct gacttgcttc 14460 ttgtgaagag agcataccat ttccaatcga agatacgctg gaatcttctg cgctagaatc 14520 aagaccatac ggcctaccgg ttgtgagaga ttccatgggc cttatgacat atcctggaaa 14580 gagtagctca tcagacttac gtttactctc tatatcaata tctacatcag gagcaatcat 14640 ttcaataaac agccgacata catcccagac gctataagct gtacgtgctt ttaccgtcag 14700 attcttggct gtttcaatgt cgtccatttt ggttttcttt taccagtatt gttcgtttga 14760 taatgtattc ttgcttatta cattataaaa tctgtgcaga tcacatgtca aaacaacttt 14820 ttatcacaag atagtaccgc aaaacgaacc tgcgggccgt ctaaaaatta aggaaaagca 14880 gcaaaggtgc atttttaaaa tatgaaatga agataccgca gtaccaatta ttttcgcagt 14940 acaaataatg cgcggccggt gcatttttcg aaagaacgcg agacaaacag gacaattaaa 15000 gttagttttt cgagttagcg tgtttgaata ctgcaagata caagataaat agagtagttg 15060 aaactagata tcaattgcac acaagatcgg cgctaagcat gccacaattt ggtatattat 15120 gtaaaacacc acctaaggtg cttgttcgtc agtttgtgga aaggtttgaa agaccttcag 15180 gtgagaaaat agcattatgt gctgctgaac taacctattt atgttggatg attacacata 15240 acggaacagc aatcaagaga gccacattca tgagctataa tactatcata agcaattcgc 15300 tgagtttcga tattgtcaat aaatcactcc agtttaaata caagacgcaa aaagcaacaa 15360 ttctggaagc ctcattaaag aaattgattc ctgcttggga atttacaatt attccttact 15420 atggacaaaa acatcaatct gatatcactg atattgtaag tagtttgcaa ttacagttcg 15480 aatcatcgga agaagcagat aagggaaata gccacagtaa aaaaatgcta aagcacttct 15540 aagtgagggt gaaagcatct gggagatcac tgagaaaata ctaaattcgt ttgagtatac 15600 ttcgagattt acaaaaacaa aaactttata ccaattcctc ttcctagcta Ctttcatcaa 15660 ttgtggaaga ttcagcgata ttaagaacgt tgatccgaaa tcatttaaat tagtccaaaa 15720 taagtatctg ggagtaataa tccagtgttt agtgacagag acaaagacaa gcgttagtag 15780 gcacatatac ttctttagcg caaggggtag 15810 <210>

<211>

<212>
DNA

<213>
vector pBKS-E2sH6 <400>

cacctaaattgtaagcgttaatattttgttaaaattcgcgttaaatttttgttaaatcag60 ctcattttttaaccaataggccgaaatcggcaaaatcccttataaatcaaaagaatagac120 cgagatagggttgagtgttgttccagtttggaacaagagtccactattaaagaacgtgga180 ctccaacgtcaaagggcgaaaaaccgtctatcagggcgatggcccactacgtgaaccatc240 accctaatcaagttttttggggtcgaggtgccgtaaagcactaaatcggaaccctaaagg300 gagcccccgatttagagcttgacggggaaagccggcgaacgtggcgagaaaggaagggaa360 gaaagcgaaaggagcgggcgctagggcgctggcaagtgtagcggtcacgctgcgcgtaac420 caccacacccgccgcgcttaatgcgccgctacagggcgcgtcccattcgccattcaggct480 gcgcaactgttgggaagggcgatcggtgcgggcctcttcgetattacgccagctggcgaa540 agggggatgtgctgcaaggcgattaagttgggtaacgccagggttttcccagtcacgacg600 ttgtaaaacgacggccagtgaattgtaatacgactcactatagggcgaattgggtaccgg660 gccccccctcgaggtcgacggtatcgataagcttgcatgcctgcagttaattaactatta720 gtgatggtggtgatggtgtctgccctcgatcacctgccactctgttgtagacagcagcag780 cgggctaagctctgatctatccctgtcctccaagtcacaacgctctcctcgagtccaatt840 gcatgcggcttcgaacctgtgctccacgccccccacgtacatcctaaccttgaagatggt900 gaagttgacagtgcaggggtagtgccagagcctatatgggtaatgaaccatacacctagg960 tgtcagccagggcccagaaccgcatctggcgtaggtggcctcggggtgcttccgaaaaca1020 gtcagtggggcaggtcaaggtgttgttgccggcccccccgatgttgcacggggggccccc1080 acacgtcttggtgaacccagtgccattcatccatgtacagccgaaccagttgcctcgcgg1140 cggccgcgtgttgttgagaatcagcacatccgagtcgttcgccccccagttatacgtggg1200 gacaccaaaccgatcggtcgtccccaccacaacagggctcggggtgaagcaatacactgg1260 accgcacacctgagacgcgggtacaataccacacggtcgaggcgegtagtgccagcagta1320 gggcctctggtccgagctgttaggctcagtgtaagtgaggggaccccacccctgagcgaa1380 cttgtcgatggagcgacagctggccaagcgctctgggcatccagacgagttgaatttgtg1440 tttgtagaatagtgcggcaaagaaccctgtttggagggagtcgttgcagttcagggcagt1500 cctgttgatgtgccaactgccgttggtgtttacgagctggattttctgagccgacccggg1560 gctaaagagggacacaaggcccctggtatcggaggctgctgcccctcctgacacgcgggt1620 atggtaccgggccecccctcgaggtcgacggtatcgataagcttgatatcgaattcctgc1680 agcccgggggatccactagttctagagcggccgccaccgcggtggagctccagcttttgt1740 tccctttagtgagggttaatttcgagcttggcgtaatcatggtcatagctgtttcctgtg1800 tgaaattgttatccgctcacaattccacacaacatacgagccggaagcataaagtgtaaa1860 gcctggggtgcctaatgagtgagctaactcacattaattgcgttgcgctcactgcccgct1920 ttccagtcgggaaacctgtcgtgccagctgcattaatgaatcggccaacgcgcggggaga1980 ggcggtttgcgtattgggcgctcttccgcttcctcgctcactgactcgctgcgctcggtc2040 gttcggctgcggcgagcggtatcagctcactcaaaggcggtaatacggttatccacagaa2100 tcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaaccgt2160 aaaaaggcCgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaa2220 aatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgttt2280 ccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctg2340 tccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatctc2400 agttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagccc2460 gaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgactta2520 tcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgct2580 acagagttcttgaagtggtggcctaactacggctacactagaaggacagtatttggtatc2640 tgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaa2700 caaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaa2760 aaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaa2820 aactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatcctt2880 ttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagta~aacttggtctgac2940 agttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcatcc3000 atagttgcctgactccccgtcgtgtagataactacgatacgggagggcttaccatctggc3060 cccagtgctgcaatgataccgcgagacccacgctcaccggctccagatttatcagcaata3120 aaccagccagccggaagggc'cgagcgcagaagtggtcctgcaactttatccgcctccatc3180 cagtctattaattgttgccgggaagctagagtaagtagttcgccagttaatagtttgcgc3240 aacgttgttgccattgctacaggcatcgtggtgtcacgctcgtcgtttggtatggcttca3300 ttcagctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtgcaaaaaa3360 gcggttagctccttcggtcctccgatcgttgtcagaagtaagttggccgcagtgttatca3420 ctcatggttatggcagcactgcataattctcttactgtcatgccatccgtaagatgcttt3480 tctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcgaccgagt3540 tgctcttgcccggcgtcaatacgggataataccgcgccacatagcagaactttaaaagtg3600 ctcatcattggaaaacgttcttcggggcgaaaactctcaaggatcttaccgctgttgaga3660 tccagttcgatgtaacccactcgtgcacccaactgatcttcagcatcttttactttcacc3720 agcgtttctgggtgagcaaaaacaggaaggcaaaatgccgcaaaaaagggaataagggcg3780 acacggaaatgttgaatactcatactcttcctttttcaatattattgaagcatttatcag3840 ggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaataggg3900 gttCCgCgCaCatttCCCCgaaaagtgc 3928 <210> 46 <211> 6104 < 212 > DN'A
<213> vector pYIGSHCCL-22aH6 <400> 46 agcgcccaatacgcaaaccgcctctccccgcgcgttggccgattcattaatgcagctggc60 acgacaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaatgtgagttagc120 tcactcattaggcaccccaggctttacactttatgcttccggctcgtatgttgtgtggaa180 ttgtgagcggataacaatttcacacaggaaacagctatgaccatgattacgaatttaata240 cgactcactatagggaattcgaggatccttcaatatgcgcacatacgctgttatgttcaa300 ggtcccttcgtttaagaacgaaagcggtcttccttttgagggatgtttcaagttgttcaa360 atctatcaaatttgcaaatccccagtctgtatctagagcgttgaatcggtgatgcgattt420 gttaattaaattgatggtgtcaccattaccaggtctagatataccaatggcaaactgagc480 acaacaataccagtccggatcaactggcaccatctctcccgtagtctcatctaatttttc540 ttccggatgaggttccagatataccgcaacacctttattatggtttccctgagggaataa600 tagaatgtcccattcgaaatcaccaattctaaacctgggcgaattgtatttcgggtttgt660 taactcgttccagtcaggaatgttccacgtgaagctatcttccagcaaagtctccacttc720 ttcatcaaattgtggagaatactcccaatgctcttatctatgggacttccgggaaacaca780 gtaccgatacttcccaattcgtcttcagagctcattgtttgtttgaagagactaatcaaa840 gaatcgttttctcaaaaaaattaatatcttaactgatagtttgatcaaaggggcaaaacg900 taggggcaaacaaacggaaaaatcgtttctcaaattttctgatgccaagaactctaacca960 gtcttatctaaaaattgccttatgatccgtctctccggttacagcctgtgtaactgatta1020 atcctgcctttctaatcaccattctaatgttttaattaagggattttgtcttcattaacg1080 gctttcgctcataaaaatgttatgacgttttgcccgcaggcgggaaaccatccacttcac1140 gagactgatctcctctgccggaacaccgggcatctccaacttataagttggagaaataag1200 agaatttcagattgagagaatgaaaaaaaaaaaccctgaaaaaaaaggttgaaaccagtt1260 ccctgaaattattcccctacttgactaataagtatataaagacggtaggtattgattgta1320 attctgtaaatctatttcttaaacttcttaaattctacttttatagttagtctttttttt1380 agttttaaaacaccaagaacttagtttcgaataaacacacataaacaaacaccatgagat1440 ttccttcaatttttactgcagttttattcgcagcatcctccgcattagctgctccagtca1500 acactacaacagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcag1560 atttagaaggggatttcgatgttgctgttttgccattttccaacagcacaaataacgggt1620 tattgtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctag1680 ataaaaggcatacccgcgtgtcaggaggggcagcagcctccgataccaggggccttgtgt1740 ccctctttagccccgggtcggctcagaaaatccagctcgtaaacaccaacggcagttggc1800 acatcaacaggactgccctgaactgcaacgactccctccaaacagggttctttgccgcac1860 tattctacaaacacaaattcaactcgtctggatgcccagagcgcttggccagctgtcgct1920 ccatcgacaagttcgctcaggggtggggtcccctcacttacactgagcctaacagctcgg1980 accagaggccctactgctggcactacgcgcctcgaccgtgtggtattgtacccgcgtctc2040 aggtgtgcggtccagtgtattgcttcaccccgagccctgttgtggtggggacgaccgatc2100 ggtttggtgtccccacgtataactggggggcgaacgactcggatgtgctgattctcaaca2160 acacgcggccgccgcgaggcaactggttcggctgtacatggatgaatggcactgggttca2220 ccaagacgtgtgggggccccccgtgcaacatcgggggggccggcaacaacaccttgacct2280 gccccactgactgttttcggaagcaccccgaggccacttacgccagatgcggttctgggc2340 cctggctgacacctaggtgtatggttcattacccatataggctctggcactacccctgca2400 ctgtcaacttcaccatcttcaaggttaggatgtacgtggggggcgtggagcacaggttcg2460 aagccgcatgcaattggactcgaggagagcgttgtgacttggaggacagggatagatcag2520 agc.ttagctcgctgctgctgtctacaacagagtggcaggtgatcgagggcagacaccatc2580 accaccatcactaatagttaattaacgatctcgacttggttgaacacgttgccaaggctt2640 aagtgaatttactttaaagtcttgcatttaaataaattttctttttatagctttatgact2700 tagtttcaatttatatactattttaatgacattttcgattcattgattgaaagctttgtg2760 ttttttcttgatgcgctattgcattgttcttgtctttttcgccacatgtaatatctgtag2820 tagatacctgatacattgtggatgctgagtgaaattttagttaataatggaggcgctctt2880 aataattttggggatattggcttttttttttaaagtttacaaatgaattttttccgccag2940 gataacgattctgaagttactcttagcgttcctatcggtacagccatcaaatcatgccta3000 taaatcatgcctatatttgcgtgcagtcagtatcatctacatgaaaaaaactcccgcaat3060 ttcttatagaatacgttgaaaattaaatgtacgcgccaagataagataacatatatctag3120 ctagatgcagtaatatacacagattcccgcggacgtgggaaggaaaaaattagataacaa3180 aatctgagtgatatggaaattccgctgtatagctcatatctttcccttcaacaccagaaa3240 tgtaaaaatcttgttacgaaggatctttttgctaatgtttctcgctcaatcctcatttct3300 tccctacgaagagtcaaatctacttgttttctgccggtatcaagatccatatcttctagt3360 ttcaccatcaaagtccaatttctagtatacagtttatgtcccaacgtaacagacaatcaa3420 aattggaaaggataagtatccttcaaagaatgattctgcgctggctcctgaaccgcctaa3480 tgggaacagagaagtccaaaacgatgctataagaaccagaaataaaacgataaaaccata3540 ccaggatccaagcttggcactggccgtcgttttacaacgtcgtgactgggaaaaccctgg3600 cgttacccaacttaatcgccttgcagcacatccccctttcgccagctggcgtaatagcga3660 agaggcccgcaccgatcgcccttcccaacagttgcgcagcctgaatggcgaatgggaaat3720 tgtaaacgttaatattttgttaaaattcgcgttaaatttttgttaaatcagctcattttt3780 taaccaataggccgaaatcggcaaaatcccttataaatcaaaagaatagaccgagatagg3840 gttgagtgttgttccagtttggaacaagagtccactattaaagaacgtggactccaacgt3900 caaagggcgaaaaaccgtctatcagggcgatggcccactacgtgaaccatcaccctaatc3960 aagttttttggggtcgaggtgccgtaaagcactaaatcggaaccctaaagggagcccccg4020 atttagagcttgacggggaaagccggcgaacgtggcgagaaaggaagggaagaaagcgaa4080 aggagcgggcgctagggcgctggcaagtgtagcggtcacgctgcgcgtaaccaccacacc4140 cgccgcgcttaatgcgccgctacagggcgcgtcaggtggcacttttcggggaaatgtgcg4200 cggaacccctatttgtttatttttctaaatacattcaaatatgtatccgctcatgagaca4260 ataaccctgataaatgcttcaataatattgaaaaaggaagagtatgagtattcaacattt4320 CCgtgtCgCCCttattCCCttttttgcggcattttgccttcctgtttttgCtCaCCCaga4380 aacgctggtgaaagtaaaagatgctgaagatcagttgggtgcacgagtgggttacatcga4440 actggatctcaacagcggtaagatccttgagagttttcgccccgaagaacgttttccaat4500 gatgagcacttttaaagttctgctatgtggcgcggtattatcccgtattgacgccgggca4560 agagcaactcggtcgccgcatacactattctcagaatgacttggttgagtactcaccagt4620 cacagaaaagcatcttacggatggcatgacagtaagagaattatgcagtgctgccataac4680 catgagtgataacactgcggccaacttacttctgacaacgatcggaggaccgaaggagct4740 aaccgcttttttgcacaacatgggggatcatgtaactcgccttgatcgttgggaaccgga4800 gctgaatgaagccataccaaacgacgagcgtgacaccacgatgcctgtagcaatggcaac4860 aacgttgcgcaaactattaactggcgaactacttactctagcttcccggcaacaattaat4920 agactggatggaggcggataaagttgcaggaccacttctgcgctcggcccttccggctgg4980 ctggtttattgctgataaatctggagccggtgagcgtgggtctcgcggtatcattgcagc5040 actggggccagatggtaagccctcccgtatcgtagttatctacacgacggggagtcaggc5100 aactatggatgaacgaaatagacagatcgctgagataggtgcctcactgattaagcattg5160 gtaactgtcagaccaagtttactcatatatactttagattgatttaaaacttcattttta5220 atttaaaaggatctaggtgaagatcctttttgataatctcatgaccaaaatcccttaacg5280 tgagttttcgttccactgagcgtcagaccccgtagaaaagatcaaaggatcttcttgaga5340 tcctttttttctgcgcgtaatctgctgcttgcaaacaaaaaaaccaccgctaccagcggt5400 ggtttgtttgccggatcaagagctaccaactctttttccgaaggtaactggcttcagcag5460 agcgcagataccaaatactgtccttctagtgtagccgtagttaggccaccacttcaagaa5520 ctctgtagcaccgcctacatacctcgctctgctaatcctgttaccagtggctgctgccag5580 tggcgataagtcgtgtcttaccgggttggactcaagacgatagttaccggataaggcgca5640 gcggtcgggctgaacggggggttcgtgcacacagcccagcttggagcgaacgacctacac5700 cgaactgagatacctacagcgtgagcattgagaaagcgccacgcttcccgaagggagaaa5760 ggcggacaggtatccggtaagcggcagggtcggaacaggagagcgcacgagggagcttcc5820 agggggaaacgcctggtatctttatagtcctgtcgggtttcgccacctctgacttgagcg5880 tcgatttttgtgatgctcgtcaggggggcggagcctatggaaaaacgccagcaacgcggc5940 ctttttacggttcctggccttttgctggccttttgCtCdCatgttCtttCCtgCgttatC6000 ccctgattctgtggataaccgtattaccgcctttgagtgagctgataccgctcgccgcag6060 ccgaacgaccgagcgcagcgagtcagtgagcgaggaagcggaag 6104 <210>

<211>

<212>
DNA

<213>
vector pYYZGSE2H6 <400>

atcgataagcttttcaattcaattcatcattttttttttattcttttttttgatttcggt60 ttctttgaaatttttttgattcggtaatctccgaacagaaggaagaacgaaggaaggagc120 acagacttagattggtatatatacgcatatgtagtgttgaagaaacatgaaattgcccag180 tattcttaacccaactgcacagaacaaaaacctgcaggaaacgaagataaatcatgtcga240 aagctacatataaggaacgtgctgctactcatcctagtcctgttgctgccaagctattta300 atatcatgcacgaaaagcaaacaaacttgtgtgcttcattggatgttcgtaccaccaagg360 aattactggagttagttgaagcattaggtcccaaaatttgtttactaaaaacacatgtgg420 atatcttgactgatttttccatggagggcacagttaagccgctaaaggcattatccgcca480 agtacaattttttactcttcgaagacagaaaatttgctgacattggtaatacagtcaaat540 tgcagtactctgcgggtgtatacagaatagcagaatgggcagacattacgaatgcacacg600 gtgtggtgggcccaggtattgttagcggtttgaagcaggcggcagaagaagtaacaaagg660 aacctagaggccttttgatgttagcagaattgtcatgcaagggctccctatctactggag720 aatatactaagggtactgttgacattgcgaagagcgacaaagattttgttatcggcttta780 ttgctcaaagagacatgggtggaagagatgaaggttacgattggttgattatgacacccg840 gtgtgggtttagatgacaagggagacgcattgggtcaacagtatagaaccgtggatgatg900 tggtctctacaggatctgacattattattgttggaagaggactatttgcaaagggaaggg960 atgctaaggtagagggtgaacgttacagaaaagcaggctgggaagcatatttgagaagat1020 gcggccagca aaactaaaaa actgtattat aagtaaatgc atgtatacta aactcacaaa 1080 ttagagcttc aatttaatta tatcagttat tacccgggaa tctcggtcgt aatgattttt 1140 ataatgacga aaaaaaaaaa attggaaaga aaaagcttta atgcggtagt ttatcacagt 1200 taaattgcta acgcagtcag gcaccgtgta tgaaatctaa caatgcgctc atcgtcatcc 1260 tcggcaccgtcaccctggatgctgtaggcataggcttggttatgccggtactgccgggcc1320 tcttgcgggatatcgtccattccgacagcatcgccagtcactatggcgtgctgctagcgc1380 tatatgcgttgatgcaatttctatgcgcacccgttctcggagcactgtccgaccgctttg1440 gCCg'CCg'CCCagtCCtgCtCgcttcgetacttggagccactatcgactacgegatcatgg1500 CgaCCaCa.CCCgtCCtgtggatccttcaatatgcgcacatacgctgttatgttcaaggtc1560 ccttcgtttaagaacgaaagcggtcttccttttgagggatgtttcaagttgttcaaatct1620 atcaaatttgcaaatccccagtctgtatctagagcgttgaatcggtgatgcgatttgtta1680 attaaattgatggtgtcaccattaccaggtctagatataccaatggcaaactgagcacaa1740 caataccagtccggatcaactggcaccatctctcccgtagtctcatctaatttttcttcc1800 ggatgaggttccagatataccgcaacacctttattatggtttccctgagggaataataga1860 atgtcccattcgaaatcaccaattctaaacctgggcgaattgtatttcgggtttgttaac1920 tcgttccagtcaggaatgttccacgtgaagctatcttccagcaaag~tctccacttcttca1980 tcaaattgtggagaatactcccaatgctcttatctatgggacttccgggaaacacagtac2040 cgatacttcccaattcgtcttcagagctcattgtttgtttgaagagactaatcaaagaat2100 cgttttctcaaaaaaattaatatcttaactgatagtttgatcaaaggggcaaaacgtagg2160 ggcaaacaaacggaaaaatcgtttctcaaattttctgatgccaagaactctaaccagtct2220 tatctaaaaattgccttatgatccgtctctccggttacagcctgtgtaactgattaatcc2280 tgcctttctaatcaccattctaatgttttaattaagggattttgtcttcattaacggctt2340 tcgctcataaaaatgttatgacgttttgcccgcaggcgggaaaccatccacttcacgaga2400 ctgatctcctctgccggaacaccgggcatctccaacttataagttggagaaataagagaa2460 tttcagattgagagaatgaaaaaaaaaaaccctgaaaaaaaaggttgaaaccagttccct2520 gaaattattcccctacttgactaataagtatataaagacggtaggtattgattgtaattc2580 tgtaaatctatttcttaaacttcttaaattctacttttatagttagtcttttttttagtt2640 ttaaaacaccaagaacttagtttcgaataaacacacataaacaaacaccatgagatttcc2700 ttcaatttttactgcagttttattcgcagcatcctccgCattagctgctccagtcaacac2760 tacaacagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagattt2820 agaaggggatttcgatgttgctgttttgccattttccaacagcacaaataacgggttatt2880 gtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctagataa2940 aaggcatacccgcgtgtcaggaggggcagcagcctccgataccaggggccttgtgtccct3000 ctttagccccgggtcggctcagaaaatccagctcgtaaacaccaacggcagttggcacat3060 caacaggactgccctgaactgcaacgactccctccaaacagggttctttgccgcactatt3120 ctacaaacacaaattcaactcgtctggatgcccagagcgcttggccagctgtcgctccat3180 cgacaagttcgctcaggggtggggtcccctcacttacactgagcctaacagctcggacca3240 gaggccctactgctggcactacgcgcctcgaccgtgtggtattgtacccgcgtctcaggt3300 gtgcggtccagtgtattgcttcaccccgagccctgttgtggtggggacga~ccgatcggtt3360 tggtgtccccacgtataactggggggcgaacgactcggatgtgctgattctcaacaacac3420 gcggccgccgcgaggcaactggttcggctgtacatggatgaatggcaetgggttcaccaa3480 gacgtgtgggggccccccgtgcaacatcgggggggccggcaacaacaccttgacctgccc3540 cactgactgttttcggaagcaccccgaggccacttacgccagatgcggttctgggccctg3600 gctgacacctaggtgtatggttcattacccatataggctctggcactacccctgcactgt3660 caacttcaccatcttcaaggttaggatgtacgtggggggcgtggagcacaggttcgaagc3720 cgcatgcaattggactcgaggagagcgttgtgacttggaggacagggatagatcagagct3780 tagctcgctgctgctgtctacaacagagtggcaggtgatcgagggcagacaccatcacca3840 ccatcactaatagttaattaacgatctcgacttggttgaacacgttgccaaggcttaagt3900 gaatttactttaaagtcttgcatttaaataaattttctttttatagctttatgacttagt3960 ttcaatttatatactattttaatgacattttcgattcattgattgaaagctttgtgtttt4020 ttcttgatgcgctattgcattgttcttgtctttttcgccacatgtaatatctgtagtaga4080 tacctgatacattgtggatgctgagtgaaattttagttaataatggaggcgctcttaata4140 attttggggatattggcttttttttttaaagtttacaaatgaattttttccgccaggata4200 acgattctgaagttactcttagcgttcctatcggtacagccatcaaatcatgcctataaa4260 tcatgcctatatttgcgtgcagtcagtatcatctacatgaaaaaaactcccgcaatttct4320 .tatagaatacgttgaaaattaaatgtacgcgccaagataagataacatatatctagctag4380 atgcagtaatatacacagattcccgcggacgtgggaaggaaaaaattagataacaaaatc4440 tgagtgatatggaaattccgctgtatagctcatatctttcccttcaacaccagaaatgta4500 aaaatcttgttacgaaggatctttttgctaatgtttctcgctcaatcctcatttcttccc4560 tacgaagagtcaaatctacttgttttctgccggtatcaagatccatatcttctagtttca4620 ccatcaaagtccaatttctagtatacagtttatgtcccaacgtaacagacaatcaaaatt4680 ggaaaggataagtatccttcaaagaatgattctgcgctggctcctgaaccgcctaatggg4740 aacagagaagtccaaaacgatgctataagaaccagaaataaaacgataaaaccataccag4800 gatcctctacgccggacgcatcgtggccggcatcaccggcgccacaggtgcggttgctgg4860 cccctatatcgccgacatcaccgatggggaagatcgggctcgccacttcgggctcatgag4920 cgcttgtttcggcgtgggtatggtggcaggccccgtggccgggggactgttgggcgccat4980 ctccttgcatgcaccattccttgcggcggcggtgctcaacggcctcaacctactactggg5040 ctgcttcctaatgcaggagtcgcataagggagagcgtcgaccgatgcccttgagagcctt5100 caacccagtcagctccttccggtgggcgcggggcatgactatcgtcgccgcacttatgac5160 tgtcttctttatcatgcaactcgtaggacaggtgccggcagcgctctgggtcattttcgg5220 cgaggaccgctttcgctggagcgcgacgatgatcggcctgtcgcttgcggtattcggaat5280 cttgcacgccctcgctcaagccttcgtcactggtcccgccaccaaacgtttcggcgagaa5340 gcaggccattatcgccggcatggcggccgacgcgctgggctacgtcttgctggcgttcgc5400 gacgcgaggctggatggccttccccattatgattcttctcgcttccggcggcatcgggat5460 gcccgcgttgcaggccatgctgtccaggcaggtagatgacgaccatcagggacagcttca5520 aggatcgctcgcggctcttaccagcctaacttcgatcactggaccgctgatcgtcacggc5580 gatttatgccgcctcggcgagcacatggaacgggttggcatggattgtaggcgccgccct5640 ataccttgtctgcctccccgcgttgcgtcgcggtgcatggagccgggccacctcgacctg5700 aatggaagccggcggcacctcgctaacggattcaccactccaagaattggagccaatcaa5760 ttcttgcggagaactgtgaatgcgcaaaccaacccttggcagaacatatccatcgcgtcc5820 gccatctccagcagccgcacgcggcgcatctcgggcagcgttgggtcctggccacgggtg5880 cgcatgatcgtgctcctgtcgttgaggacccggctaggctggcggggttgccttactggt5940 tagcagaatgaatcaccgatacgcgagcgaacgtgaagcgactgctgctgcaaaacgtct6000 gcgacctgagcaacaacatgaatggtcttcggtttccgtgtttcgtaaagtctggaaacg6060 cggaagtcagcgccctgcaccattatgttccggatctgcatcgcaggatgctgctggcta6120 ccctgtggaacacctacatctgtattaacgaagcgctggcattgaccctgagtgattttt6180 ctctggtcccgccgcatccataccgccagttgtttaccctcacaacgttccagtaaccgg6240 gcatgttcatcatcagtaacccgtatcgtgagcatcctctctcgtttcatcggtatcatt6300 acccccatgaacagaaattcccccttacacggaggcatcaagtgaccaaacaggaaaaaa6360 ccgcccttaacatggcccgctttatcagaagccagacattaacgcttctggagaaactca6420 acgagctggacgcggatgaacaggcagacatctgtgaatcgcttcacgaccacgctgatg6480 agctttaccgcagctgcctcgcgcgtttcggtgatgacggtgaaaacctctgacacatgc6540 agctcccggagacggtcacagcttgtctgtaagcggtgccgggagcagacaagcccgtca6600 gggcgcgtcagcgggtgttggcgggtgtcggggcgcagccatgacccagtcacgtagcga6660 tagcggagtgtatactggcttaactatgcggcatcagagcagattgtactgagagtgcac6720 catatgcggtgtgaaataccgcacagatgcgtaaggagaaaataccgcatcaggcgctct6780 tccgcttcctCgCtCdCtgaCtCJCtgCgCtcggtcgttcggctgcggcgagcggtatca6840 gctcactcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaac6900 atgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgttt6960 ttccataggctCCgCCCCCCtgacgagcatcacaaaaatcgacgctcaagtcagaggtgg7020 cgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgc7080 tctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagc7140 gtggcgctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctcc7200 aagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaac7260 tatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggt7320 aacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcct7380 aactacggctacactagaaggacagtatttggtatctgcgctctgctgaagccagttacc7440 ttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggt7500 ttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttg7560 atcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtc7620 atgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaa7680 tcaatctaaagtatatatgagtaaacttggtctgacagttaccaatgcttaatcagtgag7740 gcacctatctcagcgatctgtctatttcgttcatccatagttgcctgactccccgtcgtg7800 tagataacta cgatacggga gggcttacca tctggcccca gtgctgcaat gataccgcga 7860 gacccacgct caccggctcc agatttatca gcaataaacc agccagccgg aagggccgag 7920 cgcagaagtg gtcctgcaac tttatccgcc tccatccagt ctattaattg ttgccgggaa 7980 gctagagtaa gtagttcgcc agttaatagt ttgcgcaacg ttgttgccat tgctgcaggc 8040 atcgtggtgtcacgctcgtcgtttggtatggcttcattcagctccggttcccaacgatca8100 aggcgagttacatgatcccccatgttgtgcaaaaaagcggttagctccttcggtcctccg8160 atcgttgtcagaagtaagttggccgcagtgttatcactcatggttatggcagcactgcat8220 aattctcttactgtcatgccatccgtaagatgcttttctgtgactggtgagtactcaacc8280 aagtcattctgagaatagtgtatgcggcgaccgagttgctcttgcccggcgtcaacacgg8340 gataataccg cgccacatag cagaacttta aaagtgctca tcattggaaa acgttcttcg 8400 gggcgaaaac tctcaaggat cttaccgctg ttgagatcca gttcgatgta acccactcgt 8460 gcacccaact gatcttcagc atcttttact ttcaccagcg tttctgggtg agcaaaaaca 8520 ggaaggcaaa atgccgcaaa aaagggaata agggcgacac ggaaatgttg aatactcata 8580 ctcttccttt ttcaatatta ttgaagcatt tatcagggtt attgtctcat gagcggatac 8640 atatttgaat gtatttagaa aaataaacaa ataggggttc cgcgcacatt tccccgaaaa 8700 gtgccacctg acgtctaaga aaccattatt atcatgacat taacctataa aaaataggeg 8760 tatcacgagg ccctttcgtc ttcaagaatt ctcatgtttg acagcttatc atcgatccac 8820, ttgtatattt ggatgaattt ttgaggaatt ctgaaccagt cctaaaacga gtaaatagga 8880 ccggcaattc ttcaagcaat aaacaggaat accaattatt aaaagataac ttagtcagat 8940 cgtacaataa agctttgaag aaaaatgcgc cttattcaat ctttgcataa aaaaatggcc 9000 caaaatctca cattggaaga catttgatga cctcatttct ttcaatgaag ggcctaacgg 9060 agttgactaa tgttgtggga aattggaccg ataagcgtgc ttctgccgtg gccaggacaa 9120 cgtatactca tcagataaca gcaatacctg atcactactt cgcactagtt tctcggtact 9180 atgcatatga tccaatatca aaggaaatga tagcattgaa ggatgagact aatccaattg 9240 aggagtggca gcatatagaa cagctaaagg gtagtgctga aggaagcata cgataccccg 9300 catggaatgg gataatatca caggaggtac tagactacct ttcatcctac ataaatagac 9360 gcatataagt acgcatttaa gcataaacac gcactatgcc gttcttctca tgtatatata 9420 tatacaggca acacgcagat ataggtgcga cgtgaacagt gagctgtatg tgcgcagctc 9480 gcgttgcatt ttcggaagcg ctcgttttcg gaaacgcttt gaagttccta ttccgaagtt 9540 cctattctct agaaagtata ggaacttcag agcgcttttg aaaaccaaaa gcgctctgaa 9600 gacgcacttt caaaaaacca aaaacgcacc ggactgtaac gagctactaa aatattgcga 9660 ataccgcttc cacaaacatt gctcaaaagt atctctttgc tatatatctc tgtgctatat 9720 ccctatataa ccatcccatc cacctttcgc tccttgaact tgcatctaaa ctcgacctct 9780 acatttttta tgtttatctc tagtattacc tcttagacaa aaaaattgta gtaagaacta 9840 ttcatagagt taatcgaaaa caatacgaaa atgtaaacat ttcctatacg tagtatatag 9900 agacaaaata gaagaaaccg ttcataattt tctgaccaat gaagaatcat caacgctatc 9960 actttctgtt cacaaagtat gcgcaatcca catcggtata gaatataatc ggggatgcct 10020 ttatcttgaa aaaatgcacc cgcagcttcg ctagtaatca gtaaacgcgg gaagtggagt 10080 caggcttttt ttatggaaga gaaaatagac accaaagtag ccttcttcta accttaacgg 10140 acctacagtg caaaaagtta tcaagagact gcattataga gcgcacaaag gagaaaaaaa 10200 gtaatctaag atgctttgtt agaaaaatag cgctctcggg atgcattttt gtagaacaaa 10260 aaagaagtat agattcttgt tggtaaaata gcgctctcgc gttgcatttc tgttctgtaa 10320 aaatgcagct cagattcttt gtttgaaaaa ttagcgctct cgcgttgcat ttttgtttta 10380 caaaaatgaa gcacagattc ttcgttggta aaatagcgct ttcgcgttgc atttctgttc 10440 tgtaaaaatg cagctcagat tctttgtttg aaaaattagc gctctcgcgt tgcatttttg 10500 ttctacaaaa tgaagcacag atgcttcgtt aacaaagata tgctattgaa gtgcaagatg 10560 gaaacgcaga aaatgaaccg gggatgcgac gtgcaagatt acctatgcaa tagatgcaat 10620 agtttctcca ggaaccgaaa tacatacatt gtcttccgta aagcgctaga ctatatatta 10680 ttatacaggt tcaaatatac tatctgtttc agggaaaact cccaggttcg gatgttcaaa 10740 attcaatgat gggtaacaag tacgatcgta aatctgtaaa acagtttgtc ggatattagg 10800 ctgtatctcc tcaaagcgta ttcgaatatc attgagaagc tgcatttttt tttttttttt 10860 tttttttttt ttttttatat atatttcaag gatataccat tgtaatgtct gcccctaaga 10920 agatcgtcgt tttgccaggt gaccacgttg gtcaagaaat cacagccgaa gccattaagg 10980 ttcttaaagc tatttctgat gttcgttcca atgtcaagtt cgatttcgaa aatcatttaa 11040 ttggtggtgc tgctatcgat gctacaggtg tcccacttcc agatgaggcg ctggaagcct 11100 ccaagaaggt tgatgccgtt ttgttaggtg ctgtgggtgg tcctaaatgg ggtaccggta 11160 gtgttagacc tgaacaaggt ttactaaaaa tccgtaaaga acttcaattg tacgccaact 11220 taagaccatg taactttgca tccgactctc ttttagactt atctccaatc aagccacaat 11280 ttgctaaagg tactgacttc gttgttgtca gagaattagt gggaggtatt tactttggta 11340 agagaaagga agacgatggt gatggtgtcg cttgggatag tgaacaatac accgttccag 11400 aagtgcaaag aatcacaaga atggccgctt tcatggccct acaacatgag ccaccattgc 11460 ctatttggtc cttggataaa gctaatgttt tggcctcttc aagattatgg agaaaaactg 11520 tggaggaaac catcaagaac gaattcccta cattgaaggt tcaacatcaa ttgattgatt 11580 ctgccgccat gatcctagtt aagaacccaa cccacctaaa tggtattata atcaccagca 11640 acatgtttgg tgatatcatc tccgatgaag cctccgttat cccaggttcc ttgggtttgt 11700 tgccatctgc gtccttggcc tctttgccag acaagaacac cgcatttggt ttgtacgaac 11760 catgccacgg ttctgctcca gatttgccaa agaataaggt tgaccctatc gccactatct 11820 tgtctgctgc aatgatgttg aaattgtcat tgaacttgcc tgaagaaggt aaggccattg 11880 aagatgcagt taaaaaggtt ttggatgcag gtatcagaac tggtgattta ggtggttcca 11940 acagtaccac cgaagtcggt gatgctgtcg ccgaagaagt taagaaaatc cttgcttaaa 12000 aagattctct ttttttatga tatttgtaca aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 12060 aaaaaaaaaa aaaaatgcag cgtcacatcg gataataatg atggcagcca ttgtagaagt 12120 gccttttgca tttctagtct ctttctcggt ctagctagtt ttactacatc gcgaagatag 12180 aatcttagat cacactgcct ttgctgagct ggatcatatg agtaacaaaa gagtggtaag 12240 gcctcgttaa aggacaagga cctgagcgga agtgtatcgt aaagtagacg gagtatacta 12300 gtatagtcta tagtccgtgg aattctaagt gccagcttta taatgtcatt ctccttacta 12360 cagacccgcc tgaaagtaga cacatcatca tcagtaagct ttgacaaaaa gcattgagta 12420 gctaactctt ctatgcaatc tatagctgtt ttataaggca ttcaatggac agattgaggt 12480 ttttgaaaca tactagtgaa attagcctta atcccttctc gaagttaatc atgcattatg 12540 gtgtaaaaaa tgcaactcgc gttgctctac tttttcccga atttccaaat acgcagctgg 12600 ggtgattgct cgatttcgta acgaaagttt tgtttataaa aaccgcgaaa accttctgta 12660 acagatagat ttttacagcg ctgatataca atgacatcag ctgtaatgga aaataactga 12720 aatatgaatg gcgagagact gcttgcttgt attaagcaat gtattatgca gcacttccaa 12780 cctatggtgt acgatgaaag taggtgtgta atcgagacga caagggggac ttttccagtt 12840 cctgatcatt ataagaaata caaaacgtta gcatttgcat ttgttggaca tgtactgaat 12900 acagacgaca caccggtaat tgaaaaagaa ctggattggc ctgatcctgc actagtgtac 12960 aatacaattg tcgatcgaat cataaatcac ccagaattat cacagtttat atcggttgca 13020 tttattagtc agttaaaggc caccatcgga gagggtttag atattaatgt aaaaggcacg 13080 ctaaaccgca ggggaaaggg tatcagaagg cctaaaggcg tattttttag atacatggaa 13140 tctccatttg tcaatacaaa ggtcactgca ttcttctctt atcttcgaga ttataataaa 13200 attgcctcag aatatcacaa taatactaaa ttcattctca cgttttcatg tcaagcatat 13260 tgggcatctg gcccaaactt ctecgccttg aagaatgtta tttggtgctc cataattcat 13320 gaatacattt ctaagtttgt ggaaagagaa caggataaag gtcatatagg agatcaggag 13380 ctaccgcctg aagaggaccc ttctcgtgaa ctaaacaatg tacaacatga agtcaatagt 13440 ttaacggaac aagatgcgga ggcggatgaa ggattgtggg gtgaaataga ttcattatgt 13500 gaaaaatggc agtctgaagc ggagagtcaa actgaggcgg agataatagc cgacaggata 13560 attggaaata gccagaggat ggcgaacctc aaaattcgtc gtacaaagtt caaaagtgtc 13620 ttgtatcata tactaaagga actaattcaa tctcagggaa ccgtaaaggt ttatcgcggt 13680 agtagttttt cacacgattc gataaagata agcttacatt atgaagagca gcatattaca 13740 gccgtatggg tctacttgat agtaaaattt gaagagcatt ggaagcctgt tgatgtagag 13800 gtcgagttta gatgcaagtt caaggagcga aaggtggatg ggtaggttat atagggatat 13860 agcacagaga tatatagcaa agagatactt ttgaggcaat gtttgtggaa gcggtattcg 13920 caatatttta gtagctcgtt acagtccggt gcgtttttgg ttttttgaaa gtgcgtcttc 13980 agagcgcttt tggttttcaa aagcgctctg aagttcctat actttctaga gaataggaac 14040 ttcggaatag gaacttcaaa gcgtttccga aaacgagcgc ttccgaaaat gcaacgcgag 14100 ctgcgcacat acagctcact gttcacgtcg cacctatatc tgcgtgttgc ctgtatatat 14160 atatacatga gaagaacggc atagtgcgtg tttatgctta aatgcgtact tatatgcgtc 14220 tatttatgta ggatgaaagg tagtctagta cctcctgtga tattatccca ttccatgcgg 14280 ggtatcgtat gcttccttca gcactaccct ttagctgttc tatatgctgc cactcctcaa 14340 ttggattagt ctcatccttc aatgcattca tttcctttga tattggatca taccctagaa 14400 gtattacgtg attttCtgCC CCttaCCCtC gttgctactc tCCttttttt cgtgggaacc 14460 gctttagggc cctcagtgat ggtgttttgt aatttatatg ctcctcttgc atttgtgtct .14520 ctacttcttg ttcgcctgga gggaacttct tcatttgtat tagcatggtt cacttcagtc 14580 cttccttcca actcactctt tttttgctgt aaacgattct ctgccgccag ttcattgaaa 14640 ctattgaata tatcctttag agattccggg atgaataaat cacctattaa agcagcttga 14700 cgatctggtg gaactaaagt aagcaattgg gtaacgacgc ttacgagctt cataacatct 14760 tcttccgttg gagctggtgg gactaataac tgtgtacaat ccatttttct catgagcatt 14820 tcggtagctc tcttcttgtc tttctcgggc aatcttccta ttattatagc aatagatttg 14880 tatagttgct ttctattgtc taacagcttg ttattctgta gcatcaaatc tatggcagcc 14940 tgacttgctt cttgtgaaga gagcatacca tttccaatcg aagatacgct ggaatcttct 15000 gcgctagaat caagaccata cggcctaccg gttgtgagag attccatggg ccttatgaca 15060 tatcctggaa agagtagctc atcagactta cgtttactct ctatatcaat atctacatca 15120 ggagcaatca tttcaataaa cagccgacat acatcccaga cgctataagc tgtacgtgct 15180 tttaccgtca gattcttggc tgtttcaatg tcgtccattt tggttttctt ttaccagtat 15240 tgttcgtttg ataatgtatt cttgcttatt acattataaa atctgtgcag atcacatgtc 15300 aaaacaactt tttatcacaa gatagtaccg caaaacgaac ctgcgggccg tctaaaaatt 15360 aaggaaaagc agcaaaggtg catttttaaa atatgaaatg aagataccgc agtaccaatt 15420 attttcgcag tacaaataat gcgcggccgg tgcatttttc gaaagaacgc gagacaaaca 15480 ggacaattaa agttagtttt tcgagttagc gtgtttgaat actgcaagat acaagataaa 15540 tagagtagtt gaaactagat atcaattgca cacaagatcg gcgctaagca tgccacaatt 15600 tggtatatta tgtaaaacac cacctaaggt gcttgttcgt cagtttgtgg aaaggtttga 15660 aagaccttca ggtgagaaaa tagcattatg tgctgctgaa ctaacctatt tatgttggat 15720 gattacacat aacggaacag caatcaagag agccacattc atgagctata atactatcat 15780 aagcaattcg ctgagtttcg atattgtcaa taaatcactc cagtttaaat acaagacgca 15840 aaaagcaaca attctggaag cctcattaaa gaaattgatt cctgcttggg aatttacaat 15900 tattccttac tatggacaaa aacatcaatc tgatatcact gatattgtaa gtagtttgca 15960 attacagttc gaatcatcgg aagaagcaga taagggaaat agccacagta aaaaaatgct 16020 aaagcacttc taagtgaggg tgaaagcatc tgggagatca ctgagaaaat actaaattcg 16080 tttgagtata cttcgagatt tacaaaaaca aaaactttat accaattcct cttcctagct 16140 actttcatca attgtggaag attcagcgat attaagaacg ttgatccgaa atcatttaaa 16200 ttagtccaaa ataagtatct gggagtaata atccagtgtt tagtgacaga gacaaagaca 16260 agcgttagta ggcacatata cttctttagc gcaaggggta g 16301 <210>

<211>

<212>
DNA

<213> -vector pYIG7 <400>

agcgcccaatacgcaaaccgcctctccccgcgcgttggccgattcattaatgcagctggc 60 acgacaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaatgtgagttagc 120 tcactcattaggCaCCCCaggCtttaCaCtttatgCttCCggctcgtatgttgtgtggaa 180 ttgtgagcggataacaatttcacacaggaaacagctatgaccatgattacgaatttaata 240 cgactcactatagggaattcggatccttcaatatgcgcacatacgctgttatgttcaagg 300 tcccttcgtttaagaacgaaagcggtcttccttttgagggatgtttcaagttgttcaaat 360 ctatcaaatttgcaaatccccagtctgtatctagagcgttgaatcggtgatgcgatttgt 420 taattaaattgatggtgtcaccattaccaggtctagatataccaatggcaaactgagcac 480 aacaataccagtccggatcaactggcaccatctctcccgtagtctcatctaatttttctt 540 ccggatgaggttccagatataccgcaacacctttattatggtttccctgagggaataata 600 gaatgtcccattcgaaatcaccaattctaaacctgggcgaattgtatttcgggtttgtta 660 actcgttccagtcaggaatgttccacgtgaagctatcttccagcaaagtctccacttctt 720 catcaaattgtggagaatactcccaatgctcttatctatgggacttccgggaaacacagt 780 accgatacttcccaattcgtcttcagagctcattgtttgtttgaagagactaatcaaaga 840 atcgttttctcaaaaaaattaatatcttaactgatagtttgatcaaaggggcaaaacgta 900 ggggcaaacaaacggaaaaatcgtttctcaaattttctgatgccaagaactctaaccagt 960 cttatctaaaaattgccttatgatccgtctctccggttacagcctgtgtaactgattaat1020 cctgcctttctaatcaccattctaatgttttaattaagggattttgtcttcattaacggc1080 tttcgctcataaaaatgttatgacgttttgcccgcaggcgggaaaccatccacttcacga1140 gactgatctcctctgccggaacaccgggcatctccaacttataagttggagaaataagag1200 aatttcagattgagagaatgaaaaaaaaaaaccctgaaaaaaaaggttgaaaccagttcc1260 ctgaaattattcccctacttgactaataagtatataaagacggtaggtattgattgtaat1320 tctgtaaatctatttcttaaacttcttaaattctacttttatagttagtcttttttttag1380 ttttaaaacaccaagaacttagtttcgaataaacacacataaacaaacaccatgaggtct1440 ttgctaatactagtgctttgcttcctgcccctggctgctctgggggtaccagatctcgac1500 ttggttgaacacgttgccaaggcttaagtgaatttactttaaagtcttgcatttaaataa1560 attttctttttatagctttatgacttagtttcaatttatatactattttaatgacatttt1620 cgattcattgattgaaagctttgtgttttttcttgatgcgctattgcattgttcttgtct1680 ttttcgccacatgtaatatctgtagtagatacctgatacattgtggatgctgagtgaaat1740 tttagttaataatggaggcgctcttaataattttggggatattggcttttttttttaaag1800 tttacaaatgaattttttccgccaggataacgattctgaagttactcttagcgttcctat1860 cggtacagccatcaaatcatgcctataaatcatgcctatatttgcgtgcagtcagtatca1920 tctacatgaaaaaaactcccgcaatttcttatagaatacgttgaaaattaaatgtacgcg1980 ccaagataagataacatatatctagctagatgcagtaatatacacagattcccgcggacg2040 tgggaaggaaaaaattagataacaaaatctgagtgatatggaaattccgctgtatagctc2100 atatctttcccttcaacaccagaaatgtaaaaatcttgttacgaaggatctttttgctaa2160 tgtttctcgctcaatcctcatttcttccctacgaagagtcaaatctacttgttttctgcc2220 ggtatcaagatccatatcttctagtttcaccatcaaagtccaatttctagtatacagttt2280 atgtcccaacgtaacagacaatcaaaattggaaaggataagtatccttcaaagaatgatt2340 ctgcgctggctcctgaaccgcctaatgggaacagagaagtccaaaacgatgctataagaa2400 ccagaaataaaacgataaaaccataccaggatccaagcttggcactggccgtcgttttac2460 aacgtcgtgactgggaaaaccctggcgttacccaacttaatcgccttgcagcacatcccc2520 ctttcgccagctggcgtaatagcgaagaggcccgcaccgatcgcccttcccaacagttgc2580 gcagcctgaatggcgaatgggaaattgtaaacgttaatattttgttaaaattcgcgttaa2640 atttttgttaaatcagctcattttttaaccaataggccgaaatcggcaaaatcccttata2700 aatcaaaagaatagaccgagatagggttgagtgttgttccagtttggaacaagagtccac2760 tattaaagaacgtggactccaacgtcaaagggcgaaaaaccgtctatcagggcgatggcc2820 cactacgtgaaccatcaccctaatcaagttttttggggtcgaggtgccgtaaagcactaa2880 atcggaaccctaaagggagcccccgatttagagcttgacggggaaagccggcgaacgtgg2940 cgagaaaggaagggaagaaagcgaaaggagcgggcgctagggcgctggcaagtgtagcgg3000 tcacgctgcgcgtaaccaccacacccgccgcgcttaatgcgccgctacagggcgcgtcag3060 gtggcacttttcggggaaatgtgcgcggaacccctatttgtttatttttctaaatacatt3120 caaatatgtatccgctcatgagacaataaccctgataaatgcttcaataatattgaaaaa3180 ggaagagtatgagtattcaacatttccgtgtcgcccttattcccttttttgcggcatttt3240 gccttcctgtttttgctcacccagaaacgctggtgaaagtaaaagatgctgaagatcagt3300 tgggtgcacgagtgggttacatcgaactggatctcaacagcggtaagatccttgagagtt3360 ttcgccccgaagaacgttttccaatgatgagcacttttaaagttctgctatgtggcgcgg3420 tattatcccgtattgacgccgggcaagagcaactcggtcgccgcatacactattctcaga3480 atgacttggttgagtactcaccagtcacagaaaagcatettacggatggcatgacagtaa3540 gagaattatgcagtgctgccataaccatgagtgataacactgcggccaacttacttctga3600 caacgatcggaggaccgaaggagctaaccgcttttttgcacaacatgggggatcatgtaa3660 ctcgccttgatcgttgggaaccggagctgaatgaagccataccaaacgacgagcgtgaca3720 ccacgatgcctgtagcaatggcaacaacgttgcgcaaactattaactggcgaactactta3780 ctctagcttcccggcaacaattaatagactggatggaggcggataaagttgcaggaccac3840 ttctgcgctcggcccttccggctggctggtttattgctgataaatctggagccggtgagc3900 gtgggtctcgcggtatcattgcagcactggggccagatggtaagccctcccgtatcgtag3960 ttatctacacgacggggagtcaggcaactatggatgaacgaaatagacagatcgctgaga4020 taggtgcctcactgattaagcattggtaactgtcagaccaagtttactcatatatacttt4080 agattgatttaaaacttcatttttaatttaaaaggatctaggtgaagatcctttttgata4140 atctcatgaccaaaatcccttaacgtgagttttcgttccactgagcgtcagaccccgtag4200 aaaagatcaaaggatcttcttgagatcctttttttctgcgcgtaatctgctgcttgcaaa4260 caaaaaaaccaccgctaccagcggtggtttgtttgccggatcaagagctaccaactcttt4320 ttccgaaggtaactggcttcagcagagcgcagataccaaatactgtccttctagtgtagc4380 cgtagttaggccaccacttcaagaactctgtagcaccgcctacatacctcgctctgctaa4440 tcctgttaccagtggctgctgccagtggcgataagtcgtgtcttaccgggttggactcaa4500 gacgatagttaccggataaggcgcagcggtcgggctgaacggggggttcgtgcacacagc4560 ccagcttggagcgaacgacctacaccgaactgagatacctacagcgtgagcattgagaaa4620 gcgccacgcttcccgaagggagaaaggcggacaggtatccggtaagcggcagggtcggaa4680 caggagagcgcacgagggagcttccagggggaaacgcctggtatctttatagtcctgtcg4740 ggtttcgccacctctgacttgagcgtcgatttttgtgatgctcgtcaggggggcggagcc4800 tatggaaaaacgccagcaacgcggcctttttacggttcctggccttttgctggccttttg4860 ctcacatgttctttcctgcgttatcccctgattctgtggataaccgtattaccgcctttg4920 agtgagctga taccgctcgc cgcagccgaa cgaccgagcg cagcgagtca gtgagcgagg 4980 aagcggaag 4989 <210> 49 <211> 5422 <212> DNA
<213> vector pYIG7E1 <400>

agcgcccaatacgcaaaccgcctctccccgcgcgttggccgattcattaatgcagctggc60 acgacaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaatgtgagttagc120 tcactcattaggcaccccaggctttacactttatgcttccggctcgtatgttgtgtggaa180 ttgtgagcggataacaatttcacacaggaaacagctatgaccatgattacgaatttaata240 cgactcactatagggaattcggatccttcaatatgcgcacatacgctgttatgttcaagg300 tcccttcgtttaagaacgaaagcggtcttccttttgagggatgtttcaagttgttcaaat360 ctatcaaatttgcaaatccccagtctgtatctagagcgttgaatcggtgatgcgatttgt420 taattaaattgatggtgtcaccattaccaggtctagatataccaatggcaaactgagcac480 aacaataccagtccggatcaactggcaccatctctcccgtagtctcatctaatttttctt540 ccggatgaggttccagatataccgcaacacctttattatggtttccctgagggaataata600 gaatgtcccattcgaaatcaccaattctaaacctgggcgaattgtatttcgggtttgtta660 actcgttccagtcaggaatgttccacgtgaagctatcttccagcaaagtctccacttctt720 catcaaattgtggagaatactcccaatgctcttatctatgggacttccgggaaacacagt780 accgatacttcccaattcgtcttcagagctcattgtttgtttgaagagactaatcaaaga840 atcgttttctcaaaaaaattaatatcttaactgatagtttgatcaaaggggcaaaacgta900 ggggcaaacaaacggaaaaatcgtttctcaaattttctgatgccaagaactctaaccagt960 cttatctaaaaattgccttatgatccgtctctccggttacagcctgtgtaactgattaat1020 cctgcctttctaatcaccattctaatgttttaattaagggattttgtcttcattaacggc1080 tttcgctcat aaaaatgtta tgacgttttg cccgcaggcg ggaaaccatc cacttcacga 1140 gactgatctc ctctgccgga acaccgggca tctccaactt ataagttgga gaaataagag 1200 aatttcagattgagagaatgaaaaaaaaaaaccctgaaaaaaaaggttgaaaccagttcc1260 ctgaaattattcccctacttgactaataagtatataaagacggtaggtattgattgtaat1320 tctgtaaatctatttcttaaacttcttaaattctacttttatagttagtcttttttttag1380 ttttaaaacaccaagaacttagtttcgaataaacacacataaacaaacaccatgaggtct1440 ttgctaatactagtgctttgcttcctgcccctggctgctctggggtatgaggtgcgcaac1500 gtgtccgggatgtaccatgtcacgaacgactgctccaactcaagcattgtgtatgaggca2560 gcggacatgatcatgcacacccccgggtgcgtgccctgcgttcgggagaacaactcttcc1620 cgctgctgggtagcgctcacccccacgctcgcagctaggaacgccagcgtCCCCaCCaCg1680 acaatacgacgccacgtcgatttgctcgttggggcggctgctttctgttccgctatgtac1740 gtgggggacctctgcggatctgtcttcctcgtctcccagctgttcaccatctcgcctcgc1800 cggcatgagacggtgcaggactgcaattgctcaatctatcccggccacataacgggtcac1860 cgtatggcttgggatatgatgatgaactggtaatagacccttctcacctcggccgataag1920 ctcagatctcgacttggttgaacacgttgccaaggcttaagtgaatttactttaaagtct1980 tgcatttaaataaattttctttttatagctttatgacttagtttcaatttatatactatt2040 ttaatgacattttcgattcattgattgaaagctttgtgttttttcttgatgcgctattgc2100 attgttcttgtctttttcgccacatgtaatatctgtagtagatacctgatacattgtgga2160 tgctgagtgaaattttagttaataatggaggcgctcttaataattttggggatattggct2220 tttttttttaaagtttacaaatgaattttttccgccaggataacgattctgaagttactc2280 ttagcgttcctatcggtacagccatcaaatcatgcctataaatcatgcctatatttgcgt2340 gcagtcagtatcatctacatgaaaaaaactcccgcaatttcttatagaatacgttgaaaa2400 ttaaatgtacgcgccaagataagataacatatatctagct~agatgcagtaatatacacag2460 attcccgcggacgtgggaaggaaaaaattagataacaaaatctgagtgatatggaaattc2520 cgctgtatagctcatatctttcccttcaacaccagaaatgtaaaaatcttgttacgaagg2580 atctttttgctaatgtttctcgctcaatcctcatttcttccctacgaagagtcaaatcta2640 cttgttttctgccggtatcaagatccatatcttctagtttcaccatcaaagtccaatttc2700 tagtatacagtttatgtcccaacgtaacagacaatcaaaattggaaaggataagtatcct2760 tcaaagaatgattctgcgctggctcctgaaccgcctaatgggaacagagaagtccaaaac2820 gatgctataagaaccagaaataaaacgataaaaccataccaggatccaagcttggcactg2880 gccgtcgttttacaacgtcgtgactgggaaaaccctggcgttacccaacttaatcgcctt2940 gcagcacatccccctttcgccagctggcgtaatagcgaagaggCCCgCa.CCgatCgCCCt3000 tcccaacagt tgcgcagcct gaatggcgaa tgggaaattg taaacgttaa tattttgtta 3060 aaattcgcgt taaatttttg ttaaatcagc tcatttttta accaataggc cgaaatcggc 3120 aaaatccctt ataaatcaaa agaatagacc gagatagggt tgagtgttgt tccagtttgg 3180 aacaagagtc cactattaaa gaacgtggac tccaacgtca aagggcgaaa aaccgtctat 3240 cagggcgatg gcccactacg tgaaccatca ccctaatcaa gttttttggg gtcgaggtgc 3300 cgtaaagcac taaatcggaa ccctaaaggg agcccccgat ttagagcttg acggggaaag 3360 ccggcgaacg tggcgagaaa ggaagggaag aaagcgaaag gagcgggcgc tagggcgctg 3420 gcaagtgtag cggtcacgct gcgcgtaacc accacacccg ccgcgcttaa tgcgccgcta 3480 cagggcgcgt caggtggcac ttttcgggga aatgtgcgcg gaacccctat ttgtttattt 3540 ttctaaatac attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa 3600 taatattgaa aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttt 3660 tttgcggcat tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagat 3720 gctgaagatc agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaag 3780 atccttgaga gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctg 3840 ctatgtggcg cggtattatc ccgtattgac gccgggcaag agcaactcgg tcgccgcata 3900 cactattctc agaatgactt ggttgagtac tcaccagtca cagaaaagca tcttacggat 3960 ggcatgacag taagagaatt atgcagtgct gccataacca tgagtgataa cactgcggcc 4020 aacttacttc tgacaacgat cggaggaccg aaggagctaa ccgctttttt gcacaacatg 4080 ggggatcatg taactcgcct tgatcgttgg gaaccggagc tgaatgaagc cataccaaac 4140 gacgagcgtg acaccacgat gcctgtagca atggcaacaa cgttgcgcaa actattaact 4200 ggcgaactac ttactctagc ttcccggcaa caattaatag actggatgga ggcggataaa 4260 gttgcaggac cacttctgcg ctcggccctt ccggctggct ggtttattgc tgataaatct 4320 ggagccggtg agcgtgggtc tcgcggtatc attgcagcac tggggccaga tggtaagccc 4380 tcccgtatcg tagttatcta cacgacgggg agtcaggcaa ctatggatga acgaaataga 4440 cagaacgctg agataggtgc ctcactgatt aagcattggt aactgtcaga ccaagtttac 4500 tcatatatac tttagattga tttaaaactt catttttaat ttaaaaggat ctaggtgaag 4560 atcctttttg ataatctcat gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg 4620 tcagaccccg tagaaaagat caaaggatct tcttgagatc ctttttttct gcgcgtaatc 4680 tgctgcttgc aaacaaaaaa accaccgcta ccagcggtgg tttgtttgcc ggatcaagag 4740 ctaccaactc tttttccgaa ggtaactggc ttcagcagag cgcagatacc aaatactgtc 4800 cttctagtgt agccgtagtt aggccaccac ttcaagaact ctgtagcacc gcctacatac 4860 ctcgctctgctaatcctgttaccagtggctgctgccagtggcgataagtcgtgtcttacc4920 gggttggactcaagacgatagttaccggataaggcgcagcggtcgggctgaacggggggt4980 tcgtgcacacagcccagcttggagcgaacgacctacaccgaactgagatacctacagcgt5040 gagcattgagaaagcgccacgcttcccgaagggagaaaggcggacaggtatccggtaagc5100 ggcagggtcggaacaggagagcgcacgagggagcttccagggggaaacgcctggtatctt5160 tatagtcctgtcgggtttcgccacctctgacttgagcgtcgatttttgtgatgctcgtca5220 ggggggcgga gcctatggaa aaacgccagc aacgcggcct ttttacggtt cctggccttt 5280 tgctggcctt ttgctcacat gttctttcct gcgttatccc ctgattctgt ggataaccgt 5340 attaccgcct ttgagtgagc tgataccgct cgccgcagcc gaacgaccga gcgcagcgag 5400 tcagtgagcg aggaagcgga ag 5422 <210>

<211>

<212>
DNA

<213>
vector pSYlYIG7Els <400>

atcgataagcttttcaattcaattcatcattttttttttattcttttttttgatttcggt60 ttctttgaaatttttttgattcggtaatctccgaacagaaggaagaacgaaggaaggagc120 acagacttagattggtatatatacgcatatgtagtgttgaagaaacatgaaattgcccag180 tattcttaacccaactgcacagaacaaaaacctgcaggaaacgaagataaatcatgtcga240 aagctacatataaggaacgtgctgctactcatcctagtcc~tgttgctgccaagctattta300 atatcatgcacgaaaagcaaacaaacttgtgtgcttcattggatgttcgtaccaccaagg360 aattactggagttagttgaagcattaggtcccaaaatttgtttactaaaaacacatgtgg420 atatcttgactgatttttccatggagggcacagttaagccgctaaaggcattatccgcca480 agtacaattttttactcttcgaagacagaaaatttgctgacattggtaatacagtcaaat540 tgcagtactctgcgggtgtatacagaatagcagaatgggcagacattacgaatgcacacg600 gtgtggtgggcccaggtattgttagcggtttgaagcaggcggcagaagaagtaacaaagg660 aacctagaggccttttgatgttagcagaattgtcatgcaagggctccctatctactggag720 aatatactaagggtactgttgacattgcgaagagcgacaaagattttgttatcggcttta780 ttgctcaaagagacatgggtggaagagatgaaggttacgattggttgattatgacacccg840 gtgtgggtttagatgacaagggagacgcattgggtcaacagtatagaaccgtggatgatg900 tggtctctacaggatctgacattattattgttggaagaggactatttgcaaagggaaggg960 atgctaaggtagagggtgaacgttacagaaaagcaggctgggaagcatatttgagaagat1020 gcggccagcaaaactaaaaaactgtattataagtaaatgcatgtatactaaactcacaaa1080 ttagagcttcaatttaattatatcagttattacccgggaatctcggtcgtaatgattttt1140 ataatgacgaaaaaaaaaaaattggaaagaaaaagctttaatgcggtagtttatcacagt1200 taaattgctaacgcagtcaggcaccgtgtatgaaatctaacaatgcgctcatcgtcatcc1260 tcggcaccgtcaccctggatgctgtaggcataggcttggttatgccggtactgccgggcc1320 tcttgcgggatatcgtccattccgacagcatcgccagtcactatggcgtgctgctagcgc1380 tatatgcgttgatgcaatttctatgcgcacccgttctcggagcactgtccgaccgctttg1440 gCCJCCgCCC agtCCtgCtC gcttcgctac ttggagccac tatcgactac gcgatcatgg 1500 cgaccacacc cgtcctgtgg atcctggtat ggttttatcg ttttatttct ggttcttata 1560 gcatcgtttt ggacttctctgttcccattaggcggttcaggagccagcgcagaatcattc1620 tttgaaggat acttatcctttccaattttgattgtctgttacgttgggacataaactgta1680 tactagaaat tggactttgatggtgaaactagaagatatggatcttgataccggcagaaa1740 acaagtagat ttgactcttcgtagggaagaaatgaggattgagcgagaaacattagcaaa1800 aagatccttc gtaacaagatttttacatttctggtgttgaagggaaagatatgagctata1860 cagcggaatt tccatatcactcagattttgttatctaattttttccttcccacgtccgcg1920 ggaatctgtg tatattactgcatctagctagatatatgttatcttatcttggcgcgtaca1980 tttaattttc aacgtattctataagaaattgcgggagtttttttcatgtagatgatactg2040 actgcacgca aatataggcatgatttataggcatgatttgatggctgtaccgataggaac2100 gctaagagta acttcagaatcgttatcctggcggaaaaaattcatttgtaaactttaaaa2160 aaaaaagccaatatccccaaaattattaagagcgcctccattattaactaaaatttcact2220 cagcatccacaatgtatcaggtatctactacagatattacatgtggcgaaaaagacaaga2280 acaatgcaatagcgcatcaagaaaaaacacaaagctttcaatcaatgaatcgaaaatgtc2340 attaaaatagtatataaattgaaactaagtcataaagctataaaaagaaaatttatttaa2400 atgcaagactttaaagtaaattcacttaagccttggcaacgtgttcaaccaagtcgagat2460 ctgagcttatcggccgaggtgagaagggtctattaccagttcatcatcatatcccaagcc2520 atacggtgacccgttatgtggccgggatagattgagcaattgcagtcctgcaccgtctca2580 tgccggcgaggcgagatggtgaacagctgggagacgaggaagacagatccgcagaggtcc2640 cccacgtacatagcggaacagaaagcagccgccccaacgagcaaatcgacgtggcgtcgt2700 attgtcgtggtggggacgctggcgttcctagctgcgagcgtgggggtgagcgctacccag2760 cagcgggaagagttgttctcccgaacgcagggcacgcacccgggggtgtgcatgatcatg2820 tccgctgcctcatacacaatgcttgagttggagcagtcgttcgtgacatggtacatcccg2880 gacacgttgcgcacctcataccccagagcagccaggggcaggaagcaaagcactagtatt2940 agcaaagacctcatggtgtttgtttatgtgtgtttattcgaaactaagttcttggtgttt3000 taaaactaaaaaaaagactaactataaaagtagaatttaagaagtttaagaaatagattt3060 acagaattacaatcaatacctaccgtctttatatacttattagtcaagtaggggaataat3120 ttcagggaactggtttcaaccttttttttcagggtttttttttttcattctctcaatctg3180 aaattctcttatttctccaacttataagttggagatgcceggtgttccggcagaggagat3240 cagtctcgtgaagtggatggtttcccgcctgcgggcaaaacgtcataacatttttatgag3300 cgaaagccgttaatgaagacaaaatcccttaattaaaacattagaatggtgattagaaag3360 gcaggattaatcagttacacaggctgtaaccggagagacggatcataaggcaatttttag3420 ataagactggttagagttcttggcatcagaaaatttgagaaacgatttttccgtttgttt3480 gcccctacgttttgcccctttgatcaaactatcagttaagatattaatttttttgagaaa3540 acgattctttgattagtctcttcaaacaaacaatgagctctgaagacgaattgggaagta3600 tcggtactgtgtttcccggaagtcccatagataagagcattgggagtattctccacaatt3660 tgatgaagaagtggagactttgctggaagatagcttcacgtggaacattcctgactggaa3720 cgagttaacaaacccgaaatacaattcgcccaggtttagaattggtgatttcgaatggga3780 cattctattattccctcagggaaaccataataaaggtgttgcggtatatctggaacctca3840 tccggaagaaaaattagatgagactacgggagagatggtgccagttgatccggactggta3900 ttgttgtgctcagtttgccattggtatatctagacctggtaatggtgacaccatcaattt3960 aattaacaaatcgcatcaccgattcaacgctctagatacagactggggatttgcaaattt4020 gatagatttgaacaacttgaaacatccctcaaaaggaagaccgctttcgttcttaaacga4080 agggaccttgaacataacagcgtatgtgcgcatattgaaggatcctctacgccggacgca4140 tcgtggccggcatcaccggcgccacaggtgcggttgctggcccctatatcgccgacatca4200 ccgatggggaagatcgggctcgccacttcgggctcatgagcgcttgtttcggcgtgggta4260 tggtggcaggccccgtggccgggggactgttgggcgccatctccttgcatgcaccattcc4320 ttgcggcggcggtgctcaacggcctcaacctactactgggctgcttcctaatgcaggagt4380 cgcataagggagagcgtcgaccgatgcccttgagagccttcaacccagtcagctccttcc4440 ggtgggcgcggggcatgactatcgtcgccgcacttatgactgtcttctttatcatgcaac4500 tcgtaggacaggtgccggcagcgctctgggtcattttcggcgaggaccgctttcgctgga4560 gcgcgacgatgatcggcctgtcgcttgcggtattcggaatcttgcacgccctcgctcaag4620 ccttcgtcactggtcCCgccaccaaacgtttcggcgagaagcaggccattatcgccggca4680 tggcggccgacgcgctgggctacgtcttgctggcgttcgcgacgcgaggctggatggcct4740 tccccattatgattcttctcgcttccggcggcatcgggatgcccgcgttgcaggccatgc4800 tgtccaggcaggtagatgacgaccatcagggacagcttcaaggatcgctcgcggctctta4860 ccagcctaacttcgatcactggaccgctgatcgtcacggcgatttatgccgcctcggcga4920 gcacatggaacgggttggcatggattgtaggcgccgccctataccttgtctgCCtCCCCg4980 cgttgcgtcgcggtgcatggagccgggccacctcgacctgaatggaagccggcggcacct5040 cgctaacggattcaccactccaagaattggagccaatcaattcttgcggagaactgtgaa5100 tgcgcaaaccaacccttggcagaacatatccatcgcgtccgccatctccagcagccgcac5160 gcggcgcatctcgggcagcgttgggtcctggccacgggtgcgcatgatcgtgctcctgtc5220 gttgaggacccggctaggctggcggggttgccttactggttagcagaatgaatcaccgat5280 acgcgagcgaacgtgaagcgactgctgctgcaaaacgtctgcgacctgagcaacaacatg5340 aatggtcttcggtttccgtgtttcgtaaagtctggaaacgcggaagtcagcgccctgcac5400 cattatgttccggatctgcatcgcaggatgctgctggctaccctgtggaacacctacatc5460 tgtattaacgaagcgctggcattgaccctgagtgatttttctctggtcccgccgcatcca5520 taccgccagttgtttaccctcacaacgttccagtaaccgggcatgttcatcatcagtaac5580 ccgtatcgtgagcatcctctctcgtttcatcggtatcattacccccatgaacagaaattc5640 ccccttacacggaggcatcaagtgaccaaacaggaaaaaaccgcccttaacatggcccgc5700 tttatcagaagccagacattaacgcttctggagaaactcaacgagctggacgcggatgaa5760 caggcagacatctgtgaatcgcttcacgaccacgctgatgagctttaccgcagctgcctc5820 gcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtcaca5880 gcttgtctgtaagcggtgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttg5940 gcgggtgtcggggcgcagccatgacccagtcacgtagcgatagcggagtgtatactggct6000 taactatgcggcatcagagcagattgtactgagagtgcaccatatgcggtgtgaaatacc6060 gcacagatgcgtaaggagaaaataccgcatcaggcgctcttccgcttcctcgctcactga6120 ctcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggtaat6180 acggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagca6240 aaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccc6300 tgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactata6360 aagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgcc6420 gcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcatagctc6480 acgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacga6540 accccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaaccc6600 ggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgag6660 gtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaag6720 gacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtag6780 ctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagca6840 gattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctga6900 cgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaaggat6960 cttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaagtatatatga7020 gtaaacttgg.tctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatctg7080 tctatttcgttcatccatagttgcctgactccccgtcgtgtagataactacgatacggga7140 gggcttaccatctggccccagtgctgcaatgataccgcgagacccacgctcaccggctcc7200 agatttatcagcaataaaccagccagccggaagggccgagcgcagaagtggtcctgcaac7260 tttatccgcctccatccagtctattaattgttgccgggaagctagagtaagtagttcgcc7320 agttaatagtttgcgcaacgttgttgccattgctgcaggcatcgtggtgtcacgctcgtc7380 gtttggtatggcttcattcagctccggttcccaacgatcaaggcgagttacatgatcccc7440 catgttgtgcaaaaaagcggttagctccttcggtcctccgatcgttgtcagaagtaagtt7500 ggccgcagtgttatcactcatggttatggcagcactgcataattctcttactgtcatgcc7560 atccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctgagaatagtg7620 tatgcggcgaccgagttgctcttgcccggcgtcaacacgggataataccgcgccacatag7680 cagaactttaaaagtgctcatcattggaaaacgttcttcggggcgaaaactctcaaggat7740 cttaccgctgttgagatccagttcgatgtaacccactcgtgcacccaactgatcttcagc7800 atcttttactttcaccagcgtttctgggtgagcaaaaacaggaaggcaaaatgccgcaaa7860 aaagggaataagggcgacacggaaatgttgaatactcatactcttcctttttcaatatta7920 ttgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtatttagaa7980 aaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctgacgtctaaga8040 aaccattattatcatgacattaacctataaaaaataggcgtatcacgaggccctttcgtc8100 ttcaagaattctcatgtttgacagcttatcatcgatccacttgtatatttggatgaattt8160 ttgaggaattctgaaccagtcctaaaacgagtaaataggaccggcaattcttcaagcaat8220 aaacaggaataccaattattaaaagataacttagtcagatcgtacaataaagctttgaag8280 aaaaatgcgccttattcaatctttgcataaaaaaatggcccaaaatctcacattggaaga8340 catttgatga cctcatttct ttcaatgaag ggcctaacgg agttgactaa tgttgtggga 8400 aattggaccg ataagcgtgc ttctgccgtg gccaggacaa cgtatactca tcagataaca 8460 gcaatacctg atcactactt cgcactagtt tctcggtact atgcatatga tccaatatca 8520 aaggaaatga tagcattgaa ggatgagact aatccaattg aggagtggca gcatatagaa 8580 cagctaaagg gtagtgctga aggaagcata cgataccccg catggaatgg gataatatca 8640 caggaggtac tagactacct ttcatcctac ataaatagac gcatataagt acgcatttaa 8700 gcataaacac gcactatgcc gttcttctca tgtatatata tatacaggca acacgcagat 8760 ataggtgcga cgt.gaacagt gagctgtatg tgcgcagctc gcgttgcatt ttcggaagcg 8820 ctcgttttcg gaaacgcttt gaagttccta ttccgaagtt cctattctct agaaagtata 8880 ggaacttcag agcgcttttg aaaaccaaaa gcgctctgaa gacgcacttt~caaaaaacca 8940 aaaacgcacc ggactgtaac gagctactaa aatattgcga ataccgcttc cacaaacatt ~ 9000 gctcaaaagt atctctttgc tatatatctc tgtgctatat ccctatataa ccatcccatc 9060 cacctttcgc tccttgaact tgcatctaaa ctcgacctct acatttttta tgtttatctc 9120 tagtattacc tcttagacaa aaaaattgta gtaagaacta ttcatagagt taatcgaaaa 9180 caatacgaaa atgtaaacat ttcctatacg tagtatatag agacaaaata gaagaaaccg 9240 ttcataattt tctgaccaat gaagaatcat caacgctatc actttctgtt cacaaagtat 9300 gcgcaatcca catcggtata gaatataatc ggggatgcct ttatcttgaa aaaatgcacc 9360 cgcagcttcg ctagtaatca gtaaacgcgg gaagtggagt caggcttttt ttatggaaga 9420 gaaaatagac accaaagtag ccttcttcta accttaacgg acctacagtg caaaaagtta 9480 tcaagagact gcattataga gcgcacaaag gagaaaaaaa gtaatctaag atgctttgtt 9540 agaaaaatag cgctctcggg atgcattttt gtagaacaaa aaagaagtat agattcttgt' 9600 tggtaaaata gcgctctcgc gttgcatttc tgttctgtaa aaatgcagct cagattcttt 9660 gtttgaaaaa ttagcgctct cgcgttgcat ttttgtttta caaaaatgaa gcacagattc 9720 ttcgttggta aaatagcgct ttcgcgttgc atttctgttc tgtaaaaatg cagctcagat 9780 tctttgtttg aaaaattagc gctctcgcgt tgcatttttg ttctacaaaa tgaagcacag 9840 atgcttcgtt aacaaagata tgctattgaa gtgcaagatg gaaacgcaga aaatgaaccg 9900 gggatgcgac gtgcaagatt acctatgcaa tagatgcaat agtttctcca ggaaccgaaa 9960 tacatacatt gtcttccgta aagcgctaga ctatatatta ttatacaggt tcaaatatac 10020 tatctgtttc agggaaaact cccaggttcg gatgttcaaa attcaatgat gggtaacaag 10080 tacgatcgta aatctgtaaa acagtttgtc ggatattagg ctgtatctcc tcaaagcgta 10140 ttcgaatatc attgagaagc tgcatttttt tttttttttt tttttttttt ttttttatat 10200 atatttcaag gatataccat tgtaatgtct gcccctaaga agatcgtcgt tttgccaggt 10260 gaccacgttg gtcaagaaat cacagccgaa gccattaagg ttcttaaagc tatttctgat 10320 gttcgttcca atgtcaagtt cgatttcgaa aatcatttaa ttggtggtgc tgctatcgat 10380 gctacaggtg tcccacttcc agatgaggcg ctggaagcct ccaagaaggt tgatgccgtt 10440 ttgttaggtg ctgtgggtgg tcctaaatgg ggtaccggta gtgttagacc tgaacaaggt 10500 ttactaaaaa tccgtaaaga acttcaattg tacgccaact taagaccatg taactttgca 10560 tccgactctc ttttagactt atctccaatc aagccacaat ttgctaaagg tactgacttc 10620 gttgttgtca gagaattagt gggaggtatt tactttggta agagaaagga agacgatggt 10680 gatggtgtcg cttgggatag tgaacaatac accgttccag aagtgcaaag aatcacaaga 10740 atggccgctt tcatggccct acaacatgag ccaccattgc ctatttggtc cttggataaa 10800 gctaatgttt tggcctcttc aagattatgg agaaaaactg tggaggaaac catcaagaac 10860 gaattcccta cattgaaggt tcaacatcaa ttgattgatt ctgccgccat gatcctagtt 10920 aagaacccaa cccacctaaa tggtattata atcaccagca acatgtttgg tgatatcatc 10980 tccgatgaag cctccgttat cccaggttcc ttgggtttgt tgccatctgc gtccttggcc 11040 tctttgccag acaagaacac cgcatttggt ttgtacgaac catgccacgg ttctgctcca 11100 gatttgccaa agaataaggt tgaccctatc gccactatct tgtctgctgc aatgatgttg 11160 aaattgtcat tgaacttgcc tgaagaaggt aaggccattg aagatgcagt taaaaaggtt 11220 ttggatgcag gtatcagaac tggtgattta ggtggttcca acagtaccac cgaagtcggt 11280 gatgctgtcg ccgaagaagt taagaaaatc cttgcttaaa aagattctct ttttttatga 11340 tatttgtaca aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaatgcag 11400 cgtcacatcg gataataatg atggcagcca ttgtagaagt gccttttgca tttctagtct 11460 ctttctcggt ctagctagtt ttactacatc gcgaagatag aatcttagat cacactgcct 11520 ttgctgagct ggatcatatg agtaacaaaa gagtggtaag gcctcgttaa aggacaagga 11580 cctgagcgga agtgtatcgt aaagtagacg gagtatacta gtatagtcta tagtccgtgg 11640 aattctaagt gccagcttta taatgtcatt ctccttacta cagacccgcc tgaaagtaga 11700 cacatcatca tcagtaagct ttgacaaaaa gcattgagta gctaactctt ctatgcaatc 11760 tatagctgtt ttataaggca ttcaatggac agattgaggt ttttgaaaca tactagtgaa 11820 attagcctta atcccttctc gaagttaatc atgcattatg gtgtaaaaaa tgcaactcgc 11880 gttgctctac tttttcccga atttccaaat acgcagctgg ggtgattgct cgatttcgta 11940 acgaaagttt tgtttataaa aaccgcgaaa accttctgta acagatagat ttttacagcg 12000 ctgatataca atgacatcag ctgtaatgga aaataactga aatatgaatg gcgagagact 12060 gcttgcttgt attaagcaat gtattatgca gcacttccaa cctatggtgt acgatgaaag 12120 taggtgtgta atcgagacga caagggggac ttttccagtt cctgatcatt ataagaaata 12180 caaaacgtta gcatttgcat ttgttggaca tgtactgaat acagacgaca caccggtaat 12240 tgaaaaagaa ctggattggc ctgatcctgc actagtgtac aatacaattg tcgatcgaat 12300 cataaatcac ccagaattat cacagtttat atcggttgca tttattagtc agttaaaggc 12360 caccatcgga gagggtttag atattaatgt aaaaggcacg ctaaaccgca ggggaaaggg 12420 tatcagaagg cctaaaggcg tattttttag atacatggaa tctccatttg tcaatacaaa 12480 ggtcactgca ttcttctctt atcttcgaga ttataataaa attgcctcag aatatcacaa 12540 taatactaaa ttcattctca cgttttcatg tcaagcatat tgggcatctg gcccaaactt 12600 ctccgccttg aagaatgtta tttggtgctc cataattcat gaatacattt ctaagtttgt 12660 ggaaagagaa caggataaag gtcatatagg agatcaggag ctaccgcetg aagaggaccc 12720 ttctcgtgaa ctaaacaatg tacaacatga agtcaatagt ttaacggaac aagatgcgga 12780 ggcggatgaa ggattgtggg gtgaaataga ttcattatgt gaaaaatggc agtctgaagc 12840 ggagagtcaa actgaggcgg agataatagc cgacaggata attggaaata gccagaggat 12900 ggcgaacctc aaaattcgtc gtacaaagtt caaaagtgtc ttgtatcata tactaaagga 12960 actaattcaa tctcagggaa ccgtaaaggt ttatcgcggt agtagttttt cacacgattc 13020 gataaagata agcttacatt atgaagagca gcatattaca gccgtatggg tctacttgat 13080 agtaaaattt gaagagcatt ggaagcctgt tgatgtagag gtcgagttta gatgcaagtt 13140 caaggagcga aaggtggatg ggtaggttat atagggatat agcacagaga tatatagcaa 13200 agagatactt ttgaggcaat gtttgtggaa gcggtattcg caatatttta gtagctcgtt 13260 acagtccggt gcgtttttgg ttttttgaaa gtgcgtcttc agagcgcttt tggttttcaa 13320 aagcgctctg aagttcctat actttctaga gaataggaac ttcggaatag gaacttcaaa 13380 gcgtttccga aaacgagcgc ttccgaaaat gcaacgcgag ctgcgcacat acagctcact 13440 gttcacgtcg cacctatatc tgcgtgttgc ctgtatatat atatacatga gaagaacggc 13500 atagtgcgtg tttatgctta aatgcgtact tatatgcgtc tatttatgta ggatgaaagg 13560 tagtctagta cctcctgtga tattatccca ttccatgcgg ggtatcgtat gcttccttca 13620 gcactaccct ttagctgttc tatatgctgc cactcctcaa ttggattagt ctcatccttc 13680 aatgcattca tttcctttga tattggatca taccctagaa gtattacgtg attttctgcc 13740 ccttaccctc gttgctactc tccttttttt cgtgggaacc gctttagggc cctcagtgat 13800 ggtgttttgt aatttatatg ctcctcttgc atttgtgtct ctacttcttg ttcgcctgga 13860 gggaacttct tcatttgtat tagcatggtt cacttcagtc cttccttcca actcactctt 13920 tttttgctgt aaacgattct ctgccgccag ttcattgaaa ctattgaata tatcctttag 13980 agattccggg atgaataaat cacctattaa agcagcttga cgatctggtg gaactaaagt 14040 aagcaattgg gtaacgacgc ttacgagctt cataacatct tcttccgttg gagctggtgg 14100 gactaataac tgtgtacaat ccatttttct catgagcatt tcggtagctc tcttcttgtc 14160 tttctcgggc aatcttccta ttattatagc aatagatttg tatagttgct ttctattgtc 14220 taacagcttg ttattctgta gcatcaaatc tatggcagcc tgacttgctt cttgtgaaga 14280 gagcatacca tttccaatcg aagatacgct ggaatcttct gcgctagaat caagaccata 14340 cggcctaccg gttgtgagag attccatggg ccttatgaca tatcctggaa agagtagctc 14400 atcagactta cgtttactct ctatatcaat atctacatca ggagcaatca tttcaataaa 14460 cagccgacat acatcccaga cgctataagc tgtacgtgct tttaccgtca gattcttggc 14520 tgtttcaatg tcgtccattt tggttttctt ttaccagtat tgttcgtttg ataatgtatt 14580 cttgcttatt acattataaa atctgtgcag atcacatgtc aaaacaactt tttatcacaa 14640 gatagtaccg caaaacgaac ctgcgggccg tctaaaaatt aaggaaaagc agcaaaggtg 14700 catttttaaa atatgaaatg aagataccgc agtaccaatt attttcgcag tacaaataat 14760 gcgcggccgg tgcatttttc gaaagaacgc gagacaaaca ggacaattaa agttagtttt 14820 tcgagttagc gtgtttgaat actgcaagat acaagataaa tagagtagtt gaaactagat 14880 atcaattgca cacaagatcg gcgctaagca tgccacaatt tggtatatta tgtaaaacac 14940 cacctaaggt gcttgttcgt cagtttgtgg aaaggtttga aagaccttca ggtgagaaaa 15000 tagcattatg tgctgctgaa ctaacctatt tatgttggat gattacacat aacggaacag 15060 caatcaagag agccacattc atgagctata atactatcat aagcaattcg ctgagtttcg 15120 atattgtcaa taaatcactc cagtttaaat acaagacgca aaaagcaaca attctggaag 15180 cctcattaaa gaaattgatt cctgcttggg aatttacaat tattccttac tatggacaaa 15240 aacatcaatc tgatatcact gatattgtaa gtagtttgca attacagttc gaatcatcgg 15300 aagaagcaga taagggaaat agccacagta aaaaaatgct aaagcacttc taagtgaggg 15360 tgaaagcatc tgggagatca ctgagaaaat actaaattcg tttgagtata cttcgagatt 15420 tacaaaaaca aaaactttat accaattcct cttcctagct actttcatca attgtggaag 15480 attcagcgat attaagaacg ttgatccgaa atcatttaaa ttagtccaaa ataagtatct 15540 gggagtaata atccagtgtt tagtgacaga gacaaagaca agcgttagta ggcacatata 15600 cttctttagc gcaaggggta g 15621 ~210>

<211>

~212>
DNA

<213> or pPICZalphaA
vect ~400>

agatctaacatccaaagacgaaaggttgaatgaaacctttttgccatccgacatccacag60 gtccattctcacacataagtgccaaacgcaacaggaggggatacactagcagcagaccgt120 tgcaaacgcaggacctccactcctcttctcctcaacacccacttttgccatcgaaaaacc180 agcccagttattgggcttgattggagctcgctcattccaattccttctattaggctacta240 acaccatgactttattagcctgtctatcctggcccccctggcgaggttcatgtttgttta300 tttccgaatgcaacaagctccgcattacacccgaacatcactccagatgagggctttctg360 agtgtggggtcaaatagtttcatgttccccaaatggcccaaaactgacagtttaaacgct420 gtcttggaacctaatatgacaaaagcgtgatctcatccaagatgaactaagtttggttcg480 ttgaaatgctaacggccagttggtcaaaaagaaacttccaaaagtcggcataccgtttgt540 cttgtttggtattgattgacgaatgctcaaaaataatctcattaatgcttagcgcagtct600 ctctatcgcttctgaaccccggtgcacctgtgccgaaacgcaaatggggaaacacccgct660 ttttggatgattatgcattgtctccacattgtatgcttccaagattctggtgggaatact720 gctgatagcctaacgttcatgatcaaaatttaactgttctaacccctacttgacagcaat780 atataaacagaaggaagctgccctgtcttaaacctttttttttatcatcattattagctt840 actttcataattgcgactggttccaattgacaagcttttgattttaacgacttttaacga900 caacttgagaagatcaaaaaacaactaattattcgaaacgatgagatttccttcaatttt960 tactgctgttttattcgcagcatcctccgcattagctgctccagteaacactacaacaga1020 agatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaagggga1080 tttcgatgttgctgttttgccattttccaacagcacaaataacgggttattgtttataaa1140 tactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaagagaggc1200 tgaagctgaattcacgtggcccagccggccgtctcggatcggtacctcgagccgcggcgg1260 ccgccagctttctagaacaaaaactcatctcagaagaggatctgaatagcgccgtcgacc1320 atcatcatcatcatcattgagtttgtagccttagacatgactgttcctcagttcaagttg1380 ggcacttacgagaagaccggtcttgctagattctaatcaagaggatgtcagaatgccatt1440 tgcctgagagatgcaggcttcatttttgatacttttttatttgtaacctatatagtatag1500 gattttttttgtcattttgtttcttctcgtacgagcttgctcctgatcagcctatctcgc1560 agctgatgaatatcttgtggtaggggtttgggaaaatcattcgagtttga.tgtttttctt1620 ggtatttcccactcctcttcagagtacagaagattaagtgagaccttcgtttgtgcggat1680 cccccacacaccatagcttcaaaatgtttctactccttttttactcttccagattttctc1740 ggactccgcgcatcgccgtaccacttcaaaacacccaagcacagcatactaaattttccc1800 tctttcttcctctagggtgtcgttaattacccgtactaaaggtttggaaaagaaaaaaga1860 gaccgcctcgtttctttttcttcgtcgaaaaaggcaataaaaatttttatcacgtttctt1920 tttcttgaaatttttttttttagtttttttctctttcagtgacctccattgatatttaag1980 ttaataaacggtcttcaatttctcaagtttcagtttcatttttcttgttctattacaact2040 ttttttacttcttgttcattagaaagaaagcatagcaatctaatctaaggggcggtgttg2100 acaattaatcatcggcatagtatatcggcatagtataatacgacaaggtgaggaactaaa2160 ccatggccaagttgaccagtgccgttccggtgctcaccgcgcgcgacgtcgccggagcgg2220 tcgagttctggaccgaccggctcgggttctcccgggacttcgtggaggacgacttcgccg2280 gtgtggtccgggacgacgtgaccctgttcatcagcgcggtccaggaccaggtggtgccgg2340 acaacaccctggcctgggtgtgggtgcgcggcctggacgagctgtacgccgagtggtcgg2400 aggtcgtgtccacgaacttccgggacgcctccgggccggccatgaccgagatcggcgagc2460 agccgtgggggcgggagttcgccctgcgcgacccggccggcaactgcgtgcacttcgtgg2520 ccgaggagcaggactgacacgtccgacggcggcccacgggtcccaggcctcggagatccg2580 tcccccttttcctttgtcgatatcatgtaattagttatgtcacgcttacattcacgccct2640 ccccccacatccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtccct2700 atttatttttttatagttatgttagtattaagaacgttatttatatttcaaatttttctt2760 ttttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgagaag2820 gttttgggacgctcgaaggctttaatttgcaagctggagaccaacatgtgagcaaaaggc,2880 cagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgc2940 ccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacagga3000 ctataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgacc3060 ctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcaa3120 tgctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtg3180 cacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtcc3240 aacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcaga3300 gcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacact3360 agaaggacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagtt3420 ggtagotctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag 3480 cagcagatta cgogcagaaa aaaaggatct caagaagatc ctttgatctt ttctacgggg 3540 tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag atc 3593 <210>

<211>

<212>
DNA

<213> or pPICZalphaD' vect ' <400>

agatctaacatccaaagacgaaaggttgaatgaaacctttttgccatccgacatccacag60 gtccattctcacacataagtgccaaacgcaacaggaggggatacactagcagcagaccgt120 tgcaaacgcaggacctccactcctcttctcctcaacacccacttttgccatcgaaaaacc180 agcccagttattgggcttgattggagctcgctcattccaattccttctattaggctacta240 acaccatgactttattagcctgtctatcctggcccccctggcgaggttcatgtttgttta300 tttccgaatgcaacaagctccgcattacacccgaacatcactccagatgagggctttctg360 agtgtggggtcaaatagtttcatgttccccaaatggcccaaaactgacagtttaaacgct420 gtcttggaacctaatatgacaaaagcgtgatctcatccaagatgaactaagtttggttcg480 ttgaaatgctaacggccagttggtcaaaaagaaacttccaaaagtcggcataccgtttgt540 cttgtttggtattgattgacgaatgctcaaaaataatctcattaatgcttagcgcagtct600 ctctatcgcttctgaaccccggtgcacctgtgccgaaacgcaaatggggaaacacccgct660 ttttggatgattatgcattgtctccacattgtatgcttccaagattctggtgggaatact720 gctgatagcctaacgttcatgatcaaaatttaactgttctaacccctacttgacagcaat780 atataaacagaaggaagctgccctgtcttaaacctttttttttatcatcattattagctt840 actttcataattgcgactggttccaattgacaagcttttgattttaacgacttttaacga900 caacttgagaagatcaaaaaacaactaattattcgaaacgatgagatttccttcaatttt960 tactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacaga1020 agatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaagggga1080 tttcgatgttgctgttttgccattttccaacagcacaaataacgggttattgtttataaa1140 tactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaaggggccc1200 gaattcgcatgcggccgccagctttctagaacaaaaactcatctcagaagaggatctgaa1260 tagcgccgtcgaccatcatcatcatcatcattgagtttgtagccttagacatgactgttc1320 ctcagttcaagttgggcacttacgagaagaccggtcttgctagattctaatcaagaggat1380 gtcagaatgccatttgcctgagagatgcaggcttcatttttgatacttttttatttgtaa1440 cctatatagtataggattttttttgtcattttgtttcttctcgtacgagcttgctcctga1500 tcagcctatctcgcagctgatgaatatcttgtggtaggggtttgggaaaatcattcgagt1560 ttgatgtttttcttggtatttcccactcctcttcagagtacagaagattaagtgagacct1620 tcgtttgtgcggatcccccacacaccatagcttcaaaatgtttctactccttttttactc1680 ttccagattttctcggactccgcgcatcgccgtaccacttcaaaacacccaagcacagca1740 tactaaattttccctctttcttcctctagggtgtcgttaattacccgtactaaaggtttg1800 gaaaagaaaaaagagaccgcctcgtttctttttcttcgtcgaaaaaggcaataaaaattt1860 ttatcacgtttctttttcttgaaatttttttttttagtttttttctctttcagtgacctc1920 cattgatatttaagttaataaacggtcttcaatttctcaagtttcagtttcatttttctt1980 gttctattacaactttttttacttcttgttcattagaaagaaagcatagcaatctaatct2040 aaggggcggtgttgacaattaatcatcggcatagtatatcggcatagtataatacgacaa2100 ggtgaggaactaaaccatggccaagttgaccagtgccgttccggtgctcaccgcgcgcga2160 cgtcgccggagcggtcgagttctggaccgaccggctcgggttctcccgggacttcgtgga2220 ggacgacttcgccggtgtggtccgggacgacgtgaccctgttcatcagcgcggtccagga2280 ccaggtggtgccggacaacaccctggcctgggtgtgggtgcgcggcctggacgagctgta2340 cgccgagtggtcggaggtcgtgtccacgaacttccgggacgcctccgggccggccatgac2400 cgagatcggcgagcagccgtgggggcgggagttcgccctgcgcgacccggccggcaactg2460 cgtgcacttcgtggccgaggagcaggactgacacgtccgacggcggcccacgggtcccag2520 gcctcggagatccgtcccccttttcctttgtcgatatcatgtaattagttatgtcacgct2580 tacattcacgCCCtCCCCCCaCatCCg'CtCtaaccgaaaaggaaggagttagacaacctg2640 aagtctaggtccctatttatttttttatagttatgttagtattaagaacgttatttatat2700 ttcaaatttttcttttttttctgtacagacgcgtgtacgcatgtaacattatactgaaaa2760 ccttgcttgagaaggttttgggacgctcgaaggctttaatttgcaagctggagaccaaca2820 tgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttt2880 tccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggc2940 gaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgct3000 ctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcg3060 tggcgctttctcaatgctcacgctgtaggtatctcagttcggtgtaggtcgttcgctcca3120 agctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaact3180 atcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggta3240 acaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggccta3300 actacggcta cactagaagg acagtatttg gtatctgcgc tctgctgaag ccagttacct 3360 tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 3420 tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 3480 tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca 3540 tgagatc 3547 <210>

<211>

<212>
DNA

<213>
vector pPICZalphaE' <400>

agatctaacatccaaagacgaaaggttgaatgaaacctttttgccatccgacatccacag60 gtccattctcacacataagtgccaaacgcaacaggaggggatacactagcagcagaccgt120 tgcaaacgcaggacctccactcctcttctcctcaacacccacttttgccatcgaaaaacc180 agcccagttattgggcttgattggagctcgctcattccaattccttctattaggctacta240 acaccatgactttattagcctgtctatcctggcccccctggcgaggttcatgtttgttta300 tttccgaatgcaacaagctccgcattacacccgaacatcactccagatgagggctttctg360 agtgtggggtcaaatagtttcatgttccccaaatggcccaaaactgacagtttaaacgct420 gtcttggaacctaatatgacaaaagcgtgatctcatccaagatgaactaagtttggttcg480 ttgaaatgctaacggccagttggtcaaaaagaaacttccaaaagtcggcataccgtttgt540 cttgtttggtattgattgacgaatgctcaaaaataatctcattaatgcttagcgcagtct600 ctctatcgcttctgaaccccggtgcacctgtgccgaaacgcaaatggggaaacacccgct660 ttttggatgattatgcattgtctccacattgtatgcttccaagattctggtgggaatact720 gctgatagcctaacgttcatgatcaaaatttaactgttctaacccctacttgacagcaat780 atataaacagaaggaagctgccctgtcttaaacctttttttttatcatcattattagctt840 actttcataattgcgactggttccaattgacaagcttttgattttaacgacttttaacga900 caacttgagaagatcaaaaaacaactaattattcgaaacgatgagatttccttcaatttt960 tactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacaga1020 agatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaagggga1080 tttcgatgttgctgttttgccattttccaacagcacaaataacgggttattgtttataaa1140 tactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaagagaggc1200 tgaagcctgcagcatatgctcgaggccgccagctttctagaacaaaaactcatctcagaa1260 gaggatctgaatagcgccgtcgaccatcatcatcatcatcattgagtttgtagccttaga1320 catgactgttcctcagttcaagttgggcacttacgagaagaccggtcttgctagattcta1380 atcaagaggatgtcagaatgccatttgcctgagagatgcaggcttcatttttgatacttt1440 tttatttgtaacctatatagtataggattttttttgtcattttgtttcttctcgtacgag1500 cttgctcctgatcagcctatctcgcagctgatgaatatcttgtggtaggggtttgggaaa1560 atcattcgagtttgatgtttttcttggtatttcccactcctcttcagagtacagaagatt1620 aagtgagaccttcgtttgtgcggatcccccacacaccatagcttcaaaatgtttctactc1680 cttttttactcttccagattttctcggactccgcgcatcgccgtaccacttcaaaacacc1740 caagcacagcatactaaattttccctctttcttcctctagggtgtcgttaattacccgta1800 ctaaaggtttggaaaagaaaaaagagaccgcctcgtttctttttcttcgtcgaaaaaggc1860 aataaaaatttttatcacgtttctttttcttgaaatttttttttttagtttttttctctt1920 tcagtgacctccattgatatttaagttaataaacggtcttcaatttctcaagtttcagtt1980 tcatttttcttgttctattacaactttttttacttcttgttcattagaaagaaagcatag2040 caatctaatctaaggggcggtgttgacaattaatcatcggcatagtatatcggcatagta2100 taatacgacaaggtgaggaactaaaccatggccaagttgaccagtgccgttccggtgctc2160 accgcgcgcgacgtcgccggagcggtcgagttctggaccgaccggctcgggttctcccgg2220 gacttcgtggaggacgacttcgccggtgtggtccgggacgacgtgaccctgttcatcagc2280 gcggtccaggaccaggtggtgccggacaacaccctggcctgggtgtgggtgcgcggcctg'2340 gacgagctgtacgccgagtggtcggaggtcgtgtccacgaacttccgggacgcctccggg2400 ccggccatgaccgagatcggcgagcagccgtgggggcgggagttcgccctgcgcgacccg2460 gccggcaactgcgtgcacttcgtggccgaggagcaggactgacacgtccgacggcggccc2520 acgggtcccaggcctcggagatccgtcccccttttcctttgtcgatatcatgtaattagt2580 tatgtcacgcttacattcacgccctccccccacatccgctctaaccgaaaaggaaggagt2640 tagacaacctgaagtctaggtccctatttatttttttatagttatgttagtattaagaac2700 .

gttatttatatttcaaatttttcttttttttctgtacagacgcgtgtacgcatgtaacat2760 tatactgaaaaccttgcttgagaaggttttgggacgctcgaaggctttaatttgcaagct2820 ggagaccaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgtt2880 gctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaag2940 tcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctc3000 cctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctccc3060 ttcgggaagcgtggegctttctcaatgctcacgctgtaggtatctcagttcggtgtaggt3120 cgttcgctcc aagctgggct gtgtgcacga accccccgtt cagcccgacc gctgcgcctt 3180 atccggtaac tatcgtcttg agtccaaccc ggtaagacac gacttatcgc cactggcagc 3240 agccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaa3300 gtggtggcctaactacggctacactagaaggacagtatttggtatctgcgctctgctgaa3360 gccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccacegctgg3420 tagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaaga3480 agatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagg3540 gattttggtc atgagatc 3558 <210> 54 <211> 28 <212> DNA
<213> synthetic probe or primer <400> 54 tcgagaaaag gggcccgaat tcgcatgc 28 <210> 55 <211> 28 <212> DNA
<213> synthetic probe or primer <400> 55 ggccgcatgc gaattcgggc cccttttc 28 <210> 56 <211> 35 <212> DNA
<213> synthetic probe or primer <400> 56 tcgagaaaag agaggctgaa gcctgcagca tatgc 35 <210> 57 <211> 35 <212> DNA
<213> synthetic probe or primer <400> 57 ggccgcatat gctgcaggct tcagcctctc ttttc 35 <210> 58 <211> 3997 <212> DNA
<213> vector pPTCZalphaD~ElsH6 <400> 58 agatctaaca tccaaagacg aaaggttgaa tgaaaccttt ttgccatccg acatccacag 60 gtccattctcacacataagtgccaaacgcaacaggaggggatacactagcagcagaccgt120 tgcaaacgcaggacctccactcctcttctcctcaacacccacttttgccatcgaaaaacc180 agcccagttattgggcttgattggagctcgctcattccaattccttctattaggctacta240 acaccatgactttattagcctgtctatcctggcccccctggcgaggttcatgtttgttta300 tttccgaatgcaacaagctccgcattacacccgaacatcactccagatgagggctttctg360 agtgtggggtcaaatagtttcatgttccccaaatggcccaaaactgacagtttaaacgct420 gtcttggaacctaatatgacaaaagcgtgatctcatccaagatgaactaagtttggttcg480 ttgaaatgctaacggccagttggtcaaaaagaaacttccaaaagtcggcataccgtttgt540 cttgtttggtattgattgacgaatgctcaaaaataatctcattaatgcttagcgcagtct600 ' ctctatcgcttctgaaccccggtgcacctgtgccgaaacgcaaatggggaaacacccgct660 ttttggatgattatgcattgtctccacattgtatgcttccaagattctggtgggaatact720 gctgatagcctaacgttcatgatcaaaatttaactgttctaacccctacttgacagcaat780 atataaacagaaggaagctgccctgtcttaaacctttttttttatcatcattattagctt840 actttcataattgcgactggttccaattgacaagcttttgattttaacgacttttaacga900 caacttgagaagatcaaaaaacaactaattattcgaaacgatgagatttccttcaatttt960 tactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacaga1020 agatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaagggga1080 tttcgatgttgctgttttgccattttccaacagcacaaataacgggttattgtttataaa1140 tactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaaggtatga1200 ggtgcgcaacgtgtccgggatgtaccatgtcacgaacgactgctccaactcaagcattgt1260 gtatgaggcagcggacatgatcatgcacacccccgggtgcgtgccctgcgttcgggagaa1320 CaaCtCttCCCgCtgCtgggtagCgCtCaCCCCCa.CgCtCgcagctaggaacgccagcgt1380 ccccactacgacaatacgacgccacgtcgatttgctcgttggggcggctgctttctgttc1440 cgctatgtacgtgggggatctctgcggatctgtcttcctcgtctcccagctgttcaccat1500 ctcgcctcgccggcatgagacggtgcaggactgcaattgctcaatctatcccggccacat1560 aacaggtcaccgtatggcttgggatatgatgatgaactggcaccaccaccatcaccatta1620 aagatctaagcttgaatcccgcggccatgcgaattcgcatgcggccgccagctttctaga1680 acaaaaactcatctcagaagaggatctgaatagcgccgtcgaccatcatcatcatcatca1740 ttgagtttgtagccttagacatgactgttcctcagttcaagttgggcacttacgagaaga1800 ccggtcttgctagattctaatcaagaggatgtcagaatgccatttgcctgagagatgcag1860 gcttcatttttgatacttttttatttgtaacctatatagtataggattttttttgtcatt1920 ttgtttcttctcgtacgagcttgctcctgatcagcctatctcgcagctgatgaatatctt1980 gtggtaggggtttgggaaaatcattcgagtttgatgtttttcttggtatttcccactcct2040 cttcagagtacagaagattaagtgagaccttcgtttgtgcggatcccccacacaccatag2100 cttcaaaatgtttctactccttttttactcttccagattttctcggactccgcgcatcgc2160 cgtaccacttcaaaacacccaagcacagcatactaaattttccctctttcttcctctagg2220 gtgtcgttaattacccgtactaaaggtttggaaaagaaaaaagagaccgcctcgtttctt2280 tttcttcgtcgaaaaaggcaataaaaatttttatcacgtttctttttcttgaaatttttt2340 tttttagtttttttctctttcagtgacctccattgatatttaagttaataaacggtcttc2400 aatttctcaagtttcagtttcatttttcttgttctattacaactttttttacttcttgtt2460 cattagaaagaaagcatagcaatctaatctaaggggcggtgttgacaattaatcatcggc2520 atagtatatcggcatagtataatacgacaaggtgaggaactaaaccatggccaagttgac2580 cagtgccgttccggtgctcaccgcgcgcgacgtcgccggagcggtcgagttctggaccga2640 ccggctcgggttctcccgggacttcgtggaggacgacttcgccggtgtggtccgggacga2700 cgtgaccctgttcatcagcgcggtccaggaccaggtggtgccggacaacaccctggcctg2760 ggtgtgggtgcgcggcctggacgagctgtacgccgagtggtcggaggtcgtgtccacgaa2820 cttccgggacgcctccgggccggccatgaccgagatcggcgagcagccgtgggggcggga2880 gttcgccctgcgcgacccggccggcaactgcgtgcacttcgtggccgaggagcaggactg2940 acacgtccgacggcggcccacgggtcccaggcctcggagatccgtcccccttttcctttg3000 tcgatatcatgtaattagttatgtcacgcttacattcacgccctccccccacatccgctc3060 taaccgaaaaggaaggagttagacaacctgaagtctaggtccctatttatttttttatag3120 ttatgttagtattaagaacgttatttatatttcaaatttttcttttttttctgtacagac3180 gcgtgtacgcatgtaacattatactgaaaaccttgcttgagaaggttttgggacgctcga3240 aggctttaatttgcaagctggagaccaacatgtgagcaaaaggccagcaaaaggccagga3300 accgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatc3360 acaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccagg3420 CgtttCCCCCtggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggat3480 acctgtccgcctttctcccttcgggaagcgtggcgctttctcaatgctcacgctgtaggt3540 atctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttc3600 agcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacg3660 acttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcg3720 gtgctacagagttcttgaagtggtggcctaactac'ggctacactagaaggacagtatttg3780 gtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccg3840 gcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgca3900 gaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtgga3960 acgaaaactcacgttaagggattttggtcatgagatc 3997 <210>

<211>

<212>
DNA

<213>
vector pPICZalphaE'ElsH6 <400>

agatctaacatccaaagacgaaaggttgaatgaaacctttttgccatccgacatccacag60 gtccattctcacacataagtgccaaacgcaacaggaggggatacactagcagcagaccgt120 tgcaaacgcaggacctccactcctcttctcctcaacacccacttttgccatcgaaaaacc180 agcccagttattgggcttgattggagctcgctcattccaattccttctattaggctacta240 acaccatgactttattagcctgtctatcctggcccccctggcgaggttcatgtttgttta300 tttccgaatgcaacaagctccgcattacacccgaacatcactccagatgagggctttctg360 agtgtggggtcaaatagtttcatgttccccaaatggcccaaaactgacagtttaaacgct420 gtcttggaacctaatatgacaaaagcgtgatctcatccaagatgaactaagtttggttcg480 ttgaaatgctaacggccagttggtcaaaaagaaacttccaaaagtcggcataccgtttgt540 cttgtttggtattgattgacgaatgctcaaaaataatctcattaatgcttagcgcagtct600 ctctatcgcttctgaaccccggtgcacctgtgccgaaacgcaaatggggaaacacccgct660 ttttggatgattatgcattgtctccacattgtatgcttccaagattctggtgggaatact720 gctgatagcctaacgttcatgatcaaaatttaactgttctaacccctacttgacagcaat780 atataaacagaaggaagctgccctgtcttaaacctttttttttatcatcattattagctt840 actttcataattgcgactggttccaattgacaagcttttgattttaacgacttttaacga900 caacttgagaagatcaaaaaacaactaattattcgaaacgatgagatttccttcaatttt960 tactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacaga1020 agatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaagggga1080 tttcgatgttgctgttttgccattttccaacagcacaaataacgggttattgtttataaa1140 tactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaagagaggc1200 tgaagcctatgaggtgcgcaacgtgtccgggatgtaccatgtcacgaacgactgctccaa1260 ctcaagcattgtgtatgaggcagcggacatgatcatgcacacccccgggtgcgtgccctg1320 cgttcgggagaacaactcttcccgctgctgggtagcgctcacccccacgctcgcagctag1380 gaacgccagcgtccccactacgacaatacgacgccacgtcgatttgctcgttggggcggc1440 tgctttctgttccgctatgtacgtgggggatctctgcggatctgtcttcctcgtctccca1500 gctgttcaccatctcgcctcgccggcatgagacggtgcaggactgcaattgctcaatcta1560 tcccggccacataacgggtcaccgtatggcttgggatatgatgatgaactggcaccacca1620 ccatcaccattaaagatctaagcttgaatcccgcggccatggcatatgcggccgccagct1680 ttctagaacaaaaactcatctcagaagaggatctgaatagcgccgtcgaccatcatcatc1740 atcatcattgagtttgtagccttagacatgactgttcctcagttcaagttgggcacttac1800 gagaagaccggtettgctagattctaatcaagaggatgtcagaatgccatttgcctgaga1860 gatgcaggcttcatttttgatacttttttatttgtaacctatatagtataggattttttt1920 tgtcattttgtttcttctcgtacgagcttgctcctgatcagcctatctcgcagctgatga1980 atatcttgtggtaggggtttgggaaaatcattcgagtttgatgtttttcttggtatttcc2040 cactcctcttcagagtacagaagattaagtgagaccttcgtttgtgcggatcccccacac2100 accatagcttcaaaatgtttctactccttttttactcttccagattttctcggactccgc2160 gcatcgccgtaccacttcaaaacacccaagcacagcatactaaattttccctctttcttc2220 ctctagggtgtcgttaattacccgtactaaaggtttggaaaagaaaaaagagaccgcctc2280 gtttctttttcttcgtcgaaaaaggcaataaaaatttttatcacgtttctttttcttgaa2340 atttttttttttagtttttttctctttcagtgacctccattgatatttaagttaataaac2400 ggtcttcaatttctcaagtttcagtttcatttttcttgttctattacaactttttttact2460 tcttgttcattagaaagaaagcatagcaatctaatctaaggggcggtgttgacaattaat2520 catcggcatagtatatcggcatagtataatacgacaaggtgaggaactaaaccatggcca2580 agttgaccagtgccgttccggtgctcaccgcgcgcgacgtcgccggagcggtcgagttct2640 ggaccgaccggctcgggttctcccgggacttcgtggaggacgacttcgccggtgtggtcc2700 gggacgacgtgaccctgttcatcagcgcggtccaggaccaggtggtgccggacaacaccc2760 tggcctgggtgtgggtgcgcggcctggacgagctgtacgccgagtggtcggaggtcgtgt2820 ccacgaacttccgggacgcctccgggccggccatgaccgagatcggcgagcagccgtggg2880 ggcgggagttcgccctgcgcgacccggccggcaactgcgtgcacttcgtggccgaggagc2940 aggactgacacgtccgacggcggcccacgggtcccaggcctcggagatccgtcccccttt3000 tcctttgtcgatatcatgtaattagttatgtcacgcttacattcacgccctccccccaca3060 tccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtccctatttatttt3120 tttatagttatgttagtattaagaacgttatttatatttcaaatttttcttttttttctg3180 tacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgagaaggttttggga3240 cgctcgaaggctttaatttgcaagctggagaccaacatgtgagcaaaaggccagcaaaag3300 gccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgac3360 gagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaaga3420 taccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgctt3480 accggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcaatgctcacgc3540 tgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccc3600 cccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggta3660 agacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtat3720 gtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaaggaca3780 gtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctct3840 tgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagatt3900 acgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgct3960 cagtggaacgaaaactcacgttaagggattttggtcatgagatc 4004 <210>

<211>

<212>
DNA

<213> or pPICZalphaD'E2sH6 vect <400>

agatctaacatccaaagacgaaaggttgaatgaaacctttttgccatccgacatccacag 60 gtccattctcacacataagtgccaaacgcaacaggaggggatacactagcagcagaccgt 120 tgcaaacgcaggacctccactcctcttctcctcaacacccacttttgccatcgaaaaacc 180 agcccagttattgggcttgattggagctcgctcattccaattccttctattaggctacta 240 acaccatgactttattagcctgtctatcctggcccccctggcgaggttcatgtttgttta 300 tttccgaatgcaacaagctccgcattacacccgaacatcactccagatgagggctttctg 360 agtgtggggtcaaatagtttcatgttccccaaatggcccaaaactgacagtttaaacgct 420 gtcttggaacctaatatgacaaaagcgtgatctcatccaagatgaactaagtttggttcg 480 ttgaaatgctaacggccagttggtcaaaaagaaacttccaaaagtcggcataccgtttgt 540 cttgtttggtattgattgacgaatgctcaaaaataatctcattaatgcttagcgcagtct 600 ctctatcgcttctgaaccccggtgcacctgtgccgaaacgcaaatggggaaacacccgct 660 ttttggatgattatgcattgtctccacattgtatgcttccaagattctggtgggaatact 720 gctgatagcctaacgttcatgatcaaaatttaactgttctaacccctacttgacagcaat 780 atataaacagaaggaagctgccctgtcttaaacctttttttttatcatcattattagctt840 actttcataattgcgactggttccaattgacaagcttttgattttaacgacttttaacga900 caacttgagaagatcaaaaaacaactaattattcgaaacgatgagatttccttcaatttt960 tactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacaga1020 agatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaagggga1080 tttcgatgttgctgttttgccattttccaacagcacaaataacgggttattgtttataaa1140 tactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaaggcatac1200 ccgcgtgtcaggaggggcagcagcctccgataccaggggccttgtgtccctctttagccc1260 cgggtcggctcagaaaatccagctcgtaaacaccaacggcagttggcacatcaacaggac1320 tgccctgaactgcaacgactccctccaaacagggttctttgecgcactattctacaaaca1380 caaattcaaCtcgtctggatgcccagagcgcttggccagctgtcgctccatcgacaagtt1440 cgctcaggggtggggtcccctcacttacactgagcctaacagctcggaccagaggcccta1500 ctgctggcactacgcgcctcgaccgtgtggtattgtacccgcgtctcaggtgtgcggtcc1560 agtgtattgcttcaccccgagccctgttgtggtggggacgaccgatcggtttggtgtccc1620 cacgtataactggggggcgaacgactcggatgtgctgattctcaacaacacgcggccgcc1680 gcgaggcaactggttcggctgtacatggatgaatggcactgggttcaccaagacgtgtgg1740 gggccccccgtgcaacatcgggggggccggcaacaacaccttgacctgccccactgactg1800 ttttcggaagcaccccgaggccacctacgccagatgcggttctgggccctggctgacacc1860 taggtgtatggttcattacccatataggctctggcactacccctgcactgtcaacttcac1920 catcttcaaggttaggatgtacgtggggggcgtggagcacaggttcgaagccgcatgcaa1980 ttggactcgaggagagcgttgtgacttggaggacagggatagatcagagcttagcccgct2040 gctgctgtctacaacagagtggcaggtgatcgagggcagacaccatcaccaccatcacta2100 atagttaattaactgcaggcatgcaagcttatcgataccgtcgacgaattcgcatgcggc2160 cgccagctttctagaacaaaaactcatctcagaagaggatctgaatagcgccgtcgacca2220 tcatcatcatcatcattgagtttgtagccttagacatgactgttcctcagttcaagttgg2280 gcacttacgagaagaccggtcttgctagattctaatcaagaggatgtcagaatgccattt2340 gcctgagagatgcaggcttcatttttgatacttttttatttgtaacctat'atagtatagg2400 attttttttgtcattttgtttcttctcgtacgagcttgctcctgatcagcctatctcgca2460 gctgatgaatatcttgtggtaggggtttgggaaaatcattcgagtttgatgtttttcttg2520 gtatttcccactcctcttcagagtacagaagattaagtgagaccttcgtttgtgcggatc2580 ccccacacaccatagcttcaaaatgtttctactccttttttactcttccagattttctcg2640 gactccgcgcatcgccgtaccacttcaaaacacccaagcacagcatactaaattttccct2700 ctttcttcctctagggtgtcgttaattacccgtactaaaggtttggaaaagaaaaaagag2760 accgcctcgtttctttttcttcgtcgaaaaaggcaataaaaatttttatcacgtttcttt2820 ttcttgaaatttttttttttagtttttttctctttcagtgacctccattgatatttaagt2880 taataaacggtcttcaatttctcaagtttcagtttcatttttcttgttctattacaactt2940 tttttacttcttgttcattagaaagaaagcatagcaatetaatctaaggggcggtgttga3000 caattaatcatcggcatagtatatcggcatagtataatacgacaaggtgaggaactaaac3060 catggccaagttgaccagtgccgttccggtgctcaccgcgcgcgacgtcgccggagcggt3120 cgagttctggaccgaccggctcgggttctcccgggacttcgtggaggacgacttcgccgg3180 tgtggtccgggacgacgtgaccctgttcatcagcgcggtccaggaccaggtggtgccgga3240 caacaccctggcctgggtgtgggtgcgcggcctggacgagctgtacgccgagtggtcgga3300 ggtcgtgtccacgaacttccgggacgcctccgggccggccatgaccgagatcggcgagca3360 gccgtgggggcgggagttcgccctgcgcgacccggccggcaactgcgtgcacttcgtggc3420 cgaggagcaggactgacacgtccgacggcggcccacgggtcccaggcctcggagatccgt3480 cccccttttcctttgtcgatatcatgtaattagttatgtcacgcttacattcacgccctc3540 cccccacatccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtcccta3600 tttatttttttatagttatgttagtattaagaacgttatttatatttcaaatttttcttt3660 tttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgagaagg3720 ttttgggacgctcgaaggctttaatttgcaagctggagaccaacatgtgagcaaaaggcc3780 agcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcc3840 cccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggac3900 tataaagataccaggcgtttccccctggaagC'tCCC'tCgtgCJC'tCtCCtgttCCgaCCC3960 tgCCgCttaCCggataCCtgtCCgCCtttCtcccttcgggaagcgtggcgctttctcaat4020 gctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgc4080 aCgaaCCCCCCgttCagCCCgaccgctgcgCCttatCCggtaactatcgtcttgagtcca4140 acccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagag4200 cgaggtatgtaggcggtgctacagagttcttgaagtggtg~gcctaactacggctacacta4260 gaaggacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttg4320 gtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagc4380 agcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggt4440 ctgacgctca gtggaacgaa aactcacgtt aagggatttt ggtcatgaga tc 4492 <210>

<211>

<212>
DNA

<213>
vector pPICZalphaE~E2sH6 <400>

agatctaacatccaaagacgaaaggttgaatgaaacctttttgccatccgacatccacag60 gtccattctcacacataagtgccaaacgcaacaggaggggatacactagcagcagaccgt120 tgcaaacgcaggacctccactcctcttctcctcaacacccacttttgccatcgaaaaacc180 agcccagttattgggcttgattggagctcgctcattccaattccttctattaggctacta240 acaccatgactttattagcctgtctatcctggcccccctggcgaggttcatgtttgttta300 tttccgaatgcaacaagctccgcattacacccgaacatcactccagatgagggctttctg360 agtgtggggtcaaatagtttcatgttccccaaatggcccaaaactgacagtttaaacgct420 gtcttggaacctaatatgacaaaagcgtgatctcatccaagatgaactaagtttggttcg480 ttgaaatgctaacggccagttggtcaaaaagaaacttccaaaagtcggcataccgtttgt540 cttgtttggtattgattgacgaatgctcaaaaataatctcattaatgcttagcgcagtct600 ctctatcgcttctgaaccccggtgcacctgtgccgaaacgcaaatggggaaacacccgct660 ttttggatgattatgcattgtctccacattgtatgcttccaagattctggtgggaatact720 gctgatagcctaacgttcatgatcaaaatttaactgttctaacccctacttgacagcaat780 atataaacagaaggaagctgccctgtcttaaacctttttttttatcatcattattagctt840 actttcataattgcgactggttccaattgacaagcttttgattttaacgacttttaacga900 caacttgagaagatcaaaaaacaactaattattcgaaacgatgagatttccttcaatttt960 tactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacaga1020 agatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaagggga1080 tttcgatgttgctgttttgccattttccaacagcacaaataacgggttattgtttataaa1140 tactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaagagaggc1200 tgaagcccatacccgcgtgtcaggaggggcagcagcctccgataccaggggccttgtgtc1260 cctctttagccccgggtcggctcagaaaatccagctcgtaaacaccaacggcagttggca1320 catcaacaggactgccctgaactgcaacgactccctccaaacagggttctttgccgcact1380 attctacaaacacaaattcaactcgtctggatgcccagagcgcttggccagctgtcgctc1440 catcgacaagttcgctcaggggtggggtcccctcacttacactgagcctaacagctcgga1500 ccagaggccctactgctggcactacgcgcctcgaccgtgtggtattgtacccgcgtctca1560 ggtgtgcggtccagtgtattgcttcaccccgagccctgttgtggtggggacgaccgatcg1620 gtttggtgtccccacgtataactggggggcgaacgactcggatgtgctgattctcaacaa1680 cacgcggccgccgcgaggcaactggttcggctgtacatggatgaatggcactgggttcac1740 caagacgtgtgggggccccccgtgcaacatcgggggggccggcaacaacaccttgacctg1800 ccccactgactgttttcggaagcaccccgaggccacctacgccagatgcggttctgggcc1860 ctggctgacacctaggtgtatggttcattacccatataggctctggcactacccctgcac1920 tgtcaacttcaccatcttcaaggttaggatgtacgtggggggcgtggagcacaggttcga1980 agccgcatgcaattggactcgaggagagcgttgtgacttggaggacagggatagatcaga2040 gcttagcccgctgctgctgtctacaacagagtggcaggtgatcgagggcagacaccatca2100 ccaccatcactaatagttaattaactgcaggcatgcaagcttatcgataccgtcgaccat2160 catcatcatcatcattgagtttgtagccttagacatgactgttcctcagttcaagttggg2220 cacttacgagaagaccggtcttgctagattctaatcaagaggatgtcagaatgccatttg2280 cctgagagatgcaggcttcatttttgatacttttttatttgtaacctatatagtatagga2340 ttttttttgtcattttgtttcttctcgtacgagcttgctcctgatcagcctatctcgcag2400 ctgatgaatatcttgtggtaggggtttgggaaaatcattcgagtttgatgtttttcttgg2460 tatttcccactcctcttcagagtacagaagattaagtgagaccttcgtttgtgcggatec2520 cccacacaccatagcttcaaaatgtttctactccttttttactcttccagattttctcgg2580 actccgcgcatcgccgtaccacttcaaaacacccaagcacagcatactaaattttccctc2640 tttcttcctctagggtgtcgttaattacccgtactaaaggtttggaaaagaaaaaagaga2700 ccgcctcgtttctttttcttcgtcgaaaaaggcaataaaaatttttatcacgtttctttt2760 tcttgaaatttttttttttagtttttttctctttcagtgacctccattgatatttaagtt2820 aataaacggtcttcaatttctcaagtttcagtttcatttttcttgttctattacaacttt2880 ttttacttcttgttcattagaaagaaagcatagcaatctaatctaaggggcggtgttgac2940 aattaatcatcggcatagtatatcggcatagtataatacgacaaggtgaggaactaaacc3000 atggccaagttgaccagtgccgttccggtgctcaccgcgcgcgacgtcgccggagcggtc3060 gagttctggaccgaccggctcgggttctcccgggacttcgtggaggacgacttcgccggt3120 gtggtccgggacgacgtgaccctgttcatcagcgcggtccaggaccaggtggtgccggac3180 aacaccctggcctgggtgtgggtgcgcggcctggacgagctgtacgccgagtggtcggag3240 gtcgtgtccacgaacttccgggacgcctccgggccggccatgaccgagatcggcgagcag3300 ccgtgggggcgggagttcgccctgcgcgacccggccggcaactgcgtgcacttcgtggcc3360 gaggagcaggactgacacgtccgacggcggcccacgggtcccaggcctcggagatccgtc3420 ccccttttcctttgtcgatatcatgtaattagttatgtcacgcttacattcacgccctcc3480 ccccacatccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtccctat3540 ttatttttttatagttatgttagtattaagaacgttatttatatttcaaatttttctttt3600 ttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgagaaggt3660 tttgggacgctcgaaggctttaatttgcaagctggagaccaacatgtgagcaaaaggcca3720 gcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgccc3780 ccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggact3840 ataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccct3900 gccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcaatg3960 ctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgca4020 cgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaa4080 cccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagc4140 gaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactag4200 aaggacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttgg4260 tagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagca4320 gcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtc4380 tgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagatc 4431 <210> 62 <211> 2880 <212> DNA

<213> vector pUCIBMFa <400>

gcgcccaatacgcaaaccgcctctccccgcgcgttggccgattcattaatgcagctggca60 cgacaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaatgtgagttagct120 cactcattaggcaccccaggctttacactttatgcttccggctcgtatgttgtgtggaat180 tgtgagcggataacaatttcacacaggaaacagctatgaccatgattacgccaagcttac240 cccttcttctttagcagcaatgctggcaatagtagtatttataaacaataacccgttatt300 tgtgctgttg gaaaatggca aaacagcaac atcgaaatcc ccttctaaat ctgagtaacc 360 gatgacagct tcagccggaa tttgtgccgt ttcatcttct gttgtagtgt tgactggagc 420 agctaatgcg gaggatgctg cgaataaaac tgcagtaaaa attgaaggaa atctcatgaa 480 ttcactggcc gtcgttttac aacgtcgtga ctgggaaaac cctggcgtta cccaacttaa 540 tcgccttgcagcacatccccctttcgecagctggcgtaatagcgaagaggcccgcaccga600 tcgcccttcccaacagttgcgcagcctgaatggcgaatggcgcctgatgcggtattttct660 ccttacgcatctgtgcggtatttcacaccgcatatggtgcactctcagtacaatctgctc720 tgatgccgcatagttaagccagccccgacaCCCgCCaaCaCCCgCtgaCgCg'CCCtgaCg780 ggcttgtctgctcccggcatccgcttacagacaagctgtgaccgtctccgggagctgcat840 gtgtcagaggttttcaccgtcatcaccgaaacgcgcgagacgaaagggcctcgtgatacg900 cctatttttataggttaatgtcatgataataatggtttcttagacgtcaggtggcacttt960 tcggggaaatgtgcgcggaacccctatttgtttatttttctaaatacattcaaatatgta1020 tccgctcatgagacaataaccctgataaatgcttcaataatattgaaaaaggaagagtat1080 gagtattcaacatttccgtgtcgcccttattcccttttttgcggcattttgccttcctgt1140 ttttgctcacccagaaacgctggtgaaagtaaaagatgctgaagatcagttgggtgcacg1200 agtgggttacatcgaactggatctcaacagcggtaagatccttgagagttttcgccccga1260 agaacgttttccaatgatgagcacttttaaagttctgctatgtggcgcggtattatcccg1320 tattgacgccgggcaagagcaactcggtcgccgcatacactattctcagaatgacttggt1380 tgagtactcaccagtcacagaaaagcatcttacggatggcatgacagtaagagaattatg1440 cagtgctgccataaccatgagtgataacactgcggccaacttacttctgacaacgatcgg1500 aggaccgaaggagctaaccgcttttttgcacaacatgggggatcatgtaactcgccttga1560 tcgttgggaaccggagctgaatgaagccataccaaacgacgagcgtgacaccacgatgcc1620 tgtagcaatggcaacaacgttgcgcaaactattaactggcgaactacttactctagcttc1680 ccggcaacaattaatagactggatggaggcggataaagttgcaggaccacttctgcgctc1740 ggcccttccggctggctggtttattgctgataaat.ctggagccggtgagcgtgggtctcg1800 cggtatcattgcagcactggggccagatggtaagccctcccgtatcgtagttatctacac1860 gacggggagtcaggcaactatggatgaacgaaatagacagatcgctgagataggtgcctc1920 actgattaagcattggtaactgtcagaccaagtttactcatatatactttagattgattt1980 aaaacttcatttttaatttaaaaggatctaggtgaagatcctttttgataatctcatgac2040 caaaatcccttaacgtgagttttcgttccactgagcgtcagaccccgtagaaaagatcaa2100 aggatcttcttgagatcctttttttctgcgcgtaatctgctgcttgcaaacaaaaaaacc2160 accgctaccagcggtggtttgtttgccggatcaagagctaccaactctttttccgaaggt2220 aactggcttcagcagagcgcagataccaaatactgtccttctagtgtagccgtagttagg2280 ccaccacttcaagaactctgtagcaccgcctacatacctcgctctgctaatcctgttacc2340 agtggctgctgccagtggcgataagtcgtgtcttaccgggttggactcaagacgatagtt2400 accggataaggcgcagcggtcgggctgaacggggggttcgtgcacacagcccagcttgga2460 gcgaacgacctacaccgaactgagatacctacagcgtgagctatgagaaagcgccacgct2520 tcccgaagggagaaaggcggacaggtatccggtaagcggcagggtcggaacaggagagcg2580 cacgagggagcttccagggggaaacgcctggtatctttatagtcctgtcgggtttcgcca2640 cctctgacttgagcgtcgatttttgtgatgctcgtcaggggggcggagcctatggaaaaa2700 cgccagcaacgcggcctttttacggttcctggccttttgctggccttttgctcacatgtt2760 ctttcctgcgttatcccctgattctgtggataaccgtattaccgcctttgagtgagctga2820 taccgctcgccgcagccgaacgaccgagcgcagcgagtcagtgagcgaggaagcggaaga2880 <210> 63 <211> 6 <212> PRT
<213> adaptor peptide <400> 63 His His His His His His <210> 64 <211> 6 <212> PRT
<213> adaptor peptide <400> 64 Glu Glu Gly Glu Pro Lys <210> 65 <211> 6 <212> PRT
<213> adaptor peptide <400> 65 Glu Glu Ala Glu Pro Lys <210> 66 <211> 5 <212> PRT
<213> processing site <220>
<221> MISC_FEATURE
<222> (5). (5) <223> X is any amino acid <400> 66 Ile Glu Gly Arg Xaa <210> 67 <211> 5 <212> PRT
<213> processing site <220>
<221> MISC_FEATURE
<222> (5). (S) <223> X is any amino acid <400> 67 Ile Asp Gly Arg Xaa <210> 68 <211> 5 <212> PRT
<213> processing site <220>
<221> MISC_FEATURE
<222> (5). (5) <223> X is any amino acid <400>~ 68 Ala Glu Gly Arg Xaa <210> 69 <211> 5 <212> PRT
<213> adaptor peptide <400> 69 Val Ile Glu Gly Arg <210> 70 <211> 4 <212> PRT
<213> adaptor peptide <400> 70 Ile Glu Gly Arg <210> 71 <211> 4 <212> PRT
<213> adaptor peptide <400> 71 Ile Asp Gly Arg <210> 72 <211> 4 <212 > PRT
<213> adaptor peptide <400> 72 Ala Glu Gly Arg <210> 73 <211> 4 <212> PRT
<213> HCV El <400> 73 Asn Asn Ser Ser <210> 74 <211> 8 <212> PRT
<213> FLAG epitope <400> 74 Asp Tyr Lys Asp Asp Asp Asp Lys <210> 75 <211> 12 <212> PRT
<213> Protein C epitope <400> 75 Glu Asp Gln Val Asp Pro Arg Leu Ile Asp Gly Lys <210> 76 <211> 11 <212> PRT
<213> VSV epitope <400> 76 Tyr Thr Asp Ile Glu Met Asn Arg Leu Gly Lys <210> 77 <211> 9 <212> PRT
<213> streptag <400> 77 Ala Trp Arg His Pro Gln Phe Gly Gly <210> 78 <211> 12~
<212> PRT
<213> Tag100 epitope <400> 78 Glu Glu Thr Ala Arg Phe Gln Pro Gly Tyr Arg Ser <210> 79 <211> 10 <212> PRT
<213> c-myc epitope <400> 79 Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu <210> 80 <211> 11 <212> PRT
<213> HA epitope <400> 80 Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Ser Leu <210> 81 <211> 9 <212> PRT
<213> HA epitope <400> 81 Tyr Pro Tyr Asp val Pro Asp Tyr Ala <210> 82 <211> 12 <212> PRT
<213> HA epitope <400> 82 Cys Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Ser Leu <210> 83 <211> 6 <212> PRT
<213> thrombin cleavage site <400> 83 Leu Val Pro Arg Gly Ser <210> 84 <211> 4 <212> PRT
<213> collagenase recognition site <220>
<221> MISC_FEATURE
<222> (2) . (2) <223> Xaa is any amino acid.but most frequently a neutral amino acid <400> 84 Pro Xaa Gly Pro <210> 85 -<211> 192 <212> PRT
<213> hepatitis C virus <400> 85 Tyr Gln Val Arg Asn Ser Ser Gly Leu Tyr His Val Thr Asn Asp Cys Pro Asn Ser Ser Val Val Tyr Glu Ala Ala Asp Ala Ile Leu His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala Ser Arg Cys Trp Val Ala Val Thr Pro Thr Val Ala Thr Arg Asp Gly Lys Leu Pro Thr Thr Gln Leu Arg Arg His Ile Asp Leu Leu Val Gly Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val Gly Gln Leu Phe Thr Phe Ser Pro Arg His His Trp Thr Thr Gln Asp Cys Asn Cys Ser Ile Tyr Pro Gly His Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp Ser Pro Thr Ala Ala Leu Val Va1 Ala Gln Leu Leu Arg Ile Pro Gln Ala Ile Met Asp Met Ile Ala Gly Ala His Trp Gly Val Leu Ala Gly Ile Ala Tyr Phe Ser Met Val Gly Asn Trp Ala Glu Val Leu Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala <210> 86 <211> 209 <212> PRT
<213> hepatitis C virus <400> 86 Met Leu Gly Lys Leu Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr Ile Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His Gly Ala Arg Val Leu Glu Asp Gly Val Ile Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu 50 55 "60 Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr GIn Val Arg Asn Ser Ser Gly Leu Tyr His Val Thr Asn Asp Cys Pro Asn Ser Ser Val Val Tyr Glu Ala Ala Asp Ala Ile Leu His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala Ser Arg Cys Trp Val Ala Val Thr Pro Thr Val Ala Thr Arg Asp Gly Lys Leu Pro Thr Thr Gln Leu Arg Arg His Ile Asp Leu Leu Val Gly Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val Gly Gln Leu Phe Thr Phe Ser Pro Arg His His Trp Thr Thr Gln Asp Cys Asn Cys Ser Ile Tyr Pro Gly His Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn 195 200 . 205 Trp <210> 87 <211> 192 <212> PRT
<213> hepatitis C virus <400> 87 Tyr Glu Val Arg Asn val Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val Tyr Glu Ala Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr Ile Arg Arg His Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gln Leu Phe Thr Ile Ser Pro Arg Arg His Glu Thr Val Gln Asp Cys Asn Cys Ser Ile Tyr Pro Gly His Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val Val Ser Gln Leu Leu Arg Ile Pro Gln Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val Leu Val Val Met Leu Leu Phe Ala Gly Val Asp Gly <210> 88 <211> 209 <212> PRT
<213> hepatitis C virus <400> 88 Met Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr Ile Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Ile Pro Ala Ser Ala Tyr Glu Val Arg Asn Val Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val Tyr Glu Ala Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr Ile Arg Arg His Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gln Leu Phe Thr Ile Ser Pro Arg Arg His Glu Thr Val Gln Asp Cys Asn Cys Ser Ile Tyr Pro Gly His Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp <210> 89 <211> 209 <212> PRT
<213> hepatitis C virus <400> 89 Met Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr Ile Pro Val Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Ile Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser Cys Ile Thr Thr Pro Val Ser Ala Val Glu Val Lys Asn Asn Ser Asn Ser Tyr Met Ala Thr Asn Asp Cys Ser Asn Ser Ser Ile Ile Trp Gln Leu Glu Gly Ala Val Leu His Thr Pro Gly Cys Val Pro Cys Glu Leu Ala Asp Asn Thr Ser Arg Cys Trp Val Pro Val Thr Pro Asn Met Ala Ile Arg Gln Pro Gly Glu Leu Thr Lys Gly Leu Arg Ala His Val Asp Val Ile Val Met Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val 145 150 155 l60 Gly Asp Val Cys Gly Ala Leu Met Ile Ala Ala Gln Val Val Val Val Ser Pro Gln His His His Phe Val Gln Glu Cys Asn Cys Ser Ile Tyr Pro Gly His Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn T rp <210> 90 <211> 209 <212> PRT
<213> hepatitis C virus <400> 90 Met Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr Ile Pro Val Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Ile Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser Cys Val Thr Ala Pro Val Ser Ala Val Glu Val Lys Asn Thr Ser Gln Ala Tyr Met Ala Thr Asn Asp Cys Ser Asn Asn Ser Ile Val Trp Gln Leu Glu Asp Ala Val Leu His Val Pro Gly Cys Val Pro Cys Glu Asn Ser Ser Gly Arg Phe His Cys Trp Ile Pro Ile Ser Pro Asn Ile Ala Val Ser Lys Pro Gly Ala Leu Thr Lys Gly Leu Arg Ala Arg Ile Asp Ala.Val Val Met Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp val Cys Gly Ala Val Met Ile Ala Ala Gln Ala Phe Ile Val Ala Pro Lys Arg His Tyr Phe Val Gln Glu Cys Asn Cys Ser Ile Tyr Pro Gly His Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp <210> 91 <211> 209 <212> PRT
<213> hepatitis C virus <400> 91 Met Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr Ile Pro Leu Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Leu Glu Asp Gly Ile Asn Phe Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Phe Ser Cys Leu Ile His Pro Ala Ala Ser Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu Thr Asn Asp Cys Ser Asn Ser Ser Ile Val 85 ~ 90 95 Tyr Glu Ala Asp Asp Val Ile Leu His Thr Pro Gly Cys Ile Pro Cys zoo l05 110 Val Gln Asp Gly Asn Thr Ser Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Lys Tyr Val. Gly Ala Thr Thr Ala Ser Ile Arg Ser His 130 135 l40 Val Asp Leu Leu Val Gly Ala Ala Thr Met Cys Ser Ala Leu Tyr Val Gly Asp Met Cys Gly Ala Val Phe Leu Val Gly Gln Ala Phe Thr Phe Arg Pro Arg Arg His Gln Thr Val Gln Thr Cys Asn Cys Ser Leu Tyr Pro Gly His Leu Ser Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp <210> 92 <211> 209 <212> PRT
<213> hepatitis C virus <400> 92 Met Ser Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr Ile Pro Leu Val Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Ala Val Glu Asp Gly Ile Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Thr Ser Ala Val Asn Tyr Arg Asn Ala Ser Gly Val Tyr His Ile Thr Asn Asp Cys Pro Asn Ser Ser Ile Val Tyr Glu Thr Glu His His Ile Leu His Leu Pro Gly Cys Leu Pro Cys Val Arg Val Gly Asn Gln Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr Ile Gly Ala Pro Leu Glu Ser Leu Arg Ser His Val Asp Leu Met Val Gly Ala Ala Thr Ala Cys Ser Ala Leu Tyr Ile Gly Asp Leu Cys Gly Gly Val Phe Leu Val Gly Gln Met Phe Ser Phe 165 170 , 175 Gln Pro Arg Arg His Trp Thr Thr Gln Asp Cys Asn Cys Ser Ile Tyr Ala Gly His Val Thr GIy His Arg Met Ala Trp Asp Met Met Met Asn Trp <210> 93 <211> 209 <212> PRT
<213> hepatitis C virus <400> 93 Met Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr Ile Pro Leu Val Gly Gly Pro Ile Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Ile Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Val Pro Tyr Arg Asn Ala Ser Gly Ile Tyr His Val Thr Asn Asp Cys Pro Asn Ser Ser Ile Val Tyr Glu Ala Asp Asn Leu Ile Leu His Ala Pro Gly Cys Val Pro Cys Val Met Thr Gly Asn Val Ser Arg Cys Trp Val Gln Ile Thr Pro Thr Leu Ser Ala Pro Ser Leu Gly Ala Val Thr Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala Gly Gly Ala Ala Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys Gly Ala Leu Phe Leu Val Gly Gln Met Phe Thr Tyr Arg Pro Arg Gln His Ala Thr Val Gln Asn Cys Asn Cys Ser Ile Tyr Ser Gly His Val Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp <210> 94 <211> 209 <212> PRT
<213> hepatitis C virus <400> 94 Met Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr Ile Pro Val Val Gly Ala Pro Leu Gly Gly Val Ala Ala Ala Phe Ala His Gly Val Arg Ala Leu Glu Asp Gly Ile Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Thr Pro Ala Ser Ala Leu Thr Tyr Gly Asn Ser Ser Gly Leu Tyr His Leu Thr Asn Asp Cys Pro Asn Ser Ser Ile Val Leu Glu Ala Asp Ala Met Ile Leu His Leu Pro Gly Cys Leu Pro Cys Val Arg Val Asn Asn Gln Ser Thr Cys Trp His Ala Val Ser Pro Thr Leu Ala Ile Pro Asn Ala Ser Thr Pro Ala Thr Gly Plie Arg Arg His Val Asp Leu Leu Ala Gly Ala Ala Val Val Cys Ser Ser Leu Tyr Ile Gly Asp Leu Cys Gly Ser Leu Phe Leu Ala Gly Gln Leu Phe Thr Phe Gln Pro Arg Arg His Trp Thr Val Gln Asp Cys Asn Cys Ser Ile Tyr Thr Gly His Val Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp <210> 95 <211> 209 <212> PRT
<213> hepatitis C virus <400> 95 Met Leu Gly Lys val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr Ile Pro Val Va1 Gly Ala Pro Leu Gly Gly Ile Ala Ala Ala Leu Ala His Gly Val Arg Ala Val Glu Asp Gly Ile Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Thr Pro Ala Ser Ala Val His Tyr Ala Asn Lys Ser Gly Leu Tyr His Leu Thr Asn Asp Cys Pro Asn Ser Ser Ile Val Tyr Glu Ala Pro Ala Val Ile Met His Leu Pro Gly Cys Val Pro Cys Val Lys Val Gly Asn Gln Ser Thr Cys Trp Leu Pro Ala Ser Pro Thr Leu Ala Val Pro Asn Ala Ser Thr Pro Leu Thr Arg Phe Arg Lys His 130 ~ 135 140 Val Asp Leu Met Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Ile Cys Gly Gly Leu Phe Leu Leu Gly Gln Val Val Thr Ile Arg Pro Arg Leu His Gln Thr Val Gln Glu Cys Asn Cys Ser Ile Tyr Thr Gly Lys Ile Thr Gly His Arg Met Ala Trp Asp Ile Met Met Asn Trp <210> 96 <211> 209 <212> PRT
<213> hepatitis C virus <400> 96 Met Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Leu Ala Asp Leu Met Gly Tyr Ile Pro Val Leu Gly Gly Pro Leu Gly Gly Val Ala Ala Ala Leu Ala His Gly Val Arg Ala Ile Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Leu Leu Leu Ala Leu Leu Ser Cys Leu Thr Ile Pro Ala Ser Ala Ile Gln Val Lys Asn Ala Ser Gly Ile Tyr His Leu Thr Asn Asp Cys Ser Asn Asn Ser Ile Val Phe Glu Ala Glu Thr Met Ile Leu His Leu Pro Gly Cys Val Pro Cys Ile Lys Ala Gly Asn Glu Ser Arg Cys Trp Leu Pro val Ser Pro Thr Leu Ala Val Pro Asn Ser Ser Val Pro Ile His Gly Phe Arg Arg His Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Ile Gly Asp Leu Cys Gly Ser Ile Phe Leu Val Gly Gln Leu Phe Thr Phe Arg Pro Lys Tyr His Gln Val Thr Gln Asp Cys Asn Cys Ser Ile Tyr Ala Gly His Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp <210> 97 <211> 363 <212> PRT
<213> hepatitis C virus <400> 97 Glu Thr His Val Thr Gly Gly Asn Ala Gly Arg Thr Thr Ala Gly Pro Val Gly Leu Leu Thr Pro Gly Ala Lys Gln Asn Ile Gln Leu Ile Asn Thr Asn Gly Ser Trp His I1e Asn Ser Thr Ala Leu Asn Cys Asn Glu Ser Leu Asn Thr Gly Trp Leu Ala Gly Leu Phe Tyr Gln His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys Arg Arg Leu Thr Asp Phe Ala Gln Gly Trp Gly Pro Ile Ser Tyr Ala Asn Gly Ser Gly Leu Asp Glu Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Pro Cys Gly Ile Val Pro Ala Lys Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser Trp Gly Ala Asn Asp Thr Asp Val Phe Val Leu Asn Asn Thr Arg Pro Pro Leu Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Ser Thr Gly Phe Thr Lys Val Cys Gly Ala Pro Pro Cys Val Ile Gly Gly Val Gly Asn Asn Thr Leu_ Leu Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala Thr Tyr Ser Arg Cys Gly Ser Gly Pro Trp Ile Thr Pro Arg Cys Met Val Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Ile Asn Tyr Thr Ile Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg Leu Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Gln Trp Gln Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser Thr Gly Leu Ile His Leu His Gln Asn Ile Val Asp Val Gln Tyr Leu Tyr Gly Val Gly Ser Ser Ile Ala Ser Trp Ala Ile Lys Trp Glu Tyr Val Val Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys Ser Glu Phe Trp Met Met Leu Leu Ile Ser Gln Ala Glu Ala <210> 98 <211> 363 <212> PRT
<213> hepatitis C virus <400> 98 His Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Asp Thr Arg Gly Leu Val Ser Leu Phe Ser Pro Gly Ser Ala Gln Lys Ile Gln Leu Val Asn Thr Asn Gly Ser Trp His Ile Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Gln Thr Gly Phe Phe Ala Ala Leu Phe Tyr Lys His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys Arg Ser Ile Asp Lys Phe Ala Gln Gly Trp Gly Pro Leu Thr Tyr Thr Glu Pro Asn Ser Ser Asp Gln Arg Pro Tyr Cys Trp His Tyr Ala Pro Arg Pro Cys Gly Ile Val Pro Ala Ser Gln Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Phe Gly Val Pro Thr Tyr Asn Trp Gly Ala Asn Asp Ser Asp Val Leu Ile Leu Asn Asn Thr Arg Pro Pro Arg Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn Ile Gly Gly Ala Gly Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala Thr Tyr Ala Arg Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Met Val His Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr Ile Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg Phe Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Glu Trp Gln Ile Leu Pro Cys Ser Phe Thr Thr Leu Pro.Ala Leu Ser Thr Gly Leu Ile His Leu His Gln Asn Ile Val Asp Val Gln Tyr Leu Tyr Gly Val Gly Ser Ala Val Val Ser Leu Val Ile Lys Trp Glu Tyr Val Leu Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Ile Cys Ala Cys Leu Trp Met Met Leu Leu Ile Ala Gln Ala Glu Ala

Claims

1. A recombinant nucleic acid comprising a nucleotide sequence encoding a protein comprising an avian lysozyme leader peptide or a functional equivalent thereof joined to an HCV envelope protein or a part thereof.

2. The recombinant nucleic acid according to claim 1 wherein said protein is characterized by the structure CL-[(A1)a - (PS1)b- (A2)c)-HCVENV-[(A3)d - (PS2)e - (A4)f]
wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, A1, A2, A3 and A4 are adaptor peptides which can be different or the same, PS1 and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, a, b, c, d, a and f are 0 or 1, and wherein, optionally, A1 and/or A2 are part of PS1 and/or wherein A3 and/or A4 are part of PS2.

3. The recombinant nucleic acids according to claim 1 or 2 further comprising regulatory elements allowing expression of said protein in a eukaryotic host cell.

4. The recombinant nucleic acid according to any of claims 1 to 3 wherein the avian lysozyme leader peptide CL has an amino acid sequence defined by SEQ ID NO:1.

5. The recombinant nucleic acid according to claims 2 or 3 wherein A has an amino acid sequence chosen from SEQ ID NOs:63-65, 70-72 and 74-82, wherein PS has an amino acid sequence chosen from SEQ ID NOs:66-68 and 83-84 or wherein PS is a dibasic site such as Lys-Lys, Arg-Arg, Lys-Arg and Arg-Lys or a monobasic site such as Lys, and wherein HCVENV is chosen from SEQ ID NOs:85-98 and fragments thereof.

6. A vector comprising the recombinant nucleic acid according to any of claims 1 to 5.

7. The vector according to claim 6 which is an expression vector.

8. The vector according to claim 6 or 7 which is an autonomously replicating vector or an integrative vector.

9. The vector according to any of claims 6 to 8 which is chosen from SEQ ID
NOs: 20, 21, 32, 35, 36, 39, 40, 49 and 50.

10. A host cell comprising the recombinant nucleic acid according to any of claims 1 to 5 or the vector according to any of claims 6 to 9.

11. The host cell according to claim 10 which is capable of expressing the protein comprising an avian lysozyme leader peptide or a functional equivalent thereof joined to an HCV
envelope protein or a part thereof.

12. The host cell according to claim 10 or 11 which is capable of expressing the protein characterized by the structure CL-[(A1)a - (PS1)b - (A2)c]-HCVENV-[(A3)d -(PS2)e -(A4)f]

wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, A1, A2, A3 and A4 are adaptor peptides which can be different or the same, PS1 and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, a, b, c, d, a and f are 0 or 1, and wherein, optionally, A1 and/or A2 are part of PSI and/or wherein A3 and/or A4 are part of PS2.

13. The host cell according to any of claims 10 to 12 which is capable of translocating the protein CL-[(A1)a - (PS1)b- (A2)c]-HCVENV-[(A3)d - (PS2)e - (A4)f] to the endoplasmic reticulum upon removal of the CL peptide wherein said protein and said CL
peptide are derived from the protein characterized by the structure CL-[(A1)a - (PS1)b -(A2)c]-HCVENV-[(A3)d - (PS2)e - (A4) f]

wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, A1, A2, A3 and A4 are adaptor peptides which can be different or the same, PS1 and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, a, b, c, d, a and f are 0 or 1, and wherein, optionally, A1 and/or A2 are part of PS1 and/or wherein A3 and/or A4 are part of PS2.

14. The host cell according to any of claims 10 to 13 which is capable of processing the processing sites PS1 and/or PS2 in said protein translocated to the endoplasmic reticulum.

15. The host cell according to any of claims 10 to 13 which is capable of N-glycosylating said protein translocated to the endoplasmic reticulum.

16. The host cell according to claim 14 which is capable of N-glycosylating said protein translocated to the endoplasmic reticulum and processed at said sites PS1 and/or PS2.

17. The host cell according to any of claims 10 to 16 which is an eukaryotic cell.

18. The host cell according to any of claims 10 to 16 which is a fungal cell.

19. The host cell according to any of claims 17 which is a yeast cell.

20. The host cell according to claim 19 which is a Saccharomyces cell, such as a Saccharomyces cerevisiae cell, a Saccharomyces kluyveri cell, or a Saccharomyces uvarum cell, a Schizosaccharomyces cell, such as a Schizosaccharomyces pombe cell, a Kluyveromyces cell, such as a Kluyvenomyces lactis cell, a Yarrowia cell, such as a Yarrowia lipolytica cell, a Hansenula cell, such as a Hansenula polymorpha cell, a Pichia cell, such as a Pichia pastoris cell, an Aspergillus cell, a Neurospora cell, such as a Neurospora crassa cell, or a Schwanniomyces cell, such as a Schwanniomyces occidentalis cell, or a mutant cell derived from any thereof.

21. A method for producing a HCV envelope protein or part thereof in a host cell, said method comprising transforming said host cell with the recombinant nucleic acid according to any of claims 1 to 5 or with the vector according to any of claims 6 to 9, and wherein said host cell is capable of expressing a protein comprising the avian lysozyme leader peptide or a functional equivalent thereof joined to a HCV envelope protein or a part thereof.

22. A method for producing a HCV envelope protein or part thereof in a host cell, said method comprising transforming said host cell with the recombinant nucleic acid according to any of claims 1 to 5 or with the vector according to any of claims 6 to 9, and wherein said host cell is capable of expressing the protein characterized by the structure CL-[(A1)a - (PS1)b- (A2)c]-HCVENV-[(A3)g - (PS2)e - (A4)f]
wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, A1, A2, A3 and A4 are adaptor peptides which can be different or the same, PS1 and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, a, b, c, d, a and f are 0 or 1, and wherein, optionally, A1 and/or A2 are part of PS1 and/or wherein A3 and/or A4 are part of PS2.

23. The method according to claim 21 or 22 wherein said host cell is capable of translocating the protein CL-[(A1)a - (PS1)b - (A2)c]-HCVENV-[(A3)d - (PS2)e - (A4)f] to the endoplasmic reticulum upon removal of the CL peptide wherein said protein and said CL
peptide are derived from the protein characterized by the structure CL-[(A1)a -(PS1)b-(A2)c]-HCVENV-[(A3)d - (PS2)e - (A4)f]
wherein:
CL is an avian lysozyme leader peptide or a functional equivalent thereof, A1, A2, A3 and A4 are adaptor peptides which can be different or the same, PS1 and PS2 are processing sites which can be the different or the same, HCVENV is a HCV envelope protein or a part thereof, a, b, c, d, a and f are 0 or 1, and wherein, optionally, A1 and/or A2 are part of PS1 and/or wherein A3 and/or A4 are part of PS2.

24. The method according to any of claims 21 to 23 wherein said host cell is capable of processing the processing sites PS1 and/or PS2 in said protein translocated to the endoplasmic reticulum.

25. The method according to any of claims 21 to 23 further comprising in vitro processing of the processing sites PS1 and/or PS2.

26. The method according to any of claims 21 to 23 wherein said host cell is capable of N-glycosylating said protein translocated to the endoplasmic reticulum.

27. The method according to claim 24 wherein said host cell is capable of N-glycosylating said protein translocated to the endoplasmic reticulum and processed at said sites PS1 and/or PS2.

28. The method according to any of claims 21 to 27 wherein said host cell is an eukaryotic cell.

29. The method according to any of claims 21 to 27 wherein said host cell is a fungal cell.

30. The method according to any of claims 21 to 27 wherein said host cell is a yeast cell.

31. The method according to any of claims 21 to 27 wherein said host cell is a Saccharomyces cell, such as a Saccharomyces cerevisiae cell, a Saccharomyces kluyveri cell, or a Saccharomyces uvarum cell, a Schizosacchamomyces cell, such as a Schizosaccharomyces pombe cell, a Kluyveromyces cell, such as a Kluyveromyces lactis cell, a Yarrowia cell, such as a Yarrowia lipolytica cell, a Hansenula cell, such as a Hansehula polymorpha cell, a Pichia cell, such as a Pichia pastoris cell, an Aspergillus cell, a Neurospora cell, such as a Neurospora crassa cell, or a Schwanniomyces cell, such as a Schwanniomyces occidentalis cell, or a mutant cell derived from any thereof.

32. The method according to any of claims 21 to 27 further comprising cultivation of said host cells in a suitable medium to obtain expression of said protein.

33. The method according to claim 32 further comprising isolation of the expressed protein from a culture of said host cells, or from said host cells.

34, The method according to claims 33 wherein said isolation step involves lysis of said host cells in the presence of a chaotropic agent.

35. The method according to claim 33 or 34 wherein the cysteine thiol-groups in the isolated proteins are chemically modified and wherein said chemical modification is reversible or irreversible.

36. The method according to any of claims 32 to 35 involving heparin affinity chromatography.