US20090049562A1 - Porcine cmp-n-acetylneuraminic acid hydroxylase gene - Google Patents

Porcine cmp-n-acetylneuraminic acid hydroxylase gene Download PDF

Info

Publication number
US20090049562A1
US20090049562A1 US12/061,351 US6135108A US2009049562A1 US 20090049562 A1 US20090049562 A1 US 20090049562A1 US 6135108 A US6135108 A US 6135108A US 2009049562 A1 US2009049562 A1 US 2009049562A1
Authority
US
United States
Prior art keywords
cmp
sequence
porcine
neu5ac
seq
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/061,351
Inventor
Chihiro Koike
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Revivicor Inc
Original Assignee
Revivicor Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Revivicor Inc filed Critical Revivicor Inc
Priority to US12/061,351 priority Critical patent/US20090049562A1/en
Publication of US20090049562A1 publication Critical patent/US20090049562A1/en
Assigned to NATIONAL INSTITUTES OF HEALTH - DIRECTOR DEITR reassignment NATIONAL INSTITUTES OF HEALTH - DIRECTOR DEITR CONFIRMATORY LICENSE (SEE DOCUMENT FOR DETAILS). Assignors: UNIVERSITY OF PITTSBURGH
Abandoned legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • C12N9/0071Oxidoreductases (1.) acting on paired donors with incorporation of molecular oxygen (1.14)
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K67/00Rearing or breeding animals, not otherwise provided for; New breeds of animals
    • A01K67/027New breeds of vertebrates
    • A01K67/0271Chimeric animals, e.g. comprising exogenous cells
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K67/00Rearing or breeding animals, not otherwise provided for; New breeds of animals
    • A01K67/027New breeds of vertebrates
    • A01K67/0275Genetically modified vertebrates, e.g. transgenic
    • A01K67/0276Knockout animals
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P25/00Drugs for disorders of the nervous system
    • A61P25/28Drugs for disorders of the nervous system for treating neurodegenerative disorders of the central nervous system, e.g. nootropic agents, cognition enhancers, drugs for treating Alzheimer's disease or other forms of dementia
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P43/00Drugs for specific purposes, not provided for in groups A61P1/00-A61P41/00
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P7/00Drugs for disorders of the blood or the extracellular fluid
    • A61P7/04Antihaemorrhagics; Procoagulants; Haemostatic agents; Antifibrinolytic agents
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/8509Vectors or expression systems specially adapted for eukaryotic hosts for animal cells for producing genetically modified animals, e.g. transgenic
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2217/00Genetically modified animals
    • A01K2217/07Animals genetically altered by homologous recombination
    • A01K2217/075Animals genetically altered by homologous recombination inducing loss of function, i.e. knock out
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2227/00Animals characterised by species
    • A01K2227/10Mammal
    • A01K2227/108Swine
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2267/00Animals characterised by purpose
    • A01K2267/02Animal zootechnically ameliorated
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2267/00Animals characterised by purpose
    • A01K2267/02Animal zootechnically ameliorated
    • A01K2267/025Animal producing cells or organs for transplantation

Definitions

  • the present invention provides porcine CMP-N-Acetylneuraminic-Acid Hydroxylase (CMP-Neu5Ac hydroxylase) protein, cDNA, and genomic DNA regulatory sequences. Furthermore, the present invention includes porcine animals, tissues, and organs, as well as cells and cell lines derived from such animals, tissues, and organs, which lack expression of functional CMP-Neu5Ac hydroxylase. Such animals, tissues, organs, and cells can be used in research and in medical therapy, including in xenotransplantation, and in industrial livestock farming operations. In addition, methods are provided to prepare organs, tissues, and cells lacking the porcine CMP-Neu5Ac hydroxylase gene for use in xenotransplantation.
  • CMP-Neu5Ac hydroxylase CMP-Neu5Ac hydroxylase
  • Xenograft transplantation represents a potentially attractive alternative to artificial organs for human transplantation.
  • the potential pool of nonhuman organs is virtually limitless. Pigs are considered the most likely source of xenograft organs.
  • the supply of pigs is plentiful, breeding programs are well established, and their organ size and physiology are compatible with humans. Therefore, xenotransplantation with pig organs offers a potential solution to the shortage of organs available for clinical transplantation.
  • HAR hypereracute rejection
  • CRPs complement regulatory proteins
  • This antigen is chemically related to the human A, B, and O blood antigens, and it is present on many parasites and infectious agents, such as bacteria and viruses. Most mammalian tissue also contains this antigen, with the notable exception of old world monkeys, apes and humans. (see, Joziasse, et al., J. Biol. Chem., 264, 14290-97 (1989). Individuals without such carbohydrate epitopes produce abundant naturally occurring antibodies (IgM as well as IgG) specific to the epitopes. Many humans show significant levels of circulating IgG with specificity for gal- ⁇ -gal carbohydrate determinants (Galili, et al., J. Exp.
  • ⁇ -GT ⁇ -galactosyltransferase
  • PCT patent application WO 2004/016742 to Immerge Biotherapeutics, Inc. describes ⁇ (1,3)-galactosyltransferase null cells, methods of selecting GGTA-1 null cells, ⁇ (1,3)-galactosyltransferase null swine produced therefrom (referred to as a viable GGTA-1 null swine), methods for making such swine, and methods of using cells, tissues and organs of such a null swine for xenotransplantation.
  • N-glycolylneuraminic acid a member of the sialic acid family of carbohydrates.
  • sialic acids are abundant and ubiquitous.
  • Sialic acid is a generic designation used for N-acylneuraminic acids (Neu5Acyl) and their derivatives.
  • N-Acetylneuraminic acid (Neu5Ac) and N-glycolylneuraminic acid (Neu5Gc) are two of the most abundant derivatives of sialic acids.
  • the Neu5Gc epitope is located in the terminal position in the glycan chains of glycoconjugates. Due to this exposed position, it plays an important role in cellular recognition, e.g. in the case of inflammatory reactions, maturation of immune cells, differentiation processes, hormone-, pathogen- and toxin binding (Varki, A., Glycobiology, 2, pp. 25-40 (1992)).
  • Neu5Gc-containing glycoconjugates act as antigens and can induce the formation of antibodies.
  • Hanganutziu-Deicher (HD) antigens and antibodies have been referred to as Hanganutziu-Deicher (HD) antigens and antibodies (Hanganutziu, M., CR Soc. Biol . ( Paris ), 91, p. 1457 (1924); Deicher, H., Z. Hyg., 106, p. 561 (1926)).
  • Hanganutziu-Deicher antigens are detectable in many human tumors (colon carcinoma, retinoblastoma, melanoma and carcinoma of the breast) as well as in chicken tumor tissues (Higashi, H., et al. Cancer Res., 45, pp. 3796-3802 (1985)).
  • the amount of antigen in tumors is very small (usually less than 1% of the total amount of sialic acid, often in the range of from 0.01 to 0.1%), it is capable of inducing the formation of Hanganutziu-Deicher antibodies (Higashihara, T., et al., Int Arch Allergy Appl Immunol., 95, pp. 231-235 (1991)).
  • This immunological reaction is a potential barrier to xenotransplantation of Neu5Gc-containing pig organs to humans.
  • the Neu5Gc epitope is formed by the addition of a hydroxyl group to the N-acetyl moiety of Neu5Ac.
  • the enzyme that catalyzes the hydroxylation is CMP-Neu5Ac hydroxylase.
  • CMP-Neu5Ac hydroxylase determines the presence of the Neu5Gc epitope on cell surfaces.
  • Purification studies of CMP-Neu5Ac hydroxylase in mammals have shown that it is a soluble, cytosolic oxygenase that is dependent on cytochrome b5 and cytochrome b5 reductase (Kawano, T., et al., J. Biol. Chem., 269, pp.
  • Neu5Gc acts as an adhesion molecule for pathogens, allowing for entry into the cell (Kelm, S. and Schauer, R., Int. Rev. Cytol, 179, pp. 137-240 (1997)). This causes disease and economic losses in certain livestock species.
  • enterotoxigenic Escherichia coli with K99 fimbriae infect newborn piglets by binding to Neu5Gc in gangliosides such as Nue5Gc ⁇ 2 ⁇ 3Gal ⁇ 1 ⁇ 4Glc ⁇ 1 ⁇ 1′ ceramide [GM3(Neu5Gc)], N-glycolylsialoparagloboside and GM2(Neu5Gc) attached to intestinal absorptive and mucus secreting cells, causing a potentially lethal diarrhea (Malykh, Y., et. al., Biochem. J., 370, pp.
  • Pig rotavirus infects pig newborns causing diarrhea by binding to GM3(Neu5Gc).
  • CMP-Neu5Ac hydroxylase has been isolated from mouse liver and pig submandibular glands to homogeneity and characterized (Kawano, T., et al., J. Biol. Chem., 269, pp. 9024-9029 (1994); Schneckenburger, P., et al., Glycoconj. J., 11, pp. 194-203 (1994); and, Schlenzka, W., et al., Glycobiology, 4, pp. 675-683 (1994)).
  • JP-A 06 113838 describes the protein and DNA sequences of murine CMP-Neu5Ac hydroxylase, as well as a monoclonal antibody that specifically binds to the hydroxylase.
  • PCT Publication No. WO 97/03200A1 to Boehringer Manheim GMBH discloses a partial cDNA for the porcine CMP-Neu5Ac hydroxylase.
  • This application discloses a cDNA sequence beginning in the middle of Exon 8 of the CMP-Neu5Ac hydroxylase gene (further disclosed as GenBank accession number Y15010).
  • PCT Publication No. WO 02/088351 to RBC Biotechnology discloses a partial cDNA and genomic sequence (exons 7-11 as well as partial genomic sequence surrounding each exon) of porcine CMP-NeuAc hydroxylase.
  • methods are provided to generate porcine cells and animals lacking the CMP-NeuAc hydroxylase epitope, optionally, in combination with other genetic modifications, such as inactivation of the alpha-1,3-galactosyltransferase gene and/or insertion of complement proteins.
  • the full length cDNA sequence, peptide sequence, and genomic organization of the porcine CMP-Neu5Ac hydroxylase gene has been determined. To date, only partial cDNA and genomic sequences have been identified.
  • the present invention provides novel porcine CMP-Neu5Ac hydroxylase protein, cDNA, cDNA variants, and genomic DNA sequence.
  • the present invention includes porcine animals, tissues, and organs, as well as cells and cell lines derived from such animals, tissue, and organs, which lack expression of functional CMP-Neu5Ac hydroxylase. Such animals, tissues, organs, and cells can be used in research and in medical therapy, including xenotransplantation.
  • methods are provided to prepare organs, tissues, and cells lacking the porcine CMP-Neu5Ac hydroxylase gene for use in xenotransplantation.
  • One aspect of the present invention provides the full length cDNA of porcine CMP-Neu5Ac hydroxylase.
  • the full length cDNA is shown in Table 1 (SEQ ID No 1) and the full length peptide sequence is provided in Table 2 (SEQ ID No 2).
  • the start codon for the full-length cDNA is located in the 3′ portion of Exon 4, and the stop codon is found in the 3′ portion of Exon 17.
  • Nucleotide and amino acid sequences at least 80, 85, 90, 95, 98 or 99% homologous to SEQ ID Nos 1 or 2 are provided.
  • nucleotide and peptide sequences that contain at least 10, 15, 17, 20, 25 or 30 nucleotide or amino acid sequences of SEQ ID Nos 1 or 2 are also provided. Further provided is any nucleotide sequence that hybridizes, optionally under stringent conditions, to SEQ ID No 1, as well as, nucleotides homologous thereto.
  • nucleic acid and peptide sequences encoding three novel variants of CMP-Neu5Ac hydroxylase are provided (Tables 3-8, FIG. 2 ).
  • SEQ ID No 3 represents the cDNA of a variant of the gene, variant-1, that includes Exons 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15a, 16, 17, and 18.
  • SEQ ID No 5 represents the cDNA of a variant of the gene, variant-2, that includes Exons 1, 4, 5, 6, 7, 8, 9, 10, 11, 12 and 12a.
  • SEQ ID No 7 represents the cDNA of a variant of the gene, variant-3, that includes Exons 1, 4, 5, 6, 7, 8, 9, 10, 11 and 11a.
  • SEQ ID Nos 4, 6 and 8 represent the amino acid sequences of variant-1, variant-2 and variant-3, respectively. Nucleotide and amino acid sequences at least 80, 85, 90, 95, 98 or 99% homologous to SEQ ID Nos 3-8 are provided. In addition, nucleotide and peptide sequences that contain at least 10, 15, 17, 20, 25, or 30 nucleotide or amino acid sequence of SEQ ID Nos 3-8 are also provided. Further provided is any nucleotide sequence that hybridizes, optionally under stringent conditions, to SEQ ID Nos 3, 5 and 7, as well as, nucleotides homologous thereto.
  • a further embodiment provides nucleic acid sequences representing genomic DNA sequences of the CMP-Neu5Ac hydroxylase gene (Table 9, FIG. 1 ).
  • SEQ ID Nos 10-28 represent Exons 1, 4-11, 11a, 12, 12a, 13-15, 15a, 16-18, respectively, and SEQ ID Nos 29-45 represent Introns 1a, 1b, 4-15, 15a, 16, and 17, respectively.
  • SEQ ID No. 9 represents the 5′ untranslated region of the CMP-Neu5Ac hydroxylase gene.
  • SEQ ID No. 46 (Table 10) represents the genomic DNA and regulatory sequence of CMP-Neu5Ac hydroxylase.
  • genomic sequence of the porcine CMP-Neu5Ac hydroxylase gene is represented by SEQ ID No. 47.
  • SEQ ID No. 47 represents the 5′ contiguous genomic sequence containing 5′ UTR, Exon 1 and a portion of intronic sequence located 3′ of Exon 1 (Table 11).
  • the genomic sequence of the porcine CMP-Neu5Ac hydroxylase gene is represented by SEQ ID No. 48.
  • SEQ ID NO. 48 represents a contiguous genomic sequence containing intronic sequence located 5′ to Exon 4, Exon 4, Intron 4, Exon 5, Intron 5, Exon 6, Intron 6, Exon 7, Intron 7, Exon 8, Intron 8, Exon 9, Intron 9, Exon 10, Intron 10, Exon 11, Intron 11, Exon 12, Intron 12, Exon 13, Intron 13, Exon 14, Intron 14, Exon 15, Intron 15, Exon 16, Intron 16, Exon 17, Intron 17, and Exon 18 (Table 12).
  • nucleotide sequences that contain at least 2775, 2800, 2900, 3000, 3100, 3200, 3300, 3400, 3500, 3600, 3700, 3800, 3900, 4000, 4500, 5000, 5500, 6000, 6500, 7000, 7500, 8000, 8500, 9000, 9500 or 10,000 contiguous nucleotides of SEQ ID NO. 48 are provided, as well as nucleotide sequences at least 80, 85, 90, 95, 98, or 99% homologous to SEQ ID NO. 48.
  • genomic sequence of the porcine CMP-Neu5Ac hydroxylase gene is represented by SEQ ID No. 49.
  • SEQ ID NO. 49 represents contiguous genomic sequences containing Intronic sequence 5′ to Exon 4, Exon 4, Intron 4, Exon 5, Intron 5, Exon 6, Intron 6, Exon 7, Intron 7 and Exon 8.
  • genomic sequence of the porcine CMP-Neu5Ac hydroxylase gene is represented by SEQ ID No. 50.
  • SEQ ID NO. 50 represents contiguous genomic sequences containing Exon 12, Intron 12, Exon 13, Intron 13, Exon 14, Intron 14, Exon 15, Intron 15, Exon 16, Intron 16, Exon 17, Intron 17, and Exon 18 are provided.
  • nucleotide and amino acid sequences at least 80, 85, 90, 95, 98 or 99% homologous to SEQ ID Nos 9-45, 46, 47, 48, 49 and 50 are provided.
  • nucleotide and peptide sequences that contain at least 10, 15, 17, 20, 25, 30, 50, 100, 150, 200, 300, 400, 500 or 1000 contiguous nucleotide or amino acid sequences of SEQ ID Nos 9-45, 46, 47, and 48 are also provided. Further provided is any nucleotide sequence that hybridizes, optionally under stringent conditions, to SEQ ID Nos 9-45, 46, 47, 48, 49 and 50, as well as, nucleotides homologous thereto.
  • nucleic acid constructs that contain cDNA or variants thereof encoding CMP-Neu5Ac hydroxylase. These cDNA sequences can be derived from Seq ID Nos. 1-8, or any fragment thereof. Constructs can contain one, or more than one, internal ribosome entry site (IRES). The construct can also contain a promoter operably linked to the nucleic acid sequence encoding CMP-Neu5Ac hydroxylase, or, alternatively, the construct can be promoterless. In another embodiment, nucleic acid constructs are provided that contain nucleic acid sequences that permit random or targeted insertion into a host genome.
  • the expression vector can contain selectable marker sequences, such as, for example, enhanced Green Fluorescent Protein (eGFP) gene sequences, initiation and/or enhancer sequences, poly A-tail sequences, and/or nucleic acid sequences that provide for the expression of the construct in prokaryotic and/or eukaryotic host cells.
  • selectable marker sequences such as, for example, enhanced Green Fluorescent Protein (eGFP) gene sequences, initiation and/or enhancer sequences, poly A-tail sequences, and/or nucleic acid sequences that provide for the expression of the construct in prokaryotic and/or eukaryotic host cells.
  • eGFP enhanced Green Fluorescent Protein
  • nucleic acid targeting vectors constructs are also provided wherein homologous recombination in somatic cells can be achieved. These targeting vectors can be transformed into mammalian cells to target the CMP-Neu5Ac hydroxylase gene via homologous recombination.
  • the targeting vectors can contain a 3′ recombination arm and a 5′ recombination arm that is homologous to the genomic sequence of a CMP-Neu5Ac hydroxylase.
  • the homologous DNA sequence can include at least 15 bp, 20 bp, 25 bp, 50 bp, 100 bp, 500 bp, 1 kbp, 2 kbp, 4 kbp, 5 kbp, 10 kbp, 15 kbp, 20 kbp, or 50 kbp of sequence homologous to the CMP-Neu5Ac hydroxylase sequence.
  • the homologous DNA sequence can include one or more intron and/or exon sequences.
  • the DNA sequence can be homologous to Intron 5 and Intron 6 of the CMP-Neu5Ac hydroxylase gene (see, for example, FIGS. 6-8 ).
  • the DNA sequence can be homologous to Intron 5, a 55 bp portion of Exon 6, and Intron 6 of the CMP-Neu5Ac hydroxylase gene, and contain enhanced Green Fluorescent Protein sequence in an in-frame orientation 3′ to the 55 bp portion of Exon 6 (see, for example, FIGS. 10 and 11 ).
  • Another embodiment of the present invention provides oligonucleotide primers capable of hybridizing to porcine CMP-Neu5Ac hydroxylase cDNA or genomic sequence, such as Seq ID Nos. 1, 3, 5, 7, 9-45, 46, 47 or 48.
  • the primers hybridize under stringent conditions to SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47 or 48.
  • Another embodiment provides oligonucleotide probes capable of hybridizing to porcine CMP-Neu5Ac hydroxylase nucleic acid sequences, such as SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47, or 48.
  • the polynucleotide primers or probes can have at least 14 bases, 20 bases, preferably 30 bases, or 50 bases which hybridize to a polynucleotide of the present invention.
  • the probe or primer can be at least 14 nucleotides in length, and in a preferred embodiment, are at least 15, 20, 25, 28, or 30 nucleotides in length.
  • mammalian cells lacking at least one allele of the CMP-Neu5Ac hydroxylase gene produced according to the process, sequences and/or constructs described herein are provided. These cells can be obtained as a result of homologous recombination. Particularly, by inactivating at least one allele of the CMP-NeuAc hydroxylase gene, cells can be produced which have reduced capability for expression of functional Hanganutziu-Deicher antigens.
  • alleles of the CMP-Neu5Ac hydroxylase gene are rendered inactive according to the process, sequences and/or constructs described herein, such that the resultant CMP-Neu5Ac hydroxylase enzyme can no longer generate Hanganutziu-Deicher antigens.
  • the CMP-Neu5Ac hydroxylase gene can be transcribed into RNA, but not translated into protein.
  • the CMP-Neu5Ac hydroxylase gene can be transcribed in an inactive truncated form. Such a truncated RNA may either not be translated or can be translated into a nonfunctional protein.
  • the CMP-Neu5Ac hydroxylase gene can be inactivated in such a way that no transcription of the gene occurs.
  • the CMP-Neu5Ac hydroxylase gene can be transcribed and then translated into a nonfunctional protein.
  • porcine animals are provided in which at least one allele of the CMP-Neu5Ac hydroxylase gene is inactivated via a genetic targeting event produced according to the process, sequences and/or constructs described herein.
  • porcine animals are provided in which both alleles of the CMP-Neu5Ac hydroxylase gene are inactivated via a genetic targeting event.
  • the gene can be targeted via homologous recombination.
  • the gene can be disrupted, i.e. a portion of the genetic code can be altered, thereby affecting transcription and/or translation of that segment of the gene. For example, disruption of a gene can occur through substitution, deletion (“knock-out”) or insertion (“knock-in”) techniques. Additional genes for a desired protein or regulatory sequence that modulate transcription of an existing sequence can be inserted.
  • porcine cells lacking one allele, optionally both alleles of the porcine CMP-Neu5Ac hydroxylase gene can be used as donor cells for nuclear transfer into enucleated oocytes to produce cloned, transgenic animals.
  • porcine CMP-Neu5Ac hydroxylase knockouts can be created in embryonic stem cells, which are then used to produce offspring.
  • Offspring lacking a single allele of the functional CMP-Neu5Ac hydroxylase gene produced according to the process, sequences and/or constructs described herein can be breed to further produce offspring lacking functionality in both alleles through mendelian type inheritance.
  • Cells, tissues and/or organs can be harvested from these animals for use in xenotransplantation strategies.
  • the elimination of the Hanganutziu-Deicher antigens can reduce the immune rejection of the transplanted cell, tissue or organ due to the Neu5Gc epitope.
  • animals lacking at least one allele of the CMP-Neu5Ac hydroxylase gene produced according to the process, sequences and/or constructs described herein can be less susceptible or resistant to enterotoxigenic infection and disease such as, for example, E. Coli infection, rotavirus infection, and gastroenteritis coronavirus. Such animals can be used, for example, in commercial farming.
  • a pig in one aspect of the present invention, can be prepared by a method in accordance with any aspect of the present invention.
  • Genetically modified pigs can be used as a source of tissue and/or organs for transplantation therapy.
  • a pig embryo prepared in this manner or a cell line developed therefrom can also be used in cell-transplantation therapy.
  • a method of therapy comprising the administration of genetically modified cells lacking porcine CMP-Neu5Ac hydroxylase to a patient, wherein the cells have been prepared from an embryo or animal lacking CMP-Neu5Ac hydroxylase.
  • This aspect of the invention extends to the use of such cells in medicine, e.g.
  • the cells can be organized into tissues or organs, for example, heart, lung, liver, kidney, pancreas, corneas, nervous (e.g. brain, central nervous system, spinal cord), skin, or the cells can be islet cells, blood cells (e.g. haemocytes, i.e. red blood cells, leucocytes) or haematopoietic stem cells or other stem cells (e.g. bone marrow).
  • haemocytes i.e. red blood cells, leucocytes
  • haematopoietic stem cells or other stem cells e.g. bone marrow
  • CMP-Neu5Ac hydroxylase-deficient pigs also lack genes encoding other xenoantigens, such as, for example, porcine iGb3 synthase (see, for example, U.S. Patent Application 60/517,524), and/or porcine Forssman synthase (see, for example, U.S. Patent Application 60/568,922).
  • porcine cells are provided that lack the ⁇ 1,3 galactosyltransferase gene and the CMP-Neu5Ac hydroxylase gene produced according to the process, sequences and/or constructs described herein.
  • porcine ⁇ 1,3 galactosyltransferase gene knockout cells are further modified to knockout the CMP-Neu5Ac hydroxylase gene produced according to the process, sequences and/or constructs described herein.
  • CMP-Neu5Ac hydroxylase deficient pigs produced according to the process, sequences and/or constructs described herein, optionally lacking one or more additional genes associated with an adverse immune response can be modified to express complement inhibiting proteins, such as, for example, CD59, DAF, and/or MCP can be further modified to eliminate the expression of al least one allele of the CMP-Neu5Ac hydroxylase gene.
  • complement inhibiting proteins such as, for example, CD59, DAF, and/or MCP
  • These animals can be used as a source of tissue and/or organs for transplantation therapy.
  • These animals can be used as a source of tissue and/or organs for transplantation therapy.
  • a pig embryo prepared in this manner or a cell line developed therefrom can also be used in cell-trans
  • Elimination of the CMP-Neu5Ac hydroxylase gene produced according to the process, sequences and/or constructs described herein can reduce a human beings immunological response to the Neu5Gc epitope and remove an immunological barrier to xenotransplantation.
  • the present invention is directed to novel nucleic acid sequences encoding the full-length cDNA and peptide. Information about the genomic organization, intronic sequences and regulatory regions of the gene are also provided.
  • the invention provides isolated and substantially purified cDNA molecules having one of SEQ ID Nos: 1, 3, 5 or 7, or a fragment thereof.
  • DNA sequences comprising the full-length genome of the CMP-NeuAc hydrolase gene are provided in SEQ ID Nos 9-45, 46, 47, 48, 49 or 50 or fragments thereof.
  • primers for amplifying porcine CMP-Neu5Ac hydroxylase cDNA or genomic sequence derived from SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47, 48, 49 or 50 are provided.
  • probes for identifying CMP-Neu5Ac hydroxylase nucleic acid sequences derived from SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47, 48, 49 or 50, or fragments thereof are provided.
  • DNA represented by SEQ ID Nos 9-45, 46, 47, 48, 49 or 50, or fragments thereof can be used to construct pigs lacking functional CMP-Neu5Ac hydroxylase genes.
  • the invention also provides a porcine chromosome lacking a functional CMP-NeuAc hydroxylase gene and a transgenic pig lacking a functional CMP-NeuAc hydroxylase protein produced according to the process, sequences and/or constructs described herein.
  • Such pigs can be used as tissue sources for xenotransplantation into humans.
  • CMP-NeuAc hydroxylase-deficient pigs produced according to the process, sequences and/or constructs described herein also lack other genes associated with adverse immune responses in xenotransplantation, such as, for example, the ⁇ 1,3 galactosyltransferase gene, iGb3 synthetase gene, or FSM synthase gene.
  • pigs lacking CMP-Neu5Ac hydroxylase produced according to the process, sequences and/or constructs described herein and/or other genes associated with adverse immune responses in xenotransplantation express complement inhibiting factors such as, for example, CD59, DAF, and/or MCP.
  • FIG. 1 represents the genomic organization of the porcine CMP-Neu5Ac hydroxylase gene. Closed bars depict each numbered exon. The length of the introns between the exons illustrates relative distances. (Open boxes also represent exons that appear in some variants (see FIG. 2 ); “start” and “stop” denote start and stop codons, respectively) The approximate scale is depicted in the bottom of the figure.
  • FIG. 2 depicts cDNA sequences of the CMP-Neu5Ac hydroxylase gene.
  • Variant-1 contains exon 15a in place of exons 14 and 15.
  • Variant-2 contains exon 12a, and variant-3 contains exon 11a.
  • Start” and stop denote the start and stop codons, respectively.
  • FIG. 3 illustrates four non-limiting examples of targeting vectors, along with their corresponding genomic organization.
  • the selectable marker gene in this particular non-limiting example is eGFP (enhanced green fluorescent protein).
  • eGFP can be inserted in the DNA constructs to inactivate the porcine CMP-NeuAc hydroxylase gene.
  • FIG. 4 illustrates transcription factor binding sites located within exon 1 (228 bp) and its 5′-flanking region spanning 601 bp.
  • FIG. 5 depicts oligonucleotide sequences that can be used for DNA construction of porcine CMP-Neu5Ac hydroxylase gene targeting vector.
  • FIG. 6 is a schematic diagram illustrating the production of a 3′-arm segment from the porcine CMP-Neu5Ac hydroxylase gene using primers pDH3 and pDH4, and its insertion into a vector (pCRII).
  • FIG. 7 is a schematic diagram illustrating the production of a 5′-arm segment from the porcine CMP-Neu5Ac hydroxylase gene using primers pDH1 and pDH2, followed by pDH2a, pDH2b, and pDH2c, and its insertion into a vector (pCRII) in which a 3′-arm has previously been inserted.
  • FIG. 8 is a non-limiting example of a schematic illustrating a targeting vector that can be utilized to delete Exon 6 of the porcine CMP-Neu5Ac hydroxylase gene through homologous recombination.
  • FIG. 9 represents oligonucleotide sequences used in generating a enhanced green fluorescent protein expression vector for use in a Knock-in strategy.
  • FIG. 10 is a schematic illustrating the insertion of a EGFP fragment with a polyA signal into the targeting vector pDH ⁇ ex6.
  • FIG. 11 is a schematic illustrating a knock-in vector for expression of eGFP.
  • FIG. 12 is a schematic illustrating homologous recombination resulting in a frameshift between the targeting cassette DNA construct (pDH ⁇ ex6) and genomic DNA.
  • FIG. 13 is a schematic illustrating homologous recombination resulting in a frameshift between the targeting cassette DNA construct (pDH ⁇ ex6) and genomic DNA.
  • a “target DNA sequence” is a DNA sequence to be modified by homologous recombination.
  • the target DNA can be in any organelle of the animal cell including the nucleus and mitochondria and can be an intact gene, an exon or intron, a regulatory sequence or any region between genes.
  • a “targeting DNA sequence” is a DNA sequence containing the desired sequence modifications.
  • the targeting DNA sequence can be substantially isogenic with the target DNA.
  • a “homologous DNA sequence or homologous DNA” is a DNA sequence that is at least about 80%, 85%, 90%, 95%, 98% or 99% identical with a reference DNA sequence.
  • a homologous sequence hybridizes under stringent conditions to the target sequence, stringent hybridization conditions include those that will allow hybridization occur if there is at least 85% and preferably at least 95% or 98% identity between the sequences.
  • an “isogenic or substantially isogenic DNA sequence” is a DNA sequence that is identical to or nearly identical to a reference DNA sequence.
  • the term “substantially isogenic” refers to DNA that is at least about 97-99% identical with the reference DNA sequence, and preferably at least about 99.5-99.9% identical with the reference DNA sequence, and in certain uses 100% identical with the reference DNA sequence.
  • Homologous recombination refers to the process of DNA recombination based on sequence homology.
  • Gene targeting refers to homologous recombination between two DNA sequences, one of which is located on a chromosome and the other of which is not.
  • Non-homologous or random integration refers to any process by which DNA is integrated into the genome that does not involve homologous recombination.
  • a “selectable marker gene” is a gene, the expression of which allows cells containing the gene to be identified.
  • a selectable marker can be one that allows a cell to proliferate on a medium that prevents or slows the growth of cells without the gene. Examples include antibiotic resistance genes and genes which allow an organism to grow on a selected metabolite.
  • the gene can facilitate visual screening of transformants by conferring on cells a phenotype that is easily identified. Such an identifiable phenotype can be, for example, the production of luminescence or the production of a colored compound, or the production of a detectable change in the medium surrounding the cell.
  • contiguous is used herein in its standard meaning, i.e., without interruption, or uninterrupted.
  • pig refers to any pig species, including pig species such as Large White, Landrace, Meishan, Minipig.
  • oocyte describes the mature animal ovum which is the final product of oogenesis and also the precursor forms being the oogonium, the primary oocyte and the secondary oocyte respectively.
  • fragment means a portion or partial sequence of a nucleotide or peptide sequence.
  • derivatives and analogs means a nucleotide or peptide sequence which retains essentially the same biological function or activity as such nucleotide or peptide.
  • an analog includes a proprotein which can be activated by cleavage of the proprotein portion to produce an active mature polypeptide.
  • DNA (deoxyribonucleic acid) sequences provided herein are represented by the bases adenine (A), thymine (T), cytosine (C), and guanine (G).
  • Transfection refers to the introduction of DNA into a host cell. Cells do not naturally take up DNA. Thus, a variety of technical “tricks” are utilized to facilitate gene transfer. Numerous methods of transfection are known to the ordinarily skilled artisan, for example, CaPO 4 and electroporation. (J. Sambrook, E. Fritsch, T. Maniatis, Molecular Cloning: A Laboratory Manual, Cold Spring Laboratory Press, 1989). Transformation of the host cell is the indicia of successful transfection.
  • One aspect of the present invention provides novel, full length nucleic acid cDNA sequences of the porcine CMP-Neu5Ac hydroxylase gene ( FIG. 2 , Table 1, Seq ID No 1).
  • Another aspect of the present invention provides predicted amino acid peptide sequences of the porcine CMP-Neu5Ac hydroxylase gene (Table 2, Seq ID No 2).
  • the ATG start codon for the full-length cDNA is located in the 3′ portion of Exon 4, and the stop codon TAG is found in the 3′ portion of Exon 17.
  • Nucleic and amino acid sequences at least 90, 95, 98 or 99% homologous to Seq ID Nos 1 or 2 are provided.
  • nucleotide and peptide sequences that contain at least 10, 15, 17, 20 or 25 contiguous nucleic or amino acids of Seq ID Nos 1 or 2 are also provided. Further provided are fragments, derivatives and analogs of Seq ID Nos 1-2. Fragments of Seq ID Nos. 1-2 can include any contiguous nucleic acid or peptide sequence that includes at least about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90.
  • Seq ID No 3 represents the cDNA of a variant of the gene, variant-1, that includes Exons 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15a, 16, 17, and 18.
  • Exon 15a is a cryptic Exon that normally appears in Intron 15, approximately 460 bp upstream of Exon 16.
  • the start codon for variant-1 is located in Exon 4, while the stop codon is located in Exon 17.
  • Seq ID No 5 represents the cDNA of a variant of the gene, variant-2, that includes Exons 1, 4, 5, 6, 7, 8, 9, 10, 11, 12 and 12a.
  • Exon 12a is a cryptic Exon which is retained from a partial sequence of Intron 12 (see SEQ ID. No. 21). The start codon for variant-2 is located in Exon 4, while the stop codon is located in the terminal end of Exon 12a.
  • Seq ID No 7 represents the cDNA of a variant of the gene, variant-3, that includes Exons 1, 4, 5, 6, 7, 8, 9, 10, 11 and 11a.
  • Exon 11a is a cryptic Exon which is retained from a partial sequence of Intron 11 (see Seq ID No. 19).
  • the start codon for variant-3 is located in Exon 4, while the stop codon is located in Exon 11a.
  • Another aspect of the present invention provides predicted amino acid peptide sequences of three novel variants of the porcine CMP-Neu5Ac Hydroxylase gene transcript. Seq ID Nos 4, 6 and 8 represent the amino acid sequences of variant-1, variant-2 and variant-3, respectively. Nucleotide and amino acid sequences at least 80, 85, 90, 95, 98 or 99% homologous to Seq ID Nos 3-8 are provided.
  • nucleotide and peptide sequences that contain at least 10, 15, 17, 20, 25, 30, 50, 100, 150, 200, 300, 400, 500 or 1000 contiguous nucleotide or amino acid sequences of Seq ID Nos 3-8 are also provided. Further provided are fragments, derivatives and analogs of Seq ID Nos 3-8. Fragments of Seq ID Nos. 3-8 can include any contiguous nucleic acid or peptide sequence that includes at least about 10 bp, 15 bp, 17 bp, 20 bp, 50 bp, 100 bp, 500 bp, 1 kbp, 5 kbp or 10 kbp.
  • nucleic acid constructs that contain cDNA or variants thereof encoding CMP-Neu5Ac hydroxylase. These cDNA sequences can be SEQ ID NO 1, 3, 5 or 7, or derived from SEQ ID Nos. 2, 4, 6, or 8 or any fragment thereof. Constructs can contain one, or more than one, internal ribosome entry site (IRES). The construct can also contain a promoter operably linked to the nucleic acid sequence encoding CMP-Neu5Ac hydroxylase, or, alternatively, the construct can be promoterless. In another embodiment, nucleic acid constructs are provided that contain nucleic acid sequences that permit random or targeted insertion into a host genome.
  • the expression vector can contain selectable marker sequences, such as, for example, enhanced Green Fluorescent Protein (eGFP) gene sequences, initiation and/or enhancer sequences, poly A-tail sequences, and/or nucleic acid sequences that provide for the expression of the construct in prokaryotic and/or eukaryotic host cells. Suitable vectors and selectable markers are described below.
  • the expression constructs can further contain sites for transcription initiation, termination, and/or ribosome binding sites.
  • the constructs can be expressed in any prokaryotic or eukaryotic cell, including, but not limited to yeast cells, bacterial cells, such as E. Coli , mammalian cells, such as CHO cells, and/or plant cells.
  • Promoters for use in such constructs include, but are not limited to, the phage lambda PL promoter, E. coli lac, E. coli trp, E. coli phoA, E. coli tac promoters, SV40 early, SV40 late, retroviral LTRs, PGKI, GALI, GALIO genes, CYCI, PH05, TRPI, ADHI, ADH2, forglymaldehyde phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, triose phosphate isomerase, phosphoglucose isomerase, glucokinase alpha-mating factor pheromone, PRBI, GUT2, GPDI promoter, metallothionein promoter, and/or mammalian viral promoters, such as those derived from adenovirus and vaccinia virus.
  • Other promoters will be known to
  • Nucleic acid sequences representing the genomic DNA organization of the CMP-Neu5Ac hydroxylase gene are also provided.
  • Seq ID Nos 10-28 represent Exons 1, 4-11, 11a, 12, 12a, 13-15, 15a, and 16-18, respectively.
  • Exons 11a, 12a, and 15a are cryptic Exons that are retained in certain variant transcripts of CMP-Neu5Ac hydroxylase.
  • SEQ ID Nos 29-45 represent Intronic sequence between Exon 1 and Exon 4 (hereinafter Intron 1a and Intron 1b, respectively), 4-15, 15a, 16, and 17, respectively.
  • Intron 15a is the 3′ downstream portion of Intron 15 that follows the cryptic Exon 15a.
  • nucleic acid sequence representing the genomic DNA sequence of the porcine CMP-Neu5Ac hydroxylase gene (Table 10, SEQ ID No. 46) is also provided.
  • contiguous genomic sequence representing the 5′ contiguous genomic sequence containing 5′ UTR, Exon 1 and a portion of intronic sequence located between Exon 1 and Exon 4 (Intron 1a) (SEQ ID No. 47, Table 11) is provided.
  • Contiguous genomic sequence containing an intronic sequence located between Exon 1 and Exon 4 (Intron 1b) through Exon 18 (SEQ ID No. 48, Table 12) is also provided.
  • Nucleotide and amino acid sequences at least 80, 85, 90, 95, 98 or 99% homologous to SEQ ID Nos 9-45, 46, 47, 48, 49 and 50 are provided.
  • 9-45, 46, 47, 48, 49, and 50 can include any contiguous nucleic acid or peptide sequence or at least about 10 bp, 15 bp, 17 bp, 20 bp, 50 bp, 100 bp, 500 bp, 1 kbp, 5 kbp or 10 kbp.
  • binding sites are located in the 5′UTR and Exon 1 of the porcine CMP-Neu5Ac hydroxylase genome, and include binding sites for transcription factors such as, for example, ETSF, MZF1, SF1, CMYB, MEF2, TATA, MEF2, NMP4, CAAT, AP1, BRN2, SATB1, ATF, GAT1, USF, WHN, NMP4, ZF5, NFKB, ZBP89, MOK2, ZF5, NFY, and MYCMAX.
  • transcription factors such as, for example, ETSF, MZF1, SF1, CMYB, MEF2, TATA, MEF2, NMP4, CAAT, AP1, BRN2, SATB1, ATF, GAT1, USF, WHN, NMP4, ZF5, NFKB, ZBP89, MOK2, ZF5, NFY, and MYCMAX.
  • SEQ ID NO. 49 represents contiguous genomic sequences containing Intronic sequence 5′ to Exon 4, Exon 4, Intron 4, Exon 5, Intron 5, Exon 6, Intron 6, Exon 7, Intron 7 and Exon 8 (Table 13). Further, nucleotide sequences that contain at least 1750, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100, 3200, 3300, 3400, 3500, 3600, 3700, 3800, 3900, 4000, 4100, 4200, 4300, 4400, 4500, 4600, 4700, 4800, 4900, 5000, 5500, 6000, 6500, 7000, 7500, 8000, 8500, 9000, 9500, 10000, 11000, 12000, 13000, 14000, 15000, 16000, 17000, 18000, 19000, or 20000 contiguous nucleotides of SEQ ID NO. 49 are provided, as well as nucleotide
  • SEQ ID NO. 50 represents contiguous genomic sequences containing Exon 12, Intron 12, Exon 13, Intron 13, Exon 14, Intron 14, Exon 15, Intron 15, Exon 16, Intron 16, Exon 17, Intron 17, and Exon 18.
  • the present invention further provides oligonucleotide probes and primers which hybridize to the hereinabove-described sequences (SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47, 48, 49 and 50). Oligonucleotides are provided that can be homologous to SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47, 48, 49 and 50, and fragments thereof. Oligonucleotides that hybridize under stringent conditions to SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47, 48, 49 and 50 and fragments thereof, are also provided. Stringent conditions describe conditions under which hybridization will occur only if there is at least about 85%, about 90%, about 95%, or at least about 98% homology between the sequences.
  • the oligonucleotide can have at least 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 45, 46, 47, 48, 49, 50, 75 or 100 bases which hybridize to SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47, 48, 49 and 50, and fragments thereof.
  • Such oligonucleotides can be used as primers and probes to detect the sequences provided herein.
  • the probe or primer can be at least 14 nucleotides in length, and in a preferred embodiment, are at least 15, 20, 25, 28, 30, or 35 nucleotides in length.
  • oligonucleotide probes and primes that are complementary to sequences contained in Seq ID Nos. 1, 3, 5, 7, 9-45, 46, 47, 48, 49 and 50, and fragments thereof.
  • the rules for complementary pairing are well known: cytosine (“C”) always pairs with guanine (“G”) and thymine (“T”) or uracil (“U”) always pairs with adenine (“A”). It is recognized that it is not necessary for the primer or probe to be 100% complementary to the target nucleic acid sequence, as long as the primer or probe sufficiently hybridizes and can recognize the corresponding complementary sequence. A certain degree of pair mismatch can generally be tolerated.
  • Oligonucleotide sequences used as the hybridizing region of a primer can also be used as the hybridizing region of a probe. Suitability of a primer sequence for use as a probe depends on the hybridization characteristics of the primer. Similarly, an oligonucleotide used as a probe can be used as a primer.
  • primers and probes can be prepared by, for example, the addition of nucleotides to either the 5′ or 3′ ends, which nucleotides are complementary to the target sequence or are not complimentary to the target sequence. So long as primer compositions serve as a point of initiation for extension on the target sequences, and so long as the primers and probes comprise at least 14 consecutive nucleotides contained within the above mentioned SEQ ID Nos. such compositions are within the scope of the invention.
  • the probes and primers herein can be selected by the following criteria, which are factors to be considered, but are not exclusive or determinative.
  • the probes and primers are selected from the region of the CMP-Neu5Ac hydroxylase nucleic acid sequence identified in SEQ ID Nos. 1, 3, 5, 7, 945, 46, 47, 48, 49, 50, and fragments thereof.
  • the probes and primers lack homology with sequences of other genes that would be expected to compromise the test.
  • the probes or primers lack secondary structure formation in the amplified nucleic acid which can interfere with extension by the amplification enzyme such as E. coli DNA polymerase, preferably that portion of the DNA polymerase referred to as the Klenow fragment. This can be accomplished by employing up to about 15% by weight, preferably 5-10% by weight, dimethyl sulfoxide (DMSO) in the amplification medium and/or increasing the amplification temperatures to 30°-40° C.
  • DMSO dimethyl sulfoxide
  • the probes or primers should contain approximately 50% guanine and cytosine nucleotides, as measured by the formula adenine (A)+thymine (T)+cytosine (C)+guanine (G)/cytosine (C)+guanine (G).
  • the probe or primer does not contain multiple consecutive adenine and thymine residues at the 3′ end of the primer which can result in less stable hybrids.
  • the probes and primers of the invention can be about 10 to 30 nucleotides long, preferably at least 10, 11, 12, 13, 14, 15, 20, 25, or 28 nucleotides in length, including specifically 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 nucleotides.
  • the nucleotides as used in the present invention can be ribonucleotides, deoxyribonucleotides and modified nucleotides such as inosine or nucleotides containing modified groups which do not essentially alter their hybridization characteristics.
  • Probe and primer sequences are represented throughout the specification as single stranded DNA oligonucleotides from the 5′ to the 3′ end. Any of the probes can be used as such, or in their complementary form, or in their RNA form (wherein T is replaced by U).
  • the probes and primers according to the invention can be prepared by cloning of recombinant plasmids containing inserts including the corresponding nucleotide sequences, optionally by cleaving the latter out from the cloned plasmids upon using the adequate nucleases and recovering them, e.g. by fractionation according to molecular weight.
  • the probes and primers according to the present invention can also be synthesized chemically, for instance by the conventional phosphotriester or phosphodiester methods or automated embodiments thereof. In one such automated embodiment diethylphosphoramidites are used as starting materials and can be synthesized as described by Beaucage, et al., Tetrahedron Letters 22:1859-1862 (1981).
  • the oligonucleotides used as primers or probes can also comprise nucleotide analogues such as phosphorothioates (Matsukura S., Naibunpi Gakkai Zasshi. 43(6):527-32 (1967)), alkylphosphorothiates (Miller P., et al., Biochemistry 18(23):5134-43 (1979), peptide nucleic acids (Nielsen P., et al., Science 254(5037):1497-500 (1991); Nielsen P., et al., Nucleic - Acids - Res.
  • nucleotide analogues such as phosphorothioates (Matsukura S., Naibunpi Gakkai Zasshi. 43(6):527-32 (1967)), alkylphosphorothiates (Miller P., et al., Biochemistry 18(23):5134-43 (1979), peptide nucleic acids (
  • morpholino nucleic acids locked nucleic acids, pseudocyclic oligonucleobases, 2′-O,4′-C-ethylene bridged nucleic acids or can contain intercalating agents (Asseline J., et al., Proc. Natl. Acad. Sci. USA 81(11):3297-301 (1984)).
  • the stability of the probe and primer to target nucleic acid hybrid should be chosen to be compatible with the assay conditions. This can be accomplished by avoiding long AT-rich sequences, by terminating the hybrids with GC base pairs, and/or by designing the probe with an appropriate Tm.
  • the beginning and end points of the probe should be chosen so that the length and % GC result in a Tm about 2-10° C. higher than the temperature at which the final assay will be performed.
  • the base composition of the probe is significant because G-C base pairs exhibit greater thermal stability compared to A-T base pairs due to additional hydrogen bonding. Thus, hybridization involving complementary nucleic acids of higher G-C content will be stable at higher temperatures.
  • Conditions such as ionic strength and incubation temperature under which probe will be used should also be taken into account when designing a probe. It is known that hybridization will increase as the ionic strength of the reaction mixture increases, and that the thermal stability of the hybrids will increase with increasing ionic strength. Chemical reagents, such as formamide, urea, DIVISO and alcohols, which disrupt hydrogen bonds, will increase the stringency of hybridization. Destabilization of the hydrogen bonds by such reagents can greatly reduce the Tm. In general, optimal hybridization for synthetic oligonucleotide probes of about 10-50 bases in length occurs approximately 5° C. below the melting temperature for a given duplex.
  • the stringency of the assay conditions determines the amount of complementarity needed between two nucleic acid strands forming a hybrid.
  • the degree of stringency is chosen such as to maximize the difference in stability between the hybrid formed with the target and the non-target nucleic acid. In the present case, single base pair changes need to be detected, which requires conditions of very high stringency.
  • the length of the target nucleic acid sequence and, accordingly, the length of the probe sequence can also be important. In some cases, there can be several sequences from a particular region, varying in location and length, which will yield probes and primers with the desired hybridization characteristics. In other cases, one sequence can be significantly better than another which differs merely by a single base.
  • oligonucleotide probes and primers of different lengths and base composition can be used, preferred oligonucleotide probes and primers of this invention are between about 14 and 30 bases in length and have a sufficient stretch in the sequence which is perfectly complementary to the target nucleic acid sequence.
  • Regions in the target DNA or RNA which are known to form strong internal structures inhibitory to hybridization are less preferred.
  • probes with extensive self-complementarity should be avoided.
  • hybridization is the association of two single strands of complementary nucleic acids to form a hydrogen bonded double strand. It is implicit that if one of the two strands is wholly or partially involved in a hybrid, it will be less able to participate in formation of a new hybrid.
  • There can be intramolecular and intermolecular hybrids formed within the molecules of one type of probe if there is sufficient self complementarity. Such structures can be avoided through careful probe design.
  • Specific primers and sequence specific oligonucleotide probes can be used in a polymerase chain reaction that enables amplification and detection of CMP-Neu5Ac hydroxylase nucleic acid sequences.
  • Gene targeting allows for the selective manipulation of animal cell genomes. Using this technique, a particular DNA sequence can be targeted and modified in a site-specific and precise manner. Different types of DNA sequences can be targeted for modification, including regulatory regions, coding regions and regions of DNA between genes. Examples of regulatory regions include: promoter regions, enhancer regions, terminator regions and introns. By modifying these regulatory regions, the timing and level of expression of a gene can be altered. Coding regions can be modified to alter, enhance or eliminate the protein within a cell. Introns and exons, as well as inter-genic regions, are suitable targets for modification.
  • Modifications of DNA sequences can be of several types, including insertions, deletions, substitutions, or any combination thereof.
  • a specific example of a modification is the inactivation of a gene by site-specific integration of a nucleotide sequence that disrupts expression of the gene product, i.e. a “knock out”.
  • one approach to disrupting the CMP-Neu5Ac hydroxylase gene is to insert a selectable marker into the targeting DNA such that homologous recombination between the targeting DNA and the target DNA can result in insertion of the selectable marker into the coding region of the target gene. For example, see FIGS. 3 , 12 , and 13 . In this way, for example, the CMP-Neu5Ac hydroxylase gene sequence is disrupted, rendering the encoded enzyme nonfunctional.
  • homologous recombination permits site-specific modifications in endogenous genes and thus novel alterations can be engineered into the genome.
  • a primary step in homologous recombination is DNA strand exchange, which involves a pairing of a DNA duplex with at least one DNA strand containing a complementary sequence to form an intermediate recombination structure containing heteroduplex DNA (see, for example Radding, C. M. (1982) Ann. Rev. Genet. 16: 405; U.S. Pat. No. 4,888,274).
  • the heteroduplex DNA can take several forms, including a three DNA strand containing triplex form wherein a single complementary strand invades the DNA duplex (Hsieh, et al., Genes and Development 4: 1951 (1990); Rao, et al., (1991) PNAS 88:2984)) and, when two complementary DNA strands pair with a DNA duplex, a classical Holliday recombination joint or chi structure (Holliday, R., Genet. Res. 5: 282 (1964)) can form, or a double-D loop (“Diagnostic Applications of Double-D Loop Formation” U.S. Ser. No. 07/755,462, filed Sep. 4, 1991).
  • a heteroduplex structure can be resolved by strand breakage and exchange, so that all or a portion of an invading DNA strand is spliced into a recipient DNA duplex, adding or replacing a segment of the recipient DNA duplex.
  • a heteroduplex structure can result in gene conversion, wherein a sequence of an invading strand is transferred to a recipient DNA duplex by repair of mismatched bases using the invading strand as a template (Genes, 3 rd Ed. (1987) Lewin, B., John Wiley, New York, N.Y.; Lopez, et al., Nucleic Acids Res. 15: 5643 (1987)).
  • formation of heteroduplex DNA at homologously paired joints can serve to transfer genetic sequence information from one DNA molecule to another.
  • homologous recombination gene conversion and classical strand breakage/rejoining
  • genetic sequence information between DNA molecules renders targeted homologous recombination a powerful method in genetic engineering and gene manipulation.
  • homologous recombination the incoming DNA interacts with and integrates into a site in the genome that contains a substantially homologous DNA sequence.
  • non-homologous (“random” or “illicit”) integration the incoming DNA is not found at a homologous sequence in the genome but integrates elsewhere, at one of a large number of potential locations.
  • studies with higher eukaryotic cells have revealed that the frequency of homologous recombination is far less than the frequency of random integration. The ratio of these frequencies has direct implications for “gene targeting” which depends on integration via homologous recombination (i.e. recombination between the exogenous “targeting DNA” and the corresponding “target DNA” in the genome).
  • the present invention uses homologous recombination to inactivate the porcine CMP-Neu5Ac hydroxylase gene in cells, such as fibroblasts.
  • the DNA can comprise at least a portion of the gene(s) at the particular locus with introduction of an alteration into at least one, optionally both copies, of the native gene(s), so as to prevent expression of a functional enzyme and production of a Hanganutziu-Deicher antigen molecule.
  • the alteration can be an insertion, deletion, replacement or combination thereof.
  • the cells having a single unmutated copy of the target gene are amplified and can be subjected to a second targeting step, where the alteration can be the same or different from the first alteration, usually different, and where a deletion, or replacement is involved, can be overlapping at least a portion of the alteration originally introduced.
  • a targeting vector with the same arms of homology, but containing a different mammalian selectable markers can be used.
  • the resulting transformants are screened for the absence of a functional target antigen and the DNA of the cell can be further screened to ensure the absence of a wild-type target gene.
  • homozygosity as to a phenotype can be achieved by breeding hosts heterozygous for the mutation.
  • Porcine cells that can be genetically modified can be obtained from a variety of different organs and tissues such as, but not limited to, brain, heart, lungs, glands, brain, eye, stomach, spleen, pancreas, kidneys, liver, intestines, uterus, bladder, skin, hair, nails, ears, nose, mouth, lips, gums, teeth, tongue, salivary glands, tonsils, pharynx, esophagus, large intestine, small intestine, rectum, anus, pylorus, thyroid gland, thymus gland, suprarenal capsule, bones, cartilage, tendons, ligaments, skeletal muscles, smooth muscles, blood vessels, blood, spinal cord, trachea, ureters, urethra, hypothalamus, pituitary, adrenal glands, ovaries, oviducts, uterus, vagina, mammary glands, testes, seminal vesicles, penis, lymph, lymph nodes and lymph vessels.
  • porcine cells can be selected from the group consisting of, but not limited to, epithelial cells, fibroblast cells, neural cells, keratinocytes, hematopoietic cells, melanocytes, chondrocytes, lymphocytes (B and T), macrophages, monocytes, mononuclear cells, cardiac muscle cells, other muscle cells, ⁇ hosphate cells, cumulus cells, epidermal cells, endothelial cells, Islets of Langerhans cells, blood cells, blood precursor cells, bone cells, bone precursor cells, neuronal stem cells, primordial stem cells, hepatocytes, keratinocytes, umbilical vein endothelial cells, aortic endothelial cells, microvascular endothelial cells, fibroblasts, liver stellate cells, aortic smooth muscle cells, cardiac myocytes, neurons, Kupffer cells, smooth muscle cells, Schwann cells, and epithelial cells, erythrocyte
  • embryonic stem cells can be used.
  • An embryonic stem cell line can be employed or embryonic stem cells can be obtained freshly from a host, such as a porcine animal.
  • the cells can be grown on an appropriate fibroblast-feeder layer or grown in the presence of leukemia inhibiting factor (LIF).
  • LIF leukemia inhibiting factor
  • the porcine cells can be fibroblasts; in one specific embodiment, the porcine cells can be fetal fibroblasts.
  • Fibroblast cells are a preferred somatic cell type because they can be obtained from developing fetuses and adult animals in large quantities.
  • These cells can be easily propagated in vitro with a rapid doubling time and can be clonally propagated for use in gene targeting procedures.
  • Cells homozygous at a targeted locus can be produced by introducing DNA into the cells, where the DNA has homology to the target locus and includes a marker gene, allowing for selection of cells comprising the integrated construct.
  • the homologous DNA in the target vector will recombine with the chromosomal DNA at the target locus (see, for example, FIGS. 3 , 12 , and 13 ).
  • the marker gene can be flanked on both sides by homologous DNA sequences, a 3′ recombination arm and a 5′ recombination arm (See, for example, FIG. 11 ).
  • Methods for the construction of targeting vectors have been described in the art, see, for example, Dai et al., Nature Biotechnology 20: 251-255, 2002; WO 00/51424.
  • constructs can be prepared for homologous recombination at a target locus.
  • the construct can include at least 50 bp, 100 bp, 500 bp, 1 kbp, 2 kbp, 4 kbp, 5 kbp, 10 kbp, 15 kbp, 20 kbp, or 50 kbp of sequence homologous with the target locus.
  • the sequence can include any contiguous sequence of the porcine CMP-Neu5Ac hydroxylase gene, including at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90.95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 550, 600, 700, 750, 800, 850, 900, 1000, 5000 or 10, 000 contiguous nucleotides of Seq ID Nos 9-45, 46, 47, 48, 49, and 50, or any combination or fragment thereof.
  • Fragments of Seq ID Nos. 9-45, 46, 47, 48, 49 and 50 can include any contiguous nucleic acid or peptide sequence that includes at least about 10 bp, 15 bp, 17 bp, 20 bp, 50 bp, 100 bp, 500 bp, 1 kbp, 5 kbp or 10 kbp.
  • target DNA sequences such as, for example, the size of the target locus, availability of sequences, relative efficiency of double cross-over events at the target locus and the similarity of the target sequence with other sequences.
  • the targeting DNA can include a sequence in which DNA substantially isogenic flanks the desired sequence modifications with a corresponding target sequence in the genome to be modified.
  • the substantially isogenic sequence can be at least about 95%, 97-98%, 99.0-99.5%, 99.6-99.9%, or 100% identical to the corresponding target sequence (except for the desired sequence modifications).
  • the targeting DNA and the target DNA preferably can share stretches of DNA at least about 75, 150 or 500 base pairs that are 100% identical. Accordingly, targeting DNA can be derived from cells closely related to the cell line being targeted; or the targeting DNA can be derived from cells of the same cell line or animal as the cells being targeted.
  • the DNA constructs can be designed to modify the endogenous, target CMP-Neu5Ac hydroxylase.
  • the homologous sequence for targeting the construct can have one or more deletions, insertions, substitutions or combinations thereof designed to disrupt the function of the resultant gene product.
  • the alteration can be the insertion of a selectable marker gene fused in reading frame with the upstream sequence of the target gene.
  • Suitable selectable marker genes include, but are not limited to: genes conferring the ability to grow on certain media substrates, such as the tk gene (thymidine kinase) or the hprt gene (hypoxanthine phosphoribosyltransferase) which confer the ability to grow on HAT medium (hypoxanthine, aminopterin and thymidine); the bacterial gpt gene (guanine/xanthine phosphoribosyltransferase) which allows growth on MAX medium (mycophenolic acid, adenine, and xanthine). See, for example, Song, K-Y., et al. Proc. Nat'l Acad. Sci.
  • selectable markers include: genes conferring resistance to compounds such as antibiotics, genes conferring the ability to grow on selected substrates, genes encoding proteins that produce detectable signals such as luminescence, such as green fluorescent protein, enhanced green fluorescent protein (eGFP).
  • genes conferring resistance to compounds such as antibiotics
  • genes conferring the ability to grow on selected substrates genes encoding proteins that produce detectable signals such as luminescence, such as green fluorescent protein, enhanced green fluorescent protein (eGFP).
  • eGFP enhanced green fluorescent protein
  • antibiotic resistance genes such as the neomycin resistance gene (neo) (Southern, P., and P. Berg, J. Mol. Appl. Genet.
  • hygromycin resistance gene (Nucleic Acids Research 11:6895-6911 (1983), and Te Riele, H., et al., Nature 348:649-651 (1990)).
  • selectable marker genes include: acetohydroxy acid synthase (AHAS), alkaline phosphatase (AP), beta galactosidase (LacZ), beta glucoronidase (GUS), chloramphenicol acetyltransferase (CAT), green fluorescent protein (GFP), red fluorescent protein (RFP), yellow fluorescent protein (YFP), cyan fluorescent protein (CFP), horseradish peroxidase (HRP), luciferase (Luc), nopaline synthase (NOS), octopine synthase (OCS), and derivatives thereof.
  • AHAS acetohydroxy acid synthase
  • AP alkaline phosphatase
  • LacZ beta galactosidase
  • GUS beta glucoronidase
  • CAT chloramphenicol acetyltransferase
  • GFP green fluorescent protein
  • RFP red fluorescent protein
  • YFP yellow fluorescent protein
  • CFP cyan fluorescent protein
  • HRP
  • Multiple selectable markers are available that confer resistance to ampicillin, bleomycin, chloramphenicol, gentamycin, hygromycin, kanamycin, lincomycin, methotrexate, phosphinothricin, puromycin, and tetracycline.
  • Combinations of selectable markers can also be used.
  • a neo gene (with or without its own promoter, as discussed above) can be cloned into a DNA sequence which is homologous to the CMP-Neu5Ac hydroxylase gene.
  • the HSV-tk gene can be cloned such that it is outside of the targeting DNA (another selectable marker could be placed on the opposite flank, if desired). After introducing the DNA construct into the cells to be targeted, the cells can be selected on the appropriate antibiotics.
  • those cells which are resistant to G418 and gancyclovir are most likely to have arisen by homologous recombination in which the neo gene has been recombined into the CMP-Neu5Ac hydroxylase gene but the tk gene has been lost because it was located outside the region of the double crossover.
  • Deletions can be at least about 50 bp, more usually at least about 100 bp, and generally not more than about 20 kbp, where the deletion can normally include at least a portion of the coding region including a portion of or one or more exons, a portion of or one or more introns, and can or can not include a portion of the flanking non-coding regions, particularly the 5′-non-coding region (transcriptional regulatory region).
  • the homologous region can extend beyond the coding region into the 5′-non-coding region or alternatively into the 3′-non-coding region.
  • Insertions can generally not exceed 10 kbp, usually not exceed 5 kbp, generally being at least 50 bp, more usually at least 200 bp.
  • the region(s) of homology can include mutations, where mutations can further inactivate the target gene, in providing for a frame shift, or changing a key amino acid, or the mutation can correct a dysfunctional allele, etc.
  • the mutation can be a subtle change, not exceeding about 5% of the homologous flanking sequences.
  • the construct can be prepared in accordance with methods known in the art, various fragments can be brought together, introduced into appropriate vectors, cloned, analyzed and then manipulated further until the desired construct has been achieved (see, for example FIGS. 5-11 ).
  • Various modifications can be made to the sequence, to allow for restriction analysis, excision, identification of probes, etc.
  • Silent mutations can be introduced, as desired. At various stages, restriction analysis, sequencing, amplification with the polymerase chain reaction, primer repair, in vitro mutagenesis, etc. can be employed.
  • the construct can be prepared using a bacterial vector, including a prokaryotic replication system, e.g. an origin recognizable by E. coli , at each stage the construct can be cloned and analyzed.
  • a marker the same as or different from the marker to be used for insertion, can be employed, which can be removed prior to introduction into the target cell.
  • the vector containing the construct Once the vector containing the construct has been completed, it can be further manipulated, such as by deletion of the bacterial sequences, linearization, introducing a short deletion in the homologous sequence. After final manipulation, the construct can be introduced into the cell.
  • Techniques which can be used to allow the DNA construct entry into the host cell include calcium phosphate/DNA co-precipitation, microinjection of DNA into the nucleus, electroporation, bacterial protoplast fusion with intact cells, transfection, or any other technique known by one skilled in the art.
  • the DNA can be single or double stranded, linear or circular, relaxed or supercoiled DNA.
  • Keown et al. Methods in Enzymology Vol. 185, pp. 527-537 (1990).
  • the present invention further includes recombinant constructs comprising one or more of the sequences as broadly described above (for example in Tables 9-12).
  • the constructs comprise a vector, such as a plasmid or viral vector, into which a sequence of the invention has been inserted, in a forward or reverse orientation.
  • the construct can also include regulatory sequences, including, for example, a promoter, operably linked to the sequence. Large numbers of suitable vectors and promoters are known to those of skill in the art, and are commercially available.
  • the following vectors are provided by way of example: pBs, pQE-9 (Qiagen), phagescript, PsiX174, pBluescript SK, pBsKS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia).
  • any other plasmids and vectors can be used as long as they are replicable and viable in the host.
  • Vectors known in the art and those commercially available (and variants or derivatives thereof) can in accordance with the invention be engineered to include one or more recombination sites for use in the methods of the invention.
  • Such vectors can be obtained from, for example, Vector Laboratories Inc., Invitrogen, Promega, Novagen, NEB, Clontech, Boehringer Mannheim, Pharmacia, EpiCenter, OriGenes Technologies Inc., Stratagene, PerkinElmer, Pharmingen, and Research Genetics.
  • vectors of interest include eukaryotic expression vectors such as pFastBac, pFastBacHT, pFastBacDUAL, pSFV, and pTet-Splice (Invitrogen), pEUK-C1, pPUR, pMAM, pMAMneo, pBI101, pBI121, pDR2, pCMVEBNA, and pYACneo (Clontech), pSVK3, pSVL, pMSG, pCH110, and pKK232-8 (Pharmacia, Inc.), p3′SS, pXT1, pSG5, pPbac, pMbac, pMC1neo, and pOG44 (Stratagene, Inc.), and pYES2, pAC360, pBlueBacHis A, B, and C, pVL1392, pBlueBacIII, pC
  • vectors suitable for use in the invention include pUC18, pUC19, pBlueScript, pSPORT, cosmids, phagemids, YAC's (yeast artificial chromosomes), BAC's (bacterial artificial chromosomes), PI ( Escherichia coli phage), pQE70, pQE60, pQE9 (quagan), pBS vectors, PhageScript vectors, BlueScript vectors, pNH8A, pNH16A, pNH18A, pNH46A (Stratagene), pcDNA3 (Invitrogen), pGEX, pTrsfus, pTrc99A, pET-5, pET-9, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia), pSPORT1, pSPORT2, pCMVSPORT2.0 and pSV-SPORT1 (Invitrogen) and variants or derivative
  • Additional vectors of interest include pTrxFus, pThioHis, pLEX, pTrcHis, pTrcHis2, pRSET, pBlueBacHis2, pcDNA3.1/His, pcDNA3.1( ⁇ )/Myc-His, pSecTag, pEBVHis, pPIC9K, pPIC3.5K, pAO815, pPICZ, pPICZA, pPICZB, pPICZC, pGAPZA, pGAPZB, pGAPZC, pBlueBac4.5, pBlueBacHis2, pMelBac, pSinRep5, pSinHis, pIND, pIND(SP1), pVgRXR, pcDNA2.1, pYES2, pZErO1.1, pZErO-2.1, pCR-Blunt,
  • Additional vectors include, for example, pPC86, pDBLeu, pDBTrp, pPC97, p2.5, pGAD1-3, pGAD10, pACt, pACT2, pGADGL, pGADGH, pAS2-1, pGAD424, pGBT8, pGBT9, pGAD-GAL4, pLexA, pBD-GAL4, pHISi, pHISi-1, placZi, pB42AD, pDG202, pJK202, pJG4-5, pNLexA, pYESTrp and variants or derivatives thereof.
  • any other plasmids and vectors known in the art can be used as long as they are replicable and viable in the host.
  • Cells that have been homologously recombined to knock-out expression of the porcine CMP-Neu5Ac hydroxylase gene can then be grown in appropriately-selected medium to identify cells providing the appropriate integration. Those cells which show the desired phenotype can then be further analyzed by restriction analysis, electrophoresis, Southern analysis, polymerase chain reaction, or another technique known in the art. By identifying fragments which show the appropriate insertion at the target gene site, cells can be identified in which homologous recombination has occurred to inactivate or otherwise modify the target gene.
  • the presence of the selectable marker gene inserted into the CMP-Neu5Ac hydroxylase gene establishes the integration of the target construct into the host genome. Those cells which show the desired phenotype can then be further analyzed by restriction analysis, electrophoresis, Southern analysis, polymerase chain reaction, monoclonal antibody assays, Fluorescent Activated Cell Sorter (FACS), or any other techniques or methods known in the art to analyze the DNA in order to establish whether homologous or non-homologous recombination occurred.
  • FACS Fluorescent Activated Cell Sorter
  • Primers can also be used which are complementary to a sequence within the construct and complementary to a sequence outside the construct and at the target locus. In this way, one can only obtain DNA duplexes having both of the primers present in the complementary chains if homologous recombination has occurred. By demonstrating the presence of the primer sequences or the expected size sequence, the occurrence of homologous recombination is supported.
  • An alternative method for screening homologous recombination events includes utilizing monoclonal or polyclonal antibodies specific for porcine CMP-Neu5Ac Hydroxylase and/or Neu5Gc, as described in, for example, Malykh, et al., European Journal of Cell Biology 80, 48-58 (2001), Malykh, et al., Glycoconjugate J. 15, 885-893 (1998).
  • porcine cells lacking expression of functional CMP-Neu5Ac Hydroxylase due to homologous recombination events include, but are not limited to, Southern Blot analysis, Northern Blot analysis, specific lectin binding assays, and/or sequence analysis, or by using anti-Neu5Gc or anti-CMP-Neu5Ac hydroxylase antibody assays as described, for example, in Y. Malykh, et. al. Biochem J. 370: 601-607 (2003); Y. Malykh, et al. European Journal of Cell Biology 80: 48-58 (2001); Y. Malykh et al. Glycoconjugate J. 15: 885-893 (1998). See generally, for example, A. Sharma, et al. Transplantation 75(4): 430-436 (2003).
  • the cell lines obtained from the first round of targeting are likely to be heterozygous for the targeted allele.
  • Homozygosity in which both alleles are modified, can be achieved in a number of ways. One approach is to grow up a number of cells in which one copy has been modified and then to subject these cells to another round of targeting of the remaining porcine CMP-Neu5Ac hydroxylase allele using a different selectable marker.
  • homozygotes can be obtained by breeding animals heterozygous for the modified allele, according to traditional Mendelian genetics. In some situations, it can be desirable to have two different modified alleles. This can be achieved by successive rounds of gene targeting or by breeding heterozygotes, each of which carries one of the desired modified alleles.
  • cells homozygous for the nonfunctional CMP-Neu5Ac hydroxylase gene can be subject to further genetic modification.
  • introducing a construct comprising substantially the same homologous DNA, possibly with extended sequences, having the marker gene portion of the original construct deleted one can be able to obtain homologous recombination with the target locus.
  • By using a combination of marker genes for integration one providing positive selection and the other negative selection, in the removal step, one can select against the cells retaining the marker genes.
  • porcine cells are provided that lack the CMP-Neu5Ac hydroxylase gene and the ⁇ (1,3)GT gene.
  • Animals lacking functional CMP-Neu5Ac hydroxylase can be produced according to the present invention, and then cells from this animal can be used to knockout the ⁇ (1,3)GT gene.
  • Homozygous ⁇ (1,3)GT negative porcine have recently been reported (Phelps et. al. Science 2003; WO 04/028243).
  • cells from these a(1,3)GT knockout animals can be used and further modified to inactivate the CMP-Neu5Ac hydroxylase gene.
  • porcine cells are also provided that lack the porcine CMP-Neu5Ac hydroxylase gene and produce human complement inhibiting proteins.
  • Animals lacking functional porcine CMP-Neu5Ac hydroxylase gene can be produced according to the present invention, and then cells from this animal can be further modified to express human complement inhibiting proteins, such as, but not limited to, CD59 (cDNA reported by Philbrick, W. M., et al. (1990) Eur. J. Immunol. 20:87-92), human decay accelerating factor (DAF) (cDNA reported by Medof, et al. (1987) Proc. Natl. Acad. Sci. USA 84: 2007), and human membrane cofactor protein (MCP) (cDNA reported by Lublin, D., et al. (1988) J. Exp. Med. 168: 181-194).
  • human complement inhibiting proteins such as, but not limited to, CD59 (cDNA reported by Philbrick, W. M., et al. (1990) Eur. J.
  • cells from transgenic pigs producing human complement inhibiting proteins can be used and further modified to inactivate the porcine CMP-Neu5Ac hydroxylase gene.
  • Transgenic pigs producing human complement inhibiting proteins are known in the art (see, for example, U.S. Pat. No. 6,166,288).
  • porcine cells are provided that lack the porcine CMP-Neu5Ac hydroxylase gene and the porcine Forssman synthetase (FSM) gene.
  • Animals lacking functional porcine CMP-Neu5Ac hydroxylase gene can be produced according to the present invention, and then cells from this animal can be further modified to knockout the porcine FSM synthetase gene, which is involved in the production of gal- ⁇ -gal epitopes, and plays a role in xenotransplant rejection.
  • the porcine FSM synthetase gene has recently been identified (see U.S. Application 60/568,922).
  • cells from these FSM synthetase gene knockout animals can be used and further modified to inactivate the porcine CMP-Neu5Ac hydroxylase gene.
  • porcine cells are provided that lack the porcine CMP-Neu5Ac hydroxylase gene and the porcine isogloboside 3 synthase gene.
  • Animals lacking functional porcine CMP-Neu5Ac hydroxylase gene can be produced according to the present invention, and then cells from this animal can be used to knockout the porcine iGb3 synthase gene.
  • the porcine iGb3 synthase gene has recently been reported (U.S. Application No. 60/517,524).
  • cells from these porcine iGb3 synthase gene knockout animals can be used and further modified to inactivate the porcine CMP-Neu5Ac hydroxylase gene.
  • porcine cells are provided that lack the porcine CMP-Neu5Ac hydroxylase gene, the ⁇ (1,3)GT gene, the FSM synthetase gene, and the porcine iGb3 synthase gene.
  • Animals lacking functional CMP-Neu5Ac hydroxylase gene can be produced according to the present invention, and then cells from this animal can be used to knockout the ⁇ (1,3)GT gene, the FSM synthetase gene, and the porcine iGb3 synthase gene.
  • Homozygous ⁇ (1,3)GT-negative porcine have recently been reported (Phelps et al.
  • cells from these a(1,3)GT knockout animals can be used and further modified to inactivate the porcine iGb3 synthase gene, the porcine FSM synthetase gene, and the CMP-Neu5Ac hydroxylase gene, and, in addition, express human complement inhibiting proteins, such as, but not limited to, CD59, human decay accelerating factor (DAF), and human membrane cofactor protein (MCP).
  • human complement inhibiting proteins such as, but not limited to, CD59, human decay accelerating factor (DAF), and human membrane cofactor protein (MCP).
  • the present invention provides methods of producing a transgenic pig that lacks expression of CMP-Neu5Ac hydroxylase through the genetic modification of porcine totipotent embryonic cells.
  • the animals can be produced by: (a) identifying one or more target CMP-Neu5Ac hydroxylase nucleic acid genomic sequences in an animal; (b) preparing one or more homologous recombination vectors targeting the CMP-Neu5Ac hydroxylase nucleic acid genomic sequences; (c) inserting the one or more targeting vectors into the genomes of a plurality of totipotent cells of the animal, thereby producing a plurality of transgenic totipotent cells; (d) obtaining a tetraploid blastocyst of the animal; (e) inserting the plurality of totipotent cells into the tetraploid blastocyst, thereby producing a transgenic embryo; (f) transferring the embryo to a recipient female animal; and (g) allowing the embryo to develop to term in the female animal.
  • the totipotent cells can be embryonic stem (ES) cells.
  • ES embryonic stem
  • the isolation of ES cells from blastocysts, the establishing of ES cell lines and their subsequent cultivation are carried out by conventional methods as described, for example, by Doetchmann et al., J. Embryol. Exp. Morph. 87:27-45 (1985); Li et al., Cell 69:915-926 (1992); Robertson, E. J. “Tetracarcinomas and Embryonic Stem Cells: A Practical Approach,” ed. E. J. Robertson, IRL Press, Oxford, England (1987); Wurst and Joyner, “Gene Targeting: A Practical Approach,” ed. A. L.
  • the cells can be plated onto a feeder layer in an appropriate medium, for example, such as fetal bovine serum enhanced DMEM.
  • an appropriate medium for example, such as fetal bovine serum enhanced DMEM.
  • Cells containing the construct can be detected by employing a selective medium, and after sufficient time for colonies to grow, colonies can be picked and analyzed for the occurrence of homologous recombination.
  • Polymerase chain reaction can be used, with primers within and without the construct sequence but at the target locus. Those colonies which show homologous recombination can then be used for embryo manipulating and blastocyst injection.
  • Blastocysts can be obtained from superovulated females.
  • the embryonic stem cells can then be trypsinized and the modified cells added to a droplet containing the blastocysts. At least one of the modified embryonic stem cells can be injected into the blastocoel of the blastocyst.
  • blastocysts can be returned to each uterine horn of pseudopregnant females. Females are then allowed to go to term and the resulting litters screened for mutant cells having the construct.
  • the blastocysts are selected for different parentage from the transformed ES cells. By providing for a different phenotype of the blastocyst and the ES cells, chimeric progeny can be readily detected, and then genotyping can be conducted to probe for the presence of the modified CMP-Neu5Ac hydroxylase gene.
  • the totipotent cells can be embryonic germ (EG) cells.
  • Embryonic Germ cells are undifferentiated cells functionally equivalent to ES cells, that is they can be cultured and transfected in vitro, then contribute to somatic and germ cell lineages of a chimera (Stewart et al., Dev. Biol. 161:626-628 (1994)).
  • EG cells are derived by culture of primordial germ cells, the progenitors of the gametes, with a combination of growth factors: leukemia inhibitory factor, steel factor and basic fibroblast growth factor (Matsui, et al., Cell 70:841-847 (1992); Resnick, et al., Nature 359:550-551 (1992)).
  • the cultivation of EG cells can be carried out using methods known to one skilled in the art, such as described in Donovan et al., “Transgenic Animals, Generation and Use,” Ed. L. M. Houdebine, Harwood Academic Publishers (1997).
  • Tetraploid blastocysts for use in the invention can be obtained by natural zygote production and development, or by known methods by electrofusion of two-cell embryos and subsequently cultured as described, for example, by James, et al., Genet. Res. Camb. 60:185-194 (1992); Nagy and Rossant, “Gene Targeting: A Practical Approach,” ed. A. L. Joyner, IRL Press, Oxford, England (1993); or by Kubiak and Tarkowski, Exp. Cell Res. 157:561-566 (1985).
  • the introduction of the ES cells or EG cells into the blastocysts can be carried out by any method known in the art, for example, as described by Wang, et al., EMBO J. 10:2437-2450 (1991).
  • a “plurality” of totipotent cells can encompass any number of cells greater than one.
  • the number of totipotent cells for use in the present invention can be about 2 to about 30 cells, about 5 to about 20 cells, or about 5 to about 10 cells.
  • about 5-10 ES cells taken from a single cell suspension are injected into a blastocyst immobilized by a holding pipette in a micromanipulation apparatus. Then the embryos are incubated for at least 3 hours, possibly overnight, prior to introduction into a female recipient animal via methods known in the art (see for example Robertson, E. J. “Teratocarcinomas and Embryonic Stem Cells: A Practical Approach” IRL Press, Oxford, England (1987)). The embryo can then be allowed to develop to term in the female animal.
  • the present invention provides a method for cloning a pig lacking a functional CMP-Neu5Ac hydroxylase gene via somatic cell nuclear transfer.
  • a wide variety of methods to accomplish mammalian cloning are currently being rapidly developed and reported, any method that accomplishes the desired result can be used in the present invention. Nonlimiting examples of such methods are described below.
  • the pig can be produced by a nuclear transfer process comprising the following steps: obtaining desired differentiated pig cells to be used as a source of donor nuclei; obtaining oocytes from a pig; enucleating the oocytes; transferring the desired differentiated cell or cell nucleus into the enucleated oocyte, e.g., by fusion or injection, to form NT units; activating the resultant NT unit; and transferring said cultured NT unit to a host pig such that the NT unit develops into a fetus.
  • a donor cell nucleus which has been modified to alter the CMP-Neu5Ac hydroxylase gene, is transferred to a recipient porcine oocyte.
  • the use of this method is not restricted to a particular donor cell type.
  • the donor cell can be as described in Wilmut, et al., Nature 385 810 (1997); Campbell, et al., Nature 380 64-66 (1996); or Cibelli, et al., Science 280 1256-1258 (1998). All cells of normal karyotype, including embryonic, fetal and adult somatic cells which can be used successfully in nuclear transfer can in principle be employed. Fetal fibroblasts are a particularly useful class of donor cells.
  • Donor cells can also be, but do not have to be, in culture and can be quiescent.
  • Nuclear donor cells which are quiescent are cells which can be induced to enter quiescence or exist in a quiescent state in vivo.
  • Prior art methods have also used embryonic cell types in cloning procedures (Campbell, et al. ( Nature, 380:64-68, 1996) and Stice, et al ( Biol. Reprod., 20 54:100-110, 1996).
  • Somatic nuclear donor cells may be obtained from a variety of different organs and tissues such as, but not limited to, skin, mesenchyme, lung, pancreas, heart, intestine, stomach, bladder, blood vessels, kidney, urethra, reproductive organs, and a disaggregated preparation of a whole or part of an embryo, fetus or adult animal.
  • nuclear donor cells are selected from the group consisting of epithelial cells, fibroblast cells, neural cells, keratinocytes, hematopoietic cells, melanocytes, chondrocytes, lymphocytes (B and T), macrophages, monocytes, mononuclear cells, cardiac muscle cells, other muscle cells, granulose cells, cumulus cells, epidermal cells or endothelial cells.
  • the nuclear cell is an embryonic stem cell.
  • fibroblast cells can be used as donor cells.
  • the nuclear donor cells of the invention are germ cells of an animal. Any germ cell of an animal species in the embryonic, fetal, or adult stage may be used as a nuclear donor cell. In a suitable embodiment, the nuclear donor cell is an embryonic germ cell.
  • Nuclear donor cells may be arrested in any phase of the cell cycle (GO, GI, G2, S, M) so as to ensure coordination with the acceptor cell. Any method known in the art may be used to manipulate the cell cycle phase. Methods to control the cell cycle phase include, but are not limited to, GO quiescence induced by contact inhibition of cultured cells, GO quiescence induced by removal of serum or other essential nutrient, GO quiescence induced by senescence, GO quiescence induced by addition of a specific growth factor; GO or GI quiescence induced by physical or chemical means such as heat shock, hyperbaric pressure or other treatment with a chemical, hormone, growth factor or other substance; S-phase control via treatment with a chemical agent which interferes with any.
  • GO quiescence induced by contact inhibition of cultured cells GO quiescence induced by removal of serum or other essential nutrient
  • GO quiescence induced by senescence GO quiescence induced by addition of a specific growth factor
  • oocytes Methods for isolation of oocytes are well known in the art. Essentially, this can comprise isolating oocytes from the ovaries or reproductive tract of a pig. A readily available source of pig oocytes is slaughterhouse materials. For the combination of techniques such as genetic engineering, nuclear transfer and cloning, oocytes must generally be matured in vitro before these cells can be used as recipient cells for nuclear transfer, and before they can be fertilized by the sperm cell to develop into an embryo.
  • This process generally requires collecting immature (prophase I) oocytes from mammalian ovaries, e.g., bovine ovaries obtained at a slaughterhouse, and maturing the oocytes in a maturation medium prior to fertilization or enucleation until the oocyte attains the metaphase II stage, which in the case of bovine oocytes generally occurs about 18-24 hours post-aspiration. This period of time is known as the “maturation period”.
  • immature (prophase I) oocytes from mammalian ovaries, e.g., bovine ovaries obtained at a slaughterhouse, and maturing the oocytes in a maturation medium prior to fertilization or enucleation until the oocyte attains the metaphase II stage, which in the case of bovine oocytes generally occurs about 18-24 hours post-aspiration. This period of time is known as the “maturation period”.
  • a metaphase II stage oocyte can be the recipient oocyte, at this stage it is believed that the oocyte can be or is sufficiently “activated” to treat the introduced nucleus as it does a fertilizing sperm.
  • Metaphase II stage oocytes which have been matured in vivo have been successfully used in nuclear transfer techniques. Essentially, mature metaphase II oocytes can be collected surgically from either non-superovulated or superovulated porcine 35 to 48, or 39-41, hours past the onset of estrus or past the injection of human chorionic gonadotropin (hCG) or similar hormone.
  • hCG human chorionic gonadotropin
  • the oocytes can be enucleated. Prior to enucleation the oocytes can be removed and placed in appropriate medium, such as HECM containing 1 milligram per milliliter of hyaluronidase prior to removal of cumulus cells. The stripped oocytes can then be screened for polar bodies, and the selected metaphase II oocytes, as determined by the presence of polar bodies, are then used for nuclear transfer. Enucleation follows.
  • Enucleation can be performed by known methods, such as described in U.S. Pat. No. 4,994,384.
  • metaphase II oocytes can be placed in either HECM, optionally containing 7.5 micrograms per milliliter cytochalasin B, for immediate enucleation, or can be placed in a suitable medium, for example an embryo culture medium such as CR1aa, plus 10% estrus cow serum, and then enucleated later, preferably not more than 24 hours later, and more preferably 16-18 hours later.
  • Enucleation can be accomplished microsurgically using a micropipette to remove the polar body and the adjacent cytoplasm.
  • the oocytes can then be screened to identify those of which have been successfully enucleated.
  • One way to screen the oocytes is to stain the oocytes with 1 microgram per milliliter 33342 Hoechst dye in HECM, and then view the oocytes under ultraviolet irradiation for less than 10 seconds.
  • the oocytes that have been successfully enucleated can then be placed in a suitable culture medium, for example, CR1 aa plus 10% serum.
  • a single mammalian cell of the same species as the enucleated oocyte can then be transferred into the perivitelline space of the enucleated oocyte used to produce the NT unit.
  • the mammalian cell and the enucleated oocyte can be used to produce NT units according to methods known in the art.
  • the cells can be fused by electrofusion. Electrofusion is accomplished by providing a pulse of electricity that is sufficient to cause a transient breakdown of the plasma membrane. This breakdown of the plasma membrane is very short because the membrane reforms rapidly. Thus, if two adjacent membranes are induced to breakdown and upon reformation the lipid bilayers intermingle, small channels can open between the two cells.
  • thermodynamic instability Due to the thermodynamic instability of such a small opening, it enlarges until the two cells become one. See, for example, U.S. Pat. No. 4,997,384 by Prather et al.
  • electrofusion media can be used including, for example, sucrose, mannitol, sorbitol and phosphate buffered solution. Fusion can also be accomplished using Sendai virus as a fusogenic agent (Graham, Wister Inot. Symp. Monogr., 9, 19, 1969).
  • the nucleus can be injected directly into the oocyte rather than using electroporation fusion. See, for example, Collas and Barnes, Mol. Reprod. Dev., 38:264-267 (1994).
  • the resultant fused NT units are then placed in a suitable medium until activation, for example, CR1aa medium. Typically activation can be effected shortly thereafter, for example less than 24 hours later, or about 4-9 hours later.
  • the NT unit can be activated by any method that accomplishes the desired result. Such methods include, for example, culturing the NT unit at sub-physiological temperature, in essence by applying a cold, or actually cool temperature shock to the NT unit. This can be most conveniently done by culturing the NT unit at room temperature, which is cold relative to the physiological temperature conditions to which embryos are normally exposed. Alternatively, activation can be achieved by application of known activation agents. For example, penetration of oocytes by sperm during fertilization has been shown to activate prefusion oocytes to yield greater numbers of viable pregnancies and multiple genetically identical pigs after nuclear transfer. Also, treatments such as electrical and chemical shock can be used to activate NT embryos after fusion. See, for example, U.S.
  • activation can be effected by simultaneously or sequentially by increasing levels of divalent cations in the oocyte, and reducing phosphorylation of cellular proteins in the oocyte. This can generally be effected by introducing divalent cations into the oocyte cytoplasm, e.g., magnesium, strontium, barium or calcium, e.g., in the form of an ionophore.
  • divalent cations include the use of electric shock, treatment with ethanol and treatment with caged chelators.
  • Phosphorylation can be reduced by known methods, for example, by the addition of kinase inhibitors, e.g., serine-threonine kinase inhibitors, such as 6-dimethyl-aminopurine, staurosporine, 2-aminopurine, and sphingosine.
  • kinase inhibitors e.g., serine-threonine kinase inhibitors, such as 6-dimethyl-aminopurine, staurosporine, 2-aminopurine, and sphingosine.
  • phosphorylation of cellular proteins can be inhibited by introduction of a phosphatase into the oocyte, e.g., phosphatase 2A and phosphatase 2B.
  • the activated NT units can then be cultured in a suitable in vitro culture medium until the generation of cell colonies.
  • Culture media suitable for culturing and maturation of embryos are well known in the art. Examples of known media, which can be used for embryo culture and maintenance, include Ham's F-10+10% fetal calf serum (FCS), Tissue Culture Medium-199 (TCM-199)+10% fetal calf serum, Tyrodes-Albumin-Lactate-Pyruvate (TALP), Dulbecco's Phosphate Buffered Saline (PBS), Eagle's and Whitten's media.
  • the cultured NT unit or units can be washed and then placed in a suitable media contained in well plates which preferably contain a suitable confluent feeder layer.
  • Suitable feeder layers include, by way of example, fibroblasts and epithelial cells.
  • the NT units are cultured on the feeder layer until the NT units reach a size suitable for transferring to a recipient female, or for obtaining cells which can be used to produce cell colonies.
  • these NT units can be cultured until at least about 2 to 400 cells, more preferably about 4 to 128 cells, and most preferably at least about 50 cells.
  • Activated NT units can then be transferred (embryo transfers) to the oviduct of an female pigs.
  • the female pigs can be an estrus-synchronized recipient gilt.
  • Crossbred gilts large white/Duroc/Landrace) (280-400 lbs) can be used.
  • the gilts can be synchronized as recipient animals by oral administration of 18-20 mg ReguMate (Altrenogest, Hoechst, Warren, N.J.) mixed into the feed. Regu-Mate can be fed for 14 consecutive days.
  • Regu-Mate can be fed for 14 consecutive days.
  • One thousand units of Human Chorionic Gonadotropin (hCG, Intervet America, Millsboro, Del.) can then be administered i.m. about 105 h after the last Regu-Mate treatment.
  • Embryo transfers of the can then be performed about 22-26 h after the hCG injection.
  • the pregnancy can be brought to term and result in the birth of live offspring.
  • the pregnancy can be 5 terminated early and embryonic cells can be harvested.
  • the methods for embryo transfer and recipient animal management in the present invention are standard procedures used in the embryo transfer industry. Synchronous transfers are important for success of the present invention, i.e., the stage of the NT embryo is in synchrony with the estrus cycle of the recipient female. See, for example, Siedel, G. E., Jr. “Critical review of embryo transfer procedures with cattle” in Fertilization and Embryonic Development in Vitro (1981) L. Mastroianni, Jr. and J. D. Biggers, ed., Plenum Press, New York, N.Y., page 323.
  • the present invention provides viable porcine in which both alleles of the CMP-Neu5Ac hydroxylase gene have been inactivated.
  • the invention also provides organs, tissues, and cells derived from such porcine, which are useful for xenotransplantation.
  • the invention provides porcine organs, tissues and/or purified or substantially pure cells or cell lines obtained from pigs that lack any expression of functional CMP-Neu5Ac hydroxylase.
  • the invention provides organs that are useful for xenotransplantation.
  • Any porcine organ can be used, including, but not limited to: brain, heart, lungs, glands, brain, eye, stomach, spleen, pancreas, kidneys, liver, intestines, uterus, bladder, skin, hair, nails, ears, nose, mouth, lips, gums, teeth, tongue, salivary glands, tonsils, pharynx, esophagus, large intestine, small intestine, rectum, anus, pylorus, thyroid gland, thymus gland, suprarenal capsule, bones, cartilage, tendons, ligaments, skeletal muscles, smooth muscles, blood vessels, blood, spinal cord, trachea, ureters, urethra, hypothalamus, pituitary, adrenal glands, ovaries, oviducts, uterus, vagina, mammary glands, testes, seminal vesicles, penis, lymph, lymph nodes
  • the invention provides tissues that are useful for xenotransplantation.
  • Any porcine tissue can be used, including, but not limited to: epithelium, connective tissue, blood, bone, cartilage, muscle, nerve, adenoid, adipose, areolar, bone, brown adipose, cancellous, muscle, cartaginous, cavernous, chondroid, chromaffin, dartoic, elastic, epithelial, fatty, fibrohyaline, fibrous, Gaingee, gelatinous, granulation, gut-associated lymphoid, Haller's vascular, hard hemopoietic, indifferent, interstitial, investing, islet, lymphatic, lymphoid, mesenchymal, mesonephric, mucous connective, multilocular adipose, myeloid, nasion soft, nephrogenic, nodal, osseous, osteogenic, osteoid, periapical, reticular, reticular
  • the invention provides cells and cell lines from porcine animals that lack expression of functional alpha1,3GT.
  • these cells or cell lines can be used for xenotransplantation.
  • Cells from any porcine tissue or organ can be used, including, but not limited to: epithelial cells, fibroblast cells, neural cells, keratinocytes, hematopoietic cells, melanocytes, chondrocytes, lymphocytes (B and T), macrophages, monocytes, mononuclear cells, cardiac muscle cells, other muscle cells, ⁇ hosphate cells, cumulus cells, epidermal cells, endothelial cells, Islets of Langerhans cells, pancreatic insulin secreting cells, pancreatic alpha-2 cells, pancreatic beta cells, pancreatic alpha-1 cells, blood cells, blood precursor cells, bone cells, bone precursor cells, neuronal stem cells, primordial stem cells, hepatocytes, keratinocytes, umbilical vein endothelial cells
  • pancreatic cells including, but not limited to, Islets of Langerhans cells, insulin secreting cells, 48 alpha-2 cells, beta cells, alpha-1 cells from pigs that lack expression of functional alpha-1,3-GT are provided.
  • Nonviable derivatives include tissues stripped of viable cells by enzymatic or chemical treatment these tissue derivatives can be further processed via crosslinking or other chemical treatments prior to use in transplantation.
  • the derivatives include extracellular matrix derived from a variety of tissues, including skin, urinary, bladder or organ submucosal tissues.
  • tendons, joints and bones stripped of viable tissue to include heart valves and other nonviable tissues as medical devices are provided.
  • the cells can be administered into a host in order in a wide variety of ways.
  • Preferred modes of administration are parenteral, intraperitoneal, intravenous, intradermal, epidural, intraspinal, intrasternal, intra-articular, intra-synovial, intrathecal, intra-arterial, intracardiac, intramuscular, intranasal, subcutaneous, intraorbital, intracapsular, topical, transdermal patch, via rectal, vaginal or urethral administration including via suppository, percutaneous, nasal spray, surgical implant, internal surgical paint, infusion pump, or via catheter.
  • the agent and carrier are administered in a slow release formulation such as a direct tissue injection or bolus, implant, microparticle, microsphere, nanoparticle or nanosphere.
  • disorders that can be treated by infusion of the disclosed cells include, but are not limited to, diseases resulting from a failure of a dysfunction of normal blood cell production and maturation (i.e., aplastic anemia and hypoproliferative stem cell disorders); neoplastic, malignant diseases in the hematopoietic organs (e.g., leukemia and lymphomas); broad spectrum malignant solid tumors of non-hematopoietic origin; autoimmune conditions; and genetic disorders.
  • Such disorders include, but are not limited to diseases resulting from a failure or dysfunction of normal blood cell production and maturation hyperproliferative stem cell disorders, including aplastic anemia, pancytopenia, agranulocytosis, thrombocytopenia, red cell aplasia, Blackfan Diamond syndrome, due to drugs, radiation, or infection, idiopathic; hematopoietic malignancies including acute lymphoblastic (lymphocytic) leukemia, chronic lymphocytic leukemia, acute myclogenous leukemia, chronic myelogenous, leukemia, acute malignant myelosclerosis, multiple myeloma, polycythemia vera, agnogenic myelometaplasia, Waldenstrom's macroglobulinemia, Hodgkin's lymphoma, non-Hodgkin's lymphoma; immunosuppression in patients with malignant, solid tumors including malignant melanoma, carcinoma of the stomach, ovarian carcinoma, breast carcinoma,
  • Neurodegenerative diseases include neurodegenerative diseases, hepatodegenerative diseases, nephrodegenerative disease, spinal cord injury, head trauma or surgery, viral infections that result in tissue, organ, or gland degeneration, and the like.
  • Such neurodegenerative diseases include but are 10 not limited to, AIDS dementia complex; demyeliriating diseases, such as multiple sclerosis and acute transferase myelitis; extrapyramidal and cerebellar disorders, such as lesions of the ecorticospinal system; disorders of the basal ganglia or cerebellar disorders; hyperkinetic movement disorders, such as Huntington's Chorea and senile chorea; drug-induced movement disorders, such as those induced by drugs that block CNS dopamine receptors; hypokinetic movement disorders, such as Parkinson's disease; progressive supra-nucleo palsy; structural lesions of the cerebellum; spinocerebellar degenerations, such as spinal ataxia, Friedreich's ataxia, cerebellar cortical degenerations, multiple systems
  • the present invention provides viable porcine for purposes of farming applications in which one or both alleles of the CMP-Neu5Ac hydroxylase gene have been inactivated.
  • Inactivation of one or both alleles of the CMP-Neu5Ac hydroxylase gene can reduce the susceptibility of porcine animals to zoonotic diseases and infections in pigs such as, for example, E. coli , pig rotavirus, and pig transmissible gastroenteritis coronavirus, and any other zoonotic or enterotoxigenic organism that utilizes Neu5Gc in a host animal.
  • the reduction in disease susceptibility allows greater economic realization of farming operations due to the ability to harvest more healthy animals, and the reduction of animal death due to enterotoxigenic organisms.
  • porcine GenomeWalkerTM libraries were constructed using Universal GenomeWalkerTM Library kit (Clontech). Gene-specific and nested primer pairs were designed from the partial cDNA sequence provided by GenBank Accession #A59058.
  • 5′- or 3′-RACE analysis To identify the 5′ and 3′ ends of porcine CMP-Neu5Ac hydroxylase gene transcripts, 5′- and 3′-RACE procedures were performed using the Marathon cDNA Amplification Kit (Clontech) with poly A+ RNA isolated from adult porcine spleen as a template. First strand cDNA synthesis from 1 ug of poly A+ RNA was accomplished using 20 U of AMV-RT and 1 pmol of the supplied cDNA Synthesis Primer by incubating at 48° C. for 2 hours. Second strand cDNA synthesis involved incubating the entire first strand reaction with a supplied enzyme cocktail composed of Rnase H, E. coli DNA polymerase I, and E.
  • porcine GenomeWalkerTM libraries were constructed using the Universal GenomeWalkerTM Library Kit (Clontech, Palo Alto, Calif.).
  • porcine genomic DNA was separately digested with a single blunt-cutting restriction endonuclease (DraI, EcoRV, PvuII, ScaI, or StuI). After phenol-chloroform extraction, ethanol precipitation, and resuspension of the restricted fragments, a portion of each digested aliquot was used in separate ligation reaction with the GenomeWalker adapters provided with the kit. This process created five libraries for use in the PCR based cloning strategy. Primer pairs identified in Table 13 were used in a genome walking strategy.
  • the thermal cycling conditions recommended by the manufacturer were employed in all GenomeWalker-PCR experiments on a Perkin Elmer Gene Amp System 9600 or 9700 thermocycler.
  • Subcloning and sequencing of amplified products PCR products amplified from genomic DNA, GeneWalker-PCR (Clontech), and 5′-3′-RACE wre gel-purified using the Qiagen Gel Extraction Kit (Qiagen, Valencia, Calif.), if necessary, then subcloned into the pCR11 vector provided with the Original TA Cloning Kit (Invitrogen, Carlsbad, Calif.). Plasmid DNA minipreps of pCR11-ligated inserts were prepared with the QIAprep Spin Miniprep Kit (Qiagen) as directed.
  • Primer Synthesis All oligonucleotides used as primers in the various PCR-based methods were synthesized on an ABI 394 DNA Synthesizer (Applied Biosystems, Inc., Foster City Calif.) using solid phase synthesis and phosphoramidite nucleoside chemistry, unless otherwise stated.
  • CMP-Neu5Ac hydroxylase knock-out target vector A vector targeting Exon 6 of the porcine CMP-Neu5Ac Hydroxylase gene for knockout can be constructed.
  • a portion of Intron 6 is amplified by PCR for use as a 3′-arm of the targeting vector utilizing primers such as pDH3 (5′-CTCCTGGAAGCTTCTGTCAAGACGAAC-3′) and pDH4 (5′-GCCTGATACACAGTGCTGTGCAATGGT-3′) (see FIG. 5 ).
  • the amplified PCR product of approximately 3.7 kb can be inserted into the pCRII vector after restriction enzyme digestion utilizing EcoRI and ApaI. See FIG. 6 .
  • a portion of Intron 5 can be amplified by PCR for use as a 5′-arm in the targeting vector utilizing primers such as pDH1 (5′-ACCACCCAAGTCTGGAATCTTCTTACACT-3′) and pDH2 (5′-GACTCTCATACAAAAGCTAAGCTGGGTAAG-3′) (see FIG. 5 ).
  • successive PCR amplifications can be performed to introduce an EcoNI restriction site into the 3′ portion of the 5′-arm utilizing primers such as pDH1 in conjunction with primers such as pDH2a (5′-GACTCTCATACAAAACCTAAGCTGGGTAAG-3′), pDH2b (5′-GACTCTCATACAAAACCTAGGCTGGGTAAG-3′), and pDH2c (5′-GACTCTCATACAAAACCTAGGCTAGGTAAG-3′), respectively (see FIG. 5 ).
  • primers such as pDH1 in conjunction with primers such as pDH2a (5′-GACTCTCATACAAAACCTAAGCTGGGTAAG-3′), pDH2b (5′-GACTCTCATACAAAACCTAGGCTGGGTAAG-3′), and pDH2c (5′-GACTCTCATACAAAACCTAGGCTAGGTAAG-3′), respectively (see FIG. 5 ).
  • the amplified PCR product of approximately 2.6 kb containing the engineered EcoNI site can be restriction enzyme digested using ApaI and EcoNI, and inserted into the pCRII vector containing the previously inserted 3′-arm (See FIG. 7 ), generating a targeting vector (pDH ⁇ ex6) containing an approximate 6.3 kb porcine CMP-Neu5Ac hydroxylase targeting sequence (see FIG. 8 ).
  • pDH ⁇ ex6 can be further modified by an in-frame insertion of an enhanced green fluorescent protein sequence at the terminal 3′ end of Exon 6 of the porcine CMP-Neu5Ac hydroxylase gene.
  • a portion of Intron 5 and a portion of Exon 6 of the porcine CMP-Neu5Ac hydroxylase gene can be amplified by PCR utilizing primers such as pDH5 (5′-CCTTATACTGGCCCCAATTGGATCTTAC-3′) and pDH6 (5′-CCTTATACTGGCCCCAATTGGATCTTAC-3′) (see FIG.
  • pIRES-EGFP a vector containing the EGFP and a poly A tail following restriction enzyme digestion with MunI and EcoRv.
  • PCR amplification can be performed on the pIRES-EGFP vector containing the insertion utilizing primer such as pDH7 (5′-CTTACCTAGCCTAGGTTTTGTATGAGAGTC-3′) and pDH8 (5′-GACAAACCACAATTGGAATGCACTCGAG-3′) (see FIG. 9 ).
  • the PCR amplified product can be restriction enzyme digested using EcoNI and MunI and inserted into the previously constructed pDH ⁇ ex6 targeting vector (see FIG. 10 ).
  • the resultant targeting vector (pDH ⁇ ex6-EGFP) is illustrated in FIG. 11 .
  • Fetal fibroblast cells are isolated from 10 fetuses of the same pregnancy at day 33 of gestation. After removing the head and viscera, fetuses are washed with Hanks' balanced salt solution (HBSS; Gibco-BRL, 1 5 Rockville, Md.), placed in 20 ml of HBSS, and diced with small surgical scissors. The tissue is pelleted and resuspended in 50-ml tubes with 40 ml of DMEM and 100 U/ml collagenase (Gibco-BRL) per fetus. Tubes are incubated for 40 min in a shaking water bath at 37 C.
  • HBSS Hanks' balanced salt solution
  • Gibco-BRL 1 5 Rockville, Md.
  • the digested tissue is allowed to settle for 3-4 min and the cell-rich supernatant is transferred to a new 50-ml tube and pelleted.
  • the cells are then resuspended in 40 ml of DMEM containing 10% fetal calf serum (FCS), 1X nonessential amino acids, 1 mM sodium pyruvate and 2 ng/ml bFGF, and seeded into 10 cm. dishes.
  • FCS fetal calf serum
  • 1X nonessential amino acids 1 mM sodium pyruvate
  • 2 ng/ml bFGF seeded into 10 cm. dishes.
  • 10 ⁇ g of linearized pDH ⁇ ex6EGFP vector is introduced into 2 million cells using lipofectamine 2000 (Carlsbad, Calif.) following manufacturer's guidelines.
  • the transfected cells are seeded into 48-well plates at a density of 2,000 cells per well and grown to confluence. Following confluence, cells are sorted via Fluorescent Activated Cell Sorting (FACS) (FACSCalibur, Becton Dickenson, San Jose, Calif.), wherein only cells having undergone homologous recombination and expressing the EGFP are selected (see, for example, FIG. 13 ).
  • FACS Fluorescent Activated Cell Sorting
  • Selected cells are then reseeded, and grown to confluency. Once confluency is reached, several small aliquots are frozen back for future use, and the remainder are utilized for PCR and Southern Blot verification of homologous recombination.
  • the putative targeted clones can be screened by PCR across the Exon 6/EGFP insert utilizing a primer complimentary to the EGFP sequence and a primer complimentary to a sequence outside the vector as the antisense primer.
  • the PCR products can be analyzed by Southern Blotting using an EGFP probe to identify the positive clones by the presence of the expected band from the targeted allele.
  • Donor cells are genetically manipulated to produce cells heterozygous for porcine CMP-Neu5Ac hydroxylase as described generally above.
  • Nuclear transfer can be performed by methods that are well known in the art (see, e.g., Dai et al., Nature Biotechnology 20: 251255, 2002; and Polejaeva et al., Nature 407:86-90, 2000), using EGFP selected porcine fibroblasts as nuclear donors that are produced as described in detail hereinabove.
  • Oocytes can be isolated from synchronized super ovulated sexually mature Large-White X Landacre outcross gilts as described, for example, in 1. Polejaeva et al. Nature 407: 505 (2000). Donor cells are synchronized in presumptive G0/G1 by serum starvation (0.5%) between 24 to 120 hours. Oocytes enucleation, nuclear transfer, electrofusion, and electroactivation can be performed as essentially described in, for example, A. C. Boquest et al., Biol. Reproduction 68: 1283 (2002). Reconstructed embryos can be cultured overnight and can be transferred to the oviducts of asynchronous ( ⁇ 1 day) recipients. Pregnancies can be confirmed and monitored by real-time ultrasound.

Abstract

The present invention provides porcine CMP-N-Acetylneuraminic-Acid Hydroxylase (CMP-Neu5Ac hydroxylase) protein, cDNA, and genomic DNA regulatory sequences. Furthermore, the present invention includes porcine animals, tissues, and organs, as well as cells and cell lines derived from such animals, tissues, and organs, which lack expression of functional CMP-Neu5Ac hydroxylase. Such animals, tissues, organs, and cells can be used in research and in medical therapy, including in xenotransplantation, and in industrial livestock farming operations.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • This application claims priority to U.S. Provisional patent application Ser. No. 60/476,396, filed Jun. 6, 2003.
  • FIELD OF THE INVENTION
  • The present invention provides porcine CMP-N-Acetylneuraminic-Acid Hydroxylase (CMP-Neu5Ac hydroxylase) protein, cDNA, and genomic DNA regulatory sequences. Furthermore, the present invention includes porcine animals, tissues, and organs, as well as cells and cell lines derived from such animals, tissues, and organs, which lack expression of functional CMP-Neu5Ac hydroxylase. Such animals, tissues, organs, and cells can be used in research and in medical therapy, including in xenotransplantation, and in industrial livestock farming operations. In addition, methods are provided to prepare organs, tissues, and cells lacking the porcine CMP-Neu5Ac hydroxylase gene for use in xenotransplantation.
  • BACKGROUND OF THE INVENTION
  • The unavailability of acceptable human donor organs, the low rate of long term success due to host versus graft rejection, and the serious risks of infection and cancer are the main challenges now facing the field of tissue and organ transplantation. Because the demand for acceptable organs exceeds the supply, many people die each year while waiting for organs to become available. To help meet this demand, research has been focused on developing alternatives to allogenic transplantation. Dialysis is available to patients suffering from kidney failure, artificial heart models have been tested, and other mechanical systems have been developed to assist or replace failing organs. Such approaches, however, are quite expensive. The need for frequent and periodic access to dialysis machines greatly limits the freedom and quality of life of patients undergoing such therapy.
  • Xenograft transplantation represents a potentially attractive alternative to artificial organs for human transplantation. The potential pool of nonhuman organs is virtually limitless. Pigs are considered the most likely source of xenograft organs. The supply of pigs is plentiful, breeding programs are well established, and their organ size and physiology are compatible with humans. Therefore, xenotransplantation with pig organs offers a potential solution to the shortage of organs available for clinical transplantation.
  • Host rejection of such cross-species tissue remains a major concern in this area. The immunological barriers to xenotransplantation have been, and remain, formidable. The first immunological hurdle is “hyperacute rejection” (HAR). HAR is defined by the ubiquitous presence of high titers of pre-formed natural antibodies binding to the foreign tissue. The binding of these natural antibodies to target epitopes on the donor organ endothelium is believed to be the initiating event in HAR. This binding, within minutes of perfusion of the donor organ with the recipient blood, is followed by complement activation, platelet and fibrin deposition, and ultimately by interstitial edema and hemorrhage in the donor organ, all of which cause failure of the organ in the recipient (Strahan, et al. (1996) Frontiers in Bioscience 1, pp. 34-41).
  • Some noted xenotransplants of organs from apes or old-world monkeys (e.g., baboons) into humans have been tolerated for months without rejection. However, such attempts have ultimately failed due to a number of immunological factors. Even with heavy immunosuppression to suppress HAR, a low-grade innate immune response, attributable in part to failure of complement regulatory proteins (CRPs) within the graft tissue to control activation of heterologous complement on graft endothelium, ultimately leads to destruction of the transplanted organs (Starzl, Immunol. Rev., 141, 213-44 (1994)). In an effort to develop a pool of acceptable organs for xenotransplantation into humans, researchers have engineered animals that produce human CRPs, an approach which has been demonstrated to delay, but not eliminate, xenograft destruction in primates (McCurry, et al., Nat. Med., 1, 423-27 (1995); Bach et al., Immunol. Today, 17, 379-84 (1996)).
  • In addition to complement-mediated attack, human rejection of discordant xenografts appears to be mediated by a common antigen: the galactose-α(1,3)-galactose (gal-α-gal) terminal residue of many glycoproteins and glycolipids (Galili et al., Proc. Nat. Acad. Sci., (USA), 84, 1369-73 (1987); Cooper, et al., Immunol. Rev., 141, 31-58 (1994); Galili, et al., Springer Sem. Immunopathol, 15, 155-171 (1993); Sandrin, et al., Transplant Rev., 8, 134 (1994)). This antigen is chemically related to the human A, B, and O blood antigens, and it is present on many parasites and infectious agents, such as bacteria and viruses. Most mammalian tissue also contains this antigen, with the notable exception of old world monkeys, apes and humans. (see, Joziasse, et al., J. Biol. Chem., 264, 14290-97 (1989). Individuals without such carbohydrate epitopes produce abundant naturally occurring antibodies (IgM as well as IgG) specific to the epitopes. Many humans show significant levels of circulating IgG with specificity for gal-α-gal carbohydrate determinants (Galili, et al., J. Exp. Med., 162, 573-82 (1985); Galili, et al., Proc. Nat. Acad. Sci. (USA), 84, 1369-73 (1987)). The α-galactosyltransferase (α-GT) enzyme catalyzes the formation of gal-α-gal moieties. Research has focused on the modulation or elimination of this enzyme to reduce or eliminate the expression of gal-α-gal moieties on the cell surface of xenotissue.
  • The elimination of the α-galactosyltransferase gene from porcine has long been considered one of the most significant hurdles to accomplishing xenotransplantation from pigs to humans. Two alleles in the pig genome encode the α-GT gene. Single allelic knockouts of the α-GT gene in pigs were reported in 2002 (Dai, et al. Nature Biotechnol., 20:251 (2002); Lai, et al., Science, 295:1089 (2002)).
  • Recently, double allelic knockouts of the α-GT gene have been accomplished (Phelps, et al., Science, 299: pp. 411-414 (2003)). WO 2004/028243 to Revivicor Inc. describes porcine animal, tissue, organ, cells and cell lines, which lack all expression of functional α1,3 galactosyltransferase (α1,3-GT). Accordingly, the animals, tissues, organs and cells lacking functional expression of α1,3-GT can be used in xenotransplantation and for other medical purposes.
  • PCT patent application WO 2004/016742 to Immerge Biotherapeutics, Inc. describes α(1,3)-galactosyltransferase null cells, methods of selecting GGTA-1 null cells, α(1,3)-galactosyltransferase null swine produced therefrom (referred to as a viable GGTA-1 null swine), methods for making such swine, and methods of using cells, tissues and organs of such a null swine for xenotransplantation.
  • One of the earliest known xenoantigens other than gal-α-gal is an epitope that Hanganutiu Deicher antibodies recognize, and which have long been associated with serum disease. The epitope has been identified as N-glycolylneuraminic acid (Neu5Gc), a member of the sialic acid family of carbohydrates. Among carbohydrates, sialic acids are abundant and ubiquitous. Sialic acid is a generic designation used for N-acylneuraminic acids (Neu5Acyl) and their derivatives. N-Acetylneuraminic acid (Neu5Ac) and N-glycolylneuraminic acid (Neu5Gc) are two of the most abundant derivatives of sialic acids.
  • The Neu5Gc epitope is located in the terminal position in the glycan chains of glycoconjugates. Due to this exposed position, it plays an important role in cellular recognition, e.g. in the case of inflammatory reactions, maturation of immune cells, differentiation processes, hormone-, pathogen- and toxin binding (Varki, A., Glycobiology, 2, pp. 25-40 (1992)).
  • Glycoconjugates containing Neu5Gc are immunogenic in humans. In healthy humans, Neu5Gc is not detectable, although Neu5Gc is abundant in most mammals. The lack of Neu5Gc in man is due to an exon deletion in the human gene that prevents the formation of functional enzyme (Chou, H. H., et al. Proc. Natl. Acad. Sci. (USA), 95, pp. 11751-11756 (1998); Irie, A., et al. J. Biol. Chem., 273, pp. 15866-15871 (1998)). Thus, Neu5Gc-containing glycoconjugates act as antigens and can induce the formation of antibodies. Historically, the antibodies have been referred to as Hanganutziu-Deicher (HD) antigens and antibodies (Hanganutziu, M., CR Soc. Biol. (Paris), 91, p. 1457 (1924); Deicher, H., Z. Hyg., 106, p. 561 (1926)). Hanganutziu-Deicher antigens are detectable in many human tumors (colon carcinoma, retinoblastoma, melanoma and carcinoma of the breast) as well as in chicken tumor tissues (Higashi, H., et al. Cancer Res., 45, pp. 3796-3802 (1985)). Although the amount of antigen in tumors is very small (usually less than 1% of the total amount of sialic acid, often in the range of from 0.01 to 0.1%), it is capable of inducing the formation of Hanganutziu-Deicher antibodies (Higashihara, T., et al., Int Arch Allergy Appl Immunol., 95, pp. 231-235 (1991)). This immunological reaction is a potential barrier to xenotransplantation of Neu5Gc-containing pig organs to humans.
  • The Neu5Gc epitope is formed by the addition of a hydroxyl group to the N-acetyl moiety of Neu5Ac. The enzyme that catalyzes the hydroxylation is CMP-Neu5Ac hydroxylase. Thus, the expression of the CMP-Neu5Ac hydroxylase gene determines the presence of the Neu5Gc epitope on cell surfaces. Purification studies of CMP-Neu5Ac hydroxylase in mammals have shown that it is a soluble, cytosolic oxygenase that is dependent on cytochrome b5 and cytochrome b5 reductase (Kawano, T., et al., J. Biol. Chem., 269, pp. 9024-9029 (1994); Schneckenburger, P., et al., Glycoconj. J., 11, pp. 194-203 (1994); Schlenzka, W., et al., Glycobiology, 4, pp. 675-683 (1994); Kozutsumi, Y., et al., J. Biochem. (Tokyo), 108, pp. 704-706 (1990); and, Shaw, L., et al. Eur. J. Biochem., 219, pp. 1001-1011 (1994)).
  • Another important feature of Neu5Gc is that it acts as an adhesion molecule for pathogens, allowing for entry into the cell (Kelm, S. and Schauer, R., Int. Rev. Cytol, 179, pp. 137-240 (1997)). This causes disease and economic losses in certain livestock species. Specifically, enterotoxigenic Escherichia coli with K99 fimbriae infect newborn piglets by binding to Neu5Gc in gangliosides such as Nue5Gcα2→3Galβ1→4Glcβ1→1′ ceramide [GM3(Neu5Gc)], N-glycolylsialoparagloboside and GM2(Neu5Gc) attached to intestinal absorptive and mucus secreting cells, causing a potentially lethal diarrhea (Malykh, Y., et. al., Biochem. J., 370, pp. 601-607 (2003); Kyogashima, M., et al., (1993); Teneberg, S., et al., FEBS Letters, 263, pp. 10-14 (1990); Isobe, T., et al., Anal. Biochem., 236, pp. 35-40 (1996); Lindahl, M. and Carlstedt, I., J. Gen. Microbiol., 136, pp. 1609-1614 (1990); King, T. P., et al., Proceedings of the 6th International Symposium on Digestive Physiology in Pigs, pp. 290-293, (1994)). Pig rotavirus infects pig newborns causing diarrhea by binding to GM3(Neu5Gc). Pig transmissible gastroenteritis coronavirus infects pigs via entry into glycoconjugates containing α2,3-bound Neu5Gc (Schultz, B., et al., J. Virol., 70, pp. 5634-5637 (1996)).
  • CMP-Neu5Ac hydroxylase has been isolated from mouse liver and pig submandibular glands to homogeneity and characterized (Kawano, T., et al., J. Biol. Chem., 269, pp. 9024-9029 (1994); Schneckenburger, P., et al., Glycoconj. J., 11, pp. 194-203 (1994); and, Schlenzka, W., et al., Glycobiology, 4, pp. 675-683 (1994)).
  • Schlenzka, et al. (Glycobiology, Vol. 4, pp. 675-683 (1994)) purified the enzyme from pig submandibular glands using ion exchange chromatography, chromatography with immobilized triazin dyes, hydrophobic interaction chromatography and gel filtration. Schneckenburger et al. (Glycoconj. J., Vol. 11, pp. 194-203 (1994)) isolated the CMP-Neu5Ac hydroxylase from mouse liver. Both the CMP-Neu5Ac hydroxylase from pig submandibular glands and the one from mouse liver are soluble monomers having a molecular weight of 65 kDa. Their catalytic interactions with CMP-Neu5Ac and cytochrome b5 are very similar to one another. The activity of these enzymes seems to be dependent on an iron-containing prosthetic group.
  • JP-A 06 113838 describes the protein and DNA sequences of murine CMP-Neu5Ac hydroxylase, as well as a monoclonal antibody that specifically binds to the hydroxylase.
  • PCT Publication No. WO 97/03200A1 to Boehringer Manheim GMBH discloses a partial cDNA for the porcine CMP-Neu5Ac hydroxylase. This application discloses a cDNA sequence beginning in the middle of Exon 8 of the CMP-Neu5Ac hydroxylase gene (further disclosed as GenBank accession number Y15010).
  • Martensen, L., et al. (Eur. J. Biochem., Vol. 268, pp. 5157-5166 (2001)) discloses a full length amino acid sequence of porcine CMP-Neu5Ac hydroxylase.
  • PCT Publication No. WO 02/088351 to RBC Biotechnology discloses a partial cDNA and genomic sequence (exons 7-11 as well as partial genomic sequence surrounding each exon) of porcine CMP-NeuAc hydroxylase. In addition, methods are provided to generate porcine cells and animals lacking the CMP-NeuAc hydroxylase epitope, optionally, in combination with other genetic modifications, such as inactivation of the alpha-1,3-galactosyltransferase gene and/or insertion of complement proteins.
  • It is an object of the present invention to provide genomic and regulatory sequences of the porcine CMP-Neu5Ac hydroxylase gene.
  • It is an object of the present invention to provide the full length cDNA, as well as novel variants of the CMP-Neu5Ac hydroxylase gene.
  • It is another object of the invention to provide novel nucleic acid and amino acid sequences that encode the CMP-Neu5Ac hydroxylase gene.
  • It is yet a further object of the present invention to provide cells, tissues and/or organs deficient in the CMP-Neu5Ac hydroxylase gene.
  • It is another object of the present invention to generate animals, particularly pigs, lacking a functional CMP-Neu5Ac hydroxylase gene.
  • It is yet a further object of the present invention to provide cells, tissues and/or organs deficient in the CMP-Neu5Ac hydroxylase gene for use in xenotransplantation of non-human organs to human recipients in need thereof.
  • SUMMARY OF THE INVENTION
  • The full length cDNA sequence, peptide sequence, and genomic organization of the porcine CMP-Neu5Ac hydroxylase gene has been determined. To date, only partial cDNA and genomic sequences have been identified. The present invention provides novel porcine CMP-Neu5Ac hydroxylase protein, cDNA, cDNA variants, and genomic DNA sequence. Furthermore, the present invention includes porcine animals, tissues, and organs, as well as cells and cell lines derived from such animals, tissue, and organs, which lack expression of functional CMP-Neu5Ac hydroxylase. Such animals, tissues, organs, and cells can be used in research and in medical therapy, including xenotransplantation. In addition, methods are provided to prepare organs, tissues, and cells lacking the porcine CMP-Neu5Ac hydroxylase gene for use in xenotransplantation.
  • One aspect of the present invention provides the full length cDNA of porcine CMP-Neu5Ac hydroxylase. The full length cDNA is shown in Table 1 (SEQ ID No 1) and the full length peptide sequence is provided in Table 2 (SEQ ID No 2). The start codon for the full-length cDNA is located in the 3′ portion of Exon 4, and the stop codon is found in the 3′ portion of Exon 17. Nucleotide and amino acid sequences at least 80, 85, 90, 95, 98 or 99% homologous to SEQ ID Nos 1 or 2 are provided. In addition, nucleotide and peptide sequences that contain at least 10, 15, 17, 20, 25 or 30 nucleotide or amino acid sequences of SEQ ID Nos 1 or 2 are also provided. Further provided is any nucleotide sequence that hybridizes, optionally under stringent conditions, to SEQ ID No 1, as well as, nucleotides homologous thereto.
  • In one embodiment, nucleic acid and peptide sequences encoding three novel variants of CMP-Neu5Ac hydroxylase are provided (Tables 3-8, FIG. 2). SEQ ID No 3 represents the cDNA of a variant of the gene, variant-1, that includes Exons 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15a, 16, 17, and 18. SEQ ID No 5 represents the cDNA of a variant of the gene, variant-2, that includes Exons 1, 4, 5, 6, 7, 8, 9, 10, 11, 12 and 12a. SEQ ID No 7 represents the cDNA of a variant of the gene, variant-3, that includes Exons 1, 4, 5, 6, 7, 8, 9, 10, 11 and 11a. SEQ ID Nos 4, 6 and 8 represent the amino acid sequences of variant-1, variant-2 and variant-3, respectively. Nucleotide and amino acid sequences at least 80, 85, 90, 95, 98 or 99% homologous to SEQ ID Nos 3-8 are provided. In addition, nucleotide and peptide sequences that contain at least 10, 15, 17, 20, 25, or 30 nucleotide or amino acid sequence of SEQ ID Nos 3-8 are also provided. Further provided is any nucleotide sequence that hybridizes, optionally under stringent conditions, to SEQ ID Nos 3, 5 and 7, as well as, nucleotides homologous thereto.
  • A further embodiment provides nucleic acid sequences representing genomic DNA sequences of the CMP-Neu5Ac hydroxylase gene (Table 9, FIG. 1). SEQ ID Nos 10-28 represent Exons 1, 4-11, 11a, 12, 12a, 13-15, 15a, 16-18, respectively, and SEQ ID Nos 29-45 represent Introns 1a, 1b, 4-15, 15a, 16, and 17, respectively. SEQ ID No. 9 represents the 5′ untranslated region of the CMP-Neu5Ac hydroxylase gene. SEQ ID No. 46 (Table 10) represents the genomic DNA and regulatory sequence of CMP-Neu5Ac hydroxylase.
  • In another embodiment, the genomic sequence of the porcine CMP-Neu5Ac hydroxylase gene is represented by SEQ ID No. 47. SEQ ID No. 47 represents the 5′ contiguous genomic sequence containing 5′ UTR, Exon 1 and a portion of intronic sequence located 3′ of Exon 1 (Table 11).
  • In another embodiment, the genomic sequence of the porcine CMP-Neu5Ac hydroxylase gene is represented by SEQ ID No. 48. SEQ ID NO. 48 represents a contiguous genomic sequence containing intronic sequence located 5′ to Exon 4, Exon 4, Intron 4, Exon 5, Intron 5, Exon 6, Intron 6, Exon 7, Intron 7, Exon 8, Intron 8, Exon 9, Intron 9, Exon 10, Intron 10, Exon 11, Intron 11, Exon 12, Intron 12, Exon 13, Intron 13, Exon 14, Intron 14, Exon 15, Intron 15, Exon 16, Intron 16, Exon 17, Intron 17, and Exon 18 (Table 12). In addition, nucleotide sequences that contain at least 2775, 2800, 2900, 3000, 3100, 3200, 3300, 3400, 3500, 3600, 3700, 3800, 3900, 4000, 4500, 5000, 5500, 6000, 6500, 7000, 7500, 8000, 8500, 9000, 9500 or 10,000 contiguous nucleotides of SEQ ID NO. 48 are provided, as well as nucleotide sequences at least 80, 85, 90, 95, 98, or 99% homologous to SEQ ID NO. 48.
  • In another embodiment, the genomic sequence of the porcine CMP-Neu5Ac hydroxylase gene is represented by SEQ ID No. 49. SEQ ID NO. 49 represents contiguous genomic sequences containing Intronic sequence 5′ to Exon 4, Exon 4, Intron 4, Exon 5, Intron 5, Exon 6, Intron 6, Exon 7, Intron 7 and Exon 8. Further, nucleotide sequences that contain at least 1750, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100, 3200, 3300, 3400, 3500, 3600, 3700, 3800, 3900, 4000, 4100, 4200, 4300, 4400, 4500, 4600, 4700, 4800, 4900, 5000, 5500, 6000, 6500, 7000, 7500, 8000, 8500, 9000, 9500, 10000, 11000, 12000, 13000, 14000, 15000, 16000, 17000, 18000, 19000, or 20000 contiguous nucleotides of SEQ ID NO. 49 are provided, as well as nucleotide sequences at least 80, 85, 90, 95, 98, or 99% homologous to SEQ ID NO. 49.
  • In another embodiment, the genomic sequence of the porcine CMP-Neu5Ac hydroxylase gene is represented by SEQ ID No. 50. SEQ ID NO. 50 represents contiguous genomic sequences containing Exon 12, Intron 12, Exon 13, Intron 13, Exon 14, Intron 14, Exon 15, Intron 15, Exon 16, Intron 16, Exon 17, Intron 17, and Exon 18 are provided. Nucleotide sequences that contain at least 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, 5500, 6000, 6500, 7000, 7500, 8000, 8500, 9000, 9500, 10000, 11000, 12000, 13000, 14000, 15000, 16000, 17000, 18000, 19000 or 20,000 contiguous nucleotides of SEQ ID NO. 50 are provided, as well as nucleotide sequences at least 80, 85, 90, 95, 98, or 99% homologous to SEQ ID NO. 50.
  • In further embodiments, nucleotide and amino acid sequences at least 80, 85, 90, 95, 98 or 99% homologous to SEQ ID Nos 9-45, 46, 47, 48, 49 and 50 are provided. In addition, nucleotide and peptide sequences that contain at least 10, 15, 17, 20, 25, 30, 50, 100, 150, 200, 300, 400, 500 or 1000 contiguous nucleotide or amino acid sequences of SEQ ID Nos 9-45, 46, 47, and 48 are also provided. Further provided is any nucleotide sequence that hybridizes, optionally under stringent conditions, to SEQ ID Nos 9-45, 46, 47, 48, 49 and 50, as well as, nucleotides homologous thereto.
  • Another aspect of the present invention provides nucleic acid constructs that contain cDNA or variants thereof encoding CMP-Neu5Ac hydroxylase. These cDNA sequences can be derived from Seq ID Nos. 1-8, or any fragment thereof. Constructs can contain one, or more than one, internal ribosome entry site (IRES). The construct can also contain a promoter operably linked to the nucleic acid sequence encoding CMP-Neu5Ac hydroxylase, or, alternatively, the construct can be promoterless. In another embodiment, nucleic acid constructs are provided that contain nucleic acid sequences that permit random or targeted insertion into a host genome. In addition to the nucleic acid sequences the expression vector can contain selectable marker sequences, such as, for example, enhanced Green Fluorescent Protein (eGFP) gene sequences, initiation and/or enhancer sequences, poly A-tail sequences, and/or nucleic acid sequences that provide for the expression of the construct in prokaryotic and/or eukaryotic host cells.
  • In another embodiment, nucleic acid targeting vectors constructs are also provided wherein homologous recombination in somatic cells can be achieved. These targeting vectors can be transformed into mammalian cells to target the CMP-Neu5Ac hydroxylase gene via homologous recombination. In one embodiment, the targeting vectors can contain a 3′ recombination arm and a 5′ recombination arm that is homologous to the genomic sequence of a CMP-Neu5Ac hydroxylase. The homologous DNA sequence can include at least 15 bp, 20 bp, 25 bp, 50 bp, 100 bp, 500 bp, 1 kbp, 2 kbp, 4 kbp, 5 kbp, 10 kbp, 15 kbp, 20 kbp, or 50 kbp of sequence homologous to the CMP-Neu5Ac hydroxylase sequence. In another embodiment, the homologous DNA sequence can include one or more intron and/or exon sequences. In a specific embodiment, the DNA sequence can be homologous to Intron 5 and Intron 6 of the CMP-Neu5Ac hydroxylase gene (see, for example, FIGS. 6-8). In another specific embodiment, the DNA sequence can be homologous to Intron 5, a 55 bp portion of Exon 6, and Intron 6 of the CMP-Neu5Ac hydroxylase gene, and contain enhanced Green Fluorescent Protein sequence in an in-frame orientation 3′ to the 55 bp portion of Exon 6 (see, for example, FIGS. 10 and 11).
  • Another embodiment of the present invention provides oligonucleotide primers capable of hybridizing to porcine CMP-Neu5Ac hydroxylase cDNA or genomic sequence, such as Seq ID Nos. 1, 3, 5, 7, 9-45, 46, 47 or 48. In a preferred embodiment, the primers hybridize under stringent conditions to SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47 or 48. Another embodiment provides oligonucleotide probes capable of hybridizing to porcine CMP-Neu5Ac hydroxylase nucleic acid sequences, such as SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47, or 48. The polynucleotide primers or probes can have at least 14 bases, 20 bases, preferably 30 bases, or 50 bases which hybridize to a polynucleotide of the present invention. The probe or primer can be at least 14 nucleotides in length, and in a preferred embodiment, are at least 15, 20, 25, 28, or 30 nucleotides in length.
  • In another aspect of the present invention, mammalian cells lacking at least one allele of the CMP-Neu5Ac hydroxylase gene produced according to the process, sequences and/or constructs described herein are provided. These cells can be obtained as a result of homologous recombination. Particularly, by inactivating at least one allele of the CMP-NeuAc hydroxylase gene, cells can be produced which have reduced capability for expression of functional Hanganutziu-Deicher antigens.
  • In embodiments of the present invention, alleles of the CMP-Neu5Ac hydroxylase gene are rendered inactive according to the process, sequences and/or constructs described herein, such that the resultant CMP-Neu5Ac hydroxylase enzyme can no longer generate Hanganutziu-Deicher antigens. In one embodiment, the CMP-Neu5Ac hydroxylase gene can be transcribed into RNA, but not translated into protein. In another embodiment, the CMP-Neu5Ac hydroxylase gene can be transcribed in an inactive truncated form. Such a truncated RNA may either not be translated or can be translated into a nonfunctional protein. In an alternative embodiment, the CMP-Neu5Ac hydroxylase gene can be inactivated in such a way that no transcription of the gene occurs. In a further embodiment, the CMP-Neu5Ac hydroxylase gene can be transcribed and then translated into a nonfunctional protein.
  • In a further aspect of the present invention, porcine animals are provided in which at least one allele of the CMP-Neu5Ac hydroxylase gene is inactivated via a genetic targeting event produced according to the process, sequences and/or constructs described herein. In another aspect of the present invention, porcine animals are provided in which both alleles of the CMP-Neu5Ac hydroxylase gene are inactivated via a genetic targeting event. The gene can be targeted via homologous recombination. In other embodiments, the gene can be disrupted, i.e. a portion of the genetic code can be altered, thereby affecting transcription and/or translation of that segment of the gene. For example, disruption of a gene can occur through substitution, deletion (“knock-out”) or insertion (“knock-in”) techniques. Additional genes for a desired protein or regulatory sequence that modulate transcription of an existing sequence can be inserted.
  • In another aspect of the present invention, porcine cells lacking one allele, optionally both alleles of the porcine CMP-Neu5Ac hydroxylase gene can be used as donor cells for nuclear transfer into enucleated oocytes to produce cloned, transgenic animals. Alternatively, porcine CMP-Neu5Ac hydroxylase knockouts can be created in embryonic stem cells, which are then used to produce offspring. Offspring lacking a single allele of the functional CMP-Neu5Ac hydroxylase gene produced according to the process, sequences and/or constructs described herein can be breed to further produce offspring lacking functionality in both alleles through mendelian type inheritance. Cells, tissues and/or organs can be harvested from these animals for use in xenotransplantation strategies. The elimination of the Hanganutziu-Deicher antigens can reduce the immune rejection of the transplanted cell, tissue or organ due to the Neu5Gc epitope.
  • Alternatively, animals lacking at least one allele of the CMP-Neu5Ac hydroxylase gene produced according to the process, sequences and/or constructs described herein can be less susceptible or resistant to enterotoxigenic infection and disease such as, for example, E. Coli infection, rotavirus infection, and gastroenteritis coronavirus. Such animals can be used, for example, in commercial farming.
  • In one aspect of the present invention, a pig can be prepared by a method in accordance with any aspect of the present invention. Genetically modified pigs can be used as a source of tissue and/or organs for transplantation therapy. A pig embryo prepared in this manner or a cell line developed therefrom can also be used in cell-transplantation therapy. Accordingly, there is provided in a further aspect of the invention a method of therapy comprising the administration of genetically modified cells lacking porcine CMP-Neu5Ac hydroxylase to a patient, wherein the cells have been prepared from an embryo or animal lacking CMP-Neu5Ac hydroxylase. This aspect of the invention extends to the use of such cells in medicine, e.g. cell-transplantation therapy, and also to the use of cells derived from such embryos in the preparation of a cell or tissue graft for transplantation. The cells can be organized into tissues or organs, for example, heart, lung, liver, kidney, pancreas, corneas, nervous (e.g. brain, central nervous system, spinal cord), skin, or the cells can be islet cells, blood cells (e.g. haemocytes, i.e. red blood cells, leucocytes) or haematopoietic stem cells or other stem cells (e.g. bone marrow).
  • In another aspect of the present invention, CMP-Neu5Ac hydroxylase-deficient pigs also lack genes encoding other xenoantigens, such as, for example, porcine iGb3 synthase (see, for example, U.S. Patent Application 60/517,524), and/or porcine Forssman synthase (see, for example, U.S. Patent Application 60/568,922). In another embodiment, porcine cells are provided that lack the α1,3 galactosyltransferase gene and the CMP-Neu5Ac hydroxylase gene produced according to the process, sequences and/or constructs described herein. In another embodiment, porcine α1,3 galactosyltransferase gene knockout cells are further modified to knockout the CMP-Neu5Ac hydroxylase gene produced according to the process, sequences and/or constructs described herein. In addition, CMP-Neu5Ac hydroxylase deficient pigs produced according to the process, sequences and/or constructs described herein, optionally lacking one or more additional genes associated with an adverse immune response, can be modified to express complement inhibiting proteins, such as, for example, CD59, DAF, and/or MCP can be further modified to eliminate the expression of al least one allele of the CMP-Neu5Ac hydroxylase gene. These animals can be used as a source of tissue and/or organs for transplantation therapy. These animals can be used as a source of tissue and/or organs for transplantation therapy. A pig embryo prepared in this manner or a cell line developed therefrom can also be used in cell-transplantation therapy.
  • DESCRIPTION OF THE INVENTION
  • Elimination of the CMP-Neu5Ac hydroxylase gene produced according to the process, sequences and/or constructs described herein can reduce a human beings immunological response to the Neu5Gc epitope and remove an immunological barrier to xenotransplantation. The present invention is directed to novel nucleic acid sequences encoding the full-length cDNA and peptide. Information about the genomic organization, intronic sequences and regulatory regions of the gene are also provided. In one aspect, the invention provides isolated and substantially purified cDNA molecules having one of SEQ ID Nos: 1, 3, 5 or 7, or a fragment thereof. In another aspect of the invention, DNA sequences comprising the full-length genome of the CMP-NeuAc hydrolase gene are provided in SEQ ID Nos 9-45, 46, 47, 48, 49 or 50 or fragments thereof. In another aspect, primers for amplifying porcine CMP-Neu5Ac hydroxylase cDNA or genomic sequence derived from SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47, 48, 49 or 50 are provided. Additionally probes for identifying CMP-Neu5Ac hydroxylase nucleic acid sequences derived from SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47, 48, 49 or 50, or fragments thereof are provided. DNA represented by SEQ ID Nos 9-45, 46, 47, 48, 49 or 50, or fragments thereof, can be used to construct pigs lacking functional CMP-Neu5Ac hydroxylase genes. Thus, the invention also provides a porcine chromosome lacking a functional CMP-NeuAc hydroxylase gene and a transgenic pig lacking a functional CMP-NeuAc hydroxylase protein produced according to the process, sequences and/or constructs described herein. Such pigs can be used as tissue sources for xenotransplantation into humans. In an alternate embodiment, CMP-NeuAc hydroxylase-deficient pigs produced according to the process, sequences and/or constructs described herein also lack other genes associated with adverse immune responses in xenotransplantation, such as, for example, the α1,3 galactosyltransferase gene, iGb3 synthetase gene, or FSM synthase gene. In another embodiment, pigs lacking CMP-Neu5Ac hydroxylase produced according to the process, sequences and/or constructs described herein and/or other genes associated with adverse immune responses in xenotransplantation express complement inhibiting factors such as, for example, CD59, DAF, and/or MCP.
  • BRIEF DESCRIPTION OF THE FIGURES
  • FIG. 1 represents the genomic organization of the porcine CMP-Neu5Ac hydroxylase gene. Closed bars depict each numbered exon. The length of the introns between the exons illustrates relative distances. (Open boxes also represent exons that appear in some variants (see FIG. 2); “start” and “stop” denote start and stop codons, respectively) The approximate scale is depicted in the bottom of the figure.
  • FIG. 2 depicts cDNA sequences of the CMP-Neu5Ac hydroxylase gene. Variant-1 contains exon 15a in place of exons 14 and 15. Variant-2 contains exon 12a, and variant-3 contains exon 11a. “Start” and “stop” denote the start and stop codons, respectively.
  • FIG. 3 illustrates four non-limiting examples of targeting vectors, along with their corresponding genomic organization. The selectable marker gene in this particular non-limiting example is eGFP (enhanced green fluorescent protein). eGFP can be inserted in the DNA constructs to inactivate the porcine CMP-NeuAc hydroxylase gene.
  • FIG. 4 illustrates transcription factor binding sites located within exon 1 (228 bp) and its 5′-flanking region spanning 601 bp.
  • FIG. 5 depicts oligonucleotide sequences that can be used for DNA construction of porcine CMP-Neu5Ac hydroxylase gene targeting vector.
  • FIG. 6 is a schematic diagram illustrating the production of a 3′-arm segment from the porcine CMP-Neu5Ac hydroxylase gene using primers pDH3 and pDH4, and its insertion into a vector (pCRII).
  • FIG. 7 is a schematic diagram illustrating the production of a 5′-arm segment from the porcine CMP-Neu5Ac hydroxylase gene using primers pDH1 and pDH2, followed by pDH2a, pDH2b, and pDH2c, and its insertion into a vector (pCRII) in which a 3′-arm has previously been inserted.
  • FIG. 8 is a non-limiting example of a schematic illustrating a targeting vector that can be utilized to delete Exon 6 of the porcine CMP-Neu5Ac hydroxylase gene through homologous recombination.
  • FIG. 9 represents oligonucleotide sequences used in generating a enhanced green fluorescent protein expression vector for use in a Knock-in strategy.
  • FIG. 10 is a schematic illustrating the insertion of a EGFP fragment with a polyA signal into the targeting vector pDHΔex6.
  • FIG. 11 is a schematic illustrating a knock-in vector for expression of eGFP.
  • FIG. 12 is a schematic illustrating homologous recombination resulting in a frameshift between the targeting cassette DNA construct (pDHΔex6) and genomic DNA.
  • FIG. 13 is a schematic illustrating homologous recombination resulting in a frameshift between the targeting cassette DNA construct (pDHΔex6) and genomic DNA.
  • DEFINITIONS
  • A “target DNA sequence” is a DNA sequence to be modified by homologous recombination. The target DNA can be in any organelle of the animal cell including the nucleus and mitochondria and can be an intact gene, an exon or intron, a regulatory sequence or any region between genes.
  • A “targeting DNA sequence” is a DNA sequence containing the desired sequence modifications. The targeting DNA sequence can be substantially isogenic with the target DNA.
  • A “homologous DNA sequence or homologous DNA” is a DNA sequence that is at least about 80%, 85%, 90%, 95%, 98% or 99% identical with a reference DNA sequence. A homologous sequence hybridizes under stringent conditions to the target sequence, stringent hybridization conditions include those that will allow hybridization occur if there is at least 85% and preferably at least 95% or 98% identity between the sequences.
  • An “isogenic or substantially isogenic DNA sequence” is a DNA sequence that is identical to or nearly identical to a reference DNA sequence. The term “substantially isogenic” refers to DNA that is at least about 97-99% identical with the reference DNA sequence, and preferably at least about 99.5-99.9% identical with the reference DNA sequence, and in certain uses 100% identical with the reference DNA sequence.
  • “Homologous recombination” refers to the process of DNA recombination based on sequence homology.
  • “Gene targeting” refers to homologous recombination between two DNA sequences, one of which is located on a chromosome and the other of which is not.
  • “Non-homologous or random integration” refers to any process by which DNA is integrated into the genome that does not involve homologous recombination.
  • A “selectable marker gene” is a gene, the expression of which allows cells containing the gene to be identified. A selectable marker can be one that allows a cell to proliferate on a medium that prevents or slows the growth of cells without the gene. Examples include antibiotic resistance genes and genes which allow an organism to grow on a selected metabolite. Alternatively, the gene can facilitate visual screening of transformants by conferring on cells a phenotype that is easily identified. Such an identifiable phenotype can be, for example, the production of luminescence or the production of a colored compound, or the production of a detectable change in the medium surrounding the cell.
  • The term “contiguous” is used herein in its standard meaning, i.e., without interruption, or uninterrupted.
  • The term “porcine” refers to any pig species, including pig species such as Large White, Landrace, Meishan, Minipig.
  • The term “oocyte” describes the mature animal ovum which is the final product of oogenesis and also the precursor forms being the oogonium, the primary oocyte and the secondary oocyte respectively.
  • The term “fragment” means a portion or partial sequence of a nucleotide or peptide sequence.
  • The terms “derivative” and “analog” means a nucleotide or peptide sequence which retains essentially the same biological function or activity as such nucleotide or peptide. For example, an analog includes a proprotein which can be activated by cleavage of the proprotein portion to produce an active mature polypeptide.
  • DNA (deoxyribonucleic acid) sequences provided herein are represented by the bases adenine (A), thymine (T), cytosine (C), and guanine (G).
  • Amino acid sequences provided herein are represented by the following abbreviations:
  • A alanine
    P proline
    B aspartate or
    asparagine
    Q glutamine
    C cysteine
    R arginine
    D aspartate
    S serine
    E glutamate
    T threonine
    F phenylalanine
    G glycine
    V valine
    H histidine
    W tryptophan
    I isoleucine
    Y tyrosine
    Z glutamate or
    glutamine
    K lysine
    L leucine
    M methionine
    N asparagine
  • “Transfection” refers to the introduction of DNA into a host cell. Cells do not naturally take up DNA. Thus, a variety of technical “tricks” are utilized to facilitate gene transfer. Numerous methods of transfection are known to the ordinarily skilled artisan, for example, CaPO4 and electroporation. (J. Sambrook, E. Fritsch, T. Maniatis, Molecular Cloning: A Laboratory Manual, Cold Spring Laboratory Press, 1989). Transformation of the host cell is the indicia of successful transfection.
  • I. Complete cDNA Sequence and Variants of the Porcine CMP-Neu5Ac Hydroxylase Gene
  • One aspect of the present invention provides novel, full length nucleic acid cDNA sequences of the porcine CMP-Neu5Ac hydroxylase gene (FIG. 2, Table 1, Seq ID No 1). Another aspect of the present invention provides predicted amino acid peptide sequences of the porcine CMP-Neu5Ac hydroxylase gene (Table 2, Seq ID No 2). The ATG start codon for the full-length cDNA is located in the 3′ portion of Exon 4, and the stop codon TAG is found in the 3′ portion of Exon 17. Nucleic and amino acid sequences at least 90, 95, 98 or 99% homologous to Seq ID Nos 1 or 2 are provided. In addition, nucleotide and peptide sequences that contain at least 10, 15, 17, 20 or 25 contiguous nucleic or amino acids of Seq ID Nos 1 or 2 are also provided. Further provided are fragments, derivatives and analogs of Seq ID Nos 1-2. Fragments of Seq ID Nos. 1-2 can include any contiguous nucleic acid or peptide sequence that includes at least about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90. 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 550, 600, 700, 750, 800, 850, 900, 1000, 5000 or 10,000 nucleotides.
  • TABLE 1
    Full length cDNA
    CCCGACGTCCTGGCAGCGCCCAGGCACTGTTA Exons Seq ID No 1
    TTGGTGCCTCCTGTGTCCACGCGCTTCCCGGC 1 &
    CAGGCAGCCCTGGCGGATCCTATTTTCTGTTC 4-18
    CCCCGATTCTGGTACCTCTCCCTCCCGCCCTC
    GGTGCGCAGCCGTCCTCCTGCAGTGCCTGCTC
    CTCCAGGGGCGAAACCGATCAGGGATCAGGCC
    ACCCGCCTCCTGAACATCCCTCCTTAGTTCCC
    ACAGTCTAATGCCTTGTGGAAGCAAATGAGCC
    ACAGAAGCTGAAGGAAAAACCACCATTCTTTC
    TTAATACCTGGAGAGAGGCAACGACAGACTAT
    GAGCAGCATCGAACAAACGACGGAGATCCTGT
    TGTGCCTCTCACCTGCCGAAGCTGCCAATCTC
    AAGGAAGGAATCAATTTTGTTCGAAATAAGAG
    CACTGGCAAGGATTACATCTTATTTAAGAATA
    AGAGCCGCCTGAAGGCATGTAAGAACATGTGC
    AAGCACCAAGGAGGCCTCTTCATTAAAGACAT
    TGAGGATCTAAATGGAAGGTCTGTTAAATGCA
    CAAAACACAACTGGAAGTTAGATGTAAGCAGC
    ATGAAGTATATCAATCCTCCTGGAAGCTTCTG
    TCAAGACGAACTGGTTGTAGAAAAGGATGAAG
    AAAATGGAGTTTTGCTTCTAGAACTAAATCCT
    CCTAACCCGTGGGATTCAGAACCCAGATCTCC
    TGAAGATTTGGCATTTGGGGAAGTGCAGATCA
    CGTACCTTACTCACGCCTGCATGGACCTCAAG
    CTGGGGGACAAGAGAATGGTGTTCGACCCTTG
    GTTAATCGGTCCTGCTTTTGCGCGAGGATGGT
    GGTTACTACACGAGCCTCCATCTGATTGGCTG
    GAGAGGCTGAGCCGCGCAGACTTAATTTACAT
    CAGTCACATGCACTCAGACCACCTGAGTTACC
    CAACACTGAAGAAGCTTGCTGAGAGAAGACCA
    GATGTTCCCATTTATGTTGGCAACACGGAAAG
    ACCTGTATTTTGGAATCTGAATCAGAGTGGCG
    TCCAGTTGACTAATATCAATGTAGTGCCATTT
    GGAATATGGCAGCAGGTAGACAAAAATCTTCG
    ATTCATGATCTTGATGGATGGCGTTCATCCTG
    AGATGGACACTTGCATTATTGTGGAATACAAA
    GGTCATAAAATACTCAATACAGTGGATTGCAC
    CAGACCCAATGGAGGAAGGCTGCCTATGAAGG
    TTGCATTAATGATGAGTGATTTTGCTGGAGGA
    GCTTCAGGCTTTCCAATGACTTTCAGTGGTGG
    AAAATTTACTGAGGAATGGAAAGCCCAATTCA
    TTAAAACAGAAAGGAAGAAACTCCTGAACTAC
    AAGGCTCGGCTGGTGAAGGACCTACAACCCAG
    AATTTACTGCCCCTTTCCTGGGTATTTCGTGG
    AATCCCACCCAGCAGACAAGTATATTAAGGAA
    ACAAACATCAAAAATGACCCAAATGAACTCAA
    CAATCTTATCAAGAAGAATTCTGAGGTGGTAA
    CCTGGACCCCAAGACCTGGAGCCACTCTTGAT
    CTGGGTAGGATGCTAAAGGACCCAACAGACAG
    CAAGGGCATCGTAGAGCCTCCAGAAGGGACTA
    AGATTTACAAGGATTCCTGGGATTTTGGCCCA
    TATTTGAATATCTTGAATGCTGCTATAGGAGA
    TGAAATATTTCGTCACTCATCCTGGATAAAAG
    AATACTTCACTTGGGCTGGATTTAAGGATTAT
    AACCTGGTGGTCAGGATGATTGAGACAGATGA
    GGACTTCAGCCCTTTGCCTGGAGGATATGACT
    ATTTGGTTGACTTTCTGGATTTATCCTTTCCA
    AAAGAAAGACCAAGCCGGGAACATCCATATGA
    GGAAATTCGGAGCCGGGTTGATGTCATCAGAC
    ACGTGGTAAAGAATGGTCTGCTCTGGGATGAC
    TTGTACATAGGATTCCAAACCCGGCTTCAGCG
    GGATCCTGATATATACCATCATCTGTTTTGGA
    ATCATTTTCAAATAAAACTCCCCCTCACACCA
    CCTGACTGGAAGTCCTTCCTGATGTGCTCTGG
    GTAGAGAGGACCTGAGCTGTCCCAGGGGTGCC
    CAACAACATGAAAAAATCAAGAATTTATTGCT
    GCTACGTCAAAGCTTATACCAGAGATTATGCC
    TTATAGACATTAGCAATGGATAATTATATGTT
    GCACTTGTGAAATGTGCACATATCCTGTTTAT
    GAATCACCACATAGCCAGATTATCAATATTTT
    ACTTATTTCGTAAAAAATCCACAATTTTCCAT
    AACAGAATCAACGTGTGCAATAGGAACAAGAT
    TGCTATGGAAAACGAGGGTAACAGGAGGAGAT
    ATTAATCCAAGCATAGAAGAAATAGACAAATG
    AGGGGCCATAAGGGGAATATAGGGAAGAGAAA
    AAAATTAAGATGGAATTTTAAAAGGAGAATGT
    AAAAAATAGATATTTGTTCCTTAATAGGTTGA
    TTCCTCAAATAGAGCCCATGAATATAATCAAA
    TAGGAAGGGTTCATGACTGTTTTCAATTTTTC
    AAAAAGCTTTGTTGAAATCATAGACTTGCAAA
    ACAAGGCTGTAGAGGCCACCCTAAAATGGAAA
    ATTTCACTGGGACTGAAATTATTTTGATTCAA
    TGACAAAATTTGTTATTTACTGCGGATTATAA
    ACTCTAACAAATAGCGATCTCTTTGCTTCATA
    AAAACATAAACACTAGCTAGTAATAAAATGAG
    TTCTGCAG
  • TABLE 2
    Full length Amino Acid Sequence
    M S S I E Q T T E I L L C L S P A E A A Seq ID No 2
    N L K E G I N F V R N K S T G K D Y I L
    F K N K S R L K A C K N M C K H Q G G L
    F I K D I E D L N G R S V K C T K H N W
    K L D V S S M K Y I N P P G S F C Q D E
    L V V E K D E E N G V L L L E L N P P N
    P W D S E P R S P E D L A F G E V Q I T
    Y L T H A C M D L K L G D K R M V F D P
    W L I G P A F A R G W W L L H E P P S D
    W L E R L S R A D L I Y I S H M H S D H
    L S Y P T L K K L A E R R P D V P I Y V
    G N T E R P V F W N L N Q S G V Q L T N
    I N V V P F G I W Q Q V D K N L R F M I
    L M D G V H P E M D T C I I V E Y K G H
    K I L N T V D C T R P N G G R L P M K V
    A L M M S D F A G G A S G F P M T F S G
    G K F T E E W K A Q F I K T E R K K L L
    N Y K A R L V K D L Q P R I Y C P F P G
    Y F V E S H P A D K Y I K E T N I K N D
    P N E L N N L I K K N S E V V T W T P R
    P G A T L D L G R M L K D P T D S K G I
    V E P P E G T K I Y K D S W D F G P Y L
    N I L N A A I G D E I F R H S S W I K E
    Y F T W A G F K D Y N L V V R M I E T D
    E D F S P L P G G Y D Y L V D F L D L S
    F P K E R P S R E H P Y E E I R S R V D
    V I R H V V K N G L L W D D L Y I G F Q
    T R L Q R D P D I Y H H L F W N H F Q I
    K L P L T P P D W K S F L M C S G
  • Variants
  • Another aspect of the present invention provides novel nucleic acid cDNA sequences of three novel variants of CMP-Neu5Ac hydroxylase gene transcript (FIG. 2, Tables 3, 5, and 7, Seq ID Nos. 3, 5, and 7). Seq ID No 3 represents the cDNA of a variant of the gene, variant-1, that includes Exons 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15a, 16, 17, and 18. Exon 15a is a cryptic Exon that normally appears in Intron 15, approximately 460 bp upstream of Exon 16. The start codon for variant-1 is located in Exon 4, while the stop codon is located in Exon 17. Seq ID No 5 represents the cDNA of a variant of the gene, variant-2, that includes Exons 1, 4, 5, 6, 7, 8, 9, 10, 11, 12 and 12a. Exon 12a is a cryptic Exon which is retained from a partial sequence of Intron 12 (see SEQ ID. No. 21). The start codon for variant-2 is located in Exon 4, while the stop codon is located in the terminal end of Exon 12a. Seq ID No 7 represents the cDNA of a variant of the gene, variant-3, that includes Exons 1, 4, 5, 6, 7, 8, 9, 10, 11 and 11a. Exon 11a is a cryptic Exon which is retained from a partial sequence of Intron 11 (see Seq ID No. 19). The start codon for variant-3 is located in Exon 4, while the stop codon is located in Exon 11a. Another aspect of the present invention provides predicted amino acid peptide sequences of three novel variants of the porcine CMP-Neu5Ac Hydroxylase gene transcript. Seq ID Nos 4, 6 and 8 represent the amino acid sequences of variant-1, variant-2 and variant-3, respectively. Nucleotide and amino acid sequences at least 80, 85, 90, 95, 98 or 99% homologous to Seq ID Nos 3-8 are provided. In addition, nucleotide and peptide sequences that contain at least 10, 15, 17, 20, 25, 30, 50, 100, 150, 200, 300, 400, 500 or 1000 contiguous nucleotide or amino acid sequences of Seq ID Nos 3-8 are also provided. Further provided are fragments, derivatives and analogs of Seq ID Nos 3-8. Fragments of Seq ID Nos. 3-8 can include any contiguous nucleic acid or peptide sequence that includes at least about 10 bp, 15 bp, 17 bp, 20 bp, 50 bp, 100 bp, 500 bp, 1 kbp, 5 kbp or 10 kbp.
  • TABLE 3
    Variant-1 cDNA
    CCCGACGTCCTGGCAGCGCCCAGGCACTGTT Exons
    1, Seq ID No 3
    ATTGGTGCCTCCTGTGTCCACGCGCTTCCCG 4-13,
    GCCAGGCAGCCCTGGCGGATCCTATTTTCTG 15a, 16,
    TTCCCCCGATTCTGGTACCTCTCCCTCCCGC 17, 18
    CCTCGGTGCGCAGCCGTCCTCCTGCAGTGCC
    TGCTCCTCCAGGGGCGAAACCGATCAGGGAT
    CAGGCCACCCGCCTCCTGAACATCCCTCCTT
    AGTTCCCACAGTCTAATGCCTTGTGGAAGCA
    AATGAGCCACAGAAGCTGAAGGAAAAACCAC
    CATTTCTTTCTTAATACCTGGAGAGAGGCAA
    CGACAGACTATGAGCAGCATCGAACAAACGA
    CGGAGATCCTGTTGTGCCTCTCACCTGCCGA
    AGCTGCCAATCTCAAGGAAGGAATCAATTTT
    GTTCGAAATAAGAGCACTGGCAAGGATTACA
    TCTTATTTAAGAATAAGAGCCGCCTGAAGGC
    ATGTAAGAACATGTGCAAGCACCAAGGAGGC
    CTCTTCATTAAAGACATTGAGGATCTAAATG
    GAAGGTCTGTTAAATGCACAAAACACAACTG
    GAAGTTAGATGTAAGCAGCATGAAGTATATC
    AATCCTCCTGGAAGCTTCTGTCAAGACGAAC
    TGGTTGTAGAAAAGGATGAAGAAAATGGAGT
    TTTGCTTCTAGAACTAAATCCTCCTAACCCG
    TGGGATTCAGAACCCAGATCTCCTGAAGATT
    TGGCATTTGGGGAAGTGCAGATCACGTACCT
    TACTCACGCCTGCATGGACCTCAAGCTGGGG
    GACAAGAGAATGGTGTTCGACCCTTGGTTAA
    TCGGTCCTGCTTTTGCGCGAGGATGGTGGTT
    ACTACACGAGCCTCCATCTGATTGGCTGGAG
    AGGCTGAGCCGCGCAGACTTAATTTACATCA
    GTCACATGCACTCAGACCACCTGAGTTACCC
    AACACTGAAGAAGCTTGCTGAGAGAAGACCA
    GATGTTCCCATTTATGTTGGCAACACGGAAA
    GACCTGTATTTTGGAATCTGAATCAGAGTGG
    CGTCCAGTTGACTAATATCAATGTAGTGCCA
    TTTGGAATATGGCAGCAGGTAGACAAAAATC
    TTCGATTCATGATCTTGATGGATGGCGTTCA
    TCCTGAGATGGACACTTGCATTATTGTGGAA
    TACAAAGGTCATAAAATACTCAATACAGTGG
    ATTGCACCAGACCCAATGGAGGAAGGCTGCC
    TATGAAGGTTGCATTAATGATGAGTGATTTT
    GCTGGAGGAGCTTCAGGCTTTCCAATGACTT
    TCAGTGGTGGAAAATTTACTGAGGAATGGAA
    AGCCCAATTCATTAAAACAGAAAGGAAGAAA
    CTCCTGAACTACAAGGCTCGGCTGGTGAAGG
    ACCTACAACCCAGAATTTACTGCCCCTTTCC
    TGGGTATTTCGTGGAATCCCACCCAGCAGAC
    AAGTATATTAAGGAAACAAACATCAAAAATG
    ACCCAAATGAACTCAACAATCTTATCAAGAA
    GAATTCTGAGGTGGTAACCTGGACCCCAAGA
    CCTGGAGCCACTCTTGATCTGGGTAGGATGC
    TAAAGGACCCAACAGACAGATCCTGTGTCAG
    GAGTTGGGATTCTTTGAAGATTCGGAGCCGG
    GTTGATGTCATCAGACACGTGGTAAAGAATG
    GTCTGCTCTGGGATGACTTGTACATAGGATT
    CCAAACCCGGCTTCAGCGGGATCCTGATATA
    TACCATCATCTGTTTTGGAATCATTTTCAAA
    TAAAACTCCCCCTCACACCACCTGACTGGAA
    GTCCTTCCTGATGTGCTCTGGGTAGAGAGGA
    CCTGAGCTGTCCCAGGGGTGCCCAACAACAT
    GAAAAAATCAAGAATTTATTGCTGCTACGTC
    AAAGCTTATACCAGAGATTATGCCTTATAGA
    CATTAGCAATGGATAATTATATGTTGCACTT
    GTGAAATGTGCACATATCCTGTTTATGAATC
    ACCACATAGCCAGATTATCAATATTTTACTT
    ATTTCGTAAAAAATCCACAATTTTCCATAAC
    AGAATCAACGTGTGCAATAGGAACAAGATTG
    CTATGGAAAACGAGGGTAACAGGAGGAGATA
    TTAATCCAAGCATAGAAGAAATAGACAAATG
    AGGGGCCATAAGGGGAATATAGGGAAGAGAA
    AAAAATTAAGATGGAATTTTAAAAGGAGAAT
    GTAAAAAATAGATATTTGTTCCTTAATAGGT
    TGATTCCTCAAATAGAGCCCATGAATATAAT
    CAAATAGGAAGGGTTCATGACTGTTTTCAAT
    TTTTCAAAAAGCTTTGTTGAAATCATAGACT
    TGCAAAACAAGGCTGTAGAGGCCACCCTAAA
    ATGGAAAATTTCACTGGGACTGAAATTATTT
    TGATTCAATGACAAAATTTGTTATTTACTGC
    GGATTATAAACTCTAACAAATAGCGATCTCT
    TTGCTTCATAAAAACATAAACACTAGCTAGT
    AATAAAATGAGTTCTGCAG
  • TABLE 4
    Variant-1 Amino Acid Sequence
    M S S I E Q T T E I L L C L S P A E A A Seq ID No 4
    N L K E G I N F V R N K S T G K D Y I L
    F K N K S R L K A C K N M C K H Q G G L
    F I K D I E D L N G R S V K C T K H N W
    K L D V S S M K Y I N P P G S F C Q D E
    L V V E K D E E N G V L L L E L N P P N
    P W D S E P R S P E D L A F G E V Q I T
    Y L T H A C M D L K L G D K R M V F D P
    W L I G P A F A R G W W L L H E P P S D
    W L E R L S R A D L I Y I S H M H S D H
    L S Y P T L K K L A E R R P D V P I Y V
    G N T E R P V F W N L N Q S G V Q L T N
    I N V V P F G I W Q Q V D K N L R F M I
    L M D G V H P E M D T C I I V E Y K G H
    K I L N T V D C T R P N G G R L P M K V
    A L M M S D F A G G A S G F P M T F S G
    G K F T E E W K A Q F I K T E R K K L L
    N Y K A R L V K D L Q P R I Y C P F P G
    Y F V E S H P A D K Y I K E T N I K N D
    P N E L N N L I K K N S E V V T W T P R
    P G A T L D L G R M L K D P T D R S C V
    R S W D S L K I R S R V D V I R H V V K
    N G L L W D D L Y I G F Q T R L Q R D P
    D I Y H H L F W N H F Q I K L P L T P P
    D W K S F L M C S G
  • TABLE 5
    Variant-2 cDNA
    CCCGACGTCCTGGCAGCGCCCAGGCACTG Exons
    1, Seq ID No 5
    TTATTGGTGCCTCCTGTGTCCACGCGCTT 4-12, 12a
    CCCGGCCAGGCAGCCCTGGCGGATCCTAT
    TTTCTGTTCCCCCGATTCTGGTACCTCTC
    CCTCCCGCCCTCGGTGCGCAGCCGTCCTC
    CTGCAGTGCCTGCTCCTCCAGGGGCGAAA
    CCGATCAGGGATCAGGCCACCCGCCTCCT
    GAACATCCCTCCTTAGTTCCCACAGTCTA
    ATGCCTTGTGGAAGCAAATGAGCCACAGA
    AGCTGAAGGAAAAACCACCATTCTTTCTT
    AATACCTGGAGAGAGGCAACGACAGACTA
    TGAGCAGCATCGAACAAACGACGGAGATC
    CTGTTGTGCCTCTCACCTGCCGAAGCTGC
    CAATCTCAAGGAAGGAATCAATTTTGTTC
    GAAATAAGAGCACTGGCAAGGATTACATC
    TTATTTAAGAATAAGAGCCGCCTGAAGGC
    ATGTAAGAACATGTGCAAGCACCAAGGAG
    GCCTCTTCATTAAAGACATTGAGGATCTA
    AATGGAAGGTCTGTTAAATGCACAAAACA
    CAACTGGAAGTTAGATGTAAGCAGCATGA
    AGTATATCAATCCTCCTGGAAGCTTCTGT
    CAAGACGAACTGGTTGTAGAAAAGGATGA
    AGAAAATGGAGTTTTGCTTCTAGAACTAA
    ATCCTCCTAACCCGTGGGATTCAGAACCC
    AGATCTCCTGAAGATTTGGCATTTGGGGA
    AGTGCAGATCACGTACCTTACTCACGCCT
    GCATGGACCTCAAGCTGGGGGACAAGAGA
    ATGGTGTTCGACCCTTGGTTAATCGGTCC
    TGCTTTTGCGCGAGGATGGTGGTTACTAC
    ACGAGCCTCCATCTGATTGGCTGGAGAGG
    CTGAGCCGCGCAGACTTAATTTACATCAG
    TCACATGCACTCAGACCACCTGAGTTACC
    CAACACTGAAGAAGCTTGCTGAGAGAAGA
    CCAGATGTTCCCATTTATGTTGGCAACAC
    GGAAAGACCTGTATTTTGGAATCTGAATC
    AGAGTGGCGTCCAGTTGACTAATATCAAT
    GTAGTGCCATTTGGAATATGGCAGCAGGT
    AGACAAAAATCTTCGATTCATGATCTTGA
    TGGATGGCGTTCATCCTGAGATGGACACT
    TGCATTATTGTGGAATACAAAGGTCATAA
    AATACTCAATACAGTGGATTGCACCAGAC
    CCAATGGAGGAAGGCTGCCTATGAAGGTT
    GCATTAATGATGAGTGATTTTGCTGGAGG
    AGCTTCAGGCTTTCCAATGACTTTCAGTG
    GTGGAAAATTTACTGAGGAATGGAAAGCC
    CAATTCATTAAAACAGAAAGGAAGAAACT
    CCTGAACTACAAGGCTCGGCTGGTGAAGG
    ACCTACAACCCAGAATTTACTGCCCCTTT
    CCTGGGTATTTCGTGGAATCCCACCCAGC
    AGACAAGTATGGCTGGATATTTTATATAA
    CGTGTTTACGCATAAGTTAATATATGCTG
    AATGAGTGATTTAGCTGTGAAACAACATG
    AAATGAGAAAGAATGATTAGTAGGGGTCT
    GGAGCTTATTTTAACAAGCAGCCTGAAAA
    CAGAAAGTATGAATAAAAAAAATTAAATG
    CAAAAAAAAAAAAAAAAAAAAAAAAAAAA
  • TABLE 6
    Variant-2 Amino Acid Sequence
    M S S I E Q T T E I L L C L S P A E A A Seq ID No 6
    N L K E G I N F V R N K S T G K D Y I L
    F K N K S R L K A C K N M C K H Q G G L
    F I K D I E D L N G R S V K C T K H N W
    K L D V S S M K Y I N P P G S F C Q D E
    L V V E K D E E N G V L L L E L N P P N
    P W D S E P R S P E D L A F G E V Q I T
    Y L T H A C M D L K L G D K R M V F D P
    W L I G P A F A R G W W L L H E P P S D
    W L E R L S R A D L I Y I S H M H S D H
    L S Y P T L K K L A E R R P D V P I Y V
    G N T E R P V F W N L N Q S G V Q L T N
    I N V V P F G I W Q Q V D K N L R F M I
    L M D G V H P E M D T C I I V E Y K G H
    K I L N T V D C T R P N G G R L P M K V
    A L M M S D F A G G A S G F P M T F S G
    G K F T E E W K A Q F I K T E R K K L L
    N Y K A R L V K D L Q P R I Y C P F P G
    Y F V E S H P A D K Y G W I F Y I T C L
    R I S
  • TABLE 7
    Variant-3 cDNA
    CCCGACGTCCTGGCAGCGCCCAGGCACTG Exons
    1, Seq ID No 7
    TTATTGGTGCCTCCTGTGTCCACGCGCTT 4-11, 11a
    CCCGGCCAGGCAGCCCTGGCGGATCCTAT
    TTTCTGTTCCCCCGATTCTGGTACCTCTC
    CCTCCCGCCCTCGGTGCGCAGCCGTCCTC
    CTGCAGTGCCTGCTCCTCCAGGGGCGAAA
    CCGATCAGGGATCAGGCCACCCGCCTCCT
    GAACATCCCTCCTTAGTTCCCACAGTCTA
    ATGCCTTGTGGAAGCAAATGAGCCACAGA
    AGCTGAAGGAAAAACCACCATTCTTTCTT
    AATACCTGGAGAGAGGCAACGACAGACTA
    TGAGCAGCATCGAACAAACGACGGAGATC
    CTGTTGTGCCTCTCACCTGCCGAAGCTGC
    CAATCTCAAGGAAGGAATCAATTTTGTTC
    GAAATAAGAGCACTGGCAAGGATTACATC
    TTATTTAAGAATAAGAGCCGCCTGAAGGC
    ATGTAAGAACATGTGCAAGCACCAAGGAG
    GCCTCTTCATTAAAGACATTGAGGATCTA
    AATGGAAGGTCTGTTAAATGCACAAAACA
    CAACTGGAAGTTAGATGTAAGCAGCATGA
    AGTATATCAATCCTCCTGGAAGCTTCTGT
    CAAGACGAACTGGTTGTAGAAAAGGATGA
    AGAAAATGGAGTTTTGCTTCTAGAACTAA
    ATCCTCCTAACCCGTGGGATTCAGAACCC
    AGATCTCCTGAAGATTTGGCATTTGGGGA
    AGTGCAGATCACGTACCTTACTCACGCCT
    GCATGGACCTCAAGCTGGGGGACAAGAGA
    ATGGTGTTCGACCCTTGGTTAATCGGTCC
    TGCTTTTGCGCGAGGATGGTGGTTACTAC
    ACGAGCCTCCATCTGATTGGCTGGAGAGG
    CTGAGCCGCGCAGACTTAATTTACATCAG
    TCACATGCACTCAGACCACCTGAGTTACC
    CAACACTGAAGAAGCTTGCTGAGAGAAGA
    CCAGATGTTCCCATTTATGTTGGCAACAC
    GGAAAGACCTGTATTTTGGAATCTGAATC
    AGAGTGGCGTCCAGTTGACTAATATCAAT
    GTAGTGCCATTTGGAATATGGCAGCAGGT
    AGACAAAAATCTTCGATTCATGATCTTGA
    TGGATGGCGTTCATCCTGAGATGGACACT
    TGCATTATTGTGGAATACAAAGGTCATAA
    AATACTCAATACAGTGGATTGCACCAGAC
    CCAATGGAGGAAGGCTGCCTATGAAGGTT
    GCATTAATGATGAGTGATTTTGCTGGAGG
    AGCTTCAGGCTTTCCAATGACTTTCAGTG
    GTGGAAAATTTACTGGTAATTCTTTATAT
    CAAAATGATGCCAAGGAGTTGGCATGGCA
    CTTTGCTAAATGCTGTGTGAATCAATACA
    AAGATAATTAGGACATGGTTCTTCCTCAC
    AAGAGGTGTGCAATCTTATTGGGAAATCA
    TACTTGCAAGTCACAAATATAGACTAAAG
    TTTCCAGCTGAGAATATGCTGATGGAGCA
    TGAAACACTAAGGAGACAGGGAGAATCTC
    AGGAAAAATCAAGAATAATTTGGATCAAA
    TGGATTCCTGACATAGAACATAGAGCTGA
    TCAGAAAGAGTCTGACATTGGTAATCCAG
    GCTTAAGTGCTCTTTGTATGTGGTTCAGA
    ACAGAGTGTGGGCAGCCTGAGGGGGATAC
    ATACCCTTGACCTCGTGGAAAGCTCATAC
    GGGGGAGGGATGAGGCTAAGGAAGCCCCT
    CTAAAGTGTGGGATTACGAGAGGTTGGGG
    GGGTGGTAGGGAAAATAGTGGTCAAAGAG
    TATAAACTTCCAGTTACAAGATGAATAAA
    TTCTAGGGGTATAATAACAGCATGGCACT
    ATAGATAGCATATTGTACTATATACTGGA
    AGTGCTGAGAGTAGATCTTACATGTTCTA
    ACCACACACACACACACACACACACACAC
    ACCACACACACACACCACACACACACACG
    TGCACACAAACAGAAATGGTAATTATGTG
    AGGTGATGGCGGTGTTAACTAACTTTATT
    GTGGTCATCATTTAGCCATACATGCATGT
    CATGAAATCACCATGTTGTACACCTTAAA
    GTTATGTAATACTAGATGTCAGTTATATC
    TCAAAGCTAGAAAAAATGTGGGGACCAAG
    GCAGAAGCTCTTCTGCTCTGTGTCTAAGG
    GTGGTTCTGGGGCTGGGATGGGGAGGATG
    GTTAAGTGGTATATTTTTTTCATACCTTT
    GCTCAGTACTATCATTGTAAGTGTTCAAT
    ATATGTCTGCTTAATAAATTAATGTTTTT
    AGTAAAAAAAAAAAAAAAAAAAAAAAAAA
    AA
  • TABLE 8
    Variant-3 Amino Acid Sequence
    M S S I E Q T T E I L L C L S P A E A A Seq ID No 8
    N L K E G I N F V R N K S T G K D Y I L
    F K N K S R L K A C K N M C K H Q G G L
    F I K D I E D L N G R S V K C T K H N W
    K L D V S S M K Y I N P P G S F C Q D E
    L V V E K D E E N G V L L L E L N P P N
    P W D S E P R S P E D L A F G E V Q I T
    Y L T H A C M D L K L G D K R M V F D P
    W L I G P A F A R G W W L L H E P P S D
    W L E R L S R A D L I Y I S H M H S D H
    L S Y P T L K K L A E R R P D V P I Y V
    G N T E R P V F W N L N Q S G V Q L T N
    I N V V P F G I W Q Q V D K N L R F M I
    L M D G V H P E M D T C I I V E Y K G H
    K I L N T V D C T R P N G G R L P M K V
    A L M M S D F A G G A S G F P M T F S G
    G K F T G N S L Y Q N D A K E L A W H F
    A K C C V N Q Y K D N
  • In other aspects of the present invention, nucleic acid constructs are provided that contain cDNA or variants thereof encoding CMP-Neu5Ac hydroxylase. These cDNA sequences can be SEQ ID NO 1, 3, 5 or 7, or derived from SEQ ID Nos. 2, 4, 6, or 8 or any fragment thereof. Constructs can contain one, or more than one, internal ribosome entry site (IRES). The construct can also contain a promoter operably linked to the nucleic acid sequence encoding CMP-Neu5Ac hydroxylase, or, alternatively, the construct can be promoterless. In another embodiment, nucleic acid constructs are provided that contain nucleic acid sequences that permit random or targeted insertion into a host genome. In addition to the nucleic acid sequences the expression vector can contain selectable marker sequences, such as, for example, enhanced Green Fluorescent Protein (eGFP) gene sequences, initiation and/or enhancer sequences, poly A-tail sequences, and/or nucleic acid sequences that provide for the expression of the construct in prokaryotic and/or eukaryotic host cells. Suitable vectors and selectable markers are described below. The expression constructs can further contain sites for transcription initiation, termination, and/or ribosome binding sites. The constructs can be expressed in any prokaryotic or eukaryotic cell, including, but not limited to yeast cells, bacterial cells, such as E. Coli, mammalian cells, such as CHO cells, and/or plant cells.
  • Promoters for use in such constructs, include, but are not limited to, the phage lambda PL promoter, E. coli lac, E. coli trp, E. coli phoA, E. coli tac promoters, SV40 early, SV40 late, retroviral LTRs, PGKI, GALI, GALIO genes, CYCI, PH05, TRPI, ADHI, ADH2, forglymaldehyde phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, triose phosphate isomerase, phosphoglucose isomerase, glucokinase alpha-mating factor pheromone, PRBI, GUT2, GPDI promoter, metallothionein promoter, and/or mammalian viral promoters, such as those derived from adenovirus and vaccinia virus. Other promoters will be known to one skilled in the art.
  • II. Genomic Sequences of the CMP-Neu5Ac Hydroxylase Gene
  • Nucleic acid sequences representing the genomic DNA organization of the CMP-Neu5Ac hydroxylase gene (FIG. 1, Table 9) are also provided. Seq ID Nos 10-28 represent Exons 1, 4-11, 11a, 12, 12a, 13-15, 15a, and 16-18, respectively. Exons 11a, 12a, and 15a are cryptic Exons that are retained in certain variant transcripts of CMP-Neu5Ac hydroxylase. SEQ ID Nos 29-45 represent Intronic sequence between Exon 1 and Exon 4 (hereinafter Intron 1a and Intron 1b, respectively), 4-15, 15a, 16, and 17, respectively. Intron 15a is the 3′ downstream portion of Intron 15 that follows the cryptic Exon 15a. Seq ID No. 9 represents the 5′ untranslated region of the porcine CMP-Neu5Ac hydroxylase gene. Nucleic acid sequence representing the genomic DNA sequence of the porcine CMP-Neu5Ac hydroxylase gene (Table 10, SEQ ID No. 46) is also provided. In addition, contiguous genomic sequence representing the 5′ contiguous genomic sequence containing 5′ UTR, Exon 1 and a portion of intronic sequence located between Exon 1 and Exon 4 (Intron 1a) (SEQ ID No. 47, Table 11) is provided. Contiguous genomic sequence containing an intronic sequence located between Exon 1 and Exon 4 (Intron 1b) through Exon 18 (SEQ ID No. 48, Table 12) is also provided. Nucleotide and amino acid sequences at least 80, 85, 90, 95, 98 or 99% homologous to SEQ ID Nos 9-45, 46, 47, 48, 49 and 50 are provided. In addition, nucleotide and peptide sequences that contain at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90. 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 550, 600, 700, 750, 800, 850, 900, 1000, 5000 or 10,000 contiguous nucleotide or amino acid sequences of SEQ ID Nos 9-45, 46, 47, 48, 49 and 50 are also provided, as well as any nucleotide sequence 80, 85, 90, 95, 98 or 99% homologous thereto. Further provided are fragments, derivatives and analogs of SEQ ID Nos 9-45, 46, 47, 48, 49, and 50. Fragments of Seq ID Nos. 9-45, 46, 47, 48, 49, and 50 can include any contiguous nucleic acid or peptide sequence or at least about 10 bp, 15 bp, 17 bp, 20 bp, 50 bp, 100 bp, 500 bp, 1 kbp, 5 kbp or 10 kbp.
  • In addition, regulatory regions in the form of putative transcription factor binding sites of the genomic sequence have been identified (see FIG. 4). These binding sites are located in the 5′UTR and Exon 1 of the porcine CMP-Neu5Ac hydroxylase genome, and include binding sites for transcription factors such as, for example, ETSF, MZF1, SF1, CMYB, MEF2, TATA, MEF2, NMP4, CAAT, AP1, BRN2, SATB1, ATF, GAT1, USF, WHN, NMP4, ZF5, NFKB, ZBP89, MOK2, ZF5, NFY, and MYCMAX.
  • TABLE 9
    Genomic Organizational Sequences
    ctgccagcctaagccacagccacagc 5′UTR Seq ID No 9
    aacgctgggtctgagccatgtctgca
    gcctatgccagagctccccgcagcgc
    cggatgcttaacccactgagcaaggc
    cagggattgaaccctcgtcctcatgg
    atagcagttgagttgtttccacggaa
    ctcttaggggaactcctgattatttt
    ttatttaaatttatatttctctgact
    ttttcgtgtgctcatcagccactgac
    tgtgtatctccattagtcatggtttg
    ttaactctgtcattcaaaccctcttc
    atccttgctacgcagataacatcatt
    ataataaaatcgtgcctgaagaccag
    tgacgcccccaagctaagttactgct
    tcccctggggggaaaaagaagcaccg
    cgcgggcgctgacacgaagtccgggc
    agaggaagacggggcagaggaagacg
    ggggagcagtgggagcagcgggcagg
    gcgcgggaagcactggggatgttccg
    cgttggcaggagggtgttgggcgagc
    tcccggtgatgcaggggggaggagcc
    ttttccgaagtagcgggacaagagcc
    acgggaaggaactgttctgagttccc
    agt
    CCCGACGTCCTGGCAGCGCCCAGGCA Exon 1 Seq ID No 10
    CTGTTATTGGTGCCTCCTGTGTCCAC
    GCGCTTCCCGGCCAGGCAGCCCTGGC
    GGATCCTATTTTCTGTTCCCCCGATT
    CTGGTACCTCTCCCTCCCGCCCTCGG
    TGCGCAGCCGTCCTCCTGCAGTGCCT
    GCTCCTCCAGGGGCGAAACCGATCAG
    GGATCAGGCCACCCGCCTCCT
    gtgagaaggcttcgccgctgctgccg Intron 1a Seq ID No 29
    ctggcgccggcagcgccctccacgca
    cttcgtagtgggcgcgcgccctcctg
    cattgtttctaaaagattttttttta
    tccgcttatgctatcagttactgagg
    aagtatttacaaatctactattattt
    tgaatttgcctttttctccttatagt
    ttatcagtatctcttgagactgttat
    tggtgcctgcaaatttaaaatgattg
    gggttttatgaggaagtgaacctttt
    atctttatgaaacgcctaactgaggc
    aatgttaattgcttaaaatactttct
    tattatcagtgtggccatgccagtgt
    cctcttggttagaatttgcctgat
    ctgccaaagctgggagatgggggaaa Intron 1b Seq ID No 30
    gtagagtgggttattgaaactgaata
    tagagttcagcatctaaaagcgaggt
    agtagaggaggaagctgtgtcaacgg
    aaatactgagctgggttcacatcctc
    tttctccacacag
    TCTAATGCCTTGTGGAAGCAAATGAG Exon 4 Seq ID No 11
    CCACAGAAGCTGAAGGAAAAACCACC
    ATTCTTTCTTAATACCTGGAGAGAGG
    CAACGACAGACTATGAGCAG
    gcaagtgagagggggctttagctgtc Intron 4 Seq ID No 31
    agggaaggcggagataaacccttgat
    gggtaggatggccattgaaaggaggg
    gagaaatttgccccagcaggtagcca
    ccaagcttggggacttggagggaggg
    ctttcaaacgtattttcataaaaaag
    acctgtggagctgtcaatgctcaggg
    attctctcttaaaatctaacagtatt
    aatctgctaaaacatttgccttttca
    tag
    CATCGAACAAACGACGGAGATCCTGT Exon 5 Seq ID No 12
    TGTGCCTCTCACCTGCCGAAGCTGCC
    AATCTCAAGGAAGGAATCAATTTTGT
    TCGAAATAAGAGCACTGGCAAGGATT
    ACATCTTATTTAAGAATAAGAGCCGC
    CTGAAGGCATGTAAGAACATGTGCAA
    GCACCAAGGAGGCCTCTTCATTAAAG
    ACATTGAGGATCTAAATGGAAG
    gtactgagaatcctttgctttctccc Intron 5 Seq ID No 32
    tggcgatcctttctcccaattaggtt
    tggcaggaaatgtgctcattgagaaa
    ttttaaatgatccaatcaacatgcta
    tttcccccagcacatgcctaactttt
    tcttaagctcctttacggcagctctc
    tgattttgatttatgaccttgactta
    atttcccatcctctctgaagaactat
    tgtttaaaatgtattcctagttgata
    aacagtgaaacttctaaggcacatgt
    gtgtgtgtgtgtgtgtgtgtgtgtgt
    ttaccagcttttatattcaaagactc
    aagcctcttttggatttcctttcctg
    ctctctcagaagtgtgtgtgtgaggt
    gagtgcttgtccaaacactgccctag
    aacagagagactttccctgatgaaaa
    cccgaaaaatggcagagctctagctg
    cacctggcctcaacagcggctcttct
    gatcatttcttggaagaacgagtgct
    ggtaccccttttccccagccccttga
    ttaaacctgcatatcgcttgcctccc
    catctcaggagcaattctaggaggga
    gggtgggctttcttttcaggattgac
    aaagctacccagcttgcaaaccaggg
    ggatctggggggggggtttgcacctg
    atgctcccccactgataatgaatgag
    ggattgaccccatcttttcaagcttt
    gcttcagcctaacttgactctcgtag
    tgtttcagccgtttccatattaggct
    tgtcttccaccgtgtcgtgtcgtcaa
    tcttatttctcaggtcatctgtgggc
    agtttagtgcgaatggactcagaggt
    aactggtagctgtccaagagctccct
    gctctaactgtatagaagatcaccac
    ccaagtctggaatcttcttacactgg
    cccacagacttgcatcactgcatact
    tagcttcagggcccagctcccaggtt
    aagtgctgtcatacctgtagcttgct
    tggctctgcagatagggttgctagat
    taggcaaatagagggtgcccagtcaa
    atttgcatttcagataaacaacgaat
    atatttttagttagatatgtttcagg
    cactgcatgggacatacttttggtag
    gcagcctactctggaagaacctcttg
    gttgtttgctgacagactgcttttga
    gtcccttgcatcttctgggtggtttc
    aagttagggagacctcagccataggt
    tgttctgtcaccaagaagcttctgca
    agcacgtgcaggccttgaggtcttcc
    gacttgtggcccggggactctgcttt
    ttctctgtccttttttctccttagtg
    ggccatgtcctgtggtgttgtcttag
    ccagttgtttaagggagtgttgcagc
    tttatgattaagagcatggtctttcc
    ttgcaaactgcttggtttagaagcct
    ggctccaccacttagcggctctgtga
    cctcggacacatttcttagcctttct
    gggcctcgctcttcttcctcataaag
    tgaaaatgaaagtagacaaagccttc
    tctgtctggctactgagaggatggag
    tgatttcatacacataaagcacttaa
    aataatgtctggcatatgatacatgc
    tcaataaatgtcacttacatttgcta
    ttattattactctgccatgatcttgt
    gtagcttaagaacagaggtctttaca
    ggaattcaggctgttcttgaatctgg
    cttgctcagcttaatatggtaattgc
    tttgccacagactggtcttcctctcc
    ttcacccaaagccttagggggtgaac
    gatcccagtttcaacctattctgttg
    gcaggctaacatggagatggcaccat
    cttagctctgctgcaggtggggagcc
    agattcacccagctttgctcccagat
    acagctccccaagcatttatatgctg
    aaactccatcccaagagcagtctaca
    tggtacactcccccatccatctctcc
    aaatttggctgcttctacttaggctc
    tctgtgcagcaattcacctgaaatat
    ctcttccacgatacagtcaagggcag
    tgacctacctgttccaccttcccttc
    ctcagccatttttcttctttgtacat
    aatcaagatcaggaactctcataagc
    tgtggtcctcattttgtcaatctaat
    ttcacagcctcttggcacatgaagct
    gtcctctctctcctttctgcctactg
    cccatgagcagttgtgacactgccac
    atttctcctttaacgacccagcctgc
    tgaatagctgcatttggaatgttttc
    aatttttgttaatttatttatttcat
    cttttttttttttttttttttttttt
    ttttttagggccgcacccatgggata
    tggaggttcccaggctagggatccaa
    tgggagctgtagctgctggcctacac
    cacagccacagcaatgcacaattcga
    gccaatctttgacctacaccagagct
    cacggcaacactggattcttaaccca
    ctgattgaggccagggatcaaactct
    cgtcctcatagatacgagtcagattc
    gttaacctctgagccatgatagttgt
    tagttactcattgatgagaaaggaag
    tgtcacaaaatatcctccataagtcg
    aagtttgaatatgttttctgccttgt
    tactagaaaagagcattaaaaattct
    tgattggaatgaagcttggaaaaaat
    cagcatagtttactgatatataagtg
    aaaatagaccttgttagtttaaacca
    tctgatatttctggtggaagacatat
    ttgtctgtaaaaaaaaaaaatcttga
    acctgtttaaaaaaaaaacttgactg
    gaaacactaccaaaatatgggagttc
    ctactgggacacagcagaaatgaatc
    taactagtatccatgaggacacaggt
    ttgatgcctggcctcgctaagtgggt
    taaggatatggtgttgctgcagctcc
    aattcaacccctatcctgggaacccc
    catatgccaccctaaaaagcaaaaag
    aaaggtgctgccctaaaaagcaaaaa
    gaaagaaagaaagacagccagacaga
    ctaccaaatatggagaggaaatggaa
    cttttaggccctatctccaactatca
    catccctatcaccgtctggtaagaaa
    tggaaaaaatattactaagcctcctt
    tgttgctacaattaatctgattctca
    ttctgaagcagtgttgccagagttaa
    caaataaaaatgcaaagctgggtagt
    taaatttgaattacagataaacaaat
    tttcagtatatgttcaatatcgtgta
    agacgttttaaaataattttttattt
    atctgaaatttatatttttcctgtat
    tttatctggcaaccatgatcagaaat
    ctttaaacaatcaggaagtctttttt
    cttagacaaatgaaaatttgagttga
    tcttaggtttagtacactatactagg
    ggccaagggttatagtgtgactatta
    aatcacagataatctttattactaca
    ttatttccttatactggccccacttg
    gatcttacccagcttagcttttgtat
    gagagtcatccttaaagatgacttta
    ttctttaaaaaaaaaaacaaatttta
    agggctgcacccatagcatatagaag
    ttcctaggctagcggtcaaattagag
    ctgcagctgccagcctatgccacagc
    cacagcaatgccagatctgagctgca
    tctgtgacctacactgcagcttgcag
    caatgctggatccttaacccattgaa
    caatgccagggattgaacacacatcc
    tcatggatactgctcaggttcctaac
    ctgctgagccacagttggaactccaa
    agcagactttattctgatggctctgc
    tgatctctaacacgttattttgtgcc
    atggtgtttatcttcactttactcaa
    gtcagggaaacacgaagagtctcata
    caggataaacccaaggagaaatgtgc
    aaagtcacatacaaatcaaactgaca
    aaaatcaaatacaaggaaaaaatatc
    ttcactttcaaaatcacctactgatg
    atgagtttatatttccttggatattt
    gaatattagctatttttttcctttca
    tgagttttgtgttcaaccaactacag
    tcgtttactttgatcacagaataatg
    catttaagccttaaatagattaatat
    ttattttcaccatttcataaacctaa
    gtacaatttccatccag
    GTCTGTTAAATGCACAAAACACAACT Exon 6 Seq ID No 13
    GGAAGTTAGATGTAAGCAGCATGAAG
    TATATCAATCCTCCTGGAAGCTTCTG
    TCAAGACGAACTGG
    gtaaataccatcaatactgatcaatg Intron 6 Seq ID No 33
    ttttctgctgttactgtcattggggt
    ccctcttgtcaacttgtttccaatct
    cattagaagccttggatgcattctga
    ttttaaactgaggtattttaaaagta
    accatcactgaaaattctaggcaagt
    tttctctaaaaaatcccttcattcat
    tcatttgttcagtaagtatttgatga
    gaccttaccatgtgtaaacattgcac
    taggtattaagaaatacaaagatgga
    taagatagagtcggcgtaaatgagat
    gatataatgagacgttataatgaaac
    tcacaattccagttgggaaataaagt
    ccttcaaattccatgactctttctgg
    cacacgttagaggctacagcttctgt
    gtgattctcatgctggctccacttcc
    actttttccttcttcctactcaagaa
    agcctatagaaatatgagtaagaagg
    gcttaatcataggaataaatttgtct
    ctgttctaagtgattaaaaatgtctt
    tatcagtataaaaagttacttgggaa
    gattcttaaaactgcttttacacact
    gttctagaatgactgttatataaata
    aaaaagtagatttgatctaacacaat
    taaatgacctttggaaatattgacta
    attctcaccttgcccctcaaagggat
    gcctgaaccatttccttcttttgcca
    gaaagcccccaccctttgtctgttga
    cctagcctaggaaatcttcagatcac
    gttgttagcacgaactggttacatgt
    gctgtacaaatactatttaattcatc
    tgattaaaaaaaaagagataagaagc
    aaaagtttgactatcttaaactgttt
    gcgtaggtgagaggacaattgaccat
    ctactttatgagtatgtaacccagaa
    acttaaagctccttaagggagctaag
    tcttttggataagacctatagtgaga
    ccttttagcaaaatggttaagactga
    atggagctcactagcgtgggttcata
    tcctgatgctcaaacacgcaattaaa
    tgactttaggtgggttagtctctgtt
    ccttagtttcctcaatgggagataat
    attggtagtagcgattttactgggtt
    gttgaaagaacatctgttaaatgttc
    agaacgtgttacgacagagtacagag
    taatgatttgcttgtatatgtatgac
    tcaaatagtctgccatatgccttgtg
    actgggtcctgtggagcaggaaggag
    ggatttcccacccagcagaaagttgg
    gtaaactggaaaatagactgaggcca
    ggaaatgatgcaaagcgttgatgttc
    actgccacggcaggtgaagggcaggg
    ccagagttgtcagtagggtcagggga
    ggactggaaataaccaagacccactg
    cacttttcagcctttgctccagtaag
    gtaatgttgtgagagtagaaaatttt
    gttaacagaacccacttttcagtaca
    gtgctaccaatactgtagtgatttca
    taccacatcccaagaaagaaaaagat
    ggctcaatcccatgtgagctgagatt
    atttggttttattgttaaataaatag
    cattgtgtggtcatcattaaaaaagg
    tagatgttaggaaagtagaaggaaga
    agactctcacctacattttcatcact
    gttttggtatctgccagttgtcacct
    tggtccccttccccgcctctcccctg
    cctcctcttcctccttctcctttttt
    tggaatacaattcaggtaccataaaa
    tttacccttttagagtgtttgactca
    atggtttttagtattttcacatgttg
    tgctattactatcactatataattcc
    aggtcattcacatcaccccccaaaga
    aaccttctaactattagcagtccatt
    cccttcttccctcagcccctggcaac
    cactaatctacttactgtctccatgg
    atgttcctatattgaatcaagctagc
    ataaaccccacttgctcatggtcata
    attcttttttatagtgctaaattaca
    tttgctaatattcaattaaggatttc
    tatgtccatattcataaggaatattg
    gtgtgtagttttctctttgtgtgata
    tctttgtctggttgggggatcagagt
    aataattactgctctcatagaatgaa
    ttgagaagtgttccctccttttctat
    ttattggaagagtttgtgaagtatat
    tggtattgattcttctttaaacattt
    ggtcagattcaccagtgaagccatct
    gggccatggctaatctttgtgaaaag
    ttttttgattactaattaaatctctt
    taatttgttatgggtctgctcctcag
    acgttctagttcttcttgagtcagtt
    ttgttcatttgtttcttcctaggact
    ttctccctttcatttggattatttag
    attgatagtaatatcccccttttaat
    tcctggctgtagtaatttgggtcttt
    tctcttttttcttggtcagtttagct
    aaaggtttgtaattgtattaatcttt
    tcaaataactaacttttttgttttgt
    ttgttttttgttttttgttttttgtt
    ttttgtttttttttgctttttaaggc
    tgcacctgaggcatatggaagttctc
    aggctagaggtctaatcggagctaca
    gctgctggcctataccacaaccatag
    caatgccagattcaagctgcatctgc
    gacctacaccacaactcggccaggga
    tcacacccgcaacctcatggttccta
    gtcggatttgttaaccactgtgccac
    gacgggaactcccgcccatttttttt
    aacacctcatactttaacataaagat
    gggcttcacatggactgatagctcaa
    atgaggaaggtaagactatgaaagta
    atggaagaaatgtagactatttttgt
    gacctagagattactgatacttcttg
    acttttcaaacaatacttcaaaagta
    cagcccaaagggaaaaaagaaagaaa
    aaagaaacacacatatacacaaacct
    agtgaataagatatcatcgatacact
    acagatttctatgaactggaagaccc
    catggacaaagttaaagaacatatga
    tagtttgagtgattattttgcaatat
    ttacaaccaatgagggaatattatcc
    agcttataggaggaagtaatgcaaat
    cgacaagaaaaagataggaaacccaa
    tataaaaattaagaaaatacaaaaat
    taagaaaggatatgaactagcatttt
    acaaaagaaaaatctccaaaagtcaa
    tcagcacatgaaaatatgctcaaacc
    taattattagaaaactacagactgaa
    gcaatgaggtgctttactttacatct
    ttttgactgataaaaagttagaaaca
    aaggtgatatcaaatgtcagggataa
    aaggatatagaaatcgtcatgcctgt
    ggtgggagtatggccggtgcagtcat
    gtgggaaggtaatctgacagtggtta
    ggcagagcaggtttatgaatacactg
    tggcccatcaatcccacgcctgttta
    tgtaccaaagaaatcctgttgtggca
    gaatctatgggtccacccctgggagc
    atgaattaataaaatgtggcaccagg
    gtgtgtgaaactccagctagagatga
    gatgtccacatggcaacatgaatgca
    tcttagaaacatagatttgagtgaaa
    aagagtaagaaacagccgggaaaccc
    aataccatttataaaaattaaagatg
    cacacatacaatgtagtaaatatttt
    gcatgaactttcaaatggttgcctac
    agggggggagagtaaagaagagtaga
    aaacaaagataaagggagtaagtaag
    tagctctgcctggactgaatataatg
    tgtcatgaactgagaaatatggttaa
    cataatcctcttaacttgaggtccta
    aatgaatgaatgagtccactattcat
    ttacccattctttaatgtgtattgca
    ttataatccatttttttagaaccaac
    gaattttgttcccataactactaatc
    agcctgccttttctccctcattccct
    tatcagctcaggggcattcctagttt
    ttcaaacgttcctcatttgaaccaaa
    aatagcatcattgtttaaattatact
    tgttttcaaatacgatgcttatatat
    tccaagtgtgtttgcccattttctta
    ggtggtagaaatttttcattctactt
    ttctatctactcagattttcccgttg
    gaattatttccattgctattaaactt
    agaagtcccccctgtgatatgccatt
    tttttcatactttttaagcacttggt
    tgcttttctttgtgtctttaagcacc
    tagaatacttataaccattgcacagc
    actgtgtatcaggcagcccttcctct
    tccactaatttatggtccttctctta
    gactatattaaactgttatttaatta
    ggatcctctcttcgtccttatgattt
    aattattatagttttctaatatgttt
    ttattataattcctcttcattattcc
    tccctattaaaaattttaatgaattc
    catttgtttgttcttctagttaaata
    ttaagtcataatccaaataacttaga
    tgtcattagtttatgtggtcaaagta
    aggataccacatctttatagatgcag
    gcagttggcagatgtcatgattttct
    tcagtgcataaatgcaatttatcttt
    gagcaaggggcataaaaacttttatg
    gtattggctttgaaataatagttaag
    aactgcagactcagtttttcctgctt
    ttcttgaaaaagaacacttctaaaga
    aggaaaatccttaagcatggatatcg
    atgtaattttctgaaagtctcctgta
    attccttgggatttttgttgttgttt
    gttggtcggtttttttgggtttttgt
    ttgtttgttttgttttgttttgtttt
    gcttttagggctgcacctgtggcata
    tggaagttcccaggctaggggtccaa
    ctggagctacagctgccagcctactc
    cacagccacagcaacatgggatccta
    gctgcatctgtgacctaaccacagct
    cttggtaatgccagattgttaaccca
    ctgagcaatgccagagatcgaatctg
    cctcctcatggacactagtcagatta
    gtttctgctgagccacaatgggaatt
    cccaattccttgtatttttgaactgg
    ttatgtgctagcatataattttgttt
    cttgaatctttgtgggtttttttttt
    tttttttttttgtctcttgtcttttt
    aaggctgcacccacagcatatggagg
    ttcccaggctagaggtcaaattggag
    ctacagctgccagcctacacaacaac
    tgcagcaaagtggggcccaacttata
    tgacagttcgtggcaatgccggattc
    ctaacccactgagcagggccagggat
    cgaacctgagtttccagtcagtttcg
    ttaaccactgagccatgatagtaact
    cctgtttgttcagtcttgaacctcct
    ttttaattctttattccttgagggtg
    aaataattgccataataatactatca
    tttattacatgccttctctgtgctag
    gcatagtgacactttaggatttatta
    tatcacttaatccctacaacaactct
    gcaaagtatgtatcataatcctattt
    gacagatcaggaaattgcagcccagg
    atgcagataatatgcatccatcacaa
    gtgactagatatagtccctctgctat
    tcagcagggtctcattgcctttccat
    tccaaatgcaatagtttgcatctatt
    gtatatgtgttttggggtttttttgt
    ctttttttttttttttgtcttttctg
    gggcctcacccttggcataggtaggt
    tcccaggctaggggtcaaattgaagc
    tgcagctgccagcctacaccacagcc
    acagcaactcgggatctgagcctcat
    ctgcaacctacaccaaagctcacggc
    aacaccggatccttaacccactgagt
    gaggccagagatcaaaccggcaacct
    catggttcctagtcggattcattaac
    cactgagccacgatgggaactcccta
    aatgcaatagtttgctctattaaccc
    caaactcccagtccatcccactccct
    cctcctccctcttggcaaccacaagt
    ctgttctccatgtccatgattttctt
    ttctggggaaagtttcatttgtgcca
    tttttcattttacgggtaatttttac
    ttcagtttcttccactagcagttgtc
    ttaaagtgagtataattaatattcat
    ttggaaaatgtaagcaaaacattttt
    taaagggccatgcccacagcatatga
    aagtttctgggccaggggttgaatcc
    aggctccaagttgcagctgtgcccta
    cactgcagctgggcaatgctggatcc
    tttaacccactgtgcccggctaggga
    tcaaacctgcatttccacagctaccc
    gagccattgcagttggattcttaacc
    cactgcactacagtgggaactcccac
    aaaacattttttaatgtcctttgaat
    aaagtaggaaagtgctcgtctttgag
    ggcagggcggcaatgccatttccaca
    aggtttgctttggcttgggacctcat
    ctgctgtcatttagtaatgaataaaa
    ttgctgacagtaataggattaactgt
    gtgtggagatagccagggttagagat
    aaaaacactggagaagtcaaataagt
    tgctcgaggtcctctagctaataagc
    tattaagtgggagagtgagggctaga
    aacaggccatctgtctcccaagcaca
    tgtccattagtggtttgctgatagcc
    ttccagaacaacagagaggactctca
    aacatggtcttgcctccctccaattg
    atcccctccatgtgcctcacagcggg
    tctttctaaaattaagttctgatttt
    aattctcccttgctatagcacttagg
    tatggctttcagccgtgcaataaaaa
    gcaggcaagagtggctcaatcatata
    ggaggttgtttttcttagatcccaag
    caggtaatcctgggcattatggttgt
    tctgcgtttatcaaggagccaaattc
    tctatcacctcctgttctatcctcct
    cagtatctggctctattcttcagcat
    ctcaagatggcttgtgctcctccaag
    catggcagtcaaattccacacaagag
    ggggaaatatgaagggcagacagtgc
    tggtctcctgagctgtccctctttgt
    cggggaaataaatgtattccttcaag
    tcccgtgagacttctgaagtagacgt
    ctgcttacgtctcacccaccagaact
    atgtaaactgcacatagtgctaggtc
    tacatagccactcataactgccaggg
    ggtgggaaatctttaaataggtgtac
    caccacacaattaggatgctaatagt
    aagggagaaggagagaataggttttg
    cgcaagccaccagcatgcctgccaca
    attgcttaaaattcttcattgacccc
    tcattgccacaggatgaaatccaaac
    gccttcttagttgggaatctgaccta
    cctgtctctcccacctggttcagaca
    ccattctccttggtcataaaattcca
    gtcatttgtgaacatccagctccccc
    atgcctccatgcctttgcacatgctg
    ttcttttatcttttatgttgtccttt
    tatcttttatccaaaagagatatccc
    atcatcacatctcttttgtcagcccc
    caaatactttgtctttcaagttcagc
    tggaggattacctcctatttgaaatc
    agctttgtctcttacaaccaaacaag
    gttttccttccgagacactcccacag
    caccttgaactcatctctatcaatca
    ttcatttgataatgaagttgttggtg
    gtatgcctgtgtctctgacacatctg
    cgatctcatgagttccttaagtggaa
    tgtgaatagcgggatgaacagtattg
    gtcttcagccctcatctctgcagatg
    ttgcttgacccaaatgagcgttgcct
    tttattttgattttgctttgatttgt
    ctactccatgtacttgagccatgcat
    ttctgtcttagcgatgctttttaaaa
    gtcattttttggttgattatccagat
    ttgtccacctttgcttctag
    TTGTAGAAAAGGATGAAGAAAATGGA Exon 7 Seq ID No 14
    GTTTTGCTTCTAGAACTAAATCCTCC
    TAACCCGTGGGATTCAGAACCCAGAT
    CTCCTGAAGATTTGGCATTTGGGGAA
    GTGCAG
    gtaaggaaatgttaaattgcaatatt Intron 7 Seq ID No 34
    cttaaaaacacaaataaagctaacat
    atcaatttatatatatatatatatat
    atatttttttttttttttacatctta
    tattaccttgagtattcttggaagtg
    gctagttaggacatataataaagtta
    ttctgaagtctttttttttctttttc
    catggtgagcagtggcttgatgtgga
    tctcagctcccagacgaggcactgaa
    cctgagccgcagtggtgaaagcacca
    agttctagccactagaccaccaggga
    actccctattctaaattcttgagcac
    attatttaggaacctcaggaacttgg
    caggattacaggaaatatatctagat
    ttaaaaaaaaatcttttaacagaggt
    cccaaaggagagtcatgcacagctat
    gggaggaagttcagaaactgcccttg
    ctaccagatcactgtcagataaaatg
    gccagctacatgtttctgcacattgc
    cctaagatctttacaaacttttctgt
    gcatttttccacttttaaaagaaaat
    ttcggggttcctgttgttgctcagtg
    gttaacgaacccaactagtatccatg
    gggacaggggttcgagccctggcctc
    actcagtgggttaagaatctggcatt
    gctgtggctgtggcgtaggctggcgg
    ctacagctcagattggacccctagcc
    tgagaacctccatatgccgcaggtat
    ggccctaaaaaaaaaaaaaaagagag
    agagagaatttcctccagaaaaaaca
    ctttggtagtttgggagaagtaaaca
    accaaaaattaatttttctggagtat
    tcgggaagcttgtaaaaatgggctct
    tacttttttgaggagacaaatgggaa
    cctacccagaagaggcacaatcacct
    gcatttgatttcttgacctctcccta
    ccttctttgctggctttccacatttg
    gatttctgtgaccttatctctgctcc
    ttggtgttttcatttttcctgtggac
    gtgccagactatgggaagggagtaag
    gcgttgatttagaatcctgtagtctc
    tgcctgtctctagtcattgttttcac
    ccttctcaaaggaccttgacatcctg
    agtgagtccgcaagtaatttagggga
    gaagccttagaagccagtgcagccag
    gctacatgactgtgtccacccactgg
    aaccagtcatttttatacctattcac
    agcccccctaccatttaaatccccag
    aggtctgccataacatctgtaactcc
    ctttcctggtaaattgtgttctaaaa
    gactggtaacaaaagatattctgtgg
    tacagagcataattaaatacctggga
    gctgatttgagtggggtaaatcaact
    ggtttgacccctaaaacccaccatga
    gcatttctgttctaataaagtaatgc
    ccgtgctgggaagttctacggaaatg
    ctcctgctgtgtctttcttgagtcct
    gtgtcattgaacatgcttaggagcaa
    aggtcccccatgtggcttgtctgcta
    accagcccagttccttgttctggctg
    gtaatgatccgatcatctgaatctca
    ctgtcttccaacag
    ATCACGTACCTTACTCACGCCTGCAT Exon 8 Seq ID No 15
    GGACCTCAAGCTGGGGGACAAGAGAA
    TGGTGTTCGACCCTTGGTTAATCGGT
    CCTGCTTTTGCGCGAGGATGGTGGTT
    ACTACACGAGCCTCCATCTGATTGGC
    TGGAGAGGCTGAGCCGCGCAGACTTA
    ATTTACATCAGTCACATGCACTCAGA
    CCACCTGAG
    gtaaggaagggtgagccctcaactcc Intron 8 Seq ID No 35
    gaagaaaatgctgcaataaaagcact
    gttggttttcagctttttttgtaatc
    actgctcattctgaggtagattcgct
    tgggctgataaaaagagaactaattc
    agataaatgcttgcatttgcatagcc
    tctttttttaaaaacttttttttttt
    ttttttttttttggcttttcagggct
    gaacctgtggcatatggaggttccca
    ggctaggggtcgaatcagagctgtag
    ccccgggcctatgccactgccatagc
    aacatgcatagcctcctttttaaagt
    gccttcctgttttataccattgggat
    gtgagaagagctattgtggaaangag
    catggggtnataaccctggacctctc
    acgtcctaccctcaggntagtgggaa
    aactctgagtttaaggacatcaaagt
    gactcctttttagttacattatggng
    gaatcagcncatatttttacaagggg
    cggagngtaanctgttggagtttaca
    agacatatggtggcattgcaactact
    taaccctactattatagcacaaaagc
    agccatagtcggtcctgaaggagcct
    gatgccttcagctttataggcaatga
    cgtgtgaatatcacaaacagtttcct
    gtgtcaccaaacatgattgccttttg
    atttccctttcaaccctttaaaaaaa
    ggtaaaagcccttcttagcattcagc
    agcaggtcgctgtgttttgccaactc
    ctgatctgtagcatttcgacaacact
    gagctctcaacttttgaaccctgagt
    ccaccacatccttcagtgaaaccaga
    gccatgtgatactaaggatagaaacg
    gaaacttcctgaatccaggcgatcaa
    ataggagggagaaagaggaactttca
    ttgacaaaaccacaaatattgtgaat
    ggactgttacaaatattgtgaatgct
    cctattcccaaccccctggcttcatt
    acagggtcctatgtgttcatccttat
    tgagaaatttgtattgctactgccag
    gttgccaatacccagcggtgcccatg
    gtgttctaaaatgaagcaatttcaac
    tttatttttttttcctgtgactttac
    atgacaagttcacatgaaggatatac
    tttgatagtaatgtccatggttaggg
    aatatacattgtttgctggttgactg
    gcccctggatttttctattgaaagtc
    catgagatctcgaaggcacaggtgtg
    ttctctcgctttttaaggaaagggtt
    taaaaacttaagtaattaacagcttt
    agtaacaaattacctataacacactt
    aaaaaccgaataccacccactggagt
    attgtgctacgattaaaaatctactt
    gtctactacatgatatctttgtccca
    cagaaggttctggaaccaaacttgta
    atttcaggattatgagagccctgagt
    tcacgcattgtgtaataactatgttg
    tgtggtagtcaatttgtacagcttgc
    ttagagagaacaatgtcaagttaagg
    aggcgattgctttatagtgcctgtca
    caagatgccattgccattgtcctagc
    aagagatattctatgggagtatacta
    cattttagtgaggataagaacttttt
    atggcatttagtccggtcatttccca
    accactgtcctgaaaaccaatttcat
    tttgatttcaggggcttgtgtgggca
    aagttgccaggcattaaaaagccact
    tctcaactgtagtatcacaatgcttt
    agttgggtagtgtattgcagatagct
    tatggctgaaaagttaccaagccttg
    cagttttcactcctttgagtttattt
    ccttgacagaattgaccctgagtttt
    ttgactcttacctgctcaactaataa
    acaccagagtcatttatctccattgc
    tcttgtctgacctttatttaccgaat
    aatgccttatgggttcacaaaaacaa
    ggggggagggggccagcatgccttag
    aaactgtctttagtcaagaaatgnga
    ttttattatgtaaatatatgagtatt
    ataatagatagtgttattaatagaca
    ccagcaagaattgtcaataatttaaa
    aatcacaaattaaaatacatccatgt
    tagnatcatttatcctaactcccaaa
    gccctttaaagtggaagatttagatg
    ttaacccagagattaaagacatgttc
    aaagaatccttgatttttttttgaat
    cccttgtttttagagaagaaaaccta
    atgattttccccctctggattctaca
    tattaaatatagttttggaacttgaa
    tattagtatggttaataagtgctgat
    atgctgattttgtttatatttttctt
    atgagtaaatatcctatatcaccaga
    cattatagtctatgtacaaatatgat
    tcttaaacctgatagcacattcatta
    gagttggaattgcctttttttttttt
    ttttttacagttgcacctgcaacata
    tgaaagttcccaggctaggggttgaa
    tccaagctgcagctgccaccctacat
    tacagccgtagtaacagcagatccga
    gctgcatctgcaacctatgctgcagc
    tcagggcaatgccagatccactgagt
    gaagccagggatggaacttgcatcct
    catagagacaacgtcgtgtccttaac
    ccactgagccagaacaggaactccag
    aatttcctttcaatagaagaagcacc
    aagtttaggatcagaaagcctgaatt
    tgaataccaatttactatttgttagt
    catatatttctgagtgtgtttcctca
    tttattaaaagcagactaaaagatga
    gagggtcttttgttgagaatcaaata
    caataacatgtgaaagtgtgtaacac
    tatgattgaaatatacctacacagcc
    atttatttgtttattgttcatgtttt
    gccacccacacagtagtatataatcc
    ttttatgtaataaatgctaataatga
    aagttggcaacttatgtaagtactca
    aaatgctggaggtcatgggatactga
    ctgggatactacagaggtaatgtcat
    ttcctctgcgctaaacttattgtctt
    gtagttagggactgactctctttagg
    acaaggagttcattctgtataccatg
    tgtggctatcacccttcgaagttgaa
    aaactgccccagggtgggcacccatc
    cgttctcttagatatatggccgagac
    ctttctctcactgggagggaaccaca
    ctgaggaatgagaaaaaaaaaaggaa
    aatcaagatgaaaccagaaacctctt
    tggcataacttctccactctgtactt
    tttgttagaactacccttgcacaaag
    cagcatcagtgtggaagacagaattt
    gcacacctggtttgatatacatgccg
    tggtatatgggatgttctaacaataa
    agaggactctcccaggaaatctcctc
    actgttatagtcagccttgaggaaag
    agctcttcttttggactctggggaga
    gtctagtttttcagttccttgcttct
    cggtcaacgtgttggtgtaaggatca
    cactctctcttatactagataattct
    attttttcacctttcaacctgtctat
    ccttctgaccctag
    TTACCCAACACTGAAGAAGCTTGCTG Exon 9 Seq ID No 16
    AGAGAAGACCAGATGTTCCCATTTAT
    GTTGGCAACACGGAAAGACCTGTATT
    TTGGAATCTGAATCAGAGTGGCGTCC
    AGTTGACTAATATCAATGTAGTGCCA
    TTTGGAATATGGCAGCAG
    gtctgtgttctttccacatgtttggg Intron 9 Seq ID No 36
    ttatcctttctgggataaatttgagg
    cgagatagaaactttaagactaaaga
    aacaatggcctactttttttgtacat
    ggtcctgtgtaaatctctatttgagc
    tgaaataagatggtcttcctctccaa
    ttatccatggtatgactctgatggat
    aacaaatccagttctgaaaaaagggg
    atttctttccagaagagaggacagtt
    tcttcaaatattgaattaaaagcaaa
    atagatgtaaaccgttgttggtttta
    ttgttgaattccag
    GTAGACAAAAATCTTCGATTCATGAT Exon 10 Seq ID No 17
    CTTGATGGATGGCGTTCATCCTGAGA
    TGGACACTTGCATTATTGTGGAATAC
    AAAG
    gtattttcttgccctcatcagcatga Intron 10 Seq ID No 37
    aattgctcttggtagaaaggataata
    atagttatccaaaacatcatcctatg
    ttcatctgtttcttccctcttcattt
    tccatagagtacagtatattctatct
    ctgtcttaggaaaatggactgtcatt
    catataatcttacagagaatcaatta
    gtaatgtactctatgccgtgacaggt
    gcgaaggttttttttgaaggcaacag
    ataaaaatatcctatatttcacctat
    tgtaatttccttaaaactgacattat
    tgaataaatgttttactttcatcttg
    aatattattatgttatggaatcatac
    actttaccccaataatcatcgaaaag
    aatttccaaaaggttgagagagttgt
    gttgatctgattactttcctctgcat
    cctttgagcttaacctttgaatatag
    tttgctaaggaaagtagtctgtttat
    gatcctggagtggaatcaggctaagt
    gtcctcattcagaacccactgaatca
    gacagaatgaatttatttccttgaaa
    gttcaaaatgtgtcactcagagtata
    aattttcaaatcttactctctctttt
    ccttggatgtgagcaattcttcgata
    attgaatgaggcagattatatagact
    tacatggaagactgttggcctgagaa
    ttcaaactatggtgttcaagacttca
    cngngagtccgatgccatttgtttcc
    cacag
    GTCATAAAATACTCAATACAGTGGAT Exon 11 Seq ID No 18
    TGCACCAGACCCAATGGAGGAAGGCT
    GCCTATGAAGGTTGCATTAATGATGA
    GTGATTTTGCTGGAGGAGCTTCAGGC
    TTTCCAATGACTTTCAGTGGTGGAAA
    ATTTACTG
    GTAATTCTTTATATCAAAATGATGCC Exon 11a Seq ID No 19
    AAGGAGTTGGCATGGCACTTTGCTAA
    ATGCTGTGTGAATCAATACAAAGATA
    ATTAGGACATGGTTCTTCCTCACAAG
    AGGTGTGCAATCTTATTGGGAAATCA
    TACTTGCAAGTCACAAATATAGACTA
    AAGTTTCCAGCTGAGAATATGCTGAT
    GGAGCATGAAACACTAAGGAGACAGG
    GAGAATCTCAGGAAAAATCAAGAATA
    ATTTGGATCAAATGGATTCCTGACAT
    AGAACATAGAGCTGATCAGAAAGAGT
    CTGACATTGGTAATCCAGGCTTAAGT
    GCTCTTTGTATGTGGTTCAGAACAGA
    GTGTGGGCAGCCTGAGGGGGATACAT
    ACCCTTGACCTCGTGGAAAGCTCATA
    CGGGGGAGGGATGAGGCTAAGGAAGC
    CCCTCTAAAGTGTGGGATTACGAGAG
    GTTGGGGGGGTGGTAGGGAAAATAGT
    GGTCAAAGAGTATAAACTTCCAGTTA
    CAAGATGAATAAATTCTAGGGGTATA
    ATAACAGCATGGCACTATAGATAGCA
    TATTGTACTATATACTGGAAGTGCTG
    AGAGTAGATCTTACATGTTCTAACCA
    CACACACACACACACACACACACACA
    CCACACACACACACCACACACACACA
    CGTGCACACAAACAGAAATGGTAATT
    ATGTGAGGTGATGGCGGTGTTAACTA
    ACTTTATTGTGGTCATCATTTAGCCA
    TACATGCATGTCATGAAATCACCATG
    TTGTACACCTTAAAGTTATGTAATAC
    TAGATGTCAGTTATATCTCAAAGCTA
    GAAAAAATGTGGGGACCAAGGCAGAA
    GCTCTTCTGCTCTGTGTCTAAGGGTG
    GTTCTGGGGCTGGGATGGGGAGGATG
    GTTAAGTGGTATATTTTTTTCATACC
    TTTGCTCAGTACTATCATTGTAAGTG
    TTCAATATATGTCTGCTTAATAAATT
    AATGTTTTTAGTAAAAAAAAAAAAAA
    AAAAAAAAAAAAAAAAAAAAAAAAAA
    gtaattctttatatcaaaatgatgcc Intron 11 Seq ID No 38
    aaggagttggcatggcactttgctaa
    atgctgtgtgaatcaatacaaagata
    attaggacatggttcttcctcacaag
    aggtgtgcaatcttattgggaaatca
    tacttgcaagtcacaaatatagacta
    aagtttccagctgagaatatgctgat
    ggagcatgaaacactaaggagacagg
    gagaatctcaggaaaaatcaagaata
    atttggatcaaatggattcctgacat
    agaacatagagctgatcagaaagagt
    ctgacattggtaatccaggcttaagt
    gctctttgtatgtggttcagaacaga
    gtgtgggcagcctgagggggatacat
    acccttgacctcgtggaaagctcata
    cgggggagggatgaggctaaggaagc
    ccctctaaagtgtgggattacgagag
    gttgggggggtggtagggaaaatagt
    ggtcaaagagtataaacttccagtta
    caagatgaataaattctaggggtata
    ataacagcatggcactatagatagca
    tattgtactatatactggaagtgctg
    agagtagatcttacatgttctaacca
    cacacacacacacacacacacacaca
    ccacacacacacaccacacacacaca
    cgtgcacacaaacagaaatggtaatt
    atgtgaggtgatggcggtgttaacta
    actttattgtggtcatcatttagcca
    tacatgcatgtcatgaaatcaccatg
    ttgtacaccttaaagttatgtaatac
    tagatgtcagttatatctcaaagcta
    gaaaaaatgtggggaccaaggcagaa
    gctcttctgctctgtgtctaagggtg
    gttctggggctgggatggggaggatg
    gttaagtggtatatttttttcatacc
    tttgctcagtactatcattgtaagtg
    ttcaatatatgtctgcttaataaatt
    aatgtttttagtaagtaatctctgtt
    tagtaatgtgtcagaaatgccctact
    tgcaataggaagaaaacctgtccagt
    cccttccttttttctgtaagtctgat
    ttcattgcctcccagaatgcatcacc
    atgtgagagatagagggaaggtgctg
    tccttatggggttaacagtgtgacta
    gggaggcaaaatatacctactaaagg
    gtggtagcataattcagttcttatgt
    gagtatgtgtatgtgtgtgagtatgt
    gcacatgcacatacattttaaaaggt
    ctgtaatatactaacatgttcatagt
    ggttacacctagcttataggtaacat
    tttttcccctgtatccttgtttgtgt
    ttatcaaattttcataacagtaatgg
    tagaaggagtacctgacatggtacca
    tacatgctnggncctgcctaatttct
    cnatttcctttattgcccataccccc
    attgcttgacaagcataagtccatac
    tggcttgttttcgttcctcagactca
    gtacaccatgtagctccatgccctgg
    gtctttgtatgtgctatttctactgc
    ttagagtgctattgcccctgaccacc
    acgtggtcagcaacttctcttctgcg
    tctgtgtctatggtctatgattccag
    atgtcatcttcactaactacccttct
    aatatgcccttccatcccacccgtcc
    tcatccttaccccagccactctctat
    ttggtggctctgttttattttcttcc
    tagctcatcactctttgaaatgaact
    tatttacttattcaatatttgcttct
    ttcactagaatgaatgctccatgaga
    gcagggacctgctttatcttgctcgc
    cactgtattcacagtgcctagaacta
    cgtctggcacatagtaggtgctcaat
    aaatatcgatcaaatgaaagaatgag
    caaacgaacaaatgaacaacacgtga
    ggtaggcatcatgattccatcaacag
    aggagaaaaccagacttaaagnaatg
    aagtggnggagctgcatttgatcttg
    actgactccaacatccatgctcttga
    ccactgtgcatctccagagtgtaatg
    aacatactttacttttatattccacc
    aaaataacaaagccatgcccatgtta
    gtagagagttaatcgacagtgccctt
    aaaatatgcatgcacccagggtacaa
    ctatgcatgctgccctgtgttttcag
    ttggatccaaatgaattgccgtaaac
    aaagaggggattcaatgtctttgact
    agtttgggatattttcctagtaacca
    actttgcaaaataaagccactaatga
    caaggagctttgttctacttctgcat
    cactcaactgtcaatttttatctctt
    gcaagacttctaatctactagaactt
    ttgtttttctgtgatttctgaacaga
    gaagactaatccaaaccctgtcattc
    cag
    AGGAATGGAAAGCCCAATTCATTAAA Exon 12 Seq ID No 20
    ACAGAAAGGAAGAAACTCCTGAACTA
    CAAGGCTCGGCTGGTGAAGGACCTAC
    AACCCAGAATTTACTGCCCCTTTCCT
    GGGTATTTCGTGGAATCCCACCCAGC
    AGACAA
    GTATGGCTGGATATTTTATATAACGT Exon 12a Seq ID No. 21
    GTTTACGCATAAGTTAATATATGCTG
    AATGAGTGATTTAGCTGTGAAACAAC
    ATGAAATGAGAAAGAATGATTAGTAG
    GGGTCTGGAGCTTATTTTAACAAGCA
    GCCTGAAAACAGAGAGTATGAATAAA
    AAAAATTAAATAC
    gtatggctggatattttatataacgt Intron 12 Seq ID No 39
    gtttacgcataagttaatatatgctg
    aatgagtgatttagctgtgaaacaac
    atgaaatgagaaagaatgattagtag
    gggtctggagcttattttaacaagca
    gcctgaaaacagagagtatgaataaa
    aaaaattaaatacaagagtgtgctat
    taccaattatgtataatagtcttgta
    catctaacttcaattccaatcactat
    atgcttatactaaaaaacgaagtata
    gagtcaaccttctttgactaacagct
    cttccctagtcagggacattagctca
    agtatagtctttatttttcctggggt
    aagaaaagaaggattgggaagtagga
    atgcaaagaaataaaaaataattctg
    tcattgttcaaataagaatgtcatct
    gaaaataaactgccttacatgggaat
    gctcttatttgtcag
    GTATATTAAGGAAACAAACATCAAAA Exon 13 Seq ID No 22
    ATGACCCAAATGAACTCAACAATCTT
    ATCAAGAAGAATTCTGAGGTGGTAAC
    CTGGACCCCAAGACCTGGAGCCACTC
    TTGATCTGGGTAGGATGCTAAAGGAC
    CCAACAGACAG
    gtttgacttgaatatttacagggaac Intron 13 Seq ID No 40
    aaaaatgatttctgaattttttcatg
    tttatgagaaaataaagggcatacct
    atggcctcttggcaggtccctgtttg
    taggaatattaagtttttcttgacta
    gcatcctgagcttgtcatgcattaag
    atctacacaccaccctttaaagtggg
    agtcttactgtataaaataaactatt
    aaataagtatctttcaactctggggt
    ggggggggagactgagttttttcaca
    gtcctatataataattttcttatcct
    ataaaataattaggagttcccgtagt
    ggctcagcaatagcaaacccgactag
    tatcgatgaggatgcgggttcgattc
    ctggcccccctcagtgggttaaggat
    ctggcattgccgtgagctgtggtgta
    ggtggcagacacggctcagatcccac
    gttactgtggctgtggcataggccag
    cagctccagctctgattagaccctta
    gcctgggaacttccatatgctgtggg
    tgtggccttgaaaaaaaataaataaa
    taagataattactcaaatgttttcct
    tgtctcagaaccttacttcaggataa
    agagtgagaaagttttttttatgaag
    ggccattattacagctcaaaaataag
    ttgtcttcagcaagtagaaagcaata
    agcctgagagttagtgttcctatcag
    tgtaaatattacctcctcgccaatcc
    ccagacagtccatttgaacaattaac
    ggtgccctgggagtacagttcagaaa
    cattaatgtggatgttccagacctgt
    atttttataagtacttgtcttgagcc
    ggatggaaccatcattcctcaccatt
    atttagaagtggactgtgactctgtt
    ggagatcagggcacacggttaccaaa
    agcacacccttctcctggccttacct
    ttgcaaagctggggtctgggacacag
    tcagctgattatacccttttactaac
    ttcccacagctcaaatctggtcaatt
    ctccttcacaaatctcttaaaaatcc
    atcactcacctccagcctcttctgct
    gtggccttgattcagcctctcacaat
    ttttttttaaccagaattctggcagt
    ggcccctgacttgcctctgtgctccc
    agccccgctgtcctctgatccatcct
    ccatgccagcctttttcaatctgctg
    gtcacgattcattgatgggttaggaa
    atcaatggcatcacaactagcattta
    gaaaaaggaaataggcgttcccgccg
    tggcacagcagaaataaatccgacta
    ggaaccataaggttgcgggttcaacc
    cctggccttgttcagtgggttaagga
    tccggcattgccgtgggctgttttgt
    aagtcacagacatggctctgatccgg
    cattgctgtggctctggcgtaggcct
    gcagcatcagctccaattagacccct
    atcctgggagcctccatatgctgcaa
    gtgcagccctaaaaaaaataaaaaaa
    taaaaaaaaataaataaaagaagtag
    acaaattgtatagaacaaccctgagt
    atgttgcctgagcacatataacaagg
    gtaagtattatttcaggaaactctgg
    tttcacagatactcttggcatatgga
    cccctagagtcctgatgtaaaatata
    ttcttcctgggatcttaggcaagaag
    tttgaaagctccaactctgcactgct
    gccaaagaaatgatttttaagtgcaa
    aactcttcccgttcccttccctgtat
    aaaattccataggatctctccagtgc
    ctctaggataaaggcagttttcattc
    tctagttcaaggtgagagaagatttt
    aattatttcacgttttagtggggaat
    tcaagagtctggcacctgacatttgc
    tgaactctctccattatccctctcta
    gttccccagacgcatcctatggtaga
    aattcgcaaactagagtgagcgtcag
    agtaacccaaggaaactgggtaaatg
    cagctccctgggctctaccccctgag
    attctgattcagtagatctgaagcag
    agccctggaatatgcatatgcatcat
    tgtgtcacaccaagcattctgggtaa
    tgagagttgatgttaggttctcagta
    gtaagacaagtatagagattccgggg
    gactgagtgctcagctctgccttggg
    gaggagggagagggctaaagagaaca
    ggagatggggacagggaatgctcaac
    ctccaatcttaggcatttgagctatg
    tcttaggggtcaggaggaggttacca
    atatagtgattaagagattgaggttc
    cagtcagagggatatgctggagaagg
    ggggtgaaaataatgtcataggtttg
    gtgagtgcagatactttgagtttttt
    aatatttttattgaaatatagttgat
    ttacaatgctcttagtgagtacaatt
    actttgaataagtgcatagatgtatg
    ccattcttccagaaatgatttattga
    gctcctttgggcatcatgctaagtac
    aggggaaacagctgtgaagaggtcct
    tcccttatgaagtcattcatcccctt
    cagtaaatgaaggtaaaggaaaagga
    tgagacagggacgccgtgttggacca
    gggtcagaaaggccttataagacctt
    gcctggagggcaaggaacttgcctgt
    gagtaaggagagcttgagaaagcgat
    aaagcaaagaaggaacattactgcat
    tgtgttttagaaaaaccatgtcctgg
    ggaagaactcctagagtcaggggggc
    cagttgggagactgtgcttttttcca
    ggaggagataagtgaggctgctggct
    gagatggagcaaggatttagagaagc
    agatatgagattcatttagaagttag
    acattttaggatctgacacataattt
    atcaccaaaaccagtgcatctctggc
    tttgggccaccagttttggagaagtg
    gaatgtagggacctaccattacctgc
    caatctttactacacagatgcctatt
    tccctcctcatatttcctttctccag
    atcacgtcctattctattgccaggac
    tcaagattccaccttgcatgcagtga
    tccatcttcacactggatggacagct
    ctagggatgtcagagcacactcttgt
    ccatactgctgactgggtctcctgtc
    agcccatctgtctatcagctgtggta
    ttattagtataataagagggctgtat
    atgagagacacaaaattctaggtgta
    gctcaaagataggctagagttattcc
    tatgtacaacaaatatttatgggacc
    ccttctgtgtactgtcatggttgctg
    ctttcatcatacttgtagtctaatgg
    aggtgggggcagggcaggaataagcg
    gatgtccacaaatcagtaagaccact
    tatattcaacattttcataatttagt
    tatttgagcccaaagggtccacatcc
    gtggtattccaacttttttttccccg
    gacatggatctttatctttttttttt
    tttcttttttgcggccagacctgcgg
    catatggaagttcccaggccaggggt
    tgaatgggagttgcagctgcctggtc
    tacaccacagccacagcaaggtggga
    tctgagctgcatctgtgacatacacc
    gcagctgaggtaacaccagattctga
    acccactgaatgaggccagggatgga
    acccgtctccttatgaacactatgtc
    atgttcttcaccctctgagccacaac
    gggaactccagacttcgtctttaaat
    gtattctgacttggagagctatcaca
    ctaagcaattaacaggagctgacctg
    gtttaggctggggtggggccctactc
    ctcaatgttccctgaggcacatctgt
    gggacccctgggcatcatctatctga
    gcagccttagagctgctcatccagtt
    gactgttgatgtagaagtgcaaactt
    ctgccttccttatttgttgctttctt
    ttttcattgttctctcccctttgtgt
    ctttaag
    CAAGGGCATCGTAGAGCCTCCAGAAG Exon 14 Seq ID No 23
    GGACTAAGATTTACAAGGATTCCTGG
    GATTTTGGCCCATATTTGAATATCTT
    GAATGCTGCTATAGGAGATGAAATAT
    TTCGTCACTCATCCTGGATAAAAGAA
    TACTTCACTTGGGCTGGATTTAAGGA
    TTATAACCTGGTGGTCAGG
    gtatgctatgaagttattatttgttt Intron 14 Seq ID No 41
    ttgttttcttgtattacagagctata
    tgaaaacctcttagtattccagttgg
    tttctcaataagcattcattgagcct
    tactgactgtcagacggagggcgtat
    tggactatgtgctgaaacaatccttt
    gttgaaaatgtagggaatgttgaaaa
    tgtagggaatgaaatgtagatccagc
    tctgtttctcttttggaggattcttt
    ttcctccatcaccgtgtcttggttct
    tgtttgttttgggtttttgtgggtgt
    tgtattgtgttgtgttggttatggca
    gtgacagctatttaaactgtgaaacg
    ggggagttcccgtcgtggcgcagtgg
    ttaacgaatccgactgggaaccatga
    ggttgcgggttcggtccctgcccttg
    ctcagtgggttaacgatccggcgttg
    ccgtgagctgtggtgtaggttgcaga
    cacggctcggatcccgcgttgctgtg
    gctctagcgtaggccagcggctacag
    ctccgattggacccctagcctgggaa
    cctccatatgccgcaggagcggccca
    aagaaatagcaaaaagacaaaataaa
    taaataaataaataagtaagtaaaat
    aaactgtgaaacggggagttcccttc
    atggctcagcagttaacaaacccagc
    taggatccatgaggatgtaggttcga
    tccctggccttgctcagtgggttaag
    aatccagcgttgctgtgagctgtgat
    gtaggtcgcagatgcagcccagatcc
    tgcattgctgtggctgtggcgtaggc
    tggcagctgaagctccgattcaaccc
    ctagcctgggaacatccatatgctgc
    aggtgtggccttaagaggcaaaaaaa
    taaaaaaataaaaaataaataaattg
    tgggacagacaggtggctccactgca
    gagctggtgtcctgtagcagcctgga
    agcaggtaaggtaaggactgcagctg
    ggtaaggactgaattgcaccaactgg
    gaagtaagcctagatctagaacttaa
    gttagccctgacatagacacacagag
    ctcaccagctaagtggttcagcttat
    aagctggtcactgaaactgaggatgt
    ccacaaaagcaaaataagtagcaaca
    ggcagcgggatgcaagagaaagagga
    ggcctaaaatggtctgggaatccctg
    ccatacctatattttatcctacttat
    atttagtgcctgaatgtgtgcctgga
    gagcaaagtttagggaaagcatcggg
    aaatgcacagtattcatacccttagg
    aacaaagatcagttacctccagggta
    aagactatttccaagtttaaatttca
    acccctgaacattagtactgggtacc
    aggcaacacttgccatcctcaaaatc
    aatgaatcctaaaattcaacctgggg
    gtcagtgacagtctgtgacaaagttt
    ttgctggtcagtaacgaaataagtat
    gagcaccatctgagtatggtcaccaa
    gatgtcaactctctttcctttggacg
    aattgtcattattccaagattaggtc
    ctttctatttttgaggtgtgaaaaca
    tctttcctttcataaaataaaaggat
    agtaggtggaagaattttttttgttt
    tttggtctttttgctatttctttggg
    ccgcttctgcagcatatggaggttcc
    caggccaggggtcgaatcggagcttt
    agccaccggcccacgccagagccaca
    gcaacacgggatccaagccgcatctg
    cagcctacaccacagctcacggcaat
    gccggatcgttaacccactgagcaag
    ggcagggaccgaacccgcaacctcat
    ggttcctagtcggattcgttaaccac
    tgcgccacgacgggaactcctaatga
    tactcttttatatttagctactatgt
    gatgatgagaaacagtccacatttta
    ttattttttagccaatttgatatctc
    attactaagataatgataattttctc
    tataaattttatttaagttagtgtta
    tgaagtggttttgctagtgtagaagg
    ctaggatttgaattcagttcaagaaa
    gaagagagggagggagggagagggat
    gggtagagggatggggcagtgggaga
    gagcaaagaggagagacagtttttgt
    attaattctgcttcattgctatcatt
    taagggcacttgggtcttgcacattc
    tagaatttctaaggaccttgaccgcc
    agattgatatgcttcttccctttacc
    atgttgtcatttgaacag
    ATGATTGAGACAGATGAGGACTTCAG Exon 15 Seq ID No 24
    CCCTTTGCCTGGAGGATATGACTATT
    TGGTTGACTTTCTGGATTTATCCTTT
    CCAAAAGAAAGACCAAGCCGGGAACA
    TCCATATGAGGAA
    gtaagcaggaataccagtggaagtgc Intron 15 Seq ID No 42
    ccctttcttccttccttcctaaataa
    acttttttattttggaacaactttag
    agttacagaaaagttgcaaagatatt
    atagacagtagtgtttatatatatat
    ataaatttttttttgctttttatgac
    cacacctgtggcatatggaggttccc
    agtctaggggttgaattggagctaca
    gctgccagtctgtgccataaccacag
    caatgcaggatctgggccacgtctgt
    gacctacaccaaagctcacagctgga
    ttcttaacccactgagcaaggccagg
    gattgaacctgcatcctcgtggttcc
    tagttggattcgtttccgctttgccg
    caatgggaactccaaattattgttaa
    tatcttactttactggggtacatttg
    ttacaaccaatactctgatactgaaa
    cattactgttaactccgtacttgctt
    ctttttgagtcatttgcaaagactgg
    cttcttgacctgcttccttccaaaca
    gctggcctgcctatgctgttctcaga
    cctgcaagcactgatctctgcccccc
    ttgccttctctccagtggtgtctcct
    tccccaaacaaacccagtgtggctct
    ggaaagggagttaagtcaacataaac
    caacacatattttgttgagctccaat
    tttgagcaaatccctcacctacggca
    gacaggcatgatgttaagaactaggg
    ctttggacacaaggtcaagaccaaga
    agggttcctcacccctactgattcag
    ataaccaataatgaggctttgaatcc
    ctgtccaaaggttgttttttttccct
    tctattgagcttcttgccaccttatc
    agttttttttatgacagtcaaatgac
    atgatatatgtgagcatacatggtaa
    tttttaattctatataaatgaatcac
    taaataaattaggaggatatatagtc
    cacctttaagcgtattacacgtgtca
    catgaatgtgtggcgacttaattgta
    gaggtttaaatgtagcttcctataat
    agatgtgttcctaaactacattttaa
    tcattggacttgtatttttatgttag
    cacttgctgttgaagaaaagcctatg
    ccaaaagttcagtgaaaccaataatc
    cactgccagctttctgagttaaaaaa
    aatccctgggttttcacacacaggaa
    caccctgtgtgaaacactcatttaga
    gcaaaatgcatctgataaggagttcc
    tgttgtgcctcaactggttaaggacc
    tgacattctccatgagaatgtgagtt
    tgatccccggccccactcgatgggtt
    aaggatctggtgttgccacaaactgc
    agctccgattcatctcctagcctaga
    aacttccacagcccagaatatgccac
    agaattcggctgtttaaaaaaaaaaa
    gaaaaaaaaaagaatcataaatgtgt
    tggtttgttcaccaaatacatgataa
    cttgctcttgccaagctcagcttcat
    aaatattaagtcatttaatacagcag
    ccaccttatgaacagatattactata
    cttcccatttacagataaggaaaatg
    ccatatttaaccaagagattaaataa
    ctttcccgaggtcttatagcaagtaa
    atcatggtgcaggggtttgaccacac
    gcagtctatcctccagagtctgtgta
    tttagccactgttttactttcaaatt
    taaatttataaaacttctaaattatc
    tgttaaccataatctttggaattttt
    aaaaccacgagttcctataaaatgtt
    tcattgaaagtaagtcacttttccat
    agcttttgataatacatctgtaggat
    aaagtaagccacagctctcttgcaga
    cttggtacaccctggggcaaagcatc
    atgcctgtcacgtacatggtggtcct
    tactttgactctcagtgcttttattg
    cccaggaattttgtgagatttctagt
    tgttgaggtttgtttaaagaggttat
    gccggtacttggaagagctcttttct
    tgctacctggagccttctcatatttc
    ctttttgaggagggacatgaattgcc
    tttcaaactcataaatatattttcta
    gtacacaagtctccatcttccttaga
    cgcatggctcctggagttctccatcc
    tcctgctccactttgggtgggctcct
    ctctgggtctgccaccaatctgccac
    ccagagacatccttgacccacttcca
    gaccccaccatggcttcactttcttc
    gctttcctcctttgtggaaccttctg
    cttaagaatctgaggaagaaaatttg
    cacgtgagctaaactggaggtacttt
    cctgcctggtcttgcacgatagcttg
    gctgagcccatgatgctgggtggctg
    ttactttccatggacacccgaaggcg
    ttgctcctttggcttctagttgcatg
    cagtgttgcttatcccaggctgatct
    ttcttccactgtaggtgacttttaag
    aattaagggattaatctatatctaca
    acaacaacaacaaagaccttttcaag
    ctgaggtagggctttctgtatatgtt
    tggagtggttatccagcagactttac
    ttgaaggcaggggtcatatcctcaag
    tgctcataaacggaccacagaaagat
    ctcataattgggtggagctgggtggg
    gaccgtgtcatgtggccaggaaatgc
    cagatgggaagggagtggcccttact
    gagctccagctgaactctgaattttc
    tagaaaactcagaaatctggattttt
    catgtgtaatacccagatttatagat
    gtggaaagctaattcttttttttttt
    aagggactataggcaatgaactaaga
    tctaggttgtatttggacaaggggtc
    atcagtttaagctgtgtagttgagcg
    ctcagctattgggctgagggacccct
    aaatactgagacggggaggtccttgc
    tctggggcatcacaagtacactccct
    ggtctcattcaaacacttttcctaca
    aaattgatcccatttcttcagtgcac
    tgtctgaatgcatttggcccagagcc
    gtgctgaggcatagggaaggggtcca
    cggtttcatggcatcgttttgtgctg
    tgtgtccctgctgtcgtccaggatac
    ctacctctcctcctcctgcatctgaa
    tgtccccccacagactctctgggatt
    ctacagcctctggcctgttcctcaga
    cacctcttacctgccagctttccaga
    ttcacattagttagtccaaatctact
    gccgtcagtgactcacttcatttctt
    cttctccgaggcagttcagcccggta
    cagttgttttgtcaacacttcagttg
    agtctggaagatgtgcatgggttatg
    cacgagagcggtccatcattttgagc
    tagaagtcctttctcagcccagagac
    aagtcctcatctcctttacttcctga
    ctcttcttcctctgcatccttccaag
    atatctctttctccagccaccaccta
    aatctcttcttttcccggggttccgt
    gctcaacccactcttcttcttaaatc
    tgtggctgggtgaacgcatctgctgg
    caccacttctctgctaaagactccaa
    aaatccataggtcctgcccggccttt
    gcccacctctctccaacactgtccag
    ctttagatgtagagctaatcccccca
    gagatatcattccctggatgtctaag
    tcctttggtatctcactttcagcgtg
    ttcaaaatcctcttacaactgttctt
    tctccttttccatcttgattattggc
    aacatgccagcctttcccctaccccc
    agcagtgagccaagctagaaacaagg
    gcttaatcttcaatctttccttctcc
    atccctaaacctaatgagtctccaag
    cccttcccagtttacaccctaaatgt
    tgctcaaaacatcccctagttcttcc
    acgtgctctcctctatattgaaaggt
    caagaaaggccatcttccctccactg
    tgaggaaatagatcttgatactgccc
    ctgagctgggcagtcctcgacctgac
    aaactgtgcagtgtttctaaatctct
    actggcaaaatgagagtgcctttgac
    ctgtgttgcgatctcagatcacagtg
    gatgtaattgttttataggaatggtg
    aacgaaaaagaagtaaatccctaatg
    ccaaactcctgatcattctatgtcat
    ttaatagcctgtcatttatgataaag
    tttcctctactggcattagcacaata
    cttctcaggaaaaaaaaatatgatgc
    cagatactgaaaagctcctgggtaaa
    catgaacatgggtaccgataaaatgg
    tgaagccagtccaatcttagagtgac
    ttcccttcatgctacttcatgctctt
    ttttttttttttttttaagaaaaacc
    ccttttttttttctcacaccagtcac
    agaggagaccgaggcttagcaaggtt
    aaggtcacatgattagtaagtgctgg
    gctgaaactcaaaaccatctctgctt
    gtctcctaaccctgtgcacctctgac
    tattcaacag
    ATCCTGTGTCAGGAGTTGGGATTCTT Exon 15a Seq ID No 25
    TGAAG
    gtaagggccttgaccaccgaattaag Intron 15a Seq ID No 43
    gtaatcttgctctgtggcaggccttg
    ttttcagtattttaagtacactggct
    caggtaatcctcacaacagccccagg
    aggaatgttctattacctccactgta
    tagatgaggaacttgaggcacagaat
    ggttgccaaggtcacacagctatatt
    gggggttcatacccagccatccaact
    ctgtctgtactctctgccactctgca
    cccccagctcctgatccacttcctgt
    ttccatccctcgatttctgctgcact
    caggggcccctctccccctcggcctg
    tgagatctgcttcagtaggcttttct
    ccctgactcctccatccctgtcctta
    caggcagctgcttctctccgggacac
    gaggggtccatacggacactctctac
    tggctgggttgcgcctaactcgtgat
    tcctcctctgtttcag
    ATTCGGAGCCGGGTTGATGTCATCAG Exon 16 Seq ID No 26
    ACACGTGGTAAAGAATGGTCTGCTCT
    GGGATGACTTGTACATAGGATTCCAA
    ACCCGGCTTCAGCGGGATCCTGATAT
    ATACCATCATCT
    gtaagtccgaaaatgcctgtcgtgtg Intron 16 Seq ID No 44
    tgccttaggctgctgcggaggaggcc
    agggctatataagcagagtcagtgac
    tgactgtgccctgcagtgttgatggc
    catggagattccaccgttagagcttt
    tttctttgttaaccttgaaggcaaat
    ctggttaggaagataactttcaaaga
    gtcaccatctggacattcatgcccat
    gtgcttcaatcctgtatacaagcagt
    ttagagtacagggaagggaaggacat
    tatgaaagggagagggtgtgtttgga
    tccagcagctccatcctcagaattta
    tctgaagacactgcaaaattactaag
    aatcactatgacaagaatgaggatgg
    ggtgatatggcaaagttgtgatcctg
    gaagaccttcatctcccatgttgccc
    aactctgaacatgaatttggtgaact
    agttggttaaggggatgatcctccaa
    gtttctccctggttgagctccaaaaa
    ccatgtaagtttctcatagcaaaacc
    gtataggtccttagggctttagttgg
    aatatttgtgctgaaatgctggaaag
    ccccatttgccatttttgtatttgca
    aaataatcatcaagaggggagaatgc
    attctttcatgaccactgaccctctg
    aaaaggtcaggaatttagtctgaagt
    aggcaagcctcctaccccgcttctgc
    catgagcttgcacgcacaggcctgtc
    ttgacatttcttctttatagatttct
    ttttgaatatcttgaaattgctttaa
    aaatatttaaagaatgtagaattata
    taaaataaaaaggaaataaccccaca
    cctcccacaaaaccctgtttcctgcc
    tttctccacccactctccagggtaac
    acttggtaacagcatagttgtatcac
    cccaggcctatttttgagcatatcag
    catttcaagaaatgtattttttctca
    ataaaacatcccttatagttgaggag
    gggaggttatcattcctgggttttgt
    tttttttttttttttaatgtaatcct
    ggtacatcggtaatttgcatttttta
    ttcattaatatctttggtatttctag
    tgttgggacacacaggtcaacctcag
    tttttgggtttttttttttgtctttt
    tgtctttctagggccacacctgcagc
    atatggacgttcccaagctaggagtc
    taatcagagctgtagccaccagccta
    cgtcatagccatagcaacgtcagatc
    caagccgtgtctgtgacctacaagca
    cagctcatggcaacaccggatcctta
    accactgaacgaggccaggggatcga
    acacacatcctcatggatcctagtca
    tgttcattaaccactgagtcatgatg
    ggaactccaacttcaactattttaat
    gtctgtaaaacattccatttggaaac
    catttcatttgtaaagcaaaatgaaa
    acattttgttcattttcaacagagtt
    cgtagctgacttctgttctggaaaaa
    aggaaatggagcaaatttgagtgaga
    aagattcaaagataacttttctttta
    aaaaaaattatatcttggaaacttct
    gggctattgattctgaagactatttt
    tctatatactgttttgatagcaaagt
    tcataaatgtgaaaggatcctgcgat
    gaatcttgggaagcagtcatagccca
    atatatctttgttgcttttaaaatga
    gatttagtttactaaatatttttctg
    atcataaaaataacacagatctaccg
    cagaaaatttggaaaaaaaaaaactt
    ttaaattcaaaaaacagttaaaccac
    aaatgatcccaccatccagagagcaa
    tttgtactttggtgtctagttcatct
    ttctttttctgtttacaagcacatat
    accacaagcattttttcaaaaaatga
    aaatgggataatactatacatacgtc
    tgtacacctgcatagttactgaacag
    tctttgatctaccctgtaagtttcta
    acttttcattatttgaaatgatgttt
    tggcaaagaaatatgtaggtgtgtct
    cgcacactttcataatgatttcttag
    gataaatttcttaggataaattcata
    atgatttcttataataatccatactc
    tgccaactgatcttcagggaagccaa
    ctcgccttctcagaaataacatataa
    cccatttacttgccctctcaccaata
    ctaggtcctaatgtttttgtgtacag
    attctatatttttacatacaagaatt
    ccttaaagcaaggcatgtcacagaaa
    aatagaaggaagacacaattgtcatg
    tttaaggactgcattctgtaccaaaa
    atgctaagttaaatgaacatctgaaa
    cagtacagaaacgctatctttcaggg
    aaagctgagtaccaggtactgaacag
    attttggcaaatacagcaggcatgga
    tgtttccaaaacatgtttttctactt
    tatctcttacag
    GTTTTGGAATCATTTTCAAATAAAAC Exon 17 Seq ID No 27
    TCCCCCTCACACCACCTGACTGGAAG
    TCCTTCCTGATGTGCTCTGGGTAGAG
    AGGACCTGAGCTGTCCCAG
    gtaaagcatcctgcaggtctgggaga Intron 17 Seq ID No 45
    cactcttattctccagcccatcacac
    tgtgtttggcatcagaattaagcagg
    cactatgcctatcagaaaacctgact
    tttgggggaatgaaagaagctaacat
    tacaagaatgtctgtgtttaaaaata
    agtcaataagggagttcccatcgtgg
    ctcagtggtaacgaaccctactagta
    tccattgaggacacaggttcaatatc
    tggcctcactcagtcggctaaggatc
    cagtgatgccgtgagctgcagtgtag
    gccacagacgtggctcagatctggtg
    ctgctgtggctatggtgtaggccggc
    cccctgtaactccaattcgaccccta
    ggctgggaacctaaaaagaccccaaa
    aaagtcgctttaatgaatagtgaata
    catccagcccaaagtccacagactct
    ttggtctggttgtggcaaacatacag
    ccagttaacaaacaagacaaaaatta
    tcctaggtggtcagtgggggttcaga
    gctgaatcctgaacactggaaggaaa
    acagcaaccaaatccaaatactgtat
    ggttttgcttatatgtagaatctaaa
    ttcaaagcaaatgagcaaaccaattg
    aaacagttatggaagacaagcaggtg
    gttgtcaggggggagataaggggagg
    caggaaagacctgggcgagggagatt
    aagaggtaccaactttcagttgcaaa
    acaaatgagtcaccagtatgaaatgt
    gcaatgtgggaaatacaggccataac
    tttataatctcttttttttttttgtc
    ttttttgccttttctaaggctgctcc
    cgtggcatatggaggttcccaggcta
    ggagtccaaacagagctgtagctgcc
    agcctacaccagagccacagcaacac
    gggaaccttaacccgctgagcaaggc
    cagggatcgaacccgagtcctcacag
    atgccagtagggttcattaaccactg
    agccacgacaggaattccagggtctg
    ttgtgttcttaaaacacttccaggag
    agtgagtggtatgtcataagtaaaca
    ataaatgttaaccacaacaagcttat
    gaaataaacaggaaagccatatgacc
    tacaatcagtcattgggagaatccac
    aaaaggttgagcagaggatcaattcc
    agctcacactccagttttagattctc
    ccctgccttaaagcatcacagactac
    ataatctgagctgaagaataaaaatt
    aaaactcaccccagtgcaaaacagaa
    atgaaaaagtattaaaacgaggttca
    tactgttgttcattagcaatatcttt
    tattcacag
    GGGTGCCCAACAACATGAAAAAATCA Exon 18 Seq ID No 28
    AGAATTTATTGCTGCTACGTCAAAGC
    TTATACCAGAGATTATGCCTTATAGA
    CATTAGCAATGGATAATTATATGTTG
    CACTTGTGAAATGTGCACATATCCTG
    TTTATGAATCACCACATAGCCAGATT
    ATCAATATTTTACTTATTTCGTAAAA
    AATCCACAATTTTCCATAACAGAATC
    AACGTGTGCAATAGGAACAAGATTGC
    TATGGAAAACGAGGGTAACAGGAGGA
    GATATTAATCCAAGCATAGAAGAAAT
    AGACAAATGAGGGGCCATAAGGGGAA
    TATAGGGAAGAGAAAAAAATTAAGAT
    GGAATTTTAAAAGGAGAATGTAAAAA
    ATAGATATTTGTTCCTTAATAGGTTG
    ATTCCTCAAATAGAGCCCATGAATAT
    AATCAAATAGGAAGGGTTCATGACTG
    TTTTCAATTTTTCAAAAAGCTTTGTT
    GAAATCATAGACTTGCAAAACAAGGC
    TGTAGAGGCCACCCTAAAATGGAAAA
    TTTCACTGGGACTGAAATTATTTTGA
    TTCAATGACAAAATTTGTTATTTACT
    GCGGATTATAAACTCTAACAAATAGC
    GATCTCTTTGCTTCATAAAAACATAA
    ACACTAGCTAGTAATAAAATGAGTTC
    TGCAG
  • TABLE 10
    Genomic Sequence of CMP-Neu5Ac Hydroxylase gene
    ctgccagcctaagccacagccacagcaacgctgggtc Seq ID No. 46
    tgagccatgtctgcagcctatgccagagctccccgca
    gcgccggatgcttaacccactgagcaaggccagggat
    tgaaccctcgtcctcatggatagcagttgagttgttt
    ccacggaactcttaggggaactcctgattatttttta
    tttaaatttatatttctctgactttttcgtgtgctca
    tcagccactgactgtgtatctccattagtcatggttt
    gttaactctgtcattcaaaccctcttcatccttgcta
    cgcagataacatcattataataaaatcgtgcctgaag
    accagtgacgcccccaagctaagttactgcttcccct
    ggggggaaaaagaagcaccgcgcgggcgctgacacga
    agtccgggcagaggaagacggggcagaggaagacggg
    ggagcagtgggagcagcgggcagggcgcgggaagcac
    tggggatgttccgcgttggcaggagggtgttgggcga
    gctcccggtgatgcaggggggaggagccttttccgaa
    gtagcgggacaagagccacgggaaggaactgttctga
    gttcccagtCCCGACGTCCTGGCAGCGCCCAGGCACT
    GTTATTGGTGCCTCCTGTGTCCACGCGCTTCCCGGCC
    AGGCAGCCCTGGCGGATCCTATTTTCTGTTCCCCCGA
    TTCTGGTACCTCTCCCTCCCGCCCTCGGTGCGCAGCC
    GTCCTCCTGCAGTGCCTGCTCCTCCAGGGGCGAAACC
    GATCAGGGATCAGGCCACCCGCCTCCTGAACATCCCT
    CCTTAGTTCCCACAGgtgagaaggcttcgccgctgct
    gccgctggcgccggcagcgccctccacgcacttcgta
    gtgggcgcgcgccctcctgcattgtttctaaaagatt
    tttttttatccgcttatgctatcagttactgaggaag
    tatttacaaatctactattattttgaatttgcctttt
    tctccttatagtttatcagtatctcttgagactgtta
    ttggtgcctgcaaatttaaaatgattggggttttatg
    aggaagtgaaccttttatctttatgaaacgcctaact
    gaggcaatgttaattgcttaaaatactttctttatta
    tcagtgtggccatgccagtgtcctcttggttagaatt
    tgcctgat.............................
    ............ctgccaaagctgggagatgggggaa
    agtagagtgggttattgaaactgaatatagagttcag
    catctaaaagcgaggtagtagaggaggaagctgtgtc
    aacggaaatactgagctgggttcacatcctctttctc
    cacacagTCTAATGCCTTGTGGAAGCAAATGAGCCAC
    AGAAGCTGAAGGAAAAACCACCATTCTTTCTTAATAC
    CTGGAGAGAGGCAACGACAGACTATGAGCAGgcaagt
    gagagggggctttagctgtcagggaaggcggagataa
    acccttgatgggtaggatggccattgaaaggagggga
    gaaatttgccccagcaggtagccaccaagcttgggga
    cttggagggagggctttcaaacgtattttcataaaaa
    agacctgtggagctgtcaatgctcagggattctctct
    taaaatctaacagtattaatctgctaaaacatttgcc
    ttttcatagCATCGAACAAACGACGGAGATCCTGTTG
    TGCCTCTCACCTGCCGAAGCTGCCAATCTCAAGGAAG
    GAATCAATTTTGTTCGAAATAAGAGCACTGGCAAGGA
    TTACATCTTATTTAAGAATAAGAGCCGCCTGAAGGCA
    TGTAAGAACATGTGCAAGCACCAAGGAGGCCTCTTCA
    TTAAAGACATTGAGGATCTAAATGGAAGgtactgaga
    atcctttgctttctccctggcgatcctttctcccaat
    taggtttggcaggaaatgtgctcattgagaaatttta
    aatgatccaatcaacatgctatttcccccagcacatg
    cctaactttttcttaagctcctttacggcagctctct
    gattttgatttatgaccttgacttaatttcccatcct
    ctctgaagaactattgtttaaaatgtattcctagttg
    ataaacagtgaaacttctaaggcacatgtgtgtgtgt
    gtgtgtgtgtgtgtgtgtttaccagcttttatattca
    aagactcaagcctcttttggatttcctttcctgctct
    ctcagaagtgtgtgtgtgaggtgagtgcttgtccaaa
    cactgccctagaacagagagactttccctgatgaaaa
    cccgaaaaatggcagagctctagctgcacctggcctc
    aacagcggctcttctgatcatttcttggaagaacgag
    tgctggtaccccttttccccagccccttgattaaacc
    tgcatatcgcttgcctccccatctcaggagcaattct
    aggagggagggtgggctttcttttcaggattgacaaa
    gctacccagcttgcaaaccagggggatctgggggggg
    ggtttgcacctgatgctcccccactgataatgaatga
    gggattgaccccatcttttcaagctttgcttcagcct
    aacttgactctcgtagtgtttcagccgtttccatatt
    aggccttccaccgtgtcgtgtcgtcaatcttatttct
    caggtcatctgtgggcagtttagtgcgaatggactca
    gaggtaactggtagctgtccaagagctccctgctcta
    actgtatagaagatcaccacccaagtctggaatcttc
    ttacactggcccacagacttgcatcactgcatactta
    gcttcagggcccagctcccaggttaagtgctgtcata
    cctgtagcttgcttggctctgcagatagggttgctag
    attaggcaaatagagggtgcccagtcaaatttgcatt
    tcagataaacaacgaatatatttttagttagatatgt
    ttcaggcactgcatgggacatacttttggtaggcagc
    ctactctggaagaacctcttggttgtttgctgacaga
    ctgcttttgagtcccttgcatcttctgggtggtttca
    agttagggagacctcagccataggttgttctgtcacc
    aagaagcttctgcaagcacgtgcaggccttgaggtct
    tccgacttgtggcccggggactctgctttttctctgt
    ccttttttctccttagtgggccatgtcctgtggtgtg
    tcttagccagttgtttaagggagtgttgcagctttat
    gattaagagcatggtctttccttgcaaactgcttggt
    ttagaagcctggctccaccacttagcggctctgtgac
    ctcggacacatttcttagcctttctgggcctcgctct
    tcttcctcataaagtgaaaatgaaagtagacaaagcc
    ttctctgtctggctactgagaggatggagtgatttca
    tacacataaagcacttaaaataatgtctggcatatga
    tacatgctcaataaatgtcacttacatttgctattat
    tattactctgccatgatcttgtgtagcttaagaacag
    aggtctttacaggaattcaggctgttcttgaatctgg
    cttgctcagcttaatatggtaattgctttgccacaga
    ctggtcttcctctccttcacccaaagccttagggggt
    gaacgatcccagtttcaacctattctgttggcaggct
    aacatggagatggcaccatcttagctctgctgcaggt
    ggggagccagattcacccagctttgctcccagataca
    gctccccaagcatttatatgctgaaactccatcccag
    agcagtctacatggtacactcccccatccatctctcc
    aaatttggctgcttctacttaggctctctgtgcagca
    attcacctgaaatatctcttccacgatacagtcaagg
    gcagtgacctacctgttccaccttcccttcctcagcc
    atttttcttctttgtacataatcaagatcaggaactc
    tcataagctgtggtcctcattttgtcaatctaatttc
    acagcctcttggcacatgaagctgtcctctctctcct
    ttctgcctactgcccatgagcagttgtgacactgcca
    catttctcctttaacgacccagcctgctgaatagctg
    catttggaatgttttcaatttttgttaatttatttat
    ttcatcttttttttttttttttttttttttttttttt
    agggccgcacccatgggatatggaggttcccaggcta
    gggatccaatgggagctgtagctgctggcctacacca
    cagccacagcaatgcacaattcgagccaatctttgac
    ctacaccagagctcacggcaacactggattcttaacc
    cactgattgaggccagggatcaaactctcgtcctcat
    agatacgagtcagattcgttaacctctgagccatgat
    agttgttagttactcattgatgagaaaggaagtgtca
    caaaatatcctccataagtcgaagtttgaatatgttt
    tctgccttgttactagaaaagagcattaaaaattctt
    gattggaatgaagcttggaaaaaatcagcatagttta
    ctgatatataagtgaaaatagaccttgttagtttaaa
    ccatctgatatttctggtggaagacatatttgtctgt
    aaaaaaaaaaaatcttgaacctgtttaaaaaaaaaac
    ttgactggaaacactaccaaaatatgggagttcctac
    tgggacacagcagaaatgaatctaactagtatccatg
    aggacacaggtttgatgcctggcctcgctaagtgggt
    taaggatatggtgttgctgcagctccaattcaacccc
    tatcctgggaacccccatatgccaccctaaaaagcaa
    aaagaaaggtgctgccctaaaaagcaaaaagaaagaa
    agaaagacagccagacagactaccaaatatggagagg
    aaatggaacttttaggccctatctccaactatcacat
    ccctatcaccgtctggtaagaaatggaaaaaatatta
    ctaagcctcctttgttgctacaattaatctgattctc
    attctgaagcagtgttgccagagttaacaaataaaaa
    tgcaaagctgggtagttaaatttgaattacagataaa
    caaattttcagtatatgttcaatatcgtgtaagacgt
    tttaaaataattttttatttatctgaaatttatattt
    ttcctgtattttatctggcaaccatgatcagaaatct
    ttaaacaatcaggaagtcttttttcttagacaaatga
    aaatttgagttgatcttaggtttagtacactatacta
    ggggccaagggttatagtgtgactattaaatcacaga
    taatctttattactacattatttccttatactggccc
    cacttggatcttacccagcttagcttttgtatgagag
    tcatccttaaagatgactttattctttaaaaaaaaaa
    acaaattttaagggctgcacccatagcatatagaagt
    tcctaggctagcggtcaaattagagctgcagctgcca
    gcctatgccacagccacagcaatgccagatctgagct
    gcatctgtgacctacactgcagcttgcagcaatgctg
    gatccttaacccattgaacaatgccagggattgaaca
    cacatcctcatggatactgctcaggttcctaacctgc
    tgagccacagttggaactccaaagcagactttattct
    gatggctctgctgatctctaacacgttattttgtgcc
    atggtgtttatcttcactttactcaagtcagggaaac
    acgaagagtctcatacaggataaacccaaggagaaat
    gtgcaaagtcacatacaaatcaaactgacaaaaatca
    aatacaaggaaaaaatatcttcactttcaaaatcacc
    tactgatgatgagtttatatttccttggatatttgaa
    tattagctatttttttcctttcatgagttttgtgttc
    aaccaactacagtcgtttactttgatcacagaataat
    gcatttaagccttaaatagattaatatttattttcac
    catttcataaacctaagtacaatttccatccagGTCT
    GTTAAATGCACAAAACACAACTGGAAGTTAGATGTAA
    GCAGCATGAAGTATATCAATCCTCCTGGAAGCTTCTG
    TCAAGACGAACTGGgtaaataccatcaatactgatca
    atgttttctgctgttactgtcattggggtccctcttg
    tcaacttgtttccaatctcattagaagccttggatgc
    attctgattttaaactgaggtattttaaaagtaacca
    tcactgaaaattctaggcaagttttctctaaaaaatc
    ccttcattcattcatttgttcagtaagtatttgatga
    gaccttaccatgtgtaaacattgcactaggtattaag
    aaatacaaagatggataagatagagtcggcgtaaatg
    agatgatataatgagacgttataatgaaactcacaat
    tccagttgggaaataaagtccttcaaattccatgact
    ctttctggcacacgttagaggctacagcttctgtgtg
    attctcatgctggctccacttccactttttccttctt
    cctactcaagaaagcctatagaaatatgagtaagaag
    ggcttaatcataggaataaatttgtctctgttctaag
    tgattaaaaatgtctttatcagtataaaaagttactt
    gggaagattcttaaaactgcttttacacactgttcta
    gaatgactgttatataaataaaaaagtagatttgatc
    taacacaattaaatgacctttggaaatattgactaat
    tctcaccttgcccctcaaagggatgcctgaaccattt
    ccttcttttgccagaaagcccccaccctttgtctgtt
    gacctagcctaggaaatcttcagatcacgttgttagc
    acgaactggttacatgtgctgtacaaatactatttaa
    ttcatctgattaaaaaaaaagagataagaagcaaaag
    tttgactatcttaaactgtttgcgtaggtgagaggac
    aattgaccatctactttatgagtatgtaacccagaaa
    cttaaagctccttaagggagctaagtcttttggataa
    gacctatagtgagaccttttagcaaaatggttaagac
    tgaatggagctcactagcgtgggttcatatcctgatg
    ctcaaacacgcaattaaatgactttaggtgggttagt
    ctctgttccttagtttcctcaatgggagataatattg
    gtagtagcgattttactgggttgttgaaagaacatct
    gttaaatgttcagaacgtgttacgacagagtacagag
    taatgatttgcttgtatatgtatgactcaaatagtct
    gccatatgccttgtgactgggtcctgtggagcaggaa
    ggagggatttcccacccagcagaaagttgggtaaact
    ggaaaatagactgaggccaggaaatgatgcaaagcgt
    tgatgttcactgccacggcaggtgaagggcagggcca
    gagttgtcagtagggtcaggggaggactggaaataac
    caagacccactgcacttttcagcctttgctccagtaa
    ggtaatgttgtgagagtagaaaattttgttaacagaa
    cccacttttcagtacagtgctaccaatactgtagtga
    tttcataccacatcccaagaaagaaaaagatggctca
    atcccatgtgagctgagattatttggttttattgtta
    aataaatagcattgtgtggtcatcattaaaaaaggta
    gatgttaggaaagtagaaggaagaagactctcaccta
    cattttcatcactgttttggtatctgccagttgtcac
    cttggtccccttccccgcctctcccctgcctcctctt
    cctccttctcctttttttggaatacaattcaggtacc
    ataaaatttacccttttagagtgtttgactcaatggt
    ttttagtattttcacatgttgtgctattactatcact
    atataattccaggtcattcacatcaccccccaaagaa
    accttctaactattagcagtccattcccttcttccct
    cagcccctggcaaccactaatctacttactgtctcca
    tggatgttcctatattgaatcaagctagcataaaccc
    cacttgctcatggtcataattcttttttatagtgcta
    aattacatttgctaatattcaattaaggatttctatg
    tccatattcataaggaatattggtgtgtagttttctc
    tttgtgtgatatctttgtctggttgggggatcagagt
    aataattactgctctcatagaatgaattgagaagtgt
    tccctccttttctatttattggaagagtttgtgaagt
    atattggtattgattcttctttaaacatttggtcaga
    ttcaccagtgaagccatctgggccatggctaatcttt
    gtgaaaagttttttgattactaattaaatctctttaa
    tttgttatgggtctgctcctcagacgttctagttctt
    cttgagtcagttttgttcatttgtttcttcctaggac
    tttctccctttcatttggattatttagattgatagta
    atatcccccttttaattcctggctgtagtaatttggg
    tcttttctcttttttcttggtcagtttagctaaaggt
    ttgtaattgtattaatcttttcaaataactaactttt
    ttgttttgtttgttttttgttttttgttttttgtttt
    ttgtttttttttgctttttaaggctgcacctgaggca
    tatggaagttctcaggctagaggtctaatcggagcta
    cagctgctggcctataccacaaccatagcaatgccag
    attcaagctgcatctgcgacctacaccacaactcggc
    cagggatcacacccgcaacctcatggttcctagtcgg
    atttgttaaccactgtgccacgacgggaactcccgcc
    cattttttttaacacctcatactttaacataaagatg
    ggcttcacatggactgatagctcaaatgaggaaggta
    agactatgaaagtaatggaagaaatgtagactatttt
    tgtgacctagagattactgatacttcttgacttttca
    aacaatacttcaaaagtacagcccaaagggaaaaaag
    aaagaaaaaagaaacacacatatacacaaacctagtg
    aataagatatcatcgatacactacagatttctatgaa
    ctggaagaccccatggacaaagttaaagaacatatga
    tagtttgagtgattattttgcaatatttacaaccaat
    gagggaatattatccagcttataggaggaagtaatgc
    aaatcgacaagaaaaagataggaaacccaatataaaa
    attaagaaaatacaaaaattaagaaaggatatgaact
    agcattttacaaaagaaaaatctccaaaagtcaatca
    gcacatgaaaatatgctcaaacctattaattattaga
    aaactacagactgaagcaatgaggtgctttactttac
    atctttttgactgataaaaagttagaaacaaaggtga
    tatcaaatgtcagggataaaaggatatagaaatcgtc
    atgcctgtggtgggagtatggccggtgcagtcatgtg
    ggaaggtaatctgacagtggttaggcagagcaggttt
    atgaatacactgtggcccatcaatcccacgcctgttt
    atgtaccaaagaaatcctgttgtggcagaatctatgg
    gtccacccctgggagcatgaattaataaaatgtggca
    ccagggtgtgtgaaactccagctagagatgagatgtc
    cacatggcaacatgaatgcatcttagaaacatagatt
    tgagtgaaaaagagtaagaaacagccgggaaacccaa
    taccatttataaaaattaaagatgcacacatacaatg
    tagtaaatattttgcatgaactttcaaatggttgcct
    acagggggggagagtaaagaagagtagaaaacaaaga
    taaagggagtaagtaagtagctctgcctggactgaat
    ataatgtgtcatgaactgagaaatatggttaacataa
    tcctcttaacttgaggtcctaaatgaatgaatgagtc
    cactattcatttacccattctttaatgtgtattgcat
    tataatccatttttttagaaccaacgaattttgttcc
    cataactactaatcagcctgccttttctccctcattc
    ccttatcagctcaggggcattcctagtttttcaaacg
    ttcctcatttgaaccaaaaatagcatcattgtttaaa
    ttatacttgttttcaaatacgatgcttatatattcca
    agtgtgtttgcccattttcttaggtggtagaaatttt
    tcattctacttttctatctactcagattttcccgttg
    gaattatttccattgctattaaacttagaagtccccc
    ctgtgatatgccatttttttcatactttttaagcact
    tggttgcttttctttgtgtctttaagcacctagaata
    cttataaccattgcacagcactgtgtatcaggcagcc
    cttcctcttccactaatttatggtccttctcttagac
    tatattaaactgttatttaattaggatcctctcttcg
    tccttatgatttaattattatagttttctaatatgtt
    tttattataattcctcttcattattcctccctattaa
    aaattttaatgaattccatttgtttgttcttctagtt
    aaatattaagtcataatccaaataacttagatgtcat
    tagtttatgtggtcaaagtaaggataccacatcttta
    tagatgcaggcagttggcagatgtcatgattttcttc
    agtgcataaatgcaatttatctttgagcaaggggcat
    aaaaacttttatggtattggctttgaaataatagtta
    agaactgcagactcagtttttcctgcttttcttgaaa
    aagaacacttctaaagaaggaaaatccttaagcatgg
    atatcgatgtaattttctgaaagtctcctgtaattcc
    ttgggatttttgttgttgtttgttggtcggttttttt
    gggtttttgtttgtttgttttgttttgttttgttttg
    cttttagggctgcacctgtggcatatggaagttccca
    ggctaggggtccaactggagctacagctgccagccta
    ctccacagccacagcaacatgggatcctagctgcatc
    tgtgacctaaccacagctcttggtaatgccagattgt
    taacccactgagcaatgccagagatcgaatctgcctc
    ctcatggacactagtcagattagtttctgctgagcca
    caatgggaattcccaattccttgtatttttgaactgg
    ttatgtgctagcatataattttgtttcttgaatcttt
    gtgggtttttttttttttttttttttgtctcttgtct
    ttttaaggctgcacccacagcatatggaggttcccag
    gctagaggtcaaattggagctacagctgccagcctac
    acaacaactgcagcaaagtggggcccaacttatatga
    cagttcgtggcaatgccggattcctaacccactgagc
    agggccagggatcgaacctgagtttccagtcagtttc
    gttaaccactgagccatgatagtaactcctgtttgtt
    cagtcttgaacctcctttttaattctttattccttga
    gggtgaaataattgccataataatactatcatttatt
    acatgccttctctgtgctaggcatagtgacactttag
    gatttattatatcacttaatccctacaacaactctgc
    aaagtatgtatcataatcctatttgacagatcaggaa
    attgcagcccaggatgcagataatatgcatccatcac
    aagtgactagatatagtccctctgctattcagcaggg
    tctcattgcctttccattccaaatgcaatagtttgca
    tctattgtatatgtgttttggggtttttttgtctttt
    tttttttttttgtcttttctggggcctcacccttggc
    ataggtaggttcccaggctaggggtcaaattgaagct
    gcagctgccagcctacaccacagccacagcaactcgg
    gatctgagcctcatctgcaacctacaccaaagctcac
    ggcaacaccggatccttaacccactgagtgaggccag
    agatcaaaccggcaacctcatggttcctagtcggatt
    cattaaccactgagccacgatgggaactccctaaatg
    caatagtttgctctattaaccccaaactcccagtcca
    tcccactccctcctcctccctcttggcaaccacaagt
    ctgttctccatgtccatgattttcttttctggggaaa
    gtttcatttgtgccatttttcattttacgggtaattt
    ttacttcagtttcttccactagcagttgtcttaaagt
    gagtataattaatattcatttggaaaatgtaagcaaa
    acattttttaaagggccatgcccacagcatatgaaag
    tttctgggccaggggttgaatccaggctccaagttgc
    agctgtgccctacactgcagctgggcaatgctggatc
    ctttaacccactgtgcccggctagggatcaaacctgc
    atttccacagctacccgagccattgcagttggattct
    taacccactgcactacagtgggaactcccacaaaaca
    ttttttaatgtcctttgaataaagtaggaaagtgctc
    gtctttgagggcagggcggcaatgccatttccacaag
    gtttgctttggcttgggacctcatctgctgtcattta
    gtaatgaataaaattgctgacagtaataggattaact
    gtgtgtggagatagccagggttagagataaaaacact
    ggagaagtcaaataagttgctcgaggtcctctagcta
    ataagctattaagtgggagagtgagggctagaaacag
    gccatctgtctcccaagcacatgtccattagtggttt
    gctgatagccttccagaacaacagagaggactctcaa
    acatggtcttgcctccctccaattgatcccctccatg
    tgcctcacagcgggtctttctaaaattaagttctgat
    tttaattctcccttgctatagcacttaggtatggctt
    tcagccgtgcaataaaaagcaggcaagagtggctcaa
    tcatataggaggttgtttttcttagatcccaagcagg
    taatcctgggcattatggttgttctgcgtttatcaag
    gagccaaattctctatcacctcctgttctatcctcct
    cagtatctggctctattcttcagcatctcaagatggc
    ttgtgctcctccaagcatggcagtcaaattccacaca
    agagggggaaatatgaagggcagacagtgctggtctc
    ctgagctgtccctctttgtcggggaaataaatgtatt
    ccttcatgtcccgtgagacttctgaagtagacgtctg
    cttacgtctcacccaccagaactatgtaaactgcaca
    tagtgctaggtctacatagccactcataactgccagg
    gggtgggaaatctttaaataggtgtaccaccacacaa
    ttaggatgctaatagtaagggagaaggagagaatagg
    ttttgcgcaagccaccagcatgcctgccacaattgct
    taaaattcttcattgacccctcattgccacaggatga
    aatccaaacgccttcttagttgggaatctgacctacc
    tgtctctcccacctggttcagacaccattctccttgg
    tcataaaattccagtcatttgtgaacatccagctccc
    ccatgcctccatgcctttgcacatgctgttcttttat
    cttttatgttgtccttttatcttttatccaaaagaga
    tatcccatcatcacatctcttttgtcagcccccaaat
    actttgtctttcaagttcagctggaggattacctcct
    atttgaaatcagctttgtctcttacaaccaaacaagg
    ttttccttccgagacactcccacagcaccttgaactc
    atctctatcaatcattcatttgattgtaatgaagttg
    ttggtggtatgcctgtgtctctgacacatctgcgatc
    tcatgagttccttaagtggaatgtgaatagcgggatg
    aacagtattggtcttcagccctcatctctgcagatgt
    tgcttgacccaaatgagcgttgccttttattttgatt
    ttgctttgatttgtctactccatgtacttgagccatg
    catttctgtcttagcgatgctttttaaaagtcatttt
    ttggttgattatccagatttgtccacctttgcttcta
    gTTGTAGAAAAGGATGAAGAAAATGGAGTTTTGCTTC
    TAGAACTAAATCCTCCTAACCCGTGGGATTCAGAACC
    CAGATCTCCTGAAGATTTGGCATTTGGGGAAGTGCAG
    gtaaggaaatgttaaattgcaatattcttaaaaacac
    aaataaagctaacatatcaatttatatatatatatat
    atatatatttttttttttttttacatcttatattacc
    ttgagtattcttggaagtggctagttaggacatataa
    taaagttattctgaagtctttttttttctttttccat
    ggtgagcagtggcttgatgtggatctcagctcccaga
    cgaggcactgaacctgagccgcagtggtgaaagcacc
    aagttctagccactagaccaccagggaactccctatt
    ctaaattcttgagcacattatttaggaacctcaggaa
    cttggcaggattacaggaaatatatctagatttaaaa
    aaaaatcttttaacagaggtcccaaaggagagtcatg
    cacagctatgggaggaagttcagaaactgcccttgct
    accagatcactgtcagataaaatggccagctacatgt
    ttctgcacattgccctaagatctttacaaacttttct
    gtgcatttttccacttttaaaagaaaatttcggggtt
    cctgttgttgctcagtggttaacgaacccaactagta
    tccatggggacaggggttcgagccctggcctcactca
    gtgggttaagaatctggcattgctgtggctgtggcgt
    aggctggcggctacagctcagattggacccctagcct
    gagaacctccatatgccgcaggtatggccctaaaaaa
    aaaaaaaaagagagagagagaatttcctccagaaaaa
    acactttggtagtttgggagaagtaaacaaccaaaaa
    ttaatttttctggagtattcgggaagcttgtaaaaat
    gggctcttacttttttgaggagacaaatgggaaccta
    cccagaagaggcacaatcacctgcatttgatttcttg
    acctctccctaccttctttgctggctttccacatttg
    gatttctgtgaccttatctctgctccttggtgttttc
    atttttcctgtggacgtgccagactatgggaagggag
    taaggcgttgatttagaatcctgtagtctctgcctgt
    ctctagtcattgttttcacccttctcaaaggaccttg
    acatcctgagtgagtccgcaagtaatttaggggagaa
    gccttagaagccagtgcagccaggctacatgactgtg
    tccacccactggaaccagtcatttttatacctattca
    cagcccccctaccatttaaatccccagaggtctgcca
    taacatctgtaactccctttcctggtaaattgtgttc
    taaaagactggtaacaaaagatattctgtggtacaga
    gcataattaaatacctgggagctgatttgagtggggt
    aaatcaactggtttgacccctaaaacccaccatgagc
    atttctgttctaataaagtaatgcccgtgctgggaat
    tgtgttctacggaaatgctcctgctgtgtctttcttg
    agtcctgtgtcattgaacatgcttaggagcaaaggtc
    ccccatgtggcttgtctgctaaccagcccagttcctt
    gttctggctggtaatgatccgatcatctgaatctcac
    tgtcttccaacagATCACGTACCTTACTCACGCCTGC
    ATGGACCTCAAGCTGGGGGACAAGAGAATGGTGTTCG
    ACCCTTGGTTAATCGGTCCTGCTTTTGCGCGAGGATG
    GTGGTTACTACACGAGCCTCCATCTGATTGGCTGGAG
    AGGCTGAGCCGCGCAGACTTAATTTACATCAGTCACA
    TGCACTCAGACCACCTGAGgtaaggaagggtgagccc
    tcaactccgaagaaaatgctgcaataaaagcactgtt
    ggttttcagctttttttgtaatcactgctcattctga
    ggtagattcgcttgggctgataaaaagagaactaatt
    cagataaatgcttgcatttgcatagcctcttttttta
    aaaactttttttttttttttttttttttggcttttca
    gggctgaacctgtggcatatggaggttcccaggctag
    gggtcgaatcagagctgtagccccgggcctatgccac
    tgccatagcaacatgcatagcctcctttttaaagtgc
    cttcctgttttataccattgggatgtgagaagagcta
    ttgtggaaangagcatggggtnataaccctggacctc
    tcacgtcctaccctcaggntagtgggaaaactctgag
    tttaaggacatcaaagtgactcctttttagttacatt
    atggnggaatcagcncatatttttacaaggggcggag
    ngtaanctgttggagtttacaagacatatggtggcat
    tgcaactacttaaccctactattatagcacaaaagca
    gccatagtcggtcctgaaggagcctgatgccttcagc
    tttataggcaatgacgtgtgaatatcacaaacagttt
    cctgtgtcaccaaacatgattgccttttgatttccct
    ttcaaccctttaaaaaaaggtaaaagcccttcttagc
    attcagcagcaggtcgctgtgttttgccaactcctga
    tctgtagcatttcgacaacactgagctctcaactttt
    gaaccctgagtccaccacatccttcagtgaaaccaga
    gccatgtgatactaaggatagaaacggaaacttcctg
    aatccaggcgatcaaataggagggagaaagaggaact
    ttcattgacaaaaccacaaatattgtgaatggactgt
    tacaaatattgtgaatgctcctattcccaaccccctg
    gcttcattacagggtcctatgtgttcatccttattga
    gaaatttgtattgctactgccaggttgccaataccca
    gcggtgcccatggtgttctaaaatgaagcaatttcaa
    ctttatttttttttcctgtgactttacatgacaagtt
    cacatgaaggatatactttgatagtaatgtccatggt
    tagggaatatacattgtttgctggttgactggcccct
    ggatttttctattgaaagtccatgagatctcgaaggc
    acaggtgtgttctctcgctttttaaggaaagggttta
    aaaacttaagtaattaacagctttagtaacaaattac
    ctataacacacttaaaaaccgaataccacccactgga
    gtattgtgctacgattaaaaatctacttgtctactac
    atgatatctttgtcccacagaaggttctggaaccaaa
    cttgtaatttcaggattatgagagccctgagttcacg
    cattgtgtaataactatgttgtgtggtagtcaatttg
    tacagcttgcttagagagaacaatgtcaagttaagga
    ggcgattgctttatagtgcctgtcacaagatgccatt
    gccattgtcctagcaagagatattctatgggagtata
    ctacattttagtgaggataagaactttttatggcatt
    tagtccggtcatttcccaaccactgtcctgaaaacca
    atttcattttgatttcaggggcttgtgtgggcaaagt
    tgccaggcattaaaaagccacttctcaactgtagtat
    cacaatgctttagttgggtagtgtattgcagatagct
    tatggctgaaaagttaccaagccttgcagttttcact
    cctttgagtttatttccttgacagaattgaccctgag
    ttttttgactcttacctgctcaactaataaacaccag
    agtcatttatctccattgctcttgtctgacctttatt
    taccgaataatgccttatgggttcacaaaaacaaggg
    gggagggggccagcatgccttagaaactgtctttagt
    caagaaatgngattttattatgtaaatatatgagtat
    tataatagatagtgttattaatagacaccagcaagaa
    ttgtcaataatttaaaaatcacaaattaaaatacatc
    catgttagnatcatttatcctaactcccaaagccctt
    taaagtggaagatttagatgttaacccagagattaaa
    gacatgttcaaagaatccttgatttttttttgaatcc
    cttgtttttagagaagaaaacctaatgattttccccc
    tctggattctacatattaaatatagttttggaacttg
    aatattagtatggttaataagtgctgatatgctgatt
    ttgtttatatttttcttatgagtaaatatcctatatc
    accagacattatagtctatgtacaaatatgattctta
    aacctgatagcacattcattagagttggaattgcctt
    ttttttttttttttttacagttgcacctgcaacatat
    gaaagttcccaggctaggggttgaatccaagctgcag
    ctgccaccctacattacagccgtagtaacagcagatc
    cgagctgcatctgcaacctatgctgcagctcagggca
    atgccagatccactgagtgaagccagggatggaactt
    gcatcctcatagagacaacgtcgtgtccttaacccac
    tgagccagaacaggaactccagaatttcctttcaata
    gaagaagcaccaagtttaggatcagaaagcctgaatt
    tgaataccaatttactatttgttagtcatatatttct
    gagtgtgtttcctcatttattaaaagcagactaaaag
    atgagagggtcttttgttgagaatcaaatacaataac
    atgtgaaagtgtgtaacactatgattgaaatatacct
    acacagccatttatttgtttattgttcatgttttgcc
    acccacacagtagtatataatccttttatgtaataaa
    tgctaataatgaaagttggcaacttatgtaagtactc
    aaaatgctggaggtcatgggatactgactgggatact
    acagaggtaatgtcatttcctctgcgctaaacttatt
    gtctgtagttagggactgactctctttaggacaagga
    gttcattctgtataccatgtgtggctatcacccttcg
    aagttgaaaaactgccccagggtgggcacccatccgt
    tctcttagatatatggccgagacctttctctcactgg
    gagggaaccacactgaggaatgagaaaaaaaaaagga
    aaatcaagatgaaaccagaaacctctttggcataact
    tctccactctgtactttttgttagaactacccttgca
    caaagcagcatcagtgtggaagacagaatttgcacac
    ctggtttgatatacatgccgtggtatatgggatgttc
    taacaataaagaggactctcccaggaaatctcctcac
    tgttatagtcagccttgaggaaagagctcttcttttg
    gactctggggagagtctagtttttcagttccttgctt
    ctcggtcaacgtgttggtgtaaggatcacactctctc
    ttatactagataattctattttttcacctttcaacct
    gtctatccttctgaccctagTTACCCAACACTGAAGA
    AGCTTGCTGAGAGAAGACCAGATGTTCCCATTTATGT
    TGGCAACACGGAAAGACCTGTATTTTGGAATCTGAAT
    CAGAGTGGCGTCCAGTTGACTAATATCAATGTAGTGC
    CATTTGGAATATGGCAGCAGgtctgtgttctttccac
    atgtttgggttatcctttctgggataaatttgaggcg
    agatagaaactttaagactaaagaaacaatggcctac
    tttttttgtacatggtcctgtgtaaatctctatttga
    gctgaaataagatggtcttcctctccaattatccatg
    gtatgactctgatggataacaaatccagttctgaaaa
    aaggggatttctttccagaagagaggacagtttcttc
    aaatattgaattaaaagcaaaatagatgtaaaccgtt
    gttggttttattgttgaattccagGTAGACAAAAATC
    TTCGATTCATGATCTTGATGGATGGCGTTCATCCTGA
    GATGGACACTTGCATTATTGTGGAATACAAAGgtatt
    ttcttgccctcatcagcatgaaattgctcttggtaga
    aaggataataatagttatccaaaacatcatcctatgt
    tcatctgtttcttccctcttcattttccatagagtac
    agtatattctatctctgtcttaggaaaatggactgtc
    attcatataatcttacagagaatcaattagtaatgta
    ctctatgccgtgacaggtgcgaaggttttttttgaag
    gcaacagataaaaatatcctatatttcacctattgta
    atttccttaaaactgacattattgaataaatgtttta
    ctttcatcttgaatattattatgttatggaatcatac
    actttaccccaataatcatcgaaaagaatttccaaaa
    ggttgagagagttgtgttgatctgattactttcctct
    gcatcctttgagcttaacctttgaatatagtttgcta
    aggaaagtagtctgtttatgatcctggagtggaatca
    ggctaagtgtcctcattcagaacccactgaatcagac
    agaatgaatttatttccttgaaagttcaaaatgtgtc
    actcaagagtataaattttcaaatcttactctctctt
    ttccttggatgtgagcaattcttcgataattgaatga
    ggcagattatatagacttacatggaagactgttggcc
    tgagaattcaaactatggtgttcaagacttcacngng
    agtccgatgccatttgtttcccacagGTCATAAAATA
    CTCAATACAGTGGATTGCACCAGACCCAATGGAGGAA
    GGCTGCCTATGAAGGTTGCATTAATGATGAGTGATTT
    TGCTGGAGGAGCTTCAGGCTTTCCAATGACTTTCAGT
    GGTGGAAAATTTACTGgtaattctttatatcaaaatg
    atgccaaggagttggcatggcactttgctaaatgctg
    tgtgaatcaatacaaagataattaggacatggttctt
    cctcacaagaggtgtgcaatcttattgggaaatcata
    cttgcaagtcacaaatatagactaaagtttccagctg
    agaatatgctgatggagcatgaaacactaaggagaca
    gggagaatctcaggaaaaatcaagaataatttggatc
    aaatggattcctgacatagaacatagagctgatcaga
    aagagtctgacattggtaatccaggcttaagtgctct
    ttgtatgtggttcagaacagagtgtgggcagcctgag
    ggggatacatacccttgacctcgtggaaagctcatac
    gggggagggatgaggctaaggaagcccctctaaagtg
    tgggattacgagaggttgggggggtggtagggaaaat
    agtggtcaaagagtataaacttccagttacaagatga
    ataaattctaggggtataataacagcatggcactata
    gatagcatattgtactatatactggaagtgctgagag
    tagatcttacatgttctaaccacacacacacacacac
    acacacacacaccacacacacacaccacacacacaca
    cgtgcacacaaacagaaatggtaattatgtgaggtga
    tggcggtgttaactaactttattgtggtcatcattta
    gccatacatgcatgtcatgaaatcaccatgttgtaca
    ccttaaagttatgtaatactagatgtcagttatatct
    caaagctagaaaaaatgtggggaccaaggcagaagct
    cttctgctctgtgtctaagggtggttctggggctggg
    atggggaggatggttaagtggtatatttttttcatac
    ctttgctcagtactatcattgtaagtgttcaatatat
    gtctgcttaataaattaatgtttttagtaagtaatct
    ctgtttagtaatgtgtcagaaatgccctacttgcaat
    aggaagaaaacctgtccagtcccttccttttttctgt
    aagtctgatttcattgcctcccagaatgcatcaccat
    gtgagagatagagggaaggtgctgtccttatggggtt
    aacagtgtgactagggaggcaaaatatacctactaaa
    gggtggtagcataattcagttcttatgtgagtatgtg
    tatgtgtgtgagtatgtgcacatgcacatacatttta
    aaaggtctgtaatatactaacatgttcatagtggtta
    cacctagcttataggtaacattttttcccctgtatcc
    ttgtttgtgtttatcaaattttcataacagtaatggt
    agaaggagtacctgacatggtaccatacatgctnggn
    cctgcctaatttctcnatttcctttattgcccatacc
    cccattgcttgacaagcataagtccatactggcttgt
    tttcgttcctcagactcagtacaccatgtagctccat
    gccctgggtctttgtatgtgctatttctactgcttag
    agtgctattgcccctgaccaccacgtggtcagcaact
    tctcttctgcgtctgtgtctatggtctatgattccag
    atgtcatcttcactaactacccttctaatatgccctt
    ccatcccacccgtcctcatccttaccccagccactct
    ctatttggtggctctgttttattttcttcctagctca
    tcactctttgaaatgaacttatttacttattcaatat
    ttgcttctttcactagaatgaatgctccatgagagca
    gggacctgctttatcttgctcgccactgtattcacag
    tgcctagaactacgtctggcacatagtaggtgctcaa
    taaatatcgatcaaatgaaagaatgagcaaacgaaca
    aatgaacaacacgtgaggtaggcatcatgattccatc
    aacagaggagaaaaccagacttaaagnaatgaagtgg
    nggagctgcatttgatcttgactgactccaacatcca
    tgctcttgaccactgtgcatctccagagtgtaatgaa
    catactttacttttatattccaccaaaataacaaagc
    catgcccatgttagtagagagttaatcgacagtgccc
    ttaaaatatgcatgcacccagggtacaactatgcatg
    ctgccctgtgttttcagttggatccaaatgaattgcc
    gtaaacaaagaggggattcaatgtctttgactagttt
    gggatattttcctagtaaccaactttgcaaaataaag
    ccactaatgacaaggagctttgttctacttctgcatc
    actcaactgtcaatttttatctcttgcaagacttcta
    atctactagaacttttgtttttctgtgatttctgaac
    agagaagactaatccaaaccctgtcattccagAGGAA
    TGGAAAGCCCAATTCATTAAAACAGAAAGGAAGAAAC
    TCCTGAACTACAAGGCTCGGCTGGTGAAGGACCTACA
    ACCCAGAATTTACTGCCCCTTTCCTGGGTATTTCGTG
    GAATCCCACCCAGCAGACAAgtatggctggatatttt
    atataacgtgtttacgcataagttaatatatgctgaa
    tgagtgatttagctgtgaaacaacatgaaatgagaaa
    gaatgattagtaggggtctggagcttattttaacaag
    cagcctgaaaacagagagtatgaataaaaaaaattaa
    atacaagagctattaccaattatgtataatagtcttg
    tacatctaacttcaattccaatcactatatgcttata
    ctaaaaaacgaagtatagagtcaaccttctttgacta
    acagctcttccctagtcagggacattagctcaagtat
    agtctttatttttcctggggtaagaaaagaaggattg
    ggaagtaggaatgcaaagaaataaaaaataattctgt
    cattgttcaaataagaatgtcatctgaaaataaactg
    ccttacatgggaatgctcttatttgtcagGTATATTA
    AGGAAACAAACATCAAAAATGACCCAAATGAACTCAA
    CAATCTTATCAAGAAGAATTCTGAGGTGGTAACCTGG
    ACCCCAAGACCTGGAGCCACTCTTGATCTGGGTAGGA
    TGCTAAAGGACCCAACAGACAGgtttgacttgaatat
    ttacagggaacaaaaatgatttctgaattttttcatg
    tttatgagaaaataaagggcatacctatggcctcttg
    gcaggtccctgtttgtaggaatattaagtttttcttg
    actagcatcctgagcttgtcatgcattaagatctaca
    caccaccctttaaagtgggagtcttactgtataaaat
    aaactattaaataagtatctttcaactctggggtggg
    gggggagactgagttttttcacagtcctatataataa
    ttttcttatcctataaaataattaggagttcccgtag
    tggctcagcaatagcaaacccgactagtatcgatgag
    gatgcgggttcgattcctggcccccctcagtgggtta
    aggatctggcattgccgtgagctgtggtgtaggtggc
    agacacggctcagatcccacgttactgtggctgtggc
    ataggccagcagctccagctctgattagacccttagc
    ctgggaacttccatatgctgtgggtgtggccttgaaa
    aaaaataaataaataagataattactcaaatgttttc
    cttgtctcagaaccttacttcaggataaagagtgaga
    aagttttttttatgaagggccattattacagctcaaa
    aataagttgtcttcagcaagtagaaagcaataagcct
    gagagttagtgttcctatcagtgtaaatattacctcc
    tcgccaatccccagacagtccatttgaacaattaacg
    gtgccctgggagtacagttcagaaacattaatgtgga
    tgttccagacctgtatttttataagtacttgtcttga
    gccggatggaaccatcattcctcaccattatttagaa
    gtggactgtgactctgttggagatcagggcacacggt
    taccaaaagcacacccttctcctggccttacctttgc
    aaagctggggtctgggacacagtcagctgattatacc
    cttttactaacttcccacagctcaaatctggtcaatt
    ctccttcacaaatctcttaaaaatccatcactcacct
    ccagcctcttctgctgtggccttgattcagcctctca
    caatttttttttaaccagaattctggcagtggcccct
    gacttgcctctgtgctcccagccccgctgtcctctga
    tccatcctccatgccagcctttttcaatctgctggtc
    acgattcattgatgggttaggaaatcaatggcatcac
    aactagcatttagaaaaaggaaataggcgttcccgcc
    gtggcacagcagaaataaatccgactaggaaccataa
    ggttgcgggttcaacccctggccttgttcagtgggtt
    aaggatccggcattgccgtgggctgttttgtaagtca
    cagacatggctctgatccggcattgctgtggctctgg
    cgtaggcctgcagcatcagctccaattagacccctat
    cctgggagcctccatatgctgcaagtgcagccctaaa
    aaaaataaaaaaataaaaaaaaataaataaaagaagt
    agacaaattgtatagaacaaccctgagtatgttgcct
    gagcacatataacaagggtaagtattatttcaggaaa
    ctctggtttcacagatactcttggcatatggacccct
    agagtcctgatgtaaaatatattcttcctgggatctt
    aggcaagaagtttgaaagctccaactctgcactgctg
    ccaaagaaatgatttttaagtgcaaaactcttcccgt
    tcccttccctgtataaaattccataggatctctccag
    tgcctctaggataaaggcagttttcattctctagttc
    aaggtgagagaagattttaattatttcacgttttagt
    ggggaattcaagagtctggcacctgacatttgctgaa
    ctctctccattatccctctctagttccccagacgcat
    cctatggtagaaattcgcaaactagagtgagcgtcag
    agtaacccaaggaaactgggtaaatgcagctccctgg
    gctctaccccctgagattctgattcagtagatctgaa
    gcagagccctggaatatgcatatgcatcattgtgtca
    caccaagcattctgggtaatgagagttgatgttaggt
    tctcagtagtaagacaagtatagagattccgggggac
    tgagtgctcagctctgccttggggaggagggagaggg
    ctaaagagaacaggagatggggacagggaatgctcaa
    cctccaatcttaggcatttgagctatgtcttaggggt
    caggaggaggttaccaatatagtgattaagagattga
    ggttccagtcagagggatatgctggagaaggggggtg
    aaaataatgtcataggtttggtgagtgcagatacttt
    gagttttttaatatttttattgaaatatagttgattt
    acaatgctcttagtgagtacaattactttgaataagt
    gcatagatgtatgccattcttccagaaatgatttatt
    gagctcctttgggcatcatgctaagtacaggggaaac
    agctgtgaagaggtccttcccttatgaagtcattcat
    ccccttcagtaaatgaaggtaaaggaaaaggatgaga
    cagggacgccgtgttggaccagggtcagaaaggcctt
    ataagaccttgcctggagggcaaggaacttgcctgtg
    agtaaggagagcttgagaaagcgataaagcaaagaag
    gaacattactgcattgtgttttagaaaaaccatgtcc
    tggggaagaactcctagagtcaggggggccagttggg
    agactgtgcttttttccaggaggagataagtgaggct
    gctggctgagatggagcaaggatttagagaagcagat
    atgagattcatttagaagttagacattttaggatctg
    acacataatttatcaccaaaaccagtgcatctctggc
    tttgggccaccagttttggagaagtggaatgtaggga
    cctaccattacctgccaatctttactacacagatgcc
    tatttccctcctcatatttcctttctccagatcacgt
    cctattctattgccaggactcaagattccaccttgca
    tgcagtgatccatcttcacactggatggacagctcta
    gggatgtcagagcacactcccatactgctgactgggt
    ctcctgtcagcccatctgtctatcagctgtggtatta
    ttagtataataagagggctgtatatgagagacacaaa
    attctaggtgtagctcaaagataggctagagttattc
    ctatgtacaacaaatatttatgggaccccttctgtgt
    actgtcatggttgctgctttcatcatacttgtagtct
    aatggaggtgggggcagggcaggaataagcggatgtc
    cacaaaatcagtaagaccacttatattcaacattttc
    ataatttagttatttgagcccaaagggtccacatccg
    tggtattccaacttttttttccccggacatggatctt
    tatctttttttttttttcttttttgcggccagacctg
    cggcatatggaagttcccaggccaggggttgaatggg
    agttgcagctgcctggtctacaccacagccacagcaa
    ggtgggatctgagctgcatctgtgacatacaccgcag
    ctgaggtaacaccagattctgaacccactgaatgagg
    ccagggatggaacccgtctccttatgaacactatgtc
    atgttcttcaccctctgagccacaacgggaactccag
    acttcgtctttaaatgtattctgacttggagagctat
    cacactaagcaattaacaggagctgacctggtttagg
    ctggggtggggccctactcctcaatgttccctgaggc
    acatctgtgggacccctgggcatcatctatctgagca
    gccttagagctgctcatccagttgactgttgatgtag
    aagtgcaaacttctgccttccttatgcttlctttttt
    cattgttctctcccctttgtgtctttaagCAAGGGCA
    TCGTAGAGCCTCCAGAAGGGACTAAGATTTACAAGGA
    TTCCTGGGATTTTGGCCCATATTTGAATATCTTGAAT
    GCTGCTATAGGAGATGAAATATTTCGTCACTCATCCT
    GGATAAAAGAATACTTCACTTGGGCTGGATTTAAGGA
    TTATAACCTGGTGGTCAGGgtatgctatgaagttatt
    atttgtttttgttttcttgtattacagagctatatga
    aaacctcttagtattccagttggtttctcaataagca
    ttcattgagccttactgactgtcagacggagggcgta
    ttggactatgtgctgaaacaatcctttgttgaaaatg
    tagggaatgttgaaaatgtagggaatgaaatgtagat
    ccagctctgtttctcttttggaggattctttttcctc
    catcaccgtgtcttggttcttgtttgttttgggtttt
    tgtgggtgttgtattgtgttgtgttggttatggcagt
    gacagctatttaaactgtgaaacgggggagttcccgt
    cgtggcgcagtggttaacgaatccgactgggaaccat
    gaggttgcgggttcggtccctgcccttgctcagtggg
    ttaacgatccggcgttgccgtgagctgtggtgtaggt
    tgcagacacggctcggatcccgcgttgctgtggctct
    agcgtaggccagcggctacagctccgattggacccct
    agcctgggaacctccatatgccgcaggagcggcccaa
    agaaatagcaaaaagacaaaataaataaataaataaa
    taagtaagtaaaataaactgtgaaacggggagttccc
    ttcatggctcagcagttaacaaacccagctaggatcc
    atgaggatgtaggttcgatccctggccttgctcagtg
    ggttaagaatccagcgttgctgtgagctgtgatgtag
    gtcgcagatgcagcccagatcctgcattgctgtggct
    gtggcgtaggctggcagctgaagctccgattcaaccc
    ctagcctgggaacatccatatgctgcaggtgtggcct
    taagaggcaaaaaaataaaaaaataaaaaataaataa
    attgtgggacagacaggtggctccactgcagagctgg
    tgtcctgtagcagcctggaagcaggtaaggtaaggac
    tgcagctgggtaaggactgaattgcaccaactgggaa
    gtaagcctagatctagaacttaagttagccctgacat
    agacacacagagctcaccagctaagtggttcagctta
    taagctggtcactgaaactgaggatgtccacaaaagc
    aaaataagtagcaacaggcagcgggatgcaagagaaa
    gaggaggcctaaaatggtctgggaatccctgccatac
    ctatattttatcctacttatatttagtgcctgaatgt
    gtgcctggagagcaaagtttagggaaagcatcgggaa
    atgcacagtattcatacccttaggaacaaagatcagt
    tacctccagggtaaagactatttccaagtttaaattt
    caacccctgaacattagtactgggtaccaggcaacac
    ttgccatcctcaaaatcaatgaatcctaaaattcaac
    ctgggggtcagtgacagtctgtgacaaagtttttgct
    ggtcagtaacgaaataagtatgagcaccatctgagta
    tggtcaccaagatgtcaactctctttcctttggacga
    attgtcattattccaagattaggtcctttctattttt
    gaggtgtgaaaacatctttcctttcataaaataaaag
    gatagtaggtggaagaattttttttgttttttggtct
    ttttgctatttctttgggccgcttctgcagcatatgg
    aggttcccaggccaggggtcgaatcggagctttagcc
    accggcccacgccagagccacagcaacacgggatcca
    agccgcatctgcagcctacaccacagctcacggcaat
    gccggatcgttaacccactgagcaagggcagggaccg
    aacccgcaacctcatggttcctagtcggattcgttaa
    ccactgcgccacgacgggaactcctaatgatactctt
    ttatatttagctactatgtgatgatgagaaacagtcc
    acattttattattttttagccaatttgatatctcatt
    actaagataatgataattttctctataaattttattt
    aagttagtgttatgaagtggttttgctagtgtagaag
    gctaggatttgaattcagttcaagaaagaagagaggg
    agggagggagagggatgggtagagggatggggcagtg
    ggagagagcaaagaggagagacagtttttgtattaat
    tctgcttcattgctatcatttaagggcacttgggtct
    tgcacattctagaattttctaaggaccttgaccgcca
    gattgatatgcttcttccctttaccatgttgtcattt
    gaacagATGATTGAGACAGATGAGGACTTCAGCCCTT
    TGCCTGGAGGATATGACTATTTGGTTGACTTTCTGGA
    TTTATCCTTTCCAAAAGAAAGACCAAGCCGGGAACAT
    CCATATGAGGAAgtaagcaggaataccagtggaagtg
    cccctttcttccttccttcctaaataaacttttttat
    tttggaacaactttagagttacagaaaagttgcaaag
    atattatagacagtagtgtttatatatatatataaat
    ttttttttgctttttatgaccacacctgtggcatatg
    gaggttcccagtctaggggttgaattggagctacagc
    tgccagtctgtgccataaccacagcaatgcaggatct
    gggccacgtctgtgacctacaccaaagctcacagctg
    gattcttaacccactgagcaaggccagggattgaacc
    tgcatcctcgtggttcctagttggattcgtttccgct
    ttgccgcaatgggaactccaaattattgttaatatct
    tactttactggggtacatttgttacaaccaatactct
    gatactgaaacattactgttaactccgtacttgcttc
    tttttgagtcatttgcaaagactggcttcttgacctg
    cttccttccaaacagctggcctgcctatgctgttctc
    agacctgcaagcactgatctctgccccccttgccttc
    tctccagtggtgtctccttccccaaacaaacccagtg
    tggctctggaaagggagttaagtcaacataaaccaac
    acatattttgttgagctccaattttgagcaaatccct
    cacctacggcagacaggcatgatgttaagaactaggg
    ctttggacacaaggtcaagaccaagaagggttcctca
    cccctactgattcagataaccaataatgaggctttga
    atccctgtccaaaggttgttttttttcccttctattg
    agcttcttgccaccttatcagttttttttatgacagt
    caaatgacatgatatatgtgagcatacatggtaattt
    ttaattctatataaatgaatcactaaataaataggag
    gatatatagtccacctttaagcgtattacacgtgtca
    catgaatgtggcgacttaattgtagaggtttaaatgt
    agcttcctataatagatgtgttcctaaactacatttt
    aatcattggacttgtatttttatgttagcacttgctg
    ttgaagaaaagcctatgccaaaagttcagtgaaacca
    ataatccactgccagctttctgagttaaaaaaaatcc
    ctgggttttcacacacaggaacaccctgtgtgaaaca
    ctcatttagagcaaaatgcatctgataaggagttcct
    gttgtgcctcaactggttaaggacctgacattctcca
    tgagaatgtgagtttgatccccggccccactcgatgg
    gttaaggatctggtgttgccacaaactgcagctccga
    ttcatctcctagcctagaaacttccacagcccagaat
    atgccacagaattcggctgtttaaaaaaaaaaagaaa
    aaaaaaagaatcataaatgtgttggtttgttcaccaa
    atacatgataacttgctcttgccaagctcagcttcat
    aaatattaagtcatttaatacagcagccaccttatga
    acagatattactatacttcccatttacagataaggaa
    aatgccatatttaaccaagagattaaataactttccc
    gaggtcttatagcaagtaaatcatggtgcaggggttt
    gaccacacgcagtctatctccagagtctgtgtattta
    gccactgttttactttcaaatttaaatttataaaact
    tctaaattatctgttaaccataatctttggaattttt
    aaaaccacgagttcctataaaatgtttcattgaaagt
    aagtcacttttccatagcttttgataatacatctgta
    ggataaagtaagccacagctctcttgcagacttggta
    caccctggggcaaagcatcatgcctgtcacgtacatg
    gtggtccttactttgactctcagtgcttttattgccc
    aggaattttgtgagatttctagttgttgaggtttgtt
    taaagaggttatgccggtacttggaagagctcttttc
    ttgctacctggagccttctcatatttcctttttgagg
    agggacatgaattgcctttcaaactcataaatatatt
    ttctagtacacaagtctccatcttccttagacgcatg
    gctcctggagttctccatcctcctgctccactttggg
    tgggctcctctctgggtctgccaccaatctgccaccc
    agagacatccttgacccacttccagaccccaccatgg
    cttcactttcttcgctttcctcctttgtggaaccttc
    tgcttaagaatctgaggaagaaaatttgcacgtgagc
    taaactggaggtactttcctgcctggtcttgcacgat
    agcttggctgagcccatgatgctgggtggctgttact
    ttccatggacacccgaaggcgttgctcctttggcttc
    tagttgcatgcagtgttgcttatcccaggctgatctt
    tcttccactgtaggtgacttttaagaattaagggatt
    aatctatatctacaacaacaacaacaaagaccttttc
    aagctgaggtagggctttctgtatatgtttggagtgg
    ttatccagcagactttacttgaaggcaggggtcatat
    cctcaagtgctcataaacggaccacagaaagatctca
    taattgggtggagctgggtggggaccgtgtcatgtgg
    ccaggaaatgccagatgggaagggagtggcccttact
    gagctccagctgaactctgaattttctagaaaactca
    gaaatctggatttttcatgtgtaatacccagatttat
    agatgtggaaagctaattctttttttttttaagggac
    tataggcaatgaactaagatctaggttgtatttggac
    aaggggtcatcagtttaagctgtgtagttgagcgctc
    agctattgggctgagggacccctaaatactgagacgg
    ggaggtccttgctctggggcatcacaagtacactccc
    tggtctcattcaaacacttttcctacaaaattgatcc
    catttcttcagtgcactgtctgaatgcatttggccca
    gagccgtgctgaggcatagggaaggggtccacggttt
    catggcatcgttttgtgctgtgtgtccctgctgtcgt
    ccaggatacctacctctcctcctcctgcatctgaatg
    tccccccacagactctctgggattctacagcctctgg
    cctgttcctcagacacctcttacctgccagctttcca
    gattcacattagttagtccaaatctactgccgtcagt
    gactcacttcatttcttcttctccgaggcagttcagc
    ccggtacagttgttttgtcaacacttcagttgagtct
    ggaagatgtgcatgggttatgcacgagagcggtccat
    cattttgagctagaagtcctttctcagcccagagaca
    agtcctcatctcctttacttcctgactcttcttcctc
    tgcatccttccaagatatctctttctccagccaccac
    ctaaatctcttcttttcccggggttccgtgctcaacc
    cactcttcttcttaaatctgtggctgggtgaacgcat
    ctgctggcaccacttctctgctaagactccaaaaatc
    cataggtcctgcccggcctttgcccacctctctccaa
    cactgtccagctttagatgtagagctaatccccccag
    agatatcattccctggatgtctaagtcctttggtatc
    tcactttcagcgtgttcaaaatcctcttacaactgtt
    ctttctccttttccatcttgattattggcaacatgcc
    agcctttcccctacccccagcagtgagccaagctaga
    aacaagggcttaatcttcaatctttccttctccatcc
    ctaaacctaatgagtctccaagcccttcccagtttac
    accctaaatgttgctcaaaacatcccctagttcttcc
    acgtgctctcctctatattgaaaggtcaagaaaggcc
    atcttccctccactgtgaggaaatagatcttgatact
    gcccctgagctgggcagtcctcgacctgacaaactgt
    gcagtgtttctaaatctctactggcaaaatgagagtg
    cctttgacctgtgttgcgatctcagatcacagtggat
    gtaattgttttataggaatggtgaacgaaaaagaagt
    aaatccctaatgccaaactcctgatcattctatgtca
    tttaatagcctgtcatttatgataaagtttcctctac
    tggcattagcacaatacttctcaggaaaaaaaaatat
    gatgccagatactgaaaagctcctgggtaaacatgaa
    catgggtaccgataaaatggtgaagccagtccaatct
    tagagtgacttcccttcatgctacttcatgctctttt
    ttttttttttttttaagaaaaaccccttttttttttc
    tcacaccagtcacagaggagaccgaggcttagcaagg
    ttaaggtcacatgattagtaagtgctgggctgaaact
    caaaaccatctctgcttgtctcctaaccctgtgcacc
    tctgactattcaacagATCCTGTGTCAGGAGTTGGGA
    TTCTTTGAAGgtaagggccttgaccaccgaattaagg
    taatcttgctctgtggcaggccttgttttcagtattt
    taagtacactggctcaggtaatcctcacaacagcccc
    aggaggaatgttctattacctccactgtatagatgag
    gaacttgaggcacagaatggttgccaaggtcacacag
    ctatattgggggttcatacccagccatccaactctgt
    ctgtactctctgccactctgcacccccagctcctgat
    ccacttcctgtttccatccctcgatttctgctgcact
    caggggcccctctccccctcggcctgtgagatctgct
    tcagtaggcttttctccctgactcctccatccctgtc
    cttacaggcagctgcttctctccgggacacgaggggt
    ccatacggacactctctactggctgggttgcgcctaa
    ctcgtgattcctcctctgtttcagATTCGGAGCCGGG
    TTGATGTCATCAGACACGTGGTAAAGAATGGTCTGCT
    CTGGGATGACTTGTACATAGGATTCCAAACCCGGCTT
    CAGCGGGATCCTGATATATACCATCATCTgtaagtcc
    gaaaatgcctgtcgtgtgtgccttaggctgctgcgga
    ggaggccagggctatataagcagagtcagtgactgac
    tgtgccctgcagtgttgatggccatggagattccacc
    gttagagcttttttctttgttaaccttgaaggcaaat
    ctggttaggaagataactttcaaagagtcaccatctg
    gacattcatgcccatgtgcttcaatcctgtatacaag
    cagtttagagtacagggaagggaaggacattatgaaa
    gggagagggtgtgtttggatccagcagctccatcctc
    agaatttatctgaagacactgcaaaattactaagaat
    cactatgacaagaatgaggatggggtgatatggcaaa
    gttgtgatcctggaagaccttcatctcccatgttgcc
    caactctgaacatgaatttggtgaactagttggttaa
    ggggatgatcctccaagtttctccctggttgagctcc
    aaaaaccatgtaagtttctcatagcaaaaccgtatag
    gtccttagggctttagttggaatatttgtgctgaaat
    gctggaaagccccatttgccatttttgtatttgcaaa
    ataatcatcaagaggggagaatgcattctttcatgac
    cactgaccctctgaaaaggtcaggaatttagtctgaa
    gtaggcaagcctcctaccccgcttctgccatgagctt
    gcacgcacaggcctgtcttgacatttcttctttatag
    atttctttttgaatatcttgaaattgctttaaaaata
    tttaaagaatgtagaattatataaaataaaaaggaaa
    taaccccacacctcccacaaaaccctgtttcctgcct
    ttctccacccactctccagggtaacacttggtaacag
    catagttgtatcaccccaggcctatttttgagcatat
    cagcatttcaagaaatgtattttttctcaataaaaca
    tcccttatagttgaggaggggaggttatcattcctgg
    gttttgttttttttttttttttaatgtaatcctggta
    catcggtaatttgcattttttattcattaatatcttt
    ggtatttctagtgttgggacacacaggtcaacctcag
    tttttgggtttttttttttgtctttttgtctttctag
    ggccacacctgcagcatatggacgttcccaagctagg
    agtctaatcagagctgtagccaccagcctacgtcata
    gccatagcaacgtcagatccaagccgtgtctgtgacc
    tacaagcacagctcatggcaacaccggatccttaacc
    actgaacgaggccaggggatcgaacacacatcctcat
    ggatcctagtcatgttcattaaccactgagtcatgat
    gggaactccaacttcaactattttaatgtctgtaaaa
    cattccatttggaaaccatttcatttgtaaagcaaaa
    tgaaaacattttgttcattttcaacagagttcgtagc
    tgacttctgttctggaaaaaaggaaatggagcaaatt
    tgagtgagaaagattcaaagataacttttcttttaaa
    aaaaattatatcttggaaacttctgggctattgattc
    tgaagactatttttctatatactgttttgatagcaaa
    gttcataaatgtgaaaggatcctgcgatgaatcttgg
    gaagcagtcatagcccaatatatctttgttgctttta
    aaatgagatttagtttactaaatatttttctgatcat
    aaaaataacacagatctaccgcagaaaatttggaaaa
    aaaaaaacttttaaattcaaaaaacagttaaaccaca
    aatgatcccaccatccagagagcaatttgtactttgg
    tgtctagttcatctttctttttctgtttacaagcaca
    tataccacaagcattttttcaaaaaatgaaaatggga
    taatactatacatacgtctgtacacctgcatagttac
    tgaacagtctttgatctaccctgtaagtttctaactt
    ttcattatttgaaatgatgttttggcaaagaaatatg
    taggtgtgtctcgcacactttcataatgatttcttag
    gataaatttcttaggataaattcataatgatttctta
    taataatccatactctgccaactgatcttcagggaag
    ccaactcgccttctcagaaataacatataacccattt
    acttgccctctcaccaatactaggtcctaatgttttt
    gtgtacagattctatatttttacatacaagaattcct
    taaagcaaggcatgtcacagaaaaatagaaggaagac
    acaattgtcatgtttaaggactgcattctgtaccaaa
    aatgctaagttaaatgaacatctgaaacagtacagaa
    acgctatctttcagggaaagctgagtaccaggtactg
    aacagattttggcaaatacagcaggcatggatgtttc
    caaaacatgtttttctactttatctcttacagGTTTT
    GGAATCATTTTCAAATAAAACTCCCCCTCACACCACC
    TGACTGGAAGTCCTTCCTGATGTGCTCTGGGTAGAGA
    GGACCTGAGCTGTCCCAGgtaaagcatcctgcaggtc
    tgggagacactcttattctccagcccatcacactgtg
    tttggcatcagaattaagcaggcactatgcctatcag
    aaaacctgacttttgggggaatgaaagaagctaacat
    tacaagaatgtctgtgtttaaaaataagtcaataagg
    gagttcccatcgtggctcagtggtaacgaaccctact
    agtatccattgaggacacaggttcaatatctggcctc
    actcagtcggctaaggatccagtgatgccgtgagctg
    cagtgtaggccacagacgtggctcagatctggtgctg
    ctgtggctatggtgtaggccggccccctgtaactcca
    attcgacccctaggctgggaacctaaaaagaccccaa
    aaaagtcgctttaatgaatagtgaatacatccagccc
    aaagtccacagactctttggtctggttgtggcaaaca
    tacagccagacaaacaagacaaaaattatcctaggtg
    gtcagtgggggttcagagctgaatcctgaacactgga
    aggaaaacagcaaccaaatccaaatactgtatggttt
    tgcttatatgtagaatctaaattcaaagcaaatgagc
    aaaccaattgaaacagttatggaagacaagcaggtgg
    ttgtcaggggggagataaggggaggcaggaaagacct
    gggcgagggagattaagaggtaccaactttcagttgc
    aaaacaaatgagtcaccagtatgaaatgtgcaatgtg
    ggaaatacaggccataactttataatctctttttttt
    ttttgtcttttttgccttttctaaggctgctcccgtg
    gcatatggaggttcccaggctaggagtccaaacagag
    ctgtagctgccagcctacaccagagccacagcaacac
    gggaaccttaacccgctgagcaaggccagggatcgaa
    cccgagtcctcacagatgccagtagggttcattacca
    ctgagccacgacaggaattccagggtctgttgtgttc
    ttaaaacacttccaggagagtgagtggtatgtcataa
    gtaaacaataaatgttaaccacaacaagcttatgaaa
    taaacaggaaagccatatgacctacaatcagtcattg
    ggagaatccacaaaaggttgagcagaggatcaattcc
    agctcacactccagttttagattctcccctgccttaa
    agcatcacagactacataatctgagctgaagaataaa
    aattaaaactcaccccagtgcaaaacagaaatgaaaa
    agtattaaaacgaggttcatactgttgttcattagca
    atatcttttattcacagGGGTGCCCAACAACATGAAA
    AAATCAAGAATTTATTGCTGCTACGTCAAAGCTTATA
    CCAGAGATTATGCCTTATAGACATTAGCAATGGATAA
    TTATATGTTGCACTTGTGAAATGTGCACATATCCTGT
    TTATGAATCACCACATAGCCAGATTATCAATATTTTA
    CTTATTTCGTAAAAAATCCACAATTTTCCATAACAGA
    ATCAACGTGTGCAATAGGAACAAGATTGCTATGGAAA
    ACGAGGGTAACAGGAGGAGATATTAATCCAAGCATAG
    AAGAAATAGACAAATGAGGGGCCATAAGGGGAATATA
    GGG
  • TABLE 11
    Contiguous 5′ Genomic Sequence of CMP-Neu5Ac
    Hydroxylase gene
    ctgccagcctaagccacagccacagcaacgctgggtc Seq ID No. 47
    tgagccatgtctgcagcctatgccagagctccccgca
    gcgccggatgcttaacccactgagcaaggccagggat
    tgaaccctcgtcctcatggatagcagttgagttgttt
    ccacggaactcttaggggaactcctgattatttttta
    tttaaatttatatttctctgactttttcgtgtgctca
    tcagccactgactgtgtatctccattagtcatggttt
    gttaactctgtcattcaaaccctcttcatccttgcta
    cgcagataacatcattataataaaatcgtgcctgaag
    accagtgacgcccccaagctaagttactgcttcccct
    ggggggaaaaagaagcaccgcgcgggcgctgacacga
    agtccgggcagaggaagacggggcagaggaagacggg
    ggagcagtgggagcagcgggcagggcgcgggaagcac
    tggggatgttccgcgttggcaggagggtgttgggcga
    gctcccggtgatgcaggggggaggagccttttccgaa
    gtagcgggacaagagccacgggaaggaactgttctga
    gttcccagtCCCGACGTCCTGGCAGCGCCCAGGCACT
    GTTATTGGTGCCTCCTGTGTCCACGCGCTTCCCGGCC
    AGGCAGCCCTGGCGGATCCTATTTTCTGTTCCCCCGA
    TTCTGGTACCTCTCCCTCCCGCCCTCGGTGCGCAGCC
    GTCCTCCTGCAGTGCCTGCTCCTCCAGGGGCGAAACC
    GATCAGGGATCAGGCCACCCGCCTCCTGAACATCCCT
    CCTTAGTTCCCACAGgtgagaaggcttcgccgctgct
    gccgctggcgccggcagcgccctccacgcacttcgta
    gtgggcgcgcgccctcctgcattgtttctaaaagatt
    tttttttatccgcttatgctatcagttactgaggaag
    tatttacaaatctactattattttgaatttgcctttt
    tctccttatagtttatcagtatctcttgagactgtta
    ttggtgcctgcaaatttaaaatgattggggttttatg
    aggaagtgaaccttttatctttatgaaacgcctaact
    gaggcaatgttaattgcttaaaatactttctttatta
    tcagtgtggccatgccagtgtcctcttggttagaatt
    tgcctgat
  • TABLE 12
    Contiguous 3′ Genomic Sequence of the Porcine
    CMP-Neu5Ac Hydroxylase Gene
    ctgccaaagctgggagatgggggaaagtagagtgggt Seq ID No. 48
    tattgaaactgaatatagagttcagcatctaaaagcg
    aggtagtagaggaggaagctgtgtcaacggaaatact
    gagctgggttcacatcctctttctccacacagTCTAA
    TGCCTTGTGGAAGCAAATGAGCCACAGAAGCTGAAGG
    AAAAACCACCATTCTTTCTTAATACCTGGAGAGAGGC
    AACGACAGACTATGAGCAGgcaagtgagagggggctt
    tagctgtcaGggaaggcggagataaacccttgatggg
    taggatggccattgaaaggaggggagaaatttgcccc
    agcaggtagccaccaagcttggggacttggagggagg
    gctttcaaacgtattttcataaaaaagacctgtggag
    ctgtcaatgctcagggattctctcttaaaatctaaca
    gtattaatctgctaaaacatttgccttttcatagCAT
    CGAACAAACGACGGAGATCCTGTTGTGCCTCTCACCT
    GCCGAAGCTGCCAATCTCAAGGAAGGAATCAATTTTG
    TTCGAAATAAGAGCACTGGCAAGGATTACATCTTATT
    TAAGAATAAGAGCCGCCTGAAGGCATGTAAGAACATG
    TGCAAGCACCAAGGAGGCCTCTTCATTAAAGACATTG
    AGGATCTAAATGGAAGgtactgagaatcctttgcttt
    ctccctggcgatcctttctcccaattaggtttggcag
    gaaatgtgctcattgagaaattttaaatgatccaatc
    aacatgctatttcccccagcacatgcctaactttttc
    ttaagctcctttacggcagctctctgattttgattta
    tgaccttgacttaatttcccatcctctctgaagaact
    attgtttaaaatgtattcctagttgataaacagtgaa
    acttctaaggcacatgtgtgtgtgtgtgtgtgtgtgt
    gtgtgtttaccagcttttatattcaaagactcaagcc
    tcttttggatttcctttcctgctctctcagaagtgtg
    tgtgtgaggtgagtgcttgtccaaacactgccctaga
    acagagagactttccctgatgaaaacccgaaaaatgg
    cagagctctagctgcacctggcctcaacagcggctct
    tctgatcatttcttggaagaacgagtgctggtacccc
    ttttccccagccccttgattaaacctgcatatcgctt
    gcctccccatctcaggagcaattctaggagggagggt
    gggctttcttttcaggattgacaaagctacccagctt
    gcaaaccagggggatctggggggggggtttgcacctg
    atgctcccccactgataatgaatgagggattgacccc
    atcttttcaagctttgcttcagcctaacttgactctc
    gtagtgtttcagccgtttccatattaggctcttccac
    cgtgtcgtgtcgtcaatcttatttctcaggtcatctg
    tgggcagtttagtgcgaatggactcagaggtaactgg
    tagctgtccaagagctccctgctctaactgtatagaa
    gatcaccacccaagtctggaatcttcttacactggcc
    cacagacttgcatcactgcatacttagcttcagggcc
    cagctcccaggttaagtgctgtcatacctgtagcttg
    cttggctctgcagatagggttgctagattaggcaaat
    agagggtgcccagtcaaatttgcatttcagataaaca
    acgaatatatttttagttagatatgtttcaggcactg
    catgggacatacttttggtaggcagcctactctggaa
    gaacctcttggttgtttgctgacagactgcttttgag
    tcccttgcatcttctgggtggtttcaagttagggaga
    cctcagccataggttgttctgtcaccaagaagcttct
    gcaagcacgtgcaggccttgaggtcttccgacttgtg
    gcccggggactctgctttttctctgtccttttttctc
    cttagtgggccatgtcctgtggtgttgtcttagccag
    ttgtttaagggagtgttgcagctttatgattaagagc
    atggtctttccttgcaaactgcttggtttagaagcct
    ggctccaccacttagcggctctgtgacctcggacaca
    tttcttagcctttctgggcctcgctcttcttcctcat
    aaagtgaaaatgaaagtagacaaagccttctctgtct
    ggctactgagaggatggagtgatttcatacacataaa
    gcacttaaaataatgtctggcatatgatacatgctca
    ataaatgtcacttacatttgctattattattactctg
    ccatgatcgtagcttaagaacagaggtctttacagga
    attcaggctgttcttgaatctggcttgctcagcttaa
    tatggtaattgctttgccacagactggtcttcctctc
    cttcacccaaagccttagggggtgaacgatcccagtt
    tcaacctattctgttggcaggctaacatggagatggc
    accatcttagctctgctgcaggtggggagccagattc
    acccagctttgctcccagatacagctccccaagcatt
    tatatgctgaaactccatcccaagagcagtctacatg
    gtacactcccccatccatctctccaaatttggctgct
    tctacttaggctctctgtgcagcaattcacctgaaat
    atctcttccacgatacagtcaagggcagtgacctacc
    tgttccaccttcccttcctcagccatttttcttcttt
    gtacataatcaagatcaggaactctcataagctgtgg
    tcctcattttgtcaatctaatttcacagcctcttggc
    acatgaagctgtcctctctctcctttctgcctactgc
    ccatgagcagttgtgacactgccacatttctccttta
    acgacccagcctgctgaatagctgcatttggaatgtt
    ttcaatttttgttaatttatttatttcatcttttttt
    tttttttttttttttttttttttagggccgcacccat
    gggatatggaggttcccaggctagggatccaatggga
    gctgtagctgctggcctacaccacagccacagcaatg
    cacaattcgagccaatctttgacctacaccagagctc
    acggcaacactggattcttaacccactgattgaggcc
    agggatcaaactctcgtcctcatagatacgagtcaga
    tcgttaacctctgagccatgatagttgttagttactc
    attgatgagaaaggaagtgtcacaaaatatcctccat
    aagtcgaagtttgaatatgttttctgccttgttacta
    gaaaagagcattaaaaattcttgattggaatgaagct
    tggaaaaaatcagcatagtttactgatatataagtga
    aaatagaccttgttagtttaaaccatctgatatttct
    ggtggaagacatatttgtctgtaaaaaaaaaaaatct
    tgaacctgtttaaaaaaaaaacttgactggaaacact
    accaaaatatgggagttcctactgggacacagcagaa
    atgaatctaactagtatccatgaggacacaggtttga
    tgcctggcctcgctaagtgggttaaggatatggtgtt
    gctgcagctccaattcaacccctatcctgggaacccc
    catatgccaccctaaaaagcaaaaagaaaggtgctgc
    cctaaaaagcaaaaagaaagaaagaaagacagccaga
    cagactaccaaatatggagaggaaatggaacttttag
    gccctatctccaactatcacatccctatcaccgtctg
    gtaagaaatggaaaaaatattactaagcctcctttgt
    tgctacaattaatctgattctcattctgaagcagtgt
    tgccagagttaacaaataaaaatgcaaagctgggtag
    ttaaatttgaattacagataaacaaattcagtatatg
    ttcaatatcgtgtaagacgttttaaaataatttttat
    ttatctgaaatttatatttttcctgtattttatctgg
    caaccatgatcagaaatctttaaacaatcaggaagtc
    ttttttcttagacaaatgaaaatttgagttgatctta
    ggtttagtacactatactaggggccaagggttatagt
    gtgactattaaatcacagataatctttattactacat
    tatttccttatactggccccacttggatcttacccag
    cttagcttttgtatgagagtcatccttaaagatgact
    ttattctttaaaaaaaaaaacaaattttaagggctgc
    acccatagcatatagaagttcctaggctagcggtcaa
    attagagctgcagctgccagcctatgccacagccaca
    gcaatgccagatctgagctgcatctgtgacctacact
    gcagcttgcagcaatgctggatccttaacccattgaa
    caatgccagggattgaacacacatcctcatggatact
    gctcaggttcctaacctgctgagccacagttggaact
    ccaaagcagactttattctgatggctctgctgatctc
    taacacgttattttgtgccatggtgtttatcttcact
    ttactcaagtcagggaaacacgaagagtctcatacag
    gataaacccaaggagaaatgtgcaaagtcacatacaa
    atcaaactgacaaaaatcaaatacaaggaaaaaatat
    cttcactttcaaaatcacctactgatgatgagtttat
    atttccttggatatttgaatattagctatttttttcc
    tttcatgagttttgtgttcaaccaactacagtcgttt
    actttgatcacagaataatgcatttaagccttaaata
    gattaatatttattttcaccatttcataaacctaagt
    acaatttccatccagGTCTGTTAAATGCACAAAACAC
    AACTGGAAGTTAGATGTAAGCAGCATGAAGTATATCA
    ATCCTCCTGGAAGCTTCTGTCAAGACGAACTGGgtaa
    ataccatcaatactgatcaatgttttctgctgttact
    gtcattggggtccctcttgtcaacttgtttccaatct
    cattagaagccttggatgcattctgattttaaactga
    ggtattttaaaagtaaccatcactgaaaattctaggc
    aagttttctctaaaaaatcccttcattcattcatttg
    ttcagtaagtatttgatgagaccttaccatgtgtaaa
    cattgcactaggtattaagaaatacaaagatggataa
    gatagagtcggcgtaaatgagatgatataatgagacg
    ttataatgaaactcacaattccagttgggaaataaag
    tccttcaaattccatgactctttctggcacacgttag
    aggctacagcttctgtgtgattctcatgctggctcca
    cttccactttttccttcttcctactcaagaaagccta
    tagaaatatgagtaagaagggcttaatcataggaata
    aatttgtctctgttctaagtgattaaaaatgtcttta
    tcagtataaaaagttacttgggaagattcttaaaact
    gcttttacacactgttctagaatgactgttatataaa
    taaaaaagtagatttgatctaacacaattaaatgacc
    tttggaaatattgactaattctcaccttgcccctcaa
    agggatgcctgaaccatttccttcttttgccagaaag
    cccccaccctttgtctgttgacctagcctaggaaatc
    ttcagatcacgttgttagcacgaactggttacatgtg
    ctgtacaaatactatttaattcatctgattaaaaaaa
    aagagataagaagcaaaagtttgactatcttaaactg
    tttgcgtaggtgagaggacaattgaccatctacttta
    tgagtatgtaacccagaaacttaaagctccttaaggg
    agctaagtcttttggataagacctatagtgagacctt
    ttagcaaaatggttaagactgaatggagctcactagc
    gtgggttcatatcctgatgctcaaacacgcaattaaa
    tgactttaggtgggttagtctctgttccttagtttcc
    tcaatgggagataatattggtagtagcgattttactg
    ggttgttgaaagaacatctgttaaatgttcagaacgt
    gttacgacagagtacagagtaatgatttgcttgtata
    tgtatgactcaaatagtctgccatatgccttgtgact
    gggtcctgtggagcaggaaggagggatttcccaccca
    gcagaaagttgggtaaactggaaaatagactgaggcc
    aggaaatgatgcaaagcgttgatgttcactgccacgg
    caggtgaagggcagggccagagttgtcagtagggtca
    ggggaggactggaaataaccaagacccactgcacttt
    tcagcctttgctccagtaaggtaatgtgtgagagtag
    aaaattttgttaacagaacccacttttcagtacagtg
    ctaccaatactgtagtgatttcataccacatcccaag
    aaagaaaaagatggctcaatcccatgtgagctgagat
    tatttggttttattgttaaataaatagcattgtgtgg
    tcatcattaaaaaaggtagatgttaggaaagtagaag
    gaagaagactctcacctacattttcatcactgttttg
    gtatctgccagttgtcaccttggtccccttccccgcc
    tctcccctgcctcctcttcctccttctcctttttttg
    gaatacaattcaggtaccataaaatttacccttttag
    agtgtttgactcaatggtttttagtattttcacatgt
    tgtgctattactatcactatataattccaggtcattc
    acatcaccccccaaagaaaccttctaactattagcag
    tccattcccttcttccctcagcccctggcaaccacta
    atctacttactgtctccatggatgttcctatattgaa
    tcaagctagcataaaccccacttgctcatggtcataa
    ttcttttttatagtgctaaattacatttgctaatatt
    caattaaggatttctatgtccatattcataaggaata
    ttggtgtgtagttttctctttgtgatatcttgtctgg
    ttgggggatcagagtaataattactgctctcatagaa
    tgaattgagaagtgttccctccttttctatttattgg
    aagagtttgtgaagtatattggtattgattcttcttt
    aaacatttggtcagattcaccagtgaagccatctggg
    ccatggctaatctttgtgaaaagttttttgattacta
    attaaatctctttaatttgttatgggtctgctcctca
    gacgttctagttcttcttgagtcagttttgttcattt
    gtttcttcctaggactttctccctttcatttggatta
    tttagattgatagtaatatcccccttttaattcctgg
    ctgtagtaatttgggtcttttctcttttttcttggtc
    agtttagctaaaggtttgtaattgtattaatcttttc
    aaataactaacttttttgttttgtttgttttttgttt
    tttgttttttgttttttgtttttttttgctttttaag
    gctgcacctgaggcatatggaagttctcaggctagag
    gtctaatcggagctacagctgctggcctataccacaa
    ccatagcaatgccagattcaagctgcatctgcgacct
    acaccacaactcggccagggatcacacccgcaacctc
    atggttcctagtcggatttgttaaccactgtgccacg
    acgggaactcccgcccattttttttaacacctcatac
    tttaacataaagatgggcttcacatggactgatagct
    caaatgaggaaggtaagactatgaaagtaatggaaga
    aatgtagactatttttgtgacctagagattactgata
    cttcttgacttttcaaacaatacttcaaaagtacagc
    ccaaagggaaaaaagaaagaaaaaagaaacacacata
    tacacaaacctagtgaataagatatcatcgatacact
    acagatttctatgaactggaagaccccatggacaaag
    ttaaagaacatatgatagtttgagtgattattttgca
    atatttacaaccaatgagggaatattatccagcttat
    aggaggaagtaatgcaaatcgacaagaaaaagatagg
    aaacccaatataaaaattaagaaaatacaaaaattaa
    gaaaggatatgaactagcattttacaaaagaaaaatc
    tccaaaagtcaatcagcacatgaaaatatgctcaaac
    ctattaattattagaaaactacagactgaagcaatga
    ggtgctttactttacatctttttgactgataaaaagt
    tagaaacaaaggtgatatcaaatgtcagggataaaag
    gatatagaaatcgtcatgcctgtggtgggagtatggc
    cggtgcagtcatgtgggaaggtaatctgacagtggtt
    aggcagagcaggtttatgaatacactgtggcccatca
    atcccacgcctgtttatgtaccaaagaaatcctgttg
    tggcagaatctatgggtccacccctgggagcatgaat
    taataaaatgtggcaccagggtgtgtgaaactccagc
    tagagatgagatgtccacatggcaacatgaatgcatc
    ttagaacatagatttgagtgaaaaagagtaagaaaca
    gccgggaaacccaataccatttataaaaattaaagat
    gcacacatacaatgtagtaaatattttgcatgaactt
    tcaaatggttgcctacagggggggagagtaaagaaga
    gtagaaaacaaagataagggagtaagtaagtagctct
    gcctggactgaatataatgtgtcatgaactgagaaat
    atggttaacataatcctcttaacttgaggtcctaaat
    gaatgaatgagtccactattcatttacccattcttta
    atgtgtattgcattataatccatttttttagaaccaa
    cgaattttgttcccataactactaatcagcctgcctt
    ttctccctcattcccttatcagctcaggggcattcct
    agtttttcaaacgttcctcatttgaaccaaaaatagc
    atcattgtttaaattatacttgttttcaaatacgatg
    cttatatattccaagtgtgtttgcccattttcttagg
    tggtagaaatttttcattctacttttctatctactca
    gattttcccgttggaattatttccattgctattaaac
    ttagaagtcccccctgtgatatgccatttttttcata
    ctttttaagcacttggttgcttttctttgtgtcttta
    agcacctagaatacttataaccattgcacagcactgt
    gtatcaggcagcccttcctcttccactaatttatggt
    ccttctcttagactatattaaactgttatttaattag
    gatcctctcttcgtccttatgatttaattattatagt
    tttctaatatgtttttattataattcctcttcattat
    tcctccctattaaaaattttaatgaattccatttgtt
    tgttcttctagttaaatattaagtcataatccaaata
    acttagatgtcattagtttatgtggtcaaagtaagga
    taccacatctttatagatgcaggcagttggcagatgt
    catgattttcttcagtgcataaatgcaatatctttga
    gcaaggggcataaaaacttttatggtattggctttga
    aataatagttaagaactgcagactcagtttttcctgc
    ttttcttgaaaaagaacacttctaaagaaggaaaatc
    cttaagcatggatatcgatgtaattttctgaaagtct
    cctgtaattccttgggatttttgttgttgtttgttgg
    tcggtttttttgggtttttgtttgtttgttttgtttt
    gttttgttttgcttttagggctgcacctgtggcatat
    ggaagttcccaggctaggggtccaactggagctacag
    ctgccagcctactccacagccacagcaacatgggatc
    ctagctgcatctgtgacctaaccacagctcttggtaa
    tgccagattgttaacccactgagcaatgccagagatc
    gaatctgcctcctcatggacactagtcagattagttt
    ctgctgagccacaatgggaattcccaattccttgtat
    ttttgaactggttatgtgctagcatataattttgttt
    cttgaatctttgtgggttttttttttttttttttttt
    gtctcttgtctttttaaggctgcacccacagcatatg
    gaggttcccaggctagaggtcaaattggagctacagc
    tgccagcctacacaacaactgcagcaaagtggggccc
    aacttatatgacagttcgtggcaatgccggattccta
    acccactgagcagggccagggatcgaacctgagtttc
    cagtcagtttcgttaaccactgagccatgatagtaac
    tcctgtttgttcagtcttgaacctcctttttaattct
    ttattccttgagggtgaaataattgccataataatac
    tatcatttattacatgccttctctgtgctaggcatag
    tgacactttaggatttattatatcacttaatccctac
    aacaactctgcaaagtatgtatcataatcctatttga
    cagatcaggaaattgcagcccaggatgcagataatat
    gcatccatcacaagtgactagatatagtccctctgct
    attcagcagggtctcattgcctttccattccaaatgc
    aatagtttgcatctattgtatatgtgttttggggttt
    ttttgtctttttttttttttttgtcttttctggggcc
    tcacccttggcataggtaggttcccaggctaggggtc
    aaattgaagctgcagctgccagcctacaccacagcca
    cagcaactcgggatctgagcctcatctgcaacctaca
    ccaaagctcacggcaacaccggatccttaacccactg
    agtgaggccagagatcaaaccggcaacctcatggttc
    ctagtcggattcattaaccactgagccacgatgggaa
    ctccctaaatgcaatagtttgctctattaaccccaaa
    ctcccagtccatcccactccctcctcctccctcttgg
    caaccacaagtctgttctccatgtccatgattttctt
    ttctggggaaagtttcatttgtgccatttttcatttt
    acgggtaatttttacttcagtttcttccactagcagt
    tgtcttaaagtgagtataattaatattcatttggaaa
    atgtaagcaaaacattttttaaagggccatgcccaca
    gcatatgaaagtttctgggccaggggttgaatccagg
    ctccaagttgcagctgtgccctacactgcagctgggc
    aatgctggatcctttaacccactgtgcccggctaggg
    atcaaacctgcatttccacagctacccgagccattgc
    agttggattcttaacccactgcactacagtgggaact
    cccacaaaacattttttaatgtcctttgaataaagta
    ggaaagtgctcgtctttgagggcagggcggcaatgcc
    atttccacaaggtttgctttggcttgggacctcatct
    gctgtcatttagtaatgaataaaattgctgacagtaa
    taggattaactgtgtgtggagatagccagggttagag
    ataaaaacactggagaagtcaaataagttgctcgagg
    tcctctagctaataagctattaagtgggagagtgagg
    gctagaaacaggccatctgtctcccaagcacatgtcc
    attagtggtttgctgatagccttccagaacaacagag
    aggactctcaaacatggtcttgcctccctccaattga
    tcccctccatgtgcctcacagcgggtctttctaaaat
    taagttctgattttaattctcccttgctatagcactt
    aggtatggctttcagccgtgcaataaaaagcaggcaa
    gagtggctcaatcatataggaggttgtttttcttaga
    tcccaagcaggtaatcctgggcattatggttgttctg
    cgtttatcaaggagccaaattctctatcacctcctgt
    tctatcctcctcagtatctggctctattcttcagcat
    ctcaagatggcttgtgctcctccaagcatggcagtca
    aattccacacaagagggggaaatatgaagggcagaca
    gtgctggtctcctgagctgtccctccggggaaataaa
    tgtattccttcaagtcccgtgagacttctgaagtaga
    cgtctgcttacgtctcacccaccagaactatgtaaac
    tgcacatagtgctaggtctacatagccactcataact
    gccagggggtgggaaatctttaaataggtgtaccacc
    acacaattaggatgctaatagtaagggagaaggagag
    aataggttttgcgcaagccaccagcatgcctgccaca
    attgcttaaaattcttcattgacccctcattgccaca
    ggatgaaatccaaacgccttcttagttgggaatctga
    cctacctgtctctcccacctggttcagacaccattct
    ccttggtcataaaattccagtcatttgtgaacatcca
    gctcccccatgcctccatgcctttgcacatgctgttc
    ttttatcttttatgttgtccttttatcttttatccaa
    aagagatatcccatcatcacatctcttttgtcagccc
    ccaaatactttgtctttcaagttcagctggaggatta
    cctcctatttgaaatcagctttgtctcttacaaccaa
    acaaggttttccttccgagacactcccacagcacctt
    gaactcatctctatcaatcattcatttgattgtaatg
    aagttgttggtggtatgcctgtgtctctgacacatct
    gcgatctcatgagttccttaagtggaatgtgaatagc
    gggatgaacagtattggtcttcagccctcatctctgc
    agatgttgcttgacccaaatgagcgttgccttttatt
    ttgattttgctttgatttgtctactccatgtacttga
    gccatgcatttctgtcttagcgatgctttttaaaagt
    cattttttggttgattatccagatttgtccacctttg
    cttctagTTGTAGAAAAGGATGAAGAAAATGGAGTTT
    TGCTTCTAGAACTAAATCCTCCTAACCCGTGGGATTC
    AGAACCCAGATCTCCTGAAGATTTGGCATTTGGGGAA
    GTGCAGgtaaggaaatgttaaattgcaatattcttaa
    aaacacaaataaagctaacatatcaatttatatatat
    atatatatatatatttttttttttttttacatcttat
    attaccttgagtattcttggaagtggctagttaggac
    atataataaagttattctgaagtctttttttttcttt
    ttccatggtgagcagtggcttgatgtggatctcagct
    cccagacgaggcactgaacctgagccgcagtggtgaa
    agcaccaagttctagccactagaccaccagggaactc
    cctattctaaattcttgagcacattatttaggaacct
    caggaacttggcaggattacaggaaatatatctagat
    ttaaaaaaaaatcttttaacagaggtcccaaaggaga
    gtcatgcacagctatgggaggaagttcagaaactgcc
    cttgctaccagatcactgtcagataaaatggccagct
    acatgtttctgcacattgccctaagatctttacaaac
    ttttctgtgcatttttccacttttaaaagaaaatttc
    ggggttcctgttgttgctcagtggttaacgaacccaa
    ctagtatccatggggacaggggttcgagccctggcct
    cactcagtgggttaagaatctggcattgctgtggctg
    tggcgtaggctggcggctacagctcagattggacccc
    tagcctgagaacctccatatgccgcaggtatggccct
    aaaaaaaaaaaaaaagagagagagagaatttcctcca
    gaaaaaacactttggtagtttgggagaagtaaacaac
    caaaaattaatttttctggagtattcgggaagcttgt
    aaaaatgggctcttacttttttgaggagacaaatggg
    aacctacccagaagaggcacaatcacctgcatttgat
    ttcttgacctctccctaccttctttgctggctttcca
    catttggatttctgtgaccttatctctgctccttggt
    gttttcatttttcctgtggacgtgccagactatggga
    agggagtaaggcgttgatttagaatcctgtagtctct
    gcctgtctctagtcattgttttcacccttctcaaagg
    accttgacatcctgagtgagtccgcaagtaatttagg
    ggagaagccttagaagccagtgcagccaggctacatg
    actgtgtccacccactggaaccagtcatttttatacc
    tattcacagcccccctaccatttaaatccccagaggt
    ctgccataacatctgtaactccctttcctggtaaatt
    gtgttctaaaagactggtaacaaaagatattctgtgg
    tacagagcataattaaatacctgggagctgatttgag
    tggggtaaatcaactggtttgacccctaaaacccacc
    atgagcatttctgttcaataaagtaatgcccgtgctg
    ggaattgtgttctacggaaatgctcctgctgtgtctt
    tcttgagtcctgtgtcattgaacatgcttaggagcaa
    aggtcccccatgtggcttgtctgctaaccagcccagt
    tccttgttctggctggtaatgatccgatcatctgaat
    ctcactgtcttccaacagATCACGTACCTTACTCACG
    CCTGCATGGACCTCAAGCTGGGGGACAAGAGAATGGT
    GTTCGACCCTTGGTTATCGGTCCTGCTTTTGCGCGAG
    GATGGTGGTTACTACACGAGCCTCCATCTGATTGGCT
    GGAGAGGCTGAGCCGCGCAGACTTAATTTACATCAGT
    CACATGCACTCAGACCACCTGAGgtaaggaagggtga
    gccctcaactccgaagaaaatgctgcaataaaagcac
    tgttggttttcagctttttttgtaatcactgctcatt
    ctgaggtagattcgcttgggctgataaaaagagaact
    aattcagataaatgcttgcatttgcatagcctctttt
    tttaaaaactttttttttttttttttttttttggctt
    ttcagggctgaacctgtggcatatggaggttcccagg
    ctaggggtcgaatcagagctgtagccccgggcctatg
    ccactgccatagcaacatgcatagcctcctttttaaa
    gtgccttcctgttttataccattgggatgtgagaaga
    gctattgtggaaangagcatggggtnataaccctgga
    cctctcacgtcctaccctcaggntagtgggaaaactc
    tgagtttaaggacatcaaagtgactcctttttagtta
    cattatggnggaatcagcncatatttttacaaggggc
    ggagngtaanctgttggagtttacaagacatatggtg
    gcattgcaactacttaaccctactattatagcacaaa
    agcagccatagtcggtcctgaaggagcctgatgcctt
    cagctttataggcaatgacgtgtgaatatcacaaaca
    gtttcctgtgtcaccaaacatgattgccttttgattt
    ccctttcaaccctttaaaaaaaggtaaaagcccttct
    tagcattcagcagcaggtcgctgtgttttgccaactc
    ctgatctgtagcatttcgacaacactgagctctcaac
    ttttgaaccctgagtccaccacatccttcagtgaaac
    cagagccatgtgatactaaggatagaaacggaaactt
    cctgaatccaggcgatcaaataggagggagaaagagg
    aactttcattgacaaaaccacaaatattgtgaatgga
    ctgttacaaatattgtgaatgctcctattcccaaccc
    cctggcttcattacagggtcctatgtgttcatcctta
    ttgagaaatttgtattgctactgccaggttgccaata
    cccagcggtgcccatggtgttctaaaatgaagcaatt
    tcaactttatttttttttcctgtgactttacatgaca
    agttcacatgaaggatatactttgatagtaatgtcca
    tggttagggaatatacattgtttgctggttgactggc
    ccctggatttttctattgaaagtccatgagatctcga
    aggcacaggtgtgttctctcgctttttaaggaaaggg
    tttaaaaacttaagtaattaacagctttagtaacaaa
    ttacctataacacacttaaaaaccgaataccacccac
    tggagtattgtgctacgattaaaaatctacttgtcta
    ctacatgatatctttgtcccacagaaggttctggaac
    caaacttgtaatttcaggattatgagagccctgagtt
    cacgcattgtgtaataactatgttgtgtggtagtcaa
    tttgtacagcttgcttagagagaacaatgtcaagtta
    aggaggcgattgctttatagtgcctgtcacaagatgc
    cattgccattgtcctagcaagagatattctatgggag
    tatactacattttagtgaggataagaactttttatgg
    catttagtccggtcatttcccaaccactgtcctgaaa
    accaatttcattttgatttcaggggcttgtgtgggca
    aagttgccaggcattaaaaagccacttctcaactgta
    gtatcacaatgctttagttgggtagtgtattgcagat
    agcttatggctgaaaagttaccaagccttgcagtttt
    cactcctttgagtttatttccttgacagaattgaccc
    tgagttttttgactcttacctgctcaactaataaaca
    ccagagtcatttatctccattgctcttgtctgacctt
    tatttaccgaataatgccttatgggttcacaaaaaca
    aggggggagggggccagcatgccttagaaactgtctt
    tagtcaagaaatgngattttattatgtaaatatatga
    gtattataatagatagtgttattaatagacaccagca
    agaattgtcaataatttaaaaatcacaaattaaaata
    catccatgttagnatcatttatcctaactcccaaagc
    cctttaaagtggaagatttagatgttaacccagagat
    taaagacatgttcaaagaatccttgatttttttttga
    atcccttgtttttagagaagaaaacctaatgattttc
    cccctctggattctacatattaaatatagttttggaa
    cttgaatattagtatggttaataagtgctgatatgct
    gattttgtttatatttttcttatgagtaaatatccta
    tatcaccagacattatagtctatgtacaaatatgatt
    cttaaacctgatagcacattcattagagttggaattg
    ccttttttttttttttttttacagttgcacctgcaac
    atatgaaagttcccaggctaggggttgaatccaagct
    gcagctgccaccctacattacagccgtagtaacagca
    gatccgagctgcatctgcaacctatgctgcagctcag
    ggcaatgccagatccactgagtgaagccagggatgga
    acttgcatcctcatagagacaacgtcgtgtccttaac
    ccactgagccagaacaggaactccagaatttcctttc
    aatagaagaagcaccaagtttaggatcagaaagcctg
    aatttgaataccaatttactattttagtcatatattt
    ctgagtgtgntcctcatttattaaaagcagactaaaa
    gatgagagggtcttttgttgagaatcaaatacaataa
    catgtgaaagtgtgtaacactatgattgaaatatacc
    tacacagccatttatttgtttattgttcatgttttgc
    cacccacacagtagtatataatccttttatgtaataa
    atgctaataatgaaagttggcaacttatgtaagtact
    caaaatgctggaggtcatgggatactgactgggatac
    tacagaggtaatgtcatttcctctgcgctaaacttat
    tgtcttgtagttagggactgactctctttaggacaag
    gagttcattctgtataccatgtgtggctatcaccctt
    cgaagttgaaaaactgccccagggtgggcacccatcc
    gttctcttagatatatggccgagacctttctctcact
    gggagggaaccacactgaggaatgagaaaaaaaaaag
    gaaaatcaagatgaaaccagaaacctctttggcataa
    cttctccactctgtactttttgttagaactacccttg
    cacaaagcagcatcagtgtggaagacagaatttgcac
    acctggtttgatatacatgccgtggtatatgggatgt
    tctaacaataaagaggactctcccaggaaatctcctc
    actgttatagtcagccttgaggaaagagctcttcttt
    tggactctggggagagtctagtttttcagttccttgc
    ttctcggtcaacgtgttggtgtaaggatcacactctc
    tcttatactagataattctattttttcaccTTTcaac
    ctgtctatccttctgaccctagTTACCCAACACTGAA
    GAAGCTTGCTGAGAGAAGACCAGATGTTCCCATTTAT
    GTTGGCAACACGGAAAGACCTGTATTTTGGAATCTGA
    ATCAGAGTGGCGTCCAGTTGACTAATATCAATGTAGT
    GCCATTTGGAATATGGCAGCAGgtctgtgttctttcc
    acatgtttgggttatcctttctgggataaatttgagg
    cgagatagaaactttaagactaaagaaacaatggcct
    actttttttgtacatggtcctgtgtaaatctctattt
    gagctgaaataagatggtcttcctctccaattatcca
    tggtatgactctgatggataacaaatccagttctgaa
    aaaaggggatttctttccagaagagaggacagtttct
    tcaaatattgaattaaaagcaaaatagatgtaaaccg
    ttgttggttttattgttgaattccagGTAGACAAAAA
    TCTTCGATTCATGATCTTGATGGATGGCGTTCATCCT
    GAGATGGACACTTGCATTATTGTGGAATACAAAGgta
    ttttcttgccctcatcagcatgaaattgctcttggta
    gaaaggataataatagttatccaaaacatcatcctat
    gttcatctgtttcttccctcttcattttccatagagt
    acagtatattctatctctgtcttaggaaaatggactg
    tcattcatataatcttacagagaatcaattagtaatg
    tactctatgccgtgacaggtgcgaaggttttttttga
    aggcaacagataaaaatatcctatatttcacctattg
    taatttccttaaaactgacattattgaataaatgttt
    tactttcatcttgaatattattatgttatggaatcat
    acactttaccccaataatcatcgaaaagaatttccaa
    aaggttgagagagttgtgttgatctgattactttcct
    ctgcatcctttgagcttaacctttgaatatagtttgc
    taaggaaagtagtctgtttatgatcctggagtggaat
    caggctaagtgtcctcattcagaacccactgaatcag
    acagaatgaatttatttccttgaaagttcaaaatgtg
    tcactcaagagtataaattttcaaatcttactctctc
    ttttccttggatgtgagcaattcttcgataattgaat
    gaggcagattatatagacttacatggaagactgttgg
    cctgagaattcaaactatggtgttcaagacttcacng
    ngagtccgatgccatttgtttcccacagGTCATAAAA
    TACTCAATACAGTGGATTGCACCAGACCCAATGGAGG
    AAGGCTGCCTATGAAGGTTGCATTAATGATGAGTGAT
    TTTGCTGGAGGAGCTTCAGGCTTTCCAATGACTTTCA
    GTGGTGGAAAATTTACTGgtaattctttatatcaaaa
    tgatgccaaggagttggcatggcactttgctaaatgc
    tgtgtgaatcaatacaaagataattaggacatggttc
    ttcctcacaagaggtgtgcaatcttattgggaaatca
    tacttgcaagtcacaaatatagactaaagtttccagc
    tgagaatatgctgatggagcatgaaacactaaggaga
    cagggagaatctcaggaaaaatcaagaataatttgga
    tcaaatggattcctgacatagaacatagagctgatca
    gaaagagtctgacattggtaatccaggcttaagtgct
    ctttgtatgtggttcagaacagagtgtgggcagcctg
    agggggatacatacccttgacctcgtggaaagctcat
    acgggggagggatgaggctaaggaagcccctctaaag
    tgtgggattacgagaggttgggggggtggtagggaaa
    atagtggtcaaagagtataaacttccagttacaagat
    gaataaattctaggggtataataacagcatggcacta
    tagatagcatattgtactatatactggaagtgctgag
    agtagatcttacatgttctaaccacacacacacacac
    acacacacacacaccacacacacacaccacacacaca
    cacgtgcacacaaacagaaatggtaattatgtgaggt
    gatggcggtgttaactaactttattgtggtcatcatt
    tagccatacatgcatgtcatgaaatcaccatgttgta
    caccttaaagttatgtaatactagatgtcagttatat
    ctcaaagctagaaaaaatgtggggaccaaggcagaag
    ctcttctgctctgtgtctaagggtggttctggggctg
    ggatggggaggatggttaagtggtatatttttttcat
    acctttgctcagtactatcattgtaagtgttcaatat
    atgtctgcttaataaattaatgtttttagtaagtaat
    ctctgtttagtaatgtgtcagaaatgccctacttgca
    ataggaagaaaacctgtccagtcccttccttttttct
    gtaagtctgatttcattgcctcccagaatgcatcacc
    atgtgagagatagagggaaggtgctgtccttatgggg
    ttaacagtgtgactagggaggcaaaatatacctacta
    aagggtggtagcataattcagttcttatgtgagtatg
    tgtatgtgtgtgagtatgtgcacatgcacatacattt
    taaaaggtctgtaatatactaacatgttcatagtggt
    tacacctagcttataggtaacattttttcccctgtat
    ccttgtttgtgtttatcaaattttcataacagtaatg
    gtagaaggagtacctgacatggtaccatacatgctng
    gncctgcctaatttctcnatttcctttattgcccata
    cccccattgcttgacaagcataagtccatactggctt
    gttttcgttcctcagactcagtacaccatgtagctcc
    atgccctgggtctttgtatgtgctatttctactgctt
    agagtgctattgcccctgaccaccacgtggtcagcaa
    cttctcttctgcgtctgtgtctatggtctatgattcc
    agatgtcatcttcactaactacccttctaatatgccc
    ttccatcccacccgtcctcatccttaccccagccact
    ctctatttggtggctctgttttattttcttcctagct
    catcactctttgaaatgaacttatttacttattcaat
    tgcttctttcactagaatgaatgctccatgagagcag
    ggacctgctttatcttgctcgccactgtattcacagt
    gcctagaactacgtctggcacatagtaggtgctcaat
    aaatatcgatcaaatgaaagaatgagcaaacgaacaa
    atgaacaacacgtgaggtaggcatcatgattccatca
    acagaggagaaaaccagacttaaagnaatgaagtggn
    ggagctgcatttgatcttgactgactccacatccatg
    ctcttgaccactgtgcatctccagagtgtaatgaaca
    tactttacttttatattccaccaaaataacaaagcca
    tgcccatgttagtagagagttaatcgacagtgccctt
    aaaatatgcatgcacccagggtacaactatgcatgct
    gccctgtgttttcagttggatccaaatgaattgccgt
    aaacaaagaggggattcaatgtctttgactagtttgg
    gatattttcctagtaaccaactttgcaaaataaagcc
    actaatgacaaggagctttgttctacttctgcatcac
    tcaactgtcaatttttatctcttgcaagacttctaat
    ctactagaacttttgtttttctgtgatttctgaacag
    agaagactaatccaaaccctgtcattccagAGGAATG
    GAAAGCCCAATTCATTAAAACAGAAAGGAAGAAACTC
    CTGAACTACAAGGCTCGGCTGGTGAAGGACCTACAAC
    CCAGAATTTACTGCCCCTTTCCTGGGTATTTCGTGGA
    ATCCCACCCAGCAGACAAgtatggctggatattttat
    ataacgtgtttacgcataagttaatatatgctgaatg
    agtgatttagctgtgaaacaacatgaaatgagaaaga
    atgattagtaggggtctggagcttattttaacaagca
    gcctgaaaacagagagtatgaataaaaaaaattaaat
    acaagagtgtgctattaccaattatgtataatagtct
    tgtacatctaacttcaattccaatcactatatgctta
    tactaaaaaacgaagtatagagtcaaccttctttgac
    taacagctcttccctagtcagggacattagctcaagt
    atagtctttatttttcctggggtaagaaaagaaggat
    tgggaagtaggaatgcaaagaaataaaaaataattct
    gtcattgttcaaataagaatgtcatctgaaaataaac
    tgccttacatgggaatgctcttatttgtcagGTATAT
    TAAGGAAACAAACATCAAAAATGACCCAAATGAACTC
    AACAATCTTATCAAGAAGAATTCTGAGGTGGTAACCT
    GGACCCCAAGACCTGGAGCCACTCTTGATCTGGGTAG
    GATGCTAAAGGACCCAACAGACAGgtttgacttgaat
    atttacagggaacaaaaatgatttctgaattttttca
    tgtttatgagaaaataaagggcatacctatggcctct
    tggcaggtccctgtttgtaggaatattaagtttttct
    tgactagcatcctgagcttgtcatgcattaagatcta
    cacaccaccctttaaagtgggagtcttactgtataaa
    ataaactattaaataagtatctttcaactctggggtg
    gggggggagactgagttttttcacagtcctatataat
    aattttcttatcctataaaataattaggagttcccgt
    agtggctcagcaatagcaaacccgactagtatcgatg
    aggatgcgggttcgattcctggcccccctcagtgggt
    taaggatctggcattgccgtgagctgtggtgtaggtg
    gcagacacggctcagatcccacgttactgtggctgtg
    gcataggccagcagctccagctctgattagaccctta
    gcctgggaacttccatatgctgtgggtgtggccttga
    aaaaaaataaataaataagataattactcaaatgttt
    tccttgtctcagaaccttacttcaggataaagagtga
    gaaagttttttttatgaagggccattattacagctca
    aaaataagttgtcttcagcaagtagaaagcaataagc
    ctgagagttagtgttcctatcagtgtaaatattacct
    cctcgccaatccccagacagtccatttgaacaattaa
    cggtgccctgggagtacagttcagaaacattaatgtg
    gatgttccagacctgtatttttataagtacttgtctt
    gagccggatggaaccatcattcctcaccattatttag
    aagtggactgtgactctgttggagatcagggcacacg
    gttaccaaaagcacacccttctcctggccttaccttt
    gcaaagctggggtctgggacacagtcagctgattata
    cccttttactaacttcccacagctcaaatctggtcaa
    ttctccttcacaaatctcttaaaaatccatcactcac
    ctccagcctcttctgctgtggccttgattcagcctct
    cacaatttttttttaaccagaattctggcagtggccc
    ctgacttgcctctgtgctcccagccccgctgtcctct
    gatccatcctccatgccagccttcaatctgctggtca
    cgattcattgatgggttaggaaatcaatggcatcaca
    actagcatttagaaaaaggaaataggcgttcccgccg
    tggcacagcagaaataaatccgactaggaaccataag
    gttgcgggttcaacccctggccttgttcagtgggtta
    aggatccggcattgccgtgggctgttttgtaagtcac
    agacatggctctgatccggcattgctgtggctctggc
    gtaggcctgcagcatcagctccaattagacccctatc
    ctgggagcctccatatgctgcaagtgcagccctaaaa
    aaaataaaaaaataaaaaaaaataaataaaagaagta
    gacaaattgtatagaacaaccctgagtatgttgcctg
    agcacatataacaagggtaagtattatttcaggaaac
    tctggtttcacagatactcttggcatatggaccccta
    gagtcctgatgtaaaatatattcttcctgggatctta
    ggcaagaagtttgaaagctccaactctgcactgctgc
    caaagaaatgatttttaagtgcaaaactcttcccgtt
    cccttccctgtataaaattccataggatctctccagt
    gcctctaggataaaggcagttttcattctctagttca
    aggtgagagaagattttaattatttcacgttttagtg
    gggaattcaagagtctggcacctgacatttgctgaac
    tctctccattatccctctctagttccccagacgcatc
    ctatggtagaaattcgcaactagagtgagcgtcagag
    taacccaaggaaactgggtaaatgcagctccctgggc
    tctaccccctgagattctgattcagtagatctgaagc
    agagccctggaatatgcatatgcatcattgtgtcaca
    ccaagcattctgggtaatgagagttgatgttaggttc
    tcagtagtaagacaagtatagagattccgggggactg
    agtgctcagctctgccttggggaggagggagagggct
    aaagagaacaggagatggggacagggaatgctcaacc
    tccaatcttaggcatttgagctatgtcttaggggtca
    ggaggaggttaccaatatagtgattaagagattgagg
    ttccagtcagagggatatgctggagaaggggggtgaa
    aataatgtcataggtttggtgagtgcagatactttga
    gttttttaatatttttattgaaatatagttgatttac
    aatgctcttagtgagtacaattactttgaataagtgc
    atagatgtatgccattcttccagaaatgatttattga
    gctcctttgggcatcatgctaagtacaggggaaacag
    ctgtgaagaggtccttcccttatgaagtcattcatcc
    ccttcagtaaatgaaggtaaaggaaaaggatgagaca
    gggacgccgtgttggaccagggtcagaaaggccttat
    aagaccttgcctggagggcaaggaacttgcctgtgag
    taaggagagcttgagaaagcgataaagcaaagaagga
    acattactgcattgtgttttagaaaaaccatgtcctg
    gggaagaactcctagagtcaggggggccagttgggag
    actgtgcttttttccaggaggagataagtgaggctgc
    tggctgagatggagcaaggatttagagaagcagatat
    gagattcatttagaagttagacattttaggatctgac
    acataatttatcaccaaaaccagtgcatctctggctt
    tgggccaccagttttggagaagtggaatgtagggacc
    taccattacctgccaatctttactacacagatgccta
    tttccctcctcatatttcctttctccagatcacgtcc
    tattctattgccaggactcaagattccaccttgcatg
    cagtgatccatcttcacactggatggacagctctagg
    gatgtcagagcacactcttgtccatactgctgactgg
    gtctcctgtcagcccatctgtctatcagctgtggtat
    tattagtataataagagggctgtatatgagagacaca
    aaattctaggtgtagctcaaagataggctagagttat
    tcctatgtacaacaaatatttatgggaccccttctgt
    gtactgtcatggttgctgctttcatcatacttgtagt
    ctaatggaggtgggggcagggcaggaataagcggatg
    tccacaaaatcagtaagaccacttatattcaacattt
    tcataatttagttatttgagcccaaagggtccacatc
    cgtggtattccaacttttttttccccggacatggatc
    tttatctttttttttttttcttttttgcggccagacc
    tgcggcatatggaagttcccaggccaggggttgaatg
    ggagttgcagctgcctggtctacaccacagccacagc
    aaggtgggatctgagctgcatctgtgacatacaccgc
    agctgaggtaacaccagattctgaacccactgaatga
    ggccagggatggaacccgtctccttatgaacactatg
    tcatgttcttcaccctctgagccacaacgggaactcc
    agacttcgtctttaaatgtattctgacttggagagct
    atcacactaagcaattaacaggagctgacctggttta
    ggctggggtggggccctactcctcaatgttccctgag
    gcacatctgtgggacccctgggcatcatctatctgag
    cagccttagagctgctcatccagttgactgttgatgt
    agaagtgcaaacttctgccttccttatttgttgcttt
    cttttttcattgttctctcccctttgtgtctttaagC
    AAGGGCATCGTAGAGCCTCCAGAAGGGACTAAGATTT
    ACAAGGATTCCTGGGATTTTGGCCCATATTTGAATAT
    CTTGAATGCTGCTATAGGAGATGAAATATTTCGTCAC
    TCATCCTGGATAAAAGAATACTTCACTTGGGCTGGAT
    TTAAGGATTATAACCTGGTGGTCAGGgtatgctatga
    agttattatttgtttttgttttcttgtattacagagc
    tatatgaaaacctcttagtattccagttggtttctca
    ataagcattcattgagccttactgactgtcagacgga
    gggcgtattggactatgtgctgaaacaatcctttgtt
    gaaaatgtagggaatgttgaaaatgtagggaatgaaa
    tgtagatccagctctgtttctcttttggaggattctt
    tttcctccatcaccgtgtcttggttcttgtttgtttt
    gggtttttgtgggtgttgtattgtgttgtgttggtta
    tggcagtgacagctatttaaactgtgaaacgggggag
    ttcccgtcgtggcgcagtggttaacgaatccgactgg
    gaaccatgaggttgcgggttcggtccctgcccttgct
    cagtgggttaacgatccggcgttgccgtgagctgtgg
    tgtaggttgcagacacggctcggatcccgcgttgctg
    tggctctagcgtaggccagcggctacagctccgattg
    gacccctagcctgggaacctccatatgccgcaggagc
    ggcccaaagaaatagcaaaaagacaaaataaataaat
    aaataaataagtaagtaaaataaactgtgaaacgggg
    agttcccttcatggctcagcagttaacaaacccagct
    aggatccatgaggatgtaggttcgatccctggccttg
    ctcagtgggttaagaatccagcgttgctgtgagctgt
    gatgtaggtcgcagatgcagcccagatcctgcattgc
    tgtggctgtggcgtaggctggcagctgaagctccgat
    tcaacccctagcctgggaacatccatatgctgcaggt
    gtggccttaagaggcaaaaaaataaaaaaataaaaaa
    taaataaattgtgggacagacaggtggctccactgca
    gagctggtgtcctgtagcagcctggaagcaggtaagg
    taaggactgcagctgggtaaggactgaattgcaccaa
    ctgggaagtaagcctagatctagaacttaagttagcc
    ctgacatagacacacagagctcaccagctaagtggtt
    cagcttataagctggtcactgaaactgaggatgtcca
    caaaagcaaaataagtagcaacaggcagcgggatgca
    agagaaagaggaggcctaaaatggtctgggaatccct
    gccatacctatattttatcctacttatatttagtgcc
    tgaatgtgtgcctggagagcaaagtttagggaaagca
    tcgggaaatgcacagtattcatacccttaggaacaaa
    gatcagttacctccagggtaaagactatttccaagtt
    taaatttcaacccctgaacattagtactgggtaccag
    gcaacacttgccatcctcaaaatcaatgaatcctaaa
    attcaacctgggggtcagtgacagtctgtgacaaagt
    ttttgctggtcagtaacgaaataagtatgagcaccat
    ctgagtatggtcaccaagatgtcaactctctttcctt
    tggacgaattgtcattattccaagattaggtcctttc
    tatttttgaggtgtgaaaacatctttcctttcataaa
    ataaaaggatagtaggtggaagaattttttttgtttt
    ttggtctttttgctatttctttgggccgcttctgcag
    catatggaggttcccaggccaggggtcgaatcggagc
    tttagccaccggcccacgccagagccacagcaacacg
    ggatccaagccgcatctgcagcctacaccacagctca
    cggcaatgccggatcgttaacccactgagcaagggca
    gggaccgaacccgcaacctcatggttcctagtcggat
    tcgttaaccactgcgccacgacgggaactcctaatga
    tactcttttatatttagctactatgtgatgatgagaa
    acagtccacattttattattttttagccaatttgata
    tctcattactaagataatgataattttctctataaat
    tttatttaagttagtgttatgaagtggttttgctagt
    gtagaaggctaggatttgaattcagttcaagaaagaa
    gagagggagggagggagagggatgggtagagggatgg
    ggcagtgggagagagcaaagaggagagacagtttttg
    tattaattctgcttcattgctatcatttaagggcact
    tgggtcttgcacattctagaattttctaaggaccttg
    accgccagattgatatgcttcttccctttaccatgtt
    gtcatttgaacagATGATTGAGACAGATGAGGACTTC
    AGCCCTTTGCCTGGAGGATATGACTATTTGGTTGACT
    TTCTGGATTTATCCTTTCCAAAAGAAAGACCAAGCCG
    GGAACATCCATATGAGGAAgtaagcaggaataccagt
    ggaagtgcccctttcttccttccttcctaaataaact
    tttttattttggaacaactttagagttacagaaaagt
    tgcaaagatattatagacagtagtgtttatatatata
    tataaatttttttttgctttttatgaccacacctgtg
    gcatatggaggttcccagtctaggggttgaattggag
    ctacagctgccagtctgtgccataaccacagcaatgc
    aggatctgggccacgtctgtgacctacaccaaagctc
    acagctggattcttaacccactgagcaaggccaggga
    ttgaacctgcatcctcgtggttcctagttggattcgt
    ttccgctttgccgcaatgggaactccaaattattgtt
    aatatcttactttactggggtacatttgttacaacca
    atactctgatactgaaacattactgttaactccgtac
    ttgcttctttttgagtcatttgcaaagactggcttct
    tgacctgcttccttccaaacagctggcctgcctatgc
    tgttctcagacctgcaagcactgatctctgcccccct
    tgccttctctccagtggtgtctccttccccaaacaaa
    cccagtgtggctctggaaagggagttaagtcaacata
    aaccaacacatattttgttgagctccaattttgagca
    aatccctcaccacggcagacaggcatgatgttaagaa
    ctagggctttggacacaaggtcaagaccaagaagggt
    tcctcacccctactgattcagataaccaataatgagg
    ctttgaatccctgtccaaaggttgttttttttccctt
    ctattgagcttcttgccaccttatcagttttttttat
    gacagtcaaatgacatgatatatgtgagcatacatgg
    taatttttaattctatataaatgaatcactaaataaa
    ttaggaggatatatagtccacctttaagcgtattaca
    cgtgtcacatgaatgtgtggcgacttaattgtagagg
    tttaaatgtagcttcctataatagatgtgttcctaaa
    ctacattttaatcattggacttgtatttttatgttag
    cacttgctgttgaagaaaagcctatgccaaaagttca
    gtgaaaccaataatccactgccagctttctgagttaa
    aaaaaatccctgggttttcacacacaggaacaccctg
    tgtgaaacactcatttagagcaaaatgcatctgataa
    ggagttcctgttgtgcctcaactggttaaggacctga
    cattctccatgagaatgtgagtttgatccccggcccc
    actcgatgggttaaggatctggtgttgccacaaactg
    cagctccgattcatctcctagcctagaaacttccaca
    gcccagaatatgccacagaattcggctgtttaaaaaa
    aaaaagaaaaaaaaaagaatcataaatgtgttggttt
    gttcaccaaatacatgataacttgctcttgccaagct
    cagcttcataaatattaagtcatttaatacagcagcc
    accttatgaacagatattactatacttcccatttaca
    gataaggaaaatgccatatttaaccaagagattaaat
    aactttcccgaggtcttatagcaagtaaatcatggtg
    caggggtttgaccacacgcagtctatctccagagtct
    gtgtatttagccactgttttactttcaaatttaaatt
    tataaaacttctaaattatctgttaaccataatcttt
    ggaatttttaaaaccacgagttcctataaaatgtttc
    attgaaagtaagtcacttttccatagcttttgataat
    acatctgtaggataaagtaagccacagctctcttgca
    gacttggtacaccctggggcaaagcatcatgcctgtc
    acgtacatggtggtccttactttgactctcagtgctt
    ttattgcccaggaattttgtgagatttctagttgttg
    aggtttgtttaaagaggttatgccggtacttggaaga
    gctcttttcttgctacctggagccttctcatatttcc
    tttttgaggagggacatgaattgcctttcaaactcat
    aaatatattttctagtacacaagtctccatcttcctt
    agacgcatggctcctggagttctccatcctcctgctc
    cactttgggtgggctcctctctgggtctgccaccaat
    ctgccacccagagacatccttgacccacttccagacc
    ccaccatggcttcactttcttcgctttcctcctttgt
    ggaaccttctgcttaagaatctgaggaagaaaatttg
    cacgtgagctaaactggaggtactttcctgcctggtc
    ttgcacgatagcttggctgagcccatgatgctgggtg
    gctgttactttccatggacacccgaaggcgttgctcc
    tttggcttctagttgcatgcagtgttgcttatcccag
    gctgatctttcttccactgtaggtgacttttaagaat
    taagggattaatctatatctacaacaacaacaacaaa
    gaccttttcaagctgaggtagggctttctgtatatgt
    ttggagtggttatccagcagactttacttgaaggcag
    gggtcatatcctcaagtgctcataaacggaccacaga
    aagatctcataattgggtggagctgggtggggaccgt
    gtcatgtggccaggaaatgccagatgggaagggagtg
    gcccttactgagctccagctgaactctgaattttcta
    gaaaactcagaaatctggatttttcatgtgtaatacc
    cagatttatagatgtggaaagctaattcttttttttt
    ttaagggactataggcaatgaactaagatctaggttg
    tatttggacaaggggtcatcagtttaagctgtgtagt
    tgagcgctcagctattgggctgagggacccctaaata
    ctgagacggggaggtccttgctctggggcatcacaag
    tacactccctggtctcattcaaacacttttcctacaa
    aattgatcccatttcttcagtgcactgtctgaatgca
    tttggcccagagccgtgctgaggcatagggaaggggt
    ccacggtttcatggcatcgttttgtgctgtgtgtccc
    tgctgtcgtccaggatacctacctctcctcctcctgc
    atctgaatgtccccccacagactctctgggattctac
    agcctctggcctgttcctcagacacctcttacctgcc
    agctttccagattcacattagttagtccaaatctact
    gccgtcagtgactcacttcatttcttcttctccgagg
    cagttcagcccggtacagttgttttgtcaacacttca
    gttgagtctggaagatgtgcatgggttatgcacgaga
    gcggtccatcattttgagctagaagtcctttctcagc
    ccagagacaagtcctcatctcctttacttcctgactc
    ttcttcctctgcatccttccaagatatctctttctcc
    agccaccacctaaatctcttcttttcccggggttccg
    tgctcaacccactcttcttcttaaatctgtggctggg
    tgaacgcatctgctggcaccacttctctgctaaagac
    tccaaaaatccataggtcctgcccggcctttgcccac
    ctctctccaacactgtccagctttagatgtagagcta
    atccccccagagatatcattccctggatgtctaagtc
    ctttggtatctcactttcagcgtgttcaaaatcctct
    tacaactgttctttctccttttccatcttgattattg
    gcaacatgccagcctttcccctacccccagcagtgag
    ccaagctagaaacaagggcttaatcttcaatctttcc
    ttctccatccctaaacctaatgagtctccaagccctt
    cccagtttacaccctaaatgttgctcaaaacatcccc
    tagttcttccacgtgctctcctctatattgaaaggtc
    aagaaaggccatcttccctccactgtgaggaaataga
    tcttgatactgcccctgagctgggcagtcctcgacct
    gacaaactgtgcagtgtttctaaatctctactggcaa
    aatgagagtgcctttgacctgtgttgcgatctcagat
    cacagtggatgtaattgttttataggaatggtgaacg
    aaaaagaagtaaatccctaatgccaaactcctgatca
    ttctatgtcatttaatagcctgtcatttatgataaag
    tttcctctactggcattagcacaatacttctcaggaa
    aaaaaaatatgatgccagatactgaaaagctcctggg
    taaacatgaacatgggtaccgataaaatggtgaagcc
    agtccaatcttagagtgacttcccttcatgctacttc
    atgctcttttttttttttttttttaagaaaaacccct
    tttttttttctcacaccagtcacagaggagaccgagg
    cttagcaaggttaaggtcacatgattagtaagtgctg
    ggctgaaactcaaaaccatctctgcttgtctcctaac
    cctgtgcacctctgactattcaacagATCCTGTGTCA
    GGAGTTGGGATTCTTTGAAGgtaagggccttgaccac
    cgaattaaggtaatcttgctctgtggcaggccttgtt
    ttcagtattttaagtacactggctcaggtaatcctca
    caacagccccaggaggaatgttctattacctccactg
    tatagatgaggaacttgaggcacagaatggttgccaa
    ggtcacacagctatattgggggttcatacccagccat
    ccaactctgtctgtactctctgccactctgcaccccc
    agctcctgatccacttcctgtttccatccctcgattt
    ctgctgcactcaggggcccctctccccctcggcctgt
    gagatctgcttcagtaggcttttctccctgactcctc
    catccctgtccttacaggcagctgcttctctccggga
    cacgaggggtccatacggacactctctactggctggg
    ttgcgcctaactcgtgattcctcctctgtttcagATT
    CGGAGCCGGGTTGATGTCATCAGACACGTGGTAAAGA
    ATGGTCTGCTCTGGGATGACTTGTACATAGGATTCCA
    AACCCGGCTTCAGCGGGATCCTGATATATACCATCAT
    CTgtaagtccgaaaatgcctgtcgtgtgtgccttagg
    ctgctgcggaggaggccagggctatataagcagagtc
    agtgactgactgtgccctgcagtgttgatggccatgg
    agattccaccgttagagcttttttctttgttaacctt
    gaaggcaaatctggttaggaagataactttcaaagag
    tcaccatctggacattcatgcccatgtgcttcaatcc
    tgtatacaagcagtttagagtacagggaagggaagga
    cattatgaaagggagagggtgtgtttggatccagcag
    ctccatcctcagaatttatctgaagacactgcaaaat
    tactaagaatcactatgacaagaatgaggatggggtg
    atatggcaaagttgtgatcctggaagaccttcatctc
    ccatgttgcccaactctgaacatgaatttggtgaact
    agttggttaaggggatgatcctccaagtttctccctg
    gttgagctccaaaaaccatgtaagtttctcatagcaa
    aaccgtataggtccttagggctttagttggaatattt
    gtgctgaaatgctggaaagccccatttgccatttttg
    tatttgcaaaataatcatcaagaggggagaatgcatt
    ctttcatgaccactgaccctctgaaaaggtcaggaat
    ttagtctgaagtaggcaagcctcctaccccgcttctg
    ccatgagcttgcacgcacaggcctglcttgacatttc
    ttctttatagatttctttttgaatatcttgaaattgc
    tttaaaaatatttaaagaatgtagaattatataaaat
    aaaaaggaaataaccccacacctcccacaaaaccctg
    tttcctgcctttctccacccactctccagggtaacac
    ttggtaacagcatagttgtatcaccccaggcctattt
    ttgagcatatcagcatttcaagaaatgtattttttct
    caataaaacatcccttatagttgaggaggggaggtta
    tcattcctgggttttgttttttttttttttttaatgt
    aatcctggtacatcggtaatttgcattttttattcat
    taatatctttggtatttctagtgttgggacacacagg
    tcaacctcagtttttgggtttttttttttgtcttttt
    gtctttctagggccacacctgcagcatatggacgttc
    ccaagctaggagtctaatcagagctgtagccaccagc
    ctacgtcatagccatagcaacgtcagatccaagccgt
    gtctgtgacctacaagcacagctcatggcaacaccgg
    atccttaaccactgaacgaggccaggggatcgaacac
    acatcctcatggatcctagtcatgttcattaaccact
    gagtcatgatgggaactccaacttcaactattttaat
    gtctgtaaaacattccatttggaaaccatttcatttg
    taaagcaaaatgaaaacattttgttcattttcaacag
    agttcgtagctgacttctgttctggaaaaaaggaaat
    ggagcaaatttgagtgagaaagattcaaagataactt
    ttcttttaaaaaaaattatatcttggaaacttctggg
    ctattgattctgaagactatttttctatatactgttt
    tgatagcaaagttcataaatgtgaaaggatcctgcga
    tgaatcttgggaagcagtcatagcccaatatatcttt
    gttgcttttaaaatgagatttagtttactaaatattt
    ttctgatcataaaaataacacagatctaccgcagaaa
    atttggaaaaaaaaaaacttttaaattcaaaaaacag
    ttaaaccacaaatgatcccaccatccagagagcaatt
    tgtactttggtgtctagttcatctttctttttctgtt
    tacaagcacatataccacaagcattttttcaaaaaat
    gaaaatgggataatactatacatacgtctgtacacct
    gcatagttactgaacagtctttgatctaccctgtaag
    tttctaacttttcattatttgaaatgatgttttggca
    aagaaatatgtaggtgtgtctcgcacactttcataat
    gatttcttaggataaatttcttaggataaattcataa
    tgatttcttataataatccatactctgccaactgatc
    ttcagggaagccaactcgccttctcagaaataacata
    taacccatttacttgccctctcaccaatactaggtcc
    taatgtttttgtgtacagattctatatttttacatac
    aagaattccttaaagcaaggcatgtcacagaaaaata
    gaaggaagacacaattgtcatgtttaaggactgcatt
    ctgtaccaaaaatgctaagttaaatgaacatctgaaa
    cagtacagaaacgctatctttcagggaaagctgagta
    ccaggtactgaacagattttggcaaatacagcaggca
    tggatgtttccaaaacatgtttttctactttatctct
    tacagGTTTTGGAATCATTTTCAAATAAAACTCCCCC
    TCACACCACCTGACTGGAAGTCCTTCCTGATGTGCTC
    TGGGTAGAGAGGACCTGAGCTGTCCCAGgtaaagcat
    cctgcaggtctgggagacactcttattctccagccca
    tcacactgtgtttggcatcagaattaagcaggcacta
    tgcctatcagaaaacctgacttttgggggaatgaaag
    aagctaacattacaagaatgtctgtgtttaaaaataa
    gtcaataagggagttcccatcgtggctcagtggtaac
    gaaccctactagtatccattgaggacacaggttcaat
    atctggcctcactcagtcggctaaggatccagtgatg
    ccgtgagctgcagtgtaggccacagacgtggctcaga
    tctggtgctgctgtggctatggtgtaggccggccccc
    tgtaactccaattcgacccctaggctgggaacctaaa
    aagaccccaaaaaagtcgctatgaatagtgaatacat
    ccagcccaaagtccacagactctttggtctggttgtg
    gcaaacatacagccagttaacaaacaagacaaaaatt
    atcctaggtggtcagtgggggttcagagctgaatcct
    gaacactggaaggaaaacagcaaccaaatccaaatac
    tgtatggttttgcttatatgtagaatctaaattcaaa
    gcaaatgagcaaaccaattgaaacagttatggaagac
    aagcaggtggttgtcaggggggagataaggggaggca
    ggaaagacctgggcgagggagattaagaggtaccaac
    tttcagttgcaaaacaaatgagtcaccagtatgaaat
    gtgcaatgtgggaaatacaggccataactttataatc
    tcttttttttttttgtcttttttgccttttctaaggc
    tgctcccgtggcatatggaggttcccaggctaggagt
    ccaaacagagctgtagctgccagcctacaccagagcc
    acagcaacacgggaaccttaacccgctgagcaaggcc
    agggatcgaacccgagtcctcacagatgccagtaggg
    ttcattaaccactgagccacgacaggaattccagggt
    ctgttgtgttcttaaaacacttccaggagagtgagtg
    gtatgtcataagtaaacaataaatgttaaccacaaca
    agcttatgaaataaacaggaaagccatatgacctaca
    atcagtcattgggagaatccacaaaaggttgagcaga
    ggatcaattccagctcacactccagttttagattctc
    ccctgccttaaagcatcacagactacataatctgagc
    tgaagaataaaaattaaaactcaccccagtgcaaaac
    agaaatgaaaaagtattaaaacgaggttcatactgtt
    gttcattagcaatatcttttattcacagGGGTGCCCA
    ACAACATGAAAAAATCAAGAATTTATTGCTGCTACGT
    CAAAGCTTATACCAGAGATTATGCCTTATAGACATTA
    GCAATGGATAATTATATGTTGCACTTGTGAAATGTGC
    ACATATCCTGTTTATGAATCACCACATAGCCAGATTA
    TCAATATTTTACTTATTTCGTAAAAAATCCACAATTT
    TCCATAACAGAATCAACGTGTGCAATAGGAACAAGAT
    TGCTATGGAAAACGAGGGTAACAGGAGGAGATATTAA
    TCCAAGCATAGAAGAAATAGACAAATGAGGGGCCATA
    AGGGGAATATAGGG
  • TABLE 13
    SEQ ID NO. 49
    TCTAATGCCTTGTGGAAGCAAATGAGCCACAGAAGCT SEQ ID NO 49
    GAAGGAAAAACCACCATTCTTTCTTAATACCTGGAGA
    GAGGCAACGACAGACTATGAGCAG
    gcaagtgagagggggctttagctgtcagggaaggcgg
    agataaacccttgatgggtaggatggccattgaaagg
    aggggagaaatttgccccagcaggtagccaccaagct
    tggggacttggagggagggctttcaaacgtattttca
    taaaaaagacctgtggagctgtcaatgctcagggatt
    ctctcttaaaatctaacagtattaatctgctaaaaca
    tttgccttttcatag
    CATCGAACAAACGACGGAGATCCTGTGTGCCTCTCAC
    CTGCCGAAGCTGCCAATCTCAAGGAAGGAATCAATTT
    TGTTCGAAATAAGAGCACTGGCAAGGATTACATCTTA
    TTTAAGAATAAGAGCCGCCTGAAGGCATGTAAGAACA
    TGTGCAAGCACCAAGGAGGCCTCTTCATTAAAGACAT
    TGAGGATCTAAATGGAAG
    gtactgagaatcctttgctttctccctggcgatcctt
    tctcccaattaggtttggcaggaaatgtgctcattga
    gaaattttaaatgatccaatcaacatgctatttcccc
    cagcacatgcctaactttttcttaagctcctttacgg
    cagctctctgattttgatttatgaccttgacttaatt
    tcccatcctctctgaagaactattgtttaaaatgtat
    tcctagttgataaacagtgaaacttctaaggcacatg
    tgtgtgtgtgtgtgtgtgtgtgtgtgtttaccagctt
    ttatattcaaagactcaagcctcttttggatttcctt
    tcctgctctctcagaagtgtgtgtgtgaggtgagtgc
    ttgtccaaacactgccctagaacagagagactttccc
    tgatgaaaacccgaaaaatggcagagctctagctgca
    cctggcctcaacagcggctcttctgatcatttcttgg
    aagaacgagtgctggtaccccttttccccagcccctt
    gattaaacctgcatatcgcttgcctccccatctcagg
    agcaattctaggagggagggtgggctttcttttcagg
    attgacaaagctacccagcttgcaaaccagggggatc
    tggggggggggtttgcacctgatgctcccccactgat
    aatgaatgagggattgaccccatcttttcaagctttg
    cttcagcctaacttgactctcgtagtgtttcagccgt
    ttccatattaggcttgtcttccaccgtgtcgtgtcgt
    caatcttatttctcaggtcatctgtgggcagtttagt
    gcgaatggactcagaggtaactggtagctgtccaaga
    gctccctgctctaactgtatagaagatcaccacccaa
    gtctggaatcttcttacactggcccacagacttgcat
    cactgcatacttagcttcagggcccagctcccaggtt
    aagtgctgtcatacctgtagcttgcttggctctgcag
    atagggttgctagattaggcaaatagagggtgcccag
    tcaaatttgcatttcagataaacaacgaatatatttt
    tagttagatatgtttcaggcactgcatgggacatact
    tttggtaggcagcctactctggaagaacctcttggtt
    gtttgctgacagactgcttttgagtcccttgcatctt
    ctgggtggtttcaagttagggagacctcagccatagg
    ttgttctgtcaccaagaagcttctgcaagcacgtgca
    ggccttgaggtcttccgacttgtggcccggggactct
    gctttttctctgtccttttttctccttagtgggccat
    gtcctgtggtgttgtcttagccagttgtttaagggag
    tgttgcagctttatgattaagagcatggtctttcctt
    gcaaactgcttggtttagaagcctggctccaccactt
    agcggctctgtgacctcggacacatttcttagccttt
    ctgggcctcgctcttcttcctcataaagtgaaaatga
    aagtagacaaagccttctctgtctggctactgagagg
    atggagtgatttcatacacataaagcacttaaaataa
    tgtctggcatatgatacatgctcaataaatgtcactt
    acatttgctattattattactctgccatgatcttgtg
    tagcttaagaacagaggtctttacaggaattcaggct
    gttcttgaatctggcttgctcagcttaatatggtaat
    tgctttgccacagactggtcttcctctccttcaccca
    aagccttagggggtgaacgatcccagtttcaacctat
    tctgttggcaggctaacatggagatggcaccatctta
    gctctgctgcaggtggggagccagattcacccagctt
    tgctcccagatacagctccccaagcatttatatgctg
    aaactccatcccaagagcagtctacatggtacactcc
    cccatccatctctccaaatttggctgcttctacttag
    gctctctgtgcagcaattcacctgaaatatctcttcc
    acgatacagtcaagggcagtgacctacctgttccacc
    ttcccttcctcagccatttttcttctttgtacataat
    caagatcaggaactctcataagctgtggtcctcattt
    tgtcaatctaatttcacagcctcttggcacatgaagc
    tgtcctctctctcctttctgcctactgcccatgagca
    gttgtgacactgccacatttctcctttaacgacccag
    cctgctgaatagctgcatttggaatgttttcaatttt
    tgttaatttatttatttcatctttttttttttttttt
    tttttttttttttttagggccgcacccatgggatatg
    gaggttcccaggctagggatccaatgggagctgtagc
    tgctggcctacaccacagccacagcaatgcacaattc
    gagccaatctttgacctacaccagagctcacggcaac
    actggattcttaacccactgattgaggccagggatca
    aactctcgtcctcatagatacgagtcagattcgttaa
    cctctgagccatgatagttgttagttactcattgatg
    agaaaggaagtgtcacaaaatatcctccataagtcga
    agtttgaatatgttttctgccttgttactagaaaaga
    gcattaaaaattcttgattggaatgaagcttggaaaa
    aatcagcatagtttactgatatataagtgaaaataga
    ccttgttagtttaaaccatctgatatttctggtggaa
    gacatatttgtctgtaaaaaaaaaaaatcttgaacct
    gtttaaaaaaaaaacttgactggaaacactaccaaaa
    tatgggagttcctactgggacacagcagaaatgaatc
    taactagtatccatgaggacacaggtttgatgcctgg
    cctcgctaagtgggttaaggatatggtgttgctgcag
    ctccaattcaacccctatcctgggaacccccatatgc
    caccctaaaaagcaaaaagaaaggtgctgccctaaaa
    agcaaaaagaaagaaagaaagacagccagacagacta
    ccaaatatggagaggaaatggaacttttaggccctat
    ctccaactatcacatccctatcaccgtctggtaagaa
    atggaaaaaatattactaagcctcctttgttgctaca
    attaatctgattctcattctgaagcagtgttgccaga
    gttaacaaataaaaatgcaaagctgggtagttaaatt
    tgaattacagataaacaaattttcagtatatgttcaa
    tatcgtgtaagacgttttaaaataattttttatttat
    ctgaaatttatatttttcctgtattttatctggcaac
    catgatcagaaatctttaaacaatcaggaagtctttt
    ttcttagacaaatgaaaatttgagttgatcttaggtt
    tagtacactatactaggggccaagggttatagtgtga
    ctattaaatcacagataatctttattactacattatt
    tccttatactggccccacttggatcttacccagctta
    gctttgtatgagagtcatccttaaagatgactttatt
    ctttaaaaaaaaaaacaaattttaagggctgcaccca
    tagcatatagaagttcctaggctagcggtcaaattag
    agctgcagctgccagcctatgccacagccacagcaat
    gccagatctgagctgcatctgtgacctacactgcagc
    ttgcagcaatgctggatccttaacccattgaacaatg
    ccagggattgaacacacatcctcatggatactgctca
    ggttcctaacctgctgagccacagttggaactccaaa
    gcagactttattctgatggctctgctgatctctaaca
    cgttattttgtgccatggtgtttatcttcactttact
    caagtcagggaaacacgaagagtctcatacaggataa
    acccaaggagaaatgtgcaaagtcacatacaaatcaa
    actgacaaaaatcaaatacaaggaaaaaatatcttca
    ctttcaaaatcacctactgatgatgagtttatatttc
    cttggatatttgaatattagctatttttttcctttca
    tgagttttgtgttcaaccaactacagtcgtttacttt
    gatcacagaataatgcatttaagccttaaatagatta
    atatttattttcaccatttcataaacctaagtacaat
    ttccatccag
    GTCTGTTAAATGCACAAAACACAACTGGAAGTTAGAT
    GTAAGCAGCATGAAGTATATCAATCCTCCTGGAAGCT
    TCTGTCAAGACGAACTGG
    gtaaataccatcaatactgatcaatgttttctgctgt
    tactgtcattggggtccctcttgtcaacttgtttcca
    atctcattagaagccttggatgcattctgattttaaa
    ctgaggtattttaaaagtaaccatcactgaaaattct
    aggcaagttttctctaaaaaatcccttcattcattca
    tttgttcagtaagtatttgatgagaccttaccatgtg
    taaacattgcactaggtattaagaaatacaaagatgg
    ataagatagagtcggcgtaaatgagatgatataatga
    gacgttataatgaaactcacaattccagttgggaaat
    aaagtccttcaaattccatgactctttctggcacacg
    ttagaggctacagcttctgtgtgattctcatgctggc
    tccacttccactttttccttcttcctactcaagaaag
    cctatagaaatatgagtaagaagggcttaatcatagg
    aataaatttgtctctgttctaagtgattaaaaatgtc
    tttatcagtataaaaagttacttgggaagattcttaa
    aactgcttttacacactgttctagaatgactgttata
    taaataaaaaagtagatttgatctaacacaattaaat
    gacctttggaaatattgactaattctcaccttgcccc
    tcaaagggatgcctgaaccatttccttcttttgccag
    aaagcccccaccctttgtctgttgacctagcctagga
    aatcttcagatcacgttgttagcacgaactggttaca
    tgtgctgtacaaatactatttaattcatctgattaaa
    aaaaaagagataagaagcaaaagtttgactatcttaa
    actgtttgcgtaggtgagaggacaattgaccatctac
    tttatgagtatgtaacccagaaacttaaagctcctta
    agggagctaagtcttttggataagacctatagtgaga
    ccttttagcaaaatggttaagactgaatggagctcac
    tagcgtgggttcatatcctgatgctcaaacacgcaat
    taaatgactttaggtgggttagtctctgttccttagt
    ttcctcaatgggagataatattggtagtagcgatttt
    actgggttgttgaaagaacatctgttaaatgttcaga
    acgtgttacgacagagtacagagtaatgatttgcttg
    tatatgtatgactcaaatagtctgccatatgccttgt
    gactgggtcctgtggagcaggaaggagggatttccca
    cccagcagaaagttgggtaaactggaaaatagactga
    ggccaggaaatgatgcaaagcgttgatgttcactgcc
    acggcaggtgaagggcagggccagagttgtcagtagg
    gtcaggggaggactggaaataaccaagacccactgca
    cttttcagcctttgctccagtaaggtaatgttgtgag
    agtagaaaattttgttaacagaacccacttttcagta
    cagtgctaccaaactgtagtgatttcataccacatcc
    caagaaagaaaaagatggctcaatcccatgtgagctg
    agattatttggttttattgttaaataaatagcattgt
    gtggtcatcattaaaaaaggtagatgttaggaaagta
    gaaggaagaagactctcacctacattttcatcactgt
    tttggtatctgccagttgtcaccttggtccccttccc
    cgcctctcccctgcctcctcttcctccttctcctttt
    tttggaatacaattcaggtaccataaaatttaccctt
    ttagagtgtttgactcaatggtttttagtattttcac
    atgttgtgctattactatcactatataattccaggtc
    attcacatcaccccccaaagaaaccttctaactatta
    gcagtccattcccttcttccctcagcccctggcaacc
    actaatctacttactgtctccatggatgttcctatat
    tgaatcaagctagcataaaccccacttgctcatggtc
    ataattcttttttatagtgctaaattacatttgctaa
    tattcaattaaggatttctatgtccatattcataagg
    aatattggtgtgtagttttctctttgtgtgatatctt
    tgtctggttgggggatcagagtaataattactgctct
    catagaatgaattgagaagtgttccctccttttctat
    ttattggaagagtttgtgaagtatattggtattgatt
    cttctttaaacatttggtcagattcaccagtgaagcc
    atctgggccatggctaatctttgtgaaaagttttttg
    attactaattaaatctctttaatttgttatgggtctg
    ctcctcagacgttctagttcttcttgagtcagttttg
    ttcatttgtttcttcctaggactttctccctttcatt
    tggattatttagattgatagtaatatcccccttttaa
    ttcctggctgtagtaatttgggtcttttctctttttt
    cttggtcagtttagctaaaggtttgtaattgtattaa
    tcttttcaaataactaacttttttgttttgtttgttt
    tttgttttttgttttttgttttttgtttttttttgct
    ttttaaggctgcacctgaggcatatggaagttctcag
    gctagaggtctaatcggagctacagctgctggcctat
    accacaaccatagcaatgccagattcaagctgcatct
    gcgacctacaccacaactcggccagggatcacacccg
    caacctcatggttcctagtcggatttgttaaccactg
    tgccacgacgggaactcccgcccattttttttaacac
    ctcatactttaacataaagatgggcttcacatggact
    gatagctcaaatgaggaaggtaagactatgaaagtaa
    tggaagaaatgtagactatttttgtgacctagagatt
    actgatacttcttgacttttcaaacaatacttcaaaa
    gtacagcccaaagggaaaaaagaaagaaaaaagaaac
    acacatatacacaaacctagtgaataagatatcatcg
    atacactacagatttctatgaactggaagaccccatg
    gacaaagttaaagaacatatgatagtttgagtgatta
    ttttgcaatatttacaaccaatgagggaatattatcc
    agcttataggaggaagtaatgcaaatcgacaagaaaa
    agataggaaacccaatataaaaattaagaaaatacaa
    aaattaagaaaggatatgaactagcattttacaaaag
    aaaaatctccaaaagtcaatcagcacatgaaaatatg
    ctcaaacctattaattattagaaaactacagactgaa
    gcaatgaggtgctttactttacatctttttgactgat
    aaaaagttagaaacaaaggtgatatcaaatgtcaggg
    ataaaaggatatagaaatcgtcatgcctgtggtggga
    gtatggccggtgcagtcatgtgggaaggtaatctgac
    agtggttaggcagagcaggtttatgaatacactgtgg
    cccatcaatcccacgcctgtttatgtaccaaagaaat
    cctgttgtggcagaatctatgggtccacccctgggag
    catgaattaataaaatgtggcaccagggtgtgtgaaa
    ctccagctagagatgagatgtccacatggcaacatga
    atgcatcttagaaacatagatttgagtgaaaaagagt
    aagaaacagccgggaaacccaataccatttataaaaa
    ttaaagatgcacacatacaatgtagtaaatattttgc
    atgaactttcaaatggttgcctacagggggggagagt
    aaagaagagtagaaaacaaagataaagggagtaagta
    agtagctctgcctggactgaatataatgtgtcatgaa
    ctgagaaatatggttaacataatcctctaacttgagg
    tcctaaatgaatgaatgagtccactattcatttaccc
    attctttaatgtgtattgcattataatccattttttt
    agaaccaacgaattttgttcccataactactaatcag
    cctgccttttctccctcattcccttatcagctcaggg
    gcattcctagtttttcaaacgttcctcatttgaacca
    aaaatagcatcattgtttaaattatacttgttttcaa
    atacgatgcttatatattccaagtgtgtttgcccatt
    ttcttaggtggtagaaatttttcattctacttttcta
    tctactcagattttcccgttggaattatttccattgc
    tattaaacttagaagtcccccctgtgatatgccattt
    ttttcatactttttaagcacttggttgcttttctttg
    tgtctttaagcacctagaatacttataaccattgcac
    agcactgtgtatcaggcagcccttcctcttccactaa
    tttatggtccttctcttagactatattaaactgttat
    ttaattaggatcctctcttcgtccttatgatttaatt
    attatagttttctaatatgtttttattataattcctc
    ttcattattcctccctattaaaaattttaatgaattc
    catttgtttgttcttctagttaaatattaagtcataa
    tccaaataacttagatgtcattagtttatgtggtcaa
    agtaaggataccacatctttatagatgcaggcagttg
    gcagatgtcatgattttcttcagtgcataaatgcaat
    ttatctttgagcaaggggcataaaaacttttatggta
    ttggctttgaaataatagttaagaactgcagactcag
    tttttcctgcttttcttgaaaaagaacacttctaaag
    aaggaaaatccttaagcatggatatcgatgtaatttt
    ctgaaagtctcctgtaattccttgggatttttgttgt
    tgtttgttggtcggtttttttgggtttttgtttgttt
    gttttgttttgttttgttttgcttttagggctgcacc
    tgtggcatatggaagttcccaggctaggggtccaact
    ggagctacagctgccagcctactccacagccacagca
    acatgggatcctagctgcatctgtgacctaaccacag
    ctcttggtaatgccagattgttaacccactgagcaat
    gccagagatcgaatctgcctcctcatggacactagtc
    agattagtttctgctgagccacaatgggaattcccaa
    ttccttgtatttttgaactggttatgtgctagcatat
    aattttgtttcttgaatctttgtgggttttttttttt
    ttttttttttgtctcttgtctttttaaggctgcaccc
    acagcatatggaggttcccaggctagaggtcaaattg
    gagctacagctgccagcctacacaacaactgcagcaa
    agtggggcccaacttatatgacagttcgtggcaatgc
    cggattcctaacccactgagcagggccagggatcgaa
    cctgagtttccagtcagtttcgttaaccactgagcca
    tgatagtaactcctgtttgttcagtcttgaacctcct
    ttttaattctttattccttgagggtgaaataattgcc
    ataataatactatcatttattacatgccttctctgtg
    ctaggcatagtgacactttaggatttattatatcact
    taatccctacaacaactctgcaaagtatgtatcataa
    tcctatttgacagatcaggaaattgcagcccaggatg
    cagataatatgcatccatcacaagtgactagatatag
    tccctctgctattcagcagggtctcattgcctttcca
    ttccaaatgcaatagtttgcatctattgtatatgtgt
    tttggggtttttttgtctttttttttttttttgtctt
    ttctggggcctcacccttggcataggtaggttcccag
    gctaggggtcaaattgaagctgcagctgccagcctac
    accacagccacagcaactcgggatctgagcctcatct
    gcaacctacaccaaagctcacggcaacaccggatcct
    taacccactgagtgaggccagagatcaaaccggcaac
    ctcatggttcctagtcggattcattaaccactgagcc
    acgatgggaactccctaaatgcaatagtttgctctat
    taaccccaaactcccagtccatcccactccctcctcc
    tccctcttggcaaccacaagtctgttctccatgtcca
    tgattttcttttctggggaaagtttcatttgtgccat
    ttttcattttacgggtaatttttacttcagtttcttc
    cactagcagttgtcttaaagtgagtataattaatatt
    catttggaaaatgtaagcaaaacattttttaaagggc
    catgcccacagcatatgaaagtttctgggccaggggt
    tgaatccaggctccaagttgcagctgtgccctacact
    gcagctgggcaatgctggatcctttaacccactgtgc
    ccggctagggatcaaacctgcatttccacagctaccc
    gagccattgcagttggattcttaacccactgcactac
    agtgggaactcccacaaaacattttttaatgtccttt
    gaataaagtaggaaagtgctcgtctttgagggcaggg
    cggcaatgccatttccacaaggtgctttggcttggga
    cctcatctgctgtcatttagtaatgaataaaattgct
    gacagtaataggattaactgtgtgtggagatagccag
    ggttagagataaaaacactggagaagtcaaataagtt
    gctcgaggtcctctagctaataagctattaagtggga
    gagtgagggctagaaacaggccatctgtctcccaagc
    acatgtccattagtggtttgctgatagccttccagaa
    caacagagaggactctcaaacatggtcttgcctccct
    ccaattgatcccctccatgtgcctcacagcgggtctt
    tctaaaattaagttctgattttaattctcccttgcta
    tagcacttaggtatggctttcagccgtgcaataaaaa
    gcaggcaagagtggctcaatcatataggaggttgttt
    ttcttagatcccaagcaggtaatcctgggcattatgg
    ttgttctgcgtttatcaaggagccaaattctctatca
    cctcctgttctatcctcctcagtatctggctctattc
    ttcagcatctcaagatggcttgtgctcctccaagcat
    ggcagtcaaattccacacaagagggggaaaatgaagg
    gcagacagtgctggtctcctgagctgtccctctttgt
    cggggaaataaatgtattccttcaagtcccgtgagac
    ttctgaagtagacgtctgcttacgtctcacccaccag
    aactatgtaaactgcacatagtgctaggtctacatag
    ccactcataactgccagggggtgggaaatctttaaat
    aggtgtaccaccacacaattaggatgctaatagtaag
    ggagaaggagagaataggttttgcgcaagccaccagc
    atgcctgccacaattgcttaaaattcttcattgaccc
    ctcattgccacaggatgaaatccaaacgccttcttag
    ttgggaatctgacctacctgtctctcccacctggttc
    agacaccattctccttggtcataaaattccagtcatt
    tgtgaacatccagctcccccatgcctccatgcctttg
    cacatgctgttcttttatcttttatgttgtcctttta
    tcttttatccaaaagagatatcccatcatcacatctc
    ttcagcccccaaatactttgtctttcaagttcagctg
    gaggattacctcctatttgaaatcagctttgtctctt
    acaaccaaacaaggttttccttccgagacactcccac
    agcaccttgaactcatctctatcaatcattcatttga
    tgtaatgaagttgttggtggtatgcctgtgtctctga
    cacatctgcgatctcatgagttccttaagtggaatgt
    gaatagcgggatgaacagtattggtcttcagccctca
    tctctgcagatgttgcttgacccaaatgagcgttgcc
    ttttattttgattttgctttgatttgtctactccatg
    tacttgagccatgcatttctgtcttagcgatgctttt
    taaaagtcattttttggttgattatccagatttgtcc
    acctttgcttctag
    TTGTAGAAAAGGATGAAGAAAATGGAGTTTTGCTTCT
    AGAACTAAATCCTCCTAACCCGTGGGATTCAGAACCC
    AGATCTCCTGAAGATTTGGCATTTGGGGAAGTGCAG
    gtaaggaaatgttaaattgcaatattcttaaaaacac
    aaataaagctaacatatcaatttatatatatatatat
    atatatatttttttttttttttacatcttatattacc
    ttgagtattcttggaagtggctagttaggacatataa
    taaagttattctgaagtctttttttttctttttccat
    ggtgagcagtggcttgatgtggatctcagctcccaga
    cgaggcactgaacctgagccgcagtggtgaaagcacc
    aagttctagccactagaccaccagggaactccctatt
    ctaaattcttgagcacattatttaggaacctcaggaa
    cttggcaggattacaggaaatatatctagatttaaaa
    aaaaatcttttaacagaggtcccaaaggagagtcatg
    cacagctatgggaggaagttcagaaactgcccttgct
    accagatcactgtcagataaaatggccagctacatgt
    ttctgcacattgccctaagatctttacaaacttttct
    gtgcatttttccacttttaaaagaaaatttcggggtt
    cctgttgttgctcagtggttaacgaacccaactagta
    tccatggggacaggggttcgagccctggcctcactca
    gtgggttaagaatctggcattgctgtggctgtggcgt
    aggctggcggctacagctcagattggacccctagcct
    gagaacctccatatgccgcaggtatggccctaaaaaa
    aaaaaaaaagagagagagagaatttcctccagaaaaa
    acactttggtagtttgggagaagtaaacaaccaaaaa
    ttaatttttctggagtattcgggaagcttgtaaaaat
    gggctcttacttttttgaggagacaaatgggaaccta
    cccagaagaggcacaatcacctgcatttgatttcttg
    acctctccctaccttctttgctggctttccacatttg
    gatttctgtgaccttatctctgctccttggtgttttc
    atttttcctgtggacgtgccagactatgggaagggag
    taaggcgttgatttagaatcctgtagtctctgcctgt
    ctctagtcattgttttcacccttctcaaaggaccttg
    acatcctgagtgagtccgcaagtaatttaggggagaa
    gccttagaagccagtgcagccaggctacatgactgtg
    tccacccactggaaccagtcatttttatacctattca
    cagcccccctaccatttaaatccccagaggtctgcca
    taacatctgtaactccctttcctggtaaattgtgttc
    taaaagactggtaacaaaagatattctgtggtacaga
    gcataattaaatacctgggagctgatttgagtggggt
    aaatcaactggtttgacccctaaaacccaccatgagc
    atttctgttctaataaagtaatgcccgtgctgggaat
    tgtgttctacggaaatgctcctgctgtgtctttcttg
    agtcctgtgtcattgaacatgcttaggagcaaaggtc
    ccccatgtggcttgtctgctaaccagcccagttcctt
    gttctggctggtaatgatccgatcatctgaatctcac
    tgtcttccaacag
    ATCACGTACCTTACTCACGCCTGCATGGACCTCAAGC
    TGGGGGACAAGAGAATGGTGTTCGACCCTTGGTTAAT
    CGGTCCTGCTTTTGCGCGAGGATGGTGGTTACTACAC
    GAGCCTCCATCTGATTGGCTGGAGAGGCTGAGCCGCG
    CAGACTTAATTTACATCAGTCACATGCACTCAGACCA
    CCTGAG
  • SEQ ID NO. 49 represents contiguous genomic sequences containing Intronic sequence 5′ to Exon 4, Exon 4, Intron 4, Exon 5, Intron 5, Exon 6, Intron 6, Exon 7, Intron 7 and Exon 8 (Table 13). Further, nucleotide sequences that contain at least 1750, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100, 3200, 3300, 3400, 3500, 3600, 3700, 3800, 3900, 4000, 4100, 4200, 4300, 4400, 4500, 4600, 4700, 4800, 4900, 5000, 5500, 6000, 6500, 7000, 7500, 8000, 8500, 9000, 9500, 10000, 11000, 12000, 13000, 14000, 15000, 16000, 17000, 18000, 19000, or 20000 contiguous nucleotides of SEQ ID NO. 49 are provided, as well as nucleotide sequences at least 80, 85, 90, 95, 98, or 99% homologous to SEQ ID NO. 49.
  • TABLE 14
    SEQ ID NO. 50
    AGGAATGGAAAGCCCAATTCATTAAAACAGAAAGGAA SEQ ID NO. 50
    GAAACTCCTGAACTACAAGGCTCGGCTGGTGAAGGAC
    CTACAACCCAGAATTTACTGCCCCTTTCCTGGGTATT
    TCGTGGAATCCCACCCAGCAGACAAGTATGGCTGGAT
    ATTTTATATAACGTGTTTACGCATAAGTTAATATATG
    CTGAATGAGTGATTTAGCTGTGAAACAACATGAAATG
    AGAAAGAATGATTAGTAGGGGTCTGGAGCTTATTTTA
    ACAAGCAGCCTGAAAACAGAGAGTATGAATAAAAAAA
    ATTAAATAC
    gtatggctggatattttatataacgtgtttacgcata
    agttaatatatgctgaatgagtgatttagctgtgaaa
    caacatgaaatgagaaagaatgattagtaggggtctg
    gagcttattttaacaagcagcctgaaaacagagagta
    tgaataaaaaaaattaaatacaagagtgtgctattac
    caattatgtataatagtcttgtacatctaacttcaat
    tccaatcactatatgcttatactaaaaaacgaagtat
    agagtcaaccttctttgactaacagctcttccctagt
    cagggacattagctcaagtatagtctttatttttcct
    ggggtaagaaaagaaggattgggaagtaggaatgcaa
    agaaataaaaaataattctgtcattgttcaaataaga
    atgtcatctgaaaataaactgccttacatgggaatgc
    tcttatttgtcag
    GTATATTAAGGAAACAAACATCAAAAATGACCCAAAT
    GAACTCAACAATCTTATCAAGAAGAATTCTGAGGTGG
    TAACCTGGACCCCAAGACCTGGAGCCACTCTTGATCT
    GGGTAGGATGCTAAAGGACCCAACAGACAG
    gtttgacttgaatatttacagggaacaaaaatgattt
    ctgaattttttcatgtttatgagaaaataaagggcat
    acctatggcctcttggcaggtccctgtttgtaggaat
    attaagtttttcttgactagcatcctgagcttgtcat
    gcattaagatctacacaccaccctttaaagtgggagt
    cttactgtataaaataaactattaaataagtatcttt
    caactctggggtggggggggagactgagttttttcac
    agtcctatataataattttcttatcctataaaataat
    taggagttcccgtagtggctcagcaatagcaaacccg
    actagtatcgatgaggatgcgggttcgattcctggcc
    cccctcagtgggttaaggatctggcattgccgtgagc
    tgtggtgtaggtggcagacacggctcagatcccacgt
    tactgtggctgtggcataggccagcagctccagctct
    gattagacccttagcctgggaacttccatatgctgtg
    ggtgtggccttgaaaaaaaataaataaataagataat
    tactcaaatgttttccttgtctcagaaccttacttca
    ggataaagagtgagaaagttttttttatgaagggcca
    ttattacagctcaaaaataagttgtcttcagcaagta
    gaaagcaataagcctgagagttagtgttcctatcagt
    gtaaatattacctcctcgccaatccccagacagtcca
    tttgaacaattaacggtgccctgggagtacagttcag
    aaacattaatgtggatgttccagacctgtatttttat
    aagtaccttgagccggatggaaccatcattcctcacc
    attatttagaagtggactgtgactctgttggagatca
    gggcacacggttaccaaaagcacacccttctcctggc
    cttacctttgcaaagctggggtctgggacacagtcag
    ctgattatacccttttactaacttcccacagctcaaa
    tctggtcaattctccttcacaaatctcttaaaaatcc
    atcactcacctccagcctcttctgctgtggccttgat
    tcagcctctcacaatttttttttaaccagaattctgg
    cagtggcccctgacttgcctctgtgctcccagccccg
    ctgtcctctgatccatcctccatgccagcctttttca
    atctgctggtcacgattcattgatgggttaggaaatc
    aatggcatcacaactagcatttagaaaaaggaaatag
    gcgttcccgccgtggcacagcagaaataaatccgact
    aggaaccataaggttgcgggttcaacccctggccttg
    ttcagtgggttaaggatccggcattgccgtgggctgt
    tttgtaagtcacagacatggctctgatccggcattgc
    tgtggctctggcgtaggcctgcagcatcagctccaat
    tagacccctatcctgggagcctccatatgctgcaagt
    gcagccctaaaaaaaataaaaaaataaaaaaaaataa
    ataaaagaagtagacaaattgtatagaacaaccctga
    gtatgttgcctgagcacatataacaagggtaagtatt
    atttcaggaaactctggtttcacagatactcttggca
    tatggacccctagagtcctgatgtaaaatatattctt
    cctgggatcttaggcaagaagtttgaaagctccaact
    ctgcactgctgccaaagaaatgatttttaagtgcaaa
    actcttcccgttcccttccctgtataaaattccatag
    gatctctccagtgcctctaggataaaggcagttttca
    ttctctagttcaaggtgagagaagattttaattattt
    cacgttttagtggggaattcaagagtctggcacctga
    catttgctgaactctctccattatccctctctagttc
    cccagacgcatcctatggtagaaattcgcaaactaga
    gtgagcgtcagagtaacccaaggaaactgggtaaatg
    cagctccctgggctctaccccctgagattctgattca
    gtagatctgaagcagagccctggaatatgcatatgca
    tcattgtgtcacaccaagcattctgggtaatgagagt
    tgatgttaggttctcagtagtaagacaagtatagaga
    ttccgggggactgagtgctcagctctgccttggggag
    gagggagagggctaaagagaacaggagatggggacag
    ggaatgctcaacctccaatcttaggcatttgagctat
    gtcttaggggtcaggaggaggttaccaatatagtgat
    taagagattgaggttccagtcagagggatatgctgga
    gaaggggggtgaaaataatgtcataggtttggtgagt
    gcagatactttgagttttttaatatttttattgaaat
    atagttgatttacaatgctcttagtgagtacaattac
    tttgaataagtgcatagatgtatgccattcttccaga
    aatgatttattgagctcctttgggcatcatgctaagt
    acaggggaaacagctgtgaagaggtccttcccttatg
    aagtcattcatccccttcagtaaatgaaggtaaagga
    aaaggatgagacagggacgccgtgttggaccagggtc
    agaaaggccttataagaccttgcctggagggcaagga
    acttgcctgtgagtaaggagagcttgagaaagcgata
    aagcaaagaaggaacattactgcattgtgttttagaa
    aaaccatgtcctggggaagaactcctagagtcagggg
    ggccagttgggagactgtgcttttttccaggaggaga
    taagtgaggctgctggctgagatggagcaaggattta
    gagaagcagatatgagattcatttagaagttagacat
    tttaggatctgacacataatttatcaccaaaaccagt
    gcatctctggctttgggccaccagttttggagaagtg
    gaatgtagggacctaccattacctgccaatctttact
    acacagatgcctatttccctcctcatatttcctttct
    ccagatcacgtcctattctattgccaggactcaagat
    tccaccttgcatgcagtgatccatcttcacactggat
    ggacagctctagggatgtcagagcacactcttgtcca
    tactgctgactgggtctcctgtcagcccatctgtcta
    tcagctgtggtattattagtataataagagggctgta
    tatgagagacacaaaattctaggtgtagctcaaagat
    aggctagagttattcctatgtacaacaaatatttatg
    ggaccccttctgactgtcatggttgctgctttcatca
    tacttgtagtctaatggaggtgggggcagggcaggaa
    taagcggatgtccacaaaatcagtaagaccacttata
    ttcaacattttcataatttagttatttgagcccaaag
    ggtccacatccgtggtattccaacttttttttccccg
    gacatggatctttatctttttttttttttcttttttg
    cggccagacctgcggcatatggaagttcccaggccag
    gggttgaatgggagttgcagctgcctggtctacacca
    cagccacagcaaggtgggatctgagctgcatctgtga
    catacaccgcagctgaggtaacaccagattctgaacc
    cactgaatgaggccagggatggaacccgtctccttat
    gaacactatgtcatgttcttcaccctctgagccacaa
    cgggaactccagacttcgtctttaaatgtattctgac
    ttggagagctatcacactaagcaattaacaggagctg
    acctggtttaggctggggtggggccctactcctcaat
    gttccctgaggcacatctgtgggacccctgggcatca
    tctatctgagcagccttagagctgctcatccagttga
    ctgttgatgtagaagtgcaaacttctgccttccttat
    ttgttgctttcttttttcattgttctctcccctttgt
    gtctttaag
    CAAGGGCATCGTAGAGCCTCCAGAAGGGACTAAGATT
    TACAAGGATTCCTGGGATTTTGGCCCATATTTGAATA
    TCTTGAATGCTGCTATAGGAGATGAAATATTTCGTCA
    CTCATCCTGGATAAAAGAATACTTCACTTGGGCTGGA
    TTTAAGGATTATAACCTGGTGGTCAGG
    gtatgctatgaagttattatttgtttttgttttcttg
    tattacagagctatatgaaaacctcttagtattccag
    ttggtttctcaaaagcattcattgagccttactgact
    gtcagacggagggcgtattggactatgtgctgaaaca
    atcctttgttgaaaatgtagggaatgttgaaaatgta
    gggaatgaaatgtagatccagctctgtttctcttttg
    gaggattctttttcctccatcaccgtgtcttggttct
    tgtttgttttgggtttttgtgggtgttgtattgtgtt
    gtgttggttatggcagtgacagctatttaaactgtga
    aacgggggagttcccgtcgtggcgcagtggttaacga
    atccgactgggaaccatgaggttgcgggttcggtccc
    tgcccttgctcagtgggttaacgatccggcgttgccg
    tgagctgtggtgtaggttgcagacacggctcggatcc
    cgcgttgctgtggctctagcgtaggccagcggctaca
    gctccgattggacccctagcctgggaacctccatatg
    ccgcaggagcggcccaaagaaatagcaaaaagacaaa
    ataaataaataaataaataagtaagtaaaataaactg
    tgaaacggggagttcccttcatggctcagcagttaac
    aaacccagctaggatccatgaggatgtaggttcgatc
    cctggccttgctcagtgggttaagaatccagcgttgc
    tgtgagctgtgatgtaggtcgcagatgcagcccagat
    cctgcattgctgtggctgtggcgtaggctggcagctg
    aagctccgattcaacccctagcctgggaacatccata
    tgctgcaggtgtggccttaagaggcaaaaaaataaaa
    aaataaaaaataaataaattgtgggacagacaggtgg
    ctccactgcagagctggtgtcctgtagcagcctggaa
    gcaggtaaggtaaggactgcagctgggtaaggactga
    attgcaccaactgggaagtaagcctagatctagaact
    taagttagccctgacatagacacacagagctcaccag
    ctaagtggttcagcttataagctggtcactgaaactg
    aggatgtccacaaaagcaaaataagtagcaacaggca
    gcgggatgcaagagaaagaggaggcctaaaatggtct
    gggaatccctgccatacctatattttatcctacttat
    atttagtgcctgaatgtgtgcctggagagcaaagttt
    agggaaagcatcgggaaatgcacagtattcataccct
    taggaacaaagatcagttacctccagggtaaagacta
    tttccaagtttaaatttcaacccctgaacattagtac
    tgggtaccaggcaacacttgccatcctcaaaatcaat
    gaatcctaaaattcaacctgggggtcagtgacagtct
    gtgacaaagtttttgctggtcagtaacgaaataagta
    tgagcaccatctgagtatggtcaccaagatgtcaact
    ctctttcctttggacgaattgtcaltattccaagatt
    aggtcctttctatttttgaggtgtgaaaacatctttc
    ctttcataaaataaaaggatagtaggtggaagaattt
    tttttgttttttggtctttttgctatttctttgggcc
    gcttctgcagcatatggaggttcccaggccaggggtc
    gaatcggagctttagccaccggcccacgccagagcca
    cagcaacacgggatccaagccgcatctgcagcctaca
    ccacagctcacggcaatgccggatcgttaacccactg
    agcaagggcagggaccgaacccgcaacctcatggttc
    ctagtcggattcgttaaccactgcgccacgacgggaa
    ctcctaatgatactcttttatatttagctactatgtg
    atgatgagaaacagtccacattttattattttttagc
    caatttgatatctcattactaagataatgataatttt
    ctctataaattttatttaagttagtgttatgaagtgg
    ttttgctagtgtagaaggctaggatttgaattcagtt
    caagaaagaagagagggagggagggagagggatgggt
    agagggatggggcagtgggagagagcaaagaggagag
    acagtttttgtattaattctgcttcattgctatcatt
    taagggcacttgggtcttgcacattctagaattttct
    aaggaccttgaccgccagattgatatgcttcttccct
    ttaccatgttgtcatttgaacag
    ATGATTGAGACAGATGAGGACTTCAGCCCTTTGCCTG
    GAGGATATGACTATTTGGTTGACTTTCTGGATTTATC
    CTTTCCAAAAGAAAGACCAAGCCGGGAACATCCATAT
    GAGGAA
    gtaagcaggaataccagtggaagtgcccctttcttcc
    ttccttcctaaataaacttttttattttggaacaact
    ttagagttacagaaaagttgcaaagatattatagaca
    gtagtgtttatatatatatataaatttttttttgctt
    tttatgaccacacctgtggcatatggaggttcccagt
    ctaggggttgaattggagctacagctgccagtctgtg
    ccataaccacagcaatgcaggatctgggccacgtctg
    tgacctacaccaaagctcacagctggattcttaaccc
    actgagcaaggccagggattgaacctgcatcctcgtg
    gttcctagttggattcgtttccgctttgccgcaatgg
    gaactccaaattattgttaatatcttactttactggg
    gtacatttgttacaaccaatactctgatactgaaaca
    ttactgttaactccgtacttgcttctttttgagtcat
    ttgcaaagactggcttcttgacctgcttccttccaaa
    cagctggcctgcctatgctgttctcagacctgcaagc
    actgatctctgccccccttgccttctctccagtggtg
    tctccttccccaaacaaacccagtgtggctctggaaa
    gggagttaagtcaacataaaccaacacatattttgtt
    gagctccaattttgagcaaatccctcaccacggcaga
    caggcatgatgttaagaactagggctttggacacaag
    gtcaagaccaagaagggttcctcacccctactgattc
    agataaccaataatgaggctttgaatccctgtccaaa
    ggttgttttttttcccttctattgagcttcttgccac
    cttatcagttttttttatgacagtcaaatgacatgat
    atatgtgagcatacatggtaatttttaattctatata
    aatgaatcactaaataaattaggaggatatatagtcc
    acctttaagcgtattacacgtgtcacatgaatgtgtg
    gcgacttaattgtagaggtttaaatgtagcttcctat
    aatagatgtgttcctaaactacattttaatcattgga
    cttgtatttttatgttagcacttgctgttgaagaaaa
    gcctatgccaaaagttcagtgaaaccaataatccact
    gccagctttctgagttaaaaaaaatccctgggttttc
    acacacaggaacaccctgtgtgaaacactcatttaga
    gcaaaatgcatctgataaggagttcctgttgtgcctc
    aactggttaaggacctgacattctccatgagaatgtg
    agtttgatccccggccccactcgatgggttaaggatc
    tggtgttgccacaaactgcagctccgattcatctcct
    agcctagaaacttccacagcccagaatatgccacaga
    attcggctgtttaaaaaaaaaaagaaaaaaaaaagaa
    tcataaatgtgttggtttgttcaccaaatacatgata
    acttgctcttgccaagctcagcttcataaatattaag
    tcatttaatacagcagccaccttatgaacagatatta
    ctatacttcccatttacagataaggaaaatgccatat
    ttaaccaagagattaaataactttcccgaggtcttat
    agcaagtaaatcatggtgcaggggtttgaccacacgc
    agtctatctccagagtctgtgtatttagccactgttt
    tactttcaaatttaaatttataaaacttctaaattat
    ctgttaaccataatctttggaatttttaaaaccacga
    gttcctataaaatgtttcattgaaagtaagtcacttt
    tccatagcttttgataatacatctgtaggataaagta
    agccacagctctcttgcagacttggtacaccctgggg
    caaagcatcatgcctgtcacgtacatggtggtcctta
    ctttgactctcagtgcttttattgcccaggaattttg
    tgagatttctagttgttgaggtttgtttaaagaggtt
    atgccggtacttggaagagctcttttcttgctacctg
    gagccttctcatatttcctttttgaggagggacatga
    attgcctttcaaactcataaatatattttctagtaca
    caagtctccatcttccttagacgcatggctcctggag
    ttctccatcctcctgctccactttgggtgggctcctc
    tctgggtctgccaccaatctgccacccagagacatcc
    ttgacccacttccagaccccaccatggcttcactttc
    ttcgctttcctcctttgtggaaccttctgcttaagaa
    tctgaggaagaaaatttgcacgtgagctaaactggag
    gtactttcctgcctggtcttgcacgatagcttggctg
    agcccatgatgctgggtggctgttactttccatggac
    acccgaaggcgttgctcctttggcttctagttgcatg
    cagtgttgcttatcccaggctgatctttcttccactg
    taggtgacttttaagaattaagggattaatctatatc
    tacaacaacaacaacaaagaccttttcaagctgaggt
    agggctttctgtatatgtttggagtggttatccagca
    gactttacttgaaggcaggggtcatatcctcaagtgc
    tcataaacggaccacagaaagatctcataattgggtg
    gagctgggtggggaccgtgtcatgtggccaggaaatg
    ccagatgggaagggagtggcccttactgagctccagc
    tgaactctgaattttctagaaaactcagaaatctgga
    tttttcatgtgtaatacccagatttatagatgtggaa
    agctaattctttttttttttaagggactataggcaat
    gaactaagatctaggttgtatttggacaaggggtcat
    cagtttaagctgtgtagttgagcgctcagctattggg
    ctgagggacccctaaatactgagacggggaggtcctt
    gctctggggcatcacaagtacactccctggtctcatt
    caaacacttttcctacaaaattgatcccatttcttca
    gtgcactgtctgaatgcatttggcccagagccgtgct
    gaggcatagggaaggggtccacggtttcatggcatcg
    ttttgtgctgtgtgtccctgctgtcgtccaggatacc
    tacctctcctcctcctgcatctgaatgtccccccaca
    gactctctgggattctacagcctctggcctgttcctc
    agacacctcttacctgccagctttccagattcacatt
    agttagtccaaatctactgccgtcagtgactcacttc
    atttcttcttctccgaggcagttcagcccggtacagt
    tgttttgtcaacacttcagttgagtctggaagatgtg
    catgggttatgcacgagagcggtccatcattttgagc
    tagaagtcctttctcagcccagagacaagtcctcatc
    tcctttacttcctgactcttcttcctctgcatccttc
    caagatatctctttctccagccaccacctaaatctct
    tcttttcccggggttccgtgctcaacccactcttctt
    cttaaatctgtggctgggtgaacgcatctgctggcac
    cacttctctgctaaagactccaaaatccataggtcct
    gcccggcctttgcccacctctctccaacactgtccag
    ctttagatgtagagctaatccccccagagatatcatt
    ccctggatgtctaagtcctttggtatctcactttcag
    cgtgttcaaaatcctcttacaactgttctttctcctt
    ttccatcttgattattggcaacatgccagcctttccc
    ctacccccagcagtgagccaagctagaaacaagggct
    taatcttcaatctttccttctccatccctaaacctaa
    tgagtctccaagcccttcccagtttacaccctaaatg
    ttgctcaaaacatcccctagttcttccacgtgctctc
    ctctatattgaaaggtcaagaaaggccatcttccctc
    cactgtgaggaaatagatcttgatactgcccctgagc
    tgggcagtcctcgacctgacaaactgtgcagtgtttc
    taaatctctactggcaaaatgagagtgcctttgacct
    gtgttgcgatctcagatcacagtggatgtaattgttt
    tataggaatggtgaacgaaaaagaagtaaatccctaa
    tgccaaactcctgatcattctatgtcatttaatagcc
    tgtcatttatgataaagtttcctctactggcattagc
    acaatacttctcaggaaaaaaaaatatgatgccagat
    actgaaaagctcctgggtaaacatgaacatgggtacc
    gataaaatggtgaagccagtccaatcttagagtgact
    tcccttcatgctacttcatgctctttttttttttttt
    ttttaagaaaaaccccttttttttttctcacaccagt
    cacagaggagaccgaggcttagcaaggttaaggtcac
    atgattagtaagtgctgggctgaaactcaaaaccatc
    tctgcttgtctcctaaccctgtgcacctctgactatt
    caacag
    ATCCTGTGTCAGGAGTTGGGATTCTTTGAAG
    gtaagggccttgaccaccgaattaaggtaatcttgct
    ctgtggcaggccttgttttcagtattttaagtacact
    ggctcaggtaatcctcacaacagccccaggaggaatg
    ttctattacctccactgtatagatgaggaacttgagg
    cacagaatggttgccaaggtcacacagctatattggg
    ggttcatacccagccatccaactctgtctgtactctc
    tgccactctgcacccccagctcctgatccacttcctg
    tttccatccctcgatttctgctgcactcaggggcccc
    tctccccctcggcctgtgagatctgcttcagtaggct
    tttctccctgactcctccatccctgtccttacaggca
    gctgcttctctccgggacacgaggggtccatacggac
    actctctactggctgggttgcgcctaactcgtgattc
    ctcctctgtttcag
    ATTCGGAGCCGGGTTGATGTCATCAGACACGTGGTAA
    AGAATGGTCTGCTCTGGGATGACTTGTACATAGGATT
    CCAAACCCGGCTTCAGCGGGATCCTGATATATACCAT
    CATCT
    gtaagtccgaaaatgcctgtcgtgtgtgccttaggct
    gctgcggaggaggccagggctatataagcagagtcag
    tgactgactgtgccctgcagtgttgatggccatggag
    attccaccgttagagcttttttctttgttaaccttga
    aggcaaatctggttaggaagataactttcaaagagtc
    accatctggacattcatgcccatgtgcttcaatcctg
    tatacaagcagtttagagtacagggaagggaaggaca
    ttatgaaagggagagggtgtgtttggatccagcagct
    ccatcctcagaatttatctgaagacactgcaaaatta
    ctaagaatcactatgacaagaatgaggatggggtgat
    atggcaaagttgtgatcctggaagaccttcatctccc
    atgttgcccaactctgaacatgaatttggtgaactag
    ttggttaaggggatgatcctccaagtttctccctggt
    tgagctccaaaaaccatgtaagtttctcatagcaaaa
    ccgtataggtccttagggctttagttggaatatttgt
    gctgaaatgctggaaagccccatttgccatttttgta
    tttgcaaaataatcatcaagaggggagaatgcattct
    ttcatgaccactgaccctctgaaaaggtcaggaattt
    agtctgaagtaggcaagcctcctaccccgcttctgcc
    atgagcttgcacgcacaggcctgtcttgacatttctt
    ctttatagatttctttttgaatatcttgaaattgctt
    taaaaatatttaaagaatgtagaattatataaaataa
    aaaggaaataaccccacacctcccacaaaaccctgtt
    tcctgcctttctccaCccactctccagggtaacactt
    ggtaacagcatagttgtatcaccccaggcctattttt
    gagcatatcagcatttcaagaaatgtattttttctca
    ataaaacatcccttatagttgaggaggggaggttatc
    attcctgggttttgttttttttttttttttaatgtaa
    tcctggtacatcggtaatttgcattttttattcatta
    atatctttggtatttctagtgttgggacacacaggtc
    aacctcagtttttgggtttttttttttgtctttttgt
    ctttctagggccacacctgcagcatatggacgttccc
    aagctaggagtctaatcagagctgtagccaccagcct
    acgtcatagccatagcaacgtcagatccaagccgtgt
    ctgtgacctacaagcacagctcatggcaacaccggat
    ccttaaccactgaacgaggccaggggatcgaacacac
    atcctcatggatcctagtcatgttcattaaccactga
    gtcatgatgggaactccaacttcaactattttaatgt
    ctgtaaaacattccatttggaaaccatttcatttgta
    aagcaaaatgaaaacattttgttcattttcaacagag
    ttcgtagctgacttctgttctggaaaaaaggaaatgg
    agcaaatttgagtgagaaagattcaaagataactttt
    cttttaaaaaaaattatatcttggaaacttctgggct
    attgattctgaagactatttttctatatactgttttg
    atagcaaagttcataaatgtgaaaggatcctgcgatg
    aatcttgggaagcagtcatagcccaatatatctttgt
    tgcttttaaaatgagatttagtttactaaatattttt
    ctgatcataaaaataacacagatctaccgcagaaaat
    ttggaaaaaaaaaaacttttaaattcaaaaaacagtt
    aaaccacaaatgatcccaccatccagagagcaatttg
    tactttggtgtctagttcatctttctttttctgttta
    caagcacatataccacaagcattttttcaaaaaatga
    aaatgggataatactatacatacgtctgtacacctgc
    atagttactgaacagtctttgatctaccctgtaagtt
    tctaacttttcattatttgaaatgatgttttggcaaa
    gaaatatgtaggtgtgtctcgcacactttcataatga
    tttcttaggataaatttcttaggataaattcataatg
    atttcttataataatccatactctgccaactgatctt
    cagggaagccaactcgccttctcagaaataacatata
    acccatttacttgccctctcaccaatactaggtccta
    atgtttttgtgtacagattctatatttttacatacaa
    gaattccttaaagcaaggcatgtcacagaaaaataga
    aggaagacacaattgtcatgtttaaggactgcattct
    gtaccaaaaatgctaagttaaatgaacatctgaaaca
    gtacagaaacgctatctttcagggaaagctgagtacc
    aggtactgaacagattttggcaaatacagcaggcatg
    gatgtttccaaaacatgtttttctactttatctctta
    cag
    GTTTTGGAATCATTTTCAAATAAAACTCCCCCTCACA
    CCACCTGACTGGAAGTCCTTCCTGATGTGCTCTGGGT
    AGAGAGGACCTGAGCTGTCCCAG
    gtaaagcatcctgcaggtctgggagacactcttattc
    tccagcccatcacactgtgtttggcatcagaattaag
    caggcactatgcctatcagaaaacctgacttttgggg
    gaatgaaagaagctaacattacaagaatgtctgtgtt
    taaaaataagtcaataagggagttcccatcgtggctc
    agtggtaacgaaccctactagtatccattgaggacac
    aggttcaatatctggcctcactcagtcggctaaggat
    ccagtgatgccgtgagctgcagtgtaggccacagacg
    tggctcagatctggtgctgctgtggctatggtgtagg
    ccggccccctgtaactccaattcgacccctaggctgg
    gaacctaaaaagaccccaaaaaagtcgctttaatgaa
    tagtgaatacatccagcccaaagtccacagactcttt
    ggtctggttgtggcaaacatacagccagttaacaaac
    aagacaaaaattatcctaggtggtcagtgggggttca
    gagctgaatcctgaacactggaaggaaaacagcaacc
    aaatccaaatactgtatggttttgcttatatgtagaa
    tctaaattcaaagcaaatgagcaaaccaattgaaaca
    gttatggaagacaagcaggtggttgtcaggggggaga
    taaggggaggcaggaaagacctgggcgagggagatta
    agaggtaccaactttcagttgcaaaacaaatgagtca
    ccagtatgaaatgtgcaatgtgggaaatacaggccat
    aactttataatctcttttttttttttgtcttttttgc
    cttttctaaggctgctcccgtggcatatggaggttcc
    caggctaggagtccaaacagagctgtagctgccagcc
    tacaccagagccacagcaacacgggaaccttaacccg
    ctgagcaaggccagggatcgaacccgagtcctcacag
    atgccagtagggttcattaaccactgagccacgacag
    gaattccagggtctgttgtgttcttaaaacacttcca
    ggagagtgagtggtatgtcataagtaaacaataaatg
    ttaaccacaacaagcttatgaaataaacaggaaagcc
    atatgacctacaatcagtcattgggagaatccacaaa
    aggttgagcagaggatcaattccagctcacactccag
    ttttagattctcccctgccttaaagcatcacagacta
    cataatctgagctgaagaataaaaattaaaactcacc
    ccagtgcaaaacagaaatgaaaaagtattaaaacgag
    gttcatactgttgttcattagcaatatcttttattca
    cag
    GGGTGCCCAACAACATGAAAAAATCAAGAATTTATTG
    CTGCTACGTCAAAGCTTATACCAGAGATTATGCCTTA
    TAGACATTAGCAATGGATAATTATATGTTGCACTTGT
    GAAATGTGCACATATCCTGTTTATGAATCACCACATA
    GCCAGATTATCAATATTTTACTTATTTCGTAAAAAAT
    CCACAATTTTCCATAACAGAATCAACGTGTGCAATAG
    GAACAAGATTGCTATGGAAAACGAGGGTAACAGGAGG
    AGATATTAATCCAAGCATAGAAGAAATAGACAAATGA
    GGGGCCATAAGGGGAATATAGGGAAGAGAAAAAAATT
    AAGATGGAATTTTAAAAGGAGAATGTAAAAAATAGAT
    ATTTGTTCCTTAATAGGTTGATTCCTCAAATAGAGCC
    CATGAATATAATCAAATAGGAAGGGTTCATGACTGTT
    TTCAATTTTTCAAAAAGCTTTGTTGAAATCATAGACT
    TGCAAAACAAGGCTGTAGAGGCCACCCTAAAATGGAA
    AATTTCACTGGGACTGAAATTATTTTGATTCAATGAC
    AAAATTTGTTATTACTGCGGATTATAAACTCTAACAA
    ATAGCGATCTCTTTGCTTCATAAAAACATAAACACTA
    GCTAGTAATAAAATGAGTTCTGCAG
  • SEQ ID NO. 50 represents contiguous genomic sequences containing Exon 12, Intron 12, Exon 13, Intron 13, Exon 14, Intron 14, Exon 15, Intron 15, Exon 16, Intron 16, Exon 17, Intron 17, and Exon 18. Nucleotide sequences that contain at least 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, 5500, 6000, 6500, 7000, 7500, 8000, 8500, 9000, 9500, 10000, 11000, 12000, 13000, 14000, 15000, 16000, 17000, 18000, 19000 or 20,000 contiguous nucleotides of SEQ ID NO. 50 are provided, as well as nucleotide sequences at least 80, 85, 90, 95, 98, or 99% homologous to SEQ ID NO. 50.
  • VIII. Oligonucleotide Probes and Primers
  • The present invention further provides oligonucleotide probes and primers which hybridize to the hereinabove-described sequences (SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47, 48, 49 and 50). Oligonucleotides are provided that can be homologous to SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47, 48, 49 and 50, and fragments thereof. Oligonucleotides that hybridize under stringent conditions to SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47, 48, 49 and 50 and fragments thereof, are also provided. Stringent conditions describe conditions under which hybridization will occur only if there is at least about 85%, about 90%, about 95%, or at least about 98% homology between the sequences. Alternatively, the oligonucleotide can have at least 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 45, 46, 47, 48, 49, 50, 75 or 100 bases which hybridize to SEQ ID Nos. 1, 3, 5, 7, 9-45, 46, 47, 48, 49 and 50, and fragments thereof. Such oligonucleotides can be used as primers and probes to detect the sequences provided herein. The probe or primer can be at least 14 nucleotides in length, and in a preferred embodiment, are at least 15, 20, 25, 28, 30, or 35 nucleotides in length.
  • Given the above sequences, one of ordinary skill in the art using standard algorithms can construct oligonucleotide probes and primes that are complementary to sequences contained in Seq ID Nos. 1, 3, 5, 7, 9-45, 46, 47, 48, 49 and 50, and fragments thereof. The rules for complementary pairing are well known: cytosine (“C”) always pairs with guanine (“G”) and thymine (“T”) or uracil (“U”) always pairs with adenine (“A”). It is recognized that it is not necessary for the primer or probe to be 100% complementary to the target nucleic acid sequence, as long as the primer or probe sufficiently hybridizes and can recognize the corresponding complementary sequence. A certain degree of pair mismatch can generally be tolerated.
  • Oligonucleotide sequences used as the hybridizing region of a primer can also be used as the hybridizing region of a probe. Suitability of a primer sequence for use as a probe depends on the hybridization characteristics of the primer. Similarly, an oligonucleotide used as a probe can be used as a primer.
  • It will be apparent to those skilled in the art that, provided with these specific embodiments, specific primers and probes can be prepared by, for example, the addition of nucleotides to either the 5′ or 3′ ends, which nucleotides are complementary to the target sequence or are not complimentary to the target sequence. So long as primer compositions serve as a point of initiation for extension on the target sequences, and so long as the primers and probes comprise at least 14 consecutive nucleotides contained within the above mentioned SEQ ID Nos. such compositions are within the scope of the invention.
  • The probes and primers herein can be selected by the following criteria, which are factors to be considered, but are not exclusive or determinative. The probes and primers are selected from the region of the CMP-Neu5Ac hydroxylase nucleic acid sequence identified in SEQ ID Nos. 1, 3, 5, 7, 945, 46, 47, 48, 49, 50, and fragments thereof. The probes and primers lack homology with sequences of other genes that would be expected to compromise the test. The probes or primers lack secondary structure formation in the amplified nucleic acid which can interfere with extension by the amplification enzyme such as E. coli DNA polymerase, preferably that portion of the DNA polymerase referred to as the Klenow fragment. This can be accomplished by employing up to about 15% by weight, preferably 5-10% by weight, dimethyl sulfoxide (DMSO) in the amplification medium and/or increasing the amplification temperatures to 30°-40° C.
  • Preferably, the probes or primers should contain approximately 50% guanine and cytosine nucleotides, as measured by the formula adenine (A)+thymine (T)+cytosine (C)+guanine (G)/cytosine (C)+guanine (G). Preferably, the probe or primer does not contain multiple consecutive adenine and thymine residues at the 3′ end of the primer which can result in less stable hybrids.
  • The probes and primers of the invention can be about 10 to 30 nucleotides long, preferably at least 10, 11, 12, 13, 14, 15, 20, 25, or 28 nucleotides in length, including specifically 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 nucleotides. The nucleotides as used in the present invention can be ribonucleotides, deoxyribonucleotides and modified nucleotides such as inosine or nucleotides containing modified groups which do not essentially alter their hybridization characteristics. Probe and primer sequences are represented throughout the specification as single stranded DNA oligonucleotides from the 5′ to the 3′ end. Any of the probes can be used as such, or in their complementary form, or in their RNA form (wherein T is replaced by U).
  • The probes and primers according to the invention can be prepared by cloning of recombinant plasmids containing inserts including the corresponding nucleotide sequences, optionally by cleaving the latter out from the cloned plasmids upon using the adequate nucleases and recovering them, e.g. by fractionation according to molecular weight. The probes and primers according to the present invention can also be synthesized chemically, for instance by the conventional phosphotriester or phosphodiester methods or automated embodiments thereof. In one such automated embodiment diethylphosphoramidites are used as starting materials and can be synthesized as described by Beaucage, et al., Tetrahedron Letters 22:1859-1862 (1981). One method of synthesizing oligonucleotides on a modified solid support is described in U.S. Pat. No. 4,458,066. It is also possible to use a probe or primer which has been isolated from a biological source (such as a restriction endonuclease digest).
  • The oligonucleotides used as primers or probes can also comprise nucleotide analogues such as phosphorothioates (Matsukura S., Naibunpi Gakkai Zasshi. 43(6):527-32 (1967)), alkylphosphorothiates (Miller P., et al., Biochemistry 18(23):5134-43 (1979), peptide nucleic acids (Nielsen P., et al., Science 254(5037):1497-500 (1991); Nielsen P., et al., Nucleic-Acids-Res. 21(2):197-200 (1993)), morpholino nucleic acids, locked nucleic acids, pseudocyclic oligonucleobases, 2′-O,4′-C-ethylene bridged nucleic acids or can contain intercalating agents (Asseline J., et al., Proc. Natl. Acad. Sci. USA 81(11):3297-301 (1984)).
  • For designing probes and primers with desired characteristics, the following useful guidelines known to the person skilled in the art can be applied. Because the extent and specificity of hybridization reactions are affected by a number of factors, manipulation of one or more of those factors will determine the exact sensitivity and specificity of a particular probe, whether perfectly complementary to its target or not. The importance and effect of various assay conditions, explained further herein, are known to those skilled in the art.
  • The stability of the probe and primer to target nucleic acid hybrid should be chosen to be compatible with the assay conditions. This can be accomplished by avoiding long AT-rich sequences, by terminating the hybrids with GC base pairs, and/or by designing the probe with an appropriate Tm. The beginning and end points of the probe should be chosen so that the length and % GC result in a Tm about 2-10° C. higher than the temperature at which the final assay will be performed. The base composition of the probe is significant because G-C base pairs exhibit greater thermal stability compared to A-T base pairs due to additional hydrogen bonding. Thus, hybridization involving complementary nucleic acids of higher G-C content will be stable at higher temperatures. Conditions such as ionic strength and incubation temperature under which probe will be used should also be taken into account when designing a probe. It is known that hybridization will increase as the ionic strength of the reaction mixture increases, and that the thermal stability of the hybrids will increase with increasing ionic strength. Chemical reagents, such as formamide, urea, DIVISO and alcohols, which disrupt hydrogen bonds, will increase the stringency of hybridization. Destabilization of the hydrogen bonds by such reagents can greatly reduce the Tm. In general, optimal hybridization for synthetic oligonucleotide probes of about 10-50 bases in length occurs approximately 5° C. below the melting temperature for a given duplex. Incubation at temperatures below the optimum can allow mismatched base sequences to hybridize and can therefore result in reduced specificity. It is desirable to have probes which hybridize only under conditions of high stringency. Under high stringency conditions only highly complementary nucleic acid hybrids will form; hybrids without a sufficient degree of complementarity will not form. Accordingly, the stringency of the assay conditions determines the amount of complementarity needed between two nucleic acid strands forming a hybrid. The degree of stringency is chosen such as to maximize the difference in stability between the hybrid formed with the target and the non-target nucleic acid. In the present case, single base pair changes need to be detected, which requires conditions of very high stringency.
  • The length of the target nucleic acid sequence and, accordingly, the length of the probe sequence can also be important. In some cases, there can be several sequences from a particular region, varying in location and length, which will yield probes and primers with the desired hybridization characteristics. In other cases, one sequence can be significantly better than another which differs merely by a single base.
  • While it is possible for nucleic acids that are not perfectly complementary to hybridize, the longest stretch of perfectly complementary base sequence will normally primarily determine hybrid stability. While oligonucleotide probes and primers of different lengths and base composition can be used, preferred oligonucleotide probes and primers of this invention are between about 14 and 30 bases in length and have a sufficient stretch in the sequence which is perfectly complementary to the target nucleic acid sequence.
  • Regions in the target DNA or RNA which are known to form strong internal structures inhibitory to hybridization are less preferred. Likewise, probes with extensive self-complementarity should be avoided. As explained above, hybridization is the association of two single strands of complementary nucleic acids to form a hydrogen bonded double strand. It is implicit that if one of the two strands is wholly or partially involved in a hybrid, it will be less able to participate in formation of a new hybrid. There can be intramolecular and intermolecular hybrids formed within the molecules of one type of probe if there is sufficient self complementarity. Such structures can be avoided through careful probe design. By designing a probe so that a substantial portion of the sequence of interest is single stranded, the rate and extent of hybridization can be greatly increased. Computer programs are available to search for this type of interaction. However, in certain instances, it may not be possible to avoid this type of interaction.
  • Specific primers and sequence specific oligonucleotide probes can be used in a polymerase chain reaction that enables amplification and detection of CMP-Neu5Ac hydroxylase nucleic acid sequences.
  • IV. Genetic Targeting of the CMP-Neu5Ac Hydroxylase Gene
  • Gene targeting allows for the selective manipulation of animal cell genomes. Using this technique, a particular DNA sequence can be targeted and modified in a site-specific and precise manner. Different types of DNA sequences can be targeted for modification, including regulatory regions, coding regions and regions of DNA between genes. Examples of regulatory regions include: promoter regions, enhancer regions, terminator regions and introns. By modifying these regulatory regions, the timing and level of expression of a gene can be altered. Coding regions can be modified to alter, enhance or eliminate the protein within a cell. Introns and exons, as well as inter-genic regions, are suitable targets for modification.
  • Modifications of DNA sequences can be of several types, including insertions, deletions, substitutions, or any combination thereof. A specific example of a modification is the inactivation of a gene by site-specific integration of a nucleotide sequence that disrupts expression of the gene product, i.e. a “knock out”. For example, one approach to disrupting the CMP-Neu5Ac hydroxylase gene is to insert a selectable marker into the targeting DNA such that homologous recombination between the targeting DNA and the target DNA can result in insertion of the selectable marker into the coding region of the target gene. For example, see FIGS. 3, 12, and 13. In this way, for example, the CMP-Neu5Ac hydroxylase gene sequence is disrupted, rendering the encoded enzyme nonfunctional.
  • Homologous Recombination
  • Homologous recombination permits site-specific modifications in endogenous genes and thus novel alterations can be engineered into the genome. A primary step in homologous recombination is DNA strand exchange, which involves a pairing of a DNA duplex with at least one DNA strand containing a complementary sequence to form an intermediate recombination structure containing heteroduplex DNA (see, for example Radding, C. M. (1982) Ann. Rev. Genet. 16: 405; U.S. Pat. No. 4,888,274). The heteroduplex DNA can take several forms, including a three DNA strand containing triplex form wherein a single complementary strand invades the DNA duplex (Hsieh, et al., Genes and Development 4: 1951 (1990); Rao, et al., (1991) PNAS 88:2984)) and, when two complementary DNA strands pair with a DNA duplex, a classical Holliday recombination joint or chi structure (Holliday, R., Genet. Res. 5: 282 (1964)) can form, or a double-D loop (“Diagnostic Applications of Double-D Loop Formation” U.S. Ser. No. 07/755,462, filed Sep. 4, 1991). Once formed, a heteroduplex structure can be resolved by strand breakage and exchange, so that all or a portion of an invading DNA strand is spliced into a recipient DNA duplex, adding or replacing a segment of the recipient DNA duplex. Alternatively, a heteroduplex structure can result in gene conversion, wherein a sequence of an invading strand is transferred to a recipient DNA duplex by repair of mismatched bases using the invading strand as a template (Genes, 3rd Ed. (1987) Lewin, B., John Wiley, New York, N.Y.; Lopez, et al., Nucleic Acids Res. 15: 5643 (1987)). Whether by the mechanism of breakage and rejoining or by the mechanism(s) of gene conversion, formation of heteroduplex DNA at homologously paired joints can serve to transfer genetic sequence information from one DNA molecule to another.
  • The ability of homologous recombination (gene conversion and classical strand breakage/rejoining) to transfer, genetic sequence information between DNA molecules renders targeted homologous recombination a powerful method in genetic engineering and gene manipulation.
  • In homologous recombination, the incoming DNA interacts with and integrates into a site in the genome that contains a substantially homologous DNA sequence. In non-homologous (“random” or “illicit”) integration, the incoming DNA is not found at a homologous sequence in the genome but integrates elsewhere, at one of a large number of potential locations. In general, studies with higher eukaryotic cells have revealed that the frequency of homologous recombination is far less than the frequency of random integration. The ratio of these frequencies has direct implications for “gene targeting” which depends on integration via homologous recombination (i.e. recombination between the exogenous “targeting DNA” and the corresponding “target DNA” in the genome).
  • A number of papers describe the use of homologous recombination in mammalian cells. Illustrative of these papers are Kucherlapati, et al., Proc. Natl. Acad. Sci. (USA) 81:3153-3157, 1984; Kucherlapati, et al., Mol. Cell. Bio. 5:714-720, 1985; Smithies, et al, Nature 317:230-234, 1985; Wake, et al., Mol. Cell. Bio. 8:2080-2089, 1985; Ayares, et al., Genetics 111:375-388, 1985; Ayares, et al., Mol. Cell. Bio. 7:1656-1662, 1986; Song, et al., Proc. Natl. Acad. Sci. USA 84:6820-6824, 1987; Thomas, et al. Cell 44:419-428, 1986; Thomas and Capecchi, Cell 51: 503-512, 1987; Nandi, et al., Proc. Natl. Acad. Sci. USA 85:3845-3849, 1988; and Mansour, et al., Nature 336:348-352, 1988; Evans and Kaufman, Nature 294:146-154, 1981; Doetschman, et al., Nature 330:576-578, 1987; Thoma and Capecchi, Cell 51:503-512, 4987; Thompson, et al., Cell 56:316-321, 1989.
  • The present invention uses homologous recombination to inactivate the porcine CMP-Neu5Ac hydroxylase gene in cells, such as fibroblasts. The DNA can comprise at least a portion of the gene(s) at the particular locus with introduction of an alteration into at least one, optionally both copies, of the native gene(s), so as to prevent expression of a functional enzyme and production of a Hanganutziu-Deicher antigen molecule. The alteration can be an insertion, deletion, replacement or combination thereof. When the alteration is introduce into only one copy of the gene being inactivated, the cells having a single unmutated copy of the target gene are amplified and can be subjected to a second targeting step, where the alteration can be the same or different from the first alteration, usually different, and where a deletion, or replacement is involved, can be overlapping at least a portion of the alteration originally introduced. In this second targeting step, a targeting vector with the same arms of homology, but containing a different mammalian selectable markers can be used. The resulting transformants are screened for the absence of a functional target antigen and the DNA of the cell can be further screened to ensure the absence of a wild-type target gene. Alternatively, homozygosity as to a phenotype can be achieved by breeding hosts heterozygous for the mutation.
  • Porcine cells that can be genetically modified can be obtained from a variety of different organs and tissues such as, but not limited to, brain, heart, lungs, glands, brain, eye, stomach, spleen, pancreas, kidneys, liver, intestines, uterus, bladder, skin, hair, nails, ears, nose, mouth, lips, gums, teeth, tongue, salivary glands, tonsils, pharynx, esophagus, large intestine, small intestine, rectum, anus, pylorus, thyroid gland, thymus gland, suprarenal capsule, bones, cartilage, tendons, ligaments, skeletal muscles, smooth muscles, blood vessels, blood, spinal cord, trachea, ureters, urethra, hypothalamus, pituitary, adrenal glands, ovaries, oviducts, uterus, vagina, mammary glands, testes, seminal vesicles, penis, lymph, lymph nodes and lymph vessels. In one embodiment of the invention, porcine cells can be selected from the group consisting of, but not limited to, epithelial cells, fibroblast cells, neural cells, keratinocytes, hematopoietic cells, melanocytes, chondrocytes, lymphocytes (B and T), macrophages, monocytes, mononuclear cells, cardiac muscle cells, other muscle cells, □hosphate cells, cumulus cells, epidermal cells, endothelial cells, Islets of Langerhans cells, blood cells, blood precursor cells, bone cells, bone precursor cells, neuronal stem cells, primordial stem cells, hepatocytes, keratinocytes, umbilical vein endothelial cells, aortic endothelial cells, microvascular endothelial cells, fibroblasts, liver stellate cells, aortic smooth muscle cells, cardiac myocytes, neurons, Kupffer cells, smooth muscle cells, Schwann cells, and epithelial cells, erythrocytes, platelets, neutrophils, lymphocytes, monocytes, eosinophils, basophils, adipocytes, chondrocytes, pancreatic islet cells, thyroid cells, parathyroid cells, parotid cells, tumor cells, glial cells, astrocytes, red blood cells, white blood cells, macrophages, epithelial cells, somatic cells, pituitary cells, adrenal cells, hair cells, bladder cells, kidney cells, retinal cells, rod cells, cone cells, heart cells, pacemaker cells, spleen cells, antigen presenting cells, memory cells, T cells, B cells, plasma cells, muscle cells, ovarian cells, uterine cells, prostate cells, vaginal epithelial cells, sperm cells, testicular cells, germ cells, egg cells, leydig cells, peritubular cells, sertoli cells, lutein cells, cervical cells, endometrial cells, mammary cells, follicle cells, mucous cells, ciliated cells, nonkeratinized epithelial cells, keratinized epithelial cells, lung cells, goblet cells, columnar epithelial cells, squamous epithelial cells, osteocytes, osteoblasts, and osteoclasts.
  • In one alternative embodiment, embryonic stem cells can be used. An embryonic stem cell line can be employed or embryonic stem cells can be obtained freshly from a host, such as a porcine animal. The cells can be grown on an appropriate fibroblast-feeder layer or grown in the presence of leukemia inhibiting factor (LIF). In a preferred embodiment, the porcine cells can be fibroblasts; in one specific embodiment, the porcine cells can be fetal fibroblasts. Fibroblast cells are a preferred somatic cell type because they can be obtained from developing fetuses and adult animals in large quantities.
  • These cells can be easily propagated in vitro with a rapid doubling time and can be clonally propagated for use in gene targeting procedures.
  • Targeting Vectors
  • Cells homozygous at a targeted locus can be produced by introducing DNA into the cells, where the DNA has homology to the target locus and includes a marker gene, allowing for selection of cells comprising the integrated construct. The homologous DNA in the target vector will recombine with the chromosomal DNA at the target locus (see, for example, FIGS. 3, 12, and 13). The marker gene can be flanked on both sides by homologous DNA sequences, a 3′ recombination arm and a 5′ recombination arm (See, for example, FIG. 11). Methods for the construction of targeting vectors have been described in the art, see, for example, Dai et al., Nature Biotechnology 20: 251-255, 2002; WO 00/51424.
  • Various constructs can be prepared for homologous recombination at a target locus. Usually, the construct can include at least 50 bp, 100 bp, 500 bp, 1 kbp, 2 kbp, 4 kbp, 5 kbp, 10 kbp, 15 kbp, 20 kbp, or 50 kbp of sequence homologous with the target locus. The sequence can include any contiguous sequence of the porcine CMP-Neu5Ac hydroxylase gene, including at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90.95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 550, 600, 700, 750, 800, 850, 900, 1000, 5000 or 10, 000 contiguous nucleotides of Seq ID Nos 9-45, 46, 47, 48, 49, and 50, or any combination or fragment thereof. Fragments of Seq ID Nos. 9-45, 46, 47, 48, 49 and 50 can include any contiguous nucleic acid or peptide sequence that includes at least about 10 bp, 15 bp, 17 bp, 20 bp, 50 bp, 100 bp, 500 bp, 1 kbp, 5 kbp or 10 kbp.
  • Various considerations can be involved in determining the extent of homology of target DNA sequences, such as, for example, the size of the target locus, availability of sequences, relative efficiency of double cross-over events at the target locus and the similarity of the target sequence with other sequences.
  • The targeting DNA can include a sequence in which DNA substantially isogenic flanks the desired sequence modifications with a corresponding target sequence in the genome to be modified. The substantially isogenic sequence can be at least about 95%, 97-98%, 99.0-99.5%, 99.6-99.9%, or 100% identical to the corresponding target sequence (except for the desired sequence modifications). The targeting DNA and the target DNA preferably can share stretches of DNA at least about 75, 150 or 500 base pairs that are 100% identical. Accordingly, targeting DNA can be derived from cells closely related to the cell line being targeted; or the targeting DNA can be derived from cells of the same cell line or animal as the cells being targeted.
  • The DNA constructs can be designed to modify the endogenous, target CMP-Neu5Ac hydroxylase. The homologous sequence for targeting the construct can have one or more deletions, insertions, substitutions or combinations thereof designed to disrupt the function of the resultant gene product. In one embodiment, the alteration can be the insertion of a selectable marker gene fused in reading frame with the upstream sequence of the target gene.
  • Suitable selectable marker genes include, but are not limited to: genes conferring the ability to grow on certain media substrates, such as the tk gene (thymidine kinase) or the hprt gene (hypoxanthine phosphoribosyltransferase) which confer the ability to grow on HAT medium (hypoxanthine, aminopterin and thymidine); the bacterial gpt gene (guanine/xanthine phosphoribosyltransferase) which allows growth on MAX medium (mycophenolic acid, adenine, and xanthine). See, for example, Song, K-Y., et al. Proc. Nat'l Acad. Sci. U.S.A. 84:6820-6824 (1987); Sambrook, J., et al., Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1989), Chapter 16. Other examples of selectable markers include: genes conferring resistance to compounds such as antibiotics, genes conferring the ability to grow on selected substrates, genes encoding proteins that produce detectable signals such as luminescence, such as green fluorescent protein, enhanced green fluorescent protein (eGFP). A wide variety of such markers are known and available, including, for example, antibiotic resistance genes such as the neomycin resistance gene (neo) (Southern, P., and P. Berg, J. Mol. Appl. Genet. 1:327-341 (1982)); and the hygromycin resistance gene (hyg) (Nucleic Acids Research 11:6895-6911 (1983), and Te Riele, H., et al., Nature 348:649-651 (1990)). Other selectable marker genes include: acetohydroxy acid synthase (AHAS), alkaline phosphatase (AP), beta galactosidase (LacZ), beta glucoronidase (GUS), chloramphenicol acetyltransferase (CAT), green fluorescent protein (GFP), red fluorescent protein (RFP), yellow fluorescent protein (YFP), cyan fluorescent protein (CFP), horseradish peroxidase (HRP), luciferase (Luc), nopaline synthase (NOS), octopine synthase (OCS), and derivatives thereof. Multiple selectable markers are available that confer resistance to ampicillin, bleomycin, chloramphenicol, gentamycin, hygromycin, kanamycin, lincomycin, methotrexate, phosphinothricin, puromycin, and tetracycline.
  • Methods for the incorporation of antibiotic resistance genes and negative selection factors will be familiar to those of ordinary skill in the art (see, e.g., WO 99/15650; U.S. Pat. No. 6,080,576; U.S. Pat. No. 6,136,566; Niwa, et al., J. Biochem. 113:343-349 (1993); and Yoshida, et al., Transgenic Research, 4:277-287 (1995)).
  • Additional selectable marker genes useful in this invention, for example, are described in U.S. Pat. Nos. 6,319,669; 6,316,181; 6,303,373; 6,291,177; 6,284,519; 6,284,496; 6,280,934; 6,274,354; 6,270,958; 6,268,201; 6,265,548; 6,261,760; 6,255,558; 6,255,071; 6,251,677; 6,251,602; 6,251,582; 6,251,384; 6,248,558; 6,248,550; 6,248,543; 6,232,107; 6,228,639; 6,225,082; 6,221,612; 6,218,185; 6,214,567; 6,214,563; 6,210,922; 6,210,910; 6,203,986; 6,197,928; 6,180,343; 6,172,188; 6,153,409; 6,150,176; 6,146,826; 6,140,132; 6,136,539; 6,136,538; 6,133,429; 6,130,313; 6,124,128; 6,110,711; 6,096,865; 6,096,717; 6,093,808; 6,090,919; 6,083,690; 6,077,707; 6,066,476; 6,060,247; 6,054,321; 6,037,133; 6,027,881; 6,025,192; 6,020,192; 6,013,447; 6,001,557; 5,994,077; 5,994,071; 5,993,778; 5,989,808; 5,985,577; 5,968,773; 5,968,738; 5,958,713; 5,952,236; 5,948,889; 5,948,681; 5,942,387; 5,932,435; 5,922,576; 5,919,445; and 5,914,233.
  • Combinations of selectable markers can also be used. For example, to target CMP-Neu5Ac hydroxylase, a neo gene (with or without its own promoter, as discussed above) can be cloned into a DNA sequence which is homologous to the CMP-Neu5Ac hydroxylase gene. To use a combination of markers, the HSV-tk gene can be cloned such that it is outside of the targeting DNA (another selectable marker could be placed on the opposite flank, if desired). After introducing the DNA construct into the cells to be targeted, the cells can be selected on the appropriate antibiotics. In this particular example, those cells which are resistant to G418 and gancyclovir are most likely to have arisen by homologous recombination in which the neo gene has been recombined into the CMP-Neu5Ac hydroxylase gene but the tk gene has been lost because it was located outside the region of the double crossover.
  • Deletions can be at least about 50 bp, more usually at least about 100 bp, and generally not more than about 20 kbp, where the deletion can normally include at least a portion of the coding region including a portion of or one or more exons, a portion of or one or more introns, and can or can not include a portion of the flanking non-coding regions, particularly the 5′-non-coding region (transcriptional regulatory region). Thus, the homologous region can extend beyond the coding region into the 5′-non-coding region or alternatively into the 3′-non-coding region. Insertions can generally not exceed 10 kbp, usually not exceed 5 kbp, generally being at least 50 bp, more usually at least 200 bp.
  • The region(s) of homology can include mutations, where mutations can further inactivate the target gene, in providing for a frame shift, or changing a key amino acid, or the mutation can correct a dysfunctional allele, etc. Usually, the mutation can be a subtle change, not exceeding about 5% of the homologous flanking sequences.
  • The construct can be prepared in accordance with methods known in the art, various fragments can be brought together, introduced into appropriate vectors, cloned, analyzed and then manipulated further until the desired construct has been achieved (see, for example FIGS. 5-11). Various modifications can be made to the sequence, to allow for restriction analysis, excision, identification of probes, etc. Silent mutations can be introduced, as desired. At various stages, restriction analysis, sequencing, amplification with the polymerase chain reaction, primer repair, in vitro mutagenesis, etc. can be employed.
  • The construct can be prepared using a bacterial vector, including a prokaryotic replication system, e.g. an origin recognizable by E. coli, at each stage the construct can be cloned and analyzed. A marker, the same as or different from the marker to be used for insertion, can be employed, which can be removed prior to introduction into the target cell. Once the vector containing the construct has been completed, it can be further manipulated, such as by deletion of the bacterial sequences, linearization, introducing a short deletion in the homologous sequence. After final manipulation, the construct can be introduced into the cell.
  • Techniques which can be used to allow the DNA construct entry into the host cell include calcium phosphate/DNA co-precipitation, microinjection of DNA into the nucleus, electroporation, bacterial protoplast fusion with intact cells, transfection, or any other technique known by one skilled in the art. The DNA can be single or double stranded, linear or circular, relaxed or supercoiled DNA. For various techniques for transfecting mammalian cells, see, for example, Keown et al., Methods in Enzymology Vol. 185, pp. 527-537 (1990).
  • The present invention further includes recombinant constructs comprising one or more of the sequences as broadly described above (for example in Tables 9-12). The constructs comprise a vector, such as a plasmid or viral vector, into which a sequence of the invention has been inserted, in a forward or reverse orientation. The construct can also include regulatory sequences, including, for example, a promoter, operably linked to the sequence. Large numbers of suitable vectors and promoters are known to those of skill in the art, and are commercially available. The following vectors are provided by way of example: pBs, pQE-9 (Qiagen), phagescript, PsiX174, pBluescript SK, pBsKS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSv2cat, pOG44, pXT1, pSG (Stratagene) pSVK3, pBPv, pMSG, pSVL (Pharmacia). Also, any other plasmids and vectors can be used as long as they are replicable and viable in the host. Vectors known in the art and those commercially available (and variants or derivatives thereof) can in accordance with the invention be engineered to include one or more recombination sites for use in the methods of the invention. Such vectors can be obtained from, for example, Vector Laboratories Inc., Invitrogen, Promega, Novagen, NEB, Clontech, Boehringer Mannheim, Pharmacia, EpiCenter, OriGenes Technologies Inc., Stratagene, PerkinElmer, Pharmingen, and Research Genetics. Other vectors of interest include eukaryotic expression vectors such as pFastBac, pFastBacHT, pFastBacDUAL, pSFV, and pTet-Splice (Invitrogen), pEUK-C1, pPUR, pMAM, pMAMneo, pBI101, pBI121, pDR2, pCMVEBNA, and pYACneo (Clontech), pSVK3, pSVL, pMSG, pCH110, and pKK232-8 (Pharmacia, Inc.), p3′SS, pXT1, pSG5, pPbac, pMbac, pMC1neo, and pOG44 (Stratagene, Inc.), and pYES2, pAC360, pBlueBacHis A, B, and C, pVL1392, pBlueBacIII, pCDM8, pcDNA1, pZeoSV, pcDNA3 pREP4, pCEP4, and pEBVHis (Invitrogen, Corp.) and variants or derivatives thereof.
  • Other vectors suitable for use in the invention include pUC18, pUC19, pBlueScript, pSPORT, cosmids, phagemids, YAC's (yeast artificial chromosomes), BAC's (bacterial artificial chromosomes), PI (Escherichia coli phage), pQE70, pQE60, pQE9 (quagan), pBS vectors, PhageScript vectors, BlueScript vectors, pNH8A, pNH16A, pNH18A, pNH46A (Stratagene), pcDNA3 (Invitrogen), pGEX, pTrsfus, pTrc99A, pET-5, pET-9, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia), pSPORT1, pSPORT2, pCMVSPORT2.0 and pSV-SPORT1 (Invitrogen) and variants or derivatives thereof. Viral vectors can also be used, such as lentiviral vectors (see, for example, WO 03/059923; Tiscomia et al. PNAS 100:1844-1848 (2003)).
  • Additional vectors of interest include pTrxFus, pThioHis, pLEX, pTrcHis, pTrcHis2, pRSET, pBlueBacHis2, pcDNA3.1/His, pcDNA3.1(−)/Myc-His, pSecTag, pEBVHis, pPIC9K, pPIC3.5K, pAO815, pPICZ, pPICZA, pPICZB, pPICZC, pGAPZA, pGAPZB, pGAPZC, pBlueBac4.5, pBlueBacHis2, pMelBac, pSinRep5, pSinHis, pIND, pIND(SP1), pVgRXR, pcDNA2.1, pYES2, pZErO1.1, pZErO-2.1, pCR-Blunt, pSE280, pSE380, pSE420, pVL1392, pVL1393, pCDM8, pcDNA1.1, pcDNA 1.1/Amp, pcDNA3.1, pcDNA3.1/Zeo, pSe, SV2, pRc/CMV2, pRc/RSV, pREP4, pREP7, pREP8, pREP9, pREP 10, pCEP4, pEBVHis, pCR3.1, pCR2.1, pCR3.1-Uni, and pCRBac from Invitrogen; λ ExCell, λ gt11, pTrc99A, pKK223-3, pGEX-1λ T, pGEX-2T, pGEX-2TK, pGEX-4T-1, pGEX-4T-2, pGEX-4T-3, pGEX-3X, pGEX-5X-1, pGEX-5X-2, pGEX-5X-3, pEZZ18, pRIT2T, pMC1871, pSVK3, pSVL, pMSG, pCH110, pKK232-8, pSL1180, pNEO, and pUC4K from Pharmacia; pSCREEN-1b(+), pT7Blue(R), pT7Blue-2, pCITE-4abc(+), pOCUS-2, pTAg, pET-32LIC, pET-30LIC, pBAC-2 cp LIC, pBACgus-2 cp LIC, pT7Blue-2 LIC, pT7Blue-2, λ SCREEN-1, λ BlueSTAR, pET-3abcd, pET-7abc, pET9abcd, pET11abcd, pET12abc, pET-14b, pET-15b, pET-16b, pET-17b-pET-17xb, pET-19b, pET-20b(+), pET-21 abcd(+), pET-22b(+), pET-23abcd(+), pET-24abcd(+), pET-25b(+), pET-26b(+), pET-27b(+), pET-28abc(+), pET-29abc(+), pET-30abc(+), pET-31b(+), pET-32abc(+), pET-33b(+), pBAC-1, pBACgus-1, pBAC4x-1, pBACgus4x-1, pBAC-3 cp, pBACgus-2 cp, pBACsurf-1, pig, Signal pig, pYX, Selecta Vecta-Neo, Selecta Vecta-Hyg, and Selecta Vecta-Gpt from Novagen; pLexA, pB42AD, pGBT9, pAS2-1, pGAD424, pACT2, pGAD GL, pGAD GH, pGAD10, pGilda, pEZM3, pEGFP, pEGFP-1, pEGFP-N, pEGFP-C, pEBFP, pGFPuv, pGFP, p6xHis-GFP, pSEAP2-Basic, pSEAP2-Contral, pSEAP2-Promoter, pSEAP2-Enhancer, pβgal-Basic, pβgal-Control, pβgal-Promoter, pβgal-Enhancer, pCMV, pTet-Off, pTet-On, pTK-Hyg, pRetro-Off, pRetro-On, pIRES1neo, pIRES1hyg, pLXSN, pLNCX, pLAPSN, pMAMneo, pMAMneo-CAT, pMAMneo-LUC, pPUR, pSV2neo, pYEX4T-1/2/3, pYEX-S1, pBacPAK-His, pBacPAK8/9, pAcUW31, BacPAK6, pTriplEx, λgt10, λgt11, pWE15, and λTriplEx from Clontech; Lambda ZAP II, pBK-CMV, pBK-RSV, pBluescript II KS +/−, pBluescript II SK +/−, pAD-GAL4, pBD-GAL4 Cam, pSurfscript, Lambda FIX II, Lambda DASH, Lambda EMBL3, Lambda EMBL4, SuperCos, pCR-Script Amp, pCR-Script Cam, pCR-Script Direct, pBS +/−, pBC KS +/−, pBC SK +/−, Phagescript, pCAL-n-EK, pCAL-n, pCAL-c, pCAL-kc, pET-3abcd, pET-11abcd, pSPUTK, pESP-1, pCMVLacI, pOPRSVI/MCS, pOPI3 CAT, pXT1, pSG5, pPbac, pMbac, pMC1neo, pMC1neo Poly A, pOG44, pOG45, pFRTβGAL, pNEOβGAL, pRS403, pRS404, pRS405, pRS406, pRS413, pRS414, pRS415, and pRS416 from Stratagene.
  • Additional vectors include, for example, pPC86, pDBLeu, pDBTrp, pPC97, p2.5, pGAD1-3, pGAD10, pACt, pACT2, pGADGL, pGADGH, pAS2-1, pGAD424, pGBT8, pGBT9, pGAD-GAL4, pLexA, pBD-GAL4, pHISi, pHISi-1, placZi, pB42AD, pDG202, pJK202, pJG4-5, pNLexA, pYESTrp and variants or derivatives thereof.
  • Also, any other plasmids and vectors known in the art can be used as long as they are replicable and viable in the host.
  • Selection of Homologously Recombined Cells
  • Cells that have been homologously recombined to knock-out expression of the porcine CMP-Neu5Ac hydroxylase gene can then be grown in appropriately-selected medium to identify cells providing the appropriate integration. Those cells which show the desired phenotype can then be further analyzed by restriction analysis, electrophoresis, Southern analysis, polymerase chain reaction, or another technique known in the art. By identifying fragments which show the appropriate insertion at the target gene site, cells can be identified in which homologous recombination has occurred to inactivate or otherwise modify the target gene.
  • The presence of the selectable marker gene inserted into the CMP-Neu5Ac hydroxylase gene establishes the integration of the target construct into the host genome. Those cells which show the desired phenotype can then be further analyzed by restriction analysis, electrophoresis, Southern analysis, polymerase chain reaction, monoclonal antibody assays, Fluorescent Activated Cell Sorter (FACS), or any other techniques or methods known in the art to analyze the DNA in order to establish whether homologous or non-homologous recombination occurred. This can be determined by employing probes for the insert and then sequencing the 5′ and 3′ regions flanking the insert for the presence of the CMP-Neu5Ac hydroxylase gene extending beyond the flanking regions of the construct or identifying the presence of a deletion, when such deletion is introduced. Primers can also be used which are complementary to a sequence within the construct and complementary to a sequence outside the construct and at the target locus. In this way, one can only obtain DNA duplexes having both of the primers present in the complementary chains if homologous recombination has occurred. By demonstrating the presence of the primer sequences or the expected size sequence, the occurrence of homologous recombination is supported.
  • The polymerase chain reaction used for screening homologous recombination events is described, for example, in Kim and Smithies, Nucleic Acids Res. 16:8887-8903, 1988; and Joyner, et al., Nature 338:153-156, 1989.
  • An alternative method for screening homologous recombination events includes utilizing monoclonal or polyclonal antibodies specific for porcine CMP-Neu5Ac Hydroxylase and/or Neu5Gc, as described in, for example, Malykh, et al., European Journal of Cell Biology 80, 48-58 (2001), Malykh, et al., Glycoconjugate J. 15, 885-893 (1998).
  • Further characterization of porcine cells lacking expression of functional CMP-Neu5Ac Hydroxylase due to homologous recombination events include, but are not limited to, Southern Blot analysis, Northern Blot analysis, specific lectin binding assays, and/or sequence analysis, or by using anti-Neu5Gc or anti-CMP-Neu5Ac hydroxylase antibody assays as described, for example, in Y. Malykh, et. al. Biochem J. 370: 601-607 (2003); Y. Malykh, et al. European Journal of Cell Biology 80: 48-58 (2001); Y. Malykh et al. Glycoconjugate J. 15: 885-893 (1998). See generally, for example, A. Sharma, et al. Transplantation 75(4): 430-436 (2003).
  • The cell lines obtained from the first round of targeting are likely to be heterozygous for the targeted allele. Homozygosity, in which both alleles are modified, can be achieved in a number of ways. One approach is to grow up a number of cells in which one copy has been modified and then to subject these cells to another round of targeting of the remaining porcine CMP-Neu5Ac hydroxylase allele using a different selectable marker. Alternatively, homozygotes can be obtained by breeding animals heterozygous for the modified allele, according to traditional Mendelian genetics. In some situations, it can be desirable to have two different modified alleles. This can be achieved by successive rounds of gene targeting or by breeding heterozygotes, each of which carries one of the desired modified alleles.
  • VIII. Genetic Manipulation of Additional Genes to Overcome Immunologic Barriers of Xenotransplantation
  • In one aspect of the invention, cells homozygous for the nonfunctional CMP-Neu5Ac hydroxylase gene can be subject to further genetic modification. For example, one can introduce additional genetic capability into the homozygotic hosts, where the endogenous CMP Neu5Ac hydroxylase alleles have been made nonfunctional, to substitute, replace or provide different genetic capability to the host. One can remove the marker gene after homogenization. By introducing a construct comprising substantially the same homologous DNA, possibly with extended sequences, having the marker gene portion of the original construct deleted, one can be able to obtain homologous recombination with the target locus. By using a combination of marker genes for integration, one providing positive selection and the other negative selection, in the removal step, one can select against the cells retaining the marker genes.
  • In one embodiment, porcine cells are provided that lack the CMP-Neu5Ac hydroxylase gene and the α(1,3)GT gene. Animals lacking functional CMP-Neu5Ac hydroxylase can be produced according to the present invention, and then cells from this animal can be used to knockout the α(1,3)GT gene. Homozygous α(1,3)GT negative porcine have recently been reported (Phelps et. al. Science 2003; WO 04/028243). Alternatively, cells from these a(1,3)GT knockout animals can be used and further modified to inactivate the CMP-Neu5Ac hydroxylase gene.
  • In another embodiment, porcine cells are also provided that lack the porcine CMP-Neu5Ac hydroxylase gene and produce human complement inhibiting proteins. Animals lacking functional porcine CMP-Neu5Ac hydroxylase gene can be produced according to the present invention, and then cells from this animal can be further modified to express human complement inhibiting proteins, such as, but not limited to, CD59 (cDNA reported by Philbrick, W. M., et al. (1990) Eur. J. Immunol. 20:87-92), human decay accelerating factor (DAF) (cDNA reported by Medof, et al. (1987) Proc. Natl. Acad. Sci. USA 84: 2007), and human membrane cofactor protein (MCP) (cDNA reported by Lublin, D., et al. (1988) J. Exp. Med. 168: 181-194).
  • In an alternative embodiment, cells from transgenic pigs producing human complement inhibiting proteins can be used and further modified to inactivate the porcine CMP-Neu5Ac hydroxylase gene. Transgenic pigs producing human complement inhibiting proteins are known in the art (see, for example, U.S. Pat. No. 6,166,288).
  • In a further embodiment, porcine cells are provided that lack the porcine CMP-Neu5Ac hydroxylase gene and the porcine Forssman synthetase (FSM) gene. Animals lacking functional porcine CMP-Neu5Ac hydroxylase gene can be produced according to the present invention, and then cells from this animal can be further modified to knockout the porcine FSM synthetase gene, which is involved in the production of gal-α-gal epitopes, and plays a role in xenotransplant rejection. The porcine FSM synthetase gene has recently been identified (see U.S. Application 60/568,922). Alternatively, cells from these FSM synthetase gene knockout animals can be used and further modified to inactivate the porcine CMP-Neu5Ac hydroxylase gene.
  • In a still further embodiment, porcine cells are provided that lack the porcine CMP-Neu5Ac hydroxylase gene and the porcine isogloboside 3 synthase gene. Animals lacking functional porcine CMP-Neu5Ac hydroxylase gene can be produced according to the present invention, and then cells from this animal can be used to knockout the porcine iGb3 synthase gene. The porcine iGb3 synthase gene has recently been reported (U.S. Application No. 60/517,524). Alternatively, cells from these porcine iGb3 synthase gene knockout animals can be used and further modified to inactivate the porcine CMP-Neu5Ac hydroxylase gene.
  • In another embodiment, porcine cells are provided that lack the porcine CMP-Neu5Ac hydroxylase gene, the α(1,3)GT gene, the FSM synthetase gene, and the porcine iGb3 synthase gene. Animals lacking functional CMP-Neu5Ac hydroxylase gene can be produced according to the present invention, and then cells from this animal can be used to knockout the α(1,3)GT gene, the FSM synthetase gene, and the porcine iGb3 synthase gene. Homozygous α(1,3)GT-negative porcine have recently been reported (Phelps et al. supra, Science 2003; WO 04/028243) Alternatively, cells from these a(1,3)GT knockout animals can be used and further modified to inactivate the porcine iGb3 synthase gene, the porcine FSM synthetase gene, and the CMP-Neu5Ac hydroxylase gene, and, in addition, express human complement inhibiting proteins, such as, but not limited to, CD59, human decay accelerating factor (DAF), and human membrane cofactor protein (MCP).
  • VIII. Production of Genetically Modified Animals
  • The present invention provides methods of producing a transgenic pig that lacks expression of CMP-Neu5Ac hydroxylase through the genetic modification of porcine totipotent embryonic cells. In one embodiment, the animals can be produced by: (a) identifying one or more target CMP-Neu5Ac hydroxylase nucleic acid genomic sequences in an animal; (b) preparing one or more homologous recombination vectors targeting the CMP-Neu5Ac hydroxylase nucleic acid genomic sequences; (c) inserting the one or more targeting vectors into the genomes of a plurality of totipotent cells of the animal, thereby producing a plurality of transgenic totipotent cells; (d) obtaining a tetraploid blastocyst of the animal; (e) inserting the plurality of totipotent cells into the tetraploid blastocyst, thereby producing a transgenic embryo; (f) transferring the embryo to a recipient female animal; and (g) allowing the embryo to develop to term in the female animal. The method of transgenic animal production described here by which to generate a transgenic pig is further generally described in U.S. Pat. No. 6,492,575.
  • In another embodiment, the totipotent cells can be embryonic stem (ES) cells. The isolation of ES cells from blastocysts, the establishing of ES cell lines and their subsequent cultivation are carried out by conventional methods as described, for example, by Doetchmann et al., J. Embryol. Exp. Morph. 87:27-45 (1985); Li et al., Cell 69:915-926 (1992); Robertson, E. J. “Tetracarcinomas and Embryonic Stem Cells: A Practical Approach,” ed. E. J. Robertson, IRL Press, Oxford, England (1987); Wurst and Joyner, “Gene Targeting: A Practical Approach,” ed. A. L. Joyner, IRL Press, Oxford, England (1993); Hogen et al., “Manipulating the Mouse Embryo: A Laboratory Manual,” eds. Hogan, Beddington, Costantini and Lacy, Cold Spring Harbor Laboratory Press, New York (1994); and Wang, et al., Nature 336:741-744 (1992). For example, after transforming embryonic stem cells with the targeting vector to alter the CMP-Neu5Ac hydroxylase gene, the cells can be plated onto a feeder layer in an appropriate medium, for example, such as fetal bovine serum enhanced DMEM. Cells containing the construct can be detected by employing a selective medium, and after sufficient time for colonies to grow, colonies can be picked and analyzed for the occurrence of homologous recombination. Polymerase chain reaction can be used, with primers within and without the construct sequence but at the target locus. Those colonies which show homologous recombination can then be used for embryo manipulating and blastocyst injection. Blastocysts can be obtained from superovulated females. The embryonic stem cells can then be trypsinized and the modified cells added to a droplet containing the blastocysts. At least one of the modified embryonic stem cells can be injected into the blastocoel of the blastocyst. After injection, at least one of the blastocysts can be returned to each uterine horn of pseudopregnant females. Females are then allowed to go to term and the resulting litters screened for mutant cells having the construct. The blastocysts are selected for different parentage from the transformed ES cells. By providing for a different phenotype of the blastocyst and the ES cells, chimeric progeny can be readily detected, and then genotyping can be conducted to probe for the presence of the modified CMP-Neu5Ac hydroxylase gene.
  • In a further embodiment of the invention, the totipotent cells can be embryonic germ (EG) cells. Embryonic Germ cells are undifferentiated cells functionally equivalent to ES cells, that is they can be cultured and transfected in vitro, then contribute to somatic and germ cell lineages of a chimera (Stewart et al., Dev. Biol. 161:626-628 (1994)). EG cells are derived by culture of primordial germ cells, the progenitors of the gametes, with a combination of growth factors: leukemia inhibitory factor, steel factor and basic fibroblast growth factor (Matsui, et al., Cell 70:841-847 (1992); Resnick, et al., Nature 359:550-551 (1992)). The cultivation of EG cells can be carried out using methods known to one skilled in the art, such as described in Donovan et al., “Transgenic Animals, Generation and Use,” Ed. L. M. Houdebine, Harwood Academic Publishers (1997).
  • Tetraploid blastocysts for use in the invention can be obtained by natural zygote production and development, or by known methods by electrofusion of two-cell embryos and subsequently cultured as described, for example, by James, et al., Genet. Res. Camb. 60:185-194 (1992); Nagy and Rossant, “Gene Targeting: A Practical Approach,” ed. A. L. Joyner, IRL Press, Oxford, England (1993); or by Kubiak and Tarkowski, Exp. Cell Res. 157:561-566 (1985).
  • The introduction of the ES cells or EG cells into the blastocysts can be carried out by any method known in the art, for example, as described by Wang, et al., EMBO J. 10:2437-2450 (1991).
  • A “plurality” of totipotent cells can encompass any number of cells greater than one. For example, the number of totipotent cells for use in the present invention can be about 2 to about 30 cells, about 5 to about 20 cells, or about 5 to about 10 cells. In one embodiment, about 5-10 ES cells taken from a single cell suspension are injected into a blastocyst immobilized by a holding pipette in a micromanipulation apparatus. Then the embryos are incubated for at least 3 hours, possibly overnight, prior to introduction into a female recipient animal via methods known in the art (see for example Robertson, E. J. “Teratocarcinomas and Embryonic Stem Cells: A Practical Approach” IRL Press, Oxford, England (1987)). The embryo can then be allowed to develop to term in the female animal.
  • Somatic Cell Nuclear Transfer to Produce Cloned, Transgenic Offspring
  • The present invention provides a method for cloning a pig lacking a functional CMP-Neu5Ac hydroxylase gene via somatic cell nuclear transfer. In general, a wide variety of methods to accomplish mammalian cloning are currently being rapidly developed and reported, any method that accomplishes the desired result can be used in the present invention. Nonlimiting examples of such methods are described below. For example, the pig can be produced by a nuclear transfer process comprising the following steps: obtaining desired differentiated pig cells to be used as a source of donor nuclei; obtaining oocytes from a pig; enucleating the oocytes; transferring the desired differentiated cell or cell nucleus into the enucleated oocyte, e.g., by fusion or injection, to form NT units; activating the resultant NT unit; and transferring said cultured NT unit to a host pig such that the NT unit develops into a fetus.
  • Nuclear transfer techniques or nuclear transplantation techniques are known in the art (Campbell et al, Theriogenology, 43:181 (1995); Collas, et al, Mol. Report. Dev., 38:264-267 (1994); Keefer et al, Biol. Reprod., 50:935-939 (1994); Sims, et al, Proc. Natl. Acad. Sci., USA, 90:6143-6147 (1993); WO 94/26884; WO 94/24274, and WO 90/03432, U.S. Pat. Nos. 4,944,384 and 5,057,420). In one nonlimiting example, methods are provided such as those described in U.S. Patent Publication No. 2003/0046722 to Collas, et al., which describes methods for cloning mammals that allow the donor chromosomes or donor cells to be reprogrammed prior to insertion into an enucleated oocyte. The invention also describes methods of inserting or fusing chromosomes, nuclei or cells with oocytes.
  • A donor cell nucleus, which has been modified to alter the CMP-Neu5Ac hydroxylase gene, is transferred to a recipient porcine oocyte. The use of this method is not restricted to a particular donor cell type. The donor cell can be as described in Wilmut, et al., Nature 385 810 (1997); Campbell, et al., Nature 380 64-66 (1996); or Cibelli, et al., Science 280 1256-1258 (1998). All cells of normal karyotype, including embryonic, fetal and adult somatic cells which can be used successfully in nuclear transfer can in principle be employed. Fetal fibroblasts are a particularly useful class of donor cells. Generally suitable methods of nuclear transfer are described in Campbell, et al., Theriogenology 43 181 (1995), Collas, et al., Mol. Reprod. Dev. 38 264-267 (1994), Keefer, et al., Biol. Reprod. 50 935-939 (1994), Sims, et al., Proc. Nat'l. Acad. Sci. USA 90 6143-6147 (1993), WO-A-9426884, WO-A-9424274, WO-A-9807841, WO-A-9003432, U.S. Pat. No. 4,994,384 and U.S. Pat. No. 5,057,420. Differentiated or at least partially differentiated donor cells can also be used. Donor cells can also be, but do not have to be, in culture and can be quiescent. Nuclear donor cells which are quiescent are cells which can be induced to enter quiescence or exist in a quiescent state in vivo. Prior art methods have also used embryonic cell types in cloning procedures (Campbell, et al. (Nature, 380:64-68, 1996) and Stice, et al (Biol. Reprod., 20 54:100-110, 1996).
  • Somatic nuclear donor cells may be obtained from a variety of different organs and tissues such as, but not limited to, skin, mesenchyme, lung, pancreas, heart, intestine, stomach, bladder, blood vessels, kidney, urethra, reproductive organs, and a disaggregated preparation of a whole or part of an embryo, fetus or adult animal. In a suitable embodiment of the invention, nuclear donor cells are selected from the group consisting of epithelial cells, fibroblast cells, neural cells, keratinocytes, hematopoietic cells, melanocytes, chondrocytes, lymphocytes (B and T), macrophages, monocytes, mononuclear cells, cardiac muscle cells, other muscle cells, granulose cells, cumulus cells, epidermal cells or endothelial cells. In another embodiment, the nuclear cell is an embryonic stem cell. In a preferred embodiment, fibroblast cells can be used as donor cells.
  • In another embodiment of the invention, the nuclear donor cells of the invention are germ cells of an animal. Any germ cell of an animal species in the embryonic, fetal, or adult stage may be used as a nuclear donor cell. In a suitable embodiment, the nuclear donor cell is an embryonic germ cell.
  • Nuclear donor cells may be arrested in any phase of the cell cycle (GO, GI, G2, S, M) so as to ensure coordination with the acceptor cell. Any method known in the art may be used to manipulate the cell cycle phase. Methods to control the cell cycle phase include, but are not limited to, GO quiescence induced by contact inhibition of cultured cells, GO quiescence induced by removal of serum or other essential nutrient, GO quiescence induced by senescence, GO quiescence induced by addition of a specific growth factor; GO or GI quiescence induced by physical or chemical means such as heat shock, hyperbaric pressure or other treatment with a chemical, hormone, growth factor or other substance; S-phase control via treatment with a chemical agent which interferes with any. Point of the replication procedure; M-phase control via selection using fluorescence activated cell sorting, mitotic shake off, treatment with microtubule disrupting agents or any chemical which disrupts progression in mitosis (see also Freshney, R. I., “Culture of Animal Cells: A Manual of Basic Technique,” Alan R. Liss, Inc, New York (1983).
  • Methods for isolation of oocytes are well known in the art. Essentially, this can comprise isolating oocytes from the ovaries or reproductive tract of a pig. A readily available source of pig oocytes is slaughterhouse materials. For the combination of techniques such as genetic engineering, nuclear transfer and cloning, oocytes must generally be matured in vitro before these cells can be used as recipient cells for nuclear transfer, and before they can be fertilized by the sperm cell to develop into an embryo. This process generally requires collecting immature (prophase I) oocytes from mammalian ovaries, e.g., bovine ovaries obtained at a slaughterhouse, and maturing the oocytes in a maturation medium prior to fertilization or enucleation until the oocyte attains the metaphase II stage, which in the case of bovine oocytes generally occurs about 18-24 hours post-aspiration. This period of time is known as the “maturation period”.
  • A metaphase II stage oocyte can be the recipient oocyte, at this stage it is believed that the oocyte can be or is sufficiently “activated” to treat the introduced nucleus as it does a fertilizing sperm. Metaphase II stage oocytes, which have been matured in vivo have been successfully used in nuclear transfer techniques. Essentially, mature metaphase II oocytes can be collected surgically from either non-superovulated or superovulated porcine 35 to 48, or 39-41, hours past the onset of estrus or past the injection of human chorionic gonadotropin (hCG) or similar hormone.
  • After a fixed time maturation period, which ranges from about 10 to 40 hours, and preferably about 16-18 hours, the oocytes can be enucleated. Prior to enucleation the oocytes can be removed and placed in appropriate medium, such as HECM containing 1 milligram per milliliter of hyaluronidase prior to removal of cumulus cells. The stripped oocytes can then be screened for polar bodies, and the selected metaphase II oocytes, as determined by the presence of polar bodies, are then used for nuclear transfer. Enucleation follows.
  • Enucleation can be performed by known methods, such as described in U.S. Pat. No. 4,994,384. For example, metaphase II oocytes can be placed in either HECM, optionally containing 7.5 micrograms per milliliter cytochalasin B, for immediate enucleation, or can be placed in a suitable medium, for example an embryo culture medium such as CR1aa, plus 10% estrus cow serum, and then enucleated later, preferably not more than 24 hours later, and more preferably 16-18 hours later.
  • Enucleation can be accomplished microsurgically using a micropipette to remove the polar body and the adjacent cytoplasm. The oocytes can then be screened to identify those of which have been successfully enucleated. One way to screen the oocytes is to stain the oocytes with 1 microgram per milliliter 33342 Hoechst dye in HECM, and then view the oocytes under ultraviolet irradiation for less than 10 seconds. The oocytes that have been successfully enucleated can then be placed in a suitable culture medium, for example, CR1 aa plus 10% serum.
  • A single mammalian cell of the same species as the enucleated oocyte can then be transferred into the perivitelline space of the enucleated oocyte used to produce the NT unit. The mammalian cell and the enucleated oocyte can be used to produce NT units according to methods known in the art. For example, the cells can be fused by electrofusion. Electrofusion is accomplished by providing a pulse of electricity that is sufficient to cause a transient breakdown of the plasma membrane. This breakdown of the plasma membrane is very short because the membrane reforms rapidly. Thus, if two adjacent membranes are induced to breakdown and upon reformation the lipid bilayers intermingle, small channels can open between the two cells. Due to the thermodynamic instability of such a small opening, it enlarges until the two cells become one. See, for example, U.S. Pat. No. 4,997,384 by Prather et al. A variety of electrofusion media can be used including, for example, sucrose, mannitol, sorbitol and phosphate buffered solution. Fusion can also be accomplished using Sendai virus as a fusogenic agent (Graham, Wister Inot. Symp. Monogr., 9, 19, 1969). Also, the nucleus can be injected directly into the oocyte rather than using electroporation fusion. See, for example, Collas and Barnes, Mol. Reprod. Dev., 38:264-267 (1994). After fusion, the resultant fused NT units are then placed in a suitable medium until activation, for example, CR1aa medium. Typically activation can be effected shortly thereafter, for example less than 24 hours later, or about 4-9 hours later.
  • The NT unit can be activated by any method that accomplishes the desired result. Such methods include, for example, culturing the NT unit at sub-physiological temperature, in essence by applying a cold, or actually cool temperature shock to the NT unit. This can be most conveniently done by culturing the NT unit at room temperature, which is cold relative to the physiological temperature conditions to which embryos are normally exposed. Alternatively, activation can be achieved by application of known activation agents. For example, penetration of oocytes by sperm during fertilization has been shown to activate prefusion oocytes to yield greater numbers of viable pregnancies and multiple genetically identical pigs after nuclear transfer. Also, treatments such as electrical and chemical shock can be used to activate NT embryos after fusion. See, for example, U.S. Pat. No. 5,496,720, to Susko-Parrish, et al. Additionally, activation can be effected by simultaneously or sequentially by increasing levels of divalent cations in the oocyte, and reducing phosphorylation of cellular proteins in the oocyte. This can generally be effected by introducing divalent cations into the oocyte cytoplasm, e.g., magnesium, strontium, barium or calcium, e.g., in the form of an ionophore. Other methods of increasing divalent cation levels include the use of electric shock, treatment with ethanol and treatment with caged chelators. Phosphorylation can be reduced by known methods, for example, by the addition of kinase inhibitors, e.g., serine-threonine kinase inhibitors, such as 6-dimethyl-aminopurine, staurosporine, 2-aminopurine, and sphingosine. Alternatively, phosphorylation of cellular proteins can be inhibited by introduction of a phosphatase into the oocyte, e.g., phosphatase 2A and phosphatase 2B.
  • The activated NT units can then be cultured in a suitable in vitro culture medium until the generation of cell colonies. Culture media suitable for culturing and maturation of embryos are well known in the art. Examples of known media, which can be used for embryo culture and maintenance, include Ham's F-10+10% fetal calf serum (FCS), Tissue Culture Medium-199 (TCM-199)+10% fetal calf serum, Tyrodes-Albumin-Lactate-Pyruvate (TALP), Dulbecco's Phosphate Buffered Saline (PBS), Eagle's and Whitten's media.
  • Afterward, the cultured NT unit or units can be washed and then placed in a suitable media contained in well plates which preferably contain a suitable confluent feeder layer. Suitable feeder layers include, by way of example, fibroblasts and epithelial cells. The NT units are cultured on the feeder layer until the NT units reach a size suitable for transferring to a recipient female, or for obtaining cells which can be used to produce cell colonies. Preferably, these NT units can be cultured until at least about 2 to 400 cells, more preferably about 4 to 128 cells, and most preferably at least about 50 cells.
  • Activated NT units can then be transferred (embryo transfers) to the oviduct of an female pigs. In one embodiment, the female pigs can be an estrus-synchronized recipient gilt. Crossbred gilts (large white/Duroc/Landrace) (280-400 lbs) can be used. The gilts can be synchronized as recipient animals by oral administration of 18-20 mg ReguMate (Altrenogest, Hoechst, Warren, N.J.) mixed into the feed. Regu-Mate can be fed for 14 consecutive days. One thousand units of Human Chorionic Gonadotropin (hCG, Intervet America, Millsboro, Del.) can then be administered i.m. about 105 h after the last Regu-Mate treatment. Embryo transfers of the can then be performed about 22-26 h after the hCG injection. In one embodiment, the pregnancy can be brought to term and result in the birth of live offspring. In another embodiment, the pregnancy can be 5 terminated early and embryonic cells can be harvested.
  • The methods for embryo transfer and recipient animal management in the present invention are standard procedures used in the embryo transfer industry. Synchronous transfers are important for success of the present invention, i.e., the stage of the NT embryo is in synchrony with the estrus cycle of the recipient female. See, for example, Siedel, G. E., Jr. “Critical review of embryo transfer procedures with cattle” in Fertilization and Embryonic Development in Vitro (1981) L. Mastroianni, Jr. and J. D. Biggers, ed., Plenum Press, New York, N.Y., page 323.
  • VIII. Porcine Animals, Organs, Tissues, Cells and Cell Lines
  • The present invention provides viable porcine in which both alleles of the CMP-Neu5Ac hydroxylase gene have been inactivated. The invention also provides organs, tissues, and cells derived from such porcine, which are useful for xenotransplantation.
  • In one embodiment, the invention provides porcine organs, tissues and/or purified or substantially pure cells or cell lines obtained from pigs that lack any expression of functional CMP-Neu5Ac hydroxylase.
  • In one embodiment, the invention provides organs that are useful for xenotransplantation. Any porcine organ can be used, including, but not limited to: brain, heart, lungs, glands, brain, eye, stomach, spleen, pancreas, kidneys, liver, intestines, uterus, bladder, skin, hair, nails, ears, nose, mouth, lips, gums, teeth, tongue, salivary glands, tonsils, pharynx, esophagus, large intestine, small intestine, rectum, anus, pylorus, thyroid gland, thymus gland, suprarenal capsule, bones, cartilage, tendons, ligaments, skeletal muscles, smooth muscles, blood vessels, blood, spinal cord, trachea, ureters, urethra, hypothalamus, pituitary, adrenal glands, ovaries, oviducts, uterus, vagina, mammary glands, testes, seminal vesicles, penis, lymph, lymph nodes and lymph vessels.
  • In another embodiment, the invention provides tissues that are useful for xenotransplantation. Any porcine tissue can be used, including, but not limited to: epithelium, connective tissue, blood, bone, cartilage, muscle, nerve, adenoid, adipose, areolar, bone, brown adipose, cancellous, muscle, cartaginous, cavernous, chondroid, chromaffin, dartoic, elastic, epithelial, fatty, fibrohyaline, fibrous, Gaingee, gelatinous, granulation, gut-associated lymphoid, Haller's vascular, hard hemopoietic, indifferent, interstitial, investing, islet, lymphatic, lymphoid, mesenchymal, mesonephric, mucous connective, multilocular adipose, myeloid, nasion soft, nephrogenic, nodal, osseous, osteogenic, osteoid, periapical, reticular, retiform, rubber, skeletal muscle, smooth muscle, and subcutaneous tissue.
  • In a further embodiment, the invention provides cells and cell lines from porcine animals that lack expression of functional alpha1,3GT. In one embodiment, these cells or cell lines can be used for xenotransplantation. Cells from any porcine tissue or organ can be used, including, but not limited to: epithelial cells, fibroblast cells, neural cells, keratinocytes, hematopoietic cells, melanocytes, chondrocytes, lymphocytes (B and T), macrophages, monocytes, mononuclear cells, cardiac muscle cells, other muscle cells, □hosphate cells, cumulus cells, epidermal cells, endothelial cells, Islets of Langerhans cells, pancreatic insulin secreting cells, pancreatic alpha-2 cells, pancreatic beta cells, pancreatic alpha-1 cells, blood cells, blood precursor cells, bone cells, bone precursor cells, neuronal stem cells, primordial stem cells, hepatocytes, keratinocytes, umbilical vein endothelial cells, aortic endothelial cells, microvascular endothelial cells, fibroblasts, liver stellate cells, aortic smooth muscle cells, cardiac myocytes, neurons, Kupffer cells, smooth muscle cells, Schwann cells, and epithelial cells, erythrocytes, platelets, neutrophils, lymphocytes, monocytes, eosinophils, basophils, adipocytes, chondrocytes, pancreatic islet cells, thyroid cells, parathyroid cells, parotid cells, tumor cells, glial cells, astrocytes, red blood cells, white blood cells, macrophages, epithelial cells, somatic cells, pituitary cells, adrenal cells, hair cells, bladder cells, kidney cells, retinal cells, rod cells, cone cells, heart cells, pacemaker cells, spleen cells, antigen presenting cells, memory cells, T cells, B cells, plasma cells, muscle cells, ovarian cells, uterine cells, prostate cells, vaginal epithelial cells, sperm cells, testicular cells, germ cells, egg cells, leydig cells, peritubular cells, sertoli cells, lutein cells, cervical cells, endometrial cells, mammary cells, follicle cells, mucous cells, ciliated cells, nonkeratinized epithelial cells, keratinized epithelial cells, lung cells, goblet cells, columnar epithelial cells, dopaminergic cells, squamous epithelial cells, osteocytes, osteoblasts, osteoclasts, embryonic stem cells, fibroblasts and fetal fibroblasts. In a specific embodiment, pancreatic cells, including, but not limited to, Islets of Langerhans cells, insulin secreting cells, 48 alpha-2 cells, beta cells, alpha-1 cells from pigs that lack expression of functional alpha-1,3-GT are provided.
  • Nonviable derivatives include tissues stripped of viable cells by enzymatic or chemical treatment these tissue derivatives can be further processed via crosslinking or other chemical treatments prior to use in transplantation. In a preferred embodiment, the derivatives include extracellular matrix derived from a variety of tissues, including skin, urinary, bladder or organ submucosal tissues. Also, tendons, joints and bones stripped of viable tissue to include heart valves and other nonviable tissues as medical devices are provided.
  • Therapeutic Uses
  • The cells can be administered into a host in order in a wide variety of ways. Preferred modes of administration are parenteral, intraperitoneal, intravenous, intradermal, epidural, intraspinal, intrasternal, intra-articular, intra-synovial, intrathecal, intra-arterial, intracardiac, intramuscular, intranasal, subcutaneous, intraorbital, intracapsular, topical, transdermal patch, via rectal, vaginal or urethral administration including via suppository, percutaneous, nasal spray, surgical implant, internal surgical paint, infusion pump, or via catheter. In one embodiment, the agent and carrier are administered in a slow release formulation such as a direct tissue injection or bolus, implant, microparticle, microsphere, nanoparticle or nanosphere.
  • Disorders that can be treated by infusion of the disclosed cells include, but are not limited to, diseases resulting from a failure of a dysfunction of normal blood cell production and maturation (i.e., aplastic anemia and hypoproliferative stem cell disorders); neoplastic, malignant diseases in the hematopoietic organs (e.g., leukemia and lymphomas); broad spectrum malignant solid tumors of non-hematopoietic origin; autoimmune conditions; and genetic disorders. Such disorders include, but are not limited to diseases resulting from a failure or dysfunction of normal blood cell production and maturation hyperproliferative stem cell disorders, including aplastic anemia, pancytopenia, agranulocytosis, thrombocytopenia, red cell aplasia, Blackfan Diamond syndrome, due to drugs, radiation, or infection, idiopathic; hematopoietic malignancies including acute lymphoblastic (lymphocytic) leukemia, chronic lymphocytic leukemia, acute myclogenous leukemia, chronic myelogenous, leukemia, acute malignant myelosclerosis, multiple myeloma, polycythemia vera, agnogenic myelometaplasia, Waldenstrom's macroglobulinemia, Hodgkin's lymphoma, non-Hodgkin's lymphoma; immunosuppression in patients with malignant, solid tumors including malignant melanoma, carcinoma of the stomach, ovarian carcinoma, breast carcinoma, small cell lung carcinoma, retinoblastoma, testicular carcinoma, glioblastoma, rhabdomyosarcoma, neuroblastoma, Ewing's sarcoma, lymphoma; autoimmune diseases including rheumatoid arthritis, diabetes type 1, chronic hepatitis, multiple sclerosis, systemic lupus erythematosus; genetic (congenital) disorders including anemias, familial aplastic, Fanconi's syndrome, dihydrofolate reductase deficiencies, formamino transferase deficiency, Lesch-Nyhan syndrome, congenital dyserythropoietic syndrome IIV, Chwachmann-Diamond syndrome, dihydrofolate reductase deficiencies, forinamino transferase deficiency, Lesch-Nyhan syndrome, congenital spherocytosis, congenital elliptocytosis, congenital stomatocytosis, congenital Rh null disease, paroxysmal nocturnal hemoglobinuria, G6PD (glucose □hosphate dehydrogenase) variants 1, 2, 3, pyruvate kinase deficiency, congenital erythropoietin sensitivity, deficiency, sickle cell disease and trait, thalassemia alpha, beta, gamma, met-hemoglobinemia, congenital disorders of immunity, severe combined immunodeficiency disease (SCID), bare lymphocyte syndrome, ionophore-responsive combined immunodeficiency, combined immunodeficiency with a capping abnormality, nucleoside phosphorylase deficiency, granulocyte actin deficiency, infantile agranulocytosis, Gaucher's disease, adenosine deaminase deficiency, Kostmann's syndrome, reticular dysgenesis, congenital Leukocyte dysfunction syndromes; and others such as osteoporosis, myelosclerosis, acquired hemolytic anemias, acquired immunodeficiencies, infectious disorders causing primary or secondary immunodeficiencies, bacterial infections (e.g., Brucellosis, Listerosis, tuberculosis, leprosy), parasitic infections (e.g., malaria, Leishmaniasis), fungal infections, disorders involving disproportionsin lymphoid cell sets and impaired immune functions due to aging, phagocyte disorders, Kostmann's agranulocytosis, chronic granulomatous disease, Chediak-Higachi syndrome, neutrophil actin deficiency, neutrophil membrane GP-180 deficiency, metabolic storage diseases, mucopolysaccharidoses, mucolipidoses, miscellaneous disorders involving immune mechanisms, Wiskott-Aldrich Syndrome, alpha lantirypsin deficiency, etc.
  • Diseases or pathologies include neurodegenerative diseases, hepatodegenerative diseases, nephrodegenerative disease, spinal cord injury, head trauma or surgery, viral infections that result in tissue, organ, or gland degeneration, and the like. Such neurodegenerative diseases include but are 10 not limited to, AIDS dementia complex; demyeliriating diseases, such as multiple sclerosis and acute transferase myelitis; extrapyramidal and cerebellar disorders, such as lesions of the ecorticospinal system; disorders of the basal ganglia or cerebellar disorders; hyperkinetic movement disorders, such as Huntington's Chorea and senile chorea; drug-induced movement disorders, such as those induced by drugs that block CNS dopamine receptors; hypokinetic movement disorders, such as Parkinson's disease; progressive supra-nucleo palsy; structural lesions of the cerebellum; spinocerebellar degenerations, such as spinal ataxia, Friedreich's ataxia, cerebellar cortical degenerations, multiple systems degenerations (Mencel, Dejerine Thomas, Shi-Drager, and Machado-Joseph), systermioc disorders, such as Rufsum's disease, abetalipoprotemia, ataxia, telangiectasia; and mitochondrial multisystem disorder; demyelinating core disorders, such as multiple sclerosis, acute transverse myelitis; and disorders of the motor unit, such as neurogenic muscular atrophies (anterior horn cell degeneration, such as amyotrophic lateral sclerosis, infantile spinal muscular atrophy and juvenile spinal muscular atrophy); Alzheimer's disease; Down's Syndrome in middle age; Diffuse Lewy body disease; Senile Dementia of Lewy body type; Parkinson's Disease, Wernicke-Korsakoff syndrome; chronic alcoholism; Creutzfeldt-Jakob disease; Subacute sclerosing panencephalitis hallefforden-Spatz disease; and Dementia pugilistica. See, e.g., Berkow et. al., (eds.) (1987), The Merck Manual, (15′) ed.), Merck and Co., Rahway, N.J.
  • Industrial Farming Uses
  • The present invention provides viable porcine for purposes of farming applications in which one or both alleles of the CMP-Neu5Ac hydroxylase gene have been inactivated. Inactivation of one or both alleles of the CMP-Neu5Ac hydroxylase gene can reduce the susceptibility of porcine animals to zoonotic diseases and infections in pigs such as, for example, E. coli, pig rotavirus, and pig transmissible gastroenteritis coronavirus, and any other zoonotic or enterotoxigenic organism that utilizes Neu5Gc in a host animal. The reduction in disease susceptibility allows greater economic realization of farming operations due to the ability to harvest more healthy animals, and the reduction of animal death due to enterotoxigenic organisms.
  • The following examples are offered by way of illustration and not by way of limitation.
  • EXAMPLES Isolation of Nucleic Acids
  • Combination strategy of PCR-based methods was employed to identify the porcine CMP-Neu5Ac hydroxylase gene. Such PCR methods are well known in the art and described, for example, in PCR Technology, H. A. Erlich, ed., Stockton Press, London, 1989; PCR Protocols: A Guide to Methods and Applications, M. A. Innis, D. H. Gelfand, J. J. Sninsky, and T. J. White, eds., Academic Press, Inc., New York, 1990.
  • Total RNA was extracted from an adult porcine (Great Yorkshire) spleen using Trizol reagent (Gibco, Grand Island, N.Y.). After treatment with Dnase I (Ambion, Inc., Austin, Tex.), poly A+ RNA was separated using the Dynabeads mRNA Purification Kit (Dynal, Oslo, Norway). To identify the 5′- or 3′-end of porcine CMP-Neu5Ac Hydroxylase gene, 5′- or 3′-RACE (rapid amplification of cDNA ends) procedures were performed using Marathon™ cDNA Amplification kit (Clontech). To identify exon-intron boundaries, or 5′- or 3′-flanking region of the transcripts, porcine GenomeWalker™ libraries were constructed using Universal GenomeWalker™ Library kit (Clontech). Gene-specific and nested primer pairs were designed from the partial cDNA sequence provided by GenBank Accession #A59058.
  • Determination of cDNA and Genomic CMP-Neu5Ac Hydroxylase Sequence
  • 5′- or 3′-RACE analysis: To identify the 5′ and 3′ ends of porcine CMP-Neu5Ac hydroxylase gene transcripts, 5′- and 3′-RACE procedures were performed using the Marathon cDNA Amplification Kit (Clontech) with poly A+ RNA isolated from adult porcine spleen as a template. First strand cDNA synthesis from 1 ug of poly A+ RNA was accomplished using 20 U of AMV-RT and 1 pmol of the supplied cDNA Synthesis Primer by incubating at 48° C. for 2 hours. Second strand cDNA synthesis involved incubating the entire first strand reaction with a supplied enzyme cocktail composed of Rnase H, E. coli DNA polymerase I, and E. coli DNA polymerase I, and E. coli DNA ligase at 16° C. for 1.5 hr. After blunting of the double stranded cDNA ends by T4 DNA polymerase, the supplied Marathon cDNA Adapters were ligated to an aliquot of purified, double-stranded cDNA. Dilution of the adapter-ligated product in 10 mM ticme-KOH/0.1 mM EDTA buffer provided with the kit readied the cDNA for PCR amplification.
  • To obtain the 5′- and 3′-most sequences of the porcine CMP-Neu5Ac hydroxylase gene transcripts, provided Marathon cDNA Amplification primer sets were paired with gene-specific and nested gene-specific primers based on the sequence provided by GenBank accession number A59058. These primer sets are provided for in Table 13. By this method, oligonucleotide primers based on the sequence contained in Genbank accession number A59058 are oriented in the 3′ and 5′ directions and are used to generate overlapping PCR fragments. These overlapping 3′ and 5′ products are combined to produce an intact full-length cDNA. This method is described, for example, in Innis, et al., supra; and Frohman et al., Proc. Natl. Acad. Sci., 85:8998, 1988, and further described, for example, in U.S. Pat. No. 4,683,195.
  • Genome Walking analysis: To identify exon-intron boundaries, or 5′- or 3′-flanking region of the porcine CMP-Neu5Ac hydroxylase transcripts, porcine GenomeWalker™ libraries were constructed using the Universal GenomeWalker™ Library Kit (Clontech, Palo Alto, Calif.).
  • Briefly, five aliquots of porcine genomic DNA were separately digested with a single blunt-cutting restriction endonuclease (DraI, EcoRV, PvuII, ScaI, or StuI). After phenol-chloroform extraction, ethanol precipitation, and resuspension of the restricted fragments, a portion of each digested aliquot was used in separate ligation reaction with the GenomeWalker adapters provided with the kit. This process created five libraries for use in the PCR based cloning strategy. Primer pairs identified in Table 13 were used in a genome walking strategy. Either eLON-Gase or TaKaRa LA Taq (Takara Shuzo Co., Ltd., Shiga, Japan) enzyme was used for PCR in all GenomeWalker experiments as well as for direct long PCR of genomic DNA. The thermal cycling conditions recommended by the manufacturer were employed in all GenomeWalker-PCR experiments on a Perkin Elmer Gene Amp System 9600 or 9700 thermocycler.
  • TABLE 13
    Primers Used in PCR Strategies
    Primer PCR
    Set Strategy Sequence
    XA
    3′-RACE/ 5′-CATGGACCTCAAGCTGGGGGACAAGA-3′
    Genome
    Walking
    XB
    3′-RACE/ 5′-GTGTTCGACCCTTGGTTAATCGGTCCTG-3′
    Genome
    Walking
    XM
    5′-RACE/ 5′-CAGGACCGATTAACCAAGGGTCGAACAC-3′
    Genome
    Walking
    XN
    5′-RACE/ 5′-TCTTGTCCCCCAGCTTGAGGTCCATG-3′
    Genome
    Walking
  • Subcloning and sequencing of amplified products: PCR products amplified from genomic DNA, GeneWalker-PCR (Clontech), and 5′-3′-RACE wre gel-purified using the Qiagen Gel Extraction Kit (Qiagen, Valencia, Calif.), if necessary, then subcloned into the pCR11 vector provided with the Original TA Cloning Kit (Invitrogen, Carlsbad, Calif.). Plasmid DNA minipreps of pCR11-ligated inserts were prepared with the QIAprep Spin Miniprep Kit (Qiagen) as directed. Automated fluorescent sequencing of cloned inserts was performed using an ABI 377 Automated Sequence Analyzer (Applied Biosystems, Inc., Foster City, Calif.) with either the dRhodamine or BigDye Terminator Cycle Sequencing Kits (Applied Biosystems) primed with T7 and SP6 promoter primers or primers designed from internal insert sequences.
  • Primer Synthesis: All oligonucleotides used as primers in the various PCR-based methods were synthesized on an ABI 394 DNA Synthesizer (Applied Biosystems, Inc., Foster City Calif.) using solid phase synthesis and phosphoramidite nucleoside chemistry, unless otherwise stated.
  • Analysis of Transcription Factor Binding Sites
  • Analysis of possible transcription factors binding sites were performed using 228 bp of exon 1 sequence and 601 bp upstream of exon 1. The sequences were screened using “MatInspector” software available in www.genomatix.de. The sequences contain binding sites for the following transcription factors: MZF1, ETSF, SF1, CMYB, MEF2, NMP4, BRN2, AP1, GAT1, SATB1, ATF, USF, WHN, ZF5, NFκB, MOK2, NFY, MYCMAX, ZF5. See FIG. 4.
  • Construction of Porcine CMP-Neu5Ac Hydroxylase Homologous Recombination Targeting Vectors
  • CMP-Neu5Ac hydroxylase knock-out target vector: A vector targeting Exon 6 of the porcine CMP-Neu5Ac Hydroxylase gene for knockout can be constructed. In a first step, a portion of Intron 6 is amplified by PCR for use as a 3′-arm of the targeting vector utilizing primers such as pDH3 (5′-CTCCTGGAAGCTTCTGTCAAGACGAAC-3′) and pDH4 (5′-GCCTGATACACAGTGCTGTGCAATGGT-3′) (see FIG. 5). The amplified PCR product of approximately 3.7 kb can be inserted into the pCRII vector after restriction enzyme digestion utilizing EcoRI and ApaI. See FIG. 6.
  • Following the insertion of the 3′-arm, a portion of Intron 5 can be amplified by PCR for use as a 5′-arm in the targeting vector utilizing primers such as pDH1 (5′-ACCACCCAAGTCTGGAATCTTCTTACACT-3′) and pDH2 (5′-GACTCTCATACAAAAGCTAAGCTGGGTAAG-3′) (see FIG. 5). Following this initial amplification, successive PCR amplifications can be performed to introduce an EcoNI restriction site into the 3′ portion of the 5′-arm utilizing primers such as pDH1 in conjunction with primers such as pDH2a (5′-GACTCTCATACAAAACCTAAGCTGGGTAAG-3′), pDH2b (5′-GACTCTCATACAAAACCTAGGCTGGGTAAG-3′), and pDH2c (5′-GACTCTCATACAAAACCTAGGCTAGGTAAG-3′), respectively (see FIG. 5). The amplified PCR product of approximately 2.6 kb containing the engineered EcoNI site can be restriction enzyme digested using ApaI and EcoNI, and inserted into the pCRII vector containing the previously inserted 3′-arm (See FIG. 7), generating a targeting vector (pDHΔex6) containing an approximate 6.3 kb porcine CMP-Neu5Ac hydroxylase targeting sequence (see FIG. 8).
  • EGFP knock-in target vector: pDHΔex6 can be further modified by an in-frame insertion of an enhanced green fluorescent protein sequence at the terminal 3′ end of Exon 6 of the porcine CMP-Neu5Ac hydroxylase gene. In a first step, a portion of Intron 5 and a portion of Exon 6 of the porcine CMP-Neu5Ac hydroxylase gene can be amplified by PCR utilizing primers such as pDH5 (5′-CCTTATACTGGCCCCAATTGGATCTTAC-3′) and pDH6 (5′-CCTTATACTGGCCCCAATTGGATCTTAC-3′) (see FIG. 9), and inserted into a vector (pIRES-EGFP) containing the EGFP and a poly A tail following restriction enzyme digestion with MunI and EcoRv. Following insertion, PCR amplification can be performed on the pIRES-EGFP vector containing the insertion utilizing primer such as pDH7 (5′-CTTACCTAGCCTAGGTTTTGTATGAGAGTC-3′) and pDH8 (5′-GACAAACCACAATTGGAATGCACTCGAG-3′) (see FIG. 9). The PCR amplified product can be restriction enzyme digested using EcoNI and MunI and inserted into the previously constructed pDHΔex6 targeting vector (see FIG. 10). The resultant targeting vector (pDHΔex6-EGFP) is illustrated in FIG. 11.
  • Production of Porcine CMP-Neu5Ac Hydroxylase Deficient Fetal Fibroblast Cells
  • Fetal fibroblast cells are isolated from 10 fetuses of the same pregnancy at day 33 of gestation. After removing the head and viscera, fetuses are washed with Hanks' balanced salt solution (HBSS; Gibco-BRL, 1 5 Rockville, Md.), placed in 20 ml of HBSS, and diced with small surgical scissors. The tissue is pelleted and resuspended in 50-ml tubes with 40 ml of DMEM and 100 U/ml collagenase (Gibco-BRL) per fetus. Tubes are incubated for 40 min in a shaking water bath at 37 C. The digested tissue is allowed to settle for 3-4 min and the cell-rich supernatant is transferred to a new 50-ml tube and pelleted. The cells are then resuspended in 40 ml of DMEM containing 10% fetal calf serum (FCS), 1X nonessential amino acids, 1 mM sodium pyruvate and 2 ng/ml bFGF, and seeded into 10 cm. dishes. For transfections, 10 μg of linearized pDHΔex6EGFP vector is introduced into 2 million cells using lipofectamine 2000 (Carlsbad, Calif.) following manufacturer's guidelines. Forty-eight hours after transfection, the transfected cells are seeded into 48-well plates at a density of 2,000 cells per well and grown to confluence. Following confluence, cells are sorted via Fluorescent Activated Cell Sorting (FACS) (FACSCalibur, Becton Dickenson, San Jose, Calif.), wherein only cells having undergone homologous recombination and expressing the EGFP are selected (see, for example, FIG. 13).
  • Selected cells are then reseeded, and grown to confluency. Once confluency is reached, several small aliquots are frozen back for future use, and the remainder are utilized for PCR and Southern Blot verification of homologous recombination. The putative targeted clones can be screened by PCR across the Exon 6/EGFP insert utilizing a primer complimentary to the EGFP sequence and a primer complimentary to a sequence outside the vector as the antisense primer. The PCR products can be analyzed by Southern Blotting using an EGFP probe to identify the positive clones by the presence of the expected band from the targeted allele.
  • Generation of Cloned Pigs Using Heterologous CMP-Neu5Ac Hydroxylase Deficient Fetal Fibroblasts as Nuclear Donors
  • Preparation of cells for Nuclear Transfer: Donor cells are genetically manipulated to produce cells heterozygous for porcine CMP-Neu5Ac hydroxylase as described generally above. Nuclear transfer can be performed by methods that are well known in the art (see, e.g., Dai et al., Nature Biotechnology 20: 251255, 2002; and Polejaeva et al., Nature 407:86-90, 2000), using EGFP selected porcine fibroblasts as nuclear donors that are produced as described in detail hereinabove.
  • Oocytes can be isolated from synchronized super ovulated sexually mature Large-White X Landacre outcross gilts as described, for example, in 1. Polejaeva et al. Nature 407: 505 (2000). Donor cells are synchronized in presumptive G0/G1 by serum starvation (0.5%) between 24 to 120 hours. Oocytes enucleation, nuclear transfer, electrofusion, and electroactivation can be performed as essentially described in, for example, A. C. Boquest et al., Biol. Reproduction 68: 1283 (2002). Reconstructed embryos can be cultured overnight and can be transferred to the oviducts of asynchronous (−1 day) recipients. Pregnancies can be confirmed and monitored by real-time ultrasound.
  • Breeding of heterozygous CMP-Neu5Ac hydroxylase single knockout (SKO) male and female pigs can be performed to establish a miniherd of double knockout (DKO) pigs.
  • Verification of CMP-Neu5Ac Hydroxylase Deficient Pigs
  • Following breeding of the single knockout male and female pigs, verification of double knockout pigs is performed. Fibroblasts from the offspring are incubated with 1 μg of anti-N-glycolyl GM2 monoclonal antibody MK2-34 (Seikagaku Kogyo, JP) on ice for 30 minutes. FITC conjugated goat-anti-mouse IgG is added to the cells and antibody binding indicating the presence or absence of Neu5GC, and thus, an indication of the presence or absence of active CMP-Neu5Ac hydroxylase, is detected by flow cytometry (FACSCalibur, Becton Dickenson, San Jose, Calif.).

Claims (20)

1-66. (canceled)
67. A targeting vector for homologous recombination in a somatic cell comprising a marker sequence and a sequence homologous to a porcine genomic CMP-N-acetylneuraminic-acid (CMP-Neu5Ac) hydroxylase sequence, wherein said homologous sequence is sufficient to provide targeted insertion of the selectable marker sequence into a target CMP-Neu5Ac gene in a host.
68. The targeting vector of claim 67 wherein said vector comprises:
i. a first nucleotide sequence comprising at least 17 contiguous nucleotides of the porcine genomic CMP-Neu5Ac hydroxylase sequence;
ii. a second nucleotide sequence comprising at least 17 contiguous nucleotides of the porcine genomic CMP-Neu5Ac hydroxylase sequence; and
iii. a marker sequence;
wherein said first and second nucleotide sequences do not overlap.
69. The targeting vector of claim 68 wherein the selectable marker gene is green fluorescent protein.
70. The targeting vector of claim 68 wherein the first nucleotide sequence represents a 5′ recombination arm.
71. The targeting vector of claim 68 wherein the second nucleotide sequence represents a 3′ recombination arm.
72. The targeting vector of claim 68 wherein the first or second nucleotide sequence is homologous to Intron 5 of porcine genomic CMP-Neu5Ac hydroxylase sequence.
73. The targeting vector of claim 68 wherein the first or second nucleotide sequence is homologous to Intron 6 of the porcine genomic CMP-Neu5Ac hydroxylase sequence.
74. The targeting vector of claim 68 wherein the first or second nucleotide sequence comprises at least 50 contiguous nucleotides of the porcine genomic CMP-Neu5Ac hydroxylase sequence.
75. The targeting vector of claim 68 wherein the first or second nucleotide sequence comprises at least 100 contiguous nucleotides of the porcine genomic CMP-Neu5Ac hydroxylase sequence.
76. The targeting vector of claim 68 wherein the first or second nucleotide sequence comprises at least 150 contiguous nucleotides of the porcine genomic CMP-Neu5Ac hydroxylase sequence.
77. A cell transfected with the targeting vector of claim 67.
78. The cell of claim 77 wherein at least one allele of a porcine CMP-Neu5Ac gene has been rendered inactive via homologous recombination.
79. A porcine animal comprising the cell of claim 78.
80. The animal of claim 79 wherein at least one allele of a porcine CMP-Neu5Ac hydroxylase gene has been rendered inactive via homologous recombination.
81. An organ obtained from the animal of claim 80.
82. A tissue obtained from the animal of claim 80.
83. The organ of claim 81 wherein the organ is selected from the group consisting of heart, lung, kidney and liver.
84. A method to produce genetically modified cells comprising: (a) transfecting a porcine cell with the targeting vector of claim 67; and (b) selecting a transfected cell in which at least one allele of a porcine CMP-N-acetylneuraminic-acid (CMP-Neu5Ac) hydroxylase gene has been rendered inactive.
85. The method of claim 84 further comprising: (c) transferring the nucleus of the selected transfected cell into an enucleated oocyte to produce an embryo; and (d) allowing the embryo to develop into a non-human animal wherein at least one allele of a porcine CMP-Neu5Ac hydroxylase gene has been rendered inactive.
US12/061,351 2003-06-06 2008-04-02 Porcine cmp-n-acetylneuraminic acid hydroxylase gene Abandoned US20090049562A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/061,351 US20090049562A1 (en) 2003-06-06 2008-04-02 Porcine cmp-n-acetylneuraminic acid hydroxylase gene

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US47639603P 2003-06-06 2003-06-06
US10/863,116 US7368284B2 (en) 2003-06-06 2004-06-07 Porcine CMP-N-Acetylneuraminic acid hydroxylase gene
US12/061,351 US20090049562A1 (en) 2003-06-06 2008-04-02 Porcine cmp-n-acetylneuraminic acid hydroxylase gene

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US10/863,116 Continuation US7368284B2 (en) 2003-06-06 2004-06-07 Porcine CMP-N-Acetylneuraminic acid hydroxylase gene

Publications (1)

Publication Number Publication Date
US20090049562A1 true US20090049562A1 (en) 2009-02-19

Family

ID=33511785

Family Applications (2)

Application Number Title Priority Date Filing Date
US10/863,116 Expired - Fee Related US7368284B2 (en) 2003-06-06 2004-06-07 Porcine CMP-N-Acetylneuraminic acid hydroxylase gene
US12/061,351 Abandoned US20090049562A1 (en) 2003-06-06 2008-04-02 Porcine cmp-n-acetylneuraminic acid hydroxylase gene

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US10/863,116 Expired - Fee Related US7368284B2 (en) 2003-06-06 2004-06-07 Porcine CMP-N-Acetylneuraminic acid hydroxylase gene

Country Status (6)

Country Link
US (2) US7368284B2 (en)
EP (1) EP1639083A4 (en)
JP (1) JP2007525955A (en)
AU (1) AU2004246022A1 (en)
CA (1) CA2528500A1 (en)
WO (1) WO2004108904A2 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050028228A1 (en) * 2003-07-21 2005-02-03 Lifecell Corporation Acellular tissue matrices made from alpa-1,3-galactose-deficient tissue
US20060019292A1 (en) * 2001-01-18 2006-01-26 Farmer Andrew A Sequence specific recombinase-based methods for producing intron containing vectors and compositions for use in practicing the same
WO2014178485A1 (en) * 2013-04-30 2014-11-06 건국대학교 산학협력단 Cmp-acetylneuraminic acid hydroxylase targeting vector, vector-transduced transgenic animal for xenotransplantation, and method for producing same
US9420770B2 (en) 2009-12-01 2016-08-23 Indiana University Research & Technology Corporation Methods of modulating thrombocytopenia and modified transgenic pigs
US10307510B2 (en) 2013-11-04 2019-06-04 Lifecell Corporation Methods of removing alpha-galactose
US10799614B2 (en) 2018-10-05 2020-10-13 Xenotherapeutics, Inc. Xenotransplantation products and methods
US10883084B2 (en) 2018-10-05 2021-01-05 Xenotherapeutics, Inc. Personalized cells, tissues, and organs for transplantation from a humanized, bespoke, designated-pathogen free, (non-human) donor and methods and products relating to same

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060053500A1 (en) * 2004-05-28 2006-03-09 Univ. of Pittsburgh of the Commonwealth System of Higher Education, Office of Technology Management Modification of sugar metabolic processes in transgenic cells, tissues and animals
US20080026457A1 (en) * 2004-10-22 2008-01-31 Kevin Wells Ungulates with genetically modified immune systems
AU2005299413A1 (en) * 2004-10-22 2006-05-04 Revivicor, Inc. Ungulates with genetically modified immune systems
CA2617930A1 (en) 2005-08-09 2007-03-29 Revivicor, Inc. Transgenic ungulates expressing ctla4-ig and uses thereof
ES2548377T3 (en) 2008-10-27 2015-10-16 Revivicor, Inc. Immunosuppressed ungulates
WO2012112586A1 (en) 2011-02-14 2012-08-23 Revivicor, Inc. Genetically modified pigs for xenotransplantation of vascularized xenografts and derivatives thereof
KR101462720B1 (en) * 2012-07-31 2014-11-24 성균관대학교산학협력단 Promoters of pig CMP-N-acetylneuraminic acid hydroxylase gene and pig cell line for xeno-transplantation by suppressing promoters thereof
US20140115728A1 (en) 2012-10-24 2014-04-24 A. Joseph Tector Double knockout (gt/cmah-ko) pigs, organs and tissues
US9616114B1 (en) 2014-09-18 2017-04-11 David Gordon Bermudes Modified bacteria having improved pharmacokinetics and tumor colonization enhancing antitumor activity
US11129906B1 (en) 2016-12-07 2021-09-28 David Gordon Bermudes Chimeric protein toxins for expression by therapeutic bacteria
US11180535B1 (en) 2016-12-07 2021-11-23 David Gordon Bermudes Saccharide binding, tumor penetration, and cytotoxic antitumor chimeric peptides from therapeutic bacteria
CN110408707B (en) * 2019-07-23 2021-02-12 华中农业大学 Molecular marker cloned from InDel fragment and related to pig hair color property
US20220211018A1 (en) 2020-11-20 2022-07-07 Revivicor, Inc. Multi-transgenic pigs with growth hormone receptor knockout for xenotransplantation
US20230255185A1 (en) 2021-09-20 2023-08-17 Revivicor, Inc. Multitransgenic pigs comprising ten genetic modifications for xenotransplantation

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5525488A (en) * 1985-10-03 1996-06-11 Genentech, Inc. Nucleic acid encoding the mature α chain of inhibin and method for synthesizing polypeptides using such nucleic acid
US6166288A (en) * 1995-09-27 2000-12-26 Nextran Inc. Method of producing transgenic animals for xenotransplantation expressing both an enzyme masking or reducing the level of the gal epitope and a complement inhibitor

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06113838A (en) 1992-10-08 1994-04-26 Akemi Suzuki Hydroxylase, its production, gene coding hydroxylase and monoclonal antibody specifically reacting with hydroxylase
EP0837942A1 (en) 1995-07-07 1998-04-29 Boehringer Mannheim Gmbh Nucleic acid coding for cmp-n-acetyl-neuraminic acid hydroxylase and its use for the production of modified glycoproteins
EP0752474A1 (en) * 1995-07-07 1997-01-08 Boehringer Mannheim Gmbh Nucleic acid coding for CMP-N-acetyl-neuraminic acid hydroxylase and its use for the production of modified glycoproteins
CA2361943C (en) 1999-03-04 2011-05-17 Ppl Therapeutics (Scotland) Limited Genetic modification of somatic cells and uses thereof
AU2002308533B2 (en) * 2001-04-30 2007-07-19 Rbc Biotechnology, Inc Modified organs and cells for xenotransplantation
JP2005535343A (en) 2002-08-14 2005-11-24 イマージ バイオセラピューティクス,インコーポレーテッド α (1,3) -Galactosyltransferase deficient cells, selection methods, and α (1,3) -galactosyltransferase deficient pigs made therefrom
ES2338111T3 (en) 2002-08-21 2010-05-04 Revivicor, Inc. PIG ANIMALS THAT LACK OF ANY EXPRESSION OF ALPHA 1.3 FUNCTIONAL GALACTOSILTRANSPHERASE.

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5525488A (en) * 1985-10-03 1996-06-11 Genentech, Inc. Nucleic acid encoding the mature α chain of inhibin and method for synthesizing polypeptides using such nucleic acid
US6166288A (en) * 1995-09-27 2000-12-26 Nextran Inc. Method of producing transgenic animals for xenotransplantation expressing both an enzyme masking or reducing the level of the gal epitope and a complement inhibitor

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060019292A1 (en) * 2001-01-18 2006-01-26 Farmer Andrew A Sequence specific recombinase-based methods for producing intron containing vectors and compositions for use in practicing the same
US20110165629A1 (en) * 2001-01-18 2011-07-07 Life Technologies Corporation Recombinase-Based Methods for Producing Expression Vectors and Compositions for Use in Practicing the Same
US20050028228A1 (en) * 2003-07-21 2005-02-03 Lifecell Corporation Acellular tissue matrices made from alpa-1,3-galactose-deficient tissue
US20110002996A1 (en) * 2003-07-21 2011-01-06 Lifecell Corporation Acellular tissue matrices made from alpha-1,3-galactose-deficient tissue
US8324449B2 (en) 2003-07-21 2012-12-04 Lifecell Corporation Acellular tissue matrices made from alpha-1,3-galactose-deficient tissue
US8802920B2 (en) 2003-07-21 2014-08-12 Lifecell Corporation Acellular tissue matrices made from α-1,3-galactose-deficient tissue
US9420770B2 (en) 2009-12-01 2016-08-23 Indiana University Research & Technology Corporation Methods of modulating thrombocytopenia and modified transgenic pigs
WO2014178485A1 (en) * 2013-04-30 2014-11-06 건국대학교 산학협력단 Cmp-acetylneuraminic acid hydroxylase targeting vector, vector-transduced transgenic animal for xenotransplantation, and method for producing same
US10307510B2 (en) 2013-11-04 2019-06-04 Lifecell Corporation Methods of removing alpha-galactose
US10799614B2 (en) 2018-10-05 2020-10-13 Xenotherapeutics, Inc. Xenotransplantation products and methods
US10883084B2 (en) 2018-10-05 2021-01-05 Xenotherapeutics, Inc. Personalized cells, tissues, and organs for transplantation from a humanized, bespoke, designated-pathogen free, (non-human) donor and methods and products relating to same
US10905799B2 (en) 2018-10-05 2021-02-02 Xenotherapeutics Corporation Xenotransplantation products and methods
US11028371B2 (en) 2018-10-05 2021-06-08 Xenotherapeutics, Inc. Personalized cells, tissues, and organs for transplantation from a humanized, bespoke, designated-pathogen free, (non-human) donor and methods and products relating to same
US11129922B2 (en) 2018-10-05 2021-09-28 Xenotherapeutics, Inc. Xenotransplantation products and methods
US11155788B2 (en) 2018-10-05 2021-10-26 Xenotherapeutics, Inc. Personalized cells, tissues, and organs for transplantation from a humanized, bespoke, designated-pathogen free, (non-human) donor and methods and products relating to same
US11473062B2 (en) 2018-10-05 2022-10-18 Xenotherapeutics, Inc. Personalized cells, tissues, and organs for transplantation from a humanized, bespoke, designated-pathogen free, (non-human) donor and methods and products relating to same
US11833270B2 (en) 2018-10-05 2023-12-05 Xenotherapeutics, Inc. Xenotransplantation products and methods

Also Published As

Publication number Publication date
JP2007525955A (en) 2007-09-13
EP1639083A2 (en) 2006-03-29
EP1639083A4 (en) 2008-05-28
AU2004246022A1 (en) 2004-12-16
CA2528500A1 (en) 2004-12-16
WO2004108904A2 (en) 2004-12-16
US20050223418A1 (en) 2005-10-06
WO2004108904A3 (en) 2007-02-08
US7368284B2 (en) 2008-05-06

Similar Documents

Publication Publication Date Title
US20090049562A1 (en) Porcine cmp-n-acetylneuraminic acid hydroxylase gene
US11172658B2 (en) Porcine animals lacking expression of functional alpha 1, 3 galactosyltransferase
US7732180B2 (en) Porcine Forssman synthetase protein, cDNA, genomic organization, and regulatory region
US20050108783A1 (en) Porcine invariant chain protein, full length cDNA, genomic organization, and regulatory region
US7560538B2 (en) Porcine isogloboside 3 synthase protein, cDNA, genomic organization, and regulatory region

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: NATIONAL INSTITUTES OF HEALTH - DIRECTOR DEITR, MA

Free format text: CONFIRMATORY LICENSE;ASSIGNOR:UNIVERSITY OF PITTSBURGH;REEL/FRAME:037556/0587

Effective date: 20160115