WO1998016633A1 - Alpha-amylase fused to cellulose binding domain, for starch degradation - Google Patents

Alpha-amylase fused to cellulose binding domain, for starch degradation Download PDF

Info

Publication number
WO1998016633A1
WO1998016633A1 PCT/DK1997/000448 DK9700448W WO9816633A1 WO 1998016633 A1 WO1998016633 A1 WO 1998016633A1 DK 9700448 W DK9700448 W DK 9700448W WO 9816633 A1 WO9816633 A1 WO 9816633A1
Authority
WO
WIPO (PCT)
Prior art keywords
gly
asp
ala
ser
thr
Prior art date
Application number
PCT/DK1997/000448
Other languages
French (fr)
Inventor
Mads BJØRNVAD
Sven Pedersen
Martin Schulein
Henrik Bisgård-Frantzen
Original Assignee
Novo Nordisk A/S
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Novo Nordisk A/S filed Critical Novo Nordisk A/S
Priority to EP97943797A priority Critical patent/EP0950093A2/en
Priority to AU45510/97A priority patent/AU4551097A/en
Publication of WO1998016633A1 publication Critical patent/WO1998016633A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/24Hydrolases (3) acting on glycosyl compounds (3.2)
    • C12N9/2402Hydrolases (3) acting on glycosyl compounds (3.2) hydrolysing O- and S- glycosyl compounds (3.2.1)
    • C12N9/2405Glucanases
    • C12N9/2451Glucanases acting on alpha-1,6-glucosidic bonds
    • C12N9/2457Pullulanase (3.2.1.41)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/24Hydrolases (3) acting on glycosyl compounds (3.2)
    • C12N9/2402Hydrolases (3) acting on glycosyl compounds (3.2) hydrolysing O- and S- glycosyl compounds (3.2.1)
    • C12N9/2405Glucanases
    • C12N9/2408Glucanases acting on alpha -1,4-glucosidic bonds
    • C12N9/2411Amylases
    • C12N9/2428Glucan 1,4-alpha-glucosidase (3.2.1.3), i.e. glucoamylase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/24Hydrolases (3) acting on glycosyl compounds (3.2)
    • C12N9/2402Hydrolases (3) acting on glycosyl compounds (3.2) hydrolysing O- and S- glycosyl compounds (3.2.1)
    • C12N9/2405Glucanases
    • C12N9/2451Glucanases acting on alpha-1,6-glucosidic bonds
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/24Hydrolases (3) acting on glycosyl compounds (3.2)
    • C12N9/2402Hydrolases (3) acting on glycosyl compounds (3.2) hydrolysing O- and S- glycosyl compounds (3.2.1)
    • C12N9/2405Glucanases
    • C12N9/2451Glucanases acting on alpha-1,6-glucosidic bonds
    • C12N9/246Isoamylase (3.2.1.68)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y302/00Hydrolases acting on glycosyl compounds, i.e. glycosylases (3.2)
    • C12Y302/01Glycosidases, i.e. enzymes hydrolysing O- and S-glycosyl compounds (3.2.1)
    • C12Y302/01001Alpha-amylase (3.2.1.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y302/00Hydrolases acting on glycosyl compounds, i.e. glycosylases (3.2)
    • C12Y302/01Glycosidases, i.e. enzymes hydrolysing O- and S-glycosyl compounds (3.2.1)
    • C12Y302/01041Pullulanase (3.2.1.41)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y302/00Hydrolases acting on glycosyl compounds, i.e. glycosylases (3.2)
    • C12Y302/01Glycosidases, i.e. enzymes hydrolysing O- and S-glycosyl compounds (3.2.1)
    • C12Y302/01068Isoamylase (3.2.1.68)
    • CCHEMISTRY; METALLURGY
    • C13SUGAR INDUSTRY
    • C13KSACCHARIDES OBTAINED FROM NATURAL SOURCES OR BY HYDROLYSIS OF NATURALLY OCCURRING DISACCHARIDES, OLIGOSACCHARIDES OR POLYSACCHARIDES
    • C13K1/00Glucose; Glucose-containing syrups
    • C13K1/06Glucose; Glucose-containing syrups obtained by saccharification of starch or raw materials containing starch
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide

Abstract

The invention relates to a starch conversion method wherein the starch substrate is treated in aqueous medium with an CBD/enzyme hybrid. Further, the invention also relates to an isolated DNA sequence encoding a stable CBD/enzyme hybrid, a DNA construct comprising said DNA sequence of the invention, an expression vector comprising the DNA sequence of the invention, and a CBD/enzyme hybrid.

Description

ALPHA-AMYLASE FUSED TO CELLULOSE BINDING DOMAIN, FOR STARCH DEGRADATION
FIELD OF THE INVENTION
The present invention relates, inter alia , to the use of a 5 hybrid between a carbohydrate-binding domain ("CBD") and an enzyme of a type employed in industrial starch processing [notably starch processing for the production (vide infra) of sweeteners, particularly glucose- and/or fructose-containing syrups] , especially an amylolytic enzyme, such as an α-amylase 0 employed in a so-called "starch liquefaction" process (vide infra) in which starch is degraded (often termed "dextrinized") to smaller oligo- and/or polysaccharide fragments, or a debranching enzyme (such as an isoamylase or a pullulanase) employed to debranch amylopectin-derived starch fragments in 5 connection with the so-called "saccharification" process (vide infra) which is normally carried out after the liquefaction stage. The invention also relates to hybrid enzyme consisting of a CBD-linker-enzyme.
0 BACKGROUND OF THE INVENTION
As indicated above, the present invention is of particular value in the field of starch processing (starch conversion) . Conditions for conventional starch conversion processes and for liquefaction and/or saccharification processes are described in, 5 e.g., US 3,912,590 and in EP 0,252,730 and EP 0,063,909.
Production of sweeteners from starch:
A "traditional" process for the production of glucose- and fructose-containing syrups from starch normally consists of three 0 consecutive enzymatic processes, viz. a liquefaction process followed by a saccharification process and (for production of fructose-containing syrups) an isomerization process. During the liquefaction process, starch (initially in the form starch suspension in aqueous medium) is degraded to dextrins (oligo- and 5 polysaccharide fragments of starch) by an α-amylase [EC 3.2.1.1; e.g. Termamyl™ (Bacillus licheniformis α-amylase) , available from Novo Nordisk A/S, Bagsvaerd, Denmark], typically at pH values between 5.5 and 6.2 and at temperatures of 95-160°C for a period of approximately 2 hours. In order to ensure optimal enzyme stability under these conditions, approximately lmM of calcium (ca. 40 ppm free calcium ions) is typically added to the starch suspension.
After the liquefaction process the dextrins are converted into dextrose (D-glucose) by addition of a glucoamylase (amyloglucosidase, EC 3.2.1.3; e.g. AMG™, from Novo Nordisk A/S) and, typically, a debranching enzyme, such as an isoamylase (EC 3.2.1.68) or a pullulanase (EC 3.2.1.41; e.g. Promozyme™, from Novo Nordisk A/S) . Before this step the pH of the medium is normally reduced to a value below 4.5 (e.g pH 4.3), maintaining the high temperature (above 95°C) , and the liquefying α-amylase activity is thereby denatured. The temperature is then normally lowered to 60°C, and glucoamylase and debranching enzyme are added. The saccharification process is normally allowed to proceed for 24-72 hours.
After completion of the saccharification stage, the pH of the medium is increased to a value in the range of 6-8, preferably pH 7.5, and calcium ions are removed by ion exchange. The resulting syrup (dextrose syrup) may then be converted into high fructose syrup using, e.g., an immobilized "glucose isomerase" (xylose isomerase, EC 5.3.1.5; e.g. Sweetzyme™, from Novo Nordisk A/S).
A number of improvements in the properties of enzymes currently employed in starch conversion processes would be desirable. With respect to starch liquefaction, employing liquefying α-amylases, at least 3 improvements could be envisaged and are outlined below; each of these could be regarded as an individual benefit, although any combination (e.g. 1+2, 1+3, 2+3 or 1+2+3) could advantageously be employed:
Improvement 1.
Reduction of the Ca2+ dependency of the liquefying α-amylase.
Addition of free calcium (calcium ion) is required to ensure adequately high stability of α-amylases currently employed for starch liquefaction, but the presence of calcium ions in the medium at the isomerization stage results in strong inhibition of the activity of the glucoseisomerase employed therein. It is therefore necessary either to reduce the calcium ion content of the medium, by means of an expensive unit operation (e.g. ion exchange) , to a level below about 3-5 ppm of free calcium, or to minimize the inhibitory effect of calcium in some other manner, e.g. by addition, after the saccharification stage, to the medium of magnesium ions in a amount sufficient to adequately "out- compete" binding of calcium to the glucoseisomerase. Significant savings could be achieved if the liquefaction process could be performed without addition of calcium ions, thereby eliminating the need for subsequent, expensive remedial unit operations to remove calcium or minimize the inhibitory effect thereof.
To achieve this, an α-amylolytic enzyme which is stable and highly active at low concentrations of free calcium (< 40 ppm) is required. Such an enzyme should preferably have a pH optimum at a pH in the range of 4.5-6.5, more preferably in the range of 4.5- 5.5.
Improvement 2.
Reduction of formation of unwanted Maillard products.
The extent of formation of unwanted Maillard products during the liquefaction process is dependent on the pH. Low pH favours reduced formation of Maillard products. It would thus be desirable to be able to lower the process pH from around pH 6.0 to a value around pH 4.5; unfortunately, all commonly known, thermostable liquefying α-amylases are not very stable at low pH (i.e. pH < 6.0) and their specific activity is generally low.
Achievement of the above-mentioned goal requires the availability of an α-amylolytic enzyme which is stable at a pH in the range of 4.5-5.5, and which preferably maintains a high specific activity.
Improvement 3. Reduced influence of the liquefying α-amylase on the saccharification process.
It has been reported previously (US patent 5,234,823) that when saccharifying with A. niger glucoamylase and B . acidopullu- lyticus pullulanase, the presence of residual α-amylase activity remaining after the liquefaction process can lead to lower yields of dextrose if the α-amylase is not inactivated before the saccharification stage. As already mentioned (vide supra) , this inactivation is typically carried out by adjusting the pH to below 4.5 at 95°C, before lowering the temperature to 60°C for saccharification.
The cause of this negative effect on dextrose yield is not fully understood, but it is assumed that the liquefying α-amylase preparation employed (e.g. a Termamyl™ product, such as
Termamyl™ 120 L) generates "limit dextrins" (which are poor substrates for B . acidopullulyticus pullulanase) by hydrolysing 1,4-alpha-glucosidic linkages close to and on both sides of the branching points in amylopectin. Hydrolysis of these limit dextrins by glucoamylase leads to a build-up of the trisaccharide panose, which is only slowly hydrolysed by glucoamylase.
The development of a thermostable α-amylolytic enzyme which does not suffer from this disadvantage would be a significant process improvement, as no separate inactivation step would be required.
One object of the present invention is to achieve improved performance of α-amylolytic enzymes in relation to starch liquefaction processes - e.g. by achieving one or more or the above-outlined improvements - by changing the affinity of the enzyme for the starch substrate, whereby the modified enzyme comes into more intimate contact with the substrate.
SUMMARY OF THE INVENTION One aspect of the invention relates to an improved enzymatic process for liquefying starch employing a modified form of a liquefying α-amylase, wherein the α-amylase in question is linked to an amino acid sequence comprising a carbohydrate-binding domain (vide infra) . The invention also relates to an improved enzymatic process for liquefying starch which besides a modified α-amylase also is treated with a debranching enzyme. The debranching enzyme may be modified by linkage to an amino acid sequence comprising a carbohydrate-binding domain.
Similarly, and also within the scope of the invention, it is envisaged that the use of an analogously modified (i.e. CBD- derivatized) form of a debranching enzyme, such as an isoamylase or a pullulanase, for debranching amylopectin-derived starch fragments (e.g. in connection with the above-outlined saccharification stage of a starch conversion process) will result in enhanced debranching performance, and thereby dextrose yield improvement, in the saccharification procedure.
DETAILED DESCRIPTION OF THE INVENTION
In a first aspect the present invention thus relates to a method for liquefying starch, wherein a starch substrate is treated in aqueous medium with a modified enzyme (enzyme hybrid) which comprises an amino acid sequence of an α-amylase linked (i.e. covalently bound) to an amino acid sequence comprising a carbohydrate-binding domain (CBD) . The invention also relates to an improved enzymatic process for liquefying starch which besides a modified α-amylase also is treated with a debranching enzyme. The debranching enzyme may be modified by linkage to an amino acid sequence comprising a carbohydrate-binding domain. A further aspect of the present invention relates to a method for saccharifying starch which has been subjected to a liquefaction process, wherein the reaction mixture after liquefaction is treated with a modified enzyme (enzyme hybrid) which comprises an amino acid sequence of an amylopectin- debranching enzyme (e.g. an isoamylase or a pullulanase) linked (i.e. covalently bound) to an amino acid sequence comprising a carbohydrate-binding domain (CBD) .
It is to understood that starch liquefaction processes as referred to in the context of the present invention do not embrace, for example, textile de-sizing processes wherein starch ("size") present in fabrics or textiles (normally cellulosic or cellulose-containing fabrics or textiles) is removed from the fabric or textile by an enzymatic process.
Carbohydrate-binding domains
A carbohydrate-binding domain (CBD) is a polypeptide amino acid sequence which binds preferentially to a poly- or oligosaccharide (carbohydrate) , frequently - but not necessarily exclusively to a water-insoluble (including crystalline) form thereof.
Although a number of types of CBDs have been described in the patent and scientific literature, the majority thereof - many of which derive from cellulolytic enzymes (cellulases) - are commonly referred to as "cellulose-binding domains"; a typical cellulose-binding domain will thus be a CBD which occurs in a cellulase. Likewise, other sub-classes of CBDs would embrace, e.g., chitin-binding domains (CBDs which typically occur in chitinases) , ylan-binding domains (CBDs which typically occur in xylanases) , mannan-binding domains (CBDs which typically occur in mannanases) , starch-binding domains [CBDs which may occur in certain amylolytic enzymes, such as certain glucoamylases, or in enzymes such as cyclodextrin glucanotransferases ("CGTases") ] , and others.
CBDs are found as integral parts of large polypeptides or proteins consisting of two or more polypeptide amino acid sequence regions, especially in hydrolytic enzymes (hydrolases) which typically comprise a catalytic domain containing the active site for substrate hydrolysis and a carbohydrate-binding domain (CBD) for binding to the carbohydrate substrate in question. Such enzymes can comprise more than one catalytic domain and one, two or three CBDs, and optionally further comprise one or more polypeptide amino acid sequence regions linking the CBD(s) with the catalytic domain (s) , a region of the latter type usually being denoted a "linker". Examples of hydrolytic enzymes comprising a CBD - some of which have already been mentioned above - are cellulases, xylanases, mannanases, arabinofuranosidases, acetylesterases and chitinases. CBDs have also been found in algae, e.g. in the red alga Porphyra purpurea in the form of a non-hydrolytic polysaccharide-binding protein [see P. Tomme et al. Cellulose- Binding Domains - Classification and Properties in Enzymatic Degradation of Insoluble Carbohydrates, John N. Saddler and Michael H. Penner (Eds.), ACS Symposium Series, No. 618 (1996)]. However, most of the known CBDs [which are classified and referred to by P. Tomme et al. (op cit . ) as "cellulose- binding domains"] derive from cellulases and xylanases.
In the present context, the term "cellulose-binding domain" is intended to be understood in the same manner as in the latter reference (P. Tomme et al., op . cit) , and the abbreviation "CBD" as employed herein will thus often be interpretable either in the broader sense (carbohydrate-binding domain) or in the - in principle - narrower sense (cellulose- binding domain) . The P. Tomme et al. reference classifies more than 120 "cellulose-binding domains" into 10 families (I-X) which may have different functions or roles in connection with the mechanism of substrate binding. However, it is anticipated that new family representatives and additional CBD families will appear in the future. In proteins/polypeptides in which CBDs occur (e.g. enzymes, typically hydrolytic enzymes) , a CBD may be located at the N or C terminus or at an internal position.
That part of a polypeptide or protein (e.g. hydrolytic enzyme) which constitutes a CBD per se typically consists of more than about 30 and less than about 250 amino acid residues. For example: those CBDs listed and classified in Family I in accordance with P. Tomme et al. (op . cit . ) consist of 33-37 amino acid residues, those listed and classified in Family Ila consist of 95-108 amino acid residues, those listed and classified in Family VI consist of 85-92 amino acid residues, whilst one CBD (derived from a cellulase from Clostridium thermocellum) listed and classified in Family VII consists of 240 amino acid residues. Accordingly, the molecular weight of an amino acid sequence constituting a CBD per se will typically be in the range of from about 4kD to about 40kD, and usually below about 35kD. Enzyme hybrids
Enzyme classification numbers (EC numbers) referred to in the present specification with claims are in accordance with the
Recommendations (1992) of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology.
Academic Press Inc., 1992.
As already indicated to some extent (vide supra) , modified enzymes as referred to herein (in the following also denoted "enzyme hybrids") include species comprising an amino acid sequence of an amylolytic enzyme [which in the context of the present invention may, e.g., be an α-amylase (EC 3.2.1.1), an isoamylase (EC 3.2.1.68) or a pullulanase (EC 3.2.1.41)] linked (i.e. covalently bound) to an amino acid sequence comprising a CBD. Other CBD-containing enzyme hybrids of interest in relation to degradation of starch include, e.g., hybrids comprising an amino acid sequence of a glucan 1, 4-α-maltohydrolase (EC 3.2.1.133), a β-amylase (EC 3.2.1.2), a glucoamylase (EC 3.2.1.3), or a neopullulanase (EC 3.2.1.135). CBD-containing enzyme hybrids, as well as detailed descriptions of the preparation and purification thereof, are known in the art [see, e.g., WO 90/00609, WO 94/24158 and WO 95/16782, as well as Greenwood et al. , Biotechnology and Bioengineering 44 (1994) pp. 1295-1305]. They may, e.g., be prepared by transforming into a host cell a DNA construct comprising at least a fragment of DNA encoding the cellulose- binding domain ligated, with or without a linker, to a DNA sequence encoding the enzyme of interest, and growing the transformed host cell to express the fused gene. The resulting recombinant product (enzyme hybrid) - often referred to in the art as a "fusion protein - may be described by the following general formula:
A-CBD-MR-X
In the latter formula, A-CBD is the N-terminal or the C- terminal region of an amino acid sequence comprising at least the carbohydrate-binding domain (CBD) per se . MR is the middle region (the "linker"), and X is the sequence of amino acid residues of a polypeptide encoded by a DNA sequence encodng the enzyme (or other protein) to which the CBD is to be linked. 5 The moiety A may either be absent (such that A-CBD is a CBD per se , i.e. comprises no amino acid residues other than those constituting the CBD) or may be a sequence of one or more amino acid residues (functioning as a terminal extension of the CBD per se) . The linker (MR) may be a bond, or a short linking group lθ comprising from about 2 to about 100 carbon atoms, in particular of from 2 to 40 carbon atoms. However, MR is preferably a sequence of from about 2 to about 100 amino acid residues, more preferably of from 2 to 40 amino acid residues, such as from 2 to 15 amino acid residues.
15 The moiety X may constitute either the N-terminal or the C- terminal region of the overall enzyme hybrid.
It will thus be apparent from the above that the CBD in an enzyme hybrid of the type in question may be positioned C- terminally, N-terminally or internally in the enzyme hybrid.
20
Cellulases (cellulase genes) useful for preparation of CBDs
Techniques suitable for isolating a cellulase gene are well known in the art. In the present context, the term "cellulase" refers to an enzyme which catalyses the degradation of cellulose
25 to glucose, cellobiose, triose and/or other cello-oligosac- charides .
Preferred cellulases (i.e. cellulases comprising preferred CBDs) in the present context are microbial cellulases, particularly bacterial or fungal cellulases. Endoglucanases (EC
30 3.2.1.4), particularly mono-component (recombinant) endoglucanases, are a preferred class of cellulases, .
Useful examples of bacterial cellulases are cellulases derived from or producible by bacteria from the group consisting of Pseudomonas , Bacillus, Cellulomonas, Clostridium, Microspora,
35 Thermotoga, Caldocellum and Actinomycets such as Streptomyces, Termomonospora and Acidothemus , in particular from the group consisting of Pseudomonas cellulolyticus , Bacillus lautus, Bacillus agaradherens , Cellulomonas fimi, Clostridium thermocellum, Clostridium stercorarium Microspora bispora,
Termomonospora fusca, Termomonoεpora cellulolyticum and
Acidothemus cellulolyticus. The cellulase may be an acid, a neutral or an alkaline cellulase, i.e. exhibiting maximum cellulolytic activity in the acid, neutral or alkaline range, respectively.
A useful cellulase is an acid cellulase, preferably a fungal acid cellulase, which is derived from or producible by fungi from the group of genera consisting of Trichoderma, Myrothecium, Aspergillus , Phanaerochaete , Neurospora, Neocallimastix and Botrytis.
A preferred useful acid cellulase is one derived from or producible by fungi from the group of species consisting of Tri- choderma viride, Trichoderma reesei, Trichoderma longibrachiatum, Myrothecium verrucaria, Aspergillus niger, Aspergillus oryzae, Phanaerochaete chrysosporium, Neurospora crassa, Neocallimastix partriciarum and Botrytis cinerea .
Another useful cellulase is a neutral or alkaline cellulase, preferably a fungal neutral or alkaline cellulase, which is derived from or producible by fungi from the group of genera consisting of Aspergillus, Penicillium, Myceliophthora, Humicola, Irpex, Fusarium, Stachybotrys , Scopulariopsis , Chaetomium, Myco- gone, Verticillium, Myrothecium, Papulospora, Gliocladium, Cepha- losporium and Acremonium .
A preferred alkaline cellulase is one derived from or producible by fungi from the group of species consisting of Humicola insolens, Fusarium oxysporum, Myceliopthora thermophila, Penicillium janthinellum and Cephalosporium sp., preferably from the group of species consisting of Humicola insolens DSM 1800, Fusarium oxysporum DSM 2672, Myceliopthora thermophila CBS 117.65, and Cephal ospori urn sp. RYM-202.
A preferred cellulase is an alkaline endoglucanase which is immunologically reactive with an antibody raised against a highly purified ~43kD endoglucanase derived from Humicola insolens DSM 1800, or which is a derivative of the latter ~43kD endoglucanase and exhibits cellulase activity. Other examples of useful cellulases are variants of parent cellulases of fungal or bacterial origin, e.g. a parent cellulase derivable from a strain of a species within one of the fungal genera Humicola , Trichoderma or Fusarium .
Other proteins (protein genes) useful for preparation of CBDs
Examples of other types of hydrolytic enzymes which comprise a CBD are, as already mentioned, xylanases, mannanases, arabinofuranosidases, acetylesterases and chitinases. As also mentioned previously, CBDs have also been found, for example, in certain algae, e.g. in the red alga Porphyra purpurea in the form of a non-hydrolytic polysaccharide-binding protein. Reference may be made to P. Tomme et al. (op cit . ) for further details concerning sources (organism genera and species) of such CBDs. Further CBDs of interest in relation to the present invention include CBDs deriving from glucoamylases (EC 3.2.1.3) or from CGTases (EC 2.4.1.19) .
CBDs deriving from such sources will also be generally be suitable for use in the context of the invention. In this connection, techniques suitable for isolating, e.g., xylanase genes, mannanase genes, arabinofuranosidase genes, acetylesterase genes, chitinase genes (and other relevant genes) are well known in the art.
Isolation of a CBD
In order to isolate a cellulose-binding domain of, e.g., a cellulase, several genetic engineering approaches may be used. One method uses restriction enzymes to remove a portion of the gene and then to fuse the remaining gene-vector fragment in frame to obtain a mutated gene that encodes a protein truncated for a particular gene fragment. Another method involves the use of exonucleases such as Bal31 to systematically delete nucleotides either externally from the 5 ' and the 3 ' ends of the DNA or internally from a restricted gap within the gene. These gene- deletion methods result in a mutated gene encoding a shortened gene molecule whose expression product may then be evaluated for substrate-binding (e.g. cellulose-binding) ability. Appropriate substrates for evaluating the binding ability include cellulosic materials such as Avicel™ and cotton fibres. Other methods include the use of a selective or specific protease capable of cleaving a CBD, e.g. a terminal CBD, from the remainder of the polypeptide chain of the protein in question
As already indicated (vide supra) , once a nucleotide sequence encoding the substrate-binding (carbohydrate-binding) region has been identified, either as cDNA or chromosomal DNA, it may then be manipulated in a variety of ways to fuse it to a DNA sequence encoding the enzyme of interest. The DNA fragment encoding the carbohydrate-binding amino acid sequence, and the DNA encoding the enzyme of interest are then ligated with or without a linker. The resulting ligated DNA may then be manipulated in a variety of ways to achieve expression. Preferred microbial expression hosts include certain Aspergillus species (e.g. A. niger or A. oryzae) , Bacillus species, and organisms such as Escherichia coli or Saccharomyces cerevisiae .
Amylolytic enzymes
Amylases (in particular α-amylases) which are appropriate as the basis for CBD/amylase hybrids of the types employed in the context of the present invention include those of bacterial or fungal origin. Chemically or genetically modified mutants of such amylases are included in this connection. Relevant α-amylases include, for example, α-amylaseε obtainable from Bacillus species, in particular a special strain of B . licheniformis , described in more detail in GB 1296839. Relevant commercially available amylases include Duramyl™, Termamyl™, Fungamyl™ and BAN™ (all available from Novo Nordisk A/S, Bagsvaerd, Denmark) , and Rapidase™ and Maxamyl P™ (available from Gist-Brocades, Holland) , and Optitherm™ (available from Solvay) , and Spezy AA™ and Spezyme Delta AA| (available from Genencor) , and Keistase™ (available from Daiwa) . Other amylases (in particular α-amylases) which are appropriate as the basis for CBD/amylase hybrids of the types employed in the context of the present invention include a hybrid α-amylase consisting of 1-35 N-terminal amino acids of BAN (available from Novo Nordisk) and the C-terminal 36-483 C- terminal amino acids of Termamyl^ (available from Novo Nordisk) with one or more of the following mutations H156Y, A181T, N190F A209V, Q264S; Termamyl with one or more of the following mutations I201E, D207H, E211Q, H205S; or Maxamyl™ (available from Gist-brocades/Genencor) , with one or more of the following mutations H133Y, N188P,S.
Starch- or starch-fragment-debranching enzymes
Isoamylases: isoamylases (EC 3.2.1.68) appropriate as the basis for CBD/isoamylase hybrids of the types employed in the context of the present invention include those of bacterial origin. Chemically or genetically modified mutants of such isoamylases are included in this connection. Relevant isoamylases include, for example, isoamylases obtainable from Pseudomonas species, (e.g. Pseudomonas sp. SMP1 or P. amyloderomosa SB15) , Bacillus species (e.g. B . amyloliquefaciens) , Flavobacterium species or Cytophaga (Lysobacter) species.
Pullulanases: pullulanases (EC 3.2.1.41) appropriate as the basis for CBD/pullulanase hybrids of the types employed in the context of the present invention include those of bacterial origin. Chemically or genetically modified mutants of such pullulanases are included in this connection. Relevant pullulanases include, for example, pullulanases obtainable from Bacillus species (e.g.
B . acidopullulyticus; such a Promozyme™, from Novo Nordisk A/S).
Plasmids
Preparation of plasmids capable of expressing fusion proteins having the amino acid sequences derived from fragments of more than one polypeptide are well known in the art (see, e.g. WO 90/00609 and WO 95/16782) . The expression cassette may be included within a replication system for episomal maintenance in an appropriate cellular host or may be provided without a replication system, where it may become integrated into the host genome. The DNA may be introduced into the host in accordance with known techniques such as transformation, microinjection or the like.
Once the fused gene has been introduced into the appropriate host, the host may be grown to express the fused gene. Normally it is desirable additionally to add a signal sequence which provides for secretion of the fused gene. Typical examples of useful fused genes are:
Signal sequence — (pro-peptide) — carbohydrate-binding domain -
- linker — enzyme of interest, or
Signal sequence — (pro-peptide) — enzyme of interest — linker
— carbohydrate-binding domain,
in which the pro-peptide sequence normally contains 5-25 amino acid residues.
The recombinant product may be glycosylated or non- glycosylated.
Determination of α-amylolytic activity (KNU)
The α-amylolytic activity of an enzyme or enzyme hybrid may be determined using potato starch as substrate. This method is based on the break-down (hydrolysis) of modified potato starch, and the reaction is followed by mixing samples of the starch/enzyme or starch/hybrid enzyme solution with an iodine solution. Initially, a blackish-blue colour is formed, but during the break-down of the starch the blue colour becomes weaker and gradually turns to a reddish-brown. The resulting colour is compared with coloured glass calibration standards.
One Kilo Novo α-Amylase Unit (KNU) is defined as the amount of enzyme (enzyme hybrid) which, under standard conditions (i.e. at 37+0.05°C, 0.0003 M Ca2+, pH 5.6) dextrinizes 5.26 g starch dry substance (Merck Amylum solubile) . Test conditions suitable for evaluating the performance of CBD- containing enzyme hybrids in starch processing
Test conditions (e.g. conditions of pH, temperature, calcium concentration etc.) suitable for testing, e.g., CBD/α-amylase, CBD/isoamylase or CBD/pullulanase enzyme hybrids as described herein will suitably be conditions as already described above in connection with industrial starch conversion processes. Assay methods suitable for determining enzymatic activity under various conditions (e.g. pH, temperature, calcium concentration etc., depending on the nature of the enzyme hybrid) are well known in the art for numerous types of enzymes which are appropriate for linkage to a CBD as described herein, and a person of ordinary skill in the art will readily be able to select assay procedures suitable for evaluating the enzymatic performance of enzyme hybrids as employed in the present context.
The invention also relates to an isolated DNA sequence encoding a hybrid enzyme with amylolytic activity comprising:
(a) a DNA sequence encoding an amylolytic activity;
(b) a DNA sequences encoding a CBD; and (c) a DNA sequence or fragments thereof encoding the linker sequence shown in SEQ ID no. 21.
It is often a problem of hybrid enzyme comprising an enzyme and a CDB connected via a linker that they are not very stable due to the linker. The inventors have found that when using the linker shown in SEQ ID NO. 21 or essential parts thereof the hybrids are very stable.
The isolated DNA sequence of the invention typically encodes an enzyme with amylolytic activity, such as α-amylase activity, in particular a Bacillus α-amylase activity, especially the activity of Termamyl^ or a variant thereof, or one of the amylolytic activities mentioned above in the section "Amylolytic enzymes". The CBD may be any CBD e.g the CBDs described above in the section "Carbohydrate-binding domains". In a preferred embodiment the CBD is the CBD of the Bacillus agaradherens NCIMB No. 40482 alkaline cellulase Cel5A or the CBD-dimer of Clostridium stercorarium (NCIMB 11754) XynA.. In a specific embodiment of the invention the isolated DNA sequence is the Termamyl|-linker-Cel5A-CBD encoded by plasmid pMB492 shown in SEQ ID No. 19.
In a further aspect the invention relates to a DNA construct comprising the isolated DNA sequence of the invention operably linked to one or more control sequences capable of directing the expression of the DNA sequence in a suitable expression host.
The promoter may be any DNA sequence which shows transcriptional activity in the host cell of choice and may be derived from genes encoding proteins either homologous or heterologous to the host cell. Examples of suitable promoters for directing the transcription of the DNA encoding the cellulytic enzyme of the invention in bacterial host cells include the promoter of the Bacillus stearothermophilus maltogenic amylase gene, the Bacillus licheniformis alpha- a ylase gene, the Bacillus amyloliquefaciens BAN amylase gene, the Bacillus subtilis alkaline protease gene, or the Bacillus pumilus xylanase or xylosidase gene, the phage Lambda PR or P promoters, or the E. coli lac, trp or tac promoters.
Examples of suitable promoters for use in yeast host cells include promoters from yeast glycolytic genes (Hitzeman et al. (1980) J. Biol. Chem. 255:12073-12080; Alber and Kawasaki (1982) J. Mol. Appl. Gen. 1:419-434) or alcohol dehydrogenase genes (Young et al. (1982) in Genetic Engineering of Microorganisms for Chemicals (Hollaender et al, eds.), Plenum Press, New York), or the TPI1 (US 4,599,311) or ADH2-4c (Russell et al. (1983) Nature 304:652-654) promoters.
To direct the CBD/enzyme hybrid into the secretory pathway of the host cells, a secretory signal sequence (also known as a leader sequence, prepro sequence or pre sequence) may be provided in the expression vector. The secretory signal sequence is joined to the DNA sequence encoding the enzyme hybrid in the correct reading frame. Secretory signal sequences are commonly positioned 5 ' to the DNA sequence encoding the amylolytic enzyme. The secretory signal sequence may be that normally associated with the amylolytic enzyme or may be from a gene encoding another secreted protein.
In a preferred embodiment, the expression vector of the invention may comprise a secretory signal sequence substantially identical to the secretory signal encoding sequence of the Bacillus licheniformis α-amylase gene, e.g. as described in WO 86/05812.
Also, measures for amplification of the expression may be taken, e.g. by tandem amplification techniques, involving single or double crossing-over, or by multicopy techniques, e.g. as described in US 4,959,316 or WO 91/09129. Alternatively the expression vector may include a temperature sensitive origin of replication, e.g. as described in EP 283,075.
Procedures for ligating DNA sequences encoding the cellulytic enzyme, the promoter and optionally the terminator and/or secretory signal sequence, respectively, and to insert them into suitable vectors containing the information necessary for replication, are well known to persons skilled in the art (cf., for example, Sambrook et al. (1989) supra .
The invention also relates to a recombinant expression vector comprising the DNA construct of the invention, a promoter, and transcriptional and translational stop signals.
It is also an object of the invention to provide a host cell comprising the DNA construct of the invention.
The host cell of the invention, into which the DNA construct or the recombinant expression vector of the invention is to be introduced, may be any cell which is capable of producing the amylolytic enzyme and includes bacteria, yeast, fungi and higher eukaryotic cells.
Examples of bacterial host cells which, on cultivation, are capable of producing the cellulytic enzyme of the invention are grampositive bacteria such as strains of Bacillus , in particular a strain of B . subtilis, B . licheniformis, B . lentus, B . brevis, B . stearothermophilus, B . alkalophilus, B . amyloliquefaciens, B . coagulans, B . circulanε, B . lautus , B . megatherium, B . pumilus, B . thuringiensis or B . agaradherens , or strains of Streptomyces , in particular a strain of S . lividanε or S . murinus , or gramnegative bacteria such as Echerichia coli . The transformation of the bacteria may be effected by protoplast transformation or by using competent cells in a manner known per se (cf. Sambrook et al . (1989) supra) . When expressing the CBD/enzyme hybrid in bacteria such as E. coli , the enzyme may be retained in the cytoplasm, typically as insoluble granules (known as inclusion bodies) , or may be directed to the periplasmic space by a bacterial secretion sequence. In the former case, the cells are lysed and the granules are recovered and denatured after which the cellulytic enzyme is refolded by diluting the denaturing agent. In the latter case, the hybrid enzyme may be recovered from the periplasmic space by disrupting the cells, e.g. by sonication or osmotic shock, to release the contents of the periplasmic space and recovering the hybrid enzyme.
The transformed or transfected host cell described above is then cultured in a suitable nutrient medium under conditions permitting the expression of the cellulytic enzyme, after which the resulting cellulytic enzyme is recovered from the culture. The medium used to culture the cells may be any conventional medium suitable for growing the host cells, such as minimal or complex media containing appropriate supplements.
Suitable media are available from commercial suppliers or may be prepared according to published recipes (e.g., in catalogues of the American Type Culture Collection) . The cellulytic enzyme produced by the cells may then be recovered from the culture medium by conventional procedures including separating the host cells from the medium by centrif gation or filtration, precipitating the proteinaceous components of the supernatant or filtrate by means of a salt, e.g., ammonium sulphate, purification by a variety of chromatographic procedures, e.g., ion exchange chromatography, gelfiltration chromatography, affinity chromatography, or the like, dependent on the type of cellulytic enzyme in question. The present invention also relates to methods for producing a CBD/enzyme hybrid of the present invention comprising (a) cultivating a Bacillus strain to produce a supernatant comprising the polypeptide; and (b) recovering the polypeptide. The present invention also relates to methods for producing a hybrid enzyme of the present invention comprising (a) cultivating a host cell under conditions conducive to expression of the polypeptide; and (b) recovering the polypeptide.
In both methods, the cells are cultivated in a nutrient medium suitable for production of the hybrid enzyme using methods known in the art. For example, the cell may be cultivated by shake flask cultivation, small-scale or large- scale fermentation (including continuous, batch, fed-batch, or solid state fermentations) in laboratory or industrial fermentors performed in a suitable medium and under conditions allowing the polypeptide to be expressed and/or isolated. The cultivation takes place in a suitable nutrient medium comprising carbon and nitrogen sources and inorganic salts, using procedures known in the art (see, e.g. , references for bacteria and yeast; Bennett, J.W. and LaSure, L. , eds. (1991) More Gene Manipulations in Fungi, Academic Press, CA) . Suitable media are available from commercial suppliers or may be prepared according to published compositions (e.g., in catalogues of the American Type Culture Collection) . If the polypeptide is secreted into the nutrient medium, the polypeptide can be recovered directly from the medium. If the polypeptide is not secreted, it is recovered from cell lysates.
The hybrid enzyme may be detected using methods known in the art that are specific for the hybrid enzymes. These detection methods may include use of specific antibodies, formation of an enzyme product, or disappearance of an enzyme substrate. For example, an enzyme assay may be used to determine the activity of the enzyme. Procedures for determining amylolytic activity are known in the art and are described below.
The resulting hybrid enzyme may be recovered by methods known in the art. For example, the hybrid enzyme may be recovered from the nutrient medium by conventional procedures including, but not limited to, centrifugation, filtration, extraction, spray-drying, evaporation, or precipitation. The recovered hybrid enzyme may then be further purified by a variety of chromatographic procedures, e.g. , ion exchange chromatography, gel filtration chromatography, affinity chromatography, or the like.
The hybrid enzyme of the present invention may be purified by a variety of procedures known in the art including, but not limited to, chromatography (e.g., ion exchange, affinity, hydrophobic, chromatofocusing, and size exclusion) , electrophoretic procedures (e.g., preparative isoelectric focusing (IEF) , differential solubility (e.g., ammonium sulfate precipitation), or extraction (see, e.g., Protein Purification (Janson and Ryden, eds.), VCH Publishers, New York, 1989).
In a final aspect the invention relates to an isolated and purified CBD/enzyme hybrid encoded by the isolated DNA sequence of the invention, in particular the hybrid shown in SEQ ID No. 20.
MATERIALS AND METHODS Materials:
Enzymes and enzyme hybrids:
Termamyl§|-linker-CBDEGV : Hybrid of Termamyl^ and the fungal CBDEGV from Humicola insolens EGV. The construction of the hybrid is described in Example 9.
CBDCenA-Termamyl| : Hybrid of the CBDCenA from Cellulomonas fimi endoglucanase A (CenA) and Termamyl^ via a linker. The construction of the hybrid is described in Example 8.
Termamyl| (available from Novo Nordisk A/S)
Plasmids: pDN1528 (S.Jørgensen et al. (1991) Journal of Bacteriology, vol. 173, No., p-559-567.)
pBluescriptKSII- (Stratagene, USA) . pDN1981 (P.L. Jørgensen, C.K.Hansen, G.B.Poulsen and
B.Diderichsen (1990) In vivo genetic engineering: homologues recombination as a tool for plasmid construction, Gene, 96, P37-41.)
pSJ1678: Described in WO 94/19454; pDN1981: Described by Jørgensen et al. (1990) Gene 96:37-41).
Strains: Bacillus AC13 NCIMB 40482 (identical to Bacillus agaradherens DSM 8721) expressing the endoglucanase enzyme encoding DNA sequence of SEQ ID NO: 1. described in Example 1 below
E. coli strain: Cells of E . coli SJ2 (Diderichsen et al. (1990) J. Bacteriol. 172:4315-4321), which encodes alpha-acetolactate decarboxylase, an exoenzyme from Bacillus brevis were prepared for and transformed by electroporation using a Gene Pulser™ electroporator from BIO-RAD as described by the supplier.
B . subtilis PL2306 was used as the transformation host strain. It is a cellulase-negative strain developed by introducing a disruption in the transcriptional unit of the known Bacillus subtilis cellulase gene in B . subtilis strain
DN1885 (Diderichsen, B. , Wedsted, U. , Hedegaard, L. , Jensen, B. R. , Sjøholm, C. (1990) Cloning of aldB , which encodes alpha- acetolactate decarboxylase, an exoenzyme from Bacillus brevis . J. Bacteriol. 172:4315-4321). Not only was the cellulase gene of DN1885 disrupted but also two protease encoding genes where disrupted, namely the aprE (Stahl,M.L. and E.Ferrari 1984 Replacement of the Bacillus subtilis subtilisin structural gene with an In vitro-derived deletion mutation. J .Bacteriol . 158:411-418) and nprE (Yang, M.Y. et al 1984 Cloning of the neutral protease gene of Bacillus subtilis and the use of the cloned gene to create an in vitro-derived deletion mutation. J .Bacteriol . 160:16-21) genes
The disruption was performed essentially as described in Bacillus subtilis and other Gram-Positive Bacteria; A.L. Sonenshein, J.A. Hoch and Richard Losick, Eds. American Society for Microbiology, 1993, p.618).
Bacillus subtilis : ToC46 (Diderichsen, B. , Wedsted, U. , Hedegaard, L. , Jensen, B. R. , Sjøholm, C. (1990) Cloning of aldB, which encodes alpha-acetolactate decarboxylase, an exoenzyme from Bacillus brevis. J. Bacteriol., 172, 4315-4321) Was used as a secondary expression host, competent cells and transformation was performed as described above.
Solutions/Media/Reagents Waxy maize from Cerestar
Corn Starch Cerestar (89% DS) GL 03406 Batch 624362
TY and LB agar (as described in Ausubel, F. M. et al. (eds.) "Current protocols in Molecular Biology". John Wiley and Sons, 1995) .
SB: 32 g Tryptone, 20 g Yeast Extract, 5 g NaCl and 5 ml 1 N NaOH are mixed in sterile water to a final volume of 1 liter. The solution is sterilised by autoclaving for 20 min at 121 °C.
10% Avicel: 100 g of Avicel (FLUKA, Switserland) is mixed with sterile water to a final volume of 1 litre, and the 10% Avicel is sterilised by autoclaving for 20 min at 121 °C.
Buffer: 0.05 M potassium phosphate, pH 7.5
Methods DE determination
DE (dextrose equivalent is defined as the amount of reducing carbohydrate ( measured as dextrose-equivalents) in a sample expressed as w/w% of the total amount of dissolved dry matter) . It is measured by the neocuproine assay ( Dygert, Li Floridana(1965) Anal. Biochem. No 368) . The principle of the neocuproine assay is that CuS04 is added to the sample, Cu++ is reduced by the reducing sugar and the formed neocuproine complex is measured at 450 nm.
General molecular biology methods:
DNA manipulations and transformations were performed using standard methods of molecular biology (Sambrook et al. (1989) Molecular cloning: A laboratory manual, Cold Spring Harbor lab., Cold Spring Harbor, NY; Ausubel, F. M. et al. (eds.) "Current protocols in Molecular Biology". John Wiley and Sons, 1995; Harwood, C. R. , and Cutting, S. M. (eds.) "Molecular Biological Methods for Bacillus". John Wiley and Sons, 1990).
Enzymes for DNA manipulations were used according to the specifications of the suppliers.
Cellulytic Activity Cellulytic activity may be measured in cellulase viscosity units (CEVU) , determined at pH 9.0 with carboxymethyl cellulose (CMC) as substrate.
Cellulase viscosity units are determined relatively to an enzyme standard (< 1% water, kept in N2 atmosphere at -20°C; arch standard at -80°C) . The standard used, 17-1187, is 4400 CEVU/g under standard incubation conditions, i.e., pH 9.0, Tris Buffer 0.1 M, CMC Hercules 7 LFD substrate 33.3 g/1, 40.0°C for 30 minutes.
α-amylase-Termamyl Activity
See Novo Nordisk analytical method AF 9/6, available on request.
EXAMPLES The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use various constructs and perform the various methods of the present invention and are not intended to limit the scope of what the inventors regard as their invention. Unless indicated otherwise, parts are parts by weight, temperature is in degrees centigrade, and pressure is at or near atmospheric pressure. Efforts have been made to ensure accuracy with respect to numbers used (e.g., length of DNA sequences, molecular weights, amounts, particular components, etc.), but some deviations should be accounted for.
EXAMPLE 1
Cloning of Bacillus agaradherens Endoglucanase Gene Genomic DNA Preparation.
The strain NCIMB 40482 (identical to Bacillus agaradherens DSM 8721) was propagated in liquid medium as described in WO 94/01532. After 16 hours of incubation at 30°C and 300 rpm, the cells were harvested, and genomic DNA was isolated by the method described by Pitcher et al. (1989) Lett. Appl. Microbiol . 8 : 151-156) .
Genomic Library Construction.
Genomic DNA was partially digested with restriction enzyme Sau3A and size-fractionated by electrophoresis on a 0.7 % agarose gel. Fragments of between 2 and 7 kb in size were isolated by electrophoresis onto DEAE-cellulose paper (Dretzen et al. (1981) Anal. Biochem. 112:295-298). Isolated DNA fragments were ligated to BamHI digested, pSJ1678 plasmid DNA.
PCR Amplification.
In order to obtain the endoglucanase gene as ligated to the pSJ1678 vector, the ligation mixture was used as DNA template in a PCR reaction containing 200 mM of each nucleotide (dATP, dCTP, dGTP and dTTP) , 2.5 mM MgCl2, Expand High Fidelity buffer, 2.0 units of Expand High Fidelity PCR system enzyme mix and 300 nM of each of the following primers:
Primer 1 (#9555) :
5 ' -TCACAGATCCTC-GCGAATTGGTGCGGCCGCGTNGTNG-ARGARCAYGGNC-3 ' (SEQ ID No. 3) .
Primer 1 is a degenerated primer designed to match the amino acid sequence (Val-Val-Glu-Glu-His-Gly-Gln) (SEQ ID No. 4) of the N-terminal amino acid sequence presented in WO94/01532. The last amino acid is only presented by the first nucleotide of the codon namely C. C is the 3 ' -nucleotide of the primer.
Furthermore, a Notl site is included at the 5'- end for cloning purposes these nucleotides are underlined. Primer 2 (#9029) :
5 ' -CAGAGCAAGAGATTACGCGC-3 • (SEQ ID NO: 5).
Primer 2 corresponds to a sequence present in the pSJ1678 vector.
The PCR cycling was performed in a Hans Landgraf THERMOCYCLER (Hans Landgraf, Germany) , following the profile: 1 x (120 seconds at 94 °C) ;
10 x (10 seconds at 94°C; 30 seconds at 55°C; 240 seconds at 72 °C) ;
30 x (10 seconds at 94°C; 30 seconds at 55°C; 180 seconds at 72 °C; adding 20 seconds to the keep time at 72 °C for each new cycle) ; and
1 x (300 seconds at 72°C) .
The PCR product was gel purified by gel eletrophoresis in a 0.7% agarose gel, and the relevant fragment (approx. 1.7 kb) was excised from the gel and purified using QIAquick Gel extraction Kit (Qiagen, USA) according to the manufacturer's instructions. The purified DNA was eluted in 50 μl of lO M Tris-HCl, pH 8.5.
This DNA was used as a template for a PCR re-amplification using the same primers, mixture and cycle profile as above. The PCR product was gel purified by gel eletrophoresis in a 0.7% agarose gel, and the relevant fragment was excised from the gel and purified using QIAquick Gel extraction Kit. The purified DNA was eluted in 50 μl of 10 mM Tris-HCl, pH 8.5.
The purified DNA was digested with Notl and Hindlll, gel purified as above, and ligated to the vector pBluescriptll KS- (Stratagene, USA) , also digested with Notl and Hindlll, and the ligation mixture was used to transform E . coli SJ2. Cells were plated on LB agar plates containing ampicillin (200 μg/ml) supplemented with X-gal (5-Bromo-4-chloro-3-indolyl alpha-D-Galactopyranoside, 50 μg/ml) .
Identification and Charaterization of Positive Clones.
The transformed cells were plated on LB agar plates containing ampicillin (200 μg/ml) supplemented with X-gal (5- Bromo-4-chloro-3-indolyl alpha-D-Galactopyranoside, 50 μg/ml) , and incubated at 37 °C overnight. The next day white colonies were rescued by restreaking these onto fresh LB-ampicillin agar plates and incubated at 37 °C overnight. The day after, single colonies of each clone were transferred to liquid LB medium containing ampicillin (200 μg/ml) , and incubated overnight at 37 °C with shaking at 250 rpm. Plasmids were extracted from the liquid cultures using QIAgen Plasmid Purification mini kit. 5 μl samples of the plasmids are digested with Notl and Hindlll . The digestions were checked by gel electrophoresis on a 0.7 % agarose gel (NuSieve, FMC) . The appearance of a DNA fragment of approximately 1.0 kb indicated a positive clone.
Nucleotide Sequencing the Cloned DNA Fragment.
Qiagen purified plasmid DNA was sequenced with the Taq deoxy terminal cycle sequencing kit (Perkin Elmer, USA) and the primer "Reverse" or the primer "Forward":
Reverse: 5 ' -GTTTTCC-CAGTCACGAC-3 ' (SEQ ID No.6), Forward: 5 ' -GCGGATAACAATTTCACACAGG-3 ' (SEQ ID No. 7).
The DNA was sequenced using an Applied Biosystems 373A automated sequencer according to the manufacturers instructions. Analysis of the sequence data is performed according to Devereux et al. (1984) Nucleic Acids Res. 12:387-395).
From this sequence new primers could be designed for performing Inverse PCR [cf. McPherson et al. (eds) in PCR-A practical approach; 1991 IRL Press) . Inverse PCR on Genomic DNA of Strain NCIMB 40482.
Genomic DNA was isolated as described above. 2 mg of pure genomic DNA was digested with EcoRI. The EcoRI was heat inactivated at 65 °C for 20 minutes, after which a phenol: chloroform extraction of DNA was performed. DNA was finally ethanol precipitated and resuspended in 20 ml TE.
1 ml of EcoRI digested DNA was ligated with T4-DNA ligase in 100 ml reaction mixture containing T4 ligase buffer and 1 Unit T4-DNA ligase (Boehringer Mannheim, Germany) . After 18 hours of ligation at 14 °C, the ligase was heat inactivated at 68 °C for 10 minutes. In order to linearize the circulized genomic DNA fragments prior to Inverse PCR, the ligation mixture was supplemented with 10 U of BstEII (a BstEII site was present internally of the DNA sequence obtained above) . 50 ml of the BstEII digested ligation mixture was used as template in a PCR reaction containing 200 mM of each nucleotide (dATP, dCTP, dGTP and dTTP) , 2.5 mM MgCl2, Expand High Fidelity buffer, 2.0 units of Expand High Fidelity PCR system enzyme mix, and 300 nM of each of the following primers:
Primer 3 (#19719): 5 ' -TGACCCGTACGGTCCGTGGG-3 ' (SEQ ID No. 8), and
Primer 4 (#19720): 5 • -GGCTCTTGATTTTGTGTCCACC-3 (SEQ ID No.9).
The PCR cycling was performed in a Hans Landgraf THERMOCYCLER (Hans Landgraf, Germany), following the profile: 1 x (120 seconds at 94 °C) ;
10 x (10 seconds at 94°C; 30 seconds at 55°C; 240 seconds at 72 °C) ; 30 x (10 seconds at 94°C; 30 seconds at 55°C; 180 seconds at 72 °C adding 20 seconds to the keep time at 72 °C for each new cycle) ; and
1 x (300 seconds at 72°C).
The PCR product was gel purified by gel eletrophoresis in a 0.7% agarose gel, and the relevant fragment (approx. 4-5 kb) was excised from the gel and purified using QIAquick Gel extraction Kit. The purified DNA was eluted in 50 μl of lO M Tris-HCl , pH 8 . 5 .
Nucleotide Seguencing the Inverse-PCR DNA Fragment.
Qiagen purified DNA was sequenced with the Taq deoxy terminal cycle sequencing kit (Perkin Elmer, USA) , and the primer 1, 3 and 4 described above, using an Applied Biosystems 373A automated sequencer according to the manufacturers instructions. Analysis of the sequence data is performed according to Devereux et al. (1984) supra) . Based upon the obtained sequence two new primers were designed in order to clone the alkaline endoglucanase as presented as SEQ ID No. 12. The primers were #20887 (SEQ ID No. 10) and #100084 (SEQ ID NO. 14) as described below.
EXAMPLE 2
Expression of the Alkaline Endoglucanase in Bacillus subtilis
The nucleotide sequence in SEQ ID No. 12 was cloned by PCR for introduction in an expression plasmid pDN1981.
PCR was performed as described below on 500 ng of genomic DNA, using the following two primers containing Ndel and Kpnl
(the Kpnl site is conveniently present in the amplified sequence) restriction sites for introducing the endoglucanase encoding DNA sequence to pDN1981 for expression:
Primer 5 (#20887) :
5'-GTA GGC TCA GTC ATA TGT TAC ACA TTG AAA GGG GAG GAG AAT CAT GAA AAA GAT AAC TAC TAT TTT TGT CG-3 ' (SEQ ID No. 10) , and
Primer 7 (#100084) :
5'- CCT CGC GAG GTA CCA GCG GCC GCG TAC CAC CAA TTA AGT ATG GTA
C -3' (SEQ ID No. 14)
The underlined nucleotides of Primer 5 corresponds to the Ndel site, and the underlined nucleotides in the Primer 7 is part of the Kpnl site present in the sequence.
Usitng the ExpandTM Long Template PCR system (avai•lable from Boehringer Mannheim, Germany) amplification was performed using a mixture consisting of (Buffer 1 diluted 10 times) and 200 μM of each dNTP, 2.5 units of Enzyme mix (Boehringer Mannheim, Germany) and 500 pmol of each primer.
The PCR reactions was performed using a DNA Thermal Cycler (available from Landgraf, Germany) . One incubation at 94°C for 2 minutes followed by ten cycles of PCR performed using a cycle profile of denaturation at 94°C for 10 seconds, annealing at 55°C for 30 seconds, and extension at 68°C for 4 minutes. Followed by 25 cycles of PCR performed using a cycle profile of denaturation at 94°C for 10 seconds, annealing at 55°C for 30 seconds, and extension at 68°C for 3 minutes (this duration of extension is extended with 20 seconds for each of the 25 cycles) .
Aliquots of 10 μl of the amplification product is analysed by electrophoresis in 0.7 % agarose gels (NuSieve, FMC) with ReadyLoad lOObp DNA ladder (GibcoBRL, Denmark) as a size marker.
After PCR cycling, the PCR fragment was purified using QIAquick PCR column Kit (Qiagen, USA) according to the manufacturer's instructions. The purified DNA was eluted in 50 μl of lOmM Tris-HCl, pH 8.5, digested with Ndel and Kpnl, and purified and ligated to digested pDN1981. The ligation mixture was used to transform B . subtilis PL2304.
Competent cells were prepared and transformed as described by Yasbin et al . [ Yasbin R E, Wilson G A & Young F E; Transformation and transfection in lysogenic strains of Bacillus subtilis : evidence for selective induction of prophage in competent cells; J Bacteriol 1975 121 296-304].
Isolation and Test of Bacillus subtilis Transfor ants
The transformed cells were plated on LB agar plates containing 10 mg/ml Kanamycin, 0.4% glucose, 10 mM KH2P04 and 0.1% AZCL HE-cellulose (Megazyme, Australia), and incubated at
37 °C for 18 hours. Endoglucanase positive colonies were identified as colonies surrounded by a blue halo.
Each of the positive transformants were inoculated in 10 ml TY-medium containing 10 mg/ml Kanamycin. After 1 day of incubation at 37°C and stirring at 250 rpm, 50 ml supernatant was removed. The endoglucanase activity was identified by adding 50 ml supernatant to holes punched in the agar of LB agar plates containing 0.1 % AZCL HE-cellulose.
After 16 hours of incubation at 37 °C, blue halos surrounding holes indicated expression of the endoglucanase in Bacillus subtilis .
EXAMPLE 3
Analysis of the Cloned Sequence.
The protein sequence derived from the cloned endoglucanase gene shows an endoglucanase of the following composition:
Amino acid residues 1 to 26 correspond to a signal peptide; amino acid residues 27 to 326 constitute the actual endoglucanase (ho ologues to other family 5 glycosyl hydrolases) ; amino acid residues 327 to 354 correspond to a linker; amino acid residues 355 to 400 correspond to a cellulose binding domain (as described in Example 3) ; amino acid residues 401 to 416 correspond to a linker; and amino acid residues 417 to 462 constitute a second cellulose binding domain (highly homologues to the first one (at amino acid residues 355 to 400) ) .
The molar extinction coefficient was determined as 146,370. The molecular weight was approximately 52 kD.
For the protein without the signal sequence the molar extinction coefficient was determined as 146.370. The molecular weight was approximately 49 kD.
The enzyme has no cysteine, and the charged amino acids give a calculated pi of around 4.
EXAMPLE 4
Subcloning of a partial Termamyl^ sequence.
The α-amylase gene encoded on pDN1528 was PCR amplified for introduction of a BamHI site in the 3 ' -end of the coding region. The PCR and the cloning was done as follows. Approximately 10 to 20 ng of plasmid pDN1528 was PCR amplified in HiFidelity PCR buffer (Boehringer Mannheim, Germany) supplemented with 200 μM of each dNTP, 2.6 units of HiFidelity Expand enzyme mix, and 300 pmol of each primer: 5
Primer 8, #5289
5'-GCT TTA CGC CCG ATT GCT GAC GCT G -3' (SEQ ID No. 15)
Primer 9, #26748 10 5' -GCG ATG AGA CGC GCG GCC GCC TAT CTT TGA ACA TAA ATT GAA ACG GAT CCG -3' (SEQ ID No. 16) Restriction site BamHI are underlined.
The PCR reactions was performed using a DNA thermal cycler
15 (Landgraf, Germany) . One incubation at 94°C for 2 minutes, 30 seconds at 60°C and 45 seconds at 72°C followed by ten cycles of PCR performed using a cycle profile of denaturation at 94°C for 30 seconds, annealing at 60°C for 30 seconds, and extension at 72°C for 45 seconds and twenty cycles of denaturation at
20 94°C for 30 seconds, 60 °C for 30 seconds and 72°C for 45 seconds (at this elongation step 20 seconds are added every cycle) . 10 μl aliquots of the amplification product was analysed by electrophoresis in 1.0 % agarose gels (NuSieve, FMC) with ReadyLoad lOObp DNA ladder (GibcoBRL, Denmark) as a
25 size marker.
40 μl aliquots of the PCR product generated as described above were purified using QIAquick PCR purification kit (Qiagen, USA) according to the manufacturer's instructions. The purified DNA was eluted in 50 μl of lOmM Tris-HCl, pH 8.5. 25
30 μl of the purified PCR fragment was digested with BamHI and Pstl, electrophoresed in 1.0% low gelling temperature agarose (SeaPlaque GTG, FMC) gels, the relevant fragment was excised from the gel, and purified using QIAquick Gel extraction Kit (Qiagen, USA) according to the manufacturer's instructions. The
35 isolated DNA fragment was then ligated to BamHI-Pstl digested pBluescriptll KS- and the ligation mixture was used to transform E. coli SJ2.
Cells were plated on LB agar plates containing ampicillin (200 μg/ml) and supplemented with X-gal (5-Bromo-4-chloro-3- indolyl alpha-D-galactopyranoside, 50 μg/ml) , and incubated at 37°C over night. Next day white colonies were re-streaked onto fresh LB-ampicillin agar plates and incubated at 37°C over night. The next day single colonies were transferred to liquid LB medium containing (200 μg/ml) and incubated overnight at 37°C with shaking at 250 rpm. Plasmids were extracted from the liquid cultures using QIAgen Plasmid Purification mini kit (Qiagen, USA) according to the manufacturer's instructions. 5 μl samples of the plasmids were digested with Pstl and BamHI. The digestions were checked by gelelectrophoresis on a 1.0% agarose gel (NuSieve, FMC). One positive clone, containing the Pstl-BamHI fragment containing part of the alfa-amylase gene, was designated pMB335. This plasmid was then used in the construction of α-amylase-CBD hybrids.
In vitro amplification of the linker and the most C-terminal
CBD of Bacillus agaradherens NCIMB No. 40482.
Approximately 100 to 200 ng of chromosomal DNA obtained from Bacillus agaradherens NCIMB No. 40482 (as described in the
Examples 1 to 3 above) was PCR amplified in HiFidelity PCR buffer (Boehringer Mannheim, Germany) supplemented with 200 μM of each dNTP, 2.6 units of HiFidelity Expand enzyme mix, and
300 pmol of each primer:
Primer 10, #110150A 51- GCT GCA GGA TCC GTT TCA ATT TAT GTT CAA AGA TCT GAT CCA GAT TCA GGA G -3' (SEQ ID No. 17)
Primer 11, #100084
5 ' -CCT CGC GAG GTA CCA GCG GCC GCG TAC CAC CAA TTA AGT ATG GTA C-31 (SEQ ID NO. 18)
Restriction sites BamHI and Notl are underlined.
The primers were designed to amplify the linker and most C- terminal CBD of the endoglucanase encoding gene of
Bacillus agaradherens NCIMB No. 40482 described in the Examples above) .
The PCR reaction was performed using a DNA thermal cycler 5 (Landgraf, Germany) . One incubation at 94°C for 2 minutes, 30 seconds at 60°C and 45 seconds at 72°C followed by ten cycles of PCR performed using a cycle profile of denaturation at 94°C for 30 seconds, annealing at 60 C for 30 seconds, and extension at 72°C for 45 seconds and twenty cycles of denaturation at
10 94°C for 30 seconds, 60 °C for 30 seconds and 72°C for 45 seconds (at this elongation step 20 seconds are added every cycle) . 10 μl aliquots of the amplification product was analysed by electrophoresis in 1.5 % agarose gels (NuSieve, FMC) with ReadyLoad lOObp DNA ladder (GibcoBRL, Denmark) as a
15 size marker.
Cloning by polvmerase chain reaction (PCR) : Subcloning of PCR fragments.
40 μl aliquots of the PCR products generated as described
20 above were purified using QIAquick PCR purification kit (Qiagen, USA) according to the manufacturer's instructions. The purified DNA was eluted in 50 μl of lOmM Tris-HCl, pH 8.5. 25 μl of the purified PCR fragment was digested with Notl and partially digested with BamHI, electrophoresed in 1.5% low
25 gelling temperature agarose (SeaPlaque GTG, FMC) gels, the relevant fragment was excised from the gels, and purified using QIAquick Gel extraction Kit (Qiagen, USA) according to the manufacturer's instructions. The isolated DNA fragment was then ligated to BamHI-Notl digested pMB335 and the ligation mixture
30 was used to transform E . coli SJ2.
Identification and characterization of positive clones.
Cells were plated on LB agar plates containing z (200 μg/ml) and incubated at 37°C over night. Next day colonies were
35 restreaked onto fresh LB-ampicillin agar plates and incubated at 37°C over night. The next day single colonies were transferred to liquid LB medium containing (200 μg/ml) and incubated overnight at 37°C with shaking at 250 rpm.
Plasmids were extracted from the liquid cultures using QIAgen Plasmid Purification mini kit (Qiagen, USA) according to 5 the manufacturer's instructions. Five-μl samples of the plasmids were digested with BamHI and Notl. The digestions were checked by gelelectrophoresis on a 1.5% agarose gel (NuSieve, FMC) . The appearance of a DNA fragment of the same size as seen from the PCR amplification indicated a positive clone. 10 One positive clone, containing the fusion construct of the α-amylase gene and the CBD of Bacillus agaradherens NCIMB No. 40482 alkaline cellulase Cel5A, was designated MBamyC5ANewlink.
Cloning of the fusion construct into a Bacillus based
15 expression vector.
The pDN1528 vector contains the amyL gene of B . licheniformis this gene is actively expressed in B . subtilis resulting in the production of active α-amylase appearing in the supernatant. For expression purposes the DNA encoding the
20 fusion protein as constructed above was introduced to pDN1528.
This was done by digesting p MBamyC5ANewlink and pDN1528 with Sall-Notl, purifying the fragments and ligating the 4.7 kb pDN1528 Sall-Notl fragment with the 0.5 kb pMBamyC5ANewlink
Sall-Notl fragment. This created an infra e fusion of the
25 hybrid construction with the Termamyl gene. See sequence for pMB492 (SEQ ID No. 19).
The ligation mixture was used to transform competent cells of PL2306. Cells were plated on LB agar plates containing chloramphenicol (6 μg/ml), 0.4% glucose and lOmM potassium
30 hydrogen phosphate and incubated at 37°C over night. Next day colonies were restreaked onto fresh LBPG chloramphenicol agar plates and incubated at 37°C over night. The next day single colonies of each clone were transferred to liquid LB medium containing chloramphenicol (6 μg/ml) and incubated overnight at
35 37°C with shaking at 250 rpm.
Plasmids were extracted from the liquid cultures using QIAgen Plasmid Purification mini kit (Qiagen, USA) according to the manufacturer's instructions, however the resuspension buffer was supplemented with 1 mg/ml of Chicken Egg White Lysozyme (SIGMA, USA) prior to lysing the cells at 37°C for 15 minutes. 5 μl samples of the plasmids were digested with BamHI and Notl. The digestions were checked by gelelectrophoresis on a 1.5% agarose gel (NuSieve, FMC). The appearance of a DNA fragment of the same size as seen from the PCR amplification indicated a positive clone. One positive clone was designated MB492.
Expression, secretion and functional analysis of the fusion protein.
The clone MB492 (expressing Termamyl^ fused to Bacillus agraraά"here.ns-Cel5A-linker-CBD) was incubated for 20 hours in SB-medium at 37°C and 250 rpm. 1 ml of cell-free supernatant was mixed with 200 μl of 10% Avicel. The mixture was left for 1 hour incubation at 0°C. After this binding of CBD to Avicel the Avicel with CBD was spun 5 minutes at 5000g. The pellet was re- suspended in 100 μl of SDS-page buffer, boiled at 95°C for 5 minutes, spun at 5000g for 5 minutes and 25 μl was loaded on a 4-20% Laemmli Tris-Glycine, SDS-PAGE NOVEX gel (Novex, USA) . The samples were electrophoresed in a Xcell Mini-Cell (NOVEX, USA) as recommended by the manufacturer, all subsequent handling of gels including staining with comassie, destaining and drying were performed as described by the manufacturer. The appearance of a protein band of approx. 60 kDa, indicated expression in B . subtilis of the Termamyl|-Linker-CBD fusion encoded on the plasmid pMB492 (SEQ ID No. 19) . The expression protein sequence of the fusion construction of pMB492 is shown in SEQ ID No. 20.
The linker region of interest as described in this example is the specific sequence:
SDPDSGEPDPTPPSDPG (SEQ ID No. 21)
Example 5 Isolation of genomic DNA from Clostridium stercorarium NCIMB 11754.
Clostridium stercorarium NCIMB 11754 was grown anaerobically at 60°C in specified media as recommended by The National Collections of Industrial and Marine Bacteria Ltd. (Scotland) . Cells were harvested by centrifugation.
Genomic DNA was isolated as described by Pitcher et al. (Pitcher, D. G. , Saunders, N. A., Owen, R. J. (1989). Rapid extraction of bacterial genomic DNA with guanidium thiocyanate. Lett. Appl. Microbiol., 8, 151-156).
In vitro amplification of the CBD-dimer of Clostridium stercorarium (NCIMB 11754) XynA.
Approximately 100 to 200 ng of genomic DNA (isolated as described above) was PCR amplified in HiFidelity PCR buffer
(Boehringer Mannheim, Germany) supplemented with 200 μM of each dNTP, 2.6 units of HiFidelity| Expand enzyme mix, and 300 pmol of each primer:
Primer 12, #114135
5' -GCT GCA GGA TCC GTT TCA ATT TAT GTT CAA AGA TCT CCA ACT CCT GCC CCA TCT CAA AGC-3 ' (SEQ ID NO. 22)
Primer 13, #110151 5' -GCG ATG AGA CGC GCG GCC GCT ACT ACC AGT CAA CAT TAA CAG GAC CTG AG -3' (SEQ ID NO. 23) Restriction sites BamHI and Notl are underlined.
The primers were designed to amplify the DNA encoding the Cellulose Binding Domain of the XynA encoding gene of
Clostridium stercorarium (NCIMB 11754) , the DNA sequence was extracted from the database GenBank under the accession number
D13325.
The PCR reaction was performed using a DNA thermal cycler (Landgraf, Germany) . One incubation at 94°C for 2 minutes, 30 seconds at 60°C and 45 seconds at 72°C followed by ten cycles of PCR performed using a cycle profile of denaturation at 94°C for 30 seconds, annealing at 60 C for 30 seconds, and extension at 72°C for 45 seconds and twenty cycles of denaturation at 94°C for 30 seconds, 60 °C for 30 seconds and 72°C for 45 seconds (at this elongation step 20 seconds are added every cycle) . 10 μl aliquots of the amplification product was analyzed by electrophoresis in 1.0 % agarose gels (NuSieve, FMC) with ReadyLoad lOObp DNA ladder (GibcoBRL, Denmark) as a size marker.
Cloning by polymerase chain reaction (PCR) : Subcloning of PCR fragments.
40 μl aliquots of the PCR products generated as described above are purified using QIAquick PCR purification kit (Qiagen, USA) according to the manufacturer's instructions. The purified DNA is eluted in 50 μl of 10 mM Tris-HCl, pH 8.5. 25 μl of the purified PCR fragment is digested with BamHI and Eagl, electrophoresed in 1.0% low gelling temperature agarose (SeaPlaque GTG, FMC) gels, the relevant fragment is excised from the gels, and purified using QIAquick Gel extraction Kit (Qiagen, USA) according to the manufacturer's instructions. The isolated DNA fragment is then ligated to BamHI-Notl digested pMB335 and the ligation mixture is used to transform E . coli SJ2.
The following steps were then performed as described above:
-Identification and characterisation of positive clones.
-Cloning of the fusion construct into a Bacillus based expression vector.
-Expression, secretion and functional analysis of the fusion protein.
The appearance of a protein band of approximately 87 kDa on the comassie stained SDS-PAGE, shows positive expression of the hybrid in Bacillus subtilis .
The resulting hybrid is thus expressed in Bacillus subtilis clone MBXynCBD2 and is encoded in the DNA sequence SEQ ID No. 24 which can be translated to the protein sequence shown in SEQ ID No. 25. EXAMPLE 6
CBDCei5A-linker-Termamyl starch processing
It is investigated whether or not CBDCel5A-linker-Termamyl (i.e. Bacillus agaradherens NCIMB 40482 endoglucanase C- terminal CBD linked to Termamyl^ via the linker shown in SEQ ID No. 21 constructed as described in Example 4) gives an improved liquefaction of starch per μg enzyme protein/g dry substance compared to Termamyl^ at pH 6.0 and 40 ppm Ca2+. A shaking oil bath is heated to 105 °C. Two starch slurries (30% DS with 40 ppm Ca++ ) are prepared, the pH is adjusted to 6.0 with NaOH. CBDCeι5A-linker-Termamyl| and Termamyl^, respectively, are well mixed into the slurries.
From each slurry four portions of 10 g each are taken. Each portion are placed in an Erlenmeyer flask with screw cap. The flasks were placed in the oil bath for 8 minutes at 105°C and then 90 minutes at 95°C.
After 7 minutes and 45 seconds in the oil bath, the thermostat of the oil bath is adjusted to 95.4°C and 2 litre oil at room temperature are added to the oil bath. A clock is started and samples (1 flask of each slurry) are taken after 20, 40, 60, and 90 minutes. 2 drops of 1 N HC1 is added to each flask to inactivate the amylase.
The DE-value is then determined as a function of time to compare the starch liquefaction per μg enzyme/g DS of CBDCel5A- linker-Termamyl| with Termamyl^.
EXAMPLE 7
Construction of the CBDCenΛ expression vector pCBDTOOl. The gene fragment encoding the 103 residue CBDcenA from Cellulomonas fimi endoglucanase A (CenA) was cloned in the high expression vector pTugE07K3. Appropriate restriction sites were introduced at the 5' and 3' ends of the CBDCenA gene by PCR. Each PCR mixture (50 ml total volume) contained 25 ng template DNA (pTZ18R-l.6ce.nA; Damude 1995 Doctoral thesis, University of British Columbia. Canada), 25-50 pmole primers (5'SAENH and 3'SAENH), 10 % dimethyl sulfoxide, 0.4 mM 2 ' -deoxynucleotide 5 '-triphosphates, and 1U Vent DNA polymerase in "Thermopol" buffer (New England BioLabs) . Twenty successive cycles of denaturation at 94 °C for 30 seconds, followed by annealing at 55 °C for 30 seconds, and primer extension at 72 °C for 54 seconds were performed. A Spel site (underlined) was introduced at the 5' end of the CBDcenA gene fragment, using the oligonucleotide (5'SAENH)
Primer 14 5 ' -AGGTCTACTAGTCCCGGCTGCCGCGTCGAC-3 ' (SEQ ID No. 27)
as primer. .EcoRI (underlined) , Nhel (in bold) and Hindlll (in italics) restriction sites were introduced at the 3' end of the CBDcenA sequence using the oligonucleotide (3'SAENH)
Primer 15
5 '-CCGATTAAAGCTΓATTAGCTAGCACGGAATTCCGTGGGGCTGGTCGTCGGCAC-3 '
(SEQ ID No. 28)
as primer. The resulting 0.38 kb PCR fragment was digested with Spel and Hindlll and ligated in frame with the Cex leader peptide at the Nhel-Hindlll site of pTugE07K3 , previously cut with Nhel and Hindlll to remove the CBDcex gene fragment. The final construct pCBDTOOl was verified by restriction and PCR analysis.
2. Construction of the CBD-Termamyl^ hybrid expression vector PNAMK 1.0 .
The plasmid pSJ3368 a derivative of pDN1528 (S. Jørgensen et al. (1991) Journal of Bacteriology, vol. 173, No., p-559-567.) containing the Termamyl^ gene, was isolated from Bacillus by standard methods. Appropriate restriction sites for recloning the Termamyl^ gene fragment in the E . coli vector pCBDTOOl and for the construction of the hybrids were introduced by PCR. Each PCR reaction mixture (50 ml total volume) contained 15 ng template DNA (pSJ3368) , 3 pmol primers (PAM1 and PAM2) , 2 mM MgSθ4, 10 % dimethyl sulfoxide, 0.4 mM 2 ' -deoxynucleotide 5'- triphosphates and 1U Vent DNA polymerase in "Thermopol" buffer (New England BioLabs) . Thirty successive cycles were performed as follows: denaturation at 95°C for 1 min, annealing at 55°C for 1 min and primer extension at 72 °C for 1.54 min. A Nhel (underlined) and Ncol site were introduced at the 5' end of the gene with the oligonucleotide (PAM1)
Primer 16
5 • -TCATGAGCCATGGCTAGCGCAAATCTTAATGGGACGCTGATG-3 ' (SEQ ID NO. 29)
as primer. An Spel (in bold) and Hindlll site (underlined) were introduced at the3 ' end of the Termamyl gene using the oligonucleotide (PAM2)
Primer 17
5 ' -ATGACTAAGCTTAC TTACTTAGTGATGGTGATGGTGATGACTAGTTCTTTGAA
CATAAATTGAAACCGA-3 ' (SEQ ID NO. 30)
as primer. This also introduced a Hisβ-tag (in italics) for easy purification of the hybrid protein by immobilized metal affinity chromatography (IMAC) , and a stop codon immediately preceding the Hindlll restriction sequence. The resulting 1.5 kb fragment was digested with Nhel and Hindlll and cloned in frame with the CBDcenA at the Nhel-Hindlll site of pCBDTOOl to give pNAMK 1.0. The construct was verified by restricion digesting with Nhel and Hindlll and by automated sequencing.
CBDC A-PTPTTP-Termamyl^ production and purification Overnight cultures of E . coli JM101, harboring plasmid pNAMl.0, were diluted 500-fold in terrific broth (TB; 12 g tryptone, 24 g yeast extract, 9.8 g K2HPO4, 2.2 g KH2PO4 and 8 g (10 ml) glycerol in 11) (Sambrook et al . , 1989) (ref: Sambrook J. , Fritsch, E.F., & Maniatis, T. (1989) Molecular cloning: a laboratory manual, 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) supplemented with 1.25 mM CaCl2 and 100 mg kanamycin per ml and grown at 30°C to an Aeoo of 3.0-5.0. Protein production was induced by the addition of isopropyl-β-D-thiσgalactopyranoside (IPTG) to a final concentration of 0.1 mM. The cultures were incubated for an additional 18 hours at 30°C by which time the CBD-Termamyl hybrid had leaked into the culture medium. Cells were removed by centrifugation at 4°C for 10 minutes at 13,000 x g. The protein was precipitated from the clarified supernatant with 70% (NH4)2S04 with stirring overnight at 4°C. Proteins were recovered by centrifugation at 11,000 x g and the pellet was dissolved in 20 mM Tris-HCl, pH 8.0 (binding buffer). After further centrifugation at 15000 x g, the clarified supernatants was loaded onto a Ni + agarose column (Novagen, Markham, ON) . The column was washed with 40 mM imidazole, 200 mM NaCl, 20 mM Tris-HCl, pH 8.0 (wash buffer). Bound proteins were eluted with a gradient of imidazole (0-500 mM) in 20 mM Tris-HCl buffer containing 500 mM NaCl. CaCl2 was immediately added to the fractions to a final concentration of 1 mM to stabilize the protein . Fractions were analysed on SDS-PAGE (12%) and by activity measurements. The NAM1.0 nucleotide sequence is shown in SEQ ID NO. 31 and can be translated into the amino acid sequence shown in SEQ ID No. 32.
EXAMPLE 8 Termamyl linker fungal CBD from Humicola insolens EGV. pNAMK6.1 (Termamyl||-linker-CBDEGV)
The Termamyl vector NAM 2.0 for C-terminal CBD:
Each PCR reaction mixture (50 ml total volume) contained 15 ng template DNA (pSJ3368) , 3 pmol primers (5Term2 and 3Term2) , 2 mM MgSθ4, 10 % dimethyl sulfoxide, 0.4 mM 2 ' -deoxynucleotide 5 ' -triphosphates and 1U Vent DNA polymerase in "Thermopol" buffer (New England BioLabs) . Thirty successive cycles were performed as follows: denaturation at 95 °C for 1 min, annealing at 55°C for 1 min and primer extension at 72 °C for 1.54 min. Nhel (underlined) and EcoRI (in bold) sites were introduced at the 5' end of the Termamyl gene with the oligonucleotide (5Term2) Primer 18
5 ' -CATATGGCTAGCGAATTCGCAAATCTTAATGGGACGCTG-3 ' (SEQ ID NO. 33)
as primer. Stul (underlined) , Spel (in bold) and Hindlll sites (in italics) were introduced at the3 ' end of the Termamyl^ gene using the oligonucleotide (3Term2)
Primer 19 5 * -AAGCTΓACTAGTAGGCCTTCTTTGAACATAAATT GAAA-3 ' (SEQ ID NO. 34)
as primer. The construct was verified by restricion digesting and by automated sequencing.
The fungal CBD vector: pCBDT006 was obtained by cloning the gene fragment encoding CBDEGV from Humicola insolens endoglucanase V (WO 91/17243) in pTugE07K3. Appropriate restriction sites were introduced at the 5' and 3' ends of the CBDEGV gene by PCR. Each PCR mixture (50 ml total volume) contained 25 ng template DNA 25-50 pmole primers (N137 and NlPTcs) , 10 % dimethyl sulfoxide, 0.4 mM 2 ' - deoxynucleotide 5 '-triphosphates, and 1U Vent DNA polymerase in "Thermopol" buffer (New England BioLabs) . Twenty successive cycles of denaturation at 96 °C for 45 seconds, followed by annealing at 50 °C for 60 seconds, and primer extension at 72 °C for 35 seconds were performed. The last cycle was followed by extension at 72 °C for 90 seconds.
Nhel (underlined) , EcoRI (in bold, underlined) , Stul (in bold) restriction site were introduced before the artificial linker (in small letters, italics) , Spel (in italics, underlined) and _5co47III (in small, bold) sites were introduced after the linker at the 3 ' end of the CBDEGV sequence using the oligonucleotide (5CBDT6)
Primer 20
5 ' - CCATGGGCTAGCCCTGAATTCAGGCCTccaacccccΛCTAGrcCGagcqctCCC AGCGGCTGCACTGCTG -3' (SEQ ID No. 35) as primer. A Hindlll (underlined) restriction site was introduced at the 3 ' end of the CBDEGV sequence using the oligonucleotide (3CBDT6)
Primer 21
5'- AGCCTAAGCTTACAGGCACTGATGGTACCAGT -3' (SEQ ID No. 36)
as primer. The resulting 0.18 kb PCR fragment was digested with Nhel and Hindlll and ligated in frame with the Cex leader peptide at the Nhel-Hindlll site of pTugE07K3 , previously cut with Nhel and Hindlll to remove the CBDcex gene fragment. The final construct pCBDT006 was verified by restriction and PCR analysis.
Construction of the hybrid NAMK6.1 (Termamyl^-linker-CBDEGV
The Termamyl| vector NAM2.0 was digested with Nhel and Stul and the resulting 1.48 kb fragment was gel purified using the
Gene Clean (BiolOl) kit and ligated in frame with the CBDEGV encoding fragment in pCBDT006, previously cut with Nhel and
Stul to give pNAMKβ.l.
The product has the following characterization MW 60863.
Total 537 amino acid residues. First the Termamyl^ catalytic amylase then the linker in one letter codes: RPPTPTSPSAPS (SEQ ID No. 37) and finally 38 residues from the fungal CBD. Complete nucleotide Sequence for pNAMK6.1
(pTugK with Termamyl|;-CBDEGv insert) is shown in SEQ ID No. 26.
Example 9 Termamyl|-linker-CBDEGV starch processing
It was investigated whether or not the Termamyl|-linker-CBDEGV (Termamyl linker fungal CBD from Humicola insolens EGV constructed as described in Example 9 above) gives a better liquefaction of starch per μg enzyme protein/g dry substance compared to Termamyl^ at pH 6.0 and 40 ppm Ca2+.
A shaking oil bath was heated to 105°C. Three starch slurries (30% DS with 40 ppm Ca++ ) were prepared, the pH was adjusted to 6.0 with NaOH. The enzyme was well mixed into the slurries according to the scheme:
Slurry 1: Termamyl|-linker-CBDEGV 10.9 μg/g DS starch Slurry 2: Termamyl|-linker-CBDEGV 8.72 μg/g DS starch Slurry 3: Termamyl^ 10.9 μg/g DS starch
From each slurry four portions of 10 g each were taken. Each portion were placed in an Erlenmeyer flask with screw cap. The flasks were placed in the oil bath for 8 minutes at 105 °C and then 90 minutes at 95 °C.
After 7 minutes and 45 seconds in the oil bath, the thermostat of the oil bath was adjusted to 95.4°C and 2 litre oil at room temperature were added to the oil bath. A clock was started and samples (1 flask of each slurry) were taken after
20, 40, 60, and 90 minutes. 2 drops of IN HCl was added to each flask to inactivate the amylase.
Figure imgf000046_0001
As can be seen from the Table above the Termamyl|-linker- CBDEGV gives a improved liquefaction per μg enzyme/g DS compared to Termamyl^.
Example 10
CBDCenA-Termamyl| starch processing It was investigate whether or not CBDCenA-Termamyl|
(Cellolumonas fimi endoglucanase A CBD and Termamyl^ via a linker as described in Example 8 above) gives an improved liquefaction of starch per activity unit/g dry substance compared to Termamyl§f at pH 6.0 and 40 ppm Ca2+.
A shaking oil bath was heated to 105°C. Two starch slurries (30% DS with 40 ppm Ca++) were prepared, the pH was adjusted to 6.0 with NaOH. The enzyme was well mixed to the slurries according to the scheme:
Slurry 1: CBDCenA-Termamyl 75NU/g DS starch Slurry 2: Termamyl^ 75NU/g DS starch
From each slurry four portions of 10 g each were taken. Each portion were placed in an Erlenmeyer flask with screw cap.
The flasks were placed in the oil bath for 8 minutes at 105°C and then 90 minutes at 95°C.
After 7 minutes and 45 seconds in the oil bath, the thermostat of the oil bath was adjusted to 95.4°C and 2 litre oil at room temperature were added to the oil bath. A clock was started and samples (1 flask of each slurry) were taken after
20, 40, 60, and 90 minutes. 2 drops of IN HC1 were added to each flask to inactivate the amylase.
DE-determinations as function of time:
Figure imgf000047_0001
As can be seen from the Table above the CBDCenA-Termamyl gives a better liquefaction per activity unit/g DS compared to Termamyl^. SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT:
(A) NAME: Novo Nordisk A/S
(B) STREET: Novo Alle
(C) CITY: Bagsvaerd
(E) COUNTRY: Denmark
(F) POSTAL CODE (ZIP): DK-2880
(G) TELEPHONE: +45 4444 8888 (H) TELEFAX: +45 4449 3256
(ii) TITLE OF INVENTION: Hybrid enzymes/Starch processing (iii) NUMBER OF SEQUENCES: 37 (iv) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO)
(2) INFORMATION FOR SEQ ID Nθ:l:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1203 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (vi) ORIGINAL SOURCE:
(A) ORGANISM: Bacillus agaradherens
(B) STRAIN: AC13 (ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..1203
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: ATG AAA AAG ATA ACT ACT ATT TTT GTC GTA TTG CTT ATG ACA GTG GCG 48 Met Lys Lys lie Thr Thr lie Phe Val Val Leu Leu Met Thr Val Ala 1 5 10 15
TTG TTC AGT ATA GGA AAC ACG ACT GCT GCT GAT AAT GAT TCA GTT GTA 96 Leu Phe Ser lie Gly Asn Thr Thr Ala Ala Asp Asn Asp Ser Val Val 20 25 30
GAA GAA CAT GGG CAA TTA AGT ATT AGT AAC GGT GAA TTA GTC AAT GAA 144 Glu Glu His Gly Gin Leu Ser lie Ser Asn Gly Glu Leu Val Asn Glu 35 40 45
CGA GGC GAA CAA GTT CAG TTA AAA GGG ATG AGT TCC CAT GGT TTG CAA 192 Arg Gly Glu Gin Val Gin Leu Lys Gly Met Ser Ser His Gly Leu Gin 50 55 60
TGG TAC GGT CAA TTT GTA AAC TAT GAA AGT ATG AAA TGG CTA AGA GAT 240 Trp Tyr Gly Gin Phe Val Asn Tyr Glu Ser Met Lys Trp Leu Arg Asp 65 70 75 80
GAT TGG GGA ATA AAT GTA TTC CGA GCA GCA ATG TAT ACC TCT TCA GGA 288 Asp Trp Gly lie Asn Val Phe Arg Ala Ala Met Tyr Thr Ser Ser Gly 85 90 95
GGA TAT ATT GAT GAT CCA TCA GTA AAG GAA AAA GTA AAA GAG GCT GTT 336 Gly Tyr lie Asp Asp Pro Ser Val Lys Glu Lys Val Lys Glu Ala Val 100 105 110
GAA GCT GCG ATA GAC CTT GAT ATA TAT GTG ATC ATT GAT TGG CAT ATC 384 Glu Ala Ala lie Asp Leu Asp lie Tyr Val lie lie Asp Trp His lie 115 120 125
CTT TCA GAC AAT GAC CCA AAT ATA TAT AAA GAA GAA GCG AAG GAT TTC 432 Leu Ser Asp Asn Asp Pro Asn lie Tyr Lys Glu Glu Ala Lys Asp Phe 130 135 140 TTT GAT GAA ATG TCA GAG TTG TAT GGA GAC TAT CCG AAT GTG ATA TAC 480 Phe Asp Glu Met Ser Glu Leu Tyr Gly Asp Tyr Pro Asn Val He Tyr 145 150 155 160
GAA ATT GCA AAT GAA CCG AAT GGT AGT GAT GTT ACG TGG GGC AAT CAA 528 Glu He Ala Asn Glu Pro Asn Gly Ser Asp Val Thr Trp Gly Asn Gin 165 170 175
ATA AAA CCG TAT GCA GAG GAA GTC ATT CCG ATT ATT CGT AAC AAT GAC 576 He Lys Pro Tyr Ala Glu Glu Val He Pro He He Arg Asn Asn Asp 180 185 190
CCT AAT AAC ATT ATT ATT GTA GGT ACA GGT ACA TGG AGT CAG GAT GTC 624 Pro Asn Asn He He He Val Gly Thr Gly Thr Trp Ser Gin Asp Val 195 200 205
CAT CAT GCA GCT GAT AAT CAG CTT GCA GAT CCT AAC GTC ATG TAT GCA 672 His His Ala Ala Asp Asn Gin Leu Ala Asp Pro Asn Val Met Tyr Ala 210 215 220
TTT CAT TTT TAT GCA GGG ACA CAT GGT CAA AAT TTA CGA GAC CAA GTA 720 Phe His Phe Tyr Ala Gly Thr His Gly Gin Asn Leu Arg Asp Gin Val 225 230 235 240
GAT TAT GCA TTA GAT CAA GGA GCA GCG ATA TTT GTT AGT GAA TGG GGA 768 Asp Tyr Ala Leu Asp Gin Gly Ala Ala He Phe Val Ser Glu Trp Gly 245 250 255
ACA AGT GCA GCT ACA GGT GAT GGT GGC GTG TTT TTA GAT GAA GCA CAA 816 Thr Ser Ala Ala Thr Gly Asp Gly Gly Val Phe Leu Asp Glu Ala Gin 260 265 270
GTG TGG ATT GAC TTT ATG GAT GAA AGA AAT TTA AGC TGG GCC AAC TGG 864 Val Trp He Asp Phe Met Asp Glu Arg Asn Leu Ser Trp Ala Asn Trp 275 280 285
TCT CTA ACG CAT AAA GAT GAG TCA TCT GCA GCG TTA ATG CCA GGT GCA 912 Ser Leu Thr His Lys Asp Glu Ser Ser Ala Ala Leu Met Pro Gly Ala 290 295 300
AAT CCA ACT GGT GGT TGG ACA GAG GCT GAA CTA TCT CCA TCT GGT ACA 960 Asn Pro Thr Gly Gly Trp Thr Glu Ala Glu Leu Ser Pro Ser Gly Thr 305 310 315 320
TTT GTG AGG GAA AAA ATA AGA GAA TCA GCA TCT ATT CCG CCA AGC GAT 1008 Phe Val Arg Glu Lys He Arg Glu Ser Ala Ser He Pro Pro Ser Asp 325 330 335
CCA ACA CCG CCA TCT GAT CCA GGA GAA CCG GAT CCA ACG CCC CCA AGT 1056 Pro Thr Pro Pro Ser Asp Pro Gly Glu Pro Asp Pro Thr Pro Pro Ser 340 345 350
GAT CCA GGA GAG TAT CCA GCA TGG GAT CCA AAT CAA ATT TAC ACA AAT 1104 Asp Pro Gly Glu Tyr Pro Ala Trp Asp Pro Asn Gin He Tyr Thr Asn 355 360 365
GAA ATT GTG TAC CAT AAC GGC CAG CTA TGG CAA GCA AAA TGG TGG ACA 1152 Glu He Val Tyr His Asn Gly Gin Leu Trp Gin Ala Lys Trp Trp Thr 370 375 380
CAA AAT CAA GAG CCA GGT GAC CCG TAC GGT CCG TGG GAA CCA CTC AAT 1200 Gin Asn Gin Glu Pro Gly Asp Pro Tyr Gly Pro Trp Glu Pro Leu Asn 385 390 395 400
TAA 1203 (2) INFORMATION FOR SEQ ID NO: 2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 400 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:
Met Lys Lys He Thr Thr He Phe Val Val Leu Leu Met Thr Val Ala 1 5 10 15
Leu Phe Ser He Gly Asn Thr Thr Ala Ala Asp Asn Asp Ser Val Val 20 25 30
Glu Glu His Gly Gin Leu Ser He Ser Asn Gly Glu Leu Val Asn Glu 35 40 45
Arg Gly Glu Gin Val Gin Leu Lys Gly Met Ser Ser His Gly Leu Gin 50 55 60
Trp Tyr Gly Gin Phe Val Asn Tyr Glu Ser Met Lys Trp Leu Arg Asp 65 70 75 80
Asp Trp Gly He Asn Val Phe Arg Ala Ala Met Tyr Thr Ser Ser Gly 85 90 95
Gly Tyr He Asp Asp Pro Ser Val Lys Glu Lys Val Lys Glu Ala Val 100 105 110
Glu Ala Ala He Asp Leu Asp He Tyr Val He He Asp Trp His He 115 120 125
Leu Ser Asp Asn Asp Pro Asn He Tyr Lys Glu Glu Ala Lys Asp Phe 130 135 140
Phe Asp Glu Met Ser Glu Leu Tyr Gly Asp Tyr Pro Asn Val He Tyr 145 150 155 160
Glu He Ala Asn Glu Pro Asn Gly Ser Asp Val Thr Trp Gly Asn Gin 165 170 175
He Lys Pro Tyr Ala Glu Glu Val He Pro He He Arg Asn Asn Asp 180 185 190
Pro Asn Asn He He He Val Gly Thr Gly Thr Trp Ser Gin Asp Val 195 200 205
His His Ala Ala Asp Asn Gin Leu Ala Asp Pro Asn Val Met Tyr Ala 210 215 220
Phe His Phe Tyr Ala Gly Thr His Gly Gin Asn Leu Arg Asp Gin Val 225 230 235 240
Asp Tyr Ala Leu Asp Gin Gly Ala Ala He Phe Val Ser Glu Trp Gly 245 250 255
Thr Ser Ala Ala Thr Gly Asp Gly Gly Val Phe Leu Asp Glu Ala Gin 260 265 270
Val Trp He Asp Phe Met Asp Glu Arg Asn Leu Ser Trp Ala Asn Trp 275 280 285
Ser Leu Thr His Lys Asp Glu Ser Ser Ala Ala Leu Met Pro Gly Ala 290 295 300
Asn Pro Thr Gly Gly Trp Thr Glu Ala Glu Leu Ser Pro Ser Gly Thr 305 310 315 320 Phe Val Arg Glu Lys He Arg Glu Ser Ala Ser He Pro Pro Ser Asp 325 330 335
Pro Thr Pro Pro Ser Asp Pro Gly Glu Pro Asp Pro Thr Pro Pro Ser 340 345 350
Asp Pro Gly Glu Tyr Pro Ala Trp Asp Pro Asn Gin He Tyr Thr Asn 355 360 365
Glu He Val Tyr His Asn Gly Gin Leu Trp Gin Ala Lys Trp Trp Thr 370 375 380
Gin Asn Gin Glu Pro Gly Asp Pro Tyr Gly Pro Trp Glu Pro Leu Asn 385 390 395 400
(2) INFORMATION FOR SEQ ID NO: 3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 49 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS : single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid ( ix ) FEATURE :
(A) NAME/KEY: misc-feature:
(B) OTHER INFORMATION: /desc = "Primer 1 (#9555)" ( ix ) FEATURE :
(A) NAME/KEY: misc-feature
(B) LOCATION: 33,36,39,42,45,48
(D): OTHER INFORMATION: /Note N= A,G,C or T
R= G or A Y= C or T (xi) SEQUENCE DESCRIPTION: SEQ ID Nθ:3:
TCACAGATCC TCGCGAATTG GTGCGGCCGC GTNGTNGARG ARCAYGGNC 49
(2) INFORMATION FOR SEQ ID NO: 4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 7 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:
Val Val Glu Glu His Gly Gin 5
(2) INFORMATION FOR SEQ ID NO: 5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 2" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:
CAGAGCAAGAG ATTACGCGC 19
(2) INFORMATION FOR SEQ ID NO: 6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE: (A) NAME/KEY: misc-feature:
(B) OTHER INFORMATION: /desc = "Reverse Primer" (xi) SEQUENCE DESCRIPTION: SEQ ID Nθ:6:
GTTTTCCCAG TCACGAC 17
(2) INFORMATION FOR SEQ ID NO: 7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid ( ix) FEATURE :
(A) NAME/KEY: misc-feature:
(B) OTHER INFORMATION: /desc = "Forward Primer" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7:
GCGGATAACA ATTTCACACA GG 22
(2) INFORMATION FOR SEQ ID NO: 8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature:
(B) OTHER INFORMATION: /desc = "Primer 3, #19719"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8:
TGACCCGTAC GGTCCGTGGG 20
(2) INFORMATION FOR SEQ ID NO: 9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature:
(B) OTHER INFORMATION: /desc = "Primer 4, #19720" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9:
GGCTCTTGAT TTTGTGTCCA CC 22
(2) INFORMATION FOR SEQ ID NO: 10: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 71 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature:
(B) OTHER INFORMATION: /desc = "Primer 5. #20887" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:
GTAGGCTCAG TCATATGTTA CACATTGAAA GGGGAGGAGA ATCATGAAAA AGATAACTAC 60 TATTTTTGTC G 71
(2) INFORMATION FOR SEQ ID NO: 11: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 51 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature:
(B) OTHER INFORMATION: /desc = "Primer 6" (xi) SEQUENCE DESCRIPTION: SEQ ID Nθ:ll:
GTACCTCGCG GGTACCAAGC GGCCGCTTAA TTGAGTGGTT CCCACGGACC G 51
(2) INFORMATION FOR SEQ ID NO: 12: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1386 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE:
(A) ORGANISM: Bacillus agaradherens
(B) STRAIN: AC13 ( ix ) FEATURE :
(A) NAME/KEY: CDS
(B) LOCATION: 1..1386
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12:
ATG AAA AAG ATA ACT ACT ATT TTT GTC GTA TTG CTT ATG ACA GTG GCG 48 Met Lys Lys He Thr Thr He Phe Val Val Leu Leu Met Thr Val Ala 1 5 10 15
TTG TTC AGT ATA GGA AAC ACG ACT GCT GCT GAT AAT GAT TCA GTT GTA 96 Leu Phe Ser He Gly Asn Thr Thr Ala Ala Asp Asn Asp Ser Val Val 20 25 30
GAA GAA CAT GGG CAA TTA AGT ATT AGT AAC GGT GAA TTA GTC AAT GAA 144 Glu Glu His Gly Gin Leu Ser He Ser Asn Gly Glu Leu Val Asn Glu 35 40 45
CGA GGC GAA CAA GTT CAG TTA AAA GGG ATG AGT TCC CAT GGT TTG CAA 192 Arg Gly Glu Gin Val Gin Leu Lys Gly Met Ser Ser His Gly Leu Gin 50 55 60
TGG TAC GGT CAA TTT GTA AAC TAT GAA AGT ATG AAA TGG CTA AGA GAT 240 Trp Tyr Gly Gin Phe Val Asn Tyr Glu Ser Met Lys Trp Leu Arg Asp 65 70 75 80
GAT TGG GGA ATA AAT GTA TTC CGA GCA GCA ATG TAT ACC TCT TCA GGA 288 Asp Trp Gly He Asn Val Phe Arg Ala Ala Met Tyr Thr Ser Ser Gly 85 90 95
GGA TAT ATT GAT GAT CCA TCA GTA AAG GAA AAA GTA AAA GAG GCT GTT 336 Gly Tyr He Asp Asp Pro Ser Val Lys Glu Lys Val Lys Glu Ala Val 100 105 110
GAA GCT GCG ATA GAC CTT GAT ATA TAT GTG ATC ATT GAT TGG CAT ATC 384 Glu Ala Ala He Asp Leu Asp He Tyr Val He He Asp Trp His He 115 120 125
CTT TCA GAC AAT GAC CCA AAT ATA TAT AAA GAA GAA GCG AAG GAT TTC 432 Leu Ser Asp Asn Asp Pro Asn He Tyr Lys Glu Glu Ala Lys Asp Phe 130 135 140
TTT GAT GAA ATG TCA GAG TTG TAT GGA GAC TAT CCG AAT GTG ATA TAC 480 Phe Asp Glu Met Ser Glu Leu Tyr Gly Asp Tyr Pro Asn Val He Tyr 145 150 155 160
GAA ATT GCA AAT GAA CCG AAT GGT AGT GAT GTT ACG TGG GGC AAT CAA 528 Glu He Ala Asn Glu Pro Asn Gly Ser Asp Val Thr Trp Gly Asn Gin 165 170 175 ATA AAA CCG TAT GCA GAG GAA GTC ATT CCG ATT ATT CGT AAC AAT GAC 576 He Lys Pro Tyr Ala Glu Glu Val He Pro He He Arg Asn Asn Asp 180 185 190
CCT AAT AAC ATT ATT ATT GTA GGT ACA GGT ACA TGG AGT CAG GAT GTC 624 Pro Asn Asn He He He Val Gly Thr Gly Thr Trp Ser Gin Asp Val 195 200 205
CAT CAT GCA GCT GAT AAT CAG CTT GCA GAT CCT AAC GTC ATG TAT GCA 672 His His Ala Ala Asp Asn Gin Leu Ala Asp Pro Asn Val Met Tyr Ala 210 215 220
TTT CAT TTT TAT GCA GGG ACA CAT GGT CAA AAT TTA CGA GAC CAA GTA 720 Phe His Phe Tyr Ala Gly Thr His Gly Gin Asn Leu Arg Asp Gin Val 225 230 235 240
GAT TAT GCA TTA GAT CAA GGA GCA GCG ATA TTT GTT AGT GAA TGG GGA 768 Asp Tyr Ala Leu Asp Gin Gly Ala Ala He Phe Val Ser Glu Trp Gly 245 250 255
ACA AGT GCA GCT ACA GGT GAT GGT GGC GTG TTT TTA GAT GAA GCA CAA 816 Thr Ser Ala Ala Thr Gly Asp Gly Gly Val Phe Leu Asp Glu Ala Gin 260 265 270
GTG TGG ATT GAC TTT ATG GAT GAA AGA AAT TTA AGC TGG GCC AAC TGG 864 Val Trp He Asp Phe Met Asp Glu Arg Asn Leu Ser Trp Ala Asn Trp 275 280 285
TCT CTA ACG CAT AAA GAT GAG TCA TCT GCA GCG TTA ATG CCA GGT GCA 912 Ser Leu Thr His Lys Asp Glu Ser Ser Ala Ala Leu Met Pro Gly Ala 290 295 300
AAT CCA ACT GGT GGT TGG ACA GAG GCT GAA CTA TCT CCA TCT GGT ACA 960 Asn Pro Thr Gly Gly Trp Thr Glu Ala Glu Leu Ser Pro Ser Gly Thr 305 310 315 320
TTT GTG AGG GAA AAA ATA AGA GAA TCA GCA TCT ATT CCG CCA AGC GAT 1008 Phe Val Arg Glu Lys He Arg Glu Ser Ala Ser He Pro Pro Ser Asp 325 330 335
CCA ACA CCG CCA TCT GAT CCA GGA GAA CCG GAT CCA ACG CCC CCA AGT 1056 Pro Thr Pro Pro Ser Asp Pro Gly Glu Pro Asp Pro Thr Pro Pro Ser 340 345 350
GAT CCA GGA AAG TAT CCA GCA TGG GAT CCA AAT CAA ATT TAC ACA AAT 1104 Asp Pro Gly Lys Tyr Pro Ala Trp Asp Pro Asn Gin He Tyr Thr Asn 355 360 365
GAA ATT GTG TAC CAT AAC GGC CAG CTA TGG CAA GCA AAA TGG TGG ACA 1152 Glu He Val Tyr His Asn Gly Gin Leu Trp Gin Ala Lys Trp Trp Thr 370 375 380
CAA AAT CAA GAG CCA GGT GAC CCG TAC GGT CCG TGG GAA CCA CTC AAA 1200 Gin Asn Gin Glu Pro Gly Asp Pro Tyr Gly Pro Trp Glu Pro Leu Lys 385 390 395 400
TCT GAT CCA GAT TCA GGA GAA CCG GAT CCA ACG CCC CCA AGT GAT CCA 1248 Ser Asp Pro Asp Ser Gly Glu Pro Asp Pro Thr Pro Pro Ser Asp Pro 405 410 415
GGA GAA TAT CCA GCA TGG GAC CCA ACG CAA ATT TAC ACA GAT GAA ATT 1296 Gly Glu Tyr Pro Ala Trp Asp Pro Thr Gin He Tyr Thr Asp Glu He 420 425 430
GTG TAC CAT AAC GGC CAG CTA TGG CAA GCC AAA TGG TGG ACA CAA AAT 1344 Val Tyr His Asn Gly Gin Leu Trp Gin Ala Lys Trp Trp Thr Gin Asn 435 440 445 CAA GAG CCA GGT GAC CCA TAC GGT CCG TGG GAA CCA CTC AAT 1386
Gin Glu Pro Gly Asp Pro Tyr Gly Pro Trp Glu Pro Leu Asn 450 455 460
(2) INFORMATION FOR SEQ ID NO: 13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 462 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:
Met Lys Lys He Thr Thr He Phe Val Val Leu Leu Met Thr Val Ala 1 5 10 15
Leu Phe Ser He Gly Asn Thr Thr Ala Ala Asp Asn Asp Ser Val Val 20 25 30
Glu Glu His Gly Gin Leu Ser He Ser Asn Gly Glu Leu Val Asn Glu 35 40 45
Arg Gly Glu Gin Val Gin Leu Lys Gly Met Ser Ser His Gly Leu Gin 50 55 60
Trp Tyr Gly Gin Phe Val Asn Tyr Glu Ser Met Lys Trp Leu Arg Asp 65 70 75 80
Asp Trp Gly He Asn Val Phe Arg Ala Ala Met Tyr Thr Ser Ser Gly 85 90 95
Gly Tyr He Asp Asp Pro Ser Val Lys Glu Lys Val Lys Glu Ala Val 100 105 110
Glu Ala Ala He Asp Leu Asp He Tyr Val He He Asp Trp His He 115 120 125
Leu Ser Asp Asn Asp Pro Asn He Tyr Lys Glu Glu Ala Lys Asp Phe 130 135 140
Phe Asp Glu Met Ser Glu Leu Tyr Gly Asp Tyr Pro Asn Val He Tyr 145 150 155 160
Glu He Ala Asn Glu Pro Asn Gly Ser Asp Val Thr Trp Gly Asn Gin 165 170 175
He Lys Pro Tyr Ala Glu Glu Val He Pro He He Arg Asn Asn Asp 180 185 190
Pro Asn Asn He He He Val Gly Thr Gly Thr Trp Ser Gin Asp Val 195 200 205
His His Ala Ala Asp Asn Gin Leu Ala Asp Pro Asn Val Met Tyr Ala 210 215 220
Phe His Phe Tyr Ala Gly Thr His Gly Gin Asn Leu Arg Asp Gin Val 225 230 235 240
Asp Tyr Ala Leu Asp Gin Gly Ala Ala He Phe Val Ser Glu Trp Gly 245 250 255
Thr Ser Ala Ala Thr Gly Asp Gly Gly Val Phe Leu Asp Glu Ala Gin 260 265 270
Val Trp He Asp Phe Met Asp Glu Arg Asn Leu Ser Trp Ala Asn Trp 275 280 285
Ser Leu Thr His Lys Asp Glu Ser Ser Ala Ala Leu Met Pro Gly Ala 290 295 300
Asn Pro Thr Gly Gly Trp Thr Glu Ala Glu Leu Ser Pro Ser Gly Thr 305 310 315 320
Phe Val Arg Glu Lys He Arg Glu Ser Ala Ser He Pro Pro Ser Asp 325 330 335
Pro Thr Pro Pro Ser Asp Pro Gly Glu Pro Asp Pro Thr Pro Pro Ser 340 345 350
Asp Pro Gly Lys Tyr Pro Ala Trp Asp Pro Asn Gin He Tyr Thr Asn 355 360 365
Glu He Val Tyr His Asn Gly Gin Leu Trp Gin Ala Lys Trp Trp Thr 370 375 380
Gin Asn Gin Glu Pro Gly Asp Pro Tyr Gly Pro Trp Glu Pro Leu Lys 385 390 395 400
Ser Asp Pro Asp Ser Gly Glu Pro Asp Pro Thr Pro Pro Ser Asp Pro 405 410 415
Gly Glu Tyr Pro Ala Trp Asp Pro Thr Gin He Tyr Thr Asp Glu He 420 425 430
Val Tyr His Asn Gly Gin Leu Trp Gin Ala Lys Trp Trp Thr Gin Asn 435 440 445
Gin Glu Pro Gly Asp Pro Tyr Gly Pro Trp Glu Pro Leu Asn 450 455 460
(2) INFORMATION FOR SEQ ID NO: 14: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 46 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid ( ix) FEATURE :
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 7, #100084" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14:
CCTCGCGAGG TACCAGCGGC CGCGTACCAC CAATTAAGTA TGGTAC 46
(2) INFORMATION FOR SEQ ID NO: 15: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 8, #5289" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15:
GCTTTACGCC CGATTGCTGA CGCTG 35
(2) INFORMATION FOR SEQ ID NO: 16: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 51 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid ( ix ) FEATURE :
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 9, #26748" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16:
GCGATGAGAC GCGCGGCCGC CTATCTTTGA ACATAAATTG AAACGGATCC G 51
(2) INFORMATION FOR SEQ ID NO: 17: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 52 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 10, #110150A" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17:
GCTGCAGGAT CCGTTTCAAT TTATGTTCAA AGATCTGATC CAGATTCAGG AG 52
(2) INFORMATION FOR SEQ ID NO: 18: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 46 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 11, #100084" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18:
CCTCGCGAGG TACCAGCGGC CGCGTACCAC CAATTAAGTA TGGTAC 46
(2) INFORMATION FOR SEQ ID NO: 19: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1725 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid
(A) DESCRIPTION: /desc = "Hybrid" (ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..1725
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19:
ATG AAA CAA CAA AAA CGG CTT TAC GCC CGA TTG CTG ACG CTG TTA TTT 48 Met Lys Gin Gin Lys Arg Leu Tyr Ala Arg Leu Leu Thr Leu Leu Phe 1 5 10 15
GCG CTC ATC TTC TTG CTG CCT CAT TCT GCA GCA GCG GCG GCA AAT CTT 96 Ala Leu He Phe Leu Leu Pro His Ser Ala Ala Ala Ala Ala Asn Leu 20 25 30
AAT GGG ACG CTG ATG CAG TAT TTT GAA TGG TAC ATG CCC AAT GAC GGC 144 Asn Gly Thr Leu Met Gin Tyr Phe Glu Trp Tyr Met Pro Asn Asp Gly 35 40 45
CAA CAT TGG AAG CGT TTG CAA AAC GAC TCG GCA TAT TTG GCT GAA CAC 192 Gin His Trp Lys Arg Leu Gin Asn Asp Ser Ala Tyr Leu Ala Glu His 50 55 60
GGT ATT ACT GCC GTC TGG ATT CCC CCG GCA TAT AAG GGA ACG AGC CAA 240 Gly He Thr Ala Val Trp He Pro Pro Ala Tyr Lys Gly Thr Ser Gin 65 70 75 80
GCG GAT GTG GGC TAC GGT GCT TAC GAC CTT TAT GAT TTA GGG GAG TTT 288
Ala Asp Val Gly Tyr Gly Ala Tyr Asp Leu Tyr Asp Leu Gly Glu Phe 85 90 95
CAT CAA AAA GGG ACG GTT CGG ACA AAG TAC GGC ACA AAA GGA GAG CTG 336
His Gin Lys Gly Thr Val Arg Thr Lys Tyr Gly Thr Lys Gly Glu Leu
100 105 110
CAA TCT GCG ATC AAA AGT CTT CAT TCC CGC GAC ATT AAC GTT TAC GGG 384
Gin Ser Ala He Lys Ser Leu His Ser Arg Asp He Asn Val Tyr Gly
115 120 125
GAT GTG GTC ATC AAC CAC AAA GGC GGC GCT GAT GCG ACC GAA GAT GTA 432
Asp Val Val He Asn His Lys Gly Gly Ala Asp Ala Thr Glu Asp Val
130 135 140
ACC GCG GTT GAA GTC GAT CCC GCT GAC CGC AAC CGC GTA ATC TCA GGA 480
Thr Ala Val Glu Val Asp Pro Ala Asp Arg Asn Arg Val He Ser Gly 145 150 155 160
GAA CAC CTA ATT AAA GCC TGG ACA CAT TTT CAT TTT CCG GGG GCC GGC 528
Glu His Leu He Lys Ala Trp Thr His Phe His Phe Pro Gly Ala Gly 165 170 175
AGC ACA TAC AGC GAT TTT AAA TGG CAT TGG TAC CAT TTT GAC GGA ACC 576
Ser Thr Tyr Ser Asp Phe Lys Trp His Trp Tyr His Phe Asp Gly Thr
180 185 190
GAT TGG GAC GAG TCC CGA AAG CTG AAC CGC ATC TAT AAG TTT CAA GGA 624
Asp Trp Asp Glu Ser Arg Lys Leu Asn Arg He Tyr Lys Phe Gin Gly
195 200 205
AAG GCT TGG GAT TGG GAA GTT TCC AAT GAA AAC GGC AAC TAT GAT TAT 672
Lys Ala Trp Asp Trp Glu Val Ser Asn Glu Asn Gly Asn Tyr Asp Tyr
210 215 220
TTG ATG TAT GCC GAC ATC GAT TAT GAC CAT CCT GAT GTC GCA GCA GAA 720
Leu Met Tyr Ala Asp He Asp Tyr Asp His Pro Asp Val Ala Ala Glu 225 230 235 240
ATT AAG AGA TGG GGC ACT TGG TAT GCC AAT GAA CTG CAA TTG GAC GGA 768
He Lys Arg Trp Gly Thr Trp Tyr Ala Asn Glu Leu Gin Leu Asp Gly 245 250 255
AAC CGT CTT GAT GCT GTC AAA CAC ATT AAA TTT TCT TTT TTG CGG GAT 816
Asn Arg Leu Asp Ala Val Lys His He Lys Phe Ser Phe Leu Arg Asp
260 265 270
TGG GTT AAT CAT GTC AGG GAA AAA ACG GGG AAG GAA ATG TTT ACG GTA 864
Trp Val Asn His Val Arg Glu Lys Thr Gly Lys Glu Met Phe Thr Val
275 280 285
GCT GAA TAT TGG CAG AAT GAC TTG GGC GCG CTG GAA AAC TAT TTG AAC 912
Ala Glu Tyr Trp Gin Asn Asp Leu Gly Ala Leu Glu Asn Tyr Leu Asn
290 295 300
AAA ACA AAT TTT AAT CAT TCA GTG TTT GAC GTG CCG CTT CAT TAT CAG 960
Lys Thr Asn Phe Asn His Ser Val Phe Asp Val Pro Leu His Tyr Gin 305 310 315 320
TTC CAT GCT GCA TCG ACA CAG GGA GGC GGC TAT GAT ATG AGG AAA TTG 1008
Phe His Ala Ala Ser Thr Gin Gly Gly Gly Tyr Asp Met Arg Lys Leu 325 330 335
CTG AAC GGT ACG GTC GTT TCC AAG CAT CCG TTG AAA TCG GTT ACA TTT 1056
Leu Asn Gly Thr Val Val Ser Lys His Pro Leu Lys Ser Val Thr Phe 340 345 350
GTC GAT AAC CAT GAT ACA CAG CCG GGG CAA TCG CTT GAG TCG ACT GTC 1104 Val Asp Asn His Asp Thr Gin Pro Gly Gin Ser Leu Glu Ser Thr Val 355 360 365
CAA ACA TGG TTT AAG CCG CTT GCT TAC GCT TTT ATT CTC ACA AGG GAA 1152 Gin Thr Trp Phe Lys Pro Leu Ala Tyr Ala Phe He Leu Thr Arg Glu 370 375 380
TCT GGA TAC CCT CAG GTT TTC TAC GGG GAT ATG TAC GGG ACG AAA GGA 1200 Ser Gly Tyr Pro Gin Val Phe Tyr Gly Asp Met Tyr Gly Thr Lys Gly 385 390 395 400
GAC TCC CAG CGC GAA ATT CCT GCC TTG AAA CAC AAA ATT GAA CCG ATC 1248 Asp Ser Gin Arg Glu He Pro Ala Leu Lys His Lys He Glu Pro He 405 410 415
TTA AAA GCG AGA AAA CAG TAT GCG TAC GGA GCA CAG CAT GAT TAT TTC 1296 Leu Lys Ala Arg Lys Gin Tyr Ala Tyr Gly Ala Gin His Asp Tyr Phe 420 425 430
GAC CAC CAT GAC ATT GTC GGC TGG ACA AGG GAA GGC GAC AGC TCG GTT 1344 Asp His His Asp He Val Gly Trp Thr Arg Glu Gly Asp Ser Ser Val 435 440 445
GCA AAT TCA GGT TTG GCG GCA TTA ATA ACA GAC GGA CCC GGT GGG GCA 1392 Ala Asn Ser Gly Leu Ala Ala Leu He Thr Asp Gly Pro Gly Gly Ala 450 455 460
AAG CGA ATG TAT GTC GGC CGG CAA AAC GCC GGT GAG ACA TGG CAT GAC 1440 Lys Arg Met Tyr Val Gly Arg Gin Asn Ala Gly Glu Thr Trp His Asp 465 470 475 480
ATT ACC GGA AAC CGT TCG GAG CCG GTT GTC ATC AAT TCG GAA GGC TGG 1488 He Thr Gly Asn Arg Ser Glu Pro Val Val He Asn Ser Glu Gly Trp 485 490 495
GGA GAG TTT CAC GTA AAC GGC GGA TCC GTT TCA ATT TAT GTT CAA AGA 1536 Gly Glu Phe His Val Asn Gly Gly Ser Val Ser He Tyr Val Gin Arg 500 505 510
TCT GAT CCA GAT TCA GGA GAA CCG GAT CCA ACG CCC CCA AGT GAT CCA 1584 Ser Asp Pro Asp Ser Gly Glu Pro Asp Pro Thr Pro Pro Ser Asp Pro 515 520 525
GGA GAA TAT CCA GCA TGG GAC CCA ACG CAA ATT TAC ACA GAT GAA ATT 1632 Gly Glu Tyr Pro Ala Trp Asp Pro Thr Gin He Tyr Thr Asp Glu He 530 535 540
GTG TAC CAT AAC GGC CAG CTA TGG CAA GCC AAA TGG TGG ACA CAA AAT 1680 Val Tyr His Asn Gly Gin Leu Trp Gin Ala Lys Trp Trp Thr Gin Asn 545 550 555 560
CAA GAG CCA GGT GAC CCA TAC GGT CCG TGG GAA CCA CTC AAT TAA 1725
Gin Glu Pro Gly Asp Pro Tyr Gly Pro Trp Glu Pro Leu Asn * 565 570 575
(2) INFORMATION FOR SEQ ID NO: 20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 575 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20:
Met Lys Gin Gin Lys Arg Leu Tyr Ala Arg Leu Leu Thr Leu Leu Phe 1 5 10 15 Ala Leu He Phe Leu Leu Pro His Ser Ala Ala Ala Ala Ala Asn Leu 20 25 30
Asn Gly Thr Leu Met Gin Tyr Phe Glu Trp Tyr Met Pro Asn Asp Gly 35 40 45
Gin His Trp Lys Arg Leu Gin Asn Asp Ser Ala Tyr Leu Ala Glu His 50 55 60
Gly He Thr Ala Val Trp He Pro Pro Ala Tyr Lys Gly Thr Ser Gin 65 70 75 80
Ala Asp Val Gly Tyr Gly Ala Tyr Asp Leu Tyr Asp Leu Gly Glu Phe 85 90 95
His Gin Lys Gly Thr Val Arg Thr Lys Tyr Gly Thr Lys Gly Glu Leu 100 105 110
Gin Ser Ala He Lys Ser Leu His Ser Arg Asp He Asn Val Tyr Gly 115 120 125
Asp Val Val He Asn His Lys Gly Gly Ala Asp Ala Thr Glu Asp Val 130 135 140
Thr Ala Val Glu Val Asp Pro Ala Asp Arg Asn Arg Val He Ser Gly 145 150 155 160
Glu His Leu He Lys Ala Trp Thr His Phe His Phe Pro Gly Ala Gly 165 170 175
Ser Thr Tyr Ser Asp Phe Lys Trp His Trp Tyr His Phe Asp Gly Thr 180 185 190
Asp Trp Asp Glu Ser Arg Lys Leu Asn Arg He Tyr Lys Phe Gin Gly 195 200 205
Lys Ala Trp Asp Trp Glu Val Ser Asn Glu Asn Gly Asn Tyr Asp Tyr 210 215 220
Leu Met Tyr Ala Asp He Asp Tyr Asp His Pro Asp Val Ala Ala Glu 225 230 235 240
He Lys Arg Trp Gly Thr Trp Tyr Ala Asn Glu Leu Gin Leu Asp Gly 245 250 255
Asn Arg Leu Asp Ala Val Lys His He Lys Phe Ser Phe Leu Arg Asp 260 265 270
Trp Val Asn His Val Arg Glu Lys Thr Gly Lys Glu Met Phe Thr Val 275 280 285
Ala Glu Tyr Trp Gin Asn Asp Leu Gly Ala Leu Glu Asn Tyr Leu Asn 290 295 300
Lys Thr Asn Phe Asn His Ser Val Phe Asp Val Pro Leu His Tyr Gin 305 310 315 320
Phe His Ala Ala Ser Thr Gin Gly Gly Gly Tyr Asp Met Arg Lys Leu 325 330 335
Leu Asn Gly Thr Val Val Ser Lys His Pro Leu Lys Ser Val Thr Phe 340 345 350
Val Asp Asn His Asp Thr Gin Pro Gly Gin Ser Leu Glu Ser Thr Val 355 360 365
Gin Thr Trp Phe Lys Pro Leu Ala Tyr Ala Phe He Leu Thr Arg Glu 370 375 380
Ser Gly Tyr Pro Gin Val Phe Tyr Gly Asp Met Tyr Gly Thr Lys Gly 385 390 395 400
Asp Ser Gin Arg Glu He Pro Ala Leu Lys His Lys He Glu Pro He 405 410 415
Leu Lys Ala Arg Lys Gin Tyr Ala Tyr Gly Ala Gin His Asp Tyr Phe 420 425 430
Asp His His Asp He Val Gly Trp Thr Arg Glu Gly Asp Ser Ser Val 435 440 445
Ala Asn Ser Gly Leu Ala Ala Leu He Thr Asp Gly Pro Gly Gly Ala 450 455 460
Lys Arg Met Tyr Val Gly Arg Gin Asn Ala Gly Glu Thr Trp His Asp 465 470 475 480
He Thr Gly Asn Arg Ser Glu Pro Val Val He Asn Ser Glu Gly Trp 485 490 495
Gly Glu Phe His Val Asn Gly Gly Ser Val Ser He Tyr Val Gin Arg 500 505 510
Ser Asp Pro Asp Ser Gly Glu Pro Asp Pro Thr Pro Pro Ser Asp Pro 515 520 525
Gly Glu Tyr Pro Ala Trp Asp Pro Thr Gin He Tyr Thr Asp Glu He 530 535 540
Val Tyr His Asn Gly Gin Leu Trp Gin Ala Lys Trp Trp Thr Gin Asn 545 550 555 560
Gin Glu Pro Gly Asp Pro Tyr Gly Pro Trp Glu Pro Leu Asn * 565 570 575
2) INFORMATION FOR SEQ ID NO: 21:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (ix) FEATURE:
(a) NAME/KEY: misc-feature
(d) OTHER INFORMATION: /desc = "Linker" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21:
Ser Asp Pro Asp Ser Gly Glu Pro Asp Pro Thr Pro Pro Ser Asp Pro Gly 5 10 15
(2) INFORMATION FOR SEQ ID NO: 22: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 60 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 12, #114135" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22:
GCTGCAGGAT CCGTTTCAAT TTATGTTCAA AGATCTCCAA CTCCTGCCCC ATCTCAAAGC 60
(2) INFORMATION FOR SEQ ID NO: 23: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 50 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid ( ix ) FEATURE :
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 13, #110151" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23:
GCGATGAGAC GCGCGGCCGC TACTACCAGT CAACATTAAC AGGACCTGAG 50
(2) INFORMATION FOR SEQ ID NO: 24: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2346 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid
(A) DESCRIPTION: /desc = "Hybrid" ( ix) FEATURE :
(A) NAME/KEY: CDS
(B) LOCATION: 1..2346
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24:
ATG AAA CAA CAA AAA CGG CTT TAC GCC CGA TTG CTG ACG CTG TTA TTT 48 Met Lys Gin Gin Lys Arg Leu Tyr Ala Arg Leu Leu Thr Leu Leu Phe 1 5 10 15
GCG CTC ATC TTC TTG CTG CCT CAT TCT GCA GCA GCG GCG GCA AAT CTT 96 Ala Leu He Phe Leu Leu Pro His Ser Ala Ala Ala Ala Ala Asn Leu 20 25 30
AAT GGG ACG CTG ATG CAG TAT TTT GAA TGG TAC ATG CCC AAT GAC GGC 144 Asn Gly Thr Leu Met Gin Tyr Phe Glu Trp Tyr Met Pro Asn Asp Gly 35 40 45
CAA CAT TGG AAG CGT TTG CAA AAC GAC TCG GCA TAT TTG GCT GAA CAC 192 Gin His Trp Lys Arg Leu Gin Asn Asp Ser Ala Tyr Leu Ala Glu His 50 55 60
GGT ATT ACT GCC GTC TGG ATT CCC CCG GCA TAT AAG GGA ACG AGC CAA 240 Gly He Thr Ala Val Trp He Pro Pro Ala Tyr Lys Gly Thr Ser Gin 65 70 75 80
GCG GAT GTG GGC TAC GGT GCT TAC GAC CTT TAT GAT TTA GGG GAG TTT 288 Ala Asp Val Gly Tyr Gly Ala Tyr Asp Leu Tyr Asp Leu Gly Glu Phe 85 90 95
CAT CAA AAA GGG ACG GTT CGG ACA AAG TAC GGC ACA AAA GGA GAG CTG 336 His Gin Lys Gly Thr Val Arg Thr Lys Tyr Gly Thr Lys Gly Glu Leu 100 105 110
CAA TCT GCG ATC AAA AGT CTT CAT TCC CGC GAC ATT AAC GTT TAC GGG 384 Gin Ser Ala He Lys Ser Leu His Ser Arg Asp He Asn Val Tyr Gly 115 120 125
GAT GTG GTC ATC AAC CAC AAA GGC GGC GCT GAT GCG ACC GAA GAT GTA 432 Asp Val Val He Asn His Lys Gly Gly Ala Asp Ala Thr Glu Asp Val 130 135 140
ACC GCG GTT GAA GTC GAT CCC GCT GAC CGC AAC CGC GTA ATC TCA GGA 480 Thr Ala Val Glu Val Asp Pro Ala Asp Arg Asn Arg Val He Ser Gly 145 150 155 160
GAA CAC CTA ATT AAA GCC TGG ACA CAT TTT CAT TTT CCG GGG GCC GGC 528 Glu His Leu He Lys Ala Trp Thr His Phe His Phe Pro Gly Ala Gly 165 170 175 AGC ACA TAC AGC GAT TTT AAA TGG CAT TGG TAC CAT TTT GAC GGA ACC 576 Ser Thr Tyr Ser Asp Phe Lys Trp His Trp Tyr His Phe Asp Gly Thr 180 185 190
GAT TGG GAC GAG TCC CGA AAG CTG AAC CGC ATC TAT AAG TTT CAA GGA 624 Asp Trp Asp Glu Ser Arg Lys Leu Asn Arg He Tyr Lys Phe Gin Gly 195 200 205
AAG GCT TGG GAT TGG GAA GTT TCC AAT GAA AAC GGC AAC TAT GAT TAT 672 Lys Ala Trp Asp Trp Glu Val Ser Asn Glu Asn Gly Asn Tyr Asp Tyr 210 215 220
TTG ATG TAT GCC GAC ATC GAT TAT GAC CAT CCT GAT GTC GCA GCA GAA 720 Leu Met Tyr Ala Asp He Asp Tyr Asp His Pro Asp Val Ala Ala Glu 225 230 235 240
ATT AAG AGA TGG GGC ACT TGG TAT GCC AAT GAA CTG CAA TTG GAC GGA 768 He Lys Arg Trp Gly Thr Trp Tyr Ala Asn Glu Leu Gin Leu Asp Gly 245 250 255
AAC CGT CTT GAT GCT GTC AAA CAC ATT AAA TTT TCT TTT TTG CGG GAT 816 Asn Arg Leu Asp Ala Val Lys His He Lys Phe Ser Phe Leu Arg Asp 260 265 270
TGG GTT AAT CAT GTC AGG GAA AAA ACG GGG AAG GAA ATG TTT ACG GTA 864 Trp Val Asn His Val Arg Glu Lys Thr Gly Lys Glu Met Phe Thr Val 275 280 285
GCT GAA TAT TGG CAG AAT GAC TTG GGC GCG CTG GAA AAC TAT TTG AAC 912 Ala Glu Tyr Trp Gin Asn Asp Leu Gly Ala Leu Glu Asn Tyr Leu Asn 290 295 300
AAA ACA AAT TTT AAT CAT TCA GTG TTT GAC GTG CCG CTT CAT TAT CAG 960 Lys Thr Asn Phe Asn His Ser Val Phe Asp Val Pro Leu His Tyr Gin 305 310 315 320
TTC CAT GCT GCA TCG ACA CAG GGA GGC GGC TAT GAT ATG AGG AAA TTG 1008 Phe His Ala Ala Ser Thr Gin Gly Gly Gly Tyr Asp Met Arg Lys Leu 325 330 335
CTG AAC GGT ACG GTC GTT TCC AAG CAT CCG TTG AAA TCG GTT ACA TTT 1056 Leu Asn Gly Thr Val Val Ser Lys His Pro Leu Lys Ser Val Thr Phe 340 345 350
GTC GAT AAC CAT GAT ACA CAG CCG GGG CAA TCG CTT GAG TCG ACT GTC 1104 Val Asp Asn His Asp Thr Gin Pro Gly Gin Ser Leu Glu Ser Thr Val 355 360 365
CAA ACA TGG TTT AAG CCG CTT GCT TAC GCT TTT ATT CTC ACA AGG GAA 1152 Gin Thr Trp Phe Lys Pro Leu Ala Tyr Ala Phe He Leu Thr Arg Glu 370 375 380
TCT GGA TAC CCT CAG GTT TTC TAC GGG GAT ATG TAC GGG ACG AAA GGA 1200 Ser Gly Tyr Pro Gin Val Phe Tyr Gly Asp Met Tyr Gly Thr Lys Gly 385 390 395 400
GAC TCC CAG CGC GAA ATT CCT GCC TTG AAA CAC AAA ATT GAA CCG ATC 1248 Asp Ser Gin Arg Glu He Pro Ala Leu Lys His Lys He Glu Pro He 405 410 415
TTA AAA GCG AGA AAA CAG TAT GCG TAC GGA GCA CAG CAT GAT TAT TTC 1296 Leu Lys Ala Arg Lys Gin Tyr Ala Tyr Gly Ala Gin His Asp Tyr Phe 420 425 430
GAC CAC CAT GAC ATT GTC GGC TGG ACA AGG GAA GGC GAC AGC TCG GTT 1344 Asp His His Asp He Val Gly Trp Thr Arg Glu Gly Asp Ser Ser Val 435 440 445 GCA AAT TCA GGT TTG GCG GCA TTA ATA ACA GAC GGA CCC GGT GGG GCA 1392 Ala Asn Ser Gly Leu Ala Ala Leu He Thr Asp Gly Pro Gly Gly Ala 450 455 460
AAG CGA ATG TAT GTC GGC CGG CAA AAC GCC GGT GAG ACA TGG CAT GAC 1440 Lys Arg Met Tyr Val Gly Arg Gin Asn Ala Gly Glu Thr Trp His Asp 465 470 475 480
ATT ACC GGA AAC CGT TCG GAG CCG GTT GTC ATC AAT TCG GAA GGC TGG 1488 He Thr Gly Asn Arg Ser Glu Pro Val Val He Asn Ser Glu Gly Trp 485 490 495
GGA GAG TTT CAC GTA AAC GGC GGA TCC GTT TCA ATT TAT GTT CAA AGA 1536 Gly Glu Phe His Val Asn Gly Gly Ser Val Ser He Tyr Val Gin Arg 500 505 510
TCT CCA ACT CCT GCC CCA TCT CAA AGC CCA ATT AGA AGA GAT GCA TTT 1584 Ser Pro Thr Pro Ala Pro Ser Gin Ser Pro He Arg Arg Asp Ala Phe 515 520 525
TCA ATA ATC GAA GCG GAA GAA TAT AAC AGC ACA AAT TCC TCC ACT TTA 1632 Ser He He Glu Ala Glu Glu Tyr Asn Ser Thr Asn Ser Ser Thr Leu 530 535 540
CAA GTG ATT GGA ACG CCA AAT AAT GGC AGA GGA ATT GGT TAT ATT GAA 1680 Gin Val He Gly Thr Pro Asn Asn Gly Arg Gly He Gly Tyr He Glu 545 550 555 560
AAT GGT AAT ACC GTA ACT TAC AGC AAT ATA GAT TTT GGT AGT GGT GCA 1728 Asn Gly Asn Thr Val Thr Tyr Ser Asn He Asp Phe Gly Ser Gly Ala 565 570 575
ACA GGG TTC TCT GCA ACT GTT GCA ACG GAG GTT AAT ACC TCA ATT CAA 1776 Thr Gly Phe Ser Ala Thr Val Ala Thr Glu Val Asn Thr Ser He Gin 580 585 590
ATC CGT TCT GAC AGT CCT ACC GGA ACT CTA CTT GGT ACC TTA TAT GTA 1824 He Arg Ser Asp Ser Pro Thr Gly Thr Leu Leu Gly Thr Leu Tyr Val 595 600 605
AGT TCT ACC GGC AGC TGG AAT ACA TAT CAA ACC GTA TCT ACA AAC ATC 1872 Ser Ser Thr Gly Ser Trp Asn Thr Tyr Gin Thr Val Ser Thr Asn He 610 615 620
AGC AAA ATT ACC GGC GTT CAT GAT ATT GTA TTG GTA TTC TCA GGT CCA 1920 Ser Lys He Thr Gly Val His Asp He Val Leu Val Phe Ser Gly Pro 625 630 635 640
GTC AAT GTG GAC AAC TTC ATA TTT AGC AGA AGT TCA CCA GTG CCT GCA 1968 Val Asn Val Asp Asn Phe He Phe Ser Arg Ser Ser Pro Val Pro Ala 645 650 655
CCT GGT GAT AAC ACA AGA GAC GCA TAT TCT ATC ATT CAG GCC GAG GAT 2016 Pro Gly Asp Asn Thr Arg Asp Ala Tyr Ser He He Gin Ala Glu Asp 660 665 670
TAT GAC AGC AGT TAT GGT CCC AAC CTT CAA ATC TTT AGC TTA CCA GGT 2064 Tyr Asp Ser Ser Tyr Gly Pro Asn Leu Gin He Phe Ser Leu Pro Gly 675 680 685
GGT GGC AGC GCC ATT GGC TAT ATT GAA AAT GGT TAT TCC ACT ACC TAT 2112 Gly Gly Ser Ala He Gly Tyr He Glu Asn Gly Tyr Ser Thr Thr Tyr 690 695 700
AAA AAT ATT GAT TTT GGT GAC GGC GCA ACG TCC GTA ACA GCA AGA GTA 2160 Lys Asn He Asp Phe Gly Asp Gly Ala Thr Ser Val Thr Ala Arg Val 705 710 715 720 GCT ACC CAG AAT GCT ACT ACC ATT CAG GTA AGA TTG GGA AGT CCA TCG 2208 Ala Thr Gin Asn Ala Thr Thr He Gin Val Arg Leu Gly Ser Pro Ser 725 730 735
GGT ACA TTA CTT GGA ACA ATT TAC GTG GGG TCC ACA GGA AGC TTT GAT 2256 Gly Thr Leu Leu Gly Thr He Tyr Val Gly Ser Thr Gly Ser Phe Asp 740 745 750
ACT TAT AGG GAT GTA TCC GCT ACC ATT AGT AAT ACT GCG GGT GTA AAA 2304 Thr Tyr Arg Asp Val Ser Ala Thr He Ser Asn Thr Ala Gly Val Lys 755 760 765
GAT ATT GTT CTT GTA TTC TCA GGT CCT GTT AAT GTT GAC TGG 2346
Asp He Val Leu Val Phe Ser Gly Pro Val Asn Val Asp Trp 770 775 780
(2) INFORMATION FOR SEQ ID NO: 25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 782 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25:
Met Lys Gin Gin Lys Arg Leu Tyr Ala Arg Leu Leu Thr Leu Leu Phe 1 5 10 15
Ala Leu He Phe Leu Leu Pro His Ser Ala Ala Ala Ala Ala Asn Leu 20 25 30
Asn Gly Thr Leu Met Gin Tyr Phe Glu Trp Tyr Met Pro Asn Asp Gly 35 40 45
Gin His Trp Lys Arg Leu Gin Asn Asp Ser Ala Tyr Leu Ala Glu His 50 55 60
Gly He Thr Ala Val Trp He Pro Pro Ala Tyr Lys Gly Thr Ser Gin 65 70 75 80
Ala Asp Val Gly Tyr Gly Ala Tyr Asp Leu Tyr Asp Leu Gly Glu Phe 85 90 95
His Gin Lys Gly Thr Val Arg Thr Lys Tyr Gly Thr Lys Gly Glu Leu 100 105 110
Gin Ser Ala He Lys Ser Leu His Ser Arg Asp He Asn Val Tyr Gly 115 120 125
Asp Val Val He Asn His Lys Gly Gly Ala Asp Ala Thr Glu Asp Val 130 135 140
Thr Ala Val Glu Val Asp Pro Ala Asp Arg Asn Arg Val He Ser Gly 145 150 155 160
Glu His Leu He Lys Ala Trp Thr His Phe His Phe Pro Gly Ala Gly 165 170 175
Ser Thr Tyr Ser Asp Phe Lys Trp His Trp Tyr His Phe Asp Gly Thr 180 185 190
Asp Trp Asp Glu Ser Arg Lys Leu Asn Arg He Tyr Lys Phe Gin Gly 195 200 205
Lys Ala Trp Asp Trp Glu Val Ser Asn Glu Asn Gly Asn Tyr Asp Tyr 210 215 220 Leu Met Tyr Ala Asp He Asp Tyr Asp His Pro Asp Val Ala Ala Glu 225 230 235 240
He Lys Arg Trp Gly Thr Trp Tyr Ala Asn Glu Leu Gin Leu Asp Gly 245 250 255
Asn Arg Leu Asp Ala Val Lys His He Lys Phe Ser Phe Leu Arg Asp 260 265 270
Trp Val Asn His Val Arg Glu Lys Thr Gly Lys Glu Met Phe Thr Val 275 280 285
Ala Glu Tyr Trp Gin Asn Asp Leu Gly Ala Leu Glu Asn Tyr Leu Asn 290 295 300
Lys Thr Asn Phe Asn His Ser Val Phe Asp Val Pro Leu His Tyr Gin 305 310 315 320
Phe His Ala Ala Ser Thr Gin Gly Gly Gly Tyr Asp Met Arg Lys Leu 325 330 335
Leu Asn Gly Thr Val Val Ser Lys His Pro Leu Lys Ser Val Thr Phe 340 345 350
Val Asp Asn His Asp Thr Gin Pro Gly Gin Ser Leu Glu Ser Thr Val 355 360 365
Gin Thr Trp Phe Lys Pro Leu Ala Tyr Ala Phe He Leu Thr Arg Glu 370 375 380
Ser Gly Tyr Pro Gin Val Phe Tyr Gly Asp Met Tyr Gly Thr Lys Gly 385 390 395 400
Asp Ser Gin Arg Glu He Pro Ala Leu Lys His Lys He Glu Pro He 405 410 415
Leu Lys Ala Arg Lys Gin Tyr Ala Tyr Gly Ala Gin His Asp Tyr Phe 420 425 430
Asp His His Asp He Val Gly Trp Thr Arg Glu Gly Asp Ser Ser Val 435 440 445
Ala Asn Ser Gly Leu Ala Ala Leu He Thr Asp Gly Pro Gly Gly Ala 450 455 460
Lys Arg Met Tyr Val Gly Arg Gin Asn Ala Gly Glu Thr Trp His Asp 465 470 475 480
He Thr Gly Asn Arg Ser Glu Pro Val Val He Asn Ser Glu Gly Trp 485 490 495
Gly Glu Phe His Val Asn Gly Gly Ser Val Ser He Tyr Val Gin Arg 500 505 510
Ser Pro Thr Pro Ala Pro Ser Gin Ser Pro He Arg Arg Asp Ala Phe 515 520 525
Ser He He Glu Ala Glu Glu Tyr Asn Ser Thr Asn Ser Ser Thr Leu 530 535 540
Gin Val He Gly Thr Pro Asn Asn Gly Arg Gly He Gly Tyr He Glu 545 550 555 560
Asn Gly Asn Thr Val Thr Tyr Ser Asn He Asp Phe Gly Ser Gly Ala 565 570 575
Thr Gly Phe Ser Ala Thr Val Ala Thr Glu Val Asn Thr Ser He Gin 580 585 590 He Arg Ser Asp Ser Pro Thr Gly Thr Leu Leu Gly Thr Leu Tyr Val 595 600 605
Ser Ser Thr Gly Ser Trp Asn Thr Tyr Gin Thr Val Ser Thr Asn He 610 615 620
Ser Lys He Thr Gly Val His Asp He Val Leu Val Phe Ser Gly Pro 625 630 635 640
Val Asn Val Asp Asn Phe He Phe Ser Arg Ser Ser Pro Val Pro Ala 645 650 655
Pro Gly Asp Asn Thr Arg Asp Ala Tyr Ser He He Gin Ala Glu Asp 660 665 670
Tyr Asp Ser Ser Tyr Gly Pro Asn Leu Gin He Phe Ser Leu Pro Gly 675 680 685
Gly Gly Ser Ala He Gly Tyr He Glu Asn Gly Tyr Ser Thr Thr Tyr 690 695 700
Lys Asn He Asp Phe Gly Asp Gly Ala Thr Ser Val Thr Ala Arg Val 705 710 715 720
Ala Thr Gin Asn Ala Thr Thr He Gin Val Arg Leu Gly Ser Pro Ser 725 730 735
Gly Thr Leu Leu Gly Thr He Tyr Val Gly Ser Thr Gly Ser Phe Asp 740 745 750
Thr Tyr Arg Asp Val Ser Ala Thr He Ser Asn Thr Ala Gly Val Lys 755 760 765
Asp He Val Leu Val Phe Ser Gly Pro Val Asn Val Asp Trp 770 775 780
(2) INFORMATION FOR SEQ ID NO: 26: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6136 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26:
TTTGACAGCT TATCATCGAC TGCACGGTGC ACCAATGCTT CTGGCGTCAG GCAGCCATCG 60
GAAGCTGTGG TATGGCTGTG CAGGTCGTAA ATCACTGCAT AATTCGTGTC GCTCAAGGCG 120
CACTCCCGTT CTGGATAATG TTTTTTGCGC CGACATCATA ACGGTTCTGG CAAATATTCT 180
GAAATGAGCT GTTGACAATT AATCATCGGC TCGTATAATG TGTGGAATTG TGAGCGGATA 240
ACAATTTCAC ACAGGAAACA GAATTGATCC ATAACTAACT AATCTAGTAA TAATTTTGTT 300
TAACTTTAAG AAGGAGATAT ATCCATGGAT CCTAGGACCA CGCCCGCACC CGGCCACCCG 360
GCCCGCGGCG CCCGCACCGC TCTGCGCACG ACGCTCGCCG CCGCGGCGGC GACGCTCGTC 420
GTCGGCGCCA CGGTCGTGCT GCCCGCCCAG GCCGCTAGCG AATTCGCAAA TCTTAATGGG 480
ACGCTGATGC AGTATTTTGA ATGGTACATG CCCAATGACG GCCAACATTG GAGGCGTTTG 540
CAAAACGACT CGGCATATTT GGCTGAACAC GGTATTACTG CCGTCTGGAT TCCCCCGGCA 600
TATAAGGGAA CGAGCCAAGC GGATGTGGGC TACGGTGCTT ACGACCTTTA TGATTTAGGG 660 GAGTTTCATC AAAAAGGGAC GGTTCGGACA AAGTACGGCA CAAAAGGAGA GCTGCAATCT 720 GCGATCAAAA GTCTTCATTC CCGCGACATT AACGTTTACG GGGATGTGGT CATCAACCAC 780 AAAGGCGGCG CTGATGCGAC CGAAGATGTA ACCGCGGTTG AAGTCGATCC CGCTGACCGC 840 AACCGCGTAA TCTCAGGAGA ACACCTAATT AAAGCCTGGA CACATTTTCA TTTTCCGGGG 900 CGCGGCAGCA CATACAGCGA TTTTAAATGG CATTGGTACC ATTTTGACGG AACCGATTGG 960
GACGAGTCCC GAAAGCTGAA CCGCATCTAT AAGTTTCAAG GAAAGGCTTG GGATTGGGAA 1020
GTTTCCAATG AAAACGGCAA CTATGATTAT TTGATGTATG CCGACATCGA TTATGACCAT 1080
CCTGATGTCG CAGCAGAAAT TAAGAGATGG GGCACTTGGT ATGCCAATGA ACTGCAATTG 1140
GACGGTTTCC GTCTTGATGC TGTCAAACAC ATTAAATTTT CTTTTTTGCG GGATTGGGTT 1200
AATCATGTCA GGGAAAAAAC GGGGAAGGAA ATGTTTACGG TAGCTGAATA TTGGCAGAAT 1260
GACTTGGGCG CGCTGGAAAA CTATTTGAAC AAAACAAATT TTAATCATTC AGTGTTTGAC 1320
GTGCCGCTTC ATTATCAGTT CCATGCTGCA TCGACACAGG GAGGCGGCTA TGATATGAGG 1380
AAATTGCTGA ACGGTACGGT CGTTTCCAAG CATCCGTTGA AATCGGTTAC ATTTGTCGAT 1440
AACCATGATA CACAGCCGGG GCAATCGCTT GAGTCGACTG TCCAAACATG GTTTAAGCCG 1500
CTTGCTTACG CTTTTATTCT CACAAGGGAA TCTGGATACC CTCAGGTTTT CTACGGGGAT 1560
ATGTACGGGA CGAAAGGAGA CTCCCAGCGC GAAATTCCTG CCTTGAAACA CAAAATTGAA 1620
CCGATCTTAA AAGCGAGAAA ACAGTATGCG TACGGAGCAC AGCATGATTA TTTCGACCAC 1680
CATGACATTG TCGGCTGGAC AAGGGAAGGC GACAGCTCGG TTGCAAATTC AGGTTTGGCG 1740
GCATTAATAA CAGACGGACC CGGTGGGGCA AAGCGAATGT ATGTCGGCCG GCAAAACGCC 1800
GGTGAGACAT GGCATGACAT TACCGGAAAC CGTTCGGAGC CGGTTGTCAT CAATTCGGAA 1860
GGCTGGGGAG AGTTTCACGT AAACGGCGGG TCGGTTTCAA TTTATGTTCA AAGAAGGCCT 1920
CCAACCCCCA CTAGTCCGAG CGCTCCCAGC GGCTGCACTG CTGAGAGGTG GGCTCAGTGC 1980
GGCGGCAATG GCTGGAGCGG CTGCACCACC TGCGTCGCTG GCAGCACTTG CACGAAGATT 2040
AATGACTGGT ACCATCAGTG CCTGTAAGCT TATTATATTA CTAATTAATT GGGGACCCTA 2100
GAGGTCCCCT TTTTTATTTT AGCTTCACGC TGCCGCAAGC ACTCAGGGCG CAAGGGCTGC 2160
TAAAGGAAGC GGAACACGTA GAAAGCCAGT CCGCAGAAAC GGTGCTGACC CCGGATGAAT 2220
GTCAGCTACT GGGCTATCTG GACAAGGGAA AACGCAAGCG CAAAGAGAAA GCAGGTAGCT 2280
TGCAGTGGGC TTACATGGCG ATAGCTAGAC TGGGCGGTTT TATGGACAGC AAGCGAACCG 2340
GAATTGCCAG CTGGGGCGCC CTCTGGTAAG GTTGGGAAGC CCTGCAAAGT AAACTGGATG 2400
GCTTTCTTGC CGCCAAGGAT CTGATGGCGC AGGGGATCAA GATCTGATCA AGAGACAGGA 2460
TGAGGATCGT TTCGCATGAT TGAACAAGAT GGATTGCACG CAGGTTCTCC GGCCGCTTGG 2520
GTGGAGAGGC TATTCGGCTA TGACTGGGCA CAACAGACAA TCGGCTGCTC TGATGCCGCC 2580
GTGTTCCGGC TGTCAGCGCA GGGGCGCCCG GTTCTTTTTG TCAAGACCGA CCTGTCCGGT 2640
GCCCTGAATG AACTGCAGGA CGAGGCAGCG CGGCTATCGT GGCTGGCCAC GACGGGCGTT 2700 CCTTGCGCAG CTGTGCTCGA CGTTGTCACT GAAGCGGGAA GGGACTGGCT GCTATTGGGC 2760
GAAGTGCCGG GGCAGGATCT CCTGTCATCT CACCTTGCTC CTGCCGAGAA AGTATCCATC 2820
ATGGCTGATG CAATGCGGCG GCTGCATACG CTTGATCCGG CTACCTGCCC ATTCGACCAC 2880
CAAGCGAAAC ATCGCATCGA GCGAGCACGT ACTCGGATGG AAGCCGGTCT TGTCGATCAG 2940
GATGATCTGG ACGAAGAGCA TCAGGGGCTC GCGCCAGCCG AACTGTTCGC CAGGCTCAAG 3000
GCGCGCATGC CCGACGGCGA GGATCTCGTC GTGACACATG GCGATGCCTG CTTGCCGAAT 3060
ATCATGGTGG AAAATGGCCG CTTTTCTGGA TTCATCGACT GTGGCCGGCT GGGTGTGGCG 3120
GACCGCTATC AGGACATAGC GTTGGCTACC CGTGATATTG CTGAAGAGCT TGGCGGCGAA 3180
TGGGCTGACC GCTTCCTCGT GCTTTACGGT ATCGCCGCTC CCGATTCGCA GCGCATCGCC 3240
TTCTATCGCC TTCTTGACGA GTTCTTCTGA GCGGGACTCT GGGGTTCGAA ATGACCGACC 3300
AAGCGACGCC CAACCTGCCA TCACGAGATT TCGATTCCAC CGCCGCCTTC TATGAAAGGT 3360
TGGGCTTCGG AATCGTTTTC CGGGACGCCG GCTGGATGAT CCTCCAGCGC GGGGATCTCA 3420
TGCTGGAGTT CTTCGCCCAC CCCAAAAGGA TCTAGGTGAA GATCCTTTTT GATAATCTCA 3480
TGACCAAAAT CCCTTAACGT GAGTTTTCGT TCCACTGAGC GTCAGACCCC GTAGAAAAGA 3540
TCAAAGGATC TTCTTGAGAT CCTTTTTTTC TGCGCGTAAT CTGCTGCTTG CAAACAAAAA 3600
AACCACCGCT ACCAGCGGTG GTTTGTTTGC CGGATCAAGA GCTACCAACT CTTTTTCCGA 3660
AGGTAACTGG CTTCAGCAGA GCGCAGATAC CAAATACTGT CCTTCTAGTG TAGCCGTAGT 3720
TAGGCCACCA CTTCAAGAAC TCTGTAGCAC CGCCTACATA CCTCGCTCTG CTAATCCTGT 3780
TACCAGTGGC TGCTGCCAGT GGCGATAAGT CGTGTCTTAC CGGGTTGGAC TCAAGACGAT 3840
AGTTACCGGA TAAGGCGCAG CGGTCGGGCT GAACGGGGGG TTCGTGCACA CAGCCCAGCT 3900
TGGAGCGAAC GACCTACACC GAACTGAGAT ACCTACAGCG TGAGCTATGA GAAAGCGCCA 3960
CGCTTCCCGA AGGGAGAAAG GCGGACAGGT ATCCGGTAAG CGGCAGGGTC GGAACAGGAG 4020
AGCGCACGAG GGAGCTTCCA GGGGGAAACG CCTGGTATCT TTATAGTCCT GTCGGGTTTC 4080
GCCACCTCTG ACTTGAGCGT CGATTTTTGT GATGCTCGTC AGGGGGGCGG AGCCTATGGA 4140
AAAACGCCAG CAACGCGGCC TTTTTACGGT TCCTGGCCTT TTGCTGGCCT TTTGCTCACA 4200
TGTTCTTTCC TGCGTTATCC CCTGATTCTG TGGATAACCG TATTACCGCC TTTGAGTGAG 4260
CTGATACCGC TCGCCGCAGC CGAACGACCG AGCGCAGCGA GTCAGTGAGC GAGGAAGCGG 4320
AAGAGCGCCT GATGCGGTAT TTTCTCCTTA CGCATCTGTG CGGTATTTCA CACCGCATAT 4380
GCAGATATTT TGTTAAAATT CGCGTTAAAT TTTTGTTAAA TCAGCTCATT TTTTAACCAA 4440
TAGGCCGAAA TCGGCAAAAT CCCTTATAAA TCAAAAGAAT AGACCGAGAT AGGGTTGAGT 4500
GTTGTTCCAG TTTGGAACAA GAGTCCACTA TTAAAGAACG TGGACTCCAA CGTCAAAGGG 4560
CGAAAAACCG TCTATCAGGG CGATGGCCCA CTACGTGAAC CATCACCCTA ATCAAGTTTT 4620
TTGGGGTCGA GGTGCCGTAA AGCACTAAAT CGGAACCCTA AAGGGAGCCC CCGATTTAGA 4680
GCTTGACGGG GAAAGCCGGC GAACGTGGCG AGAAAGGAAG GGAAGAAAGC GAAAGGAGCG 4740 GGCGCTAGGG CGCTGGCAAG TGTAGCGGTC ACGCTGCGCG TAACCACCAC ACCCGCCGCG 4800
CTTAATGCGC CGCTACAGGG CGCGTCAGGT GGCACTTTTC GGGGAAATGT GCGCGGAACC 4860
CCTATTTGTT TATTTTTCTA AATACATTCA AATATGTATC CGCTCATGAG ACAATAACCC 4920
TGCTGCATTT ACGTTGACAC CATCGAATGG TGCAAAACCT TTCGCGGTAT GGCATGATAG 4980
CGCCCGGAAG AGAGTCAATT CAGGGTGGTG AATGTGAAAC CAGTAACGTT ATACGATGTC 5040
GCAGAGTATG CCGGTGTCTC TTATCAGACC GTTTCCCGCG TGGTGAACCA GGCCAGCCAC 5100
GTTTCTGCGA AAACGCGGGA AAAAGTGGAA GCGGCGATGG CGGAGCTGAA TTACATTCCC 5160
AACCGCGTGG CACAACAACT GGCGGGCAAA CAGTCGTTGC TGATTGGCGT TGCCACCTCC 5220
AGTCTGGCCC TGCACGCGCC GTCGCAAATT GTCGCGGCGA TTAAATCTCG CGCCGATCAA 5280
CTGGGTGCCA GCGTGGTGGT GTCGATGGTA GAACGAAGCG GCGTCGAAGC CTGTAAAGCG 5340
GCGGTGCACA ATCTTCTCGC GCAACGCGTC AGTGGGCTGA TCATTAACTA TCCGCTGGAT 5400
GACCAGGATG CCATTGCTGT GGAAGCTGCC TGCACTAATG TTCCGGCGTT ATTTCTTGAT 5460
GTCTCTGACC AGACACCCAT CAACAGTATT ATTTTCTCCC ATGAAGACGG TACGCGACTG 5520
GGCGTGGAGC ATCTGGTCGC ATTGGGTCAC CAGCAAATCG CGCTGTTAGC GGGCCCATTA 5580
AGTTCTGTCT CGGCGCGTCT GCGTCTGGCT GGCTGGCATA AATATCTCAC TCGCAATCAA 5640
ATTCAGCCGA TAGCGGAACG GGAAGGCGAC TGGAGTGCCA TGTCCGGTTT TCAACAAACC 5700
ATGCAAATGC TGAATGAGGG CATCGTTCCC ACTGCGATGC TGGTTGCCAA CGATCAGATG 5760
GCGCTGGGCG CAATGCGCGC CATTACCGAG TCCGGGCTGC GCGTTGGTGC GGATATCTCG 5820
GTAGTGGGAT ACGACGATAC CGAAGACAGC TCATGTTATA TCCCGCCGTT AACCACCATC 5880
AAACAGGATT TTCGCCTGCT GGGGCAAACC AGCGTGGACC GCTTGCTGCA ACTCTCTCAG 5940
GGCCAGGCGG TGAAGGGCAA TCAGCTGTTG CCCGTCTCAC TGGTGAAAAG AAAAACCACC 6000
CTGGCGCCCA ATACGCAAAC CGCCTCTCCC CGCGCGTTGG CCGATTCATT AATGCAGCTG 6060
GCACGACAGG TTTCCCGACT GGAAAGCGGG CAGTGAGCGC AACGCAATTA ATGTGAGTTA 6120
GCGCGAATTG ATCTGG 6136
(2) INFORMATION FOR SEQ ID NO: 27: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 14" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27:
AGGTCTACTA GTCCCGGCTG CCGCGTCGAC 30
(2) INFORMATION FOR SEQ ID NO: 28: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 53 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid ( ix ) FEATURE :
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 15" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28:
CCGATTAAAG CTTATTAGCT AGCACGGAAT TCCGTGGGGC TGGTCGTCGG CAC 53
(2) INFORMATION FOR SEQ ID NO: 29: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 42 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid ( ix ) FEATURE :
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 16" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29:
TCATGAGCCA TGGCTAGCGC AAATCTTAAT GGGACGCTGA TG 42
(2) INFORMATION FOR SEQ ID NO: 30: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 69 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 17" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30:
ATGACTAAGC TTACTTACTT AGTGATGGTG ATGGTGATGA CTAGTTCTTT GAACATAAAT TGAAACCGA
69
(2) INFORMATION FOR SEQ ID NO: 31: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1959 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..1959
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31:
ATG GAT CCT AGG ACC ACG CCC GCA CCC GGC CAC CCG GCC CGC GGC GCC 48 Met Asp Pro Arg Thr Thr Pro Ala Pro Gly His Pro Ala Arg Gly Ala 1 5 10 15
CGC ACC GCT CTG CGC ACG ACG CTC GCC GCC GCG GCG GCG ACG CTC GTC 96 Arg Thr Ala Leu Arg Thr Thr Leu Ala Ala Ala Ala Ala Thr Leu Val 20 25 30
GTC GGC GCC ACG GTC GTG CTG CCC GCC CAG GCC GCT AGT CCC GGC TGC 144 Val Gly Ala Thr Val Val Leu Pro Ala Gin Ala Ala Ser Pro Gly Cys 35 40 45
CGC GTC GAC TAC GCC GTC ACC AAC CAG TGG CCC GGC GGC TTC GGC GCC 192 Arg Val Asp Tyr Ala Val Thr Asn Gin Trp Pro Gly Gly Phe Gly Ala 50 55 60
AAC GTC ACG ATC ACC AAC CTC GGC GAC CCC GTC TCG TCG TGG AAG CTC 240 Asn Val Thr He Thr Asn Leu Gly Asp Pro Val Ser Ser Trp Lys Leu 65 70 75 80
GAC TGG ACC TAC ACC GCA GGC CAG CGG ATC CAG CAG CTG TGG AAC GGC 288 Asp Trp Thr Tyr Thr Ala Gly Gin Arg He Gin Gin Leu Trp Asn Gly 85 90 95
ACC GCG TCG ACC AAC GGC GGC CAG GTC TCC GTC ACC AGC CTG CCC TGG 336 Thr Ala Ser Thr Asn Gly Gly Gin Val Ser Val Thr Ser Leu Pro Trp 100 105 110
AAC GGC AGC ATC CCG ACC GGC GGC ACG GCG TCG TTC GGG TTC AAC GGC 384 Asn Gly Ser He Pro Thr Gly Gly Thr Ala Ser Phe Gly Phe Asn Gly 115 120 125
TCG TGG GCC GGG TCC AAC CCG ACG CCG GCG TCG TTC TCG CTC AAC GGC 432 Ser Trp Ala Gly Ser Asn Pro Thr Pro Ala Ser Phe Ser Leu Asn Gly 130 135 140
ACC ACC TGC ACG GGC ACC GTG CCG ACG ACC AGC CCC ACG GAA TTC CGT 480 Thr Thr Cys Thr Gly Thr Val Pro Thr Thr Ser Pro Thr Glu Phe Arg 145 150 155 160
GCT AGC GCA AAT CTT AAT GGG ACG CTG ATG CAG TAT TTT GAA TGG TAC 528 Ala Ser Ala Asn Leu Asn Gly Thr Leu Met Gin Tyr Phe Glu Trp Tyr 165 170 175
ATG CCC AAT GAC GGC CAA CAT TGG AAG CGC TTG CAA AAC GAC TCG GCA 576 Met Pro Asn Asp Gly Gin His Trp Lys Arg Leu Gin Asn Asp Ser Ala 180 185 190
TAT TTG GCT GAA CAC GGT ATT ACT GCC GTC TGG ATT CCC CCG GCA TAT 624 Tyr Leu Ala Glu His Gly He Thr Ala Val Trp He Pro Pro Ala Tyr 195 200 205
AAG GGA ACG AGC CAA GCG GAT GTG GGC TAC GGT GCT TAC GAC CTT TAT 672 Lys Gly Thr Ser Gin Ala Asp Val Gly Tyr Gly Ala Tyr Asp Leu Tyr 210 215 220
GAT TTA GGG GAG TTT CAT CAA AAA GGG ACG GTT CGG ACA AAG TAC GGC 720 Asp Leu Gly Glu Phe His Gin Lys Gly Thr Val Arg Thr Lys Tyr Gly 225 230 235 240
ACA AAA GGA GAG CTG CAA TCT GCG ATC AAA AGT CTT CAT TCC CGC GAC 768 Thr Lys Gly Glu Leu Gin Ser Ala He Lys Ser Leu His Ser Arg Asp 245 250 255
ATT AAC GTT TAC GGG GAT GTG GTC ATC AAC CAC AAA GGC GGC GCT GAT 816 He Asn Val Tyr Gly Asp Val Val He Asn His Lys Gly Gly Ala Asp 260 265 270
GCG ACC GAA GAT GTA ACC GCG GTT GAA GTC GAT CCC GCT GAC CGC AAC 864 Ala Thr Glu Asp Val Thr Ala Val Glu Val Asp Pro Ala Asp Arg Asn 275 280 285
CGC GTA ATT TCA GGA GAA CAC TTA ATT AAA GCC TGG ACA CAT TTT CAT 912 Arg Val He Ser Gly Glu His Leu He Lys Ala Trp Thr His Phe His 290 295 300
TTT CCG GGG CGC GGC AGC ACA TAC AGC GAT TTT AAA TGG CAT TGG TAC 960 Phe Pro Gly Arg Gly Ser Thr Tyr Ser Asp Phe Lys Trp His Trp Tyr 305 310 315 320
CAT TTT GAC GGA ACC GAT TGG GAC GAG TCC CGA AAG CTG AAC CGC ATC 1008 His Phe Asp Gly Thr Asp Trp Asp Glu Ser Arg Lys Leu Asn Arg He 325 330 335
TAT AAG TTT CAA GGA AAG GCT TGG GAT TGG GAA GTT TCC AAT GAA AAC 1056 Tyr Lys Phe Gin Gly Lys Ala Trp Asp Trp Glu Val Ser Asn Glu Asn 340 345 350
GGC AAC TAT GAT TAT TTG ATG TAT GCC GAC ATC GAT TAT GAT CAT CCT 1104 Gly Asn Tyr Asp Tyr Leu Met Tyr Ala Asp He Asp Tyr Asp His Pro 355 360 365
GAT GTC GCA GCA GAA ATT AAG AGA TGG GGC ACT TGG TAT GCC AAT GAA 1152 Asp Val Ala Ala Glu He Lys Arg Trp Gly Thr Trp Tyr Ala Asn Glu 370 375 380
CTG CAA TTG GAC GGT TTC CGT CTT GAT GCT GTC AAA CAC ATT AAA TTT 1200 Leu Gin Leu Asp Gly Phe Arg Leu Asp Ala Val Lys His He Lys Phe 385 390 395 400
TCT TTT TTG CGG GAT TGG GTT AAT CAT GTC AGG GAA AAA ACG GGG AAG 1248 Ser Phe Leu Arg Asp Trp Val Asn His Val Arg Glu Lys Thr Gly Lys 405 410 415
GAA ATG TTT ACG GTA GCT GAA TAT TGG CAG AAT GAC TTG GGC GCG CTG 1296 Glu Met Phe Thr Val Ala Glu Tyr Trp Gin Asn Asp Leu Gly Ala Leu 420 425 430
GAA AAC TAT TTG AAC AAA ACA AAT TTT AAT CAT TCA GTG TTT GAC GTG 1344 Glu Asn Tyr Leu Asn Lys Thr Asn Phe Asn His Ser Val Phe Asp Val 435 440 445
CCG CTT CAT TAT CAG TTC CAT GCT GCA TCG ACA CAG GGA GGC GGC TAT 1392 Pro Leu His Tyr Gin Phe His Ala Ala Ser Thr Gin Gly Gly Gly Tyr 450 455 460
GAT ATG AGG AAA TTG CTG AAC GGT ACG GTC GTT TCC AAG CAT CCG TTG 1440 Asp Met Arg Lys Leu Leu Asn Gly Thr Val Val Ser Lys His Pro Leu 465 470 475 480
AAA GCG GTT ACA TTT GTC GAT AAC CAT GAT ACA CAG CCG GGG CAA TCG 1488 Lys Ala Val Thr Phe Val Asp Asn His Asp Thr Gin Pro Gly Gin Ser 485 490 495
CTT GAG TCG ACT GTC CAA ACA TGG TTT AAG CCG CTT GCT TAC GCT TTT 1536 Leu Glu Ser Thr Val Gin Thr Trp Phe Lys Pro Leu Ala Tyr Ala Phe 500 505 510
ATT CTC ACA AGG GAA TCT GGA TAC CCT CAG GTT TTC TAC GGG GAT ATG 1584 He Leu Thr Arg Glu Ser Gly Tyr Pro Gin Val Phe Tyr Gly Asp Met 515 520 525
TAC GGG ACG AAA GGA GAC TCC CAG CGC GAA ATT CCT GCC TTG AAA CAC 1632 Tyr Gly Thr Lys Gly Asp Ser Gin Arg Glu He Pro Ala Leu Lys His 530 535 540
AAA ATT GAA CCG ATC TTA AAA GCG AGA AAA CAG TAT GCG TAC GGA GCA 1680 Lys He Glu Pro He Leu Lys Ala Arg Lys Gin Tyr Ala Tyr Gly Ala 545 550 555 560
CAG CAT GAT TAT TTC GAC CAC CAT GAC ATT GTC GGC TGG ACA AGG GAA 1728 Gin His Asp Tyr Phe Asp His His Asp He Val Gly Trp Thr Arg Glu 565 570 575
GGC GAC AGC TCG GTT GCA AAT TCA GGT TTG GCG GCA TTA ATA ACA GAC 1776 Gly Asp Ser Ser Val Ala Asn Ser Gly Leu Ala Ala Leu He Thr Asp 580 585 590
GGA CCC GGT GGG GCA AAG CGA ATG TAT GTC GGC CGG CAA AAC GCC GGT 1824 Gly Pro Gly Gly Ala Lys Arg Met Tyr Val Gly Arg Gin Asn Ala Gly 595 600 605
GAG ACA TGG CAT GAC ATT ACC GGA AAC CGT TCG GAG CCG GTT GTC ATC 1872 Glu Thr Trp His Asp He Thr Gly Asn Arg Ser Glu Pro Val Val He 610 615 620
AAT TCG GAA GGC TGG GGA GAG TTT CAC GTA AAC GGC GGG TCG GTT TCA 1920 Asn Ser Glu Gly Trp Gly Glu Phe His Val Asn Gly Gly Ser Val Ser 625 630 635 640
ATT TAT GTT CAA AGA ACT AGT CAT CAC CAT CAC CAT CAC He Tyr Val Gin Arg Thr Ser His His His His His His 645 650
(2) INFORMATION FOR SEQ ID NO: 32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 653 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32:
Met Asp Pro Arg Thr Thr Pro Ala Pro Gly His Pro Ala Arg Gly Ala 1 5 10 15
Arg Thr Ala Leu Arg Thr Thr Leu Ala Ala Ala Ala Ala Thr Leu Val 20 25 30
Val Gly Ala Thr Val Val Leu Pro Ala Gin Ala Ala Ser Pro Gly Cys 35 40 45
Arg Val Asp Tyr Ala Val Thr Asn Gin Trp Pro Gly Gly Phe Gly Ala 50 55 60
Asn Val Thr He Thr Asn Leu Gly Asp Pro Val Ser Ser Trp Lys Leu 65 70 75 80
Asp Trp Thr Tyr Thr Ala Gly Gin Arg He Gin Gin Leu Trp Asn Gly 85 90 95
Thr Ala Ser Thr Asn Gly Gly Gin Val Ser Val Thr Ser Leu Pro Trp 100 105 110
Asn Gly Ser He Pro Thr Gly Gly Thr Ala Ser Phe Gly Phe Asn Gly 115 120 125
Ser Trp Ala Gly Ser Asn Pro Thr Pro Ala Ser Phe Ser Leu Asn Gly 130 135 140
Thr Thr Cys Thr Gly Thr Val Pro Thr Thr Ser Pro Thr Glu Phe Arg 145 150 155 160
Ala Ser Ala Asn Leu Asn Gly Thr Leu Met Gin Tyr Phe Glu Trp Tyr 165 170 175
Met Pro Asn Asp Gly Gin His Trp Lys Arg Leu Gin Asn Asp Ser Ala 180 185 190
Tyr Leu Ala Glu His Gly He Thr Ala Val Trp He Pro Pro Ala Tyr 195 200 205
Lys Gly Thr Ser Gin Ala Asp Val Gly Tyr Gly Ala Tyr Asp Leu Tyr 210 215 220
Asp Leu Gly Glu Phe His Gin Lys Gly Thr Val Arg Thr Lys Tyr Gly 225 230 235 240 Thr Lys Gly Glu Leu Gin Ser Ala He Lys Ser Leu His Ser Arg Asp 245 250 255
He Asn Val Tyr Gly Asp Val Val He Asn His Lys Gly Gly Ala Asp 260 265 270
Ala Thr Glu Asp Val Thr Ala Val Glu Val Asp Pro Ala Asp Arg Asn 275 280 285
Arg Val He Ser Gly Glu His Leu He Lys Ala Trp Thr His Phe His 290 295 300
Phe Pro Gly Arg Gly Ser Thr Tyr Ser Asp Phe Lys Trp His Trp Tyr 305 310 315 320
His Phe Asp Gly Thr Asp Trp Asp Glu Ser Arg Lys Leu Asn Arg He 325 330 335
Tyr Lys Phe Gin Gly Lys Ala Trp Asp Trp Glu Val Ser Asn Glu Asn 340 345 350
Gly Asn Tyr Asp Tyr Leu Met Tyr Ala Asp He Asp Tyr Asp His Pro 355 360 365
Asp Val Ala Ala Glu He Lys Arg Trp Gly Thr Trp Tyr Ala Asn Glu 370 375 380
Leu Gin Leu Asp Gly Phe Arg Leu Asp Ala Val Lys His He Lys Phe 385 390 395 400
Ser Phe Leu Arg Asp Trp Val Asn His Val Arg Glu Lys Thr Gly Lys 405 410 415
Glu Met Phe Thr Val Ala Glu Tyr Trp Gin Asn Asp Leu Gly Ala Leu 420 425 430
Glu Asn Tyr Leu Asn Lys Thr Asn Phe Asn His Ser Val Phe Asp Val 435 440 445
Pro Leu His Tyr Gin Phe His Ala Ala Ser Thr Gin Gly Gly Gly Tyr 450 455 460
Asp Met Arg Lys Leu Leu Asn Gly Thr Val Val Ser Lys His Pro Leu 465 470 475 480
Lys Ala Val Thr Phe Val Asp Asn His Asp Thr Gin Pro Gly Gin Ser 485 490 495
Leu Glu Ser Thr Val Gin Thr Trp Phe Lys Pro Leu Ala Tyr Ala Phe 500 505 510
He Leu Thr Arg Glu Ser Gly Tyr Pro Gin Val Phe Tyr Gly Asp Met 515 520 525
Tyr Gly Thr Lys Gly Asp Ser Gin Arg Glu He Pro Ala Leu Lys His 530 535 540
Lys He Glu Pro He Leu Lys Ala Arg Lys Gin Tyr Ala Tyr Gly Ala 545 550 555 560
Gin His Asp Tyr Phe Asp His His Asp He Val Gly Trp Thr Arg Glu 565 570 575
Gly Asp Ser Ser Val Ala Asn Ser Gly Leu Ala Ala Leu He Thr Asp 580 585 590
Gly Pro Gly Gly Ala Lys Arg Met Tyr Val Gly Arg Gin Asn Ala Gly 595 600 605 Glu Thr Trp His Asp He Thr Gly Asn Arg Ser Glu Pro Val Val He 610 615 620
Asn Ser Glu Gly Trp Gly Glu Phe His Val Asn Gly Gly Ser Val Ser 625 630 635 640
He Tyr Val Gin Arg Thr Ser His His His His His His 645 650
(2) INFORMATION FOR SEQ ID NO: 33: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 29 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 18" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33:
CATATGGCTA GCGAATTCGC AAATCTTAAT GGGACGCTG 29
(2) INFORMATION FOR SEQ ID NO: 34: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 19" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34:
AAGCTTACTA GTAGGCCTTC TTTGAACATA AATTGAAA 28
(2) INFORMATION FOR SEQ ID NO: 35: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 70 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 20" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:35:
CCATGGGCTA GCCCTGAATT CAGGCCTCCA ACCCCCACTA GTCCGAGCGC TCCCAGCGGC TGCACTGCTG 70
(2) INFORMATION FOR SEQ ID NO: 36: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: misc-feature: (B) OTHER INFORMATION: /desc = "Primer 21" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36:
AGCCTAAGCT TACAGGCACT GATGGTACCA GT 32 2) INFORMATION FOR SEQ ID NO: 37:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 12 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (ix) FEATURE:
(a) NAME/KEY: misc-feature
(d) OTHER INFORMATION: /desc = "Linker" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37:
Arg Pro Pro Thr Pro Thr Ser Pro Ser Ala Pro Ser 1 5 10

Claims

1. A method for liquefying starch, wherein a starch substrate is treated in aqueous medium with a modified enzyme (enzyme hybrid) which comprises an amino acid sequence of an α-amylase linked to an amino acid sequence comprising a carbohydrate-binding domain (CBD) .
2. The method for liquefying starch according to claim 1, further comprising a debranching enzyme.
3. The method according to claim 2 , wherein the debranching enzyme is a modified debranching enzyme (enzyme hybrid) linked to an amino acid sequence comprising a carbohydrate-binding domain.
4. A method for saccharifying starch which has been subjected to a liquefaction process, wherein the reaction mixture after liquefaction is treated with a modified enzyme (enzyme hybrid) which comprises an amino acid sequence of a debranching enzyme linked to an amino acid sequence comprising a carbohydrate- binding domain (CBD) .
5. The method according to claims 2, 3 or 4 wherein said debranching enzyme is an isoamylase or a pullulanase.
6. A method for saccharifying starch which has been subjected to a liquefaction process, wherein the reaction mixture after liquefaction is treated with a modified enzyme (enzyme hybrid) which comprises an amino acid sequence of a glucoamylase linked to an amino acid sequence comprising a carbohydrate-binding domain (CBD) .
7. A method according to any one of the preceding claims, wherein said CBD is a CBD deriving from a cellulase, a xylanase, a mannanase, an arabinofuranosidase, an acetylesterase, a chitinase, a glucoamylase or a CGTase.
8. The use of a modified enzyme (enzyme hybrid) which comprises an amino acid sequence of an α-amylase linked to an amino acid sequence comprising a carbohydrate-binding domain (CBD) in a process for liquefying starch.
5
9. The use of a modified enzyme (enzyme hybrid) which comprises an amino acid sequence of a debranching enzyme linked to an amino acid sequence comprising a carbohydrate-binding domain (CBD) in a process for saccharifying starch which has been subjected to a
10 liquefaction process.
10. The use of a modified enzyme (enzyme hybrid) which comprises an amino acid sequence of a glucoamylase linked to an amino acid sequence comprising a carbohydrate-binding domain (CBD) in a
15 process for saccharifying starch which has been subjected to a liquefaction process.
11. An isolated DNA sequence encoding a hybrid enzyme with amylolytic activity comprising:
20 (a) a DNA sequence encoding an amylolytic activity;
(b) a DNA sequences encoding a CBD; and
(c) a DNA sequence or fragments thereof encoding the linker sequence shown in SEQ ID no. 21.
25 12. The isolated DNA sequence according to claims 11, wherein the amylolytic activity is an α-amylase activity, in particular a Bacillus α-amylase, especially the activity of Termamyl| or a variant thereof.
30 13. The isolated DNA sequence according to claims 11 or 12, wherein the CBD is the CBD of Bacillus agaradherens NCIMB No. 40482 alkaline cellulase Cel5A.
14. The isolated DNA sequence according to claim 13, encodes 35 the Termamyl|-linker-Cel5A-CBD encoded by plasmid pMB492 shown in SEQ ID No. 19.
15. The isolated DNA sequence according to claims 11 or 12, wherein the CDB is the CBD-dimer of Clostridium stercorarium
(NCIMB 11754) XynA.
16. A DNA construct construct comprising the DNA sequence of any of claims 11 to 15 operably linked to one or more control sequences capable of directing the expression of the DNA sequence in a suitable expression host.
10
17. The DNA construct of claim 16, comprising a nucleotide sequence encoding the promoter selected from the group consisting of the promoter of the Bacillus stearothermophilus maltogenic amylase gene, the promoter of the Bacillus 15 licheniformis alpha-amylase gene, the promoter of the Bacillus amyloliquefaciens BAN amylase gene, the promoter of the Bacillus subtilis alkaline protease gene, or the promoter of the Bacillus pumilus cellulase or xylosidase gene.
20 18. A recombinant expression vector comprising the DNA construct of claims 16 or 17, a promoter, and transcriptional and translational stop signals.
19. A host cell comprising the DNA construct of claims 16 or 25 17.
20. The cell of claim 19, wherein the cell is a Bacillus cell from a strain selected from the group consisting of B . subtilis, B . licheniformis, B . lentus, B . brevis, B .
30 stearothermophilus, B . alkalophilus, B . amyloliquefaciens, B . coagulans, B . circulans, B . lautus , B . megatherium, B . pumilus , B . thuringiensis or B . agaradherens .
21. A method of producing a CBD/ hybrid enzyme, comprised 35 of culturing the cell of claims 19 or 20 under conditions permitting the production of the enzyme, and recovering the enzyme from the culture.
22. An isolated and purified CBD/enzyme hybrid encoded by the DNA sequence of any of claims 11 to 15.
23. The CBD/enzyme hybrid according to claim 22 being the hybrid enzyme shown in SEQ ID No. 20.
PCT/DK1997/000448 1996-10-11 1997-10-13 Alpha-amylase fused to cellulose binding domain, for starch degradation WO1998016633A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP97943797A EP0950093A2 (en) 1996-10-11 1997-10-13 Alpha-amylase fused to cellulose binding domain, for starch degradation
AU45510/97A AU4551097A (en) 1996-10-11 1997-10-13 Alpha-amylase fused to cellulose binding domain, for starch degradation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DK113096 1996-10-11
DK1130/96 1996-10-11

Publications (1)

Publication Number Publication Date
WO1998016633A1 true WO1998016633A1 (en) 1998-04-23

Family

ID=8101361

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/DK1997/000448 WO1998016633A1 (en) 1996-10-11 1997-10-13 Alpha-amylase fused to cellulose binding domain, for starch degradation

Country Status (4)

Country Link
EP (1) EP0950093A2 (en)
CN (1) CN1233286A (en)
AU (1) AU4551097A (en)
WO (1) WO1998016633A1 (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999057252A1 (en) * 1998-05-01 1999-11-11 The Procter & Gamble Company Laundry detergent and/or fabric care compositions comprising a modified enzyme
US6468955B1 (en) 1998-05-01 2002-10-22 The Proctor & Gamble Company Laundry detergent and/or fabric care compositions comprising a modified enzyme
WO2006069290A2 (en) 2004-12-22 2006-06-29 Novozymes A/S Enzymes for starch processing
US7129069B2 (en) 2003-10-28 2006-10-31 Novo Zymes Als Hybrid enzymes
WO2006066596A3 (en) * 2004-12-22 2006-12-07 Novozymes As Hybrid enzymes consisting of an endo-amylase first amino acid sequence and a carbohydrate -binding module as second amino acid sequence
WO2007077244A2 (en) * 2006-01-04 2007-07-12 Novozymes A/S Method for producing soy sauce
WO2008101894A1 (en) * 2007-02-19 2008-08-28 Novozymes A/S Polypeptides with starch debranching activity
US7713723B1 (en) 2000-08-01 2010-05-11 Novozymes A/S Alpha-amylase mutants with altered properties
US7883883B2 (en) * 2003-06-25 2011-02-08 Novozymes A/S Enzymes for starch processing
AU2011203101B2 (en) * 2004-12-22 2012-11-08 Novozymes A/S Enzymes for starch processing
US8440444B2 (en) 2004-12-22 2013-05-14 Novozymes A/S Hybrid enzymes
US8546106B2 (en) 2006-07-21 2013-10-01 Novozymes, Inc. Methods of increasing secretion of polypeptides having biological activity
US8841091B2 (en) 2004-12-22 2014-09-23 Novozymes Als Enzymes for starch processing
EP3095858A1 (en) * 2015-05-19 2016-11-23 Honda Motor Co., Ltd. Thermostable glycoside hydrolase
CN106397601A (en) * 2004-12-22 2017-02-15 诺维信公司 Enzymes for starch processing
WO2022272119A3 (en) * 2021-06-25 2023-02-02 The Board Of Trustees Of The University Of Illinois Synthetic toolkit for plant transformation
WO2023225459A2 (en) 2022-05-14 2023-11-23 Novozymes A/S Compositions and methods for preventing, treating, supressing and/or eliminating phytopathogenic infestations and infections

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2169076B1 (en) * 2001-02-21 2016-11-02 BASF Enzymes LLC Enzymes having alpha amylase activity and methods of use thereof
CN103045565B (en) * 2012-12-14 2014-06-18 南京林业大学 High glucose resistance beta-glucosidase-CBD fusion enzyme, and expression gene and application thereof
CN108949861B (en) * 2018-08-13 2020-12-01 江南大学 Method for preparing slowly digestible dextrin

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1994029460A1 (en) * 1993-06-11 1994-12-22 Midwest Research Institute Active heteroconjugates of cellobiohydrolase and beta-glucosidase
US5496934A (en) * 1993-04-14 1996-03-05 Yissum Research Development Company Of The Hebrew University Of Jerusalem Nucleic acids encoding a cellulose binding domain
WO1996023874A1 (en) * 1995-02-03 1996-08-08 Novo Nordisk A/S A method of designing alpha-amylase mutants with predetermined properties

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5496934A (en) * 1993-04-14 1996-03-05 Yissum Research Development Company Of The Hebrew University Of Jerusalem Nucleic acids encoding a cellulose binding domain
WO1994029460A1 (en) * 1993-06-11 1994-12-22 Midwest Research Institute Active heteroconjugates of cellobiohydrolase and beta-glucosidase
WO1996023874A1 (en) * 1995-02-03 1996-08-08 Novo Nordisk A/S A method of designing alpha-amylase mutants with predetermined properties

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ELSEVIER SCIENCE LTD, Volume 12, 1994, EDWARD A. BAYER et al., "The Cellulosome - A Treasuretrove for Biotechnology". *
JOURNAL OF BACTERIOLOGY, Volume 177, No. 18, Sept. 1995, NATHALIE SAUVONNET et al., "Extracellular Secretion of Pullulanase is Unaffected by Minor Sequence Changes But is Usually Prevented by Adding Reporter Proteins to Its N- or C-Terminal End", pages 5241-5243. *

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999057250A1 (en) * 1998-05-01 1999-11-11 The Procter & Gamble Company Laundry detergent and/or fabric care compositions comprising a modified enzyme
US6468955B1 (en) 1998-05-01 2002-10-22 The Proctor & Gamble Company Laundry detergent and/or fabric care compositions comprising a modified enzyme
WO1999057252A1 (en) * 1998-05-01 1999-11-11 The Procter & Gamble Company Laundry detergent and/or fabric care compositions comprising a modified enzyme
US7713723B1 (en) 2000-08-01 2010-05-11 Novozymes A/S Alpha-amylase mutants with altered properties
US8263381B2 (en) 2003-06-25 2012-09-11 Novozyms A/S Enzymes for starch processing
EP2336308A3 (en) * 2003-06-25 2011-11-09 Novozymes A/S Enzymes for starch processing
US7883883B2 (en) * 2003-06-25 2011-02-08 Novozymes A/S Enzymes for starch processing
US7129069B2 (en) 2003-10-28 2006-10-31 Novo Zymes Als Hybrid enzymes
US7749744B2 (en) 2003-10-28 2010-07-06 Novozymes A/S Hybrid enzymes
US7312055B2 (en) 2003-10-28 2007-12-25 Novozymes A/S Hybrid enzymes
AU2005319074B2 (en) * 2004-12-22 2011-03-24 Novozymes A/S Enzymes for starch processing
JP2008525036A (en) * 2004-12-22 2008-07-17 ノボザイムス アクティーゼルスカブ Starch processing method
US9777304B2 (en) 2004-12-22 2017-10-03 Novozymes North America, Inc. Enzymes for starch processing
EP1831385A4 (en) * 2004-12-22 2009-02-18 Novozymes North America Inc Enzymes for starch processing
US9719120B2 (en) 2004-12-22 2017-08-01 Novozymes A/S Enzymes for starch processing
JP2008523830A (en) * 2004-12-22 2008-07-10 ノボザイムス アクティーゼルスカブ Hybrid enzyme
EP1831385A2 (en) * 2004-12-22 2007-09-12 Novozymes North America, Inc. Enzymes for starch processing
US8512986B2 (en) 2004-12-22 2013-08-20 Novozymes A/S Enzymes for starch processing
CN106397601A (en) * 2004-12-22 2017-02-15 诺维信公司 Enzymes for starch processing
US8841091B2 (en) 2004-12-22 2014-09-23 Novozymes Als Enzymes for starch processing
WO2006066596A3 (en) * 2004-12-22 2006-12-07 Novozymes As Hybrid enzymes consisting of an endo-amylase first amino acid sequence and a carbohydrate -binding module as second amino acid sequence
EP2365068A3 (en) * 2004-12-22 2011-12-21 Novozymes A/S Enzymes for starch processing
WO2006069290A2 (en) 2004-12-22 2006-06-29 Novozymes A/S Enzymes for starch processing
AU2011203101B2 (en) * 2004-12-22 2012-11-08 Novozymes A/S Enzymes for starch processing
US8440444B2 (en) 2004-12-22 2013-05-14 Novozymes A/S Hybrid enzymes
WO2007077244A3 (en) * 2006-01-04 2007-08-30 Novozymes As Method for producing soy sauce
WO2007077244A2 (en) * 2006-01-04 2007-07-12 Novozymes A/S Method for producing soy sauce
JP2009521943A (en) * 2006-01-04 2009-06-11 ノボザイムス アクティーゼルスカブ Soy sauce production method
US8546106B2 (en) 2006-07-21 2013-10-01 Novozymes, Inc. Methods of increasing secretion of polypeptides having biological activity
US8735549B2 (en) 2006-07-21 2014-05-27 Novozymes, Inc. Methods of increasing secretion of polypeptides having biological activity
US8871486B2 (en) 2006-07-21 2014-10-28 Novozymes, Inc. Methods of increasing secretion of polypeptides having biological activity
US9006409B2 (en) 2006-07-21 2015-04-14 Novozymes, Inc. Methods of increasing secretion of polypeptides having biological activity
US8021863B2 (en) 2007-02-19 2011-09-20 Novozymes A/S Polypeptides with starch debranching activity
WO2008101894A1 (en) * 2007-02-19 2008-08-28 Novozymes A/S Polypeptides with starch debranching activity
EP3095858A1 (en) * 2015-05-19 2016-11-23 Honda Motor Co., Ltd. Thermostable glycoside hydrolase
US10006013B2 (en) 2015-05-19 2018-06-26 Honda Motor Co., Ltd. Thermostable glycoside hydrolase
WO2022272119A3 (en) * 2021-06-25 2023-02-02 The Board Of Trustees Of The University Of Illinois Synthetic toolkit for plant transformation
WO2023225459A2 (en) 2022-05-14 2023-11-23 Novozymes A/S Compositions and methods for preventing, treating, supressing and/or eliminating phytopathogenic infestations and infections

Also Published As

Publication number Publication date
CN1233286A (en) 1999-10-27
AU4551097A (en) 1998-05-11
EP0950093A2 (en) 1999-10-20

Similar Documents

Publication Publication Date Title
EP0950093A2 (en) Alpha-amylase fused to cellulose binding domain, for starch degradation
US6017751A (en) Process and composition for desizing cellulosic fabric with an enzyme hybrid
Iefuji et al. Raw-starch-digesting and thermostable α-amylase from the yeast Cryptococcus sp. S-2: purification, characterization, cloning and sequencing
Meinke et al. Cellulose-binding polypeptides from Cellulomonas fimi: endoglucanase D (CenD), a family A beta-1, 4-glucanase
FI103285B (en) Isolated amylase mutants with improved heat, acid and / or alkali resistance
CA2112028C (en) Pullulanase, microorganisms producing same, process for preparing this pullulanase and uses thereof
WO2014007921A1 (en) Variant alpha amylases with enhanced activity on starch polymers
US20070256197A1 (en) Thermostable cellulase and methods of use
WO1992002614A1 (en) Novel thermostable pullulanases
JPWO2014157492A1 (en) Thermostable cellobiohydrolase
Kim et al. Biochemical confirmation and characterization of the family-57-like α-amylase of Methanococcus jannaschii
US8778649B2 (en) Use of acidothermus cellulolyticus xylanase for hydrolyzing lignocellulose
US9593319B2 (en) Temperature-stable β-pyranosidase
JP3025625B2 (en) Alkaline pullulanase gene having alkaline α-amylase activity
JP4228073B2 (en) Highly active fusion enzyme
Satoh et al. Characterization of an α-glucosidase, HdAgl, from the digestive fluid of Haliotis discus hannai
KR101530078B1 (en) Recombinant Vector and Recombinant Microorganism Comprising Chimeric kappa-Carrageenase Gene and Chimeric lamda-Carrageenase Gene
EP0713916B1 (en) A recombinant beta-amylase
JPH11318441A (en) Ultra heat-resistant and ultra acid-resistant amylopullulanase
CN107236720B (en) Thermostable cellobiohydrolase
EP2436698A1 (en) Cellulolytic Clostridium acetobutylicum
JPH10150986A (en) Super thermostable 4-alpha-glucanotransferase
JP4257979B2 (en) Improved heat-resistant endoglucanase and its gene
WO2020213604A1 (en) NOVEL β-AMYLASE AND METHOD FOR UTILIZATION AND PRODUCTION THEREOF
JP5062730B2 (en) Improved thermostable cellulase

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 97198640.1

Country of ref document: CN

AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW AM AZ BY KG KZ MD RU TJ TM

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH KE LS MW SD SZ UG ZW AT BE CH DE DK ES FI FR GB GR IE IT LU MC

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WR Later publication of a revised version of an international search report
121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 1997943797

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 09280763

Country of ref document: US

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWP Wipo information: published in national office

Ref document number: 1997943797

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: CA

WWW Wipo information: withdrawn in national office

Ref document number: 1997943797

Country of ref document: EP