WO1992007935A1 - Glycosaminoglycan-targeted fusion proteins, their design, construction and compositions - Google Patents

Glycosaminoglycan-targeted fusion proteins, their design, construction and compositions Download PDF

Info

Publication number
WO1992007935A1
WO1992007935A1 PCT/US1991/008105 US9108105W WO9207935A1 WO 1992007935 A1 WO1992007935 A1 WO 1992007935A1 US 9108105 W US9108105 W US 9108105W WO 9207935 A1 WO9207935 A1 WO 9207935A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
residues
polypeptide
integer
sequences
Prior art date
Application number
PCT/US1991/008105
Other languages
French (fr)
Inventor
John A. Tainer
Leslie Kuhn
Maurice Boissinot
Cindy Fisher
Hans E. Parge
John H. Griffin
Guy T. Mullenbach
Robert A. Hallewell
Original Assignee
The Scripps Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Scripps Research Institute filed Critical The Scripps Research Institute
Publication of WO1992007935A1 publication Critical patent/WO1992007935A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • C12N9/0089Oxidoreductases (1.) acting on superoxide as acceptor (1.15)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/81Protease inhibitors
    • C07K14/8107Endopeptidase (E.C. 3.4.21-99) inhibitors
    • C07K14/811Serine protease (E.C. 3.4.21) inhibitors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/02Fusion polypeptide containing a localisation/targetting motif containing a signal sequence
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/20Fusion polypeptide containing a tag with affinity for a non-protein ligand
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/31Fusion polypeptide fusions, other than Fc, for prolonged plasma life, e.g. albumin

Definitions

  • the present invention relates to
  • glycosaminoglycan-binding fusion proteins and methods for designing and constructing the fusion proteins. More particularly, the invention relates to methods and compositions for extending in vivo lifetimes of biologically active compounds and targeting them to specific cell surfaces or substrates.
  • Fusion proteins may be comprised of homologous or heterologous sources of polypeptides so long as the polypeptides being fused are not typically associated together. Particularly interesting are fusion proteins comprised of
  • polypeptides derived from independently folding structural regions (domains) of proteins that contain biological function.
  • fusion proteins have been formed using the techniques of genetic engineering. For example, a non-excretable protein can be fused to a ⁇ -lactamase moiety to give an excretable fused protein. At the genetic level, this fusion is
  • the fused protein is generated. See, e.g., Freifelder. Molecular
  • a DNA oligonucleotide can be prepared which codes for the hormone attached to a methionine group.
  • the synthetic oligonucleotide can be ligated to a cleaved vector adjacent to the lac Z gene for ⁇ -galactosidase in E. coli.
  • the enzyme region of the expressed protein can subsequently be removed by reaction with CNBr which cleaves the expressed protein at the methionine group.
  • the biologically active form of a peptide may be released by enzymatically removing the undesired protein fragment, e.g., with trypsin.
  • Many other variations can be envisioned. See, Maniatis et al. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, NY pp.422-433 (1982).
  • SOD superoxide dismutase
  • HSOD human intracellular SOD
  • SOD enzymes have also been implicated in preventing alloxan diabetes [Grankvist et al., Nature, 294:158 (1981)] and in preventing metastasis of certain forms of cancer (EPO Application No. 0332464).
  • Extracellular SOD (EC-SOD) is the major SOD enzyme in extracellular fluids.
  • Heparin is a glycosaminoglycan (GAG) that
  • polysaccharide chains are covalently linked to
  • GAGs polypeptide backbones to form proteoglycans. Seven different groups of GAGs are distinguished by the types of sugar residues, the type of linkage between the sugar residues, and the number and location of sulfate groups. The presence of sulfate groups as well as carboxyl groups give GAGs a highly negative charge. The various forms of GAGs are distributed throughout the body in such areas as connective tissues, skin, cartilage, cornea, bone, blood vessels, lung, liver, cell surfaces, extracellular matrix and the like. In these areas, GAGs adopt an extended, random-coil conformation. GAGs are hydrophilic, forming hydrated gels at low concentrations. The negative charge of the chains attracts water as well as osmotically active cations. See, Lindahl et al., Annu. Rev Biochem., 47:385-417 (1978); and Chakrabarti et al., CRC Crit. Rev. Biochem., 8:225-313 (1980).
  • EC-SOD is shown to be heterogeneous with regard to heparin binding. Marklund et al., Proc. Natl.
  • EC-SOD anti-inflammatory agent
  • recombinant EC-SOD has been expressed successfully only in mammalian cell cultures making recombinant EC-SOD very expensive to produce. Tibell et al., Proc. Natl. Acad. Sci. USA 84:6634-8 (1987).
  • recombinant EC-SOD is formed as a heterogeneous mixture of SOD enzymes due to variations in carbohydrate content and extent of proteolysis.
  • a recombinant EC-SOD also has been described by Marklund et al. (WO 8701387).
  • HSOD Intracellular human SOD
  • HSOD Intracellular human SOD
  • Chemical approaches include conjugating HSOD to polyethylene glycol [White et al., Superoxide and Superoxide Dismutase in
  • Native HSOD is a CuZn dimer having a molecular weight of 32,000 Daltons.
  • recombinant HSOD analog differs from HSOD in that it is not N-acetylated.
  • BTG's HSOD shows pharmacological activity in preclinical studies that is indistinguishable from the natural protein.
  • Recombinant HSOD has also been expressed in yeast
  • the long-lived variants of proposed pharmaceutical agents will preferably be non- immunogenic, i.e., not trigger an immune response and therefore be suitable for repeated therapeutic use in a particular host animal.
  • the long-lived variants of proposed pharmaceutical agents will preferably include functionalities that minimize the costs and complexities associated with employing such variants, e.g., by facilitating their purification from reaction mixtures.
  • a class of glycosaminoglycan (GAG) -binding moieties have been identified in the present invention that can be operatively linked to a preselected protein to form a fusion protein and thereby increase the stability, plasma half-life and ease of
  • the present invention contemplates a fusion polypeptide having a minimum of two independently folding protein moieties operatively linked into a single polypeptide.
  • a first moiety is a
  • glycosaminoglycan (GAG)-binding moiety that provides a targeting function and introduces GAG-binding activity into the fusion protein.
  • the second moiety is a polypeptide having biological activity.
  • the present invention also affords a systematic method for identifying optimal configurations of fusion proteins having independently folding
  • the fusion proteins are amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids, amino acids
  • amino acid residue sequences provide desired functionalities for the fused protein, and preferably contains a third amino acid residue sequence that serves to covalently link the first two sequences.
  • an amino acid residue sequence, which has superoxide dismutase activity, corresponding to that for HSOD is linked to a second sequence having glycosaminoglycan- binding activity.
  • a preferred linking sequence is Gly-Pro-Gly, which links the HSOD unit to the
  • compositions are also contemplated in which the compositions comprise a therapeutically effective amount of the fusion protein.
  • compositions comprise a therapeutically effective amount of the fusion protein.
  • pharmacologically active compound in an animal's bloodstream is extended by administering the heparin- binding fused protein to the animal.
  • the fused protein will comprise the heparin-binding moiety.
  • FIG. 1 illustrates the nucleotide sequence of a DNA segment that codes for a GAG-binding fusion protein, shown from left-to-right and in the direction of 5'-terminus to 3'-terminus using the single letter nucleotide base code.
  • the structural gene for the mature fusion protein begins at base 67 and ends at base 579, with the position number of the every tenth base residue in each row indicated above the row showing the sequence.
  • amino acid residue sequence for the fusion protein is indicated by the single letter code below the nucleotide base sequence, with the position number for the first residue in each row indicated to the left of the row showing the amino acid residue
  • the reading frame is indicated by placement of the deduced amino acid residue sequence below the nucleotide sequence such that the single letter that represents each amino acid is located below the first base in the corresponding codon.
  • the mature fusion protein amino acid residue sequence begins at residue 1 and ends at residue 171.
  • N-terminal A+ helix crosses the lower third of the picture, with the N-terminus, residue 5 ( ⁇ 1 -antitrypsin numbering), at far right and
  • residue 15 turns from the end of the helix, at center.
  • the H helix in the upper portion of the picture, starts at residue 269 (one residue to the right of the labeled residue, 270), and extends to residue 277 at upper left.
  • the positive charges on this helix are augmented by positive charges in residues 280-282, which have extended conformation.
  • the highly positive electrostatic potential (dots indicating a surface potential of ⁇ 3 kcal/mol) is generated by the many positive charges on these helices and constitutes the most favorable region on the PCI surface for binding of (negatively-charged) glycosaminoglycans. This arrangement of helices and their spacing is similar to the two-helix motif found for heparin binding in the crystallographic structure of platelet factor 4, a protein not homologous to PCI.
  • Figure 3 illustrates a model of the carboxy- terminal helices from the platelet factor 4 (PF4) dimer structure attached to the carboxy termini of the human superoxide dismutase (HSOD) dimer, comprising an optimal two-helix motif for glycosaminoglycan binding.
  • PF4 platelet factor 4
  • HSOD human superoxide dismutase
  • Residues 75-85 and 175-185 are the C-terminal helices of the two monomers in the PF4 crystal structure dimer.
  • the chains of the two dimers of HSOD are numbered 1-153 and 201-353.
  • FIG. 4 is block diagram of the system of the present invention.
  • Amino Acid Residue An amino acid formed upon chemical digestion (hydrolysis) of a polypeptide at its peptide linkages.
  • the amino acid residues described herein are preferably in the "L” isomeric form. However, residues in the "D" isomeric form can be substituted for any L-amino acid residue, as long as the desired functional property is retained by the polypeptide.
  • NH 2 refers to the free amino group present at the amino terminus of a polypeptide.
  • COOH refers to the free carboxy group present at the carboxy terminus of a polypeptide.
  • amino acid residue is broadly defined to include the amino acids listed in the Table of
  • Base Pair a partnership of adenine (A) with thymine (T), or of cytosine (C) with guanine (G) in a double-stranded DNA molecule.
  • A adenine
  • T thymine
  • C cytosine
  • G guanine
  • U uracil
  • Base pairs are said to be "complementary" when their component bases pair up normally when a DNA or RNA molecule adopts a double- stranded configuration.
  • Complementary Nucleotide Sequence a sequence of nucleotides in a single-stranded molecule of DNA or RNA that is sufficiently complementary to another single strand to specifically (non-randomly) hybridize to it with consequent hydrogen bonding.
  • nucleotide sequence is conserved with respect to a preselected (reference) sequence if it non-randomly hybridizes to an exact complement of the preselected sequence.
  • Duplex DNA a double-stranded nucleic acid molecule comprising two strands of substantially complementary polynucleotides held together by one or more hydrogen bonds between each of the complementary bases present in a base pair of the duplex. Because the nucleotides that form a base pair can be either a ribonucleotide base or a deoxyribonucleotide base, the phrase "duplex DNA” refers to either a DNA-DNA duplex comprising two DNA strands (ds DNA), or an RNA-DNA duplex comprising one DNA and one RNA strand.
  • Fusion Protein A protein comprised of at least two polypeptides. In some cases, a linking sequence is present to operatively link the two polypeptides into one continuous polypeptide (i.e., fusion
  • At least one, and preferably two, of the polypeptides comprising a fusion protein is
  • the two polypeptides linked in a fusion protein are typically derived from two
  • a fusion protein comprises two linked polypeptides not normally found linked in nature.
  • Gene a nucleic acid whose nucleotide sequence codes for a RNA, DNA or polypeptide molecule. Genes may be uninterrupted sequences of nucleotides or they may include such intervening segments as introns, promoter regions, splicing sites and repetitive sequences. A gene can be either RNA or DNA.
  • Hybridization the pairing of complementary nucleotide sequences (strands of nucleic acid) to form a duplex, heteroduplex, or complex containing more than two single-stranded nucleic acids, by
  • Hybridization is a
  • Linking Sequence an amino acid residue sequence comprising zero to seven amino acid residues.
  • a linking sequence serves to chemically link two
  • Nucleotide a monomeric unit of DNA or RNA consisting of a sugar moiety (pentose), a phosphate group, and a nitrogenous heterocyclic base.
  • the base is linked to the sugar moiety via the glycosidic carbon (1' carbon of the pentose) and that combination of base and sugar is a nucleoside.
  • nucleoside contains a phosphate group bonded to the 3' or 5' position of the pentose, it is referred to as a nucleotide.
  • nucleotides is typically referred to herein as a "base sequence” or “nucleotide sequence”, and their
  • Nucleotide Analog a purine or pyrimidine nucleotide that differs structurally from an A, T, G, C, or U base, but is sufficiently similar to
  • Inosine (I) is a nucleotide analog that can hydrogen bond with any of the other nucleotides. A, T, G, C, or U. In addition, methylated bases are known that can participate in nucleic acid hybridization.
  • Polynucleotide a polymer of single or double stranded nucleotides.
  • polynucleotide and its grammatical equivalents will include the full range of nucleic acids.
  • a polynucleotide will typically refer to a nucleic acid molecule comprised of a linear strand of two or more deoxyribonucleotides and/or ribonucleotides. The exact size will depend on many factors, which in turn depends on the ultimate conditions of use, as is well-known in the art.
  • the polynucleotides of the present invention include primers, probes, RNA/DNA segments, oligonucleotides or "oligos" (relatively short polynucleotides), genes, vectors, plasmids, and the like.
  • Polypeptide or Peptide or Protein a linear series of at least two amino acid residues in which adjacent residues are connected by peptide bonds between the alpha-amino group of one residue and the alpha- carboxy group of an adjacent residue.
  • Recombinant DNA (rDNA) molecule a DNA molecule produced by operatively linking a nucleic acid
  • a recombinant DNA molecule is a hybrid DNA molecule comprising at least two nucleotide sequences not normally found together in nature. rDNAs not having a common biological origin, i.e., evolutionarily different, are said to be "heterologous”.
  • Vector a DNA molecule capable of autonomous replication in a cell and to which a DNA segment, e.g., gene or polynucleotide, can be operatively linked so as to bring about replication of the attached segment.
  • a DNA segment e.g., gene or polynucleotide
  • Vectors capable of directing the expression of genes encoding for one or more proteins are referred to herein as "expression vectors”.
  • a glycosaminoglycan (GAG)-binding fusion protein i.e., a GAG-targeted fusion protein, is a protein comprising two functional elements defined by two independently folding polypeptides that are
  • the first functional element is a GAG-binding moiety comprised of a first polypeptide that independently folds into a functional three-dimensional protein structural domain having GAG-binding activity.
  • the second functional element is a biologically active moiety comprised of a second polypeptide that independently folds into a functional three-dimensional protein structural domain having a preselected biological activity.
  • GAGs Glycosaminoglycans
  • GAG-binding moiety for use in the present invention is a
  • polypeptide sequence that has an affinity for binding with a GAG, and is therefore useful in a fusion protein of the present invention to target the fusion protein to the vicinity of a GAG molecule.
  • a GAG-binding moiety for use in a fusion protein includes the following structural parameters as defined by the teachings of the present invention:
  • the GAG-binding moiety is a polypeptide of 6-20 amino acid residues in length
  • a helix-promoting residue is one of the following: leucine, alanine, glutamic acid, phenylalanine, threonine, isoleucine, serine, tyrosine, valine, asparagine, lysine, arginine, and aminoisobutyric acid;
  • polypeptide comprising the GAG- binding moiety exhibits amphipathic character when modeled as an ⁇ -helix
  • the GAG-binding moiety contains no more than two helix-breaking residues (e.g., glycine or proline).
  • a general formula for a GAG-binding moiety for use in the present invention is a polypeptide
  • X g [+] h X i [+] j -X k [+] l X m [+] n X o where [+] and X are amino acid residues (designated in single letter code), [+] is R or K; X is L, A, E, F, T, I, S, Y, V, N, K, R or aminoisobutyric acid; g is an integer from 0-9, h is an integer from 1-3, i is an integer from 1-5, j is an integer from 1-3, k is an integer from 1-7, 1 is an integer from 0-7, m is an integer from 0-7, n is an integer from 0-2, and o is an integer from 0-2; and with the proviso that
  • g+h+i+j+k+l+m+n+o is equal to or less than 20.
  • X can be H, Q, M, C, W, D, G or P and X contains zero to two of the helix-breaking residues selected from the group consisting of G and P.
  • Polypeptide sequences having an amino acid residue sequence according to the above formula represents a GAG-binding moiety for use in the present invention and can be designed de novo or can be identified from known protein sequences.
  • the polypeptide sequence is selected from a protein having known GAG-binding activity. Proteins of known sequence that have known GAG-binding activity have been described extensively in the literature and are summarized in Table 1.
  • Neural cell adhesion protein Mouse Neural cell adhesion protein Rat
  • Preferred GAG-binding moieties are polypeptides including the sequences having the formula: [+] 2 X 2 [+] 2 where [+] and X are as described above. Particularly preferred in this embodiment is the polypeptide having the sequence YKKIIKKLLES, which is derived from the platelet factor 4 (PF4) C-terminal helix.
  • PF4 platelet factor 4
  • GAG-binding moiety is the polypeptide including the formula: [+]X 3 [+]X 2 [+] 3 , wherein the [+] and X are as defined above.
  • this embodiment is the polypeptide having the sequence HRHHPREMKKRVEDL, derived from the amino terminus of protein C inhibitor and corresponding to the A+ helix as described herein.
  • GAG-binding moiety is the polypeptide including the formula: [+]X 2 [+] 2 X[+], wherein the [+] and X are as defined above.
  • preferred polypeptide according to this embodiment is an internal sequence corresponding to a section of C- terminal end of the D helix of antithrombm III having the sequence KLNCRLYRKANK.
  • polypeptide according to the above formula is the internal H helix of protein C inhibitor having the sequence EKTLRKWLK.
  • GAG-binding moiety for use in the present invention is a polypeptide including the formula: [+] 2 X 5 [+]X 3 [+] where [+] and X are as defined above.
  • a preferred polypeptide according to this embodiment is a section of the N-terminal end of the internal A helix of antithrombin III having the sequence RRVWELSKANSR.
  • GAG-binding moiety for use in a fusion protein is the PCI A+ helix identified above and utilized in the HSOD-A+ fusion protein described in Example 5.
  • GAG-binding moiety for use in the present invention is a polypeptide having the sequence
  • a linker or linking means for use in the present invention to connect a glycosaminoglycan (GAG)-binding moiety to a biologically active moiety in the fusion protein of the present invention has a structure that depends on the amino acid sequence of the two moieties being linked. Considerations for selection of a linker are discussed in detail herein and in the discussion of identifying linkers using SEARCHWILD.
  • a linking means operatively links two polypeptide portions of a fusion protein through peptide bonds and in one embodiment is comprised of zero or more amino acid residues i.e., a linker sequence, and is
  • a fusion protein typically less than 20 amino acid residues, preferably less than seven residues, and more preferably is three residues.
  • a fusion protein can simply have a peptide bond as the operative linkage (linking means) between two polypeptide domains of the fusion protein, it is more typical that the linking means in a fusion protein is one or more residues in a linker sequence for operatively connecting the protein's independently folding polypeptide domains.
  • parvalbumin B (PDB code 3cpv).
  • linker sequence REACG corresponding to residues 113-117 of p-hydroxybenzoate hydroxylase ternary complex (PDB lphh); the linker VE
  • linkers can be included in a fusion protein of the present invention involving a reverse turn of 4 residues in length.
  • Reverse turns are preferred because they are surface-exposed, well- defined structures stabilized by internal hydrogen bonds between residues within the turn [Richardson et al.. Adv. Prot. Chem., 34:168-364 (1981)] and because the preferred residue types in the four positions of type I and type II turns, the most common reverse turns, are known. Wilmot et al.. J. Mol. Biol.,
  • polypeptide sequences NDSG, NSSG, NSRG, and NSDG.
  • a preferred linker includes the amino acid residues Gly-Pro-Gly.
  • An important consideration in the linker is to provide a short extension away from one polypeptide structure into another polypeptide structure that confers a sharp turn such that the two polypeptides can lie against one another.
  • the three residues Gly-Pro-Gly provide an appropriate length for such a turn between the two structural elements provided by many such polypeptide moieties of a fusion protein.
  • Glycine has a high degree of conformational flexibility and thus allows the two structural elements which are joined to interact in an optimal way.
  • the rigid, kink-forming residue proline has high propensity to form turns.
  • the Gly-Pro-Gly structure forms turns with rotational flexibility at the ends.
  • both glycine and proline have small sidechains and are less likely to cause packing problems between the structural elements of the two polypeptide moieties.
  • Gly-Pro-Gly is particularly preferred as a linker and is utilized in the HSOD-A+ fusion protein
  • Bioly active polypeptide moieties for use in a fusion protein of this invention can be derived from any number of proteins of known primary amino acid residue sequence that provide therapeutic
  • At least four classes of proteins of known structure have potential therapeutic applications that can benefit from having glycosaminoglycan-binding properties either to
  • proteins include the following: 1) serine proteases; 2) protease inhibitors including serine protease inhibitors, which are also called serpins; 3) antioxidant enzymes; and 4) receptors and immunoglobulins.
  • Tissue-plasminogen activator (tPA), urokinase, and single-chain urokinase-like plasminogen activator (scuPA) [Haber et al., Science, 243:51-56 (1989)] are representative proteases in the serine protease class of proteins. These proteases are used for the
  • tPA and urokinase can benefit from the addition of a glycosaminoglycan-binding domain to their structure.
  • protease consists of a protease and a binding domain, the latter of which promotes binding to heparin.
  • the proteases Upon binding to heparin, however, the proteases naturally undergo a cleavage resulting in a separation of the protease domain from the binding domain.
  • tPa protease domain is more active in dissolving blood clots than the full-length form.
  • the tPA protease domain although more active, lacks heparin-binding capacity.
  • the construction of an expression vector in which a glycosaminoglycan-binding domain is incorporated into the serine protease domain of these molecules can correct this deficiency.
  • the resulting fusion protein can then have improved stability and clot targeting capacity compared to a tPA protease domain alone.
  • Protease inhibitors categorized as anti- proteases, include alpha-1-antitrypsin, acid-stable proteinase inhibitor and human secretory leukocyte protease inhibitor.
  • serpin is synonymous with serine protease inhibitors. The prototypical serine protease inhibitor or serpin is
  • alpha-1-antitrypsin This protein is the principal natural inhibitor of the protease, leukocyte elastase, which is known to cause major tissue damage in lung inflammatory diseases like emphysema. Elastase is also known to be closely associated with various glycosaminoglycans found in tissue. Travis et al., Am. J. Medicine, 84 (sup. 6A): 37-42 (1988). Heparan sulfate, a glycosaminoglycan, is a major component of the alveolar interstitial tissue where the protease damage occurs. Crystal et al., In: Pulmonary Diseases and Disorders. 2nd ed., Fishman, Ed., McGraw-Hill Book Company (1987). Alpha-1-antitrypsin does not
  • alpha-1-antitrypsin with a glycosaminoglycan-binding moiety should be more readily targeted to the site of action in the alveolar interstitial tissue and have an increased half-life.
  • SLPI human secretory leukocyte protease inhibitor
  • SLPI serine protease inhibitor in that it can reversibly inhibit a broad range of proteases involved in tissue damage during inflammation and also has a unique ability to gain access to proteolytic
  • SLPI sequestered microenvironments that are inaccessible to other fluid-phase inhibitors.
  • SLPI has a low molecular weight (11,700 D) , it is expected to be rapidly cleared by kidney filtration.
  • the addition of a glycosaminoglycan-binding peptide to SLPI can increase its half-life and can result in direct cell surface targeting.
  • antioxidants which function as anti-inflammatory agents.
  • the medically important antioxidant enzymes of known structures are superoxide dismutase, catalase and glutathione peroxidase. These enzymes are involved in the prevention of
  • immunoglobulins Targeting of immunoglobulins is normally an intrinsic function of these proteins. Antibodies to circulatory antigens and antibodies that have a catalytic function, however, can benefit from a heparin-binding moiety. Since immunoglobulins have a beta-barrel fold like that of SOD, they will behave like SOD upon fusion with a glycosaminoglycan-binding domain. Catalytic antibodies that are more stable and have enhanced tissue-targeting ability can provide a more efficient therapeutic agent. Alternatively, for industrial or purification processes, a
  • glycosaminoglycan-binding domain added either to a catalytic as well as a standard variable domain antibody or to a single-chain antibody can enable the antibody to be immobilized via the glycosaminoglycan- binding domain on a solid support such as a
  • heparin-sepharose column Recovery of the free catalyst or antibody can be easily accomplished by a gentle salt gradient elution as described for HSOD-A+.
  • many receptors are known to have an immunoglobulin-like structure.
  • the CD4 soluble receptor and especially the VI domain are contemplated as a potential drug against HIV. Ashkenazi et al., Proc. Natl. Acad. Sci., 87:7175-7154 (1990). These molecules can have enhanced tissue targeting and stability via glycosaminoglycan binding.
  • proteins can benefit from an increased circulatory half-life and tissue targeting through an attached glycosaminoglycan-binding function.
  • Solubilized domains from medically important receptors represent a major potential application.
  • transmembrane domain of the full-length molecule have been removed without affecting the activity.
  • glycosaminoglycan-binding peptide should not be
  • glycosaminoglycan-binding peptide can prove to be a general strategy for the surface targeting of soluble forms of receptors.
  • subtilisin carlsberg subtilisin carlsberg (subtilopeptidase *a)
  • subtilisin BPN* (e.c.3.4.21.14)
  • subtilisin carlsberg e.c.3.4.21.14
  • n-acetyl eglin-c 1sic subtilisin /bpn(prime)
  • streptomyces subtilisin inhibitor e.c.3.4.21.14
  • subtilisin novo e.c.3.4.21.14 complex with chymotrypsin inhibitor 2 (CI-2)
  • 3sgb proteinase b from streptomyces griseus
  • beta-trypsin orthorhombic at p*h5.0
  • beta-trypsin (e.c.3.4.21.4) complex with
  • alpha-lytic protease e.c.3.4.21.12
  • alpha chymotrypsin a tosylated (e.c.3.4.21.1)
  • 4cha alpha-chymotrypsin (e.c.3.4.21.1)
  • alpha chymotrypsin a (e.c.3.4.21.1)
  • alpha chymotrypsin a (e.c.3.4.21.1) complex
  • PEBA phenyethane boronic acid
  • alpha-chymotrypsin e.c.3.4.21.1
  • turkey ovomucoid third domain OMTKY3
  • subtilisin carlsberg e.c.3.4.21.14
  • hne human neutrophil elastase (hne) (e.c.3.4.21.37)
  • 1ntp modified beta trypsin (monoisopropylphosphoryl inhibited) (e.c.3.4.21.4) (neutron data) 1p01 : alpha-lytic protease (e.c.3.4.21.12) complex with boc-*ala-*pro-*valine boronic acid
  • alpha-lytic protease (e.c.3.4.21.12) complex with methoxysuccinyl-*ala-*ala-*pro-*alanine boronic acid
  • alpha-lytic protease (e.c.3.4.21.12) complex with methoxysuccinyl-*ala-*ala-*pro-*valine boronic acid
  • alpha-lytic protease (e.c.3.4.21.12) complex with methoxysuccinyl-*ala-*ala-*pro-*isoleucine boronic acid
  • alpha-lytic protease (e.c.3.4.21.12) complex with methoxysuccinyl-*ala-*ala-*pro-*norleucine boronic acid
  • trypsin orthorhombic, 2.4 m ammonium sulfate
  • 3ptn trypsin (trigonal, 2.4 m ammonium sulfate)
  • 3rp2 rat mast cell protease ii (rmcpii)
  • beta-trypsin (e.c.3.4.21.4) complex with
  • actinidin sulfhydryl proteinase
  • Acid Proteinases and Their Inhibitors 4ape acid proteinase (e.c.3.4.23.10), endothiapepsin 2app : acid proteinase (e.c.3.4.23.7),penicillopepsin 2apr : acid proteinase (rhizopuspepsm) (e.c.3.4.23.6) 3apr : acid proteinase (rhizopuspepsm) (e.c.3.4.23.6) complex with reduced peptide inhibitor
  • pepsin e.c.3.4.23.1
  • pepsin e.c.3.4.23.1
  • pepsin e.c.3.4.23.1
  • 2fb4 immunoglobulin fab 1fbj : ig*a fab fragment (j539) (galactan-binding) 1fc1 : fc fragment (iggl class)
  • antioxidant enzymes of the superoxide dismutase (SOD) class are particularly preferred.
  • cauliflower SOD [Steffens et al., Biol. Chem. Hoppe- Seyler, 367:1007-1016 (1986)]; cabbage SOD [Steffens et al., Physiol. Chem., 367:1007-1016 (1986)]; maize SOD [Cannon et al., Proc. Natl. Acad. Sci. USA,
  • SOD fusion proteins are particularly preferred and an exemplary embodiment using HSOD has been prepared in Example 6.
  • a SOD-containing-fusion protein is also referred to as a SOD-GAG-binding protein.
  • This embodiment, designated HS0D-A+ fusion protein has an amino acid residue sequence for the mature, expressed protein as shown in Figure 1 from residue 1 to residue 171.
  • linker sequences and GAG-binding moieties identified herein as preferred are also contemplated.
  • a fusion protein has a polypeptide sequence that corresponds, and preferably is identical, to the formula A+-L-HSOD, where A+ is at the amino terminus and HSOD is at the carboxy terminus, A+ corresponds to the PCI A+ GAG- binding helix having the formula HRHHPREMKKRVED, HSOD is a polypeptide having an amino acid residue sequence that corresponds, and preferably is identical, to the sequence in Figure 1 from residue 1 to residue 153, and L represents an operative linkage between A+ and HSOD in the form of either a peptide bond, i.e., no intervening amino acid residues, or one of the linker polypeptides YYK or SMD.
  • a fusion protein has a polypeptide sequence that corresponds, and preferably is identical, to the formula HSOD-L-PF4+, where HSOD is at the amino terminus and PF4+ is at the carboxy terminus, HSOD has a sequence as defined above, PF4+ is a GAG-binding helix having the formula YKKIIKKLLES, and L is either a peptide bond or is one of the linker polypeptides DEDG or IGVMP.
  • Another embodiment contemplates a fusion protein having a polypeptide sequence that corresponds, and preferably is identical, to the formula PF4+-L-HSOD, where PF4+ is at the amino terminus and HSOD is at the carboxy terminus, PF4+ is the polypeptide defined above, HSOD has a sequence as defined above, and L is one of the linker polypeptides REACG, VE or VMAS.
  • fusion proteins that contain a single GAG-binding moiety operatively linked to a single independently folding protein domain (i.e., a biologically active polypeptide moiety), where the GAG-binding moiety is linked at either its carboxy or amino terminus.
  • a fusion protein is also contemplated where more than one GAG-binding moiety is linked, for example, one at the carboxy and one at the amino terminus of the biologically active polypeptide moiety.
  • a fusion protein having a polypeptide sequence that corresponds, and preferably is identical, to the formula A+-L 1 -HSOD-L 2 - PF4+, where A+ is at the amino terminus and PF4+ is at the carboxy terminus, A+ corresponds to a polypeptide of the formula HRHHPREMKKRVED, PF4+ is the polypeptide as defined above, HSOD has a sequence as defined above L 1 is either a peptide bond or is one of the linker polypeptides YYKK or SMD and L 2 is either a peptide bond or is one of the linker polypeptides DEDG or IGVMP.
  • GAG-binding affinity is optimized by choosing linkers that encourage the GAG-binding helices to adopt the arrangement found in PF4 dimers (where each monomer contributes one helix) and in PCI; namely, two amphipathic, positively charged ⁇ -helices that lie roughly in a plane, that are aligned side-by-side, and that have parallel or anti-parallel axes separated by 10-14 angstroms.
  • the PF4 dimer helices have been superimposed by molecular graphics on the HSOD dimer, such that the two fold symmetry axes of the HSOD and PF4 dimers are coincident and the N-termini of the PF4 helices are as close to the C-termini of the HSOD monomers as
  • a linker of at least five residues in length is preferred.
  • the extra two to three residues for the PF4 linker relative to the A+ linker are required because the PF4 helix is one turn shorter than the A+ helix-containing peptide.
  • the fusion protein may contain two GAG-binding moieties, one attached at each end of the biologically active polypeptide.
  • a fusion protein can contain more than one GAG-binding moiety in tandem at a terminus of a biologically active polypeptide, for example, according to the general formula: -Y-L-Z n - or -Z n -L-Y-, where Y is a biologically active
  • L is a linking means
  • Z is a GAG-binding moiety
  • n is an integer of about 1-5, preferably about 2.
  • multiple GAG-binding moieties can be positioned according to the formula: -(Z-L) n -Y- or -Y-(L-Z) n -, where L, Z and Y are as defined above and n is an integer from 1 to 5, and preferably is 1, 2 or 3.
  • Y is HSOD, preferably
  • L is methionine (M)
  • n is 1, 2 or 3.
  • a GAG-binding protein comprises a polypeptide including the formula -Y-L-Z-, where Y and Z are amino acid residue sequences, L is a linking means, Y comprises a polypeptide having biological activity as described herein, and Z is a GAG-binding moiety according to the general formula described above, with the proviso that when L is -GPG-, Z is not -LWERQ-; and the proviso that when L is -PLY-, z is not -YKKII-.
  • the GAG-binding protein comprises a polypeptide including the formula -Z-L-Y-, where Z, L, and Y are as described above, with the provisio that when L is -HVG-, Z is not -RVEDL-.
  • Another embodiment contemplates inclusion of multiple GAG-binding moieties associated with a biologically active polypeptide moiety according to the formula -U b -(Z-L) a -Y- or -Y-(L-Z) a -U b - where Z, L ar?d Y are as defined before, U is an amino acid, a is an integer from 1 to 10, and b is an integer from 0 to 1.
  • fusion protein where L is methionine, Z is a polypeptide according to the sequence -RVPRESGKKRKRKRLKPS-, Y is a polypeptide having an amino acid residue sequence that corresponds to the sequence shown in Figure 1 from residue 1 to residue 171, z is 1, 2, or 3 and b is 1.
  • Fusion proteins comprising a preselected biologically active polypeptide moiety operatively linked to a glycosaminoglycan (GAG)-binding moiety are particularly useful due to the properties that the GAG-binding moiety imparts on the fusion protein.
  • GAG glycosaminoglycan
  • Therapeutic proteins administered to the blood are cleared from the blood. Addition of a GAG-binding moiety imparts a targeting function that directs the fusion protein to GAGs in the blood vessel wall and into the tissues rather than into the general
  • the targeting function takes the fusion protein away from free circulation, thereby increasing the fusion protein's effective half-life and
  • addition of a GAG- binding moiety imparts a means to more readily isolate a fusion protein from the expression medium in which it was synthesized or the fluid in which it is
  • the preparation of a fusion protein of this invention involves a combination of molecular
  • design considerations are resolved in the present invention by computer modeling methods that determine the regions of independently folding protein domains and particularly that design suitable linkers to combine two or more biologically active polypeptide moieties to form the fusion protein.
  • Example 1 A detailed description of the modeling of the protein C inhibitor is provided in Example 1. The methods generally involve a series of computer graphics and computer modeling manipulations based on the primary amino acid residue sequence of the
  • polypeptide to be modeled is exemplary of the methods used to solve protein structures in general.
  • glycosaminoglycan-binding fusion protein involves the following steps:
  • a GAG-binding moiety is selected according to the formula presented earlier.
  • a biologically active protein moiety is at least an independently folding protein domain that contains by its structure an identifiable biological activity, when assayed by standard biochemical methods for the presence of the identifiable biological assay.
  • a biologically active protein moiety is a complete protein, although there is no requirement that the protein be complete. For example. Fab fragments of immunoglobulins, or the single chain antigen binding protein described by Bird et al.
  • polypeptides The final amino acid residue sequence of the fusion protein is defined by the sum of the three parts, namely first polypeptide, linker and second polypeptide operatively linked into a single polypeptide.
  • Modeling a polypeptide can be accomplished by a variety of methods. Preferred are the homology modeling
  • SEARCHWILD scans a database of protein sequences for all occurrences of a specified sequence pattern. This pattern may include "linker" sequences (of a specified range of lengths) for which no sequence preference is specified. SEARCHWILD can be used to identify sequences forming natural (and thus
  • SEARCHWILD will identify all sequences of the protein structural database that are similar to the C- and N-terminal sequences separated by a linker of 0 or more residues. In doing so, SEARCHWILD successfully identifies linkers that provide favorable structures for linking structural units in a fusion protein.
  • An exemplary and preferred protein structure database is the Protein Data Bank available from Brookhaven National
  • SEARCHWILD is attached hereto as Appendix 1 to provide detailed description of the logic for completing a SEARCHWILD computer analysis.
  • SEARCHWILD can be run on any computer using a
  • UNIX operating system such as a SUN SPARCstation 1 or SLC, a SUN 3 or 4, a Convex 1 or 240, or a Stardent GS 1000 or Titan.
  • the executable SEARCHWILD code (the compiled and linked code in Appendix 1) is run on a (Unix operating system) computer by typing the
  • the command line includes symbols which mean the following: "pdbsearchwild" invokes the program
  • Execution of the described command initiates the program that passes parameters into SEARCHWILD, sorts the matches found in the sequence database (for example, the sequences corresponding to structural coordinates in the Protein Data Bank (PDB)), and lists the sequence matches found by the search in order from most similar to least similar to the input sequences. On each line containing a c-terminal and N-terminal sequence match is the sequence of the identified linker between them.
  • sequence database for example, the sequences corresponding to structural coordinates in the Protein Data Bank (PDB)
  • the SEARCHWILD parameters required at the command line include certain default values which are
  • the first parameter is the carboxy-terminal 7 residues of the polypeptide to precede the linker.
  • the second parameter is the amino-terminal 7 residues of the polypeptide to follow the linker.
  • the third parameter identifies the minimum linker length (in residues) between the two polypeptides to be linked, with a minimum value of zero, and is referred to as "minlinkerlen” in pdbsearchwild.
  • the fourth parameter is the maximum linker length between the two polypeptide regions, is specified as 7 residues and is referred to as "maxlinkerlen” in pdsearchwild.
  • the choice of 7 residues for the lengths of the amino and carboxy termini and for the linker length in the described SEARCHWILD program was made because 7 residues is sufficient to form any of the preferred types of protein structure for a linker in the present invention, namely reverse turns, helical turns, and open turns or loops having internal hydrogen bonds.
  • the fifth parameter in SEARCHWILD is used to measure the similarity between the input sequence and the database sequence, and gives a value for each substitution of one residue type for another. Higher matrix values indicate more similar residues.
  • the preferred matrix best.matrix (E.D. Getzoff and J.A. Tainer), is a weighted combination of 7 individual matrices
  • the last parameter, matrix tolerance, is a value equaling the smallest value in the amino acid
  • substitution matrix for a substitution of one residue by another
  • this is set to some value greater than the smallest value in the matrix (to prevent all sequences in the database from being printed out with their scores, since clearly most sequences are not similar) and less than the value at which statistically significant scores are produced (as described below; thus at least all the significant matches will be printed out).
  • matrix tolerance is a residue-selection criterion. This parameter is referred to as "mat_tol" in pdbsearchwild, and an appropriate value is zero for most choices of input sequence when using best.matrix.
  • sequence database file to be searched by SEARCHWILD referred to as "pdbseq.asc"
  • PDB Brookhaven Protein Data Bank
  • amino acid equivalence matrices can be used with SEARCHWILD in place of best.matrix described herein, so long as the matrix provides for residue substitutions.
  • Typical factors involved in designing a rational substitution matrix include the following: hydrophobicity, evolutionary occurrence, sidechain charge and polarity, turn, strand or helix preference characteristics, size and the like.
  • Schirmer (supra) methodology is applied to best.matrix rather than the matrix of relative substitution frequencies described by Schulz and Schirmer (since the level of statistical significance depends on the values in the substitution matrix).
  • This methodology determines the mean and standard deviation of the distribution of scores for the sequence matches produced by searchwild.
  • a best.matrix score greater than three standard deviations above the mean score shows significant relatedness at a confidence level of more than 99.7%. This is a restrictive criteria since it gives a frequency of 0.005 for all 5-residue peptides and 0.0014 for all 13-residue peptides occurring in 2222 known protein sequences.
  • matchextractpdb (incorporating pdbresrange and pdbchain programs), extracts from the protein database (PDB) the three- dimensional coordinates of the linker residues
  • the selected sequence represents a potential linker sequence that must be evaluated by structural appropriateness criteria in order to be positively selected for use as a linker in a fusion protein.
  • the identified linkers are evaluated for structural appropriateness of the identified sequence in the context of the two polypeptide moieties to be linked.
  • SEARCHWILD linker sequences identified by SEARCHWILD have structures that are highly dependent on adjacent structures (an undesirable feature)
  • packing and hydrogen bonding within the linker structure in the PDB are evaluated using the tiny probe program of E.D. Getzoff (Chapter 8, Ph.D. Thesis, Duke University, 1982).
  • Preferred structures for linker residues to be included in a fusion protein of present invention are reverse turns, open turns, helical turns, and short loops having local hydrogen bonds and packing
  • linkers are selected in which the linker structure generates a favorable globular fold between the protein and the GAG-binding moiety as measured by: 1) exposing the GAG-binding sidechains at the solvent- accessible surface of the fusion protein; 2) producing buried surface (as measured by MS with a 1.4 ⁇ probe) between the protein and the GAG-binding moiety without producing undue cavities" or interpenetrations; and 3) absence of steric collisions that cannot be resolved by single bond rotations.
  • polypeptide moiety to a glycosaminoglycan-binding moiety as follows:
  • Representative modeling methods for obtaining a structural model include the homology modeling approach described by Summers et al. rj. Mol. Biol., 210:785-811 (1989)], and the related approach exemplified herein at Example 1.
  • FIG. 4 A system of the present invention for identifying linker sequences is shown in Figure 4.
  • the system comprises an input device 11 such as a keyboard for entering commands and data, a ROM or RAM (read-only- memory or random access memory) 13 with a stored program (SEARCHWILD), a computer processor 15
  • an input device 11 such as a keyboard for entering commands and data
  • ROM or RAM read-only- memory or random access memory
  • SEARCHWILD stored program
  • RAM random-access-memory
  • auxiliary storage device 17 for storing entered data and predetermined sequence data.
  • the system may include a CRT
  • the invention also contemplates a method of determining an amino acid residue sequence suitable for linking selected molecules, the method comprising the steps of:
  • the invention contemplates a system for determining an amino-acid residue sequence
  • amino acid residue sequence of a protein or polypeptide is directly related via the genetic code.to the deoxyribonucleic acid (DNA) sequence of the structural gene that codes for the protein.
  • DNA deoxyribonucleic acid
  • a structural gene can be defined in terms of the amino acid residue sequence, i.e., protein or polypeptide, for which it codes.
  • a DNA sequence (i.e., DNA segment) of the present invention comprises a structural gene. Usually, the DNA sequence is present as an uninterrupted linear series of codons where each codon codes for an amino acid residue, i.e., the DNA sequence contains no introns.
  • any desired target fragment such as a nucleic acid having an intervening sequence, a promoter, a
  • a DNA segment of this invention defines a
  • the DNA segment includes a nucleotide base sequence according to the sequence in Figure 1 from nucleotide base 535 to base 579.
  • the DNA segment is no more than about 5,000 and preferably no more than 2,500 nucleotides (bases) in length.
  • a DNA segment of the present invention can easily be synthesized by chemical techniques, for example, via the phosphotriester method of Matteucci et al. [J. Am. Chem. Soc., 103:3185 (1981)] or using
  • duplex DNA molecules typically are duplex DNA molecules having cohesive termini, i.e., "overhanging" single-stranded portions that extend beyond the double-stranded portion of the molecule.
  • cohesive termini i.e., "overhanging" single-stranded portions that extend beyond the double-stranded portion of the molecule.
  • the presence of cohesive termini on the DNA molecules of the present invention is generally preferred.
  • oligonucleotides in the form of a "cassette", i.e., having convenient restriction enzyme site-defined cohesive termini, can easily be prepared by ligating smaller oligonucleotides.
  • single-stranded oligonucleotides of between 40-75 nucleotide bases in length are prepared with
  • ds DNA double stranded
  • RNA ribonucleic acid
  • the present invention further contemplates a recombinant DNA (rDNA) that includes a DNA segment of the present invention operatively linked to a vector for replication and/or expression.
  • rDNA recombinant DNA
  • a preferred rDNA is characterized as being capable of directly
  • expressing in a compatible host, a GAG-binding fusion protein of the present invention.
  • directly expressing is meant that the mature polypeptide chain of the expressed fusion protein is formed by
  • An exemplary and preferred rDNA of the present invention is the rDNA molecule pPHSODI q HPCI4 described in Example 6.
  • a rDNA molecule of the present invention can be produced by operatively linking a vector to a DNA segment of the present invention.
  • vector refers to a nucleic acid molecule capable of transporting between different genetic environments another nucleic acid to which it has been operatively linked.
  • Preferred vectors are those capable of autonomous replication and/or expression of nucleic acids to which they are operatively linked are referred to herein as
  • expression vector and can be expressed in a suitable host cell.
  • GAG-binding fusion protein encoding DNA segment of the present invention is operatively linked depends upon the functional properties desired, e.g., protein expression, and upon the host cell to be transformed. These limitations are inherent in the art of constructing recombinant DNA molecules.
  • invention is at least capable of directing the
  • replication and preferably also expression, of a gene operatively linked to the vector.
  • a vector contemplated by the present invention includes a procaryotic replicon, i.e., a DNA sequence having the ability to direct autonomous replication and maintenance of the recombinant DNA molecule extrachromosomally in a procaryotic host cell, such as a bacterial host cell, transformed therewith.
  • a procaryotic replicon i.e., a DNA sequence having the ability to direct autonomous replication and maintenance of the recombinant DNA molecule extrachromosomally in a procaryotic host cell, such as a bacterial host cell, transformed therewith.
  • a procaryotic replicon i.e., a DNA sequence having the ability to direct autonomous replication and maintenance of the recombinant DNA molecule extrachromosomally in a procaryotic host cell, such as a bacterial host cell, transformed therewith.
  • procaryotic replicon i.e., a DNA sequence having the ability to direct autonomous replication and maintenance of the recomb
  • Typical bacterial drug resistance genes are those that confer resistance to ampicillin, tetracycline, or kanamycin.
  • Those vectors that include a procaryotic replicon may also include a procaryotic promoter capable of directing the expression (transcription and
  • a promoter is an expression control element formed by a DNA sequence that permits binding of RNA polymerase and transcription to occur.
  • Promoter sequences compatible with bacterial hosts are typically provided in plasmid vectors containing convenient restriction sites for insertion of a DNA segment of the present invention. Bacterial expression systems, and choice and use of vectors in those systems is described in detail in "Gene Expression Technology", [Meth.
  • Typical of such vector plasmids are pUC8, pUC9, pBR322 and pBR329 available from Bio-Rad
  • Expression vectors compatible with eucaryotic cells can also be used to form the recombinant DNA molecules of the present invention.
  • Eucaryotic cell expression vectors are well-known in the art and are available from several commercial sources. Typically, such vectors are provided containing convenient restriction sites for insertion of the desired gene. Typical of such vectors are pSVL and pKSV-10
  • the eucaryotic cell expression vectors used to construct the recombinant DNA molecules of the present invention include a selectable phenotypic marker that is effective in a eucaryotic cell, such as a drug resistance selection marker or selective marker based on nutrient
  • a preferred drug resistance marker is the gene whose expression results in neomycin resistance, i.e., the neomycin phosphotransferase (neo) gene.
  • retroviral expression vector refers to a DNA molecule that includes a promoter sequence derived from the long terminal repeat (LTR) region of a retrovirus genome.
  • the expression vector is typically a retroviral expression vector that is preferably replication-incompetent in eucaryotic cells.
  • retroviral vectors The construction and use of retroviral vectors has been described by Sorge et al., Mol. Cell. Biol., 4:1730-37 (1984).
  • virus-based expression systems can be used, as is well-known, including systems based on SV-40, Epstein-Barr, Vaccinia, and the like. See, for example, "Gene Expression Technology", (Supra), at pp.485-569.
  • yeast a variety of vector are known in the art, in particular the vector, pCl/1 described by Brake et al., Proc. Natl. Acad.
  • the ribosome-binding site in E. coli includes an initiation codon (AUG) and a sequence 3-9 nucleotides long located 3-11 nucleotides upstream from the initiation codon (the Shine-Dalgarno sequence). See, Shine et al., Nature, 254:34 (1975). Methods for including a ribosome- binding site in mRNAs corresponding to the expressed proteins are described by Maniatis, et al. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, NY pp. 412-417 (1982). Ribosome binding sites can be modified to produce optimum configuration relative to the structural gene for maximal expression of the structural gene. [Hallewell et al., Nucl. Acid Res., 13:2017-2034 (1985)].
  • the vectors employed herein will contain restriction sites in all three reading frames of the DNA sequences.
  • other vectors will be suitable in which synthetic linkers are inserted to allow the fusion protein gene to be inserted in-frame. Synthetic linkers containing a variety of restriction sites are commercially available from a number of sources including
  • RNA sequences including the removable fragments and/or the linking sequences may also be prepared by direct synthesis techniques. Also contemplated by the present invention are RNA
  • the nucleic acids are combined with linear DNA molecules in an admixture thereof and a ligase will be added to effect ligation of the components.
  • a ligase Any ligase available commercially is contemplated to perform the ligation reaction effectively using methods and conditions well-known to those skilled in the art.
  • a preferred ligase is T4 DNA ligase.
  • Volume exclusion agents may also be used to accelerate the ligation reaction. However, such agents may cause excessive intramolecular
  • the recombinant DNA molecules of the present invention are introduced into host cells via a
  • the host cell can be either procaryotic or eucaryotic. Bacterial cells are preferred
  • procaryotic host cells typically are a strain of E. coli such as, for example, the MC1061 or JM109 strains.
  • Preferred eucaryotic host cells include yeast and mammalian cells, preferably vertebrate cells such as those from a mouse, rat, monkey or human fibroblastic cell line.
  • Preferred eucaryotic host cells include Chinese hamster ovary (CHO) cells available from the ATCC as CCL61 and NIH Swiss mouse embryo cells NIH/3T3 available from the ATCC as CRL 1658.
  • One preferred means of effecting transformation is electroporation.
  • Transformation of appropriate host cells with a recombinant DNA molecule of the present invention is accomplished by well-known methods that typically depend on the type of vector used. With regard to transformation of procaryotic host cells, see, for example, Cohen et al. [Proc. Natl. Acad. Sci. USA, 69:2110 (1972)] and Maniatis et al. [Molecular
  • rDNA recombinant DNA
  • cells resulting from the introduction of an rDNA of the present invention can be cloned to produce monoclonal colonies. Cells from those colonies can be harvested, lysed and their DNA content examined for the presence of the rDNA using a method such as that described by Southern, J. Mol. Biol., 98:503 (1975) or Berent et al., Biotech., 3:208 (1985).
  • expression vector produce a polypeptide displaying a characteristic antigenicity.
  • Samples of a culture containing cells suspected of being transformed are harvested and assayed for a subject polypeptide using antibodies specific for that polypeptide antigen, such as those produced by an appropriate hybridoma.
  • telomere sequence a suitable plasmid, e.g., pLG. Since the plasmid lacks a promoter and the Shine-Dalgarno sequence, no ⁇ -galactosidase is synthesized. However, when a portable promoter fragment is properly
  • plasmids are used to construct a fusion protein having ⁇ -galactosidase activity.
  • Plasmids having optimally placed promoter fragments are thereby recognized. These plasmids can then be used to reconstitute the fusion protein gene which is expressed at high levels.
  • cultures of the cells are contemplated as within the present invention.
  • the cultures include monoclonal (clonally homogeneous) cultures, or
  • a "serum-free" medium is preferably used.
  • the present method entails culturing a nutrient medium containing host cells transformed with a recombinant DNA molecule of the present invention that is capable of expressing a gene encoding a subject polypeptide.
  • the culture is maintained for a time period sufficient for the transformed cells to express the subject polypeptide.
  • the expressed polypeptide is then recovered from the culture.
  • the plasmid selected will have additional cloning sites which allow one to score for insertion of the gene assembly. See,
  • Bacterial cultures transformed with the plasmids are grown for a few hours to increase plasmid copy number, e.g., to more than 1000 copies per cell.
  • Induction may be performed in some cases by elevated temperature and in other cases by addition of an inactivating agent to a represser. Very large increases in cloned fusion proteins can potentially be obtained in this way.
  • Methods for recovering an expressed polypeptide from a culture include fractionation of the polypeptide-containing portion of the culture using well-known biochemical techniques. For instance, the methods of gel filtration, gel chromatography, ultrafiltration, electrophoresis, ion exchange, affinity chromatography, and the like, can be used to isolate the expressed proteins found in the culture. In addition, immunochemical methods, such as immunoaffinity, immunoabsorption, and the like, can be performed using well-known methods.
  • a preferred method for isolating a fusion protein in this invention is by affinity chromatography.
  • Isolation and purification of an expressed fusion protein containing a GAG-binding domain can be
  • a preferred affinity chromatography column in this invention is heparin immobilized to Affi-gel as shown in Example 7. After the lysate is applied to the column, the GAG- binding domain of the fusion protein binds to the heparin. After washing the column to remove non-bound proteins, the fusion protein can be specifically eluted with an increasing ionic strength salt
  • fractions containing the purified fusion protein are collected and tested for activity in an appropriate assay, preferably in a gel activity assay.
  • the fractions containing the highest activity of the fusion proteins are thereafter pooled.
  • Affinity chromatography purification of fusion proteins by these means can result in greater than 95% purity.
  • micromolar means microliter
  • ug means microgram
  • compositions of the present invention contain a physiologically tolerable carrier together with a GAG-binding fusion protein, as described herein, dissolved or dispersed therein as an active ingredient.
  • a physiologically tolerable carrier together with a GAG-binding fusion protein, as described herein, dissolved or dispersed therein as an active ingredient.
  • therapeutic composition is not immunogenic when administered to a mammal or human patient for
  • compositions, carriers, diluents and reagents are used interchangeably and represent that the materials are capable of administration to or upon a mammal without the production of undesirable physiological effects such as nausea, dizziness, gastric upset and the like.
  • compositions are prepared as
  • injectables either as liquid solutions or suspensions, however, solid forms suitable for solution, or
  • suspensions in liquid prior to use can also be prepared.
  • the preparation can also be emulsified.
  • the active ingredient can be mixed with
  • excipients which are pharmaceutically acceptable and compatible with the active ingredient. Suitable excipients are, for example, water, saline, dextrose, glycerol, ethanol or the like and combinations
  • composition can contain minor amounts of auxiliary substances such as wetting or emulsifying agents, pH buffering agents and the like which enhance the effectiveness of the active ingredient.
  • the therapeutic composition of the present invention can include pharmaceutically acceptable salts of the components therein.
  • Pharmaceutically acceptable salts include the acid addition salts
  • salts formed with the free carboxy1 groups can also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium or ferric hydroxides, and such organic bases as
  • Physiologically tolerable carriers are well-known in the art.
  • Exemplary of liquid carriers are sterile aqueous solutions that contain no materials in
  • aqueous carriers can contain more than one buffer salt, as well as salts such as sodium and potassium chlorides, dextrose, polyethylene glycol and other solutes.
  • Liquid compositions can also contain liquid phases in addition to and to the exclusion of water. Exemplary of such additional liquid phases are
  • glycerin vegetable oils such as cottonseed oil, and water-oil emulsions.
  • Methods for reducing tissue damage caused by oxygen free radical (superoxide) in vivo or in vitro are contemplated by the present invention, using a HSOD-GAG-binding fusion protein.
  • SOD SOD-oxide anion
  • tendonitis tendovaginitis, bursitis, epicondylitis, periarthritis
  • tendonitis tendovaginitis, bursitis, epicondylitis, periarthritis
  • tissue-targeted SOD should help alleviate the toxic secondary effect of anti- cancer radio and chemotherapy.
  • Drug (antibiotic and anticancer) induced nephritis also can be reduced by a more potent SOD.
  • the present invention contemplates a method of in vivo scavenging superoxide radicals in a mammal that comprises administering a therapeutically effective amount of a physiologically tolerable composition containing a ⁇ SOD-GAG-binding fusion protein to a mammal in a predetermined amount calculated to achieve the desired effect.
  • the HSOD-GAG- binding fusion protein is administered in an amount sufficient to deliver 1 to 50 milligrams (mg),
  • a preferred dosage can alternatively be stated as an amount sufficient to achieve a plasma concentration of from about 0.1 ug/ml to about 100 ug/ml, preferably from about 1.0 ug/ml to about 50 ug/ml, more preferably at least about 2 ug/ml and usually 5 to 10 ug/ml.
  • GAG-binding fusion proteins having superoxide dismutase (SOD) activity for use in a therapeutic composition typically have about 200 to 5000 units (U) of enzyme activity per mg of protein.
  • Enzyme assays for SOD activity are well-known, and a preferred assay to standardize the SOD activity in a fusion protein is that described by McCord et al., J.Biol.Chem.,
  • a dosage of about 1 to 20 mg, preferably about 4 to 8 mg is administered intra-articularly per week per human adult. In certain cases, as much as 20 mg can be administered per kilogram (kg) of patient body weight.
  • a dosage of 5 mg per kg of body weight is preferred to be administered intravenously.
  • the therapeutic compositions containing a GAG- binding fusion protein are conventionally administered intravenously, or intra-articularly (ia) in the case of arthritis, as by injection of a unit dose, for example.
  • unit dose when used in reference to a therapeutic composition of the present invention refers to physically discrete units suitable as a unitary dosage for the subject, each unit containing a predetermined quantity of active material calculated to produce the desired therapeutic effect in
  • diluent i.e., carrier, or vehicle.
  • compositions are administered in a manner compatible with the dosage formulation, and in a therapeutically effective amount.
  • the quantity to be administered depends on the subject to be treated, capacity of the subject's immune system to utilize the active ingredient, and degree of therapeutic effect desired. Precise amounts of active ingredient
  • Suitable regimes for initial administration and booster shots are also variable, but are typified by an initial administration followed by repeated doses at one or more hour intervals by a subsequent injection or other administration.
  • PCI plasma serpin protein C inhibitor
  • PCI residues 20-391 ( ⁇ 1 AT numbering) was built by sidechain substitution with the molecular editor Moledt (Biosym Technologies, Inc.), using the x-ray structure of ⁇ 1 AT available in the Brookhaven Protein Data Bank [Bernstein et al., J. Mol. Biol., 112:535-42 (1977)]; entry 6API as a template and following the original sidechain torsion angles. Sidechain collisions were corrected using new torsion angles from a rotamer library [Ponder et al., J. Mol. Biol., 193:775-91 (1987)], and small
  • the peptide bond between residue 19 (the C- terminus of the amino-terminal segment) and residue 20 (the N-terminus of residues 20-391 of PCI) was made by optimally orienting the N-terminal segment, then making minimal changes in the backbone torsion angles of residues 19 and 20 and their nearest neighbors (using Moledt) in order to align the carbonyl C of residue 19 with the amino N of residue 20.
  • Plausible models of PCI including the N-terminus were energy- minimized (using the methods described above) to alleviate unfavorable residue contacts and to improve the conformations of the residue 19-20 turn and the N- terminal segment.
  • sequences of surface ⁇ -helices are often amphipathic with a period of ⁇ 3.6 residues.
  • glycosaminoglycans GAGs
  • Electrostatic potentials were calculated at all points on the solvent-accessible surfaces of the energy-minimized PCI models using programs ESPOT and ESSURF [Getzoff et al., Nature, 306:287-90 (1983);
  • PCI exhibited an overall electrostatic dipole, with a highly positive region including the H helix opposed by a weakly negative region centered on Asp 121.
  • electrostatic potential surfaces In the electrostatic potential surfaces
  • the models including the A+ helix had a single, highly positive ( ⁇ 3 kcal mol -1 ) surface region centered on Arg 10 and Lys 274, which protrude from the A+ and H helices.
  • Other central positive residues were Lys 14 and Lys 270 in model I and Arg 6, Lys 277 and Lys 280 in model II.
  • the positive region in both models formed a single face of the protein, has an area (1365 ⁇ 2 in model I and 1705 ⁇ 2 in model II) consistent with other protein interfaces [Janin et al., J. Mol.
  • Residual APC activity was determined by the rate of change in absorbance at 405 nm, compared to controls without added PCI. Pseudo- first order rate constants were calculated from initial slopes in plots of the natural log (In) of APC activity versus time and k 2 values were obtained based on the concentration of PCI and are shown in Table 3.
  • PCI (4 ⁇ g) was incubated for 60 minutes at 22C with the anti-PCI antibodies API39 (48 ⁇ g), API60 (48 ⁇ g), or buffer, in 200 ⁇ l of 0.01 M Tris and 0.14 M NaCl at pH 7.4. The sample was adjusted to 0.1 M NaCl in a final volume of 400 ⁇ l and loaded onto a 0.6 ml column of heparin- agarose (Sigma). Using an FPLC liquid chromatography gradient programmer (Pharmacia), PCI was eluted (0.1 ml fractions) with a linear gradient from 0.1 to 0.6 M NaCl. The elution profiles were determined using an ELISA assay for PCI antigen as described by Espafia et al., Thromb. Res. 55:671-82 (1989).
  • an anti-PCI monoclonal antibody neutralizes heparin stimulation of APC inhibition by PCI [Meijers et al. Blood, 72:1401-3 (1988)], and by ELISA and peptide competition assays binds specifically a peptide corresponding to the A+ helix.
  • Antibody API39 prevented PCI from binding to a heparin-agarose column.
  • a control antibody that binds to PCI but not to peptides from the A+ or H helix regions does not affect heparin stimulation nor prevent PCI from binding to a heparin-agarose column.
  • the strikingly positive helix pairs that forms the heparin recognition surface of PCI identified by the studies in Example 1-3 is similar to the twin helical motif thought to bind heparin in dimers of platelet factor 4, a nonhomologous protein whose structure has recently been determined. St. Charles et al., J. Biol. Chem., 264: 2092-2099 (1989).
  • GAG recognition in ATIII may be a variation on this common theme, involving positive residues in both the D helix [Carrell et al.. Thrombosis and Haemostasis 1987, Verstraete et al., eds., Leuven University, pp.1-15 (1987)], and the N-terminal region.
  • a fusion protein was constructed to contain the heparin binding region of PCI, namely the A+
  • S0D-A+ contains three subunits: a first region comprised of a polypeptide having the amino acid residue sequence of HSOD, a second region comprised of a polypeptide linker to connect the first and third regions, and a third region comprised of a polypeptide having the amino acid residue sequence of the A+ helix of PCI.
  • the amino acid residue sequence of SOD-A+ is shown in Figure 1, including the first SOD region defined by residues 1-153, the second linker region defined by residues 154-156, and the third A+ region defined by residues 157-171.
  • S0D-A+ was done in the pPHSODlacI vector from Chiron Corporation (Emeryville, CA). This vector contains the Sall-EcoRI fragment from pBR322, coding for the ⁇ -lactamase and the origin of replication.
  • the lad gene was
  • the HSOD protein encoded by the synthetic HSOD gene differs from wild type HSOD in that it contains alanine and serine in place of the cysteines at amino acid residue positions 6 and 111, respectively . All experiments were carried out using E. coli MC1061 (araD139, delta (araleu)7696, delta (lac) 174, galU, galK, hsdR, strA) [Huynh et al., DNA Cloning, vol.1.
  • the HSOD be produced in yeast to obtain amino terminal acetylation like wild type HSOD protein found in humans.
  • yeast expression system for HSOD is described in Hallewell et al., J. Biol. Chem., 264:5260-5268 (1989) and also in
  • Applied Biosystems DNA synthesizer model 380B To add the Gly-Pro-Gly linker and the A+-helix to the carboxy terminus of HSOD, two oligonucleotides corresponding to the HSOD, sequence from the BamHI site of the synthetic gene to the end of the amino acid coding sequence were designed.
  • the coding strand was
  • the complementary strand was extended by a glycine
  • GCC anticodon
  • the Xmal site in the linker sequence allows further modifications of the linker sequence if needed.
  • Oligonucleotide HUCLI corresponds to the sequence of oligonucleotides 488 to 523.
  • Oligonucleotide HUCLIZ is the complement of nucleotides 492 to 528.
  • Oligonucleotide PCIHEPBI corresponds to the nucleotides sequence 524 to 583.
  • Oligonucleotide PCIHEPBZ is the complement of
  • oligonucleotides where hybridized pair wise, HUCLI with HUCLIZ and PCIHEPBI with PCIHEPBZ, 10 ⁇ g of each in 100 ml water for 1 minute at 90C followed by cooling down to room temperature for 5 minutes.
  • the hybridized oligos, HUCLI with HUCLIZ and PCIHEPBI with PCIHEPBZ, were ligated with T4 DNA ligase (New England Biolabs) according to the manufacturer's instructions.
  • T4 DNA ligase New England Biolabs
  • the resulting BamHI-Sall cassette was substituted for the BamHI-SalI fragment of the HSOD synthetic gene.
  • the BamHI-Sall cassette was ligated into the
  • Clones having a larger insert after Ncol-Sall digestion and including a sequence shown in Figure 1 from nucleotide base 1 to base 588 were selected and designated as containing the plasmid pPHSODI q HPCI4.
  • the plasmid pPHSODI q HPCI4 has been deposited with the American Type Culture Collection (ATCC; Bethesda, MD) in the form of a transformed E. coli containing the plasmid on November 1, 1990, by the depositor Chiron Corporation (Emeryville, CA) and has been assigned a deposit accession number that is available from the ATCC.
  • ATCC American Type Culture Collection
  • Chiron Corporation Emeryville, CA
  • Alternate expression vectors capable of producing HSOD-A+ fusion protein can be prepared from the deposited plasmid material using methodologies well- known. For general methods of molecular biology, see “Gene Expression Technologies” in Meth. Enzymol.
  • An exemplary alternate expression system can be prepared as follows.
  • the approximately 30 base pair (bp) Ncol-Pstl polylinker is first isolated from the pPROK-1 vector available from Clontech Laboratories (Palo Alto, CA).
  • the SalI site of pKK233-2 available from Clontech is disabled by first digesting pKK233-2 with SalI, filling in the cohesive SalI termini, then religating the resulting biunt ends to form a circular pKK233-2 plasmid with a disabled Sall.
  • pKK233-2 is digested with Ncol and Pstl, and the 30 bp Ncol-Pstl polylinker is ligated into pKK233-2 to provide a
  • Ncol-Sall site Deposited pPHSODI q HPCI4 is digested with Ncol and Sall to remove the HSOD-A+ fusion protein encoding gene cassette, and the cassette is inserted into the Ncol and SalI site of the above- modified pKK233-2 vector. Thereafter, the pKK233-2 vector having the HS0D-A+ protein encoding gene can be introduced into a suitable lacl q strain of E. coli
  • IPTG isopropylthio- ⁇ -D-galactoside
  • the periplasmic fraction of the bacterial cells was extracted by a modification of the osmotic shock procedure of Koshland et al., Cell, 20:749-760 (1980). The cells were centrifuged down into two one liter bottles (3.5k rpm for 15 minutes in a Beckman J-6B centrifuge maintained at 4C). Each pellet was
  • the periplasmic fraction was estimated to contain 5 mg per ml of HS0D-A+ as determined by coomassie blue staining of SDS-polyacrylamide gel [Laemmli, UK,
  • HSOD-A+ was then further isolated from the periplasmic fraction first purified by heparin
  • periplasmic fraction was loaded onto a 40 ml Affi-Gel heparin column.
  • the column was eluted at a flow rate of 1 ml per minute with 200 ml of a 0.2 M Tris pH 7.0 buffer generating a linear gradient from 0.03 M to 0.4 M NaCl.
  • Fractions of 5 ml were collected and tested by SDS-polyacrylamide gel electrophoresis.
  • HSOD-A+ eluted in fractions number 18 to 28 corresponding to elution buffer containing around 0.2 M salt. After that purification step, HSOD-A+ was estimated to be more than 95% pure and fully active based on the above gel activity assay.
  • heparin binding property of HSOD-A+ was demonstrated in vitro by using a heparin binding assay that measures retention of HSOD-A+ on the heparin column described above.
  • co- elution was conducted and compared using equivalent amounts of crude HS0D-A+ and of recombinant purified HSOD made in yeast [Hallewell et al., Biotechnology, 5:363-366 (1987)].
  • the HSOD was all eluted before the gradient reached 0.1 M salt while SOD-A+ eluted at about 0.2 M, indicating that the addition of a GAG- binding moiety to HSOD significantly increased the GAG-binding capacity of the SOD-A+ fusion protein.
  • mice were injected with 2 mg of HSOD-A+ or with recombinant HSOD [Hallewell et al., Biotech., 5:363- 366 (1987)] for control.
  • the proteins were
  • the half-life was estimated by the SOD gel activity assay as described above.
  • the recombinant HSOD have a half-life of less than 13 minutes, most likely between 7 and 10 minutes.
  • the HSOD-A+ half-life can be estimated to be around 15 minutes.

Abstract

Methods are described for designing and constructing fusion proteins, i.e., proteins that comprise at least two distinct structural units each providing a desired functionality. In a preferred aspect of the invention, recombinant DNA molecules coding for proteins having fused glycosaminoglycans-binding and superoxide dismutase (SOD) polypeptides are designed and expressed. The resulting fusion protein retains the activities of the glycosaminoglycan (GAG)-binding polypeptide and the SOD enzymes. The lifetime of intracellular human SOD in the bloodstream can be prolonged due to attachment of the glycosaminoglycan-binding group, and this GAG-binding function also targets the enzyme to cell surfaces.

Description

GLYCOSAMINOGLYCAN-TARGETED FUSION PROTEINS, THEIR DESIGN, CONSTRUCTION AND COMPOSITIONS
DESCRIPTION
Technical Field
The present invention relates to
glycosaminoglycan-binding fusion proteins, and methods for designing and constructing the fusion proteins. More particularly, the invention relates to methods and compositions for extending in vivo lifetimes of biologically active compounds and targeting them to specific cell surfaces or substrates.
Background
Detailed knowledge of the fundamental processes involved in transcription and translation of the genetic code has led to a variety of genetic
engineering techniques for expressing a desired protein. These engineering techniques have even afforded practical methods for synthesizing proteins that do not occur in nature. The synthetic proteins can potentially combine in one protein molecule activities normally associated with two distinct proteins. Thus, molecular splicing techniques are now available to impart a wide array of desired properties to a protein that otherwise lacks such properties. The new protein molecule, which comprises two or more protein subunits having different properties, is called a "fusion protein". Fusion proteins may be comprised of homologous or heterologous sources of polypeptides so long as the polypeptides being fused are not typically associated together. Particularly interesting are fusion proteins comprised of
polypeptides derived from independently folding structural regions (domains) of proteins that contain biological function.
A number of useful fusion proteins have been formed using the techniques of genetic engineering. For example, a non-excretable protein can be fused to a β-lactamase moiety to give an excretable fused protein. At the genetic level, this fusion is
accomplished by inserting the gene for the non- excretable protein into the amp gene of pBR322, which encodes β-lactamase. Upon transformation and
expression in a suitable host cell, the fused protein is generated. See, e.g., Freifelder. Molecular
Biology, Science Books Intl., 828:833 (1983);
Talmadge et al., Proc. Natl. Acad. Sci. 77:3369
(1980).
The basic approach to synthesizing fusion
proteins outlined above can be modified in many ways. For instance, when a target protein is very small, e.g., the hormone somatostatin, a DNA oligonucleotide can be prepared which codes for the hormone attached to a methionine group. The synthetic oligonucleotide can be ligated to a cleaved vector adjacent to the lac Z gene for β-galactosidase in E. coli. The enzyme region of the expressed protein can subsequently be removed by reaction with CNBr which cleaves the expressed protein at the methionine group. In other cases, the biologically active form of a peptide may be released by enzymatically removing the undesired protein fragment, e.g., with trypsin. Many other variations can be envisioned. See, Maniatis et al. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, NY pp.422-433 (1982).
Computer-based approaches to designing fusion proteins from two naturally aggregated proteins have been described by Ladner" in U.S. Patent No. 4,704,692.
One class of biologically active proteins that has attracted much attention in recent years is the superoxide dismutase (SOD) class of enzymes. The SOD enzymes, which catalyze the conversion of superoxide radical to molecular oxygen and hydrogen peroxide, are ubiquitous in organisms that utilize oxygen.
The dismutation reaction of SOD enzymes is believed to be important in preventing tissue damage by free radicals. Indeed, the effectiveness of human intracellular SOD (HSOD) in relieving inflammatory disorders including osteoarthritis has been
demonstrated by clinical studies in humans. See, Wilsmann, Superoxide and Superoxide Dismutase in
Chemistry, Biology and Medicine, Elsevier, 500-5
(1986). Additionally, animal studies have suggested that SOD enzymes have therapeutic potential for viral infections. See, Oda et al., Science, 244:974-6
(1989). SOD enzymes have also been implicated in preventing alloxan diabetes [Grankvist et al., Nature, 294:158 (1981)] and in preventing metastasis of certain forms of cancer (EPO Application No. 0332464).
In humans, SOD enzymes exist as distinct
intracellular and extracellular species.
Extracellular SOD (EC-SOD) is the major SOD enzyme in extracellular fluids.
Heparin is a glycosaminoglycan (GAG) that
consists of long, unbranched polysaccaride chains composed of repeating disaccharide units. The
polysaccharide chains are covalently linked to
polypeptide backbones to form proteoglycans. Seven different groups of GAGs are distinguished by the types of sugar residues, the type of linkage between the sugar residues, and the number and location of sulfate groups. The presence of sulfate groups as well as carboxyl groups give GAGs a highly negative charge. The various forms of GAGs are distributed throughout the body in such areas as connective tissues, skin, cartilage, cornea, bone, blood vessels, lung, liver, cell surfaces, extracellular matrix and the like. In these areas, GAGs adopt an extended, random-coil conformation. GAGs are hydrophilic, forming hydrated gels at low concentrations. The negative charge of the chains attracts water as well as osmotically active cations. See, Lindahl et al., Annu. Rev Biochem., 47:385-417 (1978); and Chakrabarti et al., CRC Crit. Rev. Biochem., 8:225-313 (1980).
EC-SOD is shown to be heterogeneous with regard to heparin binding. Marklund et al., Proc. Natl.
Acad. Sci. USA, 79:7634-8 (1982). The attraction of EC-SOD to heparin is proposed to be due to a sequence of positively-charged residues in the C-terminal end of the enzyme, which occurs in a C-terminal extension to a region of the protein that is homologous to intracellular human SOD (HSOD). A positively-charged polypeptide segment is implicated as the receptor since glycosaminoglycans are negatively charged. See, Hjalmarsson et al., Proc. Natl. Acad. Sci. USA,
84:6340-4 (1987). Use of EC-SOD as an anti- inflammatory agent, etc., is unsatisfactory, however, for several reasons. For example, recombinant EC-SOD has been expressed successfully only in mammalian cell cultures making recombinant EC-SOD very expensive to produce. Tibell et al., Proc. Natl. Acad. Sci. USA 84:6634-8 (1987). Also, recombinant EC-SOD is formed as a heterogeneous mixture of SOD enzymes due to variations in carbohydrate content and extent of proteolysis. A recombinant EC-SOD also has been described by Marklund et al. (WO 8701387).
Intracellular human SOD (HSOD) has also been proposed as a therapeutic agent; however, HSOD has a very short half-life of about 7 minutes in vivo. Several attempts have been made to extend the half- life of HSOD in the bloodstream. Chemical approaches include conjugating HSOD to polyethylene glycol [White et al., Superoxide and Superoxide Dismutase in
Chemistry, Biology and Medicine, Elsevier, 524-7
(1986)3 and to a pyran copolymer. Oda et al.,
Science, 244:974-6 (1989). Native HSOD is a CuZn dimer having a molecular weight of 32,000 Daltons.
Recombinant methods have been employed in efforts to generate long-lived HSOD compounds. Proteins comprising two directly linked subunits of HSOD have been expressed in E. coli. Hallewell et al., J.
Biol. Chem., 264:5260-8 (1989). However, with this approach the aggregation state is impossible to control. Also, a human intracellular SOD (HSOD) analog has been cloned and expressed in E. coli by Bio-Technology General Corporation (BTG). The
recombinant HSOD analog differs from HSOD in that it is not N-acetylated. However, BTG's HSOD shows pharmacological activity in preclinical studies that is indistinguishable from the natural protein.
Recombinant HSOD has also been expressed in yeast
[Hallewell et al., Biotechnology, 5:363-6 (1987)] and crystallographically characterized. Parge et al., J. Biol. Chem., 261:16215-8 (1986). Unfortunately, these engineered HSODs have limited clinical potential as anti-inflammatory agents due to their short serum lifetimes.
As with HSOD, many otherwise satisfactory
pharmaceutical agents are expected to find limited therapeutic use due to their short lifetimes in vivo and lack of specific targeting. Thus, a convenient method for extending the useful lifetimes of proposed pharmaceutical agents is desired. The method should allow preparation of new variants of the proposed pharmaceutical agents that at least have biological activities comparable to those for the unaltered agent. Additionally, the long-lived variants of proposed pharmaceutical agents will preferably be non- immunogenic, i.e., not trigger an immune response and therefore be suitable for repeated therapeutic use in a particular host animal. Also, the long-lived variants of proposed pharmaceutical agents will preferably include functionalities that minimize the costs and complexities associated with employing such variants, e.g., by facilitating their purification from reaction mixtures.
Brief Summary of the Invention
A class of glycosaminoglycan (GAG) -binding moieties have been identified in the present invention that can be operatively linked to a preselected protein to form a fusion protein and thereby increase the stability, plasma half-life and ease of
purification of the preselected protein. In addition, methods to operatively link the GAG-binding moiety to the preselected polypeptide are presented that form a functional fusion protein having both the GAG-binding activity and the preselected polypeptide biological activity.
Thus, the present invention contemplates a fusion polypeptide having a minimum of two independently folding protein moieties operatively linked into a single polypeptide. A first moiety is a
glycosaminoglycan (GAG)-binding moiety that provides a targeting function and introduces GAG-binding activity into the fusion protein. The second moiety is a polypeptide having biological activity.
The present invention also affords a systematic method for identifying optimal configurations of fusion proteins having independently folding
polypeptide subunits. The fusion proteins are
characterized as comprising at least two amino acid residue sequences. The amino acid residue sequences provide desired functionalities for the fused protein, and preferably contains a third amino acid residue sequence that serves to covalently link the first two sequences. The instant methods are generally
applicable to designing any fusion protein having at least two independently folding protein domains.
Recombinant DNA molecules coding for the instant fusion proteins are also contemplated.
In a preferred embodiment of the invention, an amino acid residue sequence, which has superoxide dismutase activity, corresponding to that for HSOD is linked to a second sequence having glycosaminoglycan- binding activity. A preferred linking sequence is Gly-Pro-Gly, which links the HSOD unit to the
glycosaminoglycan-binding unit. Pharmaceutical compositions are also contemplated in which the compositions comprise a therapeutically effective amount of the fusion protein. In a preferred
embodiment of the invention, a fusion protein
comprised of a heparin-binding moiety linked to a pharmacologically active compound is prepared using recombinant DNA methods. The lifetime of the
pharmacologically active compound in an animal's bloodstream is extended by administering the heparin- binding fused protein to the animal. Most preferably, the fused protein will comprise the heparin-binding moiety.
Brief Description of the Drawings
In the drawings, forming a portion of this disclosure: Figure 1, in three panels 1A, 1B, and 1C,
illustrates the nucleotide sequence of a DNA segment that codes for a GAG-binding fusion protein, shown from left-to-right and in the direction of 5'-terminus to 3'-terminus using the single letter nucleotide base code. The structural gene for the mature fusion protein begins at base 67 and ends at base 579, with the position number of the every tenth base residue in each row indicated above the row showing the sequence.
The amino acid residue sequence for the fusion protein is indicated by the single letter code below the nucleotide base sequence, with the position number for the first residue in each row indicated to the left of the row showing the amino acid residue
sequence and the position for the last residue
indicated to the right of the row. The reading frame is indicated by placement of the deduced amino acid residue sequence below the nucleotide sequence such that the single letter that represents each amino acid is located below the first base in the corresponding codon. The mature fusion protein amino acid residue sequence begins at residue 1 and ends at residue 171.
Figure 2 illustrates the orientation of the
N-terminal A+ helix and the H helix as a parallel two- helix motif on the surface of the energy-minimized model I of PCI, and the positive electrostatic
potential generated by the association of these helices. The N-terminal A+ helix crosses the lower third of the picture, with the N-terminus, residue 5 (α1-antitrypsin numbering), at far right and
residue 15, ≈2 turns from the end of the helix, at center. The H helix, in the upper portion of the picture, starts at residue 269 (one residue to the right of the labeled residue, 270), and extends to residue 277 at upper left. The positive charges on this helix are augmented by positive charges in residues 280-282, which have extended conformation. The highly positive electrostatic potential (dots indicating a surface potential of ≥3 kcal/mol) is generated by the many positive charges on these helices and constitutes the most favorable region on the PCI surface for binding of (negatively-charged) glycosaminoglycans. This arrangement of helices and their spacing is similar to the two-helix motif found for heparin binding in the crystallographic structure of platelet factor 4, a protein not homologous to PCI.
Figure 3 illustrates a model of the carboxy- terminal helices from the platelet factor 4 (PF4) dimer structure attached to the carboxy termini of the human superoxide dismutase (HSOD) dimer, comprising an optimal two-helix motif for glycosaminoglycan binding. In order to determine an optimal linkage between the PF4 helices and HSOD that allows these helices to form a glycosaminoglycan-binding motif emulating that found in intact PF4, the twofold symmetry axes of the HSOD and PF4 dimers have been superimposed, then the helical pair was rotated to minimize the distances between the N-termini of the PF4 helices and the
C-termini of the HSOD monomers. This optimal
arrangement implies that a linker between these two moieties should be at least 5 residues in length.
Residues 75-85 and 175-185 are the C-terminal helices of the two monomers in the PF4 crystal structure dimer. The chains of the two dimers of HSOD are numbered 1-153 and 201-353.
Figure 4 is block diagram of the system of the present invention.
Detailed Description of the Invention
A. Definitions Amino Acid Residue: An amino acid formed upon chemical digestion (hydrolysis) of a polypeptide at its peptide linkages. The amino acid residues described herein are preferably in the "L" isomeric form. However, residues in the "D" isomeric form can be substituted for any L-amino acid residue, as long as the desired functional property is retained by the polypeptide. NH2 refers to the free amino group present at the amino terminus of a polypeptide. COOH refers to the free carboxy group present at the carboxy terminus of a polypeptide. In keeping with standard polypeptide nomenclature (described in
J. Biol. Chem., 243:3552-59 (1969) and adopted at 37 C.F.R. 1.822(b)(2)), abbreviations for amino acid residues are shown in the following Table of
Correspondence:
Figure imgf000012_0001
l
Figure imgf000013_0001
It should be noted that all amino acid residue
sequences represented herein by formulae have a left- to-right orientation in the conventional direction of amino terminus to carboxy terminus. In addition, the phrase "amino acid residue" is broadly defined to include the amino acids listed in the Table of
Correspondence and modified and unusual amino acids, such as those listed in 37 C.F.R. 1.822(b)(4), and incorporated herein by reference. Furthermore, it should be noted that a dash at the beginning or end of an amino acid residue sequence indicates a peptide bond to a further sequence of one or more amino acid residues or a covalent bond to an amino-terminal group such as NH2 or acetyl or to a carboxy-terminal group such as COOH.
Base Pair: a partnership of adenine (A) with thymine (T), or of cytosine (C) with guanine (G) in a double-stranded DNA molecule. In RNA, uracil (U) is substituted for thymine. Base pairs are said to be "complementary" when their component bases pair up normally when a DNA or RNA molecule adopts a double- stranded configuration.
Complementary Nucleotide Sequence: a sequence of nucleotides in a single-stranded molecule of DNA or RNA that is sufficiently complementary to another single strand to specifically (non-randomly) hybridize to it with consequent hydrogen bonding.
Conserved: a nucleotide sequence is conserved with respect to a preselected (reference) sequence if it non-randomly hybridizes to an exact complement of the preselected sequence.
Duplex DNA: a double-stranded nucleic acid molecule comprising two strands of substantially complementary polynucleotides held together by one or more hydrogen bonds between each of the complementary bases present in a base pair of the duplex. Because the nucleotides that form a base pair can be either a ribonucleotide base or a deoxyribonucleotide base, the phrase "duplex DNA" refers to either a DNA-DNA duplex comprising two DNA strands (ds DNA), or an RNA-DNA duplex comprising one DNA and one RNA strand.
Fusion Protein: A protein comprised of at least two polypeptides. In some cases, a linking sequence is present to operatively link the two polypeptides into one continuous polypeptide (i.e., fusion
protein). At least one, and preferably two, of the polypeptides comprising a fusion protein is
biologically active. The two polypeptides linked in a fusion protein are typically derived from two
independent sources, and therefore a fusion protein comprises two linked polypeptides not normally found linked in nature.
Gene: a nucleic acid whose nucleotide sequence codes for a RNA, DNA or polypeptide molecule. Genes may be uninterrupted sequences of nucleotides or they may include such intervening segments as introns, promoter regions, splicing sites and repetitive sequences. A gene can be either RNA or DNA.
Hybridization: the pairing of complementary nucleotide sequences (strands of nucleic acid) to form a duplex, heteroduplex, or complex containing more than two single-stranded nucleic acids, by
establishing hydrogen bonds between/among
complementary base pairs. Hybridization is a
specific, i.e., non-random, interaction between/among complementary polynucleotides that can be
competitively inhibited.
Linking Sequence: an amino acid residue sequence comprising zero to seven amino acid residues. A linking sequence serves to chemically link two
disparate polypeptides via a peptide bond between the linking sequence and each of the polypeptides.
Nucleotide: a monomeric unit of DNA or RNA consisting of a sugar moiety (pentose), a phosphate group, and a nitrogenous heterocyclic base. The base is linked to the sugar moiety via the glycosidic carbon (1' carbon of the pentose) and that combination of base and sugar is a nucleoside. When the
nucleoside contains a phosphate group bonded to the 3' or 5' position of the pentose, it is referred to as a nucleotide. A sequence of operatively linked
nucleotides is typically referred to herein as a "base sequence" or "nucleotide sequence", and their
grammatical equivalents, and is represented herein by a formula whose left-to-right orientation is in the conventional direction of 5'-terminus to 3'-terminus.
Nucleotide Analog: a purine or pyrimidine nucleotide that differs structurally from an A, T, G, C, or U base, but is sufficiently similar to
substitute for the normal nucleotide in a nucleic acid molecule. Inosine (I) is a nucleotide analog that can hydrogen bond with any of the other nucleotides. A, T, G, C, or U. In addition, methylated bases are known that can participate in nucleic acid hybridization.
Polynucleotide: a polymer of single or double stranded nucleotides. As used herein "polynucleotide" and its grammatical equivalents will include the full range of nucleic acids. A polynucleotide will typically refer to a nucleic acid molecule comprised of a linear strand of two or more deoxyribonucleotides and/or ribonucleotides. The exact size will depend on many factors, which in turn depends on the ultimate conditions of use, as is well-known in the art. The polynucleotides of the present invention include primers, probes, RNA/DNA segments, oligonucleotides or "oligos" (relatively short polynucleotides), genes, vectors, plasmids, and the like.
Polypeptide or Peptide or Protein: a linear series of at least two amino acid residues in which adjacent residues are connected by peptide bonds between the alpha-amino group of one residue and the alpha- carboxy group of an adjacent residue.
Recombinant DNA (rDNA) molecule: a DNA molecule produced by operatively linking a nucleic acid
sequence, such as a gene, to a DNA molecule sequence of the present invention. Thus, a recombinant DNA molecule is a hybrid DNA molecule comprising at least two nucleotide sequences not normally found together in nature. rDNAs not having a common biological origin, i.e., evolutionarily different, are said to be "heterologous".
Vector: a DNA molecule capable of autonomous replication in a cell and to which a DNA segment, e.g., gene or polynucleotide, can be operatively linked so as to bring about replication of the attached segment. Vectors capable of directing the expression of genes encoding for one or more proteins are referred to herein as "expression vectors".
Particularly important vectors allow cloning of cDNA (complementary DNA) from mRNAs produced using reverse transcriptase. B. Glycosaminoglycan Targeted Fusion Proteins A glycosaminoglycan (GAG)-binding fusion protein, i.e., a GAG-targeted fusion protein, is a protein comprising two functional elements defined by two independently folding polypeptides that are
operatively linked by a linking means. The first functional element is a GAG-binding moiety comprised of a first polypeptide that independently folds into a functional three-dimensional protein structural domain having GAG-binding activity. The second functional element is a biologically active moiety comprised of a second polypeptide that independently folds into a functional three-dimensional protein structural domain having a preselected biological activity.
Glycosaminoglycan-Binding Moiety
Glycosaminoglycans (GAGs) are a class of
negatively charged biopolymers in the extracellular matrix, are the polysaccharide portion of the
proteoglycans, and include heparin, heparan sulfate, chondroitin sulfate, dermatan sulfate, hyaluronic acid, and keratan sulfate. Non-covalent associations of GAGs with proteins mediate a variety of biological processes, including cell attachment, growth and differentiation, and anticoagulation. A GAG-binding moiety for use in the present invention is a
polypeptide sequence that has an affinity for binding with a GAG, and is therefore useful in a fusion protein of the present invention to target the fusion protein to the vicinity of a GAG molecule.
A GAG-binding moiety for use in a fusion protein includes the following structural parameters as defined by the teachings of the present invention:
(1) The GAG-binding moiety is a polypeptide of 6-20 amino acid residues in length;
(2) Has 4-10 positively charged residues, that are arginine or lysine;
(3) Two of the positively charged residues are separated by three amino acid residues, preferably helix-promoting residues. A helix-promoting residue is one of the following: leucine, alanine, glutamic acid, phenylalanine, threonine, isoleucine, serine, tyrosine, valine, asparagine, lysine, arginine, and aminoisobutyric acid;
(4) The polypeptide comprising the GAG- binding moiety exhibits amphipathic character when modeled as an α-helix; and
(5) The GAG-binding moiety contains no more than two helix-breaking residues (e.g., glycine or proline).
A general formula for a GAG-binding moiety for use in the present invention is a polypeptide
including the formula: Xg[+]hXi[+]j-Xk[+]lXm[+]nXo, where [+] and X are amino acid residues (designated in single letter code), [+] is R or K; X is L, A, E, F, T, I, S, Y, V, N, K, R or aminoisobutyric acid; g is an integer from 0-9, h is an integer from 1-3, i is an integer from 1-5, j is an integer from 1-3, k is an integer from 1-7, 1 is an integer from 0-7, m is an integer from 0-7, n is an integer from 0-2, and o is an integer from 0-2; and with the proviso that
g+h+i+j+k+l+m+n+o is equal to or less than 20. In preferred embodiments g=0, 1, 4 or 9; i=1, 2, 3 or 5; k=1, 2, 3, 4, 6 or 7; 1=0, 1, 2, 3 or 6; and m=0,
1,2,4,6 or 7. In a less preferred embodiment X can be H, Q, M, C, W, D, G or P and X contains zero to two of the helix-breaking residues selected from the group consisting of G and P.
Polypeptide sequences having an amino acid residue sequence according to the above formula represents a GAG-binding moiety for use in the present invention and can be designed de novo or can be identified from known protein sequences. Preferably the polypeptide sequence is selected from a protein having known GAG-binding activity. Proteins of known sequence that have known GAG-binding activity have been described extensively in the literature and are summarized in Table 1.
Table 1
GAG-binding Proteins of Known Sequence
PROTEIN SPECIES
Antithrombin III Human
Apolipoprotein B-100 Human
Apolipoprotein B Chick
Apolipoprotein E Human
Apolipoprotein E Baboon
Apolipoprotein E Macaque
Apolipoprotein J Human
Antistasin A and B Leech
Connective tissue activating peptide Human
Elastase Human
Elastase 1 Pig
Elastase 2 Pig
Extracellular superoxide dismutase Human
Fibronectin Human
Fibronectin Bovine
Fibronectin Mouse
Fibronectin Rat
Fibronectin Chick
Ghilanten Leech
Glia-derived Nexin Human
Glia-derived Nexin Rat β-2-Glycoprotein I Human
Heparin-binding growth factor 1 Human
Heparin-binding growth factor 1 Bovine
Heparin-binding growth factor 1 Rat
Heparin-binding growth factor 2 Human
Heparin-binding growth factor 2 Bovine
Heparin-binding growth factor 2 Rat
Heparin-binding growth factor 2 Frog
Heparin-binding growth factor 8 Bovine Heparin cofactor II Human
Hepatopoietin A Rabbit
Histidine-rich glycoprotein Human
Lipoprotein lipase Human
Lipoprotein lipase Bovine Lipoprotein lipase Mouse
Lipoprotein lipase Guinea pig
Lipoprotein lipase Chick
Leucocyte Elastase Human
Neural cell adhesion protein Mouse Neural cell adhesion protein Rat
Platelet-derived growth factor Human
Platelet-derived growth factor Frog
Platelet-derived growth factor Cat
Platelet factor 4 Human Platelet factor 4 Bovine
Platelet factor 4 Rat
Protein C Inhibitor Human β-Thromboglobulin Human
Thrombospondin Human Triacylglycerol lipase Human
Triacylglycerol lipase Rat
Type IV collagen Human
Type IV collagen Mouse
Type IV collagen Fly
Type V collagen Human Vascular endothelial growth factor Human
Vascular endothelial growth factor Bovine
Vascular permeability factor Human
Type IX collagen Human
Vitronectin Human
Preferred GAG-binding moieties are polypeptides including the sequences having the formula: [+]2X2[+]2 where [+] and X are as described above. Particularly preferred in this embodiment is the polypeptide having the sequence YKKIIKKLLES, which is derived from the platelet factor 4 (PF4) C-terminal helix.
Also preferred as a GAG-binding moiety is the polypeptide including the formula: [+]X3[+]X2[+]3, wherein the [+] and X are as defined above.
Particularly preferred ir. this embodiment is the polypeptide having the sequence HRHHPREMKKRVEDL, derived from the amino terminus of protein C inhibitor and corresponding to the A+ helix as described herein.
Another preferred GAG-binding moiety is the polypeptide including the formula: [+]X2[+]2X[+], wherein the [+] and X are as defined above. A
preferred polypeptide according to this embodiment is an internal sequence corresponding to a section of C- terminal end of the D helix of antithrombm III having the sequence KLNCRLYRKANK. Another specific
polypeptide according to the above formula is the internal H helix of protein C inhibitor having the sequence EKTLRKWLK.
Another preferred GAG-binding moiety for use in the present invention is a polypeptide including the formula: [+]2X5[+]X3[+] where [+] and X are as defined above. A preferred polypeptide according to this embodiment is a section of the N-terminal end of the internal A helix of antithrombin III having the sequence RRVWELSKANSR.
Exemplary of a GAG-binding moiety for use in a fusion protein is the PCI A+ helix identified above and utilized in the HSOD-A+ fusion protein described in Example 5.
Another GAG-binding moiety for use in the present invention is a polypeptide having the sequence
RVPRESGKKRKRKRLKPS.
Linker Sequences
A linker or linking means for use in the present invention to connect a glycosaminoglycan (GAG)-binding moiety to a biologically active moiety in the fusion protein of the present invention has a structure that depends on the amino acid sequence of the two moieties being linked. Considerations for selection of a linker are discussed in detail herein and in the discussion of identifying linkers using SEARCHWILD.
A linking means operatively links two polypeptide portions of a fusion protein through peptide bonds and in one embodiment is comprised of zero or more amino acid residues i.e., a linker sequence, and is
typically less than 20 amino acid residues, preferably less than seven residues, and more preferably is three residues. Although a fusion protein can simply have a peptide bond as the operative linkage (linking means) between two polypeptide domains of the fusion protein, it is more typical that the linking means in a fusion protein is one or more residues in a linker sequence for operatively connecting the protein's independently folding polypeptide domains.
Specific Linkers Identified bv SEARCHWILD
Several particularly preferred linker sequences have been identified by applying SEARCHWILD to
particular independently folding protein moieties. For example, applying SEARCHWILD with the
parameters for minimum linker length "miniinkerlen" and maximum linker length "maxlinkerlen" of zero and 7 residues, respectively, to the C-terminal sequence from HSOD (having the amino acid sequence GVIGIAQ), and the N-terminal sequence from the A+-helix of PCI (having the amino acid sequence of HRHHPRE), none of the five most similar sequences had linkers meeting all five criteria described in Section C "Design of Glycosaminoglycan-Targeted Fusion Proteins".
These criteria were considered again in a search using SEARCHWILD where the C-terminal sequence was specified as the PCI A+ helix amino acid sequence MKKRVED, and the N-terminal sequence was the HSOD sequence ATKAVCV. The top five sequence matches produced the following three linkers meeting the five criteria: YYK from adenylate kinase (PDB code 3adk) residues 153-155, SMD derived from L-arabinose binding protein (PDB code labp) residues 155-157, and a linker of zero length (i.e., no residues in the linker) between residues 42 and 43 of calcium-binding
parvalbumin B (PDB code 3cpv).
When SEARCHWILD was run using the C-terminal residues from HSOD (having the sequence GVIGIAQ) and the N-terminal residues of the platelet factor 4 (PF4) helix (having the sequence YKKIIKK), the following specific linkers were identified from the top five matches: a linker of zero length was found between residues 104 and 105 of calmodulin (PDB 3cln); DEDG was found in residues 18 to 21 of the protein tyrosyl- transfer RNA synthetase (PDB 2tsl); and the linker IGVMP was derived from residues 206-210 of the protein L-lactate dehydrogenase (PDB 21db).
Applying SEARCHWILD to search for linkers to join the C-terminal residues of the PF4 helix polypeptide IKKLLES and the N-terminal residues of HSOD having the polypeptide sequence ATKAVCV, the following linker peptides were identified amongst the top five sequence matches: linker sequence REACG corresponding to residues 113-117 of p-hydroxybenzoate hydroxylase ternary complex (PDB lphh); the linker VE
corresponding to residues 233-234 in the protein carbonic anhydrase form B (PDB 2cab); and the linker VMAS corresponding to residues 124-127 in apo-L- lactate dehydrogenase (PDB 1ldb).
Reverse Turn Linker Sequences
Another class of linkers can be included in a fusion protein of the present invention involving a reverse turn of 4 residues in length. Reverse turns are preferred because they are surface-exposed, well- defined structures stabilized by internal hydrogen bonds between residues within the turn [Richardson et al.. Adv. Prot. Chem., 34:168-364 (1981)] and because the preferred residue types in the four positions of type I and type II turns, the most common reverse turns, are known. Wilmot et al.. J. Mol. Biol.,
203:221-232 (1988). Preferred reverse turn sequences for use as linkers are one of the following
polypeptide sequences: NDSG, NSSG, NSRG, and NSDG.
Gly-Pro-Gly Linkers
In another embodiment, a preferred linker includes the amino acid residues Gly-Pro-Gly. An important consideration in the linker is to provide a short extension away from one polypeptide structure into another polypeptide structure that confers a sharp turn such that the two polypeptides can lie against one another. The three residues Gly-Pro-Gly provide an appropriate length for such a turn between the two structural elements provided by many such polypeptide moieties of a fusion protein. Glycine has a high degree of conformational flexibility and thus allows the two structural elements which are joined to interact in an optimal way. The rigid, kink-forming residue proline has high propensity to form turns. Thus, the Gly-Pro-Gly structure forms turns with rotational flexibility at the ends. Of eleven
instances of Gly-Pro-Gly in nonhomologous protein structures, eight fold as reverse turns, two fold as open turns, and one is in extended conformation.
Additionally, both glycine and proline have small sidechains and are less likely to cause packing problems between the structural elements of the two polypeptide moieties.
Gly-Pro-Gly is particularly preferred as a linker and is utilized in the HSOD-A+ fusion protein
described in Example 5.
Biologically Active Polypeptides
Biologically active polypeptide moieties for use in a fusion protein of this invention can be derived from any number of proteins of known primary amino acid residue sequence that provide therapeutic
applications. At least four classes of proteins of known structure (or proteins whose structure can be modeled on proteins of known structure) have potential therapeutic applications that can benefit from having glycosaminoglycan-binding properties either to
increase half-life or to target them to their site of action in the body. These classes of proteins include the following: 1) serine proteases; 2) protease inhibitors including serine protease inhibitors, which are also called serpins; 3) antioxidant enzymes; and 4) receptors and immunoglobulins.
Tissue-plasminogen activator (tPA), urokinase, and single-chain urokinase-like plasminogen activator (scuPA) [Haber et al., Science, 243:51-56 (1989)] are representative proteases in the serine protease class of proteins. These proteases are used for the
treatment of fibrinolytic disorders. The function of tPA and urokinase can benefit from the addition of a glycosaminoglycan-binding domain to their structure. The full-length form of both tPa and urokinase
consists of a protease and a binding domain, the latter of which promotes binding to heparin. Upon binding to heparin, however, the proteases naturally undergo a cleavage resulting in a separation of the protease domain from the binding domain. The
separated tPa protease domain is more active in dissolving blood clots than the full-length form.
Handin et al., Heart Disease, A Textbook of
Cardiovascular Medicine, Braunwald, Ed., W.B. Saunders Company (1988). The tPA protease domain, although more active, lacks heparin-binding capacity. The construction of an expression vector in which a glycosaminoglycan-binding domain is incorporated into the serine protease domain of these molecules can correct this deficiency. The resulting fusion protein can then have improved stability and clot targeting capacity compared to a tPA protease domain alone.
Protease inhibitors, categorized as anti- proteases, include alpha-1-antitrypsin, acid-stable proteinase inhibitor and human secretory leukocyte protease inhibitor. The term serpin is synonymous with serine protease inhibitors. The prototypical serine protease inhibitor or serpin is
alpha-1-antitrypsin. This protein is the principal natural inhibitor of the protease, leukocyte elastase, which is known to cause major tissue damage in lung inflammatory diseases like emphysema. Elastase is also known to be closely associated with various glycosaminoglycans found in tissue. Travis et al., Am. J. Medicine, 84 (sup. 6A): 37-42 (1988). Heparan sulfate, a glycosaminoglycan, is a major component of the alveolar interstitial tissue where the protease damage occurs. Crystal et al., In: Pulmonary Diseases and Disorders. 2nd ed., Fishman, Ed., McGraw-Hill Book Company (1987). Alpha-1-antitrypsin does not
naturally bind glycosaminoglycans. An
alpha-1-antitrypsin with a glycosaminoglycan-binding moiety should be more readily targeted to the site of action in the alveolar interstitial tissue and have an increased half-life.
Another protease inhibitor of known structure that can gain medical advantages from the addition of a glycosaminoglycan-binding moiety is the human secretory leukocyte protease inhibitor (SLPI).
Grotter et al., EMBO J., 7:345-351 (1988 ). SLPI is an unusual serine protease inhibitor in that it can reversibly inhibit a broad range of proteases involved in tissue damage during inflammation and also has a unique ability to gain access to proteolytic
sequestered microenvironments that are inaccessible to other fluid-phase inhibitors. Rice et al., Science, 249:178-181 (1990). Because SLPI has a low molecular weight (11,700 D) , it is expected to be rapidly cleared by kidney filtration. The addition of a glycosaminoglycan-binding peptide to SLPI can increase its half-life and can result in direct cell surface targeting.
An additional category of biologically active polypeptide moieties is antioxidants, which function as anti-inflammatory agents. The medically important antioxidant enzymes of known structures are superoxide dismutase, catalase and glutathione peroxidase. These enzymes are involved in the prevention of
post-ischemic injuries and the control of inflammatory disorders. Wilsman et al., In: Superoxide and
Superoxide Dismutase in Chemistry. Biology and
Medicine, Rotilio, Ed., Elsevier Science, Amsterdam Publishers (1986). As demonstrated in the case of SOD, these enzymes would benefit from an increased circulatory half-life and from tissue targeting through an added glycosaminoglycan-binding function.
Targeting of immunoglobulins is normally an intrinsic function of these proteins. Antibodies to circulatory antigens and antibodies that have a catalytic function, however, can benefit from a heparin-binding moiety. Since immunoglobulins have a beta-barrel fold like that of SOD, they will behave like SOD upon fusion with a glycosaminoglycan-binding domain. Catalytic antibodies that are more stable and have enhanced tissue-targeting ability can provide a more efficient therapeutic agent. Alternatively, for industrial or purification processes, a
glycosaminoglycan-binding domain added either to a catalytic as well as a standard variable domain antibody or to a single-chain antibody can enable the antibody to be immobilized via the glycosaminoglycan- binding domain on a solid support such as a
heparin-sepharose column. Recovery of the free catalyst or antibody can be easily accomplished by a gentle salt gradient elution as described for HSOD-A+. Furthermore, many receptors are known to have an immunoglobulin-like structure. The CD4 soluble receptor and especially the VI domain are contemplated as a potential drug against HIV. Ashkenazi et al., Proc. Natl. Acad. Sci., 87:7175-7154 (1990). These molecules can have enhanced tissue targeting and stability via glycosaminoglycan binding.
Other proteins can benefit from an increased circulatory half-life and tissue targeting through an attached glycosaminoglycan-binding function.
Solubilized domains from medically important receptors represent a major potential application. The
anti-inflammatory soluble human complement receptor is a powerful inhibitor of the complement activation at the C3 and C5 steps. Weisman et al., Science,
249:146-151 (1990). Post-ischemic myocardial
inflammation and necrosis is suppressed by the action of the soluble complement receptor. Although the atomic structure of this protein is unknown, electron microscopy has revealed a flexible filamentous
structure. Sixty-seven residues forming the
transmembrane domain of the full-length molecule have been removed without affecting the activity.
Replacing these residues with a
glycosaminoglycan-binding peptide should not
significantly disturb the protein. The replacement of the transmembrane domain of a receptor by a
glycosaminoglycan-binding peptide can prove to be a general strategy for the surface targeting of soluble forms of receptors.
The complete three-dimensional structure for many biologically active protein molecules having
independently folding functional domains suitable for use in this invention are available from the
Brookhaven Protein Data Bank, Brookhaven National Laboratories, Upton, NY. Exemplary proteins with their respective Protein Data Bank Codes (PDB numbers) that are included in the database are listed in Table 2. Table 2
Serine Proteases and Serine Protease Inhibitors
1sbc : subtilisin carlsberg (subtilopeptidase *a)
(e.c.3.4.21.14)
1sbt : subtilisin BPN* (e.c.3.4.21.14)
2sbt : subtilisin novo (e.c.3.4.21.14)
2sec : subtilisin carlsberg (e.c.3.4.21.14) complex with genetically-engineered n-acetyl eglin-c 1sic : subtilisin /bpn(prime) (e.c.3.4.21.14) complex with streptomyces subtilisin inhibitor
2ssi : streptomyces subtilisin inhibitor
2sni : subtilisin novo (e.c.3.4.21.14) complex with chymotrypsin inhibitor 2 (CI-2)
2sga : proteinase a (component of the extracellular filtrate pronase) (SGPA) (e.c. number not assigned)
3sgb : proteinase b from streptomyces griseus
1sgc : proteinase *A complex with chymostatin
1sgt : trypsin (sgt) (e.c.3.4.21.4)
1tec : thermitase (e.c.3.4.21.14) complex with eglin-c 2tga : trypsinogen (2.4 M magnesium sulfate)
1tgb : trypsinogen-ca from peg
1tgc : trypsinogen (0.50 methanol, 0.50 water)
2tgd : trypsinogen, diisopropylphosphoryl inhibited 1tgn : trypsinogen
1tgt : trypsinogen (173 degrees K, 0.70 methanol, 0.30 water)
2tgt : trypsinogen (103 degrees K, 0.70 methanol, 0.30 water)
2tgp : trypsinogen complex with pancreatic trypsin
inhibitor
1tgs : trypsinogen complex with porcine pancreatic secretory
1tpa : trypsin inhibitor anhydro-trypsin (e.c.3.4.21.4) complex with pancreatic trypsin inhibitor
2tpi : trypsinogen - pancreatic trypsin inhibitor - ile-val complex (2.4 M magnesium sulfate) 3tpi : trypsinogen complex with pancreatic trypsin
inhibitor and ile-val
4tpi : trypsinogen complex with the ARG-15 analogue of pancreatic trypsin inhibitor and val-val 1tld : beta-trypsin (orthorhombic) at pH 5.3
(e.c.3.4.21.4)
1ton : tonin (e.c. number not assigned)
ltpo : beta-trypsin (orthorhombic) at p*h5.0
(e.c.3.4.21.4)
1tpp : beta-trypsin (e.c.3.4.21.4) complex with
p-amidino-phenyl-pyruvate (appa)
1trm : Asn-102trypsin (e.c.3.4.21.4) (mutant with asp
102 replaced by asn) (dl02n) complex with benzamidine at pH 6 (anionic isozyme)
2trm : asn==102==*trypsin (e.c.3.4.21.4) (mutant with asp 102 replaced by asn) (dl02n) complex with benzamidine at p*h 8 (anionic isozyme)
2alp : alpha-lytic protease (e.c.3.4.21.12)
2cha : alpha chymotrypsin a (tosylated) (e.c.3.4.21.1) 4cha : alpha-chymotrypsin (e.c.3.4.21.1)
5cha : alpha chymotrypsin a (e.c.3.4.21.1)
6cha : alpha chymotrypsin a (e.c.3.4.21.1) complex
with phenyethane boronic acid (PEBA)
1chg : chymotrypsinogen a
1cho : alpha-chymotrypsin (e.c.3.4.21.1) complex with turkey ovomucoid third domain (OMTKY3)
1cse : subtilisin carlsberg (e.c.3.4.21.14)
(commercial product lcse compnd 1 from serra, heidelberg called subtilisin nagarse) complex with eglin-c
1est : tosyl-elastase (e.c.3.4.21.11) 2est : elastase (e.c.3.4.21.11) complex with
trifluoroacetyl
-*1-lysyl-*1-alanyl-p-trifluoromethylphenylanil ide (tfap)
3est : native elastase (e.c.3.4.21.11)
2gch : gamma chymotrypsin a (e.c.3.4.21.1)
1hne : human neutrophil elastase (hne) (e.c.3.4.21.37)
(also referred to as human leucocyte elastase (hie)) complex with
methoxysuccinyl-*ala-*ala-*pro-*ala
chloromethyl compnd 4 ketone (msack)
1ntp : modified beta trypsin (monoisopropylphosphoryl inhibited) (e.c.3.4.21.4) (neutron data) 1p01 : alpha-lytic protease (e.c.3.4.21.12) complex with boc-*ala-*pro-*valine boronic acid
1p02 : alpha-lytic protease (e.c.3.4.21.12) complex with methoxysuccinyl-*ala-*ala-*pro-*alanine boronic acid
1p03 : alpha-lytic protease (e.c.3.4.21.12) complex with methoxysuccinyl-*ala-*ala-*pro-*valine boronic acid
1p04 : alpha-lytic protease (e.c.3.4.21.12) complex with methoxysuccinyl-*ala-*ala-*pro-*isoleucine boronic acid
1p05 : alpha-lytic protease (e.c.3.4.21.12) complex with methoxysuccinyl-*ala-*ala-*pro-*norleucine boronic acid
1p06 : alpha-lytic protease (e.c.3.4.21.12) complex with
methoxysuccinyl-*ala-*ala-*pro-*phenylalanine boronic acid
1p07 : alpha-lytic protease (e.c.3.4.21.12) (mutant with met 192 1p07 : replaced by ala) (m192a) 1p08 : alpha-lytic protease (e.c.3.4.21.12) (mutant with met 192 replaced by ala) (m192a) complex with methoxysuccinyl-
*ala-*ala-*pro-*phenylalanine boronic acid 1p09 : alpha-lytic protease (e.c.3.4.21.12) (mutant with met 213 replaced by ala) (m213a) 1p10 : alpha-lytic protease (e.c.3.4.21.12) (mutant with met 213 replaced by ala) (m213a) complex with methoxysuccinyl-*ala-*ala-*pro-*valine boronic acid
2pka : kallikrein a (e.c.3.4.21.8)
2prk : proteinase k (e.c.3.4.21.14)
3ptb : beta-trypsin (benzamidine inhibited) at p*h7
(e.c.3.4.21.4)
2ptn : trypsin (orthorhombic, 2.4 m ammonium sulfate) 3ptn : trypsin (trigonal, 2.4 m ammonium sulfate)
(e.c.3.4.21.4)
4ptp : beta trypsin, diisopropylphosphoryl inhibited
3rp2 : rat mast cell protease ii (rmcpii)
5api : modified alpha=1=-*antitrypsin (modified
alpha=1=-*proteinase inhibitor)
6api : modified alpha=1=-*antitrypsin (modified
alpha=1=-*proteinase inhibitor)
2ci2 : chymotrypsin inhibitor 2 (ci-2)
2ptc : beta-trypsin (e.c.3.4.21.4) complex with
pancreatic trypsin inhibitor
4pti : trypsin inhibitor
5pti : trypsin inhibitor (crystal form ii)
6pti : bovine pancreatic trypsin inhibitor
(bpti,crystal form iii) Aspartyl and Related Proteases and Their Inhibitors 1hvp : hiv-1 protease complex with substrate (model) 2hvp : hiv-1 protease
3hvp : (aba==67, 95==)-hiv-1 protease (sf2 isolate) 4hvp : hiv-1 protease (hiv-1 pr) complex with
inhibitor n-acetyl-*thr-*ile-*nle-psi(ch2-nh)-*nle-*gln-* arg amide (mvt-101)
2rsp : rous sarcoma virus protease (rsv pr)
1ovo : ovomucoid third domain
2ovo : ovomucoid third domain
2hir : hirudin (wild-type) (nmr,32 simulated annealing structures)
4hir : hirudin (mutant with lys 47 replaced by glu)
(k47e) (nmr,32 simulated annealing structures) 5hir : hirudin (wild-type)
6hir : hirudin (mutant with lys 47 replaced by glu)
(k47e) (nmr, minimized mean structure)
Sulfhydryl Proteases and Their Inhibitors
2act : actinidin (sulfhydryl proteinase) (e.c. number not assigned)
1pad : papain (e.c.3.4.22.2) -acetyl-alanyl-alanyl- phenylalanyl-methylenylalanyl derivative of cysteine 25 (acaapack)
2pad : papain (e.c.3.4.22.2) -cysteinyl derivative of cysteine-25 (papsscys)
4pad : papain (e.c.3.4.22.2) -tosyl-methylenyllysyl derivative of cysteine-25 (tick)
5pad : papain (e.c.3.4.22.2)
-benzyloxycarbonyl-glycyl- phenylalanyl-methylenylglycyl derivative
(zgpgck)
6pad : papain (e.c.3.4.22.2) -benzyloxycarbonyl- phenylalanyl-methylenylalanyl derivative (zpack)
9pap : papain (e.c.3.4.22.2) cys-25 oxidized
1ppd : 2-hydroxyethylthiopapain (e.c.3.4.22.2)- crystal form d Acid Proteinases and Their Inhibitors 4ape : acid proteinase (e.c.3.4.23.10), endothiapepsin 2app : acid proteinase (e.c.3.4.23.7),penicillopepsin 2apr : acid proteinase (rhizopuspepsm) (e.c.3.4.23.6) 3apr : acid proteinase (rhizopuspepsm) (e.c.3.4.23.6) complex with reduced peptide inhibitor
(==5==psi==6==, (ch2-nh))-d-his-pro-phe-his-phe- phe-val-tyr
1cms : chymosin b (formerly known as rennin)
(e.c.3.4.23.4)
2pep : pepsin (e.c.3.4.23.1)
3pep : pepsin (e.c.3.4.23.1)
4pep : pepsin (e.c.3.4.23.1)
1psg : pepsinogen C-terminal Peptidases and Their Inhibitors
4cpa : carboxypeptidase a=alpha= (cox) (e.c.3.4.17.1) complex with potato carboxypeptidase a
inhibitor Anti-Oxidants
2cyp : cytochrome c peroxidase (e.c.1.11.1.5)
(ferrocytochrome c (colon) h2*o2 reductase)
1gp1 : glutathione peroxidase (e.c.1.11.1.9)
2cyp : cytochrome c peroxidase (e.c.1.11.1.5)
(ferrocytochrome c (colon) h2*o2 reductase)
4cat : catalase (e.c.1.11.1.6)
7cat : catalase (e.c.1.11.1.6)
Scat : catalase (e.c.1.11.1.6)
2sod : cu, zn superoxide dismutase (e.c.1.15.1.1)
Immunoglobulins. Immunoglobulin Fragments,
and Receptors
1f19 : r19.9 (ig*g2b=k=, cri==-===a=) fab fragment
3fab : lambda immunoglobulin fab(prime)
2fb4 : immunoglobulin fab 1fbj : ig*a fab fragment (j539) (galactan-binding) 1fc1 : fc fragment (iggl class)
1fc2 : immunoglobulin fc and fragment b of protein a complex
1fvb : ig*a fv fragment (19.1.2, anti-alpha(1->6)
dextran) (model)
2fvb : ig*a fv fragment (19.1.2, anti-alpha(1->6)
dextran) (energy minimized model)
1fvw : ig*a fv fragment (W3129, anti-alpha(1->6)
dextran) (model)
2fvw : ig*a fv fragment (w3129, anti-alpha(1->6)
dextran) (energy-minimized model)
1hfm : ig*g1 fv fragment (hy/hel-10) (model)
2ig2 : immunoglobulin g1g2 : immunoglobulin g1
1ige : fc fragment (ig*e(prime) cl) (model)
1mcg : immunoglobulin, lambda-*type bence-*jones dimer meg immunoglobulin fab fragment (mc/pc603) 2mcp : immunoglobulin mc/pc603 fab-phosphocholine
complex
1pfc : p/fc(prime) fragment of an ig*g1
1rei : bence-*jones immunoglobulin rei variable
portion
2rhe : bence-*jones protein (lambda, variable domain) 2hfl : ig*g1 fab fragment (hy/hel-5) and lysozyme
(e.c.3.2.1.17)
Particularly preferred are the antioxidant enzymes of the superoxide dismutase (SOD) class.
Because of the wide distribution of SOD enzymes in aerobic organisms, many isolates of SOD have been reported in many species. A recent literature search revealed descriptions of the sequence of 26 different SOD enzymes in mammals, non-mammals, bacteria, yeast and plants including human EC-SOD, [Hjalmarsson et al., Proc. Natl. Acad. Sci. USA, 84:6340-6344 (1987)]; human SOD [Sherman et al., Proc. Natl. Acad. Sci. USA, 80:5465-5469]; bovine SOD [Steinman et al., J. Biol. Chem.. 249:7326-7338, (1974)]; equine SOD [Lerch et al., J. Biol. Chem., 256:11545-11551 (1981)]; murine SOD [Getzoff et al., Proteins: Struct. Func. Genet., 5:322-336 (1989)]; porcine SOD [Schinina et al., FEBS Lett., 186:267-270 (1985)]; rabbit SOD [Reinecke et al., Biol. Chem., 369:715-725 (1988)]; ovine SOD
[Schinina et al., FEBS Lett., 207:7-10 (1986)]; rat SOD [Steffens et al., Z. Physiol. Chem., 367:1017-1024 (1986)]; drosophila SOD [Nucleic Acids Res., 17:2133- 2133 (1989)]; xenopus SOD [Eur. J. Biochem. (1989)]; brucella SOD [Beck et al., Biochemistry, 29:372-376 (1990)]; caulobacter SOD [Steinman et al., J.
Bacteriol. (1988)]; neurospora SOD [Lerch, J. Biol. Chem., 260:9559-9566 (1985)]; photobacterium SOD
[Steffens et al., Z. Physiol. Chem., 364:675-690
(1983)]; schistosoma SOD [Simorda et al., Exp.
Parasitol., 67:73-84 (1988)]; yeast SOD [Steinman et al., J. Biol. Chem., 255:6758-6765 (1980)];
cauliflower SOD [Steffens et al., Biol. Chem. Hoppe- Seyler, 367:1007-1016 (1986)]; cabbage SOD [Steffens et al., Physiol. Chem., 367:1007-1016 (1986)]; maize SOD [Cannon et al., Proc. Natl. Acad. Sci. USA,
84:179-183 (1987)]; pea SOD [Scioli et al., Proc.
Natl. Acad. Sci. USA, 85:7661-7665 (1988)]; spinach SOD [Kitagawa et al., J. Biochem., 99:1289-1298
(1986)]; and tomato SOD [Plant Mol. Biol., 11:609-623 (1988)].
Specific SOD fusion proteins are particularly preferred and an exemplary embodiment using HSOD has been prepared in Example 6. A SOD-containing-fusion protein is also referred to as a SOD-GAG-binding protein. This embodiment, designated HS0D-A+ fusion protein, has an amino acid residue sequence for the mature, expressed protein as shown in Figure 1 from residue 1 to residue 171.
Using the linker sequences and GAG-binding moieties identified herein as preferred, specific preferred fusion proteins including HSOD are also contemplated.
For example, in one embodiment, a fusion protein has a polypeptide sequence that corresponds, and preferably is identical, to the formula A+-L-HSOD, where A+ is at the amino terminus and HSOD is at the carboxy terminus, A+ corresponds to the PCI A+ GAG- binding helix having the formula HRHHPREMKKRVED, HSOD is a polypeptide having an amino acid residue sequence that corresponds, and preferably is identical, to the sequence in Figure 1 from residue 1 to residue 153, and L represents an operative linkage between A+ and HSOD in the form of either a peptide bond, i.e., no intervening amino acid residues, or one of the linker polypeptides YYK or SMD.
In another embodiment a fusion protein has a polypeptide sequence that corresponds, and preferably is identical, to the formula HSOD-L-PF4+, where HSOD is at the amino terminus and PF4+ is at the carboxy terminus, HSOD has a sequence as defined above, PF4+ is a GAG-binding helix having the formula YKKIIKKLLES, and L is either a peptide bond or is one of the linker polypeptides DEDG or IGVMP.
Another embodiment contemplates a fusion protein having a polypeptide sequence that corresponds, and preferably is identical, to the formula PF4+-L-HSOD, where PF4+ is at the amino terminus and HSOD is at the carboxy terminus, PF4+ is the polypeptide defined above, HSOD has a sequence as defined above, and L is one of the linker polypeptides REACG, VE or VMAS.
The above embodiments describe fusion proteins that contain a single GAG-binding moiety operatively linked to a single independently folding protein domain (i.e., a biologically active polypeptide moiety), where the GAG-binding moiety is linked at either its carboxy or amino terminus. However, nothing is to construe the invention as so limited. A fusion protein is also contemplated where more than one GAG-binding moiety is linked, for example, one at the carboxy and one at the amino terminus of the biologically active polypeptide moiety.
For example, a fusion protein is contemplated having a polypeptide sequence that corresponds, and preferably is identical, to the formula A+-L1-HSOD-L2- PF4+, where A+ is at the amino terminus and PF4+ is at the carboxy terminus, A+ corresponds to a polypeptide of the formula HRHHPREMKKRVED, PF4+ is the polypeptide as defined above, HSOD has a sequence as defined above L1 is either a peptide bond or is one of the linker polypeptides YYKK or SMD and L2 is either a peptide bond or is one of the linker polypeptides DEDG or IGVMP.
Two-Helix Motif for GAG Recognition
The structural modeling and electrostatics analysis of protein C inhibitor (of Model I) described herein (Figure 2), the model of antithrombin III, and the structure of platelet factor 4 all point to a two- helix motif for binding GAGs. In the HS0D-A+,
A+-HSOD, HS0D-PF4, and PF4-HS0D constructs, described herein, GAG-binding affinity is optimized by choosing linkers that encourage the GAG-binding helices to adopt the arrangement found in PF4 dimers (where each monomer contributes one helix) and in PCI; namely, two amphipathic, positively charged α-helices that lie roughly in a plane, that are aligned side-by-side, and that have parallel or anti-parallel axes separated by 10-14 angstroms.
To evaluate the requirements for such linkers, the PF4 dimer helices have been superimposed by molecular graphics on the HSOD dimer, such that the two fold symmetry axes of the HSOD and PF4 dimers are coincident and the N-termini of the PF4 helices are as close to the C-termini of the HSOD monomers as
possible. The result of this superposition is shown in Figure 3 and points to the following requirements for linkers between the C-terminus of HSOD and the A+ or PF4 helices, if they are to align as a helix pair in the dimer:
1) the linker forms a reverse turn in the first three residues;
2) a linker of at least three residues in
length is preferred to link the A+ helix to the C-terminus of HSOD; and
3) for linking the PF4 helix to the C-terminus of HSOD, a linker of at least five residues in length is preferred. The extra two to three residues for the PF4 linker relative to the A+ linker are required because the PF4 helix is one turn shorter than the A+ helix-containing peptide.
Thus, in one embodiment, the fusion protein may contain two GAG-binding moieties, one attached at each end of the biologically active polypeptide.
In a related embodiment, a fusion protein can contain more than one GAG-binding moiety in tandem at a terminus of a biologically active polypeptide, for example, according to the general formula: -Y-L-Zn- or -Zn-L-Y-, where Y is a biologically active
polypeptide, L is a linking means, Z is a GAG-binding moiety, and n is an integer of about 1-5, preferably about 2. In another embodiment, multiple GAG-binding moieties can be positioned according to the formula: -(Z-L)n-Y- or -Y-(L-Z)n-, where L, Z and Y are as defined above and n is an integer from 1 to 5, and preferably is 1, 2 or 3. Particularly preferred is the fusion protein where Y is HSOD, preferably
residues 1-153 of Figure 1, L is methionine (M), and n is 1, 2 or 3.
In general terms, a GAG-binding protein comprises a polypeptide including the formula -Y-L-Z-, where Y and Z are amino acid residue sequences, L is a linking means, Y comprises a polypeptide having biological activity as described herein, and Z is a GAG-binding moiety according to the general formula described above, with the proviso that when L is -GPG-, Z is not -LWERQ-; and the proviso that when L is -PLY-, z is not -YKKII-.
In another embodiment, the GAG-binding protein comprises a polypeptide including the formula -Z-L-Y-, where Z, L, and Y are as described above, with the provisio that when L is -HVG-, Z is not -RVEDL-.
Another embodiment contemplates inclusion of multiple GAG-binding moieties associated with a biologically active polypeptide moiety according to the formula -Ub-(Z-L)a-Y- or -Y-(L-Z)a-Ub- where Z, L ar?d Y are as defined before, U is an amino acid, a is an integer from 1 to 10, and b is an integer from 0 to 1.
Particularly preferred is the fusion protein where L is methionine, Z is a polypeptide according to the sequence -RVPRESGKKRKRKRLKPS-, Y is a polypeptide having an amino acid residue sequence that corresponds to the sequence shown in Figure 1 from residue 1 to residue 171, z is 1, 2, or 3 and b is 1.
Fusion proteins comprising a preselected biologically active polypeptide moiety operatively linked to a glycosaminoglycan (GAG)-binding moiety are particularly useful due to the properties that the GAG-binding moiety imparts on the fusion protein.
Therapeutic proteins administered to the blood are cleared from the blood. Addition of a GAG-binding moiety imparts a targeting function that directs the fusion protein to GAGs in the blood vessel wall and into the tissues rather than into the general
circulation where it is more dilute and available for clearance. The targeting function takes the fusion protein away from free circulation, thereby increasing the fusion protein's effective half-life and
localizing protection to needed areas.
In a related embodiment, addition of a GAG- binding moiety imparts a means to more readily isolate a fusion protein from the expression medium in which it was synthesized or the fluid in which it is
present. By using affinity-based separation
techniques directed to the GAG-binding capacity, namely using immobilized GAG to absorb fusion protein from solution, one can more easily isolate a GAG- binding fusion protein. Further description of a typical method can be found in Example 6.
C. Design of Glycosaminoglycan-Targeted Fusion
Proteins
The preparation of a fusion protein of this invention involves a combination of molecular
biological methodologies to assemble a gene that codes for the fusion protein, the introduction of the gene into a suitable protein expression vector and the expression and purification of the fusion protein. These methods are now generally well-known procedures as long as the amino acid residue sequence of the protein to be expressed is known. (See, for example, Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, NY (1989); and "Gene Expression Technology" in Meth. Enzymol., volume 185 (1990) for description of the present state of the art).
However, the selection of a primary amino acid residue sequence of a fusion protein involves design considerations to insure that the independently folding protein domains containing biological
functions can be combined to form a functional fusion protein. The design considerations are resolved in the present invention by computer modeling methods that determine the regions of independently folding protein domains and particularly that design suitable linkers to combine two or more biologically active polypeptide moieties to form the fusion protein.
A comprehensive review of protein structure, protein folding including independently folding protein domains, and computer modeling approaches to solving three-dimensional structures of proteins where only primary amino acid sequence information is available can be found in "Prediction of Protein
Structure and the Principles of Protein Conformation", Ed. G.D. Fasman, Plenum Press, NY (1989). Exemplary predictive conformation modeling programs and
molecular graphic display software and apparatus are described in Appendices 1 through 5 of the above review by G.D. Fasman at pages 303-316. The
references cited therein, and cited throughout the present invention disclosure, are hereby incorporated by reference.
A detailed description of the modeling of the protein C inhibitor is provided in Example 1. The methods generally involve a series of computer graphics and computer modeling manipulations based on the primary amino acid residue sequence of the
polypeptide to be modeled and is exemplary of the methods used to solve protein structures in general.
Overall, the procedure for designing a
glycosaminoglycan-binding fusion protein involves the following steps:
1) A GAG-binding moiety is selected according to the formula presented earlier.
2) A protein or a portion of a protein
comprising an independently folding protein domain that contains a biologically active protein moiety is selected. A biologically active protein moiety is at least an independently folding protein domain that contains by its structure an identifiable biological activity, when assayed by standard biochemical methods for the presence of the identifiable biological assay. Typically a biologically active protein moiety is a complete protein, although there is no requirement that the protein be complete. For example. Fab fragments of immunoglobulins, or the single chain antigen binding protein described by Bird et al.
[Science, 242:423-426 (1988)], are cases where a functional protein moiety is comprised of less than the intact protein.
3) The two preselected polypeptides, namely the GAG-binding moiety and the biologically active protein moiety, are then aligned in either of the two possible configurations as if the carboxy terminus of one polypeptide were to be linked by a peptide bond to the amino terminus of the other polypeptide and the residues of the termini to be operatively linked are then analyzed as described herein for linker design to identify a suitable linker for joining the two
polypeptides. The final amino acid residue sequence of the fusion protein is defined by the sum of the three parts, namely first polypeptide, linker and second polypeptide operatively linked into a single polypeptide.
Where the three-dimensional structure or
structural model is available for a biologically active protein moiety, it is preferred to include a molecular graphics display step to evaluate the reasonableness of the choice of linker in terms of packing of the two linked moieties. The best linker choice, if more than one possible linker sequence is presented through use of the above procedure, is one that presents the most stereochemically favorable and compact association of the linked moiety when
evaluated by molecular graphics.
Where the three-dimensional structure of either polypeptide to be linked is not known, it is desirable to model the structure so that the above evaluation for reasonable packing can be considered. Modeling a polypeptide can be accomplished by a variety of methods. Preferred are the homology modeling
approaches described by Summers et al. [J. Mol. Biol., 210:785-811 (1989)], and by the present invention at Example 1.
SEARCHWILD scans a database of protein sequences for all occurrences of a specified sequence pattern. This pattern may include "linker" sequences (of a specified range of lengths) for which no sequence preference is specified. SEARCHWILD can be used to identify sequences forming natural (and thus
evolutionarily favored) linker structures between two polypeptide sequence patterns by applying SEARCHWILD to a sequence database containing the sequences of known protein structures. In particular, by
specifying the carboxy- and amino-terminal sequences of the two polypeptides to be linked, SEARCHWILD will identify all sequences of the protein structural database that are similar to the C- and N-terminal sequences separated by a linker of 0 or more residues. In doing so, SEARCHWILD successfully identifies linkers that provide favorable structures for linking structural units in a fusion protein. An exemplary and preferred protein structure database is the Protein Data Bank available from Brookhaven National
Laboratory.
The complete source code for the program
SEARCHWILD is attached hereto as Appendix 1 to provide detailed description of the logic for completing a SEARCHWILD computer analysis.
SEARCHWILD can be run on any computer using a
UNIX operating system, such as a SUN SPARCstation 1 or SLC, a SUN 3 or 4, a Convex 1 or 240, or a Stardent GS 1000 or Titan. The executable SEARCHWILD code (the compiled and linked code in Appendix 1) is run on a (Unix operating system) computer by typing the
following command line: "pdbsearchwild a b c d e f".
The command line includes symbols which mean the following: "pdbsearchwild" invokes the program
SEARCHWILD; "a" is the C-terminal amino acid sequence in single letter code corresponding to the C-terminal residues of a polypeptide to be linked in a fusion protein; "b" represents an amino acid sequence in single letter code corresponding to the N-terminal residues of a polypeptide to be linked in a fusion protein; "c" and "d" represent the minimum and maximum linker lengths, respectively, in a proposed linker sequence that will be matched against the protein structural database; "e" is an amino acid residue substitution matrix; and "f" represents the matrix tolerance level as described herein below. Execution of the described command initiates the program that passes parameters into SEARCHWILD, sorts the matches found in the sequence database (for example, the sequences corresponding to structural coordinates in the Protein Data Bank (PDB)), and lists the sequence matches found by the search in order from most similar to least similar to the input sequences. On each line containing a c-terminal and N-terminal sequence match is the sequence of the identified linker between them.
The SEARCHWILD parameters required at the command line include certain default values which are
typically set according to the following
considerations for the purpose of identifying a linker sequence between 2 functional moieties (although they can be varied to accomplish other purposes).
The first parameter is the carboxy-terminal 7 residues of the polypeptide to precede the linker.
This must be specified in lower case and is referred to as "cterminus" in pdbsearchwild.
The second parameter is the amino-terminal 7 residues of the polypeptide to follow the linker.
This must be specified in lower case and is referred to as "nterminus" in pdbsearchwild.
The third parameter identifies the minimum linker length (in residues) between the two polypeptides to be linked, with a minimum value of zero, and is referred to as "minlinkerlen" in pdbsearchwild.
The fourth parameter is the maximum linker length between the two polypeptide regions, is specified as 7 residues and is referred to as "maxlinkerlen" in pdsearchwild. The choice of 7 residues for the lengths of the amino and carboxy termini and for the linker length in the described SEARCHWILD program was made because 7 residues is sufficient to form any of the preferred types of protein structure for a linker in the present invention, namely reverse turns, helical turns, and open turns or loops having internal hydrogen bonds.
The fifth parameter in SEARCHWILD, the amino acid residue substitution matrix, is used to measure the similarity between the input sequence and the database sequence, and gives a value for each substitution of one residue type for another. Higher matrix values indicate more similar residues. The preferred matrix, best.matrix (E.D. Getzoff and J.A. Tainer), is a weighted combination of 7 individual matrices
representing the replaceability of one residue type for another with respect to the following properties: 1) hydrophobicity, 2) evolutionary occurrence, 3) sidechain size, 4) charge and polarity, and 5)
preference for turn, 6) α-helix, and 7) β-strand secondary structure. The parameter for best.matrix is identified by the term "mat_file" in pdbsearchwild.
The last parameter, matrix tolerance, is a value equaling the smallest value in the amino acid
substitution matrix (for a substitution of one residue by another) that is to be considered a match between two residues. In practice, this is set to some value greater than the smallest value in the matrix (to prevent all sequences in the database from being printed out with their scores, since clearly most sequences are not similar) and less than the value at which statistically significant scores are produced (as described below; thus at least all the significant matches will be printed out). Thus, matrix tolerance is a residue-selection criterion. This parameter is referred to as "mat_tol" in pdbsearchwild, and an appropriate value is zero for most choices of input sequence when using best.matrix. The sequence database file to be searched by SEARCHWILD, referred to as "pdbseq.asc", is a set of sequences in SEARCHWILD-readable form containing the sequences for all residues with coordinates in the current version of the Brookhaven Protein Data Bank (PDB) structural database.
Although the SEARCHWILD parameters as described have been chosen for identification of linker
sequences between two independently folding domains, the parameters can be altered for other applications.
Other amino acid equivalence matrices can be used with SEARCHWILD in place of best.matrix described herein, so long as the matrix provides for residue substitutions. Typical factors involved in designing a rational substitution matrix include the following: hydrophobicity, evolutionary occurrence, sidechain charge and polarity, turn, strand or helix preference characteristics, size and the like.
Given the sequence matches produced by SEARCHWILD in order from most similar to least similar, the matches are evaluated for the statistical significance of their similarity to the input sequence according to the methodology described by Schulz and Schirmer, in section 9.6 of Principles of Protein Structure,
Springer-Verlag, NY (1979). In the analysis of statistical significance, the question is whether the sequence identified in the database is sufficiently similar to the input sequences such that the
intervening linker sequence can be said to be in a similar sequence (and structural) context to that in the fusion protein. To that end, the Schulz and
Schirmer (supra) methodology is applied to best.matrix rather than the matrix of relative substitution frequencies described by Schulz and Schirmer (since the level of statistical significance depends on the values in the substitution matrix). This methodology determines the mean and standard deviation of the distribution of scores for the sequence matches produced by searchwild. A best.matrix score greater than three standard deviations above the mean score shows significant relatedness at a confidence level of more than 99.7%. This is a restrictive criteria since it gives a frequency of 0.005 for all 5-residue peptides and 0.0014 for all 13-residue peptides occurring in 2222 known protein sequences. To
identify a significantly similar sequence, one selects those sequences produced by SEARCHWILD that have a similarity score three standard deviations above the mean.
For each statistically significant sequence match identified by SEARCHWILD, the program, matchextractpdb (incorporating pdbresrange and pdbchain programs), extracts from the protein database (PDB) the three- dimensional coordinates of the linker residues
identified by the program and also extracts the coordinates of a specifiable number of residues surrounding the linker. Only those linker sequences having full heavy atom (non-hydrogen) coordinates and not coming from a model structure (which have
insufficient precision for analyzing hydrogen bonding) are retained. The selected sequence represents a potential linker sequence that must be evaluated by structural appropriateness criteria in order to be positively selected for use as a linker in a fusion protein.
Thereafter, the identified linkers are evaluated for structural appropriateness of the identified sequence in the context of the two polypeptide moieties to be linked. To analyze whether the linker sequences identified by SEARCHWILD have structures that are highly dependent on adjacent structures (an undesirable feature), packing and hydrogen bonding within the linker structure in the PDB are evaluated using the tiny probe program of E.D. Getzoff (Chapter 8, Ph.D. Thesis, Duke University, 1982). This
program, a variant of the solvent-accessible molecular surface program of M. Connolly (MS; available through QCPE, University of Indiana, Bloomington, IN), uses a 0.5 Å probe rolling over the van der Waals surface to identify close interactions within proteins and peptides. The MS program is then used with a 1.4 Å (H2O-sized) probe to determine whether the linker residues are surface-exposed in the PDB structure. This is significant, since in a fusion protein the linker is attached at protein surfaces. Linkers without local interactions or without surface exposure in the context of the PDB protein are excluded.
Preferred structures for linker residues to be included in a fusion protein of present invention are reverse turns, open turns, helical turns, and short loops having local hydrogen bonds and packing
interactions that place the GAG-binding moiety against the protein surface. If three-dimensional structures are available or if structural models can be made (as exemplified by the PCI model in Example 1) for the protein and the GAG-binding polypeptide termini, linkers are selected in which the linker structure generates a favorable globular fold between the protein and the GAG-binding moiety as measured by: 1) exposing the GAG-binding sidechains at the solvent- accessible surface of the fusion protein; 2) producing buried surface (as measured by MS with a 1.4 Å probe) between the protein and the GAG-binding moiety without producing undue cavities" or interpenetrations; and 3) absence of steric collisions that cannot be resolved by single bond rotations. By all these criteria, summarized as follows, a linker is identified that is suitable for combining a biologically active
polypeptide moiety to a glycosaminoglycan-binding moiety as follows:
1) Using SEARCHWILD, identify sequences in the PDB that are similar to the C-terminal and N-terminal residues of the moieties to be linked.
2) Determine which of these PDB sequences are
(statistically) significantly similar to the specified termini, and only retain these matches.
3) Retain only those matches for which full, non- model coordinates are available for the linker
residues.
4) Retain only those linker structures having local interactions and compact packing.
5) If structures or structural models are obtainable for the biologically active moiety and GAG-binding moiety, only retain those linkers that generate a stereochemically favorable, compact association when viewed using standard molecular graphic displays. If a structure is not obtainable, the linker structure (sequence) obtained after step 4 is the linker
sequence used to produce an amino acid residue
sequence corresponding to the designed GAG-binding fusion protein.
Representative modeling methods for obtaining a structural model, if not already solved, include the homology modeling approach described by Summers et al. rj. Mol. Biol., 210:785-811 (1989)], and the related approach exemplified herein at Example 1.
A system of the present invention for identifying linker sequences is shown in Figure 4. The system comprises an input device 11 such as a keyboard for entering commands and data, a ROM or RAM (read-only- memory or random access memory) 13 with a stored program (SEARCHWILD), a computer processor 15
operating under control of the stored program, RAM (random-access-memory) or auxiliary storage device 17 for storing entered data and predetermined sequence data. Optionally the system may include a CRT
(cathode ray tube) display unit 19 for displaying data, and a printer 21 for printing output data.
Thus, the invention also contemplates a method of determining an amino acid residue sequence suitable for linking selected molecules, the method comprising the steps of:
specifying a first sequence of residues corresponding to a first terminus of a first selected molecule;
specifying a second sequence of residues corresponding to a second terminus of a second
selected molecule;
specifying a minimum and a maximum length, in number of residues, of the linking sequence;
providing a matrix of numeric values indicating the relative substitutability of one or more residues for a selected residue;
specifying a residue-selection criterion; determining, according to said criterion and according to said substitutability of residues, a set of residues substitutable for each of the residues of said first and second sequences; and
identifying, from among said substitutable residues and from known predetermined sequences, a set of sequences equivalent to said first and second sequences, each of which equivalent sequences having an intervening linking sequence length similar to the specified length, said intervening linker sequence being candidates for linking said molecules. In addition, the invention contemplates a system for determining an amino-acid residue sequence
suitable for linking selected molecules, the system comprising:
means for specifying a first sequence of residues corresponding to a first terminus of a first selected molecule;
means for specifying a second sequence of residues corresponding to a second terminus of a second selected molecule;
means for specifying a minimum and a maximum length, in number of residues, of the linking
sequence;
means for providing a matrix of numeric values indicating the relative substitutability of one or more residues for a selected residue;
means for specifying a residue-selection criterion;
means for determining, according to said criterion and according to said substitutability of residues, a set of residues substitutable for each of the residues of said first and second sequences; and means for identifying, from among said substitutable residues and from known predetermined sequences, a set of sequences equivalent to said first and second sequences, each of which equivalent
sequences having an intervening linker sequence length similar to the specified length, said intervening linker sequence being candidates for linking said molecules.
D. DNA Segments
In living organisms, the amino acid residue sequence of a protein or polypeptide is directly related via the genetic code.to the deoxyribonucleic acid (DNA) sequence of the structural gene that codes for the protein. Thus, a structural gene can be defined in terms of the amino acid residue sequence, i.e., protein or polypeptide, for which it codes.
An important and well-known feature of the genetic code is its redundancy. That is, for most of the amino acids used to make proteins, more than one coding nucleotide triplet (codon) can code for or designate a particular amino acid residue. Therefore, a number of different nucleotide sequences can code for a particular amino acid residue sequence. Such nucleotide sequences are considered functionally equivalent since they can result in the production of the same amino acid residue sequence in all organisms. Occasionally, a methylated variant of a purine or pyrimidine may be incorporated into a given nucleotide sequence. However, such methylations do not affect the coding relationship in any way. A DNA sequence (i.e., DNA segment) of the present invention comprises a structural gene. Usually, the DNA sequence is present as an uninterrupted linear series of codons where each codon codes for an amino acid residue, i.e., the DNA sequence contains no introns.
However, also contemplated within the invention is any desired target fragment, such as a nucleic acid having an intervening sequence, a promoter, a
regulatory sequence, a repetitive sequence, a flanking sequence, or a synthetic nucleic acid.
A DNA segment of this invention defines a
structural gene coding for a GAG-binding fusion protein of this invention. Preferably the DNA segment includes a nucleotide base sequence according to the sequence in Figure 1 from nucleotide base 535 to base 579.
Particularly preferred is a DNA sequence having the base sequence shown in Figure 1 from base 1 to base 579.
Typically the DNA segment is no more than about 5,000 and preferably no more than 2,500 nucleotides (bases) in length.
A DNA segment of the present invention can easily be synthesized by chemical techniques, for example, via the phosphotriester method of Matteucci et al. [J. Am. Chem. Soc., 103:3185 (1981)] or using
phosphoramidite chemistry according to Beaucage et al. [Tetrahedron Letters, 22:1859-1862 (1982)]. Of course, by chemically synthesizing the coding
sequence, any desired modifications can be made simply by substituting the appropriate bases for those encoding the native amino acid residue sequence.
The DNA segments of the present invention
typically are duplex DNA molecules having cohesive termini, i.e., "overhanging" single-stranded portions that extend beyond the double-stranded portion of the molecule. The presence of cohesive termini on the DNA molecules of the present invention is generally preferred.
Larger DNA segments corresponding to, for
example, complete structural genes in the form of a "cassette", i.e., having convenient restriction enzyme site-defined cohesive termini, can easily be prepared by ligating smaller oligonucleotides. Typically, single-stranded oligonucleotides of between 40-75 nucleotide bases in length are prepared with
overlapping complementary ends to form the complete cassette DNA segment. The oligonucleotides are then annealed and the oligos are ligated to form a complete double stranded (ds DNA) molecule. [See for example, Urdea et al., Proc. Natl. Acad. Sci. USA, 80:7461-7465 (1983); and Hallewell et al., J. Biol. Chem.,
264:5260-5268 (1989).] Also contemplated as within the present invention are ribonucleic acid (RNA) equivalents of the above described DNA segments. E. Recombinant DNA Molecules
The present invention further contemplates a recombinant DNA (rDNA) that includes a DNA segment of the present invention operatively linked to a vector for replication and/or expression. A preferred rDNA is characterized as being capable of directly
expressing, in a compatible host, a GAG-binding fusion protein of the present invention. By "directly expressing" is meant that the mature polypeptide chain of the expressed fusion protein is formed by
translation alone as opposed to proteolytic cleavage of two or more terminal amino acid residues from a larger translated precursor protein. An exemplary and preferred rDNA of the present invention is the rDNA molecule pPHSODIqHPCI4 described in Example 6.
A rDNA molecule of the present invention can be produced by operatively linking a vector to a DNA segment of the present invention.
As used herein, the term "vector" refers to a nucleic acid molecule capable of transporting between different genetic environments another nucleic acid to which it has been operatively linked. Preferred vectors are those capable of autonomous replication and/or expression of nucleic acids to which they are operatively linked are referred to herein as
"expression vectors". As used herein, the term
"operatively linked", in reference to DNA segments, describes that the nucleotide sequence is joined to the vector so that the sequence is under the
transcriptional and translation control of the
expression vector and can be expressed in a suitable host cell.
As is well-known in the art, the choice of vector to which a GAG-binding fusion protein encoding DNA segment of the present invention is operatively linked depends upon the functional properties desired, e.g., protein expression, and upon the host cell to be transformed. These limitations are inherent in the art of constructing recombinant DNA molecules.
However, a vector contemplated by the present
invention is at least capable of directing the
replication, and preferably also expression, of a gene operatively linked to the vector.
In preferred embodiments, a vector contemplated by the present invention includes a procaryotic replicon, i.e., a DNA sequence having the ability to direct autonomous replication and maintenance of the recombinant DNA molecule extrachromosomally in a procaryotic host cell, such as a bacterial host cell, transformed therewith. Such replicons are well-known in the art. In addition, those embodiments that include a procaryotic replicon may also include a gene whose expression confers a selective advantage such as complementation amino acid auxotrophy dependency or drug resistance to a bacterial host transformed therewith as is well-known, in order to allow
selection of transformed clones. Typical bacterial drug resistance genes are those that confer resistance to ampicillin, tetracycline, or kanamycin.
Those vectors that include a procaryotic replicon may also include a procaryotic promoter capable of directing the expression (transcription and
translation) of the gene transformed therewith. A promoter is an expression control element formed by a DNA sequence that permits binding of RNA polymerase and transcription to occur. Promoter sequences compatible with bacterial hosts are typically provided in plasmid vectors containing convenient restriction sites for insertion of a DNA segment of the present invention. Bacterial expression systems, and choice and use of vectors in those systems is described in detail in "Gene Expression Technology", [Meth.
Enzymol., Vol 185, Goeddel, Ed., Academic Press, NY (1990)]. Typical of such vector plasmids are pUC8, pUC9, pBR322 and pBR329 available from Bio-Rad
Laboratories, (Richmond, CA) and pPL and pKK233-2, available from Pharmacia, (Piscataway, NJ), or Clone Tech (Palo Alto, Ca).
Expression vectors compatible with eucaryotic cells, preferably those compatible with vertebrate cells, can also be used to form the recombinant DNA molecules of the present invention. Eucaryotic cell expression vectors are well-known in the art and are available from several commercial sources. Typically, such vectors are provided containing convenient restriction sites for insertion of the desired gene. Typical of such vectors are pSVL and pKSV-10
(Pharmacia), pBPV-1/pML2d (International
Biotechnologies, Inc.), and pTDTl (ATCC, #31255).
In preferred embodiments, the eucaryotic cell expression vectors used to construct the recombinant DNA molecules of the present invention include a selectable phenotypic marker that is effective in a eucaryotic cell, such as a drug resistance selection marker or selective marker based on nutrient
dependency. A preferred drug resistance marker is the gene whose expression results in neomycin resistance, i.e., the neomycin phosphotransferase (neo) gene.
Southern et al. [J. Mol. Appl. Genet., 1:327-341
(1982)].
The use of retroviral expression vectors to form the rDNAs of the present invention is also
contemplated. As used herein, the term "retroviral expression vector" refers to a DNA molecule that includes a promoter sequence derived from the long terminal repeat (LTR) region of a retrovirus genome.
In preferred embodiments, the expression vector is typically a retroviral expression vector that is preferably replication-incompetent in eucaryotic cells. The construction and use of retroviral vectors has been described by Sorge et al., Mol. Cell. Biol., 4:1730-37 (1984).
Other virus-based expression systems can be used, as is well-known, including systems based on SV-40, Epstein-Barr, Vaccinia, and the like. See, for example, "Gene Expression Technology", (Supra), at pp.485-569. For expression in yeast, a variety of vector are known in the art, in particular the vector, pCl/1 described by Brake et al., Proc. Natl. Acad.
Sci. USA, 81:4642-4647 (1984); and Hallewell et al., Biotechnology, 5:363-366 (1987). Other vectors are described in "Gene Expression Technology", (Supra).
In addition to using strong promoter sequences to generate large quantities of mRNA coding for the expressed fusion proteins of the present invention, it is desirable to provide ribosome-binding sites in the mRNA to ensure efficient translation. The ribosome- binding site in E. coli includes an initiation codon (AUG) and a sequence 3-9 nucleotides long located 3-11 nucleotides upstream from the initiation codon (the Shine-Dalgarno sequence). See, Shine et al., Nature, 254:34 (1975). Methods for including a ribosome- binding site in mRNAs corresponding to the expressed proteins are described by Maniatis, et al. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, NY pp. 412-417 (1982). Ribosome binding sites can be modified to produce optimum configuration relative to the structural gene for maximal expression of the structural gene. [Hallewell et al., Nucl. Acid Res., 13:2017-2034 (1985)].
Preferably the vectors employed herein will contain restriction sites in all three reading frames of the DNA sequences. However, it is contemplated that other vectors will be suitable in which synthetic linkers are inserted to allow the fusion protein gene to be inserted in-frame. Synthetic linkers containing a variety of restriction sites are commercially available from a number of sources including
International Biotechnologies, Inc. (New Haven, CT). Instructions for their use can be obtained from the supplier. Polynucleotide sequences, including the removable fragments and/or the linking sequences may also be prepared by direct synthesis techniques. Also contemplated by the present invention are RNA
equivalents of the above-described recombinant DNA molecules.
The nucleic acids are combined with linear DNA molecules in an admixture thereof and a ligase will be added to effect ligation of the components. Any ligase available commercially is contemplated to perform the ligation reaction effectively using methods and conditions well-known to those skilled in the art. A preferred ligase is T4 DNA ligase.
Volume exclusion agents may also be used to accelerate the ligation reaction. However, such agents may cause excessive intramolecular
circularizations in some cases.
F. Transformation of Cells and Fusion Protein Expression
The recombinant DNA molecules of the present invention are introduced into host cells via a
procedure commonly known as transformation or
transfection. The host cell can be either procaryotic or eucaryotic. Bacterial cells are preferred
procaryotic host cells and typically are a strain of E. coli such as, for example, the MC1061 or JM109 strains. Preferred eucaryotic host cells include yeast and mammalian cells, preferably vertebrate cells such as those from a mouse, rat, monkey or human fibroblastic cell line. Preferred eucaryotic host cells include Chinese hamster ovary (CHO) cells available from the ATCC as CCL61 and NIH Swiss mouse embryo cells NIH/3T3 available from the ATCC as CRL 1658. One preferred means of effecting transformation is electroporation.
Transformation of appropriate host cells with a recombinant DNA molecule of the present invention is accomplished by well-known methods that typically depend on the type of vector used. With regard to transformation of procaryotic host cells, see, for example, Cohen et al. [Proc. Natl. Acad. Sci. USA, 69:2110 (1972)] and Maniatis et al. [Molecular
Cloning, A Laboratory Manual, Cold Spring Harbor
Laboratory, Cold Spring Harbor, NY (1982)]. With regard to transformation of vertebrate cells with retroviral vectors containing rDNAs, see, for example, Sorge et al., Mol. Cell. Biol., 4:1730-37 (1984);
Graham et al., Virol., 52:456 (1973); and Wigler et al., Proc. Natl. Acad. Sci. USA, 76:1373-76 (1979).
Successfully transformed cells, i.e., cells that contain a recombinant DNA (rDNA) molecule of the present invention, are usually monitored by an
appropriate immunological or functional assay. For example, cells resulting from the introduction of an rDNA of the present invention can be cloned to produce monoclonal colonies. Cells from those colonies can be harvested, lysed and their DNA content examined for the presence of the rDNA using a method such as that described by Southern, J. Mol. Biol., 98:503 (1975) or Berent et al., Biotech., 3:208 (1985).
In addition to directly assaying for the presence of rDNA, successful transformation can be confirmed by well-known immunological methods when the rDNA is capable of directing the expression of a subject polypeptide. For example, cells successfully
transformed with a subject rDNA containing an
expression vector produce a polypeptide displaying a characteristic antigenicity. Samples of a culture containing cells suspected of being transformed are harvested and assayed for a subject polypeptide using antibodies specific for that polypeptide antigen, such as those produced by an appropriate hybridoma.
A particularly convenient assay technique
involves fusing the foreign fusion protein DNA to a Lac Z gene in a suitable plasmid, e.g., pLG. Since the plasmid lacks a promoter and the Shine-Dalgarno sequence, no β-galactosidase is synthesized. However, when a portable promoter fragment is properly
positioned in front of the fused gene, high levels of a fusion protein having β-galactosidase activity should be expressed. The plasmids are used to
transform Lac- bacteria which are scored for β- galactosidase activity on lactose indicator plates. Plasmids having optimally placed promoter fragments are thereby recognized. These plasmids can then be used to reconstitute the fusion protein gene which is expressed at high levels.
Thus, in addition to the transformed host cells themselves, cultures of the cells are contemplated as within the present invention. The cultures include monoclonal (clonally homogeneous) cultures, or
cultures derived from a monoclonal culture, in a nutrient medium. Nutrient media useful for culturing transformed host cells are well-known in the art and can be obtained from several commercial sources. In embodiments wherein the host cell is mammalian, a "serum-free" medium is preferably used.
The present method entails culturing a nutrient medium containing host cells transformed with a recombinant DNA molecule of the present invention that is capable of expressing a gene encoding a subject polypeptide. The culture is maintained for a time period sufficient for the transformed cells to express the subject polypeptide. The expressed polypeptide is then recovered from the culture.
Once a gene has been expressed in high levels, a DNA fragment containing the entire expression
assembly, e.g., promoter, ribosome-binding site, and fusion protein gene) may be transferred to a plasmid that can attain very high copy numbers. For instance, the temperature-inducible "runaway replication" vector pKN402 may be used. Preferably, the plasmid selected will have additional cloning sites which allow one to score for insertion of the gene assembly. See,
Bittner et al. Gene, 15:31 (1981). Bacterial cultures transformed with the plasmids are grown for a few hours to increase plasmid copy number, e.g., to more than 1000 copies per cell. Induction may be performed in some cases by elevated temperature and in other cases by addition of an inactivating agent to a represser. Very large increases in cloned fusion proteins can potentially be obtained in this way.
G. Purification of' Fusion Proteins
Methods for recovering an expressed polypeptide from a culture are well-known in the art and include fractionation of the polypeptide-containing portion of the culture using well-known biochemical techniques. For instance, the methods of gel filtration, gel chromatography, ultrafiltration, electrophoresis, ion exchange, affinity chromatography, and the like, can be used to isolate the expressed proteins found in the culture. In addition, immunochemical methods, such as immunoaffinity, immunoabsorption, and the like, can be performed using well-known methods.
A preferred method for isolating a fusion protein in this invention is by affinity chromatography.
Isolation and purification of an expressed fusion protein containing a GAG-binding domain can be
accomplished by affinity chromatography of the impure proteins in, for example, a cell lysate. A preferred affinity chromatography column in this invention is heparin immobilized to Affi-gel as shown in Example 7. After the lysate is applied to the column, the GAG- binding domain of the fusion protein binds to the heparin. After washing the column to remove non-bound proteins, the fusion protein can be specifically eluted with an increasing ionic strength salt
gradient, preferably around 0.1 to 0.5 M NaCl. The fractions containing the purified fusion protein are collected and tested for activity in an appropriate assay, preferably in a gel activity assay. The fractions containing the highest activity of the fusion proteins are thereafter pooled. Affinity chromatography purification of fusion proteins by these means can result in greater than 95% purity.
Additional purification to remove endotoxins can also be conducted, and is desirable for therapeutic compositions containing a GAG-binding fusion protein, using endotoxin removal procedures, or the methods described in U.S. Patent No. 4,808,314, which reduce the endotoxin in the composition as to not produce fever in rabbits using a standard pyrogen assay. H. Therapeutic Compositions and Methods
Many of the compounds and groups involved in the instant specification (e.g., amino acid residues) have a number of forms, particularly variably protonated forms, in equilibrium with each other. As the skilled practitioner will understand, representation herein of one form of a compound or group is intended to include all forms thereof that are in equilibrium with each other.
In the present specification, "uM" means
micromolar, "ul" means microliter, and "ug" means microgram.
Therapeutic compositions of the present invention contain a physiologically tolerable carrier together with a GAG-binding fusion protein, as described herein, dissolved or dispersed therein as an active ingredient. In a preferred embodiment, the
therapeutic composition is not immunogenic when administered to a mammal or human patient for
therapeutic purposes.
As used herein, the terms "pharmaceutically acceptable", "physiologically tolerable" and
grammatical variations thereof, as they refer to compositions, carriers, diluents and reagents, are used interchangeably and represent that the materials are capable of administration to or upon a mammal without the production of undesirable physiological effects such as nausea, dizziness, gastric upset and the like.
The preparation of a pharmacological composition that contains active ingredients dissolved or dispersed therein is well-understood in the art.
Typically such compositions are prepared as
injectables either as liquid solutions or suspensions, however, solid forms suitable for solution, or
suspensions, in liquid prior to use can also be prepared. The preparation can also be emulsified.
The active ingredient can be mixed with
excipients which are pharmaceutically acceptable and compatible with the active ingredient. Suitable excipients are, for example, water, saline, dextrose, glycerol, ethanol or the like and combinations
thereof. In addition, if desired, the composition can contain minor amounts of auxiliary substances such as wetting or emulsifying agents, pH buffering agents and the like which enhance the effectiveness of the active ingredient.
The therapeutic composition of the present invention can include pharmaceutically acceptable salts of the components therein. Pharmaceutically acceptable salts include the acid addition salts
(formed with the free amino groups of the polypeptide) that are formed with inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids as acetic, tartaric, mandelic and the like. Salts formed with the free carboxy1 groups can also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium or ferric hydroxides, and such organic bases as
isopropylamine, trimethylamine, 2-ethylamino ethanol, histidine, procaine and the like.
Physiologically tolerable carriers are well-known in the art. Exemplary of liquid carriers are sterile aqueous solutions that contain no materials in
addition to the active ingredients and water, or contain a buffer such as sodium phosphate at physiological pH value, physiological saline or both, such as phosphate-buffered saline. Still further, aqueous carriers can contain more than one buffer salt, as well as salts such as sodium and potassium chlorides, dextrose, polyethylene glycol and other solutes.
Liquid compositions can also contain liquid phases in addition to and to the exclusion of water. Exemplary of such additional liquid phases are
glycerin, vegetable oils such as cottonseed oil, and water-oil emulsions.
1. Methods of Treatment to Reduce Tissue Damage Due to Superoxide Radicals
Methods for reducing tissue damage caused by oxygen free radical (superoxide) in vivo or in vitro are contemplated by the present invention, using a HSOD-GAG-binding fusion protein.
Human recombinant SOD can protect ischemic tissue in experimental models when injected into the circulation just prior to reperfusion (Ambrosio et al. Circulation 75:282. 1987). Injury to the endothelium, a tissue covered with glycosaminoglycans, is a major consequence of ischemia/reperfusion injury. This cause edema formation due to the loss of barrier function and favor platelet adhesion to the
endothelium. The protective action of SOD is due to its scavenging of superoxide anion. SOD also protects the endothelium "in vivo" by preventing the formation of peroxynitrite, which is toxic due to its
decomposition to form potent, cytotoxic oxidants (Beckman et al. Proc. Natl. Acad. Sci. USA 87:1620- 1624. 1990). Postischemic injury involving the superoxide anion has been observed in the heart, intestine, liver, pancreas, skin, skeletal muscle. kidney and perhaps occurs in other organs (McCord Fed. Proc. 46:2402-2406. 1987). Chemical linking of SOD to albumin increase "in vivo" half-life of SOD and has been proven to be superior to native SOD in inhibiting postischemic reperfusion arrhythmias (Watanabe et al. Biochem. Pharmacol. 38:3477-3483. 1989). The surface targeting and the increased half-life of HSODA+ will be useful in the prevention of these postischemic damages.
SOD has proven to be effective in several inflammatory diseases like osteoarthritis and
rheumatoid arthritis. Local infiltration of SOD in extra-articular inflammatory processes (e.g.,
tendonitis, tendovaginitis, bursitis, epicondylitis, periarthritis) has also proven to be effective.
Improvement upon SOD administration has also been observed in Peyronie's disease and Dupuytren's
contracture (Wilsman. In Rotilio Ed. Superoxide and Superoxide Dismutase in Chemistry, Biology and
Medicine. Elsevier. 1986). For these inflammatory disorders as well as for respiratory distress
syndromes a cell-surface-targeted SOD with increased half-life will be a useful drug. Organ transport and organ transplant also can benefit from such an
improved SOD. In addition, tissue-targeted SOD should help alleviate the toxic secondary effect of anti- cancer radio and chemotherapy. Drug (antibiotic and anticancer) induced nephritis also can be reduced by a more potent SOD.
Thus, the present invention contemplates a method of in vivo scavenging superoxide radicals in a mammal that comprises administering a therapeutically effective amount of a physiologically tolerable composition containing a ΗSOD-GAG-binding fusion protein to a mammal in a predetermined amount calculated to achieve the desired effect.
For instance, when used as an agent for
scavenging superoxide radicals, such as in a human patient displaying the symptoms of inflammation induced tissue damage such as during an autoimmune disease, osteoarthritis and the like, or during a reperfusion procedure to reintroduce blood or plasma into ischemic tissue such as during or after surgical procedures, trauma, in thrombi, or in transplant organs, or after episodes of infection causing massive cell death and release of oxidants, the HSOD-GAG- binding fusion protein is administered in an amount sufficient to deliver 1 to 50 milligrams (mg),
preferably about 5 to 20 mg, per human adult, when the SOD fusion protein has a specific activity of about 3000 U per mg. A preferred dosage can alternatively be stated as an amount sufficient to achieve a plasma concentration of from about 0.1 ug/ml to about 100 ug/ml, preferably from about 1.0 ug/ml to about 50 ug/ml, more preferably at least about 2 ug/ml and usually 5 to 10 ug/ml.
GAG-binding fusion proteins having superoxide dismutase (SOD) activity for use in a therapeutic composition typically have about 200 to 5000 units (U) of enzyme activity per mg of protein. Enzyme assays for SOD activity are well-known, and a preferred assay to standardize the SOD activity in a fusion protein is that described by McCord et al., J.Biol.Chem.,
244:6049 (1969).
For treating arthritic conditions such as rheumatoid arthritis, tendonitis, bursitis or the like, a dosage of about 1 to 20 mg, preferably about 4 to 8 mg is administered intra-articularly per week per human adult. In certain cases, as much as 20 mg can be administered per kilogram (kg) of patient body weight.
For treating reperfusions or myocardial
injuries, a dosage of 5 mg per kg of body weight is preferred to be administered intravenously.
The therapeutic compositions containing a GAG- binding fusion protein are conventionally administered intravenously, or intra-articularly (ia) in the case of arthritis, as by injection of a unit dose, for example. The term "unit dose" when used in reference to a therapeutic composition of the present invention refers to physically discrete units suitable as a unitary dosage for the subject, each unit containing a predetermined quantity of active material calculated to produce the desired therapeutic effect in
association with the required diluent, i.e., carrier, or vehicle.
The compositions are administered in a manner compatible with the dosage formulation, and in a therapeutically effective amount. The quantity to be administered depends on the subject to be treated, capacity of the subject's immune system to utilize the active ingredient, and degree of therapeutic effect desired. Precise amounts of active ingredient
required to be administered depend on the judgment of the practitioner and are peculiar to each individual. However, suitable dosage ranges for systemic
application are disclosed herein and depend on the route of administration. Suitable regimes for initial administration and booster shots are also variable, but are typified by an initial administration followed by repeated doses at one or more hour intervals by a subsequent injection or other administration.
Alternatively, continuous intravenous infusion
sufficient to maintain concentrations in the blood in the ranges specified for in vivo therapies are contemplated.
The present invention is more fully understood by the specific examples described hereinbelow, and by the appended claims.
Examples
The following examples are intended to
illustrate, but not limit, the present invention.
1. Computer Modeling of A+ Helix
a. Comparison of PCI and α1 AT Seguence
To analyze the molecular basis for
glycosaminoglycan recognition and binding by the plasma serpin protein C inhibitor (PCI), computer modeling was used to construct a complete, energy- minimized, three-dimensional model of PCI using the structure of α1-antitrypsin (α1AT) as a template based on its sequence homology to α1AT, for which a high- resolution crystallographic structure has been
determined. Loebermann et al., J. Mol. Biol.,
177:531-56 (1984). The sequence of human PCI
determined by Suzuki et al., J. Biol. Chem., 262:611- 6(1987) was aligned with the human α1AT sequence solved in three dimensions by Carrell et al., Nature, 298:329-34 (1982) using two methods; manually, using the criterion that chemically similar (i.e.,
hydrophobic, hydrophilic, positively charged,
negatively charged) residues be optimally aligned without introducing gaps in either sequence, and automatically, using the program ALIGN [Dayhoff et al., Atlas of Protein Sequence and Structure, Vol. 5, Suppl. 3, National Biomedical Research Foundation, Washington, D.C., 345-62 (1979)] with both mutation data and identity matrices, and gap penalties ranging from 2 to 10.
The structure of PCI residues 20-391 (α1AT numbering) was built by sidechain substitution with the molecular editor Moledt (Biosym Technologies, Inc.), using the x-ray structure of α1AT available in the Brookhaven Protein Data Bank [Bernstein et al., J. Mol. Biol., 112:535-42 (1977)]; entry 6API as a template and following the original sidechain torsion angles. Sidechain collisions were corrected using new torsion angles from a rotamer library [Ponder et al., J. Mol. Biol., 193:775-91 (1987)], and small
adjustments were made where necessary using the program Insight (Biosym Technologies, Inc.) on an
Evans & Sutherland PS390 graphics workstation. α1AT and sidechain-substituted PCI structures were then energy-minimized using the program Discover (Biosym Technologies, Inc.) with an all-atom force field.
Steepest descents minimization was applied while forcing heavy atom (non-H) positions to a template of the starting model, decreasing the force constant stepwise from 1000 to 10 kcal/A until the potential energy derivative was less than 10 kcal mol-1 A-1 for each atom. The energies were further minimized using the conjugate gradients algorithm [Fletcher et al.. Computer J., 7:149-54 (1964)] with a force constant of 10 kcal A-1 applied only to mainchain atoms until the maximum potential energy derivative was less than 10 kcal mol-1 A-1. This methodology may be summarized by the steps of sidechain substitution, followed by choice of new sidechain rotamers to remove van der Waals conflicts, followed by energy minimization.
The resulting sequence alignment, based on manual alignment of chemically similar residues, involved no insertions or deletions. Alignments made with the ALIGN program were not used because in this case it introduced many gaps that only marginally improved the sequence homology, even when a large gap penalty was used. The high degree and uniformity of the PCI-α1AT sequence homology suggested that the structure of PCI is largely similar to that of α1AT. Of the residues of PCI modeled on α1AT (20-391 in the α1AT numbering system), 44.4% were identical and 71.8% were conserved within hydrophobicity categories. Rose et al., Science, 229:834-838 (1985). The two proteins were homologous along the entire length of PCI except for the 27 N-terminal residues and residues 351-382, which surround the reactive center cleavage site at residue 358. Realignment of this segment would have involved introducing two multi-residue gaps without significantly improving the number of identical residues in the alignment. A study of the
relationship between atomic coordinates in modeled and actual structures of a number of proteins [Blundell et al., Nature, 326:347-352 (1987)] has shown that the expected rms deviation between atomic positions in proteins having ~45% sequences identity is 1.0 Å.
b. Models of the Amino Terminus of PCI
Since the 15 N-terminal residues (residues 5-19 in α1AT numbering) of PCI are not sequence-homologous to α1AT and the N-terminus of α1AT is not visible in the electron density map, the structure of these residues in PCI was modeled independently. α-Helical, β-strand, and loop models were built using the
molecular editor Moledt and analyzed in several orientations on the surface of the minimized PCI model using the graphics program Insight. To evaluate shape complementarity, the solvent-accessible molecular surfaces of the N-terminus and PCI models were
calculated using the program MS [Connolly et al.,
Science, 221:709-13 (1983)] with a 1.6 A probe sphere and surface dot density of 2 dots/A2. For the most plausible orientations of the N-terminus on the surface of PCI, the surface buried between the N- terminus and PCI was calculated using MS and analyzed using graphics programs GRAMPS [O'Donnell et al.,
Comput. Graph. 15:133-142 (1981)], GRANNY [Connolly et al., Comput. Chem., 9:1-6 (1985)], and Insight on Evans & Sutherland MPS and PS390 graphics
workstations. These N-terminus models were then evaluated for hydrogen bonding and charge
complementarity with the PCI surface. For the best conformations and orientations of the amino-terminal residues, the peptide bond between residue 19 (the C- terminus of the amino-terminal segment) and residue 20 (the N-terminus of residues 20-391 of PCI) was made by optimally orienting the N-terminal segment, then making minimal changes in the backbone torsion angles of residues 19 and 20 and their nearest neighbors (using Moledt) in order to align the carbonyl C of residue 19 with the amino N of residue 20. Plausible models of PCI including the N-terminus were energy- minimized (using the methods described above) to alleviate unfavorable residue contacts and to improve the conformations of the residue 19-20 turn and the N- terminal segment.
Analysis of the α-helix, β-strand, and loop models of the 15 N-terminal residues showed that an α- helical structure was most likely. Possible β-strand and loop structures were deemed implausible due to their length, lack of favorable sidechain
interactions, and the absence of suitable mainchain hydrogen bonds. The helicity of these residues in the PCI model was further supported by two secondary structure algorithms [Gamier et al., J. Mol. Biol., 120:97-120 (1978); Novotny et al., Nucleic Acids Res., 12:243-255 (1984); Chou et al., Adv. Enzymol. Relat. Areas Mol. Biol., 47:45-148 (1978)] which both
predicted that the first nine residues of PCI (residues 9-17 in α1AT numbering) fold as an α-helix. When represented on an α-helical wheel, residues 8-18 formed a highly amphipathic α-helical structure with only hydrophobic residues on one side and only charged residues on the other. Studies of known protein structures show that sequence amphipathicity
correlates with structural periodicity, and in
particular that the sequences of surface α-helices are often amphipathic with a period of ~3.6 residues.
Eisenberg et al., Nature, 299:371-374 (1982);
Eisenberg et al., Proc. Natl. Acad Sci. USA, 81:140- 144 (1984).
The N-terminal α-helix region formed by residues 5-19, designated A+ helix, was found to be
complementary in shape to two surface grooves
extending from residue 20, and in each groove the α- helix directed its charged sidechains toward the solvent while burying its hydrophobic sidechains against hydrophobic residues of PCI. Detailed
analyses were made of the two structures resulting from energy-minimization of PCI modeled with the A+ helix attached in both orientation: alongside the A helix at an angle of 25 degrees to the H helix (model I) and antiparallel to the H helix at an angle of 40 degrees (model II).
c. Evaluation of the Complete PCI Model Changes in heavy atom positions during energy minimization of α1AT and PCI were moderate and
distributed throughout the proteins' structures. The rms deviation between heavy atom positions before and after energy minimization, calculated using the program RMSN [Roberts et al., Isr. J. Chem., 27:198- 210 (1986)], was 1.16 Å for PCI without the A+ helix, (0.84 Å for Cα), and 1.17 Å for α1AT (0.77 Å for Cα). Superposition of the energy-minimized structure of PCI without the 15 N-terminal residues on that of α1AT showed their close similarity with an overall rms deviation of 0.83 Å between post-minimization Cα positions in PCI and α1AT (residues 20-391). The most significant deviations in Cα positions between PCI and α1AT were in the H helix and apparently resulted from the large number of repulsive position charges in the area of PCI. The rms deviation between heavy atom positions in pre- and post-minimized structures of PCI including the A+ helix was 1.21 Å (0.87 Å for Cα) for model I and 1.15 Å (0.83 Å of Cα) for model II. After minimization, the N-terminus of the A+ helix was less tightly coiled, which is a common feature in known structures. Minimization also improved the geometry of the turn at residues 19 and 20, with the rest of the PCI structures being substantially the same as that of PCI minimized without the A+ helix. The energy-minimized models I and II were judged
reasonable based on the moderate rms deviations, conservation of hydrophobic core residues and
secondary structure, the absence of buried charged residues, and the appropriateness of turn residues, d. Characteristics of the Region of the
PCI Model Responsible for Electrostatic Recognition of Heparin
To identify positive regions of PCI that could bind the negatively charged groups of
glycosaminoglycans (GAGs), the electrostatic potential at the surface of modelled PCI was analyzed.
Electrostatic potentials were calculated at all points on the solvent-accessible surfaces of the energy-minimized PCI models using programs ESPOT and ESSURF [Getzoff et al., Nature, 306:287-90 (1983);
Getzoff, et al. Biophys J. (1986)] with two different dielectric models; the dielectric constant of bulk water, ∈=80, and a linear distance-dependent
dielectric, ∈=4r. The electrostatic potential value for each molecular surface point was color-coded on the surface and displayed using GRAMPS and GRANNY.
PCI exhibited an overall electrostatic dipole, with a highly positive region including the H helix opposed by a weakly negative region centered on Asp 121. In the electrostatic potential surfaces
calculated with a dielectric constant of 80, the models including the A+ helix had a single, highly positive (≥3 kcal mol-1) surface region centered on Arg 10 and Lys 274, which protrude from the A+ and H helices. Other central positive residues were Lys 14 and Lys 270 in model I and Arg 6, Lys 277 and Lys 280 in model II. The positive region in both models formed a single face of the protein, has an area (1365 Å2 in model I and 1705 Å2 in model II) consistent with other protein interfaces [Janin et al., J. Mol. Biol., 204:155-164 (1988)], and was large enough to bind an extended heparin octasaccharide chain, a minimal unit for heparin binding in ATIII. Thunberg et al., FEBS Lett., 117:203-206 (1980). Positive residues
contributing at least 30 Å2 to the ≥3 kcal mol-1 surface in model I were residues 5-8, 10, 13-15, 86, 270, 273-274, and 277, and in model II were residues 6-7, 10, 13, 270, 273-274, 277, and 280-282. All Of these except Lys 86 were either in the A+ or H helices or immediately followed the H helix. The region with potential ≥3 kcal mol-1 constituted 9% of PCI's total surface in model I and 11% in model II.
Electrostatic calculations done on PCI surfaces either without the A+ helix or with the H helix charges neutralized indicated that neither the A+ helix nor the H helix alone generated a strongly positive surface; in both cases there was less than 19 Å2 of surface with potential ≥3 kcal mol-1. There was also no significant positive surface associated with the D helix, which is implicated in heparin binding to ATIII. Carrell et al., Thrombosis and Haemostasis 1987. Verstraete et al., eds. pp. 1-15 (1987). These results suggest that in PCI both helices A+ and H, but not helix D, are essential for GAG binding. The distance-dependent dielectric model gave qualitatively similar results, with the only significantly positive surface region of PCI centered on the A+ and H
helices. Thus, choice of dielectric model did not appear to affect the ability to identify the highly positive A+ and H helix region as forming the single likely surface for heparin recognition.
2. Assessing the Electrostatic Interaction of Heparin with PCI
The role of electrostatics in the interaction of heparin with PCI was assessed experimentally by measuring the ionic strength effect on heparin
stimulation of the formation of PCI complexes with activated protein C (APC), since heparin interaction with PCI accelerates complex formation. The second- order rate constant, k2, for the inhibition of APC by PCI was determined as described by Espana et al.,
Thromb. Res. 55:369-84 (1989). PCI (2-1600 nM) in 0.01 M Tris HCl, 1% BSA, and 0.02% NaN3 at pH 7.4, with NaCl added to give ionic strengths between about 0.09 and 0.25, was incubated for 5 min at 37C with heparin at 0, 0.8 and 1.6 units/ml, after which APC (2 nM) was added. At timed intervals, aliquots were withdrawn and diluted 30-fold with 1 mM of the
chromogenic substrate S-2366 (Kabi Vitrum, Stockholm, Sweden) in 0.05 M Tris HCl, 0.1 M NaCl, 4 mM Call2, and 0.02% NaN3 at pH 8.2: Residual APC activity was determined by the rate of change in absorbance at 405 nm, compared to controls without added PCI. Pseudo- first order rate constants were calculated from initial slopes in plots of the natural log (In) of APC activity versus time and k2 values were obtained based on the concentration of PCI and are shown in Table 3.
Figure imgf000080_0001
Examination of the ionic strength dependence of heparin-stimulated inhibition of APC by PCI, as measured by the rate constant (k2) of PCI:APC complex formation shown in Table 1, indicates that heparin stimulation fell sharply above physiological ionic strength. This confirmed that electrostatic
shielding impairs PCI-heparin recognition and
supported the role of multiple complementary
electrostatic interactions in the heparin-binding process indicated by the models. The heparin
concentrations were chosen for optimal stimulation of complex formation, and the effect of ionic strength in the absence of heparin was measured for comparison. 3. Localizing the Heparin Binding Site to the
A+ Region of PCI
The specific role of the A+ helix of the modeled PCI in heparin binding was verified by immunochemical experiments. For heparin-binding studies, murine monoclonal antibodies were raised against intact purified PCI. Meijers et al. Blood, 72:1401-3 (1988). IgG was purified by Protein A-Sepharose chromatography (Pharmacia Laboratories, Uppsala, Sweden) as
recommended by the manufacturer. PCI (4 μg) was incubated for 60 minutes at 22C with the anti-PCI antibodies API39 (48 μg), API60 (48 μg), or buffer, in 200 μl of 0.01 M Tris and 0.14 M NaCl at pH 7.4. The sample was adjusted to 0.1 M NaCl in a final volume of 400 μl and loaded onto a 0.6 ml column of heparin- agarose (Sigma). Using an FPLC liquid chromatography gradient programmer (Pharmacia), PCI was eluted (0.1 ml fractions) with a linear gradient from 0.1 to 0.6 M NaCl. The elution profiles were determined using an ELISA assay for PCI antigen as described by Espafia et al., Thromb. Res. 55:671-82 (1989).
As shown by the above immunochemical experiments, an anti-PCI monoclonal antibody (API39) neutralizes heparin stimulation of APC inhibition by PCI [Meijers et al. Blood, 72:1401-3 (1988)], and by ELISA and peptide competition assays binds specifically a peptide corresponding to the A+ helix. Antibody API39 prevented PCI from binding to a heparin-agarose column. A control antibody (API60) that binds to PCI but not to peptides from the A+ or H helix regions does not affect heparin stimulation nor prevent PCI from binding to a heparin-agarose column. These antibody binding experiments indicate that the A+ helix is in the heparin-binding site of PCI. 4. Identification of a Two-Helix Motif for
Glycosaminoglycan Recognition
The strikingly positive helix pairs that forms the heparin recognition surface of PCI identified by the studies in Example 1-3 is similar to the twin helical motif thought to bind heparin in dimers of platelet factor 4, a nonhomologous protein whose structure has recently been determined. St. Charles et al., J. Biol. Chem., 264: 2092-2099 (1989). GAG recognition in ATIII may be a variation on this common theme, involving positive residues in both the D helix [Carrell et al.. Thrombosis and Haemostasis 1987, Verstraete et al., eds., Leuven University, pp.1-15 (1987)], and the N-terminal region. Electrostatic surface calculations on ATIII modeled without the 44 N-terminal residues indicate that the D helix alone does not generate a large, positive surface, as only 64 Å2 of the ATIII surface had an electrostatic potential ≥ 3 kcal mol-1. Thus, residues in the unmodeled N-terminus of ATIII may likely be required for electrostatic recognition of heparin and provide an amphipathic helix that plays an analogous role to A+ helix residues in PCI.
Sequential helix pairs in the helix-turn-helix DNA recognition motif are structurally conserved amongst DNA-binding proteins. Richardson et al.,
Proteins: Struct. Funct. Genet., 4:229-239 (1988). For GAG-binding proteins, non-sequential amphipathic helix pairs perform an analogous role, with the separation of the rigid, independent helices in serpins allowing conformational changes that affect GAG-binding function by optimizing shape and
electrostatic complementarity. Beyond elucidating heparin-PCI interactions, the detailed structural model of PCI presented here will facilitate the design of inhibitor-resistant antithrombotic agents based on APC, as has recently been done with tissue-type plasminogen activator. Madison et al., Nature,
339:721-724 (1989).
5. Preparation of a Vector for Expressing the
SOD-A+ Fusion Protein
A fusion protein was constructed to contain the heparin binding region of PCI, namely the A+
amphipathic α helix of PCI modelled in Example 1, and a protein of considerable biological interest, namely human superoxide dismutase (HSOD). The resulting fusion protein, designated as S0D-A+, contains three subunits: a first region comprised of a polypeptide having the amino acid residue sequence of HSOD, a second region comprised of a polypeptide linker to connect the first and third regions, and a third region comprised of a polypeptide having the amino acid residue sequence of the A+ helix of PCI. The amino acid residue sequence of SOD-A+ is shown in Figure 1, including the first SOD region defined by residues 1-153, the second linker region defined by residues 154-156, and the third A+ region defined by residues 157-171.
The construction and expression of S0D-A+ was done in the pPHSODlacI vector from Chiron Corporation (Emeryville, CA). This vector contains the Sall-EcoRI fragment from pBR322, coding for the β-lactamase and the origin of replication. The lad gene was
inserted as in EcoRI cassette. Downstream from lacIq, a tac promoter as described by Amann et al., Gene, 25:167 (1983), was added as an EcoRI-Ncol fragment. To maximize expression of SOD, a stop-start mini gene was added from the Ncol site. This optimized the ribosome binding site sequence as described by
Hallewell et al., Nucl. Acids Res., 13:2017-2034
(1985). A leader sequence from Photobacterium
leiognathi described by Steinman, J. Biol. Chem., 262:1882-1887 (1987) and corresponding to residues -21 to 0 in Figure 1 was also put into the construct to allow secretion of SOD and ease protein purification later on. The synthetic HSOD gene described in
Hallewell et al., J. Biol. Chem., 264:5260-5268 (1989) was inserted downstream from the leader sequence as a Hindlll-Sall fragment. The Hindlll site was created in the leader sequence. The HSOD protein encoded by the synthetic HSOD gene differs from wild type HSOD in that it contains alanine and serine in place of the cysteines at amino acid residue positions 6 and 111, respectively . All experiments were carried out using E. coli MC1061 (araD139, delta (araleu)7696, delta (lac) 174, galU, galK, hsdR, strA) [Huynh et al., DNA Cloning, vol.1. Glover, Ed., IRL Press Ltd., Oxford, Eng., pp.56-110 (1985)]. However, for therapeutic use in humans it is preferred that the HSOD be produced in yeast to obtain amino terminal acetylation like wild type HSOD protein found in humans. Yeast expression system, for HSOD is described in Hallewell et al., J. Biol. Chem., 264:5260-5268 (1989) and also in
Hallewell et al., Biotechnology, 5:363-366 (1987).
6. Synthetic Oligonucleotides
Oligonucleotides where synthesized by the
phosphoramidites chemistry [Beaucage et al.,
Tetrahedron Letters, 22:1859-1862 (1982)] on an
Applied Biosystems DNA synthesizer model 380B. To add the Gly-Pro-Gly linker and the A+-helix to the carboxy terminus of HSOD, two oligonucleotides corresponding to the HSOD, sequence from the BamHI site of the synthetic gene to the end of the amino acid coding sequence were designed. The coding strand was
shortened by two nucleotides at the 3' end to allow overlapping complementary ends needed for assembly with the Gly-Pro-Gly A+-helix coding sequence. The complementary strand was extended by a glycine
anticodon (GCC) of the Gly-Pro-Gly linker to provide a 5' overhang for the assembly as noted above. Such a construct allows HSOD carboxy terminal fusion with any polypeptide beginning with glycine. Another set of oligos was synthesized to encode the remaining parts of the Gly-Pro-Gly linker and the A+-helix. The coding sequence was generated by reverse translation of the Gly-Pro-Gly A+-helix peptide sequence and addition of a stop codon (TAA) followed by a Sall cleavage site. An Xmal restriction site was also introduced in the linker sequence due to the
degeneracy of the genetic code. The Xmal site in the linker sequence allows further modifications of the linker sequence if needed.
The sequence of the four oligonucleotides used can be found in Figure 1. Oligonucleotide HUCLI corresponds to the sequence of oligonucleotides 488 to 523. Oligonucleotide HUCLIZ is the complement of nucleotides 492 to 528. Oligonucleotide PCIHEPBI corresponds to the nucleotides sequence 524 to 583. Oligonucleotide PCIHEPBZ is the complement of
nucleotides 529 to 587.
After synthesis, the oligos where deprotected and precipitated as in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Second Ed. (1989). To avoid the formation of multimers, only the oligos that did not have a
palindrome at the 5' ends were kinased. The oligos were then gel purified on 10% or 12% acrylamide-urea gels (Sambrook et al., Supra). The purified
oligonucleotides where hybridized pair wise, HUCLI with HUCLIZ and PCIHEPBI with PCIHEPBZ, 10 μg of each in 100 ml water for 1 minute at 90C followed by cooling down to room temperature for 5 minutes. The hybridized oligos, HUCLI with HUCLIZ and PCIHEPBI with PCIHEPBZ, were ligated with T4 DNA ligase (New England Biolabs) according to the manufacturer's instructions. The resulting BamHI-Sall cassette was substituted for the BamHI-SalI fragment of the HSOD synthetic gene. The BamHI-Sall cassette was ligated into the
dephosphorylated vector and transformed into E. coli MC1061 by the CaCl technique (Sambrook et al., Supra). The bacteria where plated and grown at 37C overnight on Luria-Bertani (LB) agar containing 200 μg per ml ampicillin. Recombinants where screened by DNA alkaline mini-preps (Sambrook et al., Supra) and assayed by Ncol-Sall restriction enzyme digests for the presence of an insert larger than the wild type HSOD gene. Final confirmation of the clones identity was done by dideoxy nucleotide sequencing [Sanger et al., Proc. Natl. Acad. Sci. USA, 74:5463-5467 (1977)] of the Ncol-Sall fragment in the single strand phage M13. Clones having a larger insert after Ncol-Sall digestion and including a sequence shown in Figure 1 from nucleotide base 1 to base 588 were selected and designated as containing the plasmid pPHSODIqHPCI4.
The plasmid pPHSODIqHPCI4 has been deposited with the American Type Culture Collection (ATCC; Bethesda, MD) in the form of a transformed E. coli containing the plasmid on November 1, 1990, by the depositor Chiron Corporation (Emeryville, CA) and has been assigned a deposit accession number that is available from the ATCC.
Alternate expression vectors capable of producing HSOD-A+ fusion protein can be prepared from the deposited plasmid material using methodologies well- known. For general methods of molecular biology, see "Gene Expression Technologies" in Meth. Enzymol.
Volume 185, Ed. by Goeddel, Academic Press (1990).
An exemplary alternate expression system can be prepared as follows. The approximately 30 base pair (bp) Ncol-Pstl polylinker is first isolated from the pPROK-1 vector available from Clontech Laboratories (Palo Alto, CA). The SalI site of pKK233-2 available from Clontech is disabled by first digesting pKK233-2 with SalI, filling in the cohesive SalI termini, then religating the resulting biunt ends to form a circular pKK233-2 plasmid with a disabled Sall. pKK233-2 is digested with Ncol and Pstl, and the 30 bp Ncol-Pstl polylinker is ligated into pKK233-2 to provide a
Ncol-Sall site. Deposited pPHSODIqHPCI4 is digested with Ncol and Sall to remove the HSOD-A+ fusion protein encoding gene cassette, and the cassette is inserted into the Ncol and SalI site of the above- modified pKK233-2 vector. Thereafter, the pKK233-2 vector having the HS0D-A+ protein encoding gene can be introduced into a suitable laclq strain of E. coli
(e.g., JM109) for expression of the HSOD-A+ protein.
7. Expression and Purification of SOD-A+ Fusion
Protein
E. coli cells containing the pPHSODIqHPCI4
plasmid were grown overnight at 37C as a starter culture of 100 ml Luria-Bertani broth containing 200 μg per ml ampicillin. Nine liters (L) of the same culture media were inoculated with the starter
culture. Five ml of SAG-471 antifoaming agent were added to the 9L culture and air was bubbled through the media during growth. The cells were grown at 30- 35C until mid-log phase (approximately 6 hours).
Induction of HSOD-A+ protein synthesis was then initiated by addition of isopropylthio-β-D-galactoside (IPTG) to a final concentration of 0.2 mM. At the same time, the culture was also supplemented with 250 μM CuSO4 to provide a source of copper for SOD. The culture was then maintained overnight. One hour before harvest the media was supplemented with CuSO4 to a final concentration of 1 mM. Zn++ is ubiquitous in LB media reagents but must also be present for proper HSOD-A+ assembly during protein expression.
The periplasmic fraction of the bacterial cells was extracted by a modification of the osmotic shock procedure of Koshland et al., Cell, 20:749-760 (1980). The cells were centrifuged down into two one liter bottles (3.5k rpm for 15 minutes in a Beckman J-6B centrifuge maintained at 4C). Each pellet was
resuspended with 250 ml of cold (4C) 20 mM Tris-HCl pH 7.5. All subsequent manipulations were carried out at 4C. 250 ml of cold 40% sucrose solution and 30 ml of 250 mM EDTA were added to each suspension. The suspensions were shaken on a platform rotary shaker at 100 rpm for 20 minutes. The cells were repelleted at 4k rpm for 20 minutes and resuspended in a total of 200 ml cold distilled water. The pooled bacterial suspension was shaked again as previously described. The cells were finally pelleted at 8k rpm for 20 minutes. The supernatant, containing the periplasmic fraction, was collected and stored at -70C until purification.
The periplasmic fraction was estimated to contain 5 mg per ml of HS0D-A+ as determined by coomassie blue staining of SDS-polyacrylamide gel [Laemmli, UK,
Nature, 227:680-685 (1970)]. The superoxide dismutase activity of HSOD-A+ was assayed on native 10% polyacrylamide gels (non-denaturing cells) as
described by Beauchamp et al., Anal. Biochem., 44:276- 287 (1971). Briefly, a sample of periplasmic fraction containing approximately 0.5 μg of HSOD-A+ was loaded on the gel and run at 60 volts for 6 to 7 hours. The gel was then soaked in the dark for 20 minutes into a 2 mg per ml solution of nitroblue tetrazolium in water. After rinsing in water, the gel was then soaked in the dark for 20 minutes into a 0.036 M KPO4 buffer at pH 7.8 containing 0.28 M TEMED and 2.8 x 10- 5M riboflavin. The gel was then exposed to light until enough contrast developed to show white bands on a purple background, typically after about 15 minutes at 25C.
HSOD-A+ was then further isolated from the periplasmic fraction first purified by heparin
affinity chromatography on an Affi-Gel heparin column (BioRad, Richmond, CA). Twenty five ml of the
periplasmic fraction was loaded onto a 40 ml Affi-Gel heparin column. The column was eluted at a flow rate of 1 ml per minute with 200 ml of a 0.2 M Tris pH 7.0 buffer generating a linear gradient from 0.03 M to 0.4 M NaCl. Fractions of 5 ml were collected and tested by SDS-polyacrylamide gel electrophoresis. HSOD-A+ eluted in fractions number 18 to 28 corresponding to elution buffer containing around 0.2 M salt. After that purification step, HSOD-A+ was estimated to be more than 95% pure and fully active based on the above gel activity assay. The heparin binding property of HSOD-A+ was demonstrated in vitro by using a heparin binding assay that measures retention of HSOD-A+ on the heparin column described above. In the assay co- elution was conducted and compared using equivalent amounts of crude HS0D-A+ and of recombinant purified HSOD made in yeast [Hallewell et al., Biotechnology, 5:363-366 (1987)]. With the Affi-Gel heparin column described above, the HSOD was all eluted before the gradient reached 0.1 M salt while SOD-A+ eluted at about 0.2 M, indicating that the addition of a GAG- binding moiety to HSOD significantly increased the GAG-binding capacity of the SOD-A+ fusion protein.
Before animal testing the pooled heparin column fractions containing SOD-A+ were dialyzed against 20 mM Tris-HCl pH 8.0 and loaded onto a 1.5 x 75 cm DEAE- Sepharose CL-6B chromatography column (Pharmacia LKB Biotechnology Inc.). The column was washed with the same buffer, eluted with a gradient up to 0.15 M NaCl in the Tris-Cl buffer, and fractions containing SOD-A+ were pooled. To remove possible traces of endotoxin the purified SOD-A+ was submitted to octyl-Sepharose chromatography.
8. In Vivo Animal Testing for HSOD-A+ Half-Life in Blood
Mice were injected with 2 mg of HSOD-A+ or with recombinant HSOD [Hallewell et al., Biotech., 5:363- 366 (1987)] for control. The proteins were
resuspended in physiologic saline. Sera were taken at 0, 2, 15, 60 and 240 minutes after injection. After 240 minutes a bolus of heparin was injected and sera was taken at 10 minutes thereafter at 250 minutes.
The half-life was estimated by the SOD gel activity assay as described above. The recombinant HSOD have a half-life of less than 13 minutes, most likely between 7 and 10 minutes. The HSOD-A+ half-life can be estimated to be around 15 minutes.
The foregoing specification, including the specific embodiments and examples, is intended to be illustrative of the present invention and is not to be taken as limiting. Numerous other variations and modification can be effected without departing from the true spirit and scope of the present invention.
Figure imgf000092_0001
Figure imgf000093_0001
Figure imgf000094_0001
# La Jolla, CA 92037 # Telephone: (619) 5 #
cterminus=$l; shift len_c= 'expr length nterminus=$l; shift len_n=Λexpr length min_linker_len*=$l; sh max_linker_len=$l; sh mat_file=$l; shift mat_tol=$l; shift
# generate unique te defns_file="/tmp/pdb match file="/tmρ/pdb
-
# generate custom re gendefs $mat_file $m
# run searchwild, pl searchwild -d $defn $cterminus . \ {\\$min_ EOF
# searchwild will ha
# so extract the tw
# rank them
matchextracttermini I tr 'a-z' 'A-Z' | I averagelinebylin
I paste searchwil # clean up temporary /bin/rm -f $match—fi
Figure imgf000095_0001
Figure imgf000096_0001
Figure imgf000097_0001
Figure imgf000098_0001
Figure imgf000099_0001
Figure imgf000100_0001
Figure imgf000101_0001
Figure imgf000102_0001
Figure imgf000103_0001
Figure imgf000104_0001
Figure imgf000105_0001
Figure imgf000106_0001
Figure imgf000107_0001
Figure imgf000108_0001
Figure imgf000109_0001
Figure imgf000110_0001
r
c
Figure imgf000111_0001
Figure imgf000112_0001
Figure imgf000113_0001
Figure imgf000114_0001
Figure imgf000115_0001
Figure imgf000116_0001
Figure imgf000117_0001
Figure imgf000118_0001
Figure imgf000119_0001
Figure imgf000120_0001
Figure imgf000121_0001
Figure imgf000122_0001
Figure imgf000123_0001
Figure imgf000124_0001
" "
Figure imgf000125_0001
Figure imgf000126_0001

Claims

What Is Claimed Is:
1. A fusion protein comprising a polypeptide having the formula -Ub-(Z-L)a-Y- or -Y-(L-Z)a-Ub, where Y and Z are amino acid residue sequences, L represents a linking means, Y comprises a polypeptide having biological activity, and Z is a glycosaminoglycan- binding polypeptide according to the formula
Xg[+]hXi[+]Xk[+]lXm[+]nXo, where [+] is R or K; X is L, A, E, F, T, I, S, Y, V, N, K, R or aminoisobutyric acid; g is an integer from 0 to 9, h is an integer from 1 to 3, i is an integer from 1 to 5, j is an integer from 1 to 3, k is an integer from 1 to 7, 1 is an integer from 0 to 7, m is an integer from 0 to 7; n is an integer from 0 to 2, and o is an integer from 0 to 2; and with the proviso that the sum of
g+h+i+j+k+l+m+n+o is equal to or less than 20; U is an amino acid residue; a is an integer from 2 to 10; and b is an integer from 0 to 1.
2. The fusion protein of claim 1 wherein said linking means is a polypeptide of 1 to 7 amino acid residues in length.
3. The fusion protein of claim 1 wherein said linking means is methionine or has the sequence -GPG-.
4. The fusion protein of claim 1 wherein said Z is a polypeptide having an amino acid residue sequence that corresponds to a sequence according to the formula: -RVPRESGKKRKRKRLKPS-.
5. The fusion protein of claim 5 wherein said Y is a polypeptide having superoxide dismutase activity.
6. The protein of claim 1 wherein said
polypeptide has an amino acid residue sequence that corresponds to the sequence shown in Figure 1 from residue 1 to 153.
7. The fusion protein of claim 1 wherein said Z is a polypeptide having an amino acid residue sequence that corresponds to a sequence according to the formula: -RVPRESGKKRKRKRLKPS-, L is methionine, Y is a polypeptide having an amino acid residue sequence that corresponds to the sequence shown in Figure 1 from residue 1 to residue 171, U is methionine, a is 1, 2 or 3, and b is 1.
8. The protein of claim 1 wherein said Z is a polypeptide having an amino acid residue sequence that corresponds to a sequence according to the formula:
-HRHHPREMKKRVEDL-,
-YKKIIKKLLES-,
-KLNCRLYRKANK-, or
-EKTLRKWLK-.
9. The fusion protein of claim 1 having an amino acid residue sequence according to the sequence shown in Figure 1 from residue 1 to residue 171.
10. A composition comprising a fusion protein according to claim 1 in a pharmaceutically acceptable carrier.
11. A DNA segment defining a structural gene coding for a fusion protein comprising a polypeptide having the formula -Ub-(Z-L)a-Y- or -Y-(L-Z)a-Ub, where Y and Z are amino acid residue sequences, L represents a linking means, Y comprises a polypeptide having biological activity, and Z is a glycosaminoglycan- binding polypeptide according to the formula
Xg[+]hXi[+]Xk[+]lXm[+]nXo, where [+] is R or K; X is L, A, E, F, T, I, S, Y, V, N, K, R or aminoisobutyric acid; g is an integer from 0 to 9, h is an integer from 1 to 3, i is an integer from 1 to 5, j is an integer from 1 to 3, k is an integer from 1 to 7, 1 is an integer from 0 to 7, m is an integer from 0 to 7; n is an integer from 0 to 2, and o is an integer from 0 to 2; and with the proviso that the sum of
g+h+i+j+k+l+m+n+o is equal to or less than 20; U is an amino acid residue; a is an integer from 2 to 10; and b is an integer from 0 to 1.
12. The DNA segment of claim 11 wherein said segment includes a nucleotide sequence according to the sequence in Figure 1 from nucleotide base 535 to nucleotide base 579.
13. The DNA segment of claim 11 wherein said segment includes a nucleotide sequence according to the sequence in Figure 1 from nucleotide base 1 to nucleotide base 579.
14. A recombinant DNA molecule comprising a vector operatively linked to a DNA segment according to claim 11.
15. The recombinant DNA molecule of claim 14 wherein said DNA segment has a nucleotide sequence according to the sequence in Figure 1 from nucleotide base 1 to nucleotide base 579.
16. A method for therapeutic treatment to reduce tissue damage in an animal by superoxide radicals, which method comprises administering to said animal an effective amount of a GAG-binding fusion protein according to claim 5.
17. A therapeutic treatment for inflammation induced tissue damage in an animal induced by
superoxide radicals, which method comprises
administering to said animal an effective amount of a GAG-binding fusion protein according to claim 5.
18. A therapeutic treatment for post ischemic tissue damage in ischemic tissue of an animal, which method comprises administering to said animal an effective amount of a GAG-binding fusion protein according to claim 5.
19. The method of claim 18 wherein said
administering is conducted substantially concurrently with a procedure for reperfusing said ischemic tissue.
20. A method of determining an amino-acid residue sequence suitable for linking selected
molecules, the method comprising the steps of:
specifying a first sequence of residues corresponding to a first terminus of a first selected molecule;
specifying a second sequence of residues corresponding to a second terminus of a second
selected molecule;
specifying a minimum and a maximum length, in number of residues, of the linking sequence;
providing a matrix of numeric values
indicating the relative substitutability of one or more residues for a selected residue;
specifying a residue-selection criterion; determining, according to said criterion and according to said substitutability of residues, a set of residues substitutable for each of the residues of said first and second sequences; and
identifying, from among said substitutable residues and from known predetermined sequences, a set of sequences equivalent to said first and second sequences, each of which equivalent sequences having a length similar to the specified length, said
equivalent sequences being candidates for linking said molecules.
21. The method of claim 20 where said numeric values are weighted, representing degrees-of- relatedness between respective residues, and the step of specifying a residue-selection criterion includes the step of specifying one of said weighted numeric values.
22. The method of claim 21 where the step of selecting includes the step of establishing a database of known amino acid residue sequences, showing the equivalence of one or more residues to a given
residue.
23. The method of claim 22 where the step of identifying includes the step of sorting the
identified sequences according to similarity of each identified sequence to the first and second sequences, similarity being represented by the weighted numeric values of the residues of each sequence.
24. The method of claim 23 where the step of identifying includes the step of determining a maximum match value for each identified sequence, said maximum match value representing the sum of the weighted numeric values for the residues of the identified sequence.
25. The method of claim 24 where the step of identifying further includes the step of determining an average maximum match value for each identified sequence.
26. The method of claim 25 where the step of identifying also includes the step of determining an alignment score for each identified sequence, said alignment score representing the difference between the maximum match value and the average maximum match value.
27. A system for determining an amino-acid residue sequence suitable, for linking selected
molecules, the system comprising:
means for specifying a first sequence of residues corresponding to a first terminus of a first selected molecule;
means for specifying a second sequence of residues corresponding to a second terminus of a second selected molecule;
means for specifying a minimum and a maximum length, in number of residues, of the linking sequence;
means for providing a matrix of numeric values indicating the relative substitutability of one or more residues for a selected residue;
means for specifying a residue-selection criterion;
means for determining, according to said criterion and according to said substitutability of residues, a set of residues substitutable for each of the residues of said first and second sequences; and means for identifying, from among said substitutable residues and from known predetermined sequences, a set of sequences equivalent to said first and second sequences, each of which equivalent
sequences having a length similar to the specified length, said equivalent sequences being candidates for linking said molecules.
PCT/US1991/008105 1990-11-01 1991-11-01 Glycosaminoglycan-targeted fusion proteins, their design, construction and compositions WO1992007935A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US60853990A 1990-11-01 1990-11-01
US608,539 1990-11-01
US60856990A 1990-11-02 1990-11-02
US608,569 1990-11-02

Publications (1)

Publication Number Publication Date
WO1992007935A1 true WO1992007935A1 (en) 1992-05-14

Family

ID=27085796

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1991/008105 WO1992007935A1 (en) 1990-11-01 1991-11-01 Glycosaminoglycan-targeted fusion proteins, their design, construction and compositions

Country Status (2)

Country Link
AU (1) AU8947791A (en)
WO (1) WO1992007935A1 (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0723398A1 (en) * 1993-10-15 1996-07-31 Duke University Superoxide dismutase and mimetics thereof
US5866402A (en) * 1995-05-05 1999-02-02 Chiron Corporation Chimeric MCP and DAF proteins with cell surface localizing domain
WO1999023215A2 (en) * 1997-10-31 1999-05-14 University Of Florida Materials and methods for preventing cellular injury in humans and animals
US5994339A (en) * 1993-10-15 1999-11-30 University Of Alabama At Birmingham Research Foundation Oxidant scavengers
US6103714A (en) * 1994-09-20 2000-08-15 Duke University Oxidoreductase activity of manganic porphyrins
EP1158046A1 (en) * 1994-04-11 2001-11-28 Human Genome Sciences, Inc. Superoxide Dismutase-4
US6479477B1 (en) 1998-04-24 2002-11-12 Duke University Substituted porphyrins
US6544975B1 (en) 1999-01-25 2003-04-08 National Jewish Medical And Research Center Substituted porphyrins
US6583132B1 (en) 1993-10-15 2003-06-24 Duke University Oxidant scavengers
EP1456239A2 (en) * 2001-07-31 2004-09-15 Wayne State University Hybrid proteins with neuregulin heparin-binding domain for targeting to heparan sulfate proteoglycans
WO2005054285A1 (en) * 2003-12-04 2005-06-16 Protaffin Biotechnologie Ag Gag binding proteins
US6916799B2 (en) 1997-11-03 2005-07-12 Duke University Substituted porphyrins
WO2007038943A1 (en) * 2005-09-21 2007-04-12 7Tm Pharma A/S Y2 selective receptor agonists for therapeutic interventions
WO2007038942A1 (en) * 2005-09-21 2007-04-12 7Tm Pharma A/S Y4 selective receptor agonists for therapeutic interventions
US7485721B2 (en) 2002-06-07 2009-02-03 Duke University Substituted porphyrins
WO2010071190A1 (en) * 2008-12-19 2010-06-24 国立大学法人 新潟大学 Heparin affinity erythropoietin
US8470808B2 (en) 1999-01-25 2013-06-25 Jon D. Piganelli Oxidant scavengers for treatment of type I diabetes or type II diabetes
WO2021014220A1 (en) * 2019-07-23 2021-01-28 Csts Health Care Inc. Platelet-facilitated delivery of therapeutic compounds
CN114703154A (en) * 2022-03-30 2022-07-05 云南大学 Polypeptide, protein containing polypeptide and application of polypeptide
US11382895B2 (en) 2008-05-23 2022-07-12 National Jewish Health Methods for treating injury associated with exposure to an alkylating species

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1990010694A1 (en) * 1989-03-06 1990-09-20 Suntory Limited New superoxide dismutase
US5013653A (en) * 1987-03-20 1991-05-07 Creative Biomolecules, Inc. Product and process for introduction of a hinge region into a fusion protein to facilitate cleavage

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5013653A (en) * 1987-03-20 1991-05-07 Creative Biomolecules, Inc. Product and process for introduction of a hinge region into a fusion protein to facilitate cleavage
WO1990010694A1 (en) * 1989-03-06 1990-09-20 Suntory Limited New superoxide dismutase

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BIOCHEMISTRY, Vol. 25, issued 1986, PABO et al., "Computer-Aided Model Building Strategins for Protein Design", pp. 5987-5991. *
PROC. NATL. ACAD. SCI., Vol. 88, issued November 1991, NAKAZONA et al., "Does superoxide underline the pathogensis of Hypertension", pp. 10045-10048. *
THE JOUR. OF BIOL. CHEM., Vol. 266, No. 25, issued 1991, INOUE et al., "Expression of a Hybrid Cu/zn-type Superoxide Dismutase which has a high affinity for Heparin-like Proteoglycans on Vascular Endothelial Cells", pp. 16409-16414. *

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0723398A4 (en) * 1993-10-15 1999-03-24 Univ Duke Superoxide dismutase and mimetics thereof
US5994339A (en) * 1993-10-15 1999-11-30 University Of Alabama At Birmingham Research Foundation Oxidant scavengers
EP0723398A1 (en) * 1993-10-15 1996-07-31 Duke University Superoxide dismutase and mimetics thereof
US6583132B1 (en) 1993-10-15 2003-06-24 Duke University Oxidant scavengers
EP1158046A1 (en) * 1994-04-11 2001-11-28 Human Genome Sciences, Inc. Superoxide Dismutase-4
US6635252B2 (en) 1994-04-11 2003-10-21 Human Genome Sciences, Inc. Antibodies to superoxide dismutase-4
US6103714A (en) * 1994-09-20 2000-08-15 Duke University Oxidoreductase activity of manganic porphyrins
US5866402A (en) * 1995-05-05 1999-02-02 Chiron Corporation Chimeric MCP and DAF proteins with cell surface localizing domain
WO1999023215A2 (en) * 1997-10-31 1999-05-14 University Of Florida Materials and methods for preventing cellular injury in humans and animals
WO1999023215A3 (en) * 1997-10-31 1999-07-15 Univ Florida Materials and methods for preventing cellular injury in humans and animals
US6916799B2 (en) 1997-11-03 2005-07-12 Duke University Substituted porphyrins
US6479477B1 (en) 1998-04-24 2002-11-12 Duke University Substituted porphyrins
US8546562B2 (en) 1999-01-25 2013-10-01 James D. Crapo Substituted porphyrins
US9289434B2 (en) 1999-01-25 2016-03-22 Aeolus Sciences, Inc. Substituted porphyrins
US8946202B2 (en) 1999-01-25 2015-02-03 Aeolus Sciences, Inc. Substituted porphyrins
US7189707B2 (en) 1999-01-25 2007-03-13 National Jewish Medical Research Center Substituted porphyrins
US8470808B2 (en) 1999-01-25 2013-06-25 Jon D. Piganelli Oxidant scavengers for treatment of type I diabetes or type II diabetes
US8217026B2 (en) 1999-01-25 2012-07-10 Aeolus Sciences, Inc. Substituted porphyrins
US6544975B1 (en) 1999-01-25 2003-04-08 National Jewish Medical And Research Center Substituted porphyrins
US7820644B2 (en) 1999-01-25 2010-10-26 Aelous Pharmaceuticals, Inc. Substituted porphyrins
EP1456239A4 (en) * 2001-07-31 2005-04-27 Univ Wayne State Hybrid proteins with neuregulin heparin-binding domain for targeting to heparan sulfate proteoglycans
EP1456239A2 (en) * 2001-07-31 2004-09-15 Wayne State University Hybrid proteins with neuregulin heparin-binding domain for targeting to heparan sulfate proteoglycans
US7527794B2 (en) 2001-07-31 2009-05-05 Wayne State University Hybrid proteins with neuregulin heparin-binding domain for targeting to heparan sulfate proteoglycans
AU2002322762B2 (en) * 2001-07-31 2008-10-16 Wayne State University Hybrid proteins with neuregulin heparin-binding domain for targeting to heparan sulfate proteoglycans
US7485721B2 (en) 2002-06-07 2009-02-03 Duke University Substituted porphyrins
JP2007536906A (en) * 2003-12-04 2007-12-20 プロタフィン・ビオテヒノロギー・アクチェンゲゼルシャフト GAG binding protein
WO2005054285A1 (en) * 2003-12-04 2005-06-16 Protaffin Biotechnologie Ag Gag binding proteins
AU2004295104B2 (en) * 2003-12-04 2010-07-22 Protaffin Biotechnologie Ag GAG binding proteins
US7807413B2 (en) 2003-12-04 2010-10-05 Protaffin Biotechnologie Ag GAG binding protein
EP1752470A1 (en) 2003-12-04 2007-02-14 Protaffin Biotechnologie AG Gag binding proteins
US7585937B2 (en) * 2003-12-04 2009-09-08 Protaffin Biotechnologie Ag GAG binding proteins
EP2270038A2 (en) 2003-12-04 2011-01-05 Protaffin Biotechnologie AG GAG binding proteins
EP2270038A3 (en) * 2003-12-04 2011-02-23 Protaffin Biotechnologie AG GAG binding proteins
EP2311866A1 (en) 2003-12-04 2011-04-20 Protaffin Biotechnologie AG GAG binding proteins
EP2363411A1 (en) 2003-12-04 2011-09-07 Protaffin Biotechnologie AG GAG binding proteins
KR101278459B1 (en) 2003-12-04 2013-07-01 프로타핀 바이오테크놀로기 아게 Gag binding proteins
US7851590B2 (en) 2005-09-21 2010-12-14 7Tm Pharma A/S Y2 selective receptor agonists for therapeutic interventions
WO2007038942A1 (en) * 2005-09-21 2007-04-12 7Tm Pharma A/S Y4 selective receptor agonists for therapeutic interventions
US8022035B2 (en) 2005-09-21 2011-09-20 7Tm Pharma A/S Y4 selective receptor agonists for therapeutic interventions
WO2007038943A1 (en) * 2005-09-21 2007-04-12 7Tm Pharma A/S Y2 selective receptor agonists for therapeutic interventions
JP2009508885A (en) * 2005-09-21 2009-03-05 7ティーエム ファーマ エイ/エス Y4 selective receptor agonists for therapeutic intervention
US11382895B2 (en) 2008-05-23 2022-07-12 National Jewish Health Methods for treating injury associated with exposure to an alkylating species
JP5799409B2 (en) * 2008-12-19 2015-10-28 国立大学法人 新潟大学 Heparin affinity erythropoietin
WO2010071190A1 (en) * 2008-12-19 2010-06-24 国立大学法人 新潟大学 Heparin affinity erythropoietin
WO2021014220A1 (en) * 2019-07-23 2021-01-28 Csts Health Care Inc. Platelet-facilitated delivery of therapeutic compounds
CN114703154A (en) * 2022-03-30 2022-07-05 云南大学 Polypeptide, protein containing polypeptide and application of polypeptide
CN114703154B (en) * 2022-03-30 2024-01-09 云南大学 Polypeptide, protein containing same and application

Also Published As

Publication number Publication date
AU8947791A (en) 1992-05-26

Similar Documents

Publication Publication Date Title
WO1992007935A1 (en) Glycosaminoglycan-targeted fusion proteins, their design, construction and compositions
FI88932B (en) FRAMSTAELLNING AV FUNKTIONELLT MAENSKLIGT UROKINASPROTEIN
US5550213A (en) Inhibitors of urokinase plasminogen activator
CA2065409C (en) Anticoagulant polypeptides
KR20110136825A (en) Mirac proteins
Kuhn et al. Elucidating the structural chemistry of glycosaminoglycan recognition by protein C inhibitor.
JPH11240897A (en) Peptide inhibitor of urokinase receptor activity
JPH07147984A (en) Gene coding for polykringleplasminogen activator and vector containing the same
EP0687731B1 (en) Secretion vector, transformed microorganisms containing said vector and manufacture of products from said microorganism
JP2567536B2 (en) Variant of PAI-2
IL171158A (en) Inhibitor proteins of kallikrein
EP0494929B1 (en) Mutants of the human plasminogen activator inhibitor 1 (pai-1), their preparation and use
RU2186110C2 (en) Recombinant protein asp-pallidipin, method of its production and purification, vector, strain, pharmaceutical composition
US5589360A (en) Polypeptide, DNA fragment encoding the same, drug composition containing the same and process for producing the same
RU2283863C2 (en) Recombinant staphylokinase derivative, method for production and uses thereof
US5876971A (en) Thrombin inhibitor from the saliva of protostomia
JPH05213998A (en) New polypeptide and medicinal composition containing the same as active ingredient
NO317661B1 (en) Thrombin-activatable plasminogen analogs, use of the compounds in the preparation of a pharmaceutical agent, compositions comprising the compounds, and nucleic acid
JP2002503958A (en) Aprotinin variants and aprotinin variants bikunin with improved properties
JPH03505398A (en) Superoxide dismutation enzyme analogs with new binding properties
IL81879A (en) Purification of minactivin to homogeneity, production of minactivin by recombinant techniques, and uses of homogeneous and recombinant minactivin
RO116969B1 (en) Staphylokinase derivatives, process for producing the same and pharmaceutical composition comprising the same
JP2859297B2 (en) Polypeptide having thrombin inhibitory activity and method for producing the same
CN108473542A (en) The yeast cells of genetic modification and the improved method for producing grumeleuse specificity streptokinase
JPH05308988A (en) New polypeptide, new dna, new vector, new transformant, new medicinal composition and production of the new polypeptide

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA JP KR US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IT LU NL SE

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: CA