WO2002048174A9 - Dimeric fluorescent polypeptides - Google Patents

Dimeric fluorescent polypeptides

Info

Publication number
WO2002048174A9
WO2002048174A9 PCT/US2001/048690 US0148690W WO0248174A9 WO 2002048174 A9 WO2002048174 A9 WO 2002048174A9 US 0148690 W US0148690 W US 0148690W WO 0248174 A9 WO0248174 A9 WO 0248174A9
Authority
WO
WIPO (PCT)
Prior art keywords
polypeptide
fluorescent
cell
recombinant fusion
fusion polypeptide
Prior art date
Application number
PCT/US2001/048690
Other languages
French (fr)
Other versions
WO2002048174A2 (en
WO2002048174A3 (en
Inventor
Ronald W Davis
Peter Vaillancourt
Original Assignee
Stratagene Inc
Ronald W Davis
Peter Vaillancourt
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Stratagene Inc, Ronald W Davis, Peter Vaillancourt filed Critical Stratagene Inc
Priority to AU2002230920A priority Critical patent/AU2002230920A1/en
Priority to EP01991178A priority patent/EP1349867A4/en
Priority to CA002432782A priority patent/CA2432782A1/en
Publication of WO2002048174A2 publication Critical patent/WO2002048174A2/en
Publication of WO2002048174A9 publication Critical patent/WO2002048174A9/en
Publication of WO2002048174A3 publication Critical patent/WO2002048174A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/43504Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates
    • C07K14/43595Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates from coelenteratae, e.g. medusae
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide

Definitions

  • Fluorescent proteins are widely used in the fields of biochemistry, molecular and cell biology, medical diagnostics and drug screening methodologies (Chalfie et al., 1994, Science 263: 802-805; Tsien, 1998, Ann. Rev. Biochem. 67: 509-544).
  • One property shared by the most useful fluorescent proteins is that they require no host-encoded co-factors or substrates for fluorescence. The proteins therefore retain their fluorescent properties both in isolation from their native organism, and when expressed in the cells of other organisms. This property makes them particularly well suited for a variety of in vivo and in vitro applications.
  • fluorescent proteins for use in biological systems is that they are indeed proteins, which permits their synthesis within cells or organisms of interest, avoiding a host of problems relating to the attachment ofthe label to a protein of interest and/or delivery of labeled proteins into a cell. Not only can the proteins be made within the desired cell or organism, but they also retain their fluorescent properties when expressed as fusions with other proteins of interest, which greatly enhances their utility both in vivo and in vitro.
  • Fluorescent proteins have been used as reporter molecules to study gene expression in culture as well as in transgenic animals by insertion of fluorescent protein coding sequences downstream of an appropriate promoter. They have also been used to study the subcellular localization of proteins by direct fusion of test proteins to fluorescent proteins, and fluorescent proteins have become the reporter of choice for monitoring the infection efficiency of viral vectors both in cell culture and in animals. Variants of fluorescent proteins exhibiting spectral shifts in response to changes in the cellular environment (e.g., changes in pH, ion flux, or the redox status ofthe cell) are also used to monitor such changes (see, for example, Inouye & Tsuji, 1994, FEBS Lett.
  • FRET fluorescence resonance energy transfer
  • the prototypical fluorescent protein is the Aequorea victoria green fluorescent protein (GFP), which was the first green fluorescent protein cloned (Prasher et al., 1992, Gene 111: 229- 233).
  • Purified victoria GFP is a monomeric protein of about 27 kDa that absorbs blue light with an excitation wavelength maximum of 395 nm, with a minor peak at 470 nm, and emits green fluorescence with an emission wavelength of about 510 nm and a minor peak near 540 nm (Ward et al., 1979, Photochem. Photobiol. Rev. 4: 1-57).
  • the polypeptide has several drawbacks, including relatively broad excitation and emission spectra, low quantum yield, and low expression in cells of higher eukaryotes. Mutants with improved spectral characteristics and higher quantum yield have been identified, and expression in higher eukaryotes has been improved by "humanizing" the nucleic acid sequences to encode codons optimized for human or mammalian expression.
  • Additional fluorescent proteins include, but are not limited to those expressed by Discosoma sp. and Phialidium gregarum (Ward et al., 1982, Photochem. Photobiol. 35: 803-808; Levine et al., 1982, Comp. Biochem. Physiol. 72B:77-85). Also, Vibrio fischeri strain Yl expresses a yellow fluorescent protein that requires flavins as a co-factor for its fluorescence (Baldwin et al., 1990, Biochemistry 29: 5509-5515).
  • Additional cloned fluorescent proteins include, for example, the green fluorescent proteins from the sea pansy, Renilla mullerei (WO/99/49019) and from Renilla reniformis (see SEQ ID NO: 1; Figure 1). Each of these fluorescent proteins and others are useful for a variety of in vivo and in vitro uses.
  • the R. reniformis GFP (rGFP) clone is particularly important, since rGFP is seen as the benchmark protein among known naturally-occurring fluorescent proteins. rGFP has 3 to 6-fold higher quantum yield than A. victoria GFP, and the excitation and emission spectra are narrower, making rGFP more suitable for applications involving, for example, FRET.
  • GFPs from A. victoria, R. mullerei and R. reniformis are dimeric.
  • the proteins exist as homodimers.
  • heterodimers can form if the dimerization interfaces for the different fluorescent proteins are complementary. Heterodimerization interferes with the usefulness of fluorescent proteins for several reasons.
  • heterodimerization is undesirable when fluorescent proteins are used in energy transfer-based analyses because heterodimerization raises the background of acceptor fluorescence without a real interaction between the proteins or protein domains of interest.
  • FRET fluorescence desorption spectroscopy
  • donor and acceptor fluorescent fusion proteins are often expressed in the same cell or otherwise mixed.
  • the excitation ofthe donor fluorophore leads to emission by the acceptor fluorophore only if the two fusion proteins are in close apposition.
  • heterodimerization occurs between the differing fluorescent proteins (e.g., between a wild-type rGFP and an rGFP variant that is a fluorescence donor to the wild-type GFP)
  • excitation ofthe donor will result in emission by the acceptor regardless ofthe interaction between the fused polypeptides being examined for interaction. This generates an unacceptably high background fluorescence from the acceptor fluorophore.
  • Another problem caused by the heterodimerization is that the dimerization interfaces between the proteins can serve to artifactually bring fusion polypeptides linked to the fluorescent protein monomers into close contact.
  • the inappropriate recruitment of proteins into close apposition can have biological consequences that make data interpretation difficult. For example, some cell surface receptors gain the ability to initiate an intracellular signaling cascade following ligand-induced dimerization. If the dimerization interfaces ofthe fluorescent proteins inappropriately recruit the fused receptor monomers into close contact, the signaling cascade can be inappropriately initiated in the absence of ligand.
  • U.S. Patent No. 5,981,200 (Tsien et al.) teaches donor and acceptor fluorescent proteins linked by a peptide linker.
  • the linked donor and acceptor proteins referred to as “tandem fluorescent proteins,” are taught to be useful for assaying enzymes capable of cleaving the linker peptide sequence.
  • tandem fluorescent proteins When linked, the tandem fluorescent proteins exhibit either no fluorescence (e.g., when one protein quenches the fluorescence ofthe other) or fluorescence characteristic of the acceptor. Following cleavage, the fluorescence emitted is that characteristic ofthe individual fluorescent proteins. Assays using this arrangement will not work unless the tandem fluorescent proteins are related as donor and acceptor.
  • the invention encompasses a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein the recombinant fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore.
  • the first polypeptide and the second polypeptide are peptide bonded to each other via a linker sequence.
  • the recombinant fusion polypeptide further comprises a third polypeptide peptide bonded to the recombinant fusion polypeptide.
  • the third polypeptide can be peptide bonded to the recombinant fusion polypeptide either directly or through a peptide linker sequence.
  • a recombinant fusion polypeptide of this embodiment is referred to in this summary as a "fluorescent polypeptide fusion.”
  • the third polypeptide is fused to the amino terminus ofthe first polypeptide.
  • the third polypeptide is fused to the carboxy terminus ofthe second polypeptide sequence.
  • the third polypeptide is a member of a specific binding pair.
  • one or both ofthe first and second polypeptides is a monomer of one of R. reniformis GFP, R. mulleri GFP or victoria GFP.
  • both ofthe first and second polypeptides are a monomer of one ofR. reniformis GFP, R. mulleri GFP or A. victoria GFP.
  • the invention further encompasses a polynucleotide encoding a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein the recombinant fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore.
  • the first polypeptide and the second polypeptide encoded by the polynucleotide are peptide bonded to each other via a linker sequence.
  • the linker sequence encoded by the polynucleotide is from 5 to 50 amino acids long.
  • the linker sequence comprises one or more iterations of a peptide, for example the peptide RARDPRVPVAT (i.e., Arg-Ala-Arg-Asp-Pro-Arg-Val-Pro- Val-Ala-Thr).
  • the linker sequence is selected from the group consisting of (Arg-Ala-Arg-Asp-Pro-Arg-Val-Pro-Val-Ala-Thr) n , (Gly-Ser) n , (Thr-Ser-Pro) n , (Gly-Gly-Gly) n , and (Glu-Lys) n , wherein n is 1 to 15.
  • the polynucleotide further encodes a third polypeptide peptide bonded to the recombinant fusion polypeptide.
  • the third polypeptide encoded by the polynucleotide may be joined directly or via an encoded peptide linker.
  • the third polypeptide encoded by the polynucleotide is a member of a specific binding pair. It alternatively preferred that the third encoded polypeptide is fused to the amino terminus ofthe first polypeptide. Is additionally preferred that the third encoded polypeptide is fused to the carboxy terminus ofthe second polypeptide.
  • one or both ofthe first and second polypeptides is a monomer of one ofR. reniformis GFP, R. mulleri GFP, A. victoria GFP.
  • both ofthe first and second polypeptides is a monomer of one ofR. reniformis GFP, R. mulleri GFP, A. victoria GFP.
  • the invention further encompasses a vector comprising a polynucleotide encoding a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein the recombinant fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore.
  • the invention further encompasses a cell comprising a vector comprising a polynucleotide encoding a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein the recombinant fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore.
  • the cell is a bacterial cell.
  • the cell is a eukaryotic cell.
  • the eukaryotic cell is a yeast cell, an insect cell, or a mammalian cell.
  • the invention further encompasses a pair of polypeptides comprising a polypeptide labeled with a fluorescent dye and a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, wherein the fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore, and wherein the fluorescent dye and the recombinant fusion polypeptide are fluorescent donor and acceptor to each other.
  • the invention further encompasses a pair of recombinant fusion polypeptides comprising (a) a first fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein the first fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore, and (b) a second fusion polypeptide comprising a third polypeptide peptide bonded to a fourth polypeptide, wherein the third and fourth polypeptides are found in nature as monomers of a multimeric protein, and wherein the second fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore, wherein the first fusion polypeptide and the second fusion polypeptide are fluorescent donor and acceptor to each other.
  • each ofthe first and second fusion polypeptides further comprises an additional fused (third) polypeptide, wherein the additional fused polypeptide ofthe first fusion polypeptide comprises a sequence which is different from the additional fused polypeptide ofthe second fusion polypeptide.
  • the invention further encompasses a method of producing a fluorescently labeled recombinant fusion polypeptide, the method comprising the steps of introducing to a cell a polynucleotide encoding a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein the recombinant fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore, and culturing the cell under conditions that permit the synthesis ofthe recombinant fusion polypeptide, whereby the recombinant fusion polypeptide is produced.
  • the invention further encompasses a method of labeling a cell with a fluorescent recombinant fusion polypeptide, the method comprising the steps of: a) introducing to a cell a polynucleotide encoding a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein the recombinant fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore; and b) culturing the cell under conditions that permit the synthesis of the recombinant fusion polypeptide, whereby the cell is labeled with the fluorescent recombinant fusion polypeptide.
  • the polynucleotide introduced to the cell further comprises a sequence encoding a third polypeptide fused in frame to the sequence en
  • the invention further encompasses a method of monitoring the interaction of two polypeptides of interest, the method comprising the steps of: a) contacting a fluorescent polypeptide fusion, as described above, and a second polypeptide wherein: i) the fluorescent polypeptide fusion comprises a first polypeptide of interest; ii) the second polypeptide comprises a second polypeptide of interest and is fluorescently labeled; and iii) the fluorophores comprised by the fluorescent polypeptide fusion and the second polypeptide are fluorescent donor and fluorescent acceptor to each other; b) exciting the donor fluorophore; and c) detecting fluorescent emission from the fluorescent acceptor, wherein the emission is indicative ofthe interaction ofthe first and the second polypeptides of interest.
  • the second polypeptide comprises a second fluorescent polypeptide fusion, as described above, wherein the polypeptide of interest ofthe second fluorescent polypeptide fusion is different from the polypeptide of interest ofthe first fluorescent polypeptide fusion.
  • the contacting step is performed in vitro.
  • the contacting step is performed in a cell.
  • the contacting comprises the step of introducing nucleic acid encoding the polypeptides to a cell.
  • the invention further encompasses a method of screening for a compound that modulates the interaction of a first and a second member of a specific binding pair, the method comprising the steps of: a) contacting a first polypeptide and a second polypeptide in the presence and absence of a candidate modulator wherein: i) the first polypeptide is a fluorescent polypeptide fusion, as described above, wherein the third polypeptide is the first member of a specific binding pair; ii) the second polypeptide is fluorescently labeled and comprises the second member of a specific binding pair; and iii) the fluorophores comprised by the first and second polypeptides are fluorescent donor and acceptor to each other; b) exciting the donor fluorophore; and c) detecting the fluorescence ofthe acceptor fluorophore, wherein emission ofthe spectrum characteristic ofthe fluorescent acceptor indicates the interaction ofthe first and the second members ofthe specific binding pair, and wherein a change in the interaction in the presence of the candidate modulator indicates that
  • the second polypeptide is a fluorescent polypeptide fusion, as described above, which comprises the second member of a specific binding pair.
  • Figure 1 shows the polynucleotide sequence ofR. reniformis GFP (SEQ ID NO: 1).
  • Figure 2 shows the amino acid sequence ofR. reniformis GFP (SEQ ID NO: 2).
  • Figure 3 shows the polynucleotide and amino acid sequences for hrGFP, a humanized R. reniformis GFP.
  • the polynucleotide sequence is SEQ ID NO: 3
  • the amino acid sequence is SEQ ID NO: 4.
  • FIG. 4 shows a schematic diagram of a construct encoding an IDFP ofthe invention.
  • CMV refers to the cytomegalovirus promoter
  • MCS refers to a multiple cloning sequence
  • pA refers to a poly(A) addition site sequence.
  • hrGFP represents one monomer ofthe humanized R. reniformis GFP
  • linker refers to a peptide or polypeptide linker sequence.
  • A, B, and C show examples of linker peptide sequences.
  • Figure 5 shows relationships between emission and excitation peaks for donor and acceptor fluorophores capable of FRET.
  • a recombinant fusion polypeptide is "fluorescent when excited”.
  • excited refers to a fluorophore that is exposed to light of an excitation wavelength or to an acceptor fluorophore that is interactive with an excited donor fluorophore.
  • fluorescent when excited means that when the recombinant fusion polypeptide is exposed to light of an excitation wavelength or when the polypeptide interacts with an excited donor fluorophore, the polypeptide fluoresces.
  • Exposed to light of an excitation wavelength means irradiated with light (electromagnetic radiation) within a given spectrum of wavelengths that is absorbed by the polypeptide such that the polypeptide emits light having a different spectrum of wavelengths, and thus fluoresces. Fluorescent emission occurs at a longer wavelength than does excitation.
  • a recombinant fusion polypeptide according to the invention has three properties: 1) it must emit light upon irradiation with light of a given wavelength or wavelengths; 2) it must have the capacity to form an intramolecular homodimer as defined herein above; and 3) the first and second polypeptide monomers that constitute the fusion polypeptide cannot function as fluorescent donor and fluorescent acceptor, respectively, in the context of fluorescence resonance energy transfer.
  • the term "light of an excitation wavelength” refers to those wavelengths of light that are absorbed by and excite a given fluorophore to emit fluorescence. These wavelengths are described in detail herein below. Light of an appropriate portion ofthe spectrum is synonymous with light within the excitation spectrum of a given fluorophore.
  • excited donor fluorophore refers to a fluorophore which has absorbed energy within its excitation spectrum. An excited donor fluorophore can transmit energy sufficient to excite an acceptor fluorophore.
  • fluorescent dye refers to a non-polypeptide chemical moiety that, upon absorption of light energy of a particular wavelength or wavelengths, emits light at another wavelength or that emits light when paired with an appropriate excited donor fluorophore.
  • fluorescence donor i.e., fluorescent dyes or polypeptides
  • fluorescence acceptor the member of the pair that emits in response to excitation by the fluorescence donor
  • fluorescence acceptor the member of the pair that emits in response to excitation by the fluorescence donor
  • fluorescence acceptor the member of the pair that emits in response to excitation by the fluorescence donor
  • fluorescence acceptor the member of the pair that emits in response to excitation by the fluorescence donor
  • fluorescence donor and fluorescence acceptor polypeptides are not linked by peptide bonds.
  • either ofthe fluorescence donor or acceptor, but not both may be a non-polypeptide fluorescent dye (also not covalently linked to each other).
  • fluorescently labeled means, when referring to a polypeptide, that the polypeptide is covalently attached to a fluorescent moiety.
  • a polypeptide may be fluorescently labeled by covalent attachment to a non-polypeptide fluorescent dye, or alternatively, by expression as a fusion protein with a fluorescent polypeptide.
  • a fluorescent polypeptide is distinguished from a luminescent polypeptide in that a fluorescent polypeptide requires an input of electromagnetic energy in order to emit light, while a luminescent polypeptide emits light in response to release of chemical energy.
  • a luminescent polypeptide may serve as a donor of excitation energy for a fluorescent polypeptide (in fact, this is exactly what happens in nature when, for example, Renilla luciferase emits energy that excites Renilla GFP).
  • a fusion polypeptide according to the invention may or may not be luminescent.
  • recombinant refers to a polynucleotide that has been isolated from its natural environment using recombinant DNA techniques, or synthesized, or to a polypeptide expressed from such a polynucleotide.
  • a recombinant polypeptide may be identical to or different from a naturally occurring polypeptide, as long as it is expressed from a recombinant polynucleotide.
  • the term "monomer” refers to a single polypeptide molecule that exists as a dimer or heterodimer or other multimer (e.g., a trimer, quadramer, pentamer, etc.) in a multimeric protein.
  • a “monomer” interacts with another monomer, e.g., in a dimer, via a specific sequence referred to herein by the equivalent terms “interaction domain” and “interaction interface”.
  • interaction domain and “interaction interface”.
  • the appropriate equivalent terms for the sequences that mediate the interaction are "dimerization domain” and "dimerization interface.”
  • a monomer of a fluorescent polypeptide may be full length, for example, as the polypeptide occurs in nature, or it may be longer or shorter than the naturally occurring polypeptide, so long as it retains the two requisite properties.
  • a recombinant fusion polypeptide according to the invention may comprise first and second polypeptides which exist in nature as non-peptide-bonded monomers of a multimeric protein.
  • first and second polypeptides which exist in nature as non-peptide-bonded monomers of a multimeric protein.
  • these first and second polypeptides are peptide bonded and form a single chain polypeptide.
  • the peptide-bonded first and second polypeptides retain the ability, independently, to interact with a donor or acceptor fluorophore and fluoresce. This is believed to be a result ofthe intramolecular interaction ofthe monomers and the ability ofthe intramolecular dimer thus formed to be excited at an excitation wavelength of light and to act as a fluorescent donor or acceptor.
  • linker sequence refers to a sequence of peptide bonded amino acids that joins or links by peptide bonds two amino acid sequences or polypeptide domains that are not joined by peptide bonds in nature.
  • a linker sequence is encoded in frame on a polynucleotide between the sequences encoding the two polypeptide domains joined by the linker.
  • a linker is preferably 5 to 50 amino acids in length, more preferably 10 to 20 amino acids in length.
  • An example of linkers useful in the invention are the Gly- Ala linkers taught by Huston et al., U.S. Patent No. 5,258,498, incorporated herein by reference.
  • Additional useful linkers include, but are not limited to (Arg-Ala-Arg-Asp-Pro-Arg-Val-Pro-Val-Ala-Thr) 1-5 (Xu et al., 1999, Proc. Natl. Acad. Sci. U.S.A. 96: 151-156), (Gly-Ser) n (Shao et al., 2000, Bioconjug. Chem. 11: 822-826), (Thr-Ser-Pro) radical (Kroon et al., 2000, Eur. J. Biochem.
  • specific binding pair refers to a pair of polypeptides that physically interact in a specific manner that gives rise to a biological activity, that is, to the substantial exclusion of other polypeptides. Members of a specific binding pair interact through complementary interaction domains, such that they interact to the substantial exclusion of proteins that do not have a complementary interaction domain.
  • specific binding pairs include antibody-antigen pairs, enzyme-substrate pairs, dimeric transcription factors (e.g., AP-1, composed of Fos specifically bound to Jun via a leucine zipper interaction domain) and receptor-ligand pairs.
  • amino terminus refers to the last amino acid at the amino end of a polypeptide, where the last amino acid is not peptide bonded to another amino acid.
  • carboxy terminus refers to the last amino acid at the carboxyl end of a polypeptide, where the last amino acid is not peptide bonded to another amino acid.
  • labeling a cell refers to the expression of a fluorescent polypeptide in a cell, such that the cell is detectable by irradiating the cell with light within the excitation spectrum ofthe fluorescent polypeptide and monitoring or detecting emission within the emission spectrum ofthe polypeptide.
  • a cell may be labeled by expression of a fluorescent polypeptide that localizes anywhere in the cell, including, but not limited to the cell surface, the cytoplasm, the nucleus or to particular organelles such as mitochondria, lysosomes, endosomes, golgi apparatus, endoplasmic reticulum or other specific sub-cellular locale.
  • introducing a nucleic acid into a cell or "introducing a polynucleotide into a cell” refers to the process whereby a recombinant polynucleotide is put into a cell.
  • Methods for introducing a nucleic acid to a cell will vary with the nature ofthe cell and the nature ofthe chosen vector, but one of skill in the art may readily select and employ a known method appropriate for a given cell type and vector.
  • the term "culturing a cell under conditions that permit the synthesis of a recombinant polypeptide” refers to the maintenance of cells comprising a polynucleotide encoding a recombinant polypeptide in growth medium and under environmental conditions (e.g., temperature, pH, redox and osmotic conditions, O 2 and CO 2 concentrations and presence or absence of an effective concentration of an appropriate expression-modulating agent such as IPTG or tetracycline) conducive to the synthesis ofthe recombinant polypeptide.
  • environmental conditions e.g., temperature, pH, redox and osmotic conditions, O 2 and CO 2 concentrations and presence or absence of an effective concentration of an appropriate expression-modulating agent such as IPTG or tetracycline
  • monitoring the interaction refers to the process whereby the physical association of two polypeptides or a polypeptide and another entity are measured. As relates to the invention, the term refers most frequently to detection or measurement of association or interaction using FRET.
  • intramolecular dimer refers to a dimer formed by the covalent peptide linkage of two polypeptide monomers.
  • An "intramolecular dimer fluorescent protein” (IDFP) is an intramolecular dimer in which the linked polypeptides which exist in nature as monomers of a multimeric protein are fluorescent polypeptides. According to the invention, the linked monomers of an IDFP are not fluorescent donor and acceptor to each other.
  • An “IDFP fusion protein” is an IDFP which is fused to a protein of interest or to a fragment of a protein of interest.
  • protein of interest refers to a polypeptide, or a domain (fragment) of a polypeptide, that is selected to be fused to an IDFP. Any polypeptide or fragment of a polypeptide for which a polynucleotide sequence is known can be fused to an IDFP by standard techniques known in the art.
  • a protein of interest according to the invention either does not alter the fluorescence characteristics ofthe fused IDFP, or, if it does alter those characteristics, the alteration is such that the alteration does not interfere with the intended use of the IDFP fusion protein.
  • detecting fluorescence refers to the process whereby the fluorescent emission by a fluorescent polypeptide is measured or determined. Fluorescence detection methods include quantitative and qualitative methods adapted for standard or confocal microscopy, FACS analysis, and those adapted for high throughput methods involving multiwell plates, arrays or microarrays. One of skill in the art can select appropriate filter sets and excitation energy sources for the detection of fluorescent emission from a given fluorescent polypeptide or dye.
  • candidate modulator refers to an agent being evaluated for its effect on the function of a polypeptide or the interaction of members of a specific binding pair. Exemplary sources and types of candidate modulators useful according to the invention are described herein below.
  • the term "change in interaction” or “modulation of interaction” refers to an increase or decrease in the level of interaction detected between members of a specific binding pair.
  • the level of interaction is considered increased if the detected interaction goes up by at least 10%, and preferably by 20%, 35%, 50%, 75%, or more, up to and including 2-fold, 5-fold, 10-fold, 20-fold, 50-fold or more relative to a standard.
  • the level of interaction is considered decreased if the detected interaction goes down by at least 10%, and preferably by 20%, 35%, 50%, 75%, 90%, 95%, 98%, 99% or more, up to and including 100% (no interaction) relative to a standard.
  • single polypeptide chain refers to a polypeptide chain in which all amino acids are linked sequentially by peptide bonds.
  • a "single polypeptide chain” is one generated by translation of a single mRNA template and may encompass one or more polypeptide domains, including one or more repeats ofthe sequence comprising one polypeptide or polypeptide domain.
  • polypeptide domain refers to a sequence of amino acids that exhibits one or more discrete binding or functional properties.
  • binding or functional properties include binding to one or more polypeptides, modulation ofthe binding of one or more polypeptides, recognition by an antibody or antigen binding fragment thereof, binding to a coenzyme, ion, or other ligand, catalytic activity or inhibition of catalytic activity, fluorescence and luminescence.
  • non-limiting examples of polypeptide domains include a DNA binding domain and a kinase domain.
  • homodimer refers to a protein complex comprised of two identical copies ofthe same monomer.
  • the term "interact" means that two molecular species physically associate with each other.
  • the association that is characterized as an interaction can involve charge- charge interactions, charge-dipole interactions, dipole-dipole interactions, van der Waals forces, hydrogen bonding and/or hydrophobic forces.
  • specific binding means the specific recognition of one of two different molecules for the other compared to substantially less recognition of other molecules.
  • Members of a specific binding pair have a particular affinity for each other that gives rise to a biological activity.
  • the molecules have areas on their surfaces or in cavities giving rise to specific recognition between the two molecules.
  • Specific binding are antibody- antigen interactions, enzyme—substrate interactions, polynucleotide interactions, and so forth.
  • the term “specifically dimerize” means that two monomers useful in the invention interact via an interaction domain present on each monomer, to the substantial exclusion of polypeptides lacking that interaction domain.
  • “Specifically homodimerize” means that the monomers that interact via a shared interaction domain, to the substantial exclusion of polypeptides lacking that interaction domain, form a homodimer as defined herein.
  • “Substantial exclusion” means that at a given time in a sample, less than 0.1% ofthe monomers, and preferably less than 0.01%, 0.001% or fewer monomers are physically associated with polypeptides that do not have a complementary interaction domain.
  • variant refers to a polypeptide that differs in amino acid sequence from a parent polypeptide yet retains the function ofthe parent polypeptide.
  • a variant fluorescent polypeptide may, for example, have one or more amino acid insertions, deletions or substitutions that do not alter ability ofthe polypeptide to emit fluorescence upon excitation or interaction with a donor or acceptor fluorophore.
  • a variant fluorescent polypeptide according to the invention has the ability to form an intramolecular homodimer as defined herein.
  • a fluorescent polypeptide can be derived from a wild-type fluorescent polypeptide (i.e., a reference polypeptide) by random or site-directed mutagenesis, including insertions, deletions or truncations or fusions.
  • a fluorescent polypeptide derived from a wild-type polypeptide can have different fluorescence characteristics than the wild-type polypeptide.
  • fluorescent characteristic refers to a property ofthe excitation or emission by a fluorescent polypeptide. Fluorescence characteristics include, for example, the wavelength(s) at which a fluorescent polypeptide is excited or at which it emits (including the breadth and amplitudes ofthe spectra for each), the extinction coefficient or intensity ofthe emission, quantum yield or the efficiency of emission, and resistance or susceptibility to photobleaching. Table 2 provides examples of excitation maxima, emission maxima, extinction coefficient and quantum yield for a variety of fluorescent polypeptides.
  • the term "spectrum characteristic of a fluorescent acceptor” refers to the emission spectrum of a given fluorophore that is being used as the fluorescence acceptor in an acceptor/donor pair.
  • the invention relates to dimeric fluorescent proteins that avoid the problems caused by heterodimerization.
  • heterodimerization is avoided by fusing two monomers ofthe fluorescent polypeptide using a linker peptide.
  • the close spatial relationship ofthe fused monomers strongly favors the formation of a dimer between the two fused monomers, to the essential exclusion of other monomers sharing a similar dimerization interface.
  • the interaction ofthe fused monomers via their respective dimerization interfaces is referred to herein as "intramolecular dimerization”.
  • An intramolecular dimer fluorescent protein (IDFP) does not comprise fluorescent monomers that are related to each other as donor and acceptor. That is, the monomers that are linked in an IDFP cannot undergo FRET between them. IDFPs may be co-expressed within the same cell or otherwise mixed with distinct fluorescent proteins comprising the same fluorescent protein dimerization interfaces without encountering the problems caused by heterodimer formation.
  • the nucleic acid encoding a monomer of a fluorescent protein is joined in frame at its 3' end to a sequence encoding a peptide linker, which is itself joined in frame to another copy ofthe nucleic acid encoding the monomer.
  • This sequence may and often will be additionally linked in frame to a sequence encoding a polypeptide of interest, for example, a polypeptide being investigated for interaction with another protein.
  • Translation of the mRNA encoded by such a nucleic acid construct generates the fluorescent monomers in such close proximity to each other that intramolecular homodimerization ofthe monomers is very strongly favored over intermolecular heterodimerization.
  • the resulting polypeptide therefore comprises an intramolecular homodimer ofthe fluorescent protein monomers, fused to a protein of interest.
  • any fluorescent protein that homodimerizes in a cell can be useful in generating an IDFP ofthe invention.
  • GFPs from Aequorea victoria, Renilla reniformis and Renilla mulleri, among others, are homodimers as they exist in nature. Any of these proteins, and any mutants or engineered versions of these proteins that retain the ability to homodimerize may be used to generate an IDFP ofthe invention.
  • the fluorescent protein or the natural protein it was derived from e.g., R.. reniformis GFP
  • a biochemical approach is to fractionate samples of purified proteins by size selection gel chromatography under denaturing versus non-denaturing conditions and analyze fractions for the fluorescent protein by fluorescence. If the fluorescent protein migrates at a larger size (approximately twice as large) under non-denaturing conditions relative to denaturing conditions, it is an indication that the protein is a dimer under native conditions.
  • Examples of commonly used matrices include, for example, Sephadex (G10-G200), Bio-Gel (P-2 - P-300) and Sepharose (2B, 4B, etc.) matrices.
  • Sephadex G10-G200
  • Bio-Gel P-2 - P-300
  • this method can indicate whether or not a polypeptide homodimerizes. If the method is applied to non-purified protein, for example, to protein extracts, the assay only indicates that a dimer forms with some polypeptide, and further analysis is required to determine if the dimer is a homodimer.
  • Another biochemical method of investigating dimer formation is to generate a truncated or elongated form ofthe protein and mix it, either by co-expression or by mixing of isolated proteins, with the wild-type protein. If homodimers can form, there will be three distinctly sized bands following native gel electrophoresis: 1) a homodimer ofthe wild-type; 2) a homodimer of the elongated or truncated form; and 3) an intermediate-migrating diagnostic heterodimer complex ofthe wild-type and the truncated forms. In the absence of dimerization, only bands (1) and (2) will form.
  • homodimer formation is detected by the method of analytical ultracentrifugation (Baird et al., 2000, Proc Natl Acad Sci U S A., 22:11984-9).
  • SEQ ID NO: 1 ( Figure 1) is the nucleotide sequence encoding wild-type rGFP
  • SEQ ID NO: 2 Figure 2 is the amino acid sequence of wild-type rGFP.
  • a preferred embodiment ofthe IDFP comprises two copies ofthe wild-type rGFP polypeptide, linked by a peptide linker sequence.
  • Another embodiment encompasses the same rGFP IDFP additionally fused in frame to a protein of interest. Any protein derived from the rGFP of SEQ ID NO: 2 can be used to generate an IDFP ofthe invention as long as it retains the ability to homodimerize.
  • the polynucleotide sequence encoding a fluorescent polypeptide is a humanized polynucleotide rGFP coding sequence, also referred to herein as hrGFP.
  • Figure 3 shows a humanized polynucleotide sequence (hrGFP) and the rGFP sequence it encodes (SEQ ID Nos: 3 and 4, respectively).
  • amino acid and nucleotide sequences of A. victoria GFP are known in the art
  • vz ' cto ⁇ -derived GFPs are also known and are frequently commercially available.
  • Heim et al. (1995, Nature 373: 663-664) teaches mutations at S65 of A. victoria GFP that enhance the fluorescence intensity ofthe polypeptide.
  • the mutant containing the S65T mutation is particularly important, since its fluorescence is approximately 35 times as intense as wild-type A. victoria GFP, and its emission spectrum is shifted to the red, making it more amenable to standard rhodamine optics (excitation and emission maxima at 489 nm and 508 nm, respectively).
  • An S65T mutant encoded by a construct comprising humanized codons is known as EGFP, or "enhanced GFP" (available from CLONTECH; see GenBank Accession No. U43284).
  • the EGFP mutant is the cornerstone of a series of commercially-available GFP mutants that have differing emission spectra and other useful engineered properties (Cormack et al., 1996, Gene 173: 33-38; Yang et al., 1996, Nucl. Acids Res. 24: 4592-4593; Crameri et al., 1996, Nature Biotechnol. 14: 315-319)
  • Each protein in the series contains mutations in addition to the S65T and humanizing mutations, that alter the emission characteristics ofthe proteins.
  • the cyan fluorescent protein known as ECFP contains six mutations that shift the emission to cyan light (excitation and emission maxima at 434 nm and 477 nm, respectively; see GenBank Accession No.
  • the blue fluorescent protein known as EBFP contains four mutations that shift the emission spectrum to blue (excitation and emission maxima at 380 nm and 440 nm, respectively).
  • the yellow fluorescent protein known as EYFP (see Ormo et al., Science 273: 1392-1395, clone GFP-10C) contains mutations shifting the emission to yellow-green (excitation and emission maxima at 514 nm and 527 nm, respectively).
  • EGFP, ECFP, EYFP and EBFP are all available from CLONTECH.
  • the S65 site has received considerable scrutiny for its role in determining the fluorescence characteristics ofthe A. victoria GFP molecule.
  • Additional mutants at S65 include, for example, S65A, S65C and S65L, each of which have excitation and emission maxima that differ from wild-type A. victoria GFP (see Table 2).
  • the nucleotide sequence encoding an S65 A mutant is available as GenBank Accession No. U56996.
  • One skilled in the art can introduce mutations necessary to alter S65 to any desired amino acid.
  • the additional point mutations detailed in Table 2 can be generated by one of skill in the art.
  • fluorescent proteins useful according to the invention include, for example, A. vtcto ⁇ -derived GFPs that are optimized for expression in plants (GenBank Accession No. U87625 and WO 96/27675), are less thermosensitive (GenBank Accession No. U87973), or are more soluble and emit blue fluorescence (GenBank Accession No. U70497).
  • A. victoria GFPs targeted to specific organelles have also been described, such as those targeted to the mitochondria and the nucleus (Rizzuto et al., 1996, Curr. Biol. 6: 183-188). This listing is by no means exhaustive. There are, for example, a number of fluorescent protein variants, both derived from A.
  • the red fluorescent protein from the Indo-Pacific sea anemone ofthe Discosoma species is also a candidate for IDFP generation according to the invention (see Matz et al, 1999, Nature Biotechnol. 17: 969-973).
  • the sequence encoding the protein known as "DsRed” is available at GenBank Accession No. AF272711, and vectors encoding the protein are commercially available (CLONTECH). Linker Sequences Useful According to the Invention
  • Linker sequences useful according to the invention serve to join monomers in the dimeric fluorescent polypeptides ofthe invention.
  • a linker is preferably about 5 to about 50 amino acids in length, and more preferably about 10 to about 20 amino acids in length.
  • An example of linkers useful in the invention are the Gly- Ala linkers taught by Huston et al., U.S. Patent No. 5,258,498, incorporated herein by reference. Additional useful linkers include, but are not limited to (Arg- Ala-Arg-Asp-Pro-Arg-VaI-Pro-Val-Ala-Thr) 1-5 (Xu et al., 1999, Proc. Natl. Acad. Sci. U.S.A.
  • the protein of interest can be any protein for which the nucleic acid sequence is known and for which that sequence or at least a relevant part of that sequence can be cloned into a vector encoding an IDFP.
  • relevant part is meant a domain of interest within a protein, for example, a domain being evaluated for protei protein interactions or a domain with catalytic activity.
  • protein of interest or “domain of interest” refers to any polypeptide or protein, or polypeptide or protein domain, that one wishes to fuse to an IDFP molecule ofthe invention.
  • the fusion of an IDFP with a polypeptide of interest may be through linkage of the IDFP sequence to either the N or C terminus ofthe fusion partner.
  • Fusions comprising IDFP polypeptides ofthe invention need not comprise only a single polypeptide or domain in addition to the IDFP. Rather, any number of domains of interest may be linked in any way as long as the IDFP coding region retains its reading frame and the encoded polypeptide retains fluorescence activity under at least one set of conditions.
  • physiological salt concentration i.e., about 90 mM
  • pH near neutral i.e., about 90 mM
  • proteins of interest include, but are not limited to receptors (transmembrane and intracellular) and cell surface proteins, growth factors, signal transduction proteins, transcription factors, structural proteins (e.g., cytoskeletal proteins, nuclear matrix proteins, histones, etc.), extracellular matrix proteins, immunoglobulins, bacterial proteins, plant proteins, viral or phage proteins, enzymes, therapeutic proteins, phosphoproteins, glycoproteins, and lipoproteins.
  • receptors transmembrane and intracellular
  • cell surface proteins include, but are not limited to cell surface proteins, growth factors, signal transduction proteins, transcription factors, structural proteins (e.g., cytoskeletal proteins, nuclear matrix proteins, histones, etc.), extracellular matrix proteins, immunoglobulins, bacterial proteins, plant proteins, viral or phage proteins, enzymes, therapeutic proteins, phosphoproteins, glycoproteins, and lipoproteins.
  • IDFPs from recombinant vectors may be effected in a number of ways known to those skilled in the art.
  • plasmids, bacteriophage or viral vectors may be introduced to prokaryotic or eukaryotic cells by any of a number of ways known to those skilled in the art. Examples of useful vectors, cells, methods of introducing vectors to cells and methods of detecting and isolating GFP polypeptides and variants thereof are also described herein below.
  • vectors there is a wide array of vectors known and available in the art that are useful for the expression of IDFPs according to the invention.
  • the selection of a particular vector clearly depends upon the intended use ofthe polypeptide.
  • the selected vector must be capable of driving expression ofthe polypeptide in the desired cell type, whether that cell type be prokaryotic or eukaryotic.
  • Many vectors comprise sequences allowing both prokaryotic vector replication and eukaryotic expression of operably linked gene sequences.
  • Vectors useful according to the invention may be autonomously replicating, that is, the vector, for example, a plasmid, exists extrachromosomally and its replication is not necessarily directly linked to the replication ofthe host cell's genome.
  • the replication ofthe vector may be linked to the replication ofthe host's chromosomal DNA, for example, the vector may be integrated into the chromosome ofthe host cell as achieved by retroviral vectors.
  • Vectors useful according to the invention preferably comprise sequences operably linked to the IDFP coding sequences that permit the transcription and translation ofthe IDFP sequence. Sequences that permit the transcription ofthe linked IDFP sequence include a promoter and optionally also include an enhancer element or elements permitting the strong expression ofthe linked sequences.
  • transcriptional regulatory sequences refers to the combination of a promoter and any additional sequences conferring desired expression characteristics (e.g., high level expression, inducible expression, tissue- or cell-type-specific expression, or a combination of these) on an operably linked nucleic acid sequence.
  • the selected promoter may be any DNA sequence that exhibits transcriptional activity in the selected host cell, and may be derived from a gene normally expressed in the host cell or from a gene normally expressed in other cells or organisms.
  • promoters include, but are not limited to the following: A) prokaryotic promoters - E. coli lac, tac, or trp promoters, lambda phage P R or P L promoters, bacteriophage T7, T3, Sp6 promoters, B. subtilis alkaline protease promoter, and the B.
  • eukaryotic promoters - yeast promoters such as GAL1, GAL4 and other glycolytic gene promoters (see for example, Hitzeman et al., 1980, J. Biol. Chem. 255: 12073-12080; Alber & Kawasaki, 1982, J. Mol. Appl. Gen. 1: 419-434), LEU2 promoter (Martinez-Garcia et al., 1989, Mol Gen Genet.
  • alcohol dehydrogenase gene promoters Young et al., 1982, in Genetic Engineering of Microorganisms for Chemicals, Hollaender et al., eds., Plenum Press, NY
  • TPI1 promoter U.S. Pat. No. 4,599,311
  • insect promoters such as the polyhedrin promoter (U.S. Pat. No. 4,745,051; Vasuvedan et al., 1992, FEBS Lett. 311: 7-11)
  • the P10 promoter Vlak et al., 1988, J. Gen. Virol.
  • the Autographa californica polyhedrosis virus basic protein promoter (EP 397485), the baculovirus immediate-early gene promoter gene 1 promoter (U.S. Pat. Nos. 5,155,037 and 5,162,222), the baculovirus 39K delayed-early gene promoter (also U.S. Pat. Nos. 5,155,037 and 5,162,222) and the OpMNPV immediate early promoter 2; mammalian promoters - the SV40 promoter (Subramani et al., 1981, Mol. Cell. Biol.
  • metallothionein promoter MT-1; Palmiter et al., 1983, Science 222: 809-814
  • adenovirus 2 major late promoter Yu et al., 1984, Nucl. Acids Res. 12: 9309-21
  • CMV cytomegalovirus
  • other viral promoter Teong et al., 1998, Anticancer Res. 18: 719-725
  • a selected promoter may also be linked to sequences rendering it inducible or tissue-specific.
  • the addition of a tissue-specific enhancer element upstream of a selected promoter may render the promoter more active in a given tissue or cell type.
  • inducible expression may be achieved by linking the promoter to any of a number of sequence elements permitting induction by, for example, thermal changes (temperature sensitive), chemical treatment (for example, metal ion- or IPTG-inducible), or the addition of an antibiotic inducing agent (for example, tetracycline).
  • Regulatable expression is achieved using, for example, expression systems that are drug inducible (e.g., tetracycline, rapamycin or hormone-inducible).
  • Drug-regulatable promoters that are particularly well suited for use in mammalian cells include the tetracycline regulatable promoters, and glucocorticoid steroid-, sex hormone steroid-, ecdysone-, lipopolysaccharide (LPS)- and isopropylthiogalactoside (IPTG)-regulatable promoters.
  • a regulatable expression system for use in mammalian cells should ideally, but not necessarily, involve a transcriptional regulator that binds (or fails to bind) nonmammalian DNA motifs in response to a regulatory agent, and a regulatory sequence that is responsive only to this transcriptional regulator.
  • tissue-specific promoters may also be used to advantage with IDFP-encoding constructs.
  • tissue-specific promoters A wide variety of tissue-specific promoters is known.
  • tissue-specific means that a given promoter is transcriptionally active (i.e., directs the expression of linked sequences sufficient to permit detection ofthe polypeptide product ofthe promoter) in less than all cells or tissues of an organism.
  • a tissue specific promoter is preferably active in only one cell type, but may, for example, be active in a particular class or lineage of cell types (e.g., hematopoietic cells).
  • a tissue specific promoter useful according to the invention comprises those sequences necessary and sufficient for the expression of an operably linked nucleic acid sequence in a manner or pattern that is essentially the same as the manner or pattern of expression ofthe gene linked to that promoter in nature. Any tissue specific transcriptional regulatory sequence known in the art may be used to advantage with a vector encoding an IDFP.
  • vectors useful according to the invention may further comprise a suitable terminator.
  • suitable terminator include, for example, the human growth hormone terminator (Palmiter et al., 1983, supra), or, for yeast or fungal hosts, the TPIl (Alber & Kawasaki, 1982, supra) or ADH3 terminator (McKnight et al., 1985, EMBO J. 4: 2093-2099).
  • Vectors useful according to the invention may also comprise polyadenylation sequences (e.g., the SV40 or Ad5Elb poly(A) sequence), and translational enhancer sequences (e.g., those from Adenovirus VA RNAs). Further, a vector useful according to the invention may encode a signal sequence directing the recombinant polypeptide to a particular cellular compartment or, alternatively, may encode a signal directing secretion ofthe recombinant polypeptide. A vector useful according to the invention may also comprise a selectable marker allowing identification of a cell that has received a functional copy ofthe IDFP-encoding gene construct.
  • polyadenylation sequences e.g., the SV40 or Ad5Elb poly(A) sequence
  • translational enhancer sequences e.g., those from Adenovirus VA RNAs.
  • a vector useful according to the invention may encode a signal sequence directing the recombinant polypeptide to a particular cellular compartment or, alternatively,
  • the IDFP sequence itself, linked to a chosen promoter may be considered a selectable marker, in that illumination of cells or cell lysates with the proper wavelength of light and measurement of emitted fluorescence at the expected wavelength allows detection of cells that express the IDFP construct.
  • the selectable marker may comprise an antibiotic resistance gene, such as the neomycin, bleomycin, zeocin or phleomycin resistance genes, or it may comprise a gene whose product complements a defect in a host cell, such as the gene encoding dihydrofolate reductase (DHFR), or, for example, in yeast, the Leu2 gene.
  • the selectable marker may, in some cases be a luciferase gene or a chromogenic substrate-converting enzyme gene such as the ⁇ -galactosidase gene.
  • IDFP-encoding sequences according to the invention may be expressed either as freestanding polypeptides or as fusions with other polypeptides. It is assumed that one of skill in the art can, given an IDFP nucleic acid sequence, readily construct a gene comprising a sequence encoding the IDFP fused in frame to one or more polypeptides or polypeptide domains of interest. References teaching methods to do so include Sambrook et al., 1989, Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, and Ausubel et al. (eds.), 1993, Current Protocols in Molecular Biology on CD-ROM, John Wiley & Sons, New York, NY.
  • FIG. 4 A schematic diagram of a vector encoding the transcription unit of one possible embodiment ofthe invention is shown in Figure 4.
  • an intramolecular dimer humanized R. reniformis GFP hrGFP
  • MCS multi-cloning site
  • a gene of interest is fused at the C-terminus ofthe hrGFP dimer by insertion in frame into the MCS.
  • a polyadenylation site sequence is included 3' ofthe MCS to enhance the stability and processing ofthe transcript generated.
  • the (Gly Ser) 2-4 linkers shown represent three examples of a linker peptide sequence useful according to the invention and are not meant to be limiting.
  • Plasmid vectors Plasmid vectors.
  • Any plasmid vector that allows expression of an IDFP coding sequence ofthe invention in a selected host cell type is acceptable for use according to the invention.
  • a plasmid vector useful in the invention may have any or all ofthe above-noted characteristics of vectors useful according to the invention.
  • Plasmid vectors useful according to the invention include, but are not limited to the following examples: Bacterial - pQE70, pQE60, pQE-9 (Qiagen) pBs, phagescript, psiX174, pBluescript SK, pBsKS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, andpRIT5 (Pharmacia); Eukaryotic - pWLneo, pSV2cat, pOG44, pXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, andpSVL (Pharmacia). However, any other plasmid or vector may be used as long as it is replicable in the host. b. Bacteriophage vectors.
  • bacteriophage-derived vectors useful according to the invention. Foremost among these are the lambda-based vectors, such as Lambda Zap II or Lambda-Zap Express vectors (Stratagene) that allow inducible expression ofthe polypeptide encoded by the insert. Others include filamentous bacteriophage such as the M13-based family of vectors. c. Viral vectors.
  • retroviral vectors include but are not limited to retroviral vectors, adenoviral vectors, adeno-associated viral vectors, herpesviral vectors, and Semliki forest viral (alphaviral) vectors.
  • Defective retroviruses are well characterized for use in gene transfer (for a review see Miller, A.D. (1990) Blood 76:271). Protocols for producing recombinant retroviruses and for infecting cells in vitro or in vivo with such viruses can be found in Ausubel et al. (eds.), 1993, supra, and other standard laboratory manuals.
  • adenoviruses can be manipulated such that they encode and express a gene product of interest but are inactivated in terms of their ability to replicate in a normal lytic viral life cycle (see for example Berkner et al., 1988, BioTechniques 6:616; Rosenfeld et al., 1991, Science 252:431-434; and Rosenfeld et al., 1992, Cell 68:143-155).
  • Suitable adenoviral vectors derived from the adenovirus strain Ad type 5 dl324 or other strains of adenovirus are well known to those skilled in the art.
  • Adeno-associated virus is a naturally occurring defective virus that requires another virus, such as an adenovirus or a herpes virus, as a helper virus for efficient replication and a productive life cycle.
  • An AAV vector such as that described in Traschin et al. (1985, Mol. Cell. Biol. 5:3251-3260) can be used to introduce nucleic acid into cells.
  • AAV vectors are useful for the introduction of nucleic acid sequences into a variety of different cell types (see, for example, Hermonat et al., 1984, Proc. Natl. Acad. Sci. USA 81: 6466-6470; and Traschin et al., 1985, Mol. Cell. Biol. 4: 2072-2081).
  • insect cells because high level expression may be obtained, the culture conditions are simple relative to mammalian cell culture, and the post-translational modifications made by insect cells closely resemble those made by mammalian cells.
  • insect cells such as Drosophila S2 cells
  • infection with baculovirus -vectors is widely used.
  • Other insect vector systems include, for example, the expression plasmid pIZ/V5-His (InVitrogen) and other variants ofthe pIZ/V5 vectors encoding other tags and selectable markers.
  • Insect cells are readily transfectable using lipofection reagents, and there are lipid-based transfection products specifically optimized for the transfection of insect cells (for example, from PanVera).
  • any cell into which a recombinant vector carrying an IDFP sequence may be introduced and wherein the vector is permitted to drive the expression ofthe IDFP is useful accordmg to the invention. That is, because ofthe wide variety of uses for the IDFP molecules ofthe invention, any cell in which an IDFP molecule ofthe invention may be expressed and preferably detected is a suitable host.
  • Vectors suitable for the introduction of IDFP-encoding sequences to host cells from a variety of different organisms, both prokaryotic and eukaryotic, are described herein above or known to those skilled in the art.
  • Host cells may be prokaryotic, such as any of a number of bacterial strains, or may be eukaryotic, such as yeast or other fungal cells, insect or amphibian cells, or mammalian cells including, for example, rodent, simian or human cells. Host cells may also be plant cells. Cells expressing IDFPs may be primary cultured cells, for example, primary human fibroblasts or keratinocytes, or may be an established cell line, such as NIH3T3, 293T or CHO cells. Further, mammalian cells useful for expression of IDFPs may be phenotypically normal or oncogenically transformed. It is assumed that one skilled in the art can readily establish and maintain a chosen host cell type in culture.
  • IDFP-encoding vectors may be introduced to selected host cells by any of a number of suitable methods known to those skilled in the art.
  • IDFP constructs may be introduced to appropriate bacterial cells by infection, in the case of E. coli bacteriophage vector particles such as lambda or Ml 3, or by any of a number of transformation methods for plasmid vectors or for bacteriophage DNA.
  • E. coli bacteriophage vector particles such as lambda or Ml 3
  • transformation methods for plasmid vectors or for bacteriophage DNA for example, standard calcium-chloride-mediated bacterial transformation is still commonly used to introduce naked DNA to bacteria (Sambrook et al., 1989, supra), but electroporation may also be used (Ausubel et al. (eds.), supra, 1993).
  • IDFP-encoding constructs For the introduction of IDFP-encoding constructs to yeast or other fungal cells, chemical transformation methods are generally used (e.g. as described by Rose et al., 1990, Methods in Yeast Genetics, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY).
  • transformation of S. cerevisiae for example, the cells are treated with lithium acetate to achieve
  • Transformed cells are then isolated on selective media appropriate to the selectable marker used.
  • plates or filters lifted from plates may be scanned for IDFP fluorescence to identify transformed clones.
  • IDFP-encoding vectors For the introduction of IDFP-encoding vectors to mammalian cells, the method used will depend upon the form ofthe vector.
  • DNA encoding an IDFP may be introduced by any of a number of transfection methods, including, for example, lipid-mediated transfection ("lipofection"), DEAE-dextran-mediated transfection, electroporation or calcium phosphate precipitation. These methods are detailed, for example, in Ausubel et al. (eds.), 1993, supra.
  • Lipofection reagents and methods suitable for transient transfection of a wide variety of transformed and non-transformed or primary cells are widely available, making lipofection an attractive method of introducing constructs to eukaryotic, and particularly mammalian cells in culture.
  • LipofectAMINETM Life Technologies
  • LipoTaxiTM LipoTaxiTM(Stratagene) kits
  • Other companies offering reagents and methods for lipofection include Bio-Rad Laboratories, CLONTECH, Glen Research, InVitrogen, JBL Scientific, MBI Fermentas, PanVera, Promega, Quantum Biotechnologies, Sigma- Aldrich, and Wako Chemicals USA.
  • liposome-mediated transfection is commonly used, as is baculovirus infection.
  • Cells such as Schneider-2 cells (Drosophila melanogaster), Sf-9 and Sf-21 cells (Spodoptera frugiperda) or High FiveTM cells (Trichoplusia ni) may be transfected using any of a number of commercially available liposome transfection reagents optimized for use with insect cells.
  • Reagents include, for example, TransIT-hisectaTM (PanVera), FuGENETM-6 (Roche), InsectinTM-Plus (InVitrogen) and TfxTM- 20 (Promega).
  • Each of these reagents permits the introduction of nucleic acid vectors encoding an IDFP to insect cells.
  • Expression vectors optimized for insect cell expression are widely known and are commercially available from, for example, Clontech and InVitrogen. These include both plasmid-based vectors and baculovirus vectors. Insect cell expression vectors are described in detail in "Baculovirus Expression Vectors", D.R. O'Reilly, L.K. Miller & V.A. Luckow (1992, W.H. Freeman Co., New York).
  • eukaryotic (preferably, but not necessarily mammalian) cells successfully incorporating the construct may be selected, as noted above, by either treatment ofthe transfected population with a selection agent, such as an antibiotic whose resistance gene is encoded by the vector, or by direct screening using, for example, FACS ofthe cell population or fluorescence scanning of adherent cultures. Frequently, both types of screening may be used, wherein a negative selection is used to enrich for cells taking up the construct and FACS or fluorescence scanning is used to further enrich for cells expressing IDFPs or to identify specific clones of cells, respectively.
  • a selection agent such as an antibiotic whose resistance gene is encoded by the vector
  • FACS fluorescence scanning
  • a negative selection with the neomycin analog G418 may be used to identify cells that have received the vector, and fluorescence scanning may be used to identify those cells or clones of cells that express an IDFP to the greatest extent.
  • codons in the table are arranged from left to right in descending order of relative use in human genes. In particular, those codons underlined in the table are almost never used in known human genes and, if found in a sequence to be humanized, would therefore represent the most important codons to modify for enhanced expression efficiency in mammalian or human cells.
  • a sequence is considered "humanized” if the codon for one or more amino acids has been changed from the native codon sequence to a codon sequence more favored for translation in human or mammalian cells, preferably without altering the polypeptide coding sequence.
  • Site-directed mutagenesis is well known in the art and is often performed using commercially available kits, such as the EXSITETM (Catalog No. 200502), QUIKCHANGETM (Catalog No. 200518) or CHAMELEON ® mutagenesis kits (Catalog No. 200509), available from Stratagene. TABLE 1
  • Recombinant fluorescent proteins can be purified from bacteria as follows. Bacteria transformed with a recombinant IDFP-encoding vector ofthe invention are grown in Luria-
  • Bertani medium containing the appropriate selective antibiotic (e.g., ampicillin at 50 ⁇ g/ml).
  • the vector permits, recombinant polypeptide expression is induced by the addition ofthe appropriate inducer (e.g., IPTG at 1 mM).
  • the appropriate inducer e.g., IPTG at 1 mM.
  • Bacteria are harvested by centrifiigation and lysed by freeze-thaw ofthe cell pellet. Debris is removed by centrifugation at 14,000 x g, and the supernatant is loaded onto a Sephadex G-75 (Pharmacia, Piscataway, NJ) column equilibrated with 10 mM phosphate buffered saline, pH 7.0. Fractions containing IDFP are identified by fluorescence emission at the expected wavelength when excited by light in the excitation wavelength.
  • the appropriate inducer e.g., IPTG at 1 mM.
  • IDFPs can be isolated from eukaryotic cells by methods well known to those skilled in the art. It is also contemplated that IDFPs will include a marker or affinity tag sequence to permit affinity purification. Examples include 6X-His, glutathione S transferase (GST), or epitope tags such as Flag or the Myc tag. These tags are useful for both bacterial and eukaryotic cell expression and purification of IDFPs.
  • a candidate modulator or candidate agent being evaluated for a modulatory function on a given interaction or biological process may be a synthetic compound, a mixture of compounds, or may be a natural product (e.g. a plant extract or culture supernatant).
  • Candidate agents from large libraries of synthetic or natural compounds can be screened. Numerous means are currently used for random and directed synthesis of saccharide, peptide, and nucleic acid based compounds. Synthetic compound libraries are commercially available from a number of companies including Maybridge Chemical Co. (Trevillet, Cornwall, UK), Comgenex (Princeton, NJ), Brandon Associates (Merrimack, NH), and Microsource (New Milford, CT). A rare chemical library is available from Aldrich (Milwaukee, WI). Combinatorial libraries are available and can be prepared.
  • libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available from e.g., Pan Laboratories (Bothell, WA) or MycoSearch (NC), or are readily produceable by methods well known in the art. Additionally, natural and synthetically produced libraries and compounds are readily modified through conventional chemical, physical, and biochemical means.
  • Useful candidate compounds may be found within numerous chemical classes. Such compounds may be organic compounds, or small organic compounds. Small organic compounds have a molecular weight of more than 50 yet less than about 2,500 Daltons, preferably less than about 750, more preferably less than about 350 daltons. Exemplary classes include heterocycles, peptides, saccharides, steroids, and the like. The compounds may be modified to enhance efficacy, stability, pharmaceutical compatibility, and the like. Structural identification of an agent may be used to identify, generate, or screen additional agents.
  • peptide agents may be modified in a variety of ways to enhance their stability, such as using an unnatural amino acid, such as a D-amino acid, particularly D-alanine, by functionalizing the amino or carboxylic terminus, e.g. for the amino group, acylation or alkylation, and for the carboxyl group, esterification or amidification, or the like.
  • an unnatural amino acid such as a D-amino acid, particularly D-alanine
  • Candidate agents will be effective at varying concentrations, depending on the nature of the agent and on the nature of its interaction with the polypeptide or polypeptide fragment of interest. Therefore, candidate agents should be screened at varying concentrations. Generally, concentrations from about 10 mM to about 1 fM are preferred for screening.
  • concentrations from about 10 mM to about 1 fM are preferred for screening.
  • the association constants of agents that bind polypeptides or fragments thereof will generally be in the range of
  • IDFPs can be used in any application for which fluorescent proteins are suited.
  • IDFPs can be used as reporter genes to monitor the activity of promoter sequences, to investigate the cellular localization of fusion proteins, to mark cellular proteins for FACS analyses of cell populations, to monitor viral vector infection, to monitor transgene expression in vivo or in culture, and to monitor protei protein interactions both in vivo and in vitro.
  • IDFPs comprising fluorescent proteins whose spectral characteristics are sensitive to intracellular or extracellular environmental changes (e.g., pH, redox status, phosphorylation of the fluorescent protein, etc.) will continue to be sensitive to those changes in the context of an IDFP.
  • IDFPs do not heterodimerize, they are particularly well suited for multiple- labeling studies involving the co-expression of IDFP-fusion proteins with differing spectral characteristics.
  • Techniques useful for the detection of IDFP fusion proteins include, for example, standard fluorescent microscopy, confocal microscopy, flow cytometry and fluorescence activated cell sorting (FACS).
  • IDFPs are particularly well suited to applications that rely on FRET.
  • the lack of heterodimerization between IDFPs with differing spectral characteristics that permit FRET but that share the same dimerization interfaces is a major improvement over previous methods using fluorescent proteins that could heterodimerize, since it removes a significant source of FRET background.
  • two different IDFPs that have overlapping emission and excitation spectra i.e., they are donor and acceptor to each other
  • a specific interaction ofthe fusion partners will result in a change in the detected emission spectrum from that ofthe donor to that ofthe acceptor when a mixture ofthe two IDFP fusion proteins is irradiated with light that excites the donor fluorophore.
  • This type of assay is readily adapted to a screening format, in which known interactors are exposed to candidate compounds. Detection of a change from the acceptor's emission profile to the donor's emission profile indicates that a candidate compound has disrupted the interaction between the fusion partners. Either of these assays can be performed in vivo or in vitro.
  • An example of a donor/acceptor fluorescent protein pair is P4-3 and S65C or S65T (Table 2; U.S. Pat. No. 5,981,200).
  • Other examples of donor/acceptor pairs of fluorescent polypeptides include, but are
  • T203Y/S65G, V68L or Q69K excitation ⁇ 515 nm, emission ⁇ 527 nm
  • acceptor See Tsien et al., WO 97/28261.
  • Each of these proteins shares the dimerization interface of A. victoria GFP. Their expression as IDFPs would allow their co-expression without heterodimerization.
  • a pair of fluorescent proteins that are useful according to the invention function as fluorescent donor and fluorescent acceptor, respectively, in the context of fluorescence resonance energy transfer.
  • the ability of a pair of fluorescent proteins to function as fluorescent donor and fluorescent acceptor, respectively, in the context of fluorescence resonance energy transfer is determined experimentally and is influenced by a number of factors including donor/acceptor peaks, emission/excitation peaks, peak widths, the efficiency of energy transfer within a fluorescent moiety and peak overlap.
  • the donor excitation peak, A, figure 5 will overlap minimally with the acceptor excitation peak, C, figure 5, such that excitation ofthe donor does not excite the acceptor; 2) the donor excitation peak, A, figure 5, and the donor emission peak, B, figure 5, have sufficient overlap to permit efficient energy transfer; 3) the donor emission peak, B, figure 5, and the acceptor excitation peak, C, figure 5, have sufficient overlap to permit efficient FRET energy transfer; 4) the donor emission peak, B, figure 5 and the acceptor emission peak, D, figure 5 have sufficient overlap to allow for differentiation between FRET and non-FRET energy transfer; and 5) the donor excitation peak, A, figure 5 and the acceptor emission peak, D, figure 5 have sufficient overlap to allow for differentiation between FRET and non-FRET energy transfer.
  • an acceptable donor/acceptor pair exhibits > 50% quenching of donor emission at a chromophore distance of > 10 A. This is based on the Forster radius, R o , which is the distance at which 50% of excited donors are deactivated by FRET (i.e., distance at which energy transfer is 50% efficient). The value of R o is dependent on the spectral properties ofthe
  • RQ [8.8 x IO 23 • ⁇ 2 • n "4 • QY D •
  • the advantage resulting from the forced intramolecular homodimer formation is most apparent when, for example, fluorescent proteins with different emission characteristics, derived from the same parent fluorescent protein, are expressed in a single cell. For example, if two variants ofR. reniformis GFP have spectral characteristics that permit FRET between the variants, both of these proteins will have the same dimerization interfaces. Without the forced homodimerization occurring in an IDFP, the background level of acceptor fluorescence upon irradiation within the donor's excitation spectrum will be higher than if IDFP versions ofthe same fluorescent proteins are used.
  • heterodimerization between two fluorescent fusion proteins via that interface can be a problem.
  • heterodimerization can reduce the sensitivity of sub-cellular localization studies using two labels.
  • Heterodimerization will segregate the labeled proteins into three populations: homodimers ofthe first fusion protein, homodimers ofthe second fusion protein, and heterodimers comprising both.
  • heterodimer formation will reduce the amount of either homodimer available to segregate to a given location in the cell. This will result in decreased sensitivity in the assay. Therefore, the use of IDFPs in such a situation will improve upon detection sensitivity even if one is not relying upon FRET for detection.
  • IDFPs are well suited for applications that monitor the association of fusion polypeptides using energy transfer.
  • IDFPs In order to monitor the association of two polypeptides of interest using IDFPs, one must first select a pair of fluorescent polypeptides that are donor and acceptor to each other. Each polypeptide in the pair must be capable of homodimerization.
  • Another pair useful accordmg to the invention is P4-3 and R. reniformis GFP (hrGFP).
  • the nucleic acid sequence encoding the fluorescence donor polypeptide is used to generate a construct encoding, in order, a copy ofthe donor polypeptide (e.g., P4-3), a linker, a second copy ofthe fluorescence donor polypeptide and one ofthe polypeptides of interest (alternatively, the sequence encoding the protein of interest may be placed upstream of and in frame with the sequences encoding the IDFP).
  • sequence encoding the fluorescence acceptor polypeptide is used to generate a construct encoding, in order, a copy of the acceptor polypeptide (e.g., S65T), a linker, a second copy ofthe fluorescence acceptor polypeptide and the second polypeptide of interest (alternatively, the sequence encoding the protein of interest may be placed upstream of and in frame with the sequences encoding the acceptor IDFP).
  • a pair of proteins of interest is the Ras proto-oncogene product and the Raf-1 kinase.
  • the G-Protein Ras binds to Raf-1 in response to signals originating at receptor tyrosine kinases.
  • a human c-Ha-Ras cDNA sequence is available at GenBank Accession No. J00277
  • a human Raf-1 kinase sequence is available at GenBank Accession No. NM002880.
  • Ras coding sequences may be ligated in frame to the donor P4- 3 IDFP construct
  • the Raf-1 coding sequences may be ligated in frame to the acceptor S65A IDFP construct.
  • Constructs encoding the two IDFP-fusion proteins are transfected, either simultaneously or sequentially into cells in which the proteimprotein interaction is to be studied (e.g., HeLa cells, NIH3T3 cells, or another specific cell type of interest) using methods well known in the art (e.g., lipofection, electroporation, calcium phosphate precipitation, or even retroviral infection following generation of recombinant retroviral vector particles as known in the art).
  • the interaction ofthe proteins of interest is measured by detection of fluorescent emission upon irradiation with light that excites the donor fluorophore, in this instance P4-3, but not the acceptor fluorophore, S65T. If the fused Ras and Raf-1 domains interact, excitation with 381 nm light will result in energy transfer between the P4-3 and S65T fluorophores and emission of light with a maximum at about 511 nm. In contrast, if the domains do not interact, the emission maximum upon excitation at 381 nm will be at about 445 nm, the emission maximum of P4-3. This therefore allows the monitoring ofthe interaction ofthe two domains in response to stimuli, such as the addition of growth factor, growth factor analogs, or candidate modulators ofthe signal transduction pathway.
  • stimuli such as the addition of growth factor, growth factor analogs, or candidate modulators ofthe signal transduction pathway.
  • the proteimprotein interaction assay using IDFP fusion proteins described above may also be performed in vitro with isolated or purified IDFP fusion proteins.
  • This type of assay, or even the cell-based assay described above may be readily adapted to a high-throughput format by placing the transfected cells or protein samples in a multiwell container and monitoring fluorescence output of samples exposed to various candidate modulators. Further, by performing the interaction assay in the presence or absence of a candidate modulator, one may adapt the method for screening of candidate modulator compounds to identify compounds that either increase or decrease the measured interaction. A change in the interaction in the presence of a candidate modulator relative to the interaction in its absence is indicative of a modulatory effect.
  • Example 2. Labeling a cell with an IDFP.
  • IDFPs according to the invention can be used in any application in which fluorescent polypeptides are useful.
  • cells can be labeled by expression of IDFPs to monitor the uptake and expression of transgene constructs, including plasmid-based and retroviral constructs. Cells may also be labeled to facilitate subsequent FACS analysis in a mixed population.
  • an IDFP-encoding construct is introduced to cells by standard methods appropriate to that cell type.
  • selection for cells receiving the construct can either be performed by standard positive or negative selection based on additional selectable marker sequences (e.g., antibiotic resistance genes), by sorting or selection by FACS, or by allowing cells to form colonies and isolating those colonies that fluoresce when irradiated with light within the excitation spectrum ofthe IDFP. Maintaining the cells under conditions permitting the expression ofthe IDFP will permit the detection ofthe cells by fluorescence.
  • additional selectable marker sequences e.g., antibiotic resistance genes
  • Fluorescently labeled proteins are often used to examine the sub-cellular localization of proteins of interest. Frequently, it is useful to monitor the localization of two or more proteins or protein domains simultaneously, for example, as a means of identifying relationships between the proteins.
  • the sensitivity ofthe localization assay can be adversely affected by heterodimerization between the fluorescent polypeptides.
  • proteins to be monitored for localization include proteins that are recruited to the vicinity ofthe plasma membrane upon a stimulus such as growth factor engagement of a receptor (e.g., G-proteins, Protein Kinase A, SH2-domain containing proteins, etc.), proteins that localize to the nucleus in response to a stimulus (e.g., steroid hormone receptor), or proteins that localize to the golgi, mitochondria, nuclear pores or any other subcellular locale.
  • a receptor e.g., G-proteins, Protein Kinase A, SH2-domain containing proteins, etc.
  • proteins that localize to the nucleus in response to a stimulus e.g., steroid hormone receptor
  • proteins that localize to the golgi mitochondria, nuclear pores or any other subcellular locale.
  • two IDFP fusion constructs each comprising sequences encoding one ofthe proteins of interest, are introduced to cells, either simultaneously or sequentially, using standard methods appropriate for that cell type.
  • the localization ofthe IDFP-tagged proteins is monitored by fluorescence microscopy using excitation wavelengths and filter sets appropriate for the different fluorophores. While not wishing to exclude the possibility, it is generally not necessary that the two IDFPs be fluorescent donor and acceptor to each other. More frequently, unless one is assaying for direct interaction ofthe proteins, it is preferred that the fluorescent proteins are not related to each other in this manner.
  • the IDFP fusion protein constructs are made using standard methods well known in the art. Examples of pairs of fluorescent polypeptides that are well suited for simultaneous monitoring of localization include, but are not limited to any of S72A,

Abstract

The invention relates to proteins or polypeptides that comprise intramolecular dimers of fluorescent protein monomers. More specifically, the invention relates t recombinant polypeptides comprising a monomer of a fluorescent polypeptide, a linker peptide, and a second monomer of that fluorescent polypeptide, where the monomers form an intramolecular dimer. The invention also relates to nucleic acids encoding Intramolicular Dimer Fluorescent Proteins (IDFPs) and vectors comprising such a nicleic acids. The invention further relates to methods of making IDFPs and methods of using them. IDFPs are useful in any application suited for fluorescent proteins and are particularly useful in applications in which more than one fluorescent protein sharing complementary dimerization interfaces is present in the same mixture or is expressed in the same cell, because IDFPs do not form heterodimers.

Description

DIMERIC FLUORESCENT POLYPEPTIDES
BACKGROUND OF THE INVENTION Fluorescent proteins are widely used in the fields of biochemistry, molecular and cell biology, medical diagnostics and drug screening methodologies (Chalfie et al., 1994, Science 263: 802-805; Tsien, 1998, Ann. Rev. Biochem. 67: 509-544). One property shared by the most useful fluorescent proteins is that they require no host-encoded co-factors or substrates for fluorescence. The proteins therefore retain their fluorescent properties both in isolation from their native organism, and when expressed in the cells of other organisms. This property makes them particularly well suited for a variety of in vivo and in vitro applications. Another major advantage of fluorescent proteins for use in biological systems is that they are indeed proteins, which permits their synthesis within cells or organisms of interest, avoiding a host of problems relating to the attachment ofthe label to a protein of interest and/or delivery of labeled proteins into a cell. Not only can the proteins be made within the desired cell or organism, but they also retain their fluorescent properties when expressed as fusions with other proteins of interest, which greatly enhances their utility both in vivo and in vitro.
Fluorescent proteins have been used as reporter molecules to study gene expression in culture as well as in transgenic animals by insertion of fluorescent protein coding sequences downstream of an appropriate promoter. They have also been used to study the subcellular localization of proteins by direct fusion of test proteins to fluorescent proteins, and fluorescent proteins have become the reporter of choice for monitoring the infection efficiency of viral vectors both in cell culture and in animals. Variants of fluorescent proteins exhibiting spectral shifts in response to changes in the cellular environment (e.g., changes in pH, ion flux, or the redox status ofthe cell) are also used to monitor such changes (see, for example, Inouye & Tsuji, 1994, FEBS Lett. 351: 211-214; Miyawaki et al, 1997, Nature 388: 882-887). Perhaps the most promising role for fluorescent proteins as biochemical markers is their application to methods that exploit fluorescence resonance energy transfer (FRET). FRET occurs with fluorophores for which the emission spectrum of one fluorophore overlaps with the excitation spectrum of a second fluorophore. When such fluorophores are brought into close proximity, excitation ofthe "donor" fluorophore results in emission from the "acceptor". Pairs of such fluorophores are thus useful for monitoring molecular interactions. Fluorescent proteins are useful for the analysis of proteimprotein molecular interactions in vivo or in vitro if their respective fluorescent emission and excitation spectra overlap to allow FRET. The donor and acceptor fluorescent proteins may be produced as fusions with the proteins one wishes to analyze for interactions. These types of applications of fluorescent proteins are particularly appealing for high throughput analyses, since the readout is direct and independent of subcellular localization.
The prototypical fluorescent protein is the Aequorea victoria green fluorescent protein (GFP), which was the first green fluorescent protein cloned (Prasher et al., 1992, Gene 111: 229- 233). Purified victoria GFP is a monomeric protein of about 27 kDa that absorbs blue light with an excitation wavelength maximum of 395 nm, with a minor peak at 470 nm, and emits green fluorescence with an emission wavelength of about 510 nm and a minor peak near 540 nm (Ward et al., 1979, Photochem. Photobiol. Rev. 4: 1-57). The polypeptide has several drawbacks, including relatively broad excitation and emission spectra, low quantum yield, and low expression in cells of higher eukaryotes. Mutants with improved spectral characteristics and higher quantum yield have been identified, and expression in higher eukaryotes has been improved by "humanizing" the nucleic acid sequences to encode codons optimized for human or mammalian expression.
Additional fluorescent proteins include, but are not limited to those expressed by Discosoma sp. and Phialidium gregarum (Ward et al., 1982, Photochem. Photobiol. 35: 803-808; Levine et al., 1982, Comp. Biochem. Physiol. 72B:77-85). Also, Vibrio fischeri strain Yl expresses a yellow fluorescent protein that requires flavins as a co-factor for its fluorescence (Baldwin et al., 1990, Biochemistry 29: 5509-5515).
Additional cloned fluorescent proteins include, for example, the green fluorescent proteins from the sea pansy, Renilla mullerei (WO/99/49019) and from Renilla reniformis (see SEQ ID NO: 1; Figure 1). Each of these fluorescent proteins and others are useful for a variety of in vivo and in vitro uses. The R. reniformis GFP (rGFP) clone is particularly important, since rGFP is seen as the benchmark protein among known naturally-occurring fluorescent proteins. rGFP has 3 to 6-fold higher quantum yield than A. victoria GFP, and the excitation and emission spectra are narrower, making rGFP more suitable for applications involving, for example, FRET.
One major drawback shared by the GFPs from A. victoria, R. mullerei and R. reniformis, as well as by all known variants of those proteins, is that they are dimeric. Generally, the proteins exist as homodimers. However, when more than one form of a given GFP is expressed in a single cell or is mixed in vitro, heterodimers can form if the dimerization interfaces for the different fluorescent proteins are complementary. Heterodimerization interferes with the usefulness of fluorescent proteins for several reasons.
First, heterodimerization is undesirable when fluorescent proteins are used in energy transfer-based analyses because heterodimerization raises the background of acceptor fluorescence without a real interaction between the proteins or protein domains of interest. When FRET is used, for example to monitor proteimprotein interactions, donor and acceptor fluorescent fusion proteins are often expressed in the same cell or otherwise mixed. In the absence of heterodimerization, the excitation ofthe donor fluorophore leads to emission by the acceptor fluorophore only if the two fusion proteins are in close apposition. However, if heterodimerization occurs between the differing fluorescent proteins (e.g., between a wild-type rGFP and an rGFP variant that is a fluorescence donor to the wild-type GFP), excitation ofthe donor will result in emission by the acceptor regardless ofthe interaction between the fused polypeptides being examined for interaction. This generates an unacceptably high background fluorescence from the acceptor fluorophore.
Another problem caused by the heterodimerization is that the dimerization interfaces between the proteins can serve to artifactually bring fusion polypeptides linked to the fluorescent protein monomers into close contact. The inappropriate recruitment of proteins into close apposition can have biological consequences that make data interpretation difficult. For example, some cell surface receptors gain the ability to initiate an intracellular signaling cascade following ligand-induced dimerization. If the dimerization interfaces ofthe fluorescent proteins inappropriately recruit the fused receptor monomers into close contact, the signaling cascade can be inappropriately initiated in the absence of ligand. There is a need in the art for fluorescent proteins that do not heterodimerize.
U.S. Patent No. 5,981,200 (Tsien et al.) teaches donor and acceptor fluorescent proteins linked by a peptide linker. The linked donor and acceptor proteins, referred to as "tandem fluorescent proteins," are taught to be useful for assaying enzymes capable of cleaving the linker peptide sequence. When linked, the tandem fluorescent proteins exhibit either no fluorescence (e.g., when one protein quenches the fluorescence ofthe other) or fluorescence characteristic of the acceptor. Following cleavage, the fluorescence emitted is that characteristic ofthe individual fluorescent proteins. Assays using this arrangement will not work unless the tandem fluorescent proteins are related as donor and acceptor.
SUMMARY OF THE INVENTION
The invention encompasses a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein the recombinant fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore.
In one embodiment, the first polypeptide and the second polypeptide are peptide bonded to each other via a linker sequence.
In another embodiment, the recombinant fusion polypeptide further comprises a third polypeptide peptide bonded to the recombinant fusion polypeptide. The third polypeptide can be peptide bonded to the recombinant fusion polypeptide either directly or through a peptide linker sequence. A recombinant fusion polypeptide of this embodiment is referred to in this summary as a "fluorescent polypeptide fusion." In a preferred embodiment, the third polypeptide is fused to the amino terminus ofthe first polypeptide. In another preferred embodiment, the third polypeptide is fused to the carboxy terminus ofthe second polypeptide sequence.
In an additional preferred embodiment, the third polypeptide is a member of a specific binding pair.
In another embodiment, one or both ofthe first and second polypeptides is a monomer of one of R. reniformis GFP, R. mulleri GFP or victoria GFP.
In another embodiment, both ofthe first and second polypeptides are a monomer of one ofR. reniformis GFP, R. mulleri GFP or A. victoria GFP.
The invention further encompasses a polynucleotide encoding a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein the recombinant fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore. In one embodiment, the first polypeptide and the second polypeptide encoded by the polynucleotide are peptide bonded to each other via a linker sequence. In a preferred embodiment, the linker sequence encoded by the polynucleotide is from 5 to 50 amino acids long. In a further preferred embodiment, the linker sequence comprises one or more iterations of a peptide, for example the peptide RARDPRVPVAT (i.e., Arg-Ala-Arg-Asp-Pro-Arg-Val-Pro- Val-Ala-Thr). In a further preferred embodiment, the linker sequence is selected from the group consisting of (Arg-Ala-Arg-Asp-Pro-Arg-Val-Pro-Val-Ala-Thr)n, (Gly-Ser)n, (Thr-Ser-Pro)n, (Gly-Gly-Gly)n, and (Glu-Lys)n, wherein n is 1 to 15.
In another embodiment, the polynucleotide further encodes a third polypeptide peptide bonded to the recombinant fusion polypeptide. The third polypeptide encoded by the polynucleotide may be joined directly or via an encoded peptide linker.
In a preferred embodiment, the third polypeptide encoded by the polynucleotide is a member of a specific binding pair. It alternatively preferred that the third encoded polypeptide is fused to the amino terminus ofthe first polypeptide. Is additionally preferred that the third encoded polypeptide is fused to the carboxy terminus ofthe second polypeptide.
In another preferred embodiment, one or both ofthe first and second polypeptides is a monomer of one ofR. reniformis GFP, R. mulleri GFP, A. victoria GFP.
In another preferred embodiment, both ofthe first and second polypeptides is a monomer of one ofR. reniformis GFP, R. mulleri GFP, A. victoria GFP.
The invention further encompasses a vector comprising a polynucleotide encoding a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein the recombinant fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore. The invention further encompasses a cell comprising a vector comprising a polynucleotide encoding a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein the recombinant fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore.
In one embodiment, the cell is a bacterial cell.
In another embodiment, the cell is a eukaryotic cell. In a preferred embodiment, the eukaryotic cell is a yeast cell, an insect cell, or a mammalian cell.
The invention further encompasses a pair of polypeptides comprising a polypeptide labeled with a fluorescent dye and a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, wherein the fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore, and wherein the fluorescent dye and the recombinant fusion polypeptide are fluorescent donor and acceptor to each other.
The invention further encompasses a pair of recombinant fusion polypeptides comprising (a) a first fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein the first fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore, and (b) a second fusion polypeptide comprising a third polypeptide peptide bonded to a fourth polypeptide, wherein the third and fourth polypeptides are found in nature as monomers of a multimeric protein, and wherein the second fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore, wherein the first fusion polypeptide and the second fusion polypeptide are fluorescent donor and acceptor to each other.
In one embodiment, each ofthe first and second fusion polypeptides further comprises an additional fused (third) polypeptide, wherein the additional fused polypeptide ofthe first fusion polypeptide comprises a sequence which is different from the additional fused polypeptide ofthe second fusion polypeptide.
The invention further encompasses a method of producing a fluorescently labeled recombinant fusion polypeptide, the method comprising the steps of introducing to a cell a polynucleotide encoding a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein the recombinant fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore, and culturing the cell under conditions that permit the synthesis ofthe recombinant fusion polypeptide, whereby the recombinant fusion polypeptide is produced.
The invention further encompasses a method of labeling a cell with a fluorescent recombinant fusion polypeptide, the method comprising the steps of: a) introducing to a cell a polynucleotide encoding a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein the first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein the recombinant fusion polypeptide is fluorescent when exposed to light of an excitation wavelength or when interactive with an excited donor fluorophore; and b) culturing the cell under conditions that permit the synthesis of the recombinant fusion polypeptide, whereby the cell is labeled with the fluorescent recombinant fusion polypeptide. In a preferred embodiment, in the introducing step (a), the polynucleotide introduced to the cell further comprises a sequence encoding a third polypeptide fused in frame to the sequence encoding the recombinant fusion polypeptide.
The invention further encompasses a method of monitoring the interaction of two polypeptides of interest, the method comprising the steps of: a) contacting a fluorescent polypeptide fusion, as described above, and a second polypeptide wherein: i) the fluorescent polypeptide fusion comprises a first polypeptide of interest; ii) the second polypeptide comprises a second polypeptide of interest and is fluorescently labeled; and iii) the fluorophores comprised by the fluorescent polypeptide fusion and the second polypeptide are fluorescent donor and fluorescent acceptor to each other; b) exciting the donor fluorophore; and c) detecting fluorescent emission from the fluorescent acceptor, wherein the emission is indicative ofthe interaction ofthe first and the second polypeptides of interest.
In one embodiment, the second polypeptide comprises a second fluorescent polypeptide fusion, as described above, wherein the polypeptide of interest ofthe second fluorescent polypeptide fusion is different from the polypeptide of interest ofthe first fluorescent polypeptide fusion.
In one embodiment, the contacting step is performed in vitro.
In another embodiment, the contacting step is performed in a cell. In a preferred embodiment, the contacting comprises the step of introducing nucleic acid encoding the polypeptides to a cell.
The invention further encompasses a method of screening for a compound that modulates the interaction of a first and a second member of a specific binding pair, the method comprising the steps of: a) contacting a first polypeptide and a second polypeptide in the presence and absence of a candidate modulator wherein: i) the first polypeptide is a fluorescent polypeptide fusion, as described above, wherein the third polypeptide is the first member of a specific binding pair; ii) the second polypeptide is fluorescently labeled and comprises the second member of a specific binding pair; and iii) the fluorophores comprised by the first and second polypeptides are fluorescent donor and acceptor to each other; b) exciting the donor fluorophore; and c) detecting the fluorescence ofthe acceptor fluorophore, wherein emission ofthe spectrum characteristic ofthe fluorescent acceptor indicates the interaction ofthe first and the second members ofthe specific binding pair, and wherein a change in the interaction in the presence of the candidate modulator indicates that the candidate modulator modulates the interaction ofthe members ofthe specific binding pair.
In one embodiment, the second polypeptide is a fluorescent polypeptide fusion, as described above, which comprises the second member of a specific binding pair.
BRIEF DESCRIPTION OF THE FIGURES
Figure 1 shows the polynucleotide sequence ofR. reniformis GFP (SEQ ID NO: 1).
Figure 2 shows the amino acid sequence ofR. reniformis GFP (SEQ ID NO: 2).
Figure 3 shows the polynucleotide and amino acid sequences for hrGFP, a humanized R. reniformis GFP. The polynucleotide sequence is SEQ ID NO: 3, and the amino acid sequence is SEQ ID NO: 4.
Figure 4 shows a schematic diagram of a construct encoding an IDFP ofthe invention. "CMV" refers to the cytomegalovirus promoter, "MCS" refers to a multiple cloning sequence and "pA" refers to a poly(A) addition site sequence. "hrGFP" represents one monomer ofthe humanized R. reniformis GFP, and "linker" refers to a peptide or polypeptide linker sequence. A, B, and C show examples of linker peptide sequences.
Figure 5 shows relationships between emission and excitation peaks for donor and acceptor fluorophores capable of FRET. DETAILED DESCRIPTION OF THE INVENTION
All patents and patent applications, both U.S. and international, and all literature publications referred to herein are hereby incorporated in their entirety within this document by reference. Definitions
As used herein, a recombinant fusion polypeptide is "fluorescent when excited".
As used herein, the term "excited" refers to a fluorophore that is exposed to light of an excitation wavelength or to an acceptor fluorophore that is interactive with an excited donor fluorophore.
The phrase "fluorescent when excited" means that when the recombinant fusion polypeptide is exposed to light of an excitation wavelength or when the polypeptide interacts with an excited donor fluorophore, the polypeptide fluoresces. "Exposed to light of an excitation wavelength" means irradiated with light (electromagnetic radiation) within a given spectrum of wavelengths that is absorbed by the polypeptide such that the polypeptide emits light having a different spectrum of wavelengths, and thus fluoresces. Fluorescent emission occurs at a longer wavelength than does excitation.
A recombinant fusion polypeptide according to the invention has three properties: 1) it must emit light upon irradiation with light of a given wavelength or wavelengths; 2) it must have the capacity to form an intramolecular homodimer as defined herein above; and 3) the first and second polypeptide monomers that constitute the fusion polypeptide cannot function as fluorescent donor and fluorescent acceptor, respectively, in the context of fluorescence resonance energy transfer.
As used herein, the term "light of an excitation wavelength" refers to those wavelengths of light that are absorbed by and excite a given fluorophore to emit fluorescence. These wavelengths are described in detail herein below. Light of an appropriate portion ofthe spectrum is synonymous with light within the excitation spectrum of a given fluorophore.
As used herein, the term "excited donor fluorophore" refers to a fluorophore which has absorbed energy within its excitation spectrum. An excited donor fluorophore can transmit energy sufficient to excite an acceptor fluorophore.
As used herein, the term "fluorescent dye" refers to a non-polypeptide chemical moiety that, upon absorption of light energy of a particular wavelength or wavelengths, emits light at another wavelength or that emits light when paired with an appropriate excited donor fluorophore.
When referring to members of a pair of fluorophores (i.e., fluorescent dyes or polypeptides) that can undergo fluorescence resonance energy transfer (FRET), the fluorophore that emits at a wavelength or spectrum of wavelengths that excites the other member ofthe pair is referred to as the "fluorescent donor" or "fluorescence donor". Conversely, the member ofthe pair that emits in response to excitation by the fluorescence donor is termed the "fluorescent acceptor" or "fluorescence acceptor". The members of such a pair are said to be "fluorescent donor and acceptor to each other." According to the invention, the fluorescence donor and fluorescence acceptor polypeptides are not linked by peptide bonds. In one embodiment ofthe invention, either ofthe fluorescence donor or acceptor, but not both, may be a non-polypeptide fluorescent dye (also not covalently linked to each other).
As used herein, the term "fluorescently labeled" means, when referring to a polypeptide, that the polypeptide is covalently attached to a fluorescent moiety. A polypeptide may be fluorescently labeled by covalent attachment to a non-polypeptide fluorescent dye, or alternatively, by expression as a fusion protein with a fluorescent polypeptide. In nature and as used herein, a fluorescent polypeptide is distinguished from a luminescent polypeptide in that a fluorescent polypeptide requires an input of electromagnetic energy in order to emit light, while a luminescent polypeptide emits light in response to release of chemical energy. A luminescent polypeptide may serve as a donor of excitation energy for a fluorescent polypeptide (in fact, this is exactly what happens in nature when, for example, Renilla luciferase emits energy that excites Renilla GFP). A fusion polypeptide according to the invention may or may not be luminescent.
As used herein, the term "recombinant" refers to a polynucleotide that has been isolated from its natural environment using recombinant DNA techniques, or synthesized, or to a polypeptide expressed from such a polynucleotide. A recombinant polypeptide may be identical to or different from a naturally occurring polypeptide, as long as it is expressed from a recombinant polynucleotide.
As used herein, the term "monomer" refers to a single polypeptide molecule that exists as a dimer or heterodimer or other multimer (e.g., a trimer, quadramer, pentamer, etc.) in a multimeric protein. A "monomer" interacts with another monomer, e.g., in a dimer, via a specific sequence referred to herein by the equivalent terms "interaction domain" and "interaction interface". In a "dimer" the appropriate equivalent terms for the sequences that mediate the interaction are "dimerization domain" and "dimerization interface."
A monomer of a fluorescent polypeptide may be full length, for example, as the polypeptide occurs in nature, or it may be longer or shorter than the naturally occurring polypeptide, so long as it retains the two requisite properties.
A recombinant fusion polypeptide according to the invention may comprise first and second polypeptides which exist in nature as non-peptide-bonded monomers of a multimeric protein. Thus, the term "monomer" is used with respect to what is found in nature. In a fusion polypeptide according to the invention, these first and second polypeptides are peptide bonded and form a single chain polypeptide. However, the peptide-bonded first and second polypeptides retain the ability, independently, to interact with a donor or acceptor fluorophore and fluoresce. This is believed to be a result ofthe intramolecular interaction ofthe monomers and the ability ofthe intramolecular dimer thus formed to be excited at an excitation wavelength of light and to act as a fluorescent donor or acceptor.
As used herein, the term "linker sequence" refers to a sequence of peptide bonded amino acids that joins or links by peptide bonds two amino acid sequences or polypeptide domains that are not joined by peptide bonds in nature. A linker sequence is encoded in frame on a polynucleotide between the sequences encoding the two polypeptide domains joined by the linker. A linker is preferably 5 to 50 amino acids in length, more preferably 10 to 20 amino acids in length. An example of linkers useful in the invention are the Gly- Ala linkers taught by Huston et al., U.S. Patent No. 5,258,498, incorporated herein by reference. Additional useful linkers include, but are not limited to (Arg-Ala-Arg-Asp-Pro-Arg-Val-Pro-Val-Ala-Thr)1-5 (Xu et al., 1999, Proc. Natl. Acad. Sci. U.S.A. 96: 151-156), (Gly-Ser)n (Shao et al., 2000, Bioconjug. Chem. 11: 822-826), (Thr-Ser-Pro)„ (Kroon et al., 2000, Eur. J. Biochem. 267: 6740-6752), (Gly- Gly-Gly)„ (Kluczyk et al., 2000, Peptides 21: 1411-1420), and (Glu-Lys)n (Klyczyk et al, 2000, supra), wherein n is 1 to 15.
As used herein, the term "specific binding pair" refers to a pair of polypeptides that physically interact in a specific manner that gives rise to a biological activity, that is, to the substantial exclusion of other polypeptides. Members of a specific binding pair interact through complementary interaction domains, such that they interact to the substantial exclusion of proteins that do not have a complementary interaction domain. Non-limiting examples of specific binding pairs include antibody-antigen pairs, enzyme-substrate pairs, dimeric transcription factors (e.g., AP-1, composed of Fos specifically bound to Jun via a leucine zipper interaction domain) and receptor-ligand pairs.
As used herein, the term "amino terminus" refers to the last amino acid at the amino end of a polypeptide, where the last amino acid is not peptide bonded to another amino acid.
As used herein, the term "carboxy terminus" refers to the last amino acid at the carboxyl end of a polypeptide, where the last amino acid is not peptide bonded to another amino acid.
As used herein, the term "labeling a cell" refers to the expression of a fluorescent polypeptide in a cell, such that the cell is detectable by irradiating the cell with light within the excitation spectrum ofthe fluorescent polypeptide and monitoring or detecting emission within the emission spectrum ofthe polypeptide. A cell may be labeled by expression of a fluorescent polypeptide that localizes anywhere in the cell, including, but not limited to the cell surface, the cytoplasm, the nucleus or to particular organelles such as mitochondria, lysosomes, endosomes, golgi apparatus, endoplasmic reticulum or other specific sub-cellular locale.
As used herein, the term "introducing a nucleic acid into a cell" or "introducing a polynucleotide into a cell" refers to the process whereby a recombinant polynucleotide is put into a cell. Methods for introducing a nucleic acid to a cell will vary with the nature ofthe cell and the nature ofthe chosen vector, but one of skill in the art may readily select and employ a known method appropriate for a given cell type and vector.
As used herein, the term "culturing a cell under conditions that permit the synthesis of a recombinant polypeptide" refers to the maintenance of cells comprising a polynucleotide encoding a recombinant polypeptide in growth medium and under environmental conditions (e.g., temperature, pH, redox and osmotic conditions, O2 and CO2 concentrations and presence or absence of an effective concentration of an appropriate expression-modulating agent such as IPTG or tetracycline) conducive to the synthesis ofthe recombinant polypeptide. One of skill in the art is assumed to be capable of maintaining yeast, insect, mammalian or other cells under conditions that permit the synthesis of a recombinant polypeptide according to the invention.
As used herein, the term "monitoring the interaction" refers to the process whereby the physical association of two polypeptides or a polypeptide and another entity are measured. As relates to the invention, the term refers most frequently to detection or measurement of association or interaction using FRET.
As used herein, the term "intramolecular dimer" refers to a dimer formed by the covalent peptide linkage of two polypeptide monomers. An "intramolecular dimer fluorescent protein" (IDFP) is an intramolecular dimer in which the linked polypeptides which exist in nature as monomers of a multimeric protein are fluorescent polypeptides. According to the invention, the linked monomers of an IDFP are not fluorescent donor and acceptor to each other. An "IDFP fusion protein" is an IDFP which is fused to a protein of interest or to a fragment of a protein of interest.
As used herein, the term "protein of interest" refers to a polypeptide, or a domain (fragment) of a polypeptide, that is selected to be fused to an IDFP. Any polypeptide or fragment of a polypeptide for which a polynucleotide sequence is known can be fused to an IDFP by standard techniques known in the art. A protein of interest according to the invention either does not alter the fluorescence characteristics ofthe fused IDFP, or, if it does alter those characteristics, the alteration is such that the alteration does not interfere with the intended use of the IDFP fusion protein.
As used herein, the term "detecting fluorescence" refers to the process whereby the fluorescent emission by a fluorescent polypeptide is measured or determined. Fluorescence detection methods include quantitative and qualitative methods adapted for standard or confocal microscopy, FACS analysis, and those adapted for high throughput methods involving multiwell plates, arrays or microarrays. One of skill in the art can select appropriate filter sets and excitation energy sources for the detection of fluorescent emission from a given fluorescent polypeptide or dye.
As used herein, the term "candidate modulator" refers to an agent being evaluated for its effect on the function of a polypeptide or the interaction of members of a specific binding pair. Exemplary sources and types of candidate modulators useful according to the invention are described herein below.
As used herein, the term "change in interaction" or "modulation of interaction" refers to an increase or decrease in the level of interaction detected between members of a specific binding pair. As used herein, the level of interaction is considered increased if the detected interaction goes up by at least 10%, and preferably by 20%, 35%, 50%, 75%, or more, up to and including 2-fold, 5-fold, 10-fold, 20-fold, 50-fold or more relative to a standard. As used herein, the level of interaction is considered decreased if the detected interaction goes down by at least 10%, and preferably by 20%, 35%, 50%, 75%, 90%, 95%, 98%, 99% or more, up to and including 100% (no interaction) relative to a standard.
As used herein, the term "single polypeptide chain" refers to a polypeptide chain in which all amino acids are linked sequentially by peptide bonds. A "single polypeptide chain" is one generated by translation of a single mRNA template and may encompass one or more polypeptide domains, including one or more repeats ofthe sequence comprising one polypeptide or polypeptide domain.
As used herein, the term "polypeptide domain" refers to a sequence of amino acids that exhibits one or more discrete binding or functional properties. As used herein, binding or functional properties include binding to one or more polypeptides, modulation ofthe binding of one or more polypeptides, recognition by an antibody or antigen binding fragment thereof, binding to a coenzyme, ion, or other ligand, catalytic activity or inhibition of catalytic activity, fluorescence and luminescence. In this context, non-limiting examples of polypeptide domains include a DNA binding domain and a kinase domain.
As used herein, the term "homodimer" refers to a protein complex comprised of two identical copies ofthe same monomer.
As used herein, the term "interact" means that two molecular species physically associate with each other. The association that is characterized as an interaction can involve charge- charge interactions, charge-dipole interactions, dipole-dipole interactions, van der Waals forces, hydrogen bonding and/or hydrophobic forces.
As used herein, the term "specific binding" means the specific recognition of one of two different molecules for the other compared to substantially less recognition of other molecules. Members of a specific binding pair have a particular affinity for each other that gives rise to a biological activity. Generally, the molecules have areas on their surfaces or in cavities giving rise to specific recognition between the two molecules. Exemplary of specific binding are antibody- antigen interactions, enzyme—substrate interactions, polynucleotide interactions, and so forth.
As used herein, the term "specifically dimerize" means that two monomers useful in the invention interact via an interaction domain present on each monomer, to the substantial exclusion of polypeptides lacking that interaction domain. "Specifically homodimerize" means that the monomers that interact via a shared interaction domain, to the substantial exclusion of polypeptides lacking that interaction domain, form a homodimer as defined herein. "Substantial exclusion" means that at a given time in a sample, less than 0.1% ofthe monomers, and preferably less than 0.01%, 0.001% or fewer monomers are physically associated with polypeptides that do not have a complementary interaction domain. As used herein, the term "variant" refers to a polypeptide that differs in amino acid sequence from a parent polypeptide yet retains the function ofthe parent polypeptide. A variant fluorescent polypeptide may, for example, have one or more amino acid insertions, deletions or substitutions that do not alter ability ofthe polypeptide to emit fluorescence upon excitation or interaction with a donor or acceptor fluorophore. A variant fluorescent polypeptide according to the invention has the ability to form an intramolecular homodimer as defined herein.
As used herein, the term "derived from" refers to a polypeptide that differs in amino acid sequence from a reference polypeptide used as the template or starting sequence for generating or deriving the differing sequence. For example, a fluorescent polypeptide can be derived from a wild-type fluorescent polypeptide (i.e., a reference polypeptide) by random or site-directed mutagenesis, including insertions, deletions or truncations or fusions. A fluorescent polypeptide derived from a wild-type polypeptide can have different fluorescence characteristics than the wild-type polypeptide.
As used herein, the term "fluorescence characteristic" refers to a property ofthe excitation or emission by a fluorescent polypeptide. Fluorescence characteristics include, for example, the wavelength(s) at which a fluorescent polypeptide is excited or at which it emits (including the breadth and amplitudes ofthe spectra for each), the extinction coefficient or intensity ofthe emission, quantum yield or the efficiency of emission, and resistance or susceptibility to photobleaching. Table 2 provides examples of excitation maxima, emission maxima, extinction coefficient and quantum yield for a variety of fluorescent polypeptides.
As used herein, the term "spectrum characteristic of a fluorescent acceptor" refers to the emission spectrum of a given fluorophore that is being used as the fluorescence acceptor in an acceptor/donor pair. Detailed Description ofthe Invention
In one aspect, the invention relates to dimeric fluorescent proteins that avoid the problems caused by heterodimerization. In this aspect, heterodimerization is avoided by fusing two monomers ofthe fluorescent polypeptide using a linker peptide. The close spatial relationship ofthe fused monomers strongly favors the formation of a dimer between the two fused monomers, to the essential exclusion of other monomers sharing a similar dimerization interface. The interaction ofthe fused monomers via their respective dimerization interfaces is referred to herein as "intramolecular dimerization". An intramolecular dimer fluorescent protein (IDFP) does not comprise fluorescent monomers that are related to each other as donor and acceptor. That is, the monomers that are linked in an IDFP cannot undergo FRET between them. IDFPs may be co-expressed within the same cell or otherwise mixed with distinct fluorescent proteins comprising the same fluorescent protein dimerization interfaces without encountering the problems caused by heterodimer formation.
In order to make an IDFP, the nucleic acid encoding a monomer of a fluorescent protein is joined in frame at its 3' end to a sequence encoding a peptide linker, which is itself joined in frame to another copy ofthe nucleic acid encoding the monomer. This sequence may and often will be additionally linked in frame to a sequence encoding a polypeptide of interest, for example, a polypeptide being investigated for interaction with another protein. Translation of the mRNA encoded by such a nucleic acid construct generates the fluorescent monomers in such close proximity to each other that intramolecular homodimerization ofthe monomers is very strongly favored over intermolecular heterodimerization. The resulting polypeptide therefore comprises an intramolecular homodimer ofthe fluorescent protein monomers, fused to a protein of interest. Fluorescent Proteins Useful According to the Invention
Any fluorescent protein that homodimerizes in a cell can be useful in generating an IDFP ofthe invention. GFPs from Aequorea victoria, Renilla reniformis and Renilla mulleri, among others, are homodimers as they exist in nature. Any of these proteins, and any mutants or engineered versions of these proteins that retain the ability to homodimerize may be used to generate an IDFP ofthe invention. h order to generate an IDFP according to the invention, the fluorescent protein or the natural protein it was derived from (e.g., R.. reniformis GFP) must form homodimers when expressed in a monomeric form. It is generally known in the field whether a given protein exists as a homo- or heterodimer in vivo or if it has the capacity to homodimerize. In the event that such knowledge is not available, there are a number of ways in which one of skill in the art may determine whether a particular fluorescent protein homodimerizes. First, biophysical methods such as X-ray crystallography, nuclear magnetic resonance, radiation target analysis or mass spectrometry can be used to determine whether a polypeptide dimerizes.
A biochemical approach is to fractionate samples of purified proteins by size selection gel chromatography under denaturing versus non-denaturing conditions and analyze fractions for the fluorescent protein by fluorescence. If the fluorescent protein migrates at a larger size (approximately twice as large) under non-denaturing conditions relative to denaturing conditions, it is an indication that the protein is a dimer under native conditions. Examples of commonly used matrices include, for example, Sephadex (G10-G200), Bio-Gel (P-2 - P-300) and Sepharose (2B, 4B, etc.) matrices. One of skill in the art may readily select a size separation matrix appropriate for such analyses. If performed with purified protein this method can indicate whether or not a polypeptide homodimerizes. If the method is applied to non-purified protein, for example, to protein extracts, the assay only indicates that a dimer forms with some polypeptide, and further analysis is required to determine if the dimer is a homodimer.
Another biochemical method of investigating dimer formation is to generate a truncated or elongated form ofthe protein and mix it, either by co-expression or by mixing of isolated proteins, with the wild-type protein. If homodimers can form, there will be three distinctly sized bands following native gel electrophoresis: 1) a homodimer ofthe wild-type; 2) a homodimer of the elongated or truncated form; and 3) an intermediate-migrating diagnostic heterodimer complex ofthe wild-type and the truncated forms. In the absence of dimerization, only bands (1) and (2) will form.
Additionally, homodimer formation is detected by the method of analytical ultracentrifugation (Baird et al., 2000, Proc Natl Acad Sci U S A., 22:11984-9).
Examples of known fluorescent proteins that can be expressed as intramolecular dimers are as follows. SEQ ID NO: 1 (Figure 1) is the nucleotide sequence encoding wild-type rGFP, and SEQ ID NO: 2 (Figure 2) is the amino acid sequence of wild-type rGFP. A preferred embodiment ofthe IDFP comprises two copies ofthe wild-type rGFP polypeptide, linked by a peptide linker sequence. Another embodiment encompasses the same rGFP IDFP additionally fused in frame to a protein of interest. Any protein derived from the rGFP of SEQ ID NO: 2 can be used to generate an IDFP ofthe invention as long as it retains the ability to homodimerize. In a preferred embodiment, the polynucleotide sequence encoding a fluorescent polypeptide (e.g., rGFP of SEQ ID NO: 2) is a humanized polynucleotide rGFP coding sequence, also referred to herein as hrGFP. Figure 3 shows a humanized polynucleotide sequence (hrGFP) and the rGFP sequence it encodes (SEQ ID Nos: 3 and 4, respectively).
The amino acid and nucleotide sequences of A. victoria GFP are known in the art
(Prasher et al., 1992, supra) and vectors encoding a variety of mutant.4. vz'ctoπα-derived GFPs are also known and are frequently commercially available. For example, Heim et al. (1995, Nature 373: 663-664) teaches mutations at S65 of A. victoria GFP that enhance the fluorescence intensity ofthe polypeptide. The mutant containing the S65T mutation is particularly important, since its fluorescence is approximately 35 times as intense as wild-type A. victoria GFP, and its emission spectrum is shifted to the red, making it more amenable to standard rhodamine optics (excitation and emission maxima at 489 nm and 508 nm, respectively). An S65T mutant encoded by a construct comprising humanized codons is known as EGFP, or "enhanced GFP" (available from CLONTECH; see GenBank Accession No. U43284).
The EGFP mutant is the cornerstone of a series of commercially-available GFP mutants that have differing emission spectra and other useful engineered properties (Cormack et al., 1996, Gene 173: 33-38; Yang et al., 1996, Nucl. Acids Res. 24: 4592-4593; Crameri et al., 1996, Nature Biotechnol. 14: 315-319) Each protein in the series contains mutations in addition to the S65T and humanizing mutations, that alter the emission characteristics ofthe proteins. For example, the cyan fluorescent protein known as ECFP contains six mutations that shift the emission to cyan light (excitation and emission maxima at 434 nm and 477 nm, respectively; see GenBank Accession No. AB041904 and Sawano et al., 2000, Nucl. Acids Res. 28: e78). The blue fluorescent protein known as EBFP contains four mutations that shift the emission spectrum to blue (excitation and emission maxima at 380 nm and 440 nm, respectively). The yellow fluorescent protein known as EYFP (see Ormo et al., Science 273: 1392-1395, clone GFP-10C) contains mutations shifting the emission to yellow-green (excitation and emission maxima at 514 nm and 527 nm, respectively). EGFP, ECFP, EYFP and EBFP are all available from CLONTECH.
The S65 site has received considerable scrutiny for its role in determining the fluorescence characteristics ofthe A. victoria GFP molecule. Additional mutants at S65 include, for example, S65A, S65C and S65L, each of which have excitation and emission maxima that differ from wild-type A. victoria GFP (see Table 2). The nucleotide sequence encoding an S65 A mutant is available as GenBank Accession No. U56996. One skilled in the art can introduce mutations necessary to alter S65 to any desired amino acid. Similarly, the additional point mutations detailed in Table 2 can be generated by one of skill in the art.
Other fluorescent proteins useful according to the invention include, for example, A. vtctoπα-derived GFPs that are optimized for expression in plants (GenBank Accession No. U87625 and WO 96/27675), are less thermosensitive (GenBank Accession No. U87973), or are more soluble and emit blue fluorescence (GenBank Accession No. U70497). A. victoria GFPs targeted to specific organelles have also been described, such as those targeted to the mitochondria and the nucleus (Rizzuto et al., 1996, Curr. Biol. 6: 183-188). This listing is by no means exhaustive. There are, for example, a number of fluorescent protein variants, both derived from A. victoria and from other sources, that have been reported in or are the subject of U.S. and international patents and patent applications, for example, U.S. Patent Nos. 6,124,128, 6,066,476, 6,020,192, 5,804,387, 5,874,304, 5,968,738, 5,625,048, and 5,777,079, and PCT Application Nos. WO 98/21355, WO 98/06737, WO 97/20078, WO 97/42320 and WO 97/11094. Fluorescent protein variants are also described in a number of additional publications in the scientific literature, including, for example, Ehrig et al., 1995, FEBS Lett. 367: 163-166); Surpin et al., 1987, Photochem. Photobiol. 45 (Suppl): 95S; and Delagrave et al., 1995, BioTechnology 13: 151-154. Any and all ofthe fluorescent proteins taught in these sources and elsewhere are candidates for the generation of IDFPs ofthe invention, provided that they homodimerize and the sequences encoding them are known.
The red fluorescent protein from the Indo-Pacific sea anemone ofthe Discosoma species is also a candidate for IDFP generation according to the invention (see Matz et al, 1999, Nature Biotechnol. 17: 969-973). The sequence encoding the protein, known as "DsRed" is available at GenBank Accession No. AF272711, and vectors encoding the protein are commercially available (CLONTECH). Linker Sequences Useful According to the Invention
Linker sequences useful according to the invention serve to join monomers in the dimeric fluorescent polypeptides ofthe invention. A linker is preferably about 5 to about 50 amino acids in length, and more preferably about 10 to about 20 amino acids in length. An example of linkers useful in the invention are the Gly- Ala linkers taught by Huston et al., U.S. Patent No. 5,258,498, incorporated herein by reference. Additional useful linkers include, but are not limited to (Arg- Ala-Arg-Asp-Pro-Arg-VaI-Pro-Val-Ala-Thr)1-5 (Xu et al., 1999, Proc. Natl. Acad. Sci. U.S.A. 96: 151-156), (Gly-Ser)n (Shao et al., 2000, Bioconjug. Chem. 11: 822-826), (Thr-Ser-Pro)n (Kroon et al., 2000, Eur. J. Biochem. 267: 6740-6752), (Gly-Gly-Gly)n (Kluczyk et al., 2000, Peptides 21: 1411-1420), and (Glu-Lys)n (Klyczyk et al., 2000, supra), wherein n is 1 to 15 (each ofthe preceding references is also incorporated herein by reference). Proteins of Interest
Frequently it will be advantageous to express an IDFP ofthe invention as a fusion with a protein of interest. The protein of interest can be any protein for which the nucleic acid sequence is known and for which that sequence or at least a relevant part of that sequence can be cloned into a vector encoding an IDFP. By relevant part is meant a domain of interest within a protein, for example, a domain being evaluated for protei protein interactions or a domain with catalytic activity. As used herein, the term "protein of interest" or "domain of interest" refers to any polypeptide or protein, or polypeptide or protein domain, that one wishes to fuse to an IDFP molecule ofthe invention. The fusion of an IDFP with a polypeptide of interest may be through linkage of the IDFP sequence to either the N or C terminus ofthe fusion partner. Fusions comprising IDFP polypeptides ofthe invention need not comprise only a single polypeptide or domain in addition to the IDFP. Rather, any number of domains of interest may be linked in any way as long as the IDFP coding region retains its reading frame and the encoded polypeptide retains fluorescence activity under at least one set of conditions. One non-limiting example of such conditions includes physiological salt concentration (i.e., about 90 mM), pH near neutral and 37°C.
Examples of proteins of interest include, but are not limited to receptors (transmembrane and intracellular) and cell surface proteins, growth factors, signal transduction proteins, transcription factors, structural proteins (e.g., cytoskeletal proteins, nuclear matrix proteins, histones, etc.), extracellular matrix proteins, immunoglobulins, bacterial proteins, plant proteins, viral or phage proteins, enzymes, therapeutic proteins, phosphoproteins, glycoproteins, and lipoproteins. Production of Intramolecular Dimer Fluorescent Proteins
The production of IDFPs from recombinant vectors may be effected in a number of ways known to those skilled in the art. For example, plasmids, bacteriophage or viral vectors may be introduced to prokaryotic or eukaryotic cells by any of a number of ways known to those skilled in the art. Examples of useful vectors, cells, methods of introducing vectors to cells and methods of detecting and isolating GFP polypeptides and variants thereof are also described herein below.
1. Vectors Useful According to the Invention.
There is a wide array of vectors known and available in the art that are useful for the expression of IDFPs according to the invention. The selection of a particular vector clearly depends upon the intended use ofthe polypeptide. For example, the selected vector must be capable of driving expression ofthe polypeptide in the desired cell type, whether that cell type be prokaryotic or eukaryotic. Many vectors comprise sequences allowing both prokaryotic vector replication and eukaryotic expression of operably linked gene sequences.
Vectors useful according to the invention may be autonomously replicating, that is, the vector, for example, a plasmid, exists extrachromosomally and its replication is not necessarily directly linked to the replication ofthe host cell's genome. Alternatively, the replication ofthe vector may be linked to the replication ofthe host's chromosomal DNA, for example, the vector may be integrated into the chromosome ofthe host cell as achieved by retroviral vectors.
Vectors useful according to the invention preferably comprise sequences operably linked to the IDFP coding sequences that permit the transcription and translation ofthe IDFP sequence. Sequences that permit the transcription ofthe linked IDFP sequence include a promoter and optionally also include an enhancer element or elements permitting the strong expression ofthe linked sequences. The term "transcriptional regulatory sequences" refers to the combination of a promoter and any additional sequences conferring desired expression characteristics (e.g., high level expression, inducible expression, tissue- or cell-type-specific expression, or a combination of these) on an operably linked nucleic acid sequence.
The selected promoter may be any DNA sequence that exhibits transcriptional activity in the selected host cell, and may be derived from a gene normally expressed in the host cell or from a gene normally expressed in other cells or organisms. Examples of promoters include, but are not limited to the following: A) prokaryotic promoters - E. coli lac, tac, or trp promoters, lambda phage PR or PL promoters, bacteriophage T7, T3, Sp6 promoters, B. subtilis alkaline protease promoter, and the B. stearothermophilus maltogenic amylase promoter, etc.; B) eukaryotic promoters - yeast promoters, such as GAL1, GAL4 and other glycolytic gene promoters (see for example, Hitzeman et al., 1980, J. Biol. Chem. 255: 12073-12080; Alber & Kawasaki, 1982, J. Mol. Appl. Gen. 1: 419-434), LEU2 promoter (Martinez-Garcia et al., 1989, Mol Gen Genet. 217: 464-470), alcohol dehydrogenase gene promoters (Young et al., 1982, in Genetic Engineering of Microorganisms for Chemicals, Hollaender et al., eds., Plenum Press, NY), or the TPI1 promoter (U.S. Pat. No. 4,599,311); insect promoters, such as the polyhedrin promoter (U.S. Pat. No. 4,745,051; Vasuvedan et al., 1992, FEBS Lett. 311: 7-11), the P10 promoter (Vlak et al., 1988, J. Gen. Virol. 69: 765-776), the Autographa californica polyhedrosis virus basic protein promoter (EP 397485), the baculovirus immediate-early gene promoter gene 1 promoter (U.S. Pat. Nos. 5,155,037 and 5,162,222), the baculovirus 39K delayed-early gene promoter (also U.S. Pat. Nos. 5,155,037 and 5,162,222) and the OpMNPV immediate early promoter 2; mammalian promoters - the SV40 promoter (Subramani et al., 1981, Mol. Cell. Biol. 1: 854-864), metallothionein promoter (MT-1; Palmiter et al., 1983, Science 222: 809-814), adenovirus 2 major late promoter (Yu et al., 1984, Nucl. Acids Res. 12: 9309-21), cytomegalovirus (CMV) or other viral promoter (Tong et al., 1998, Anticancer Res. 18: 719-725), or even the endogenous promoter of a gene of interest in a particular cell type.
A selected promoter may also be linked to sequences rendering it inducible or tissue- specific. For example, the addition of a tissue-specific enhancer element upstream of a selected promoter may render the promoter more active in a given tissue or cell type. Alternatively, or in addition, inducible expression may be achieved by linking the promoter to any of a number of sequence elements permitting induction by, for example, thermal changes (temperature sensitive), chemical treatment (for example, metal ion- or IPTG-inducible), or the addition of an antibiotic inducing agent (for example, tetracycline).
Regulatable expression is achieved using, for example, expression systems that are drug inducible (e.g., tetracycline, rapamycin or hormone-inducible). Drug-regulatable promoters that are particularly well suited for use in mammalian cells include the tetracycline regulatable promoters, and glucocorticoid steroid-, sex hormone steroid-, ecdysone-, lipopolysaccharide (LPS)- and isopropylthiogalactoside (IPTG)-regulatable promoters. A regulatable expression system for use in mammalian cells should ideally, but not necessarily, involve a transcriptional regulator that binds (or fails to bind) nonmammalian DNA motifs in response to a regulatory agent, and a regulatory sequence that is responsive only to this transcriptional regulator.
Tissue-specific promoters may also be used to advantage with IDFP-encoding constructs. A wide variety of tissue-specific promoters is known. As used herein, the term "tissue-specific" means that a given promoter is transcriptionally active (i.e., directs the expression of linked sequences sufficient to permit detection ofthe polypeptide product ofthe promoter) in less than all cells or tissues of an organism. A tissue specific promoter is preferably active in only one cell type, but may, for example, be active in a particular class or lineage of cell types (e.g., hematopoietic cells). A tissue specific promoter useful according to the invention comprises those sequences necessary and sufficient for the expression of an operably linked nucleic acid sequence in a manner or pattern that is essentially the same as the manner or pattern of expression ofthe gene linked to that promoter in nature. Any tissue specific transcriptional regulatory sequence known in the art may be used to advantage with a vector encoding an IDFP.
In addition to promoter/enhancer elements, vectors useful according to the invention may further comprise a suitable terminator. Such terminators include, for example, the human growth hormone terminator (Palmiter et al., 1983, supra), or, for yeast or fungal hosts, the TPIl (Alber & Kawasaki, 1982, supra) or ADH3 terminator (McKnight et al., 1985, EMBO J. 4: 2093-2099).
Vectors useful according to the invention may also comprise polyadenylation sequences (e.g., the SV40 or Ad5Elb poly(A) sequence), and translational enhancer sequences (e.g., those from Adenovirus VA RNAs). Further, a vector useful according to the invention may encode a signal sequence directing the recombinant polypeptide to a particular cellular compartment or, alternatively, may encode a signal directing secretion ofthe recombinant polypeptide. A vector useful according to the invention may also comprise a selectable marker allowing identification of a cell that has received a functional copy ofthe IDFP-encoding gene construct. In its simplest form, the IDFP sequence itself, linked to a chosen promoter may be considered a selectable marker, in that illumination of cells or cell lysates with the proper wavelength of light and measurement of emitted fluorescence at the expected wavelength allows detection of cells that express the IDFP construct. In other forms, the selectable marker may comprise an antibiotic resistance gene, such as the neomycin, bleomycin, zeocin or phleomycin resistance genes, or it may comprise a gene whose product complements a defect in a host cell, such as the gene encoding dihydrofolate reductase (DHFR), or, for example, in yeast, the Leu2 gene. Alternatively, the selectable marker may, in some cases be a luciferase gene or a chromogenic substrate-converting enzyme gene such as the β-galactosidase gene.
IDFP-encoding sequences according to the invention may be expressed either as freestanding polypeptides or as fusions with other polypeptides. It is assumed that one of skill in the art can, given an IDFP nucleic acid sequence, readily construct a gene comprising a sequence encoding the IDFP fused in frame to one or more polypeptides or polypeptide domains of interest. References teaching methods to do so include Sambrook et al., 1989, Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, and Ausubel et al. (eds.), 1993, Current Protocols in Molecular Biology on CD-ROM, John Wiley & Sons, New York, NY.
A schematic diagram of a vector encoding the transcription unit of one possible embodiment ofthe invention is shown in Figure 4. In this embodiment, an intramolecular dimer humanized R. reniformis GFP (hrGFP) is encoded on a construct driven by the strong CMV promoter and containing a multi-cloning site (MCS) downstream ofthe second, or C-terminal copy of hrGFP. A gene of interest is fused at the C-terminus ofthe hrGFP dimer by insertion in frame into the MCS. A polyadenylation site sequence is included 3' ofthe MCS to enhance the stability and processing ofthe transcript generated. The (Gly Ser)2-4 linkers shown represent three examples of a linker peptide sequence useful according to the invention and are not meant to be limiting. a. Plasmid vectors.
Any plasmid vector that allows expression of an IDFP coding sequence ofthe invention in a selected host cell type is acceptable for use according to the invention. A plasmid vector useful in the invention may have any or all ofthe above-noted characteristics of vectors useful according to the invention. Plasmid vectors useful according to the invention include, but are not limited to the following examples: Bacterial - pQE70, pQE60, pQE-9 (Qiagen) pBs, phagescript, psiX174, pBluescript SK, pBsKS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, andpRIT5 (Pharmacia); Eukaryotic - pWLneo, pSV2cat, pOG44, pXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, andpSVL (Pharmacia). However, any other plasmid or vector may be used as long as it is replicable in the host. b. Bacteriophage vectors.
There are a number of well known bacteriophage-derived vectors useful according to the invention. Foremost among these are the lambda-based vectors, such as Lambda Zap II or Lambda-Zap Express vectors (Stratagene) that allow inducible expression ofthe polypeptide encoded by the insert. Others include filamentous bacteriophage such as the M13-based family of vectors. c. Viral vectors.
A number of different viral vectors are useful according to the invention, and any viral vector that permits the introduction and expression of sequences encoding an IDFP in cells is acceptable for use in the methods ofthe invention. Viral vectors that can be used to deliver foreign nucleic acid into cells include but are not limited to retroviral vectors, adenoviral vectors, adeno-associated viral vectors, herpesviral vectors, and Semliki forest viral (alphaviral) vectors. Defective retroviruses are well characterized for use in gene transfer (for a review see Miller, A.D. (1990) Blood 76:271). Protocols for producing recombinant retroviruses and for infecting cells in vitro or in vivo with such viruses can be found in Ausubel et al. (eds.), 1993, supra, and other standard laboratory manuals.
In addition to retroviral vectors, adenoviruses can be manipulated such that they encode and express a gene product of interest but are inactivated in terms of their ability to replicate in a normal lytic viral life cycle (see for example Berkner et al., 1988, BioTechniques 6:616; Rosenfeld et al., 1991, Science 252:431-434; and Rosenfeld et al., 1992, Cell 68:143-155). Suitable adenoviral vectors derived from the adenovirus strain Ad type 5 dl324 or other strains of adenovirus (e.g., Ad2, Ad3, Ad7 etc.) are well known to those skilled in the art. Adeno-associated virus (AAV) is a naturally occurring defective virus that requires another virus, such as an adenovirus or a herpes virus, as a helper virus for efficient replication and a productive life cycle. (For a review see Muzyczka et al., 1992, Curr. Topics in Micro, and Immunol. 158:97-129). An AAV vector such as that described in Traschin et al. (1985, Mol. Cell. Biol. 5:3251-3260) can be used to introduce nucleic acid into cells. AAV vectors are useful for the introduction of nucleic acid sequences into a variety of different cell types (see, for example, Hermonat et al., 1984, Proc. Natl. Acad. Sci. USA 81: 6466-6470; and Traschin et al., 1985, Mol. Cell. Biol. 4: 2072-2081).
Finally, the introduction and expression of foreign genes is often desired in insect cells because high level expression may be obtained, the culture conditions are simple relative to mammalian cell culture, and the post-translational modifications made by insect cells closely resemble those made by mammalian cells. For the introduction of foreign DNA to insect cells, such as Drosophila S2 cells, infection with baculovirus -vectors is widely used. Other insect vector systems include, for example, the expression plasmid pIZ/V5-His (InVitrogen) and other variants ofthe pIZ/V5 vectors encoding other tags and selectable markers. Insect cells are readily transfectable using lipofection reagents, and there are lipid-based transfection products specifically optimized for the transfection of insect cells (for example, from PanVera).
2. Host Cells Useful According to the Invention.
Any cell into which a recombinant vector carrying an IDFP sequence may be introduced and wherein the vector is permitted to drive the expression ofthe IDFP is useful accordmg to the invention. That is, because ofthe wide variety of uses for the IDFP molecules ofthe invention, any cell in which an IDFP molecule ofthe invention may be expressed and preferably detected is a suitable host. Vectors suitable for the introduction of IDFP-encoding sequences to host cells from a variety of different organisms, both prokaryotic and eukaryotic, are described herein above or known to those skilled in the art.
Host cells may be prokaryotic, such as any of a number of bacterial strains, or may be eukaryotic, such as yeast or other fungal cells, insect or amphibian cells, or mammalian cells including, for example, rodent, simian or human cells. Host cells may also be plant cells. Cells expressing IDFPs may be primary cultured cells, for example, primary human fibroblasts or keratinocytes, or may be an established cell line, such as NIH3T3, 293T or CHO cells. Further, mammalian cells useful for expression of IDFPs may be phenotypically normal or oncogenically transformed. It is assumed that one skilled in the art can readily establish and maintain a chosen host cell type in culture.
3. Introduction of IDFP-Encoding Vectors to Host Cells.
IDFP-encoding vectors may be introduced to selected host cells by any of a number of suitable methods known to those skilled in the art. For example, IDFP constructs may be introduced to appropriate bacterial cells by infection, in the case of E. coli bacteriophage vector particles such as lambda or Ml 3, or by any of a number of transformation methods for plasmid vectors or for bacteriophage DNA. For example, standard calcium-chloride-mediated bacterial transformation is still commonly used to introduce naked DNA to bacteria (Sambrook et al., 1989, supra), but electroporation may also be used (Ausubel et al. (eds.), supra, 1993).
For the introduction of IDFP-encoding constructs to yeast or other fungal cells, chemical transformation methods are generally used (e.g. as described by Rose et al., 1990, Methods in Yeast Genetics, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY). For transformation of S. cerevisiae, for example, the cells are treated with lithium acetate to achieve
transformation efficiencies of approximately IO4 colony-forming units (transformed cellsYμg of
DNA. Transformed cells are then isolated on selective media appropriate to the selectable marker used. Alternatively, or in addition, plates or filters lifted from plates may be scanned for IDFP fluorescence to identify transformed clones.
For the introduction of IDFP-encoding vectors to mammalian cells, the method used will depend upon the form ofthe vector. For plasmid vectors, DNA encoding an IDFP may be introduced by any of a number of transfection methods, including, for example, lipid-mediated transfection ("lipofection"), DEAE-dextran-mediated transfection, electroporation or calcium phosphate precipitation. These methods are detailed, for example, in Ausubel et al. (eds.), 1993, supra.
Lipofection reagents and methods suitable for transient transfection of a wide variety of transformed and non-transformed or primary cells are widely available, making lipofection an attractive method of introducing constructs to eukaryotic, and particularly mammalian cells in culture. For example, LipofectAMINE™ (Life Technologies) or LipoTaxi™(Stratagene) kits are available. Other companies offering reagents and methods for lipofection include Bio-Rad Laboratories, CLONTECH, Glen Research, InVitrogen, JBL Scientific, MBI Fermentas, PanVera, Promega, Quantum Biotechnologies, Sigma- Aldrich, and Wako Chemicals USA.
For the introduction of IDFP-encoding vectors to insect cells, liposome-mediated transfection is commonly used, as is baculovirus infection. Cells such as Schneider-2 cells (Drosophila melanogaster), Sf-9 and Sf-21 cells (Spodoptera frugiperda) or High Five™ cells (Trichoplusia ni) may be transfected using any of a number of commercially available liposome transfection reagents optimized for use with insect cells. Reagents include, for example, TransIT-hisecta™ (PanVera), FuGENE™-6 (Roche), Insectin™-Plus (InVitrogen) and Tfx™- 20 (Promega). Each of these reagents, used according to the vendor's instructions, permits the introduction of nucleic acid vectors encoding an IDFP to insect cells. Expression vectors optimized for insect cell expression are widely known and are commercially available from, for example, Clontech and InVitrogen. These include both plasmid-based vectors and baculovirus vectors. Insect cell expression vectors are described in detail in "Baculovirus Expression Vectors", D.R. O'Reilly, L.K. Miller & V.A. Luckow (1992, W.H. Freeman Co., New York).
Following transfection with an IDFP-encoding vector ofthe invention, eukaryotic (preferably, but not necessarily mammalian) cells successfully incorporating the construct (intra- or extrachromosomally) may be selected, as noted above, by either treatment ofthe transfected population with a selection agent, such as an antibiotic whose resistance gene is encoded by the vector, or by direct screening using, for example, FACS ofthe cell population or fluorescence scanning of adherent cultures. Frequently, both types of screening may be used, wherein a negative selection is used to enrich for cells taking up the construct and FACS or fluorescence scanning is used to further enrich for cells expressing IDFPs or to identify specific clones of cells, respectively. For example, a negative selection with the neomycin analog G418 (Life Technologies, Inc.) may be used to identify cells that have received the vector, and fluorescence scanning may be used to identify those cells or clones of cells that express an IDFP to the greatest extent.
4. Modification of nucleotide sequences to enhance translation of IDFPs.
In many applications it will be advantageous to enhance the expression of fluorescent proteins derived from marine invertebrates or bacteria by modifying the codons in the coding sequences to make them more compatible with codon usage in higher eukaryotes, such as mammals and humans. The methods for this so-called "humanizing" are known in the art and, as noted above, have been applied to A. victoria GFP and mutants thereof (U.S. Patent Nos. 6,020,192 and 5,874,304). Humanization is accomplished by site-directed mutagenesis ofthe less favored codons to more highly favored codons for the same amino acid, as described herein or as known in the art. The preferred codons for human gene expression are listed in Table 1. The codons in the table are arranged from left to right in descending order of relative use in human genes. In particular, those codons underlined in the table are almost never used in known human genes and, if found in a sequence to be humanized, would therefore represent the most important codons to modify for enhanced expression efficiency in mammalian or human cells. A sequence is considered "humanized" if the codon for one or more amino acids has been changed from the native codon sequence to a codon sequence more favored for translation in human or mammalian cells, preferably without altering the polypeptide coding sequence. Site-directed mutagenesis is well known in the art and is often performed using commercially available kits, such as the EXSITE™ (Catalog No. 200502), QUIKCHANGE™ (Catalog No. 200518) or CHAMELEON® mutagenesis kits (Catalog No. 200509), available from Stratagene. TABLE 1
PREFERRED DNA CODONS FOR HUMAN USE
Amino Acids Codons Preferred in Human Genes Alanine Ala A GCC GCT GCAGCG Cysteine Cys C TGTTGT Aspartic acid Asp D GAC GAT Glutamic acid Glu E GAGGAA Phenylalanine Phe F TTCTTT Glycine Gly G GGC GGG GGA GGT Histidine His H CAC CAT Isoleucine He I ATCATTATA Lysine Lys K AAGAAA Leucine Leu L CTGTTGCTT CTATTA Methionine Met M ATG Asparagine Asn N AACAAT Proline Pro P CCC CCTCCACCG Glutamine Gin Q CAGCAA Arginine Arg R CGCAGGCGGAGACGACGT Serine Ser S AGCTCCTCTAGTTCATCG Threonine Thr T ACCACAACTACG Valine Val V GTGGTCGTTGTA Tryprophan Trp w TGG Tyrosine Tyr Y TAGTAT The codons at the left represent those most preferred for use in human genes, with human usage decreasing towards the right. Underlined codons are almost never used in human genes.
5. Purification of intramolecular dimer fluorescent proteins.
Recombinant fluorescent proteins can be purified from bacteria as follows. Bacteria transformed with a recombinant IDFP-encoding vector ofthe invention are grown in Luria-
Bertani medium containing the appropriate selective antibiotic (e.g., ampicillin at 50 μg/ml). If
the vector permits, recombinant polypeptide expression is induced by the addition ofthe appropriate inducer (e.g., IPTG at 1 mM). Bacteria are harvested by centrifiigation and lysed by freeze-thaw ofthe cell pellet. Debris is removed by centrifugation at 14,000 x g, and the supernatant is loaded onto a Sephadex G-75 (Pharmacia, Piscataway, NJ) column equilibrated with 10 mM phosphate buffered saline, pH 7.0. Fractions containing IDFP are identified by fluorescence emission at the expected wavelength when excited by light in the excitation wavelength.
If necessary, IDFPs can be isolated from eukaryotic cells by methods well known to those skilled in the art. It is also contemplated that IDFPs will include a marker or affinity tag sequence to permit affinity purification. Examples include 6X-His, glutathione S transferase (GST), or epitope tags such as Flag or the Myc tag. These tags are useful for both bacterial and eukaryotic cell expression and purification of IDFPs.
6. Candidate modulators.
A candidate modulator or candidate agent being evaluated for a modulatory function on a given interaction or biological process may be a synthetic compound, a mixture of compounds, or may be a natural product (e.g. a plant extract or culture supernatant).
Candidate agents from large libraries of synthetic or natural compounds can be screened. Numerous means are currently used for random and directed synthesis of saccharide, peptide, and nucleic acid based compounds. Synthetic compound libraries are commercially available from a number of companies including Maybridge Chemical Co. (Trevillet, Cornwall, UK), Comgenex (Princeton, NJ), Brandon Associates (Merrimack, NH), and Microsource (New Milford, CT). A rare chemical library is available from Aldrich (Milwaukee, WI). Combinatorial libraries are available and can be prepared. Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available from e.g., Pan Laboratories (Bothell, WA) or MycoSearch (NC), or are readily produceable by methods well known in the art. Additionally, natural and synthetically produced libraries and compounds are readily modified through conventional chemical, physical, and biochemical means.
Useful candidate compounds may be found within numerous chemical classes. Such compounds may be organic compounds, or small organic compounds. Small organic compounds have a molecular weight of more than 50 yet less than about 2,500 Daltons, preferably less than about 750, more preferably less than about 350 daltons. Exemplary classes include heterocycles, peptides, saccharides, steroids, and the like. The compounds may be modified to enhance efficacy, stability, pharmaceutical compatibility, and the like. Structural identification of an agent may be used to identify, generate, or screen additional agents. For example, where peptide agents are identified, they may be modified in a variety of ways to enhance their stability, such as using an unnatural amino acid, such as a D-amino acid, particularly D-alanine, by functionalizing the amino or carboxylic terminus, e.g. for the amino group, acylation or alkylation, and for the carboxyl group, esterification or amidification, or the like.
Candidate agents will be effective at varying concentrations, depending on the nature of the agent and on the nature of its interaction with the polypeptide or polypeptide fragment of interest. Therefore, candidate agents should be screened at varying concentrations. Generally, concentrations from about 10 mM to about 1 fM are preferred for screening. The association constants of agents that bind polypeptides or fragments thereof will generally be in the range of
about 1 mM to about 1 fM, and optimally in the range of about 1 μM to about 1 pM or less.
Uses of Intramolecular Dimer Fluorescent Proteins According to the Invention
IDFPs can be used in any application for which fluorescent proteins are suited. For example, IDFPs can be used as reporter genes to monitor the activity of promoter sequences, to investigate the cellular localization of fusion proteins, to mark cellular proteins for FACS analyses of cell populations, to monitor viral vector infection, to monitor transgene expression in vivo or in culture, and to monitor protei protein interactions both in vivo and in vitro. It is also expected that IDFPs comprising fluorescent proteins whose spectral characteristics are sensitive to intracellular or extracellular environmental changes (e.g., pH, redox status, phosphorylation of the fluorescent protein, etc.) will continue to be sensitive to those changes in the context of an IDFP. Because IDFPs do not heterodimerize, they are particularly well suited for multiple- labeling studies involving the co-expression of IDFP-fusion proteins with differing spectral characteristics. Techniques useful for the detection of IDFP fusion proteins include, for example, standard fluorescent microscopy, confocal microscopy, flow cytometry and fluorescence activated cell sorting (FACS).
As noted, IDFPs are particularly well suited to applications that rely on FRET. The lack of heterodimerization between IDFPs with differing spectral characteristics that permit FRET but that share the same dimerization interfaces is a major improvement over previous methods using fluorescent proteins that could heterodimerize, since it removes a significant source of FRET background. In one embodiment, two different IDFPs that have overlapping emission and excitation spectra (i.e., they are donor and acceptor to each other) are used to generate fusions with two different cellular (or viral) proteins or protein domains being investigated for their ability to interact. A specific interaction ofthe fusion partners will result in a change in the detected emission spectrum from that ofthe donor to that ofthe acceptor when a mixture ofthe two IDFP fusion proteins is irradiated with light that excites the donor fluorophore. This type of assay is readily adapted to a screening format, in which known interactors are exposed to candidate compounds. Detection of a change from the acceptor's emission profile to the donor's emission profile indicates that a candidate compound has disrupted the interaction between the fusion partners. Either of these assays can be performed in vivo or in vitro. An example of a donor/acceptor fluorescent protein pair is P4-3 and S65C or S65T (Table 2; U.S. Pat. No. 5,981,200). Other examples of donor/acceptor pairs of fluorescent polypeptides include, but are
not limited to any of S72A, K79R, Y145F, M153A and T203I (excitation λ 395 nm, emission λ
511 nm) as donor, and any of S65G, T203 Y (excitation λ 514 nm, emission λ 527 nm) or
T203Y/S65G, V68L or Q69K (excitation λ 515 nm, emission λ 527 nm) as acceptor (See Tsien et al., WO 97/28261). Each of these proteins shares the dimerization interface of A. victoria GFP. Their expression as IDFPs would allow their co-expression without heterodimerization.
A pair of fluorescent proteins that are useful according to the invention function as fluorescent donor and fluorescent acceptor, respectively, in the context of fluorescence resonance energy transfer. The ability of a pair of fluorescent proteins to function as fluorescent donor and fluorescent acceptor, respectively, in the context of fluorescence resonance energy transfer, is determined experimentally and is influenced by a number of factors including donor/acceptor peaks, emission/excitation peaks, peak widths, the efficiency of energy transfer within a fluorescent moiety and peak overlap. Preferably, 1) the donor excitation peak, A, figure 5, will overlap minimally with the acceptor excitation peak, C, figure 5, such that excitation ofthe donor does not excite the acceptor; 2) the donor excitation peak, A, figure 5, and the donor emission peak, B, figure 5, have sufficient overlap to permit efficient energy transfer; 3) the donor emission peak, B, figure 5, and the acceptor excitation peak, C, figure 5, have sufficient overlap to permit efficient FRET energy transfer; 4) the donor emission peak, B, figure 5 and the acceptor emission peak, D, figure 5 have sufficient overlap to allow for differentiation between FRET and non-FRET energy transfer; and 5) the donor excitation peak, A, figure 5 and the acceptor emission peak, D, figure 5 have sufficient overlap to allow for differentiation between FRET and non-FRET energy transfer.
Generally, an acceptable donor/acceptor pair exhibits > 50% quenching of donor emission at a chromophore distance of > 10 A. This is based on the Forster radius, Ro, which is the distance at which 50% of excited donors are deactivated by FRET (i.e., distance at which energy transfer is 50% efficient). The value of Ro is dependent on the spectral properties ofthe
donor and acceptor fluorophores, with a general formula: RQ = [8.8 x IO23 • κ2 • n"4 • QYD
J(λ)]1/6 A, where: κ2 = dipole orientation factor (0-4; κ2 = 2/3 for randomly oriented donors and acceptors); QYD = the quantum yield ofthe donor in the absence ofthe acceptor; n = the refractive index; and J(λ) = the spectral overlap integral (the shared area under the overlapping donor emission and acceptor excitation peaks).
The advantage resulting from the forced intramolecular homodimer formation is most apparent when, for example, fluorescent proteins with different emission characteristics, derived from the same parent fluorescent protein, are expressed in a single cell. For example, if two variants ofR. reniformis GFP have spectral characteristics that permit FRET between the variants, both of these proteins will have the same dimerization interfaces. Without the forced homodimerization occurring in an IDFP, the background level of acceptor fluorescence upon irradiation within the donor's excitation spectrum will be higher than if IDFP versions ofthe same fluorescent proteins are used.
Even if FRET does not occur between differing fluorescent proteins that share the same dimerization interface, heterodimerization between two fluorescent fusion proteins via that interface can be a problem. For example, such heterodimerization can reduce the sensitivity of sub-cellular localization studies using two labels. Heterodimerization will segregate the labeled proteins into three populations: homodimers ofthe first fusion protein, homodimers ofthe second fusion protein, and heterodimers comprising both. Even if one assumes, strictly for the sake of argument, that the heterodimerization will not affect the intracellular localization ofthe proteins, heterodimer formation will reduce the amount of either homodimer available to segregate to a given location in the cell. This will result in decreased sensitivity in the assay. Therefore, the use of IDFPs in such a situation will improve upon detection sensitivity even if one is not relying upon FRET for detection.
EXAMPLES Example 1. Monitorin the interaction of polypeptides using IDFPs. IDFPs are well suited for applications that monitor the association of fusion polypeptides using energy transfer. In order to monitor the association of two polypeptides of interest using IDFPs, one must first select a pair of fluorescent polypeptides that are donor and acceptor to each other. Each polypeptide in the pair must be capable of homodimerization. An example of such a pair P4-3 and S65T. Another pair useful accordmg to the invention is P4-3 and R. reniformis GFP (hrGFP). The nucleic acid sequence encoding the fluorescence donor polypeptide is used to generate a construct encoding, in order, a copy ofthe donor polypeptide (e.g., P4-3), a linker, a second copy ofthe fluorescence donor polypeptide and one ofthe polypeptides of interest (alternatively, the sequence encoding the protein of interest may be placed upstream of and in frame with the sequences encoding the IDFP). Similarly, the sequence encoding the fluorescence acceptor polypeptide is used to generate a construct encoding, in order, a copy of the acceptor polypeptide (e.g., S65T), a linker, a second copy ofthe fluorescence acceptor polypeptide and the second polypeptide of interest (alternatively, the sequence encoding the protein of interest may be placed upstream of and in frame with the sequences encoding the acceptor IDFP). An example of a pair of proteins of interest is the Ras proto-oncogene product and the Raf-1 kinase. The G-Protein Ras binds to Raf-1 in response to signals originating at receptor tyrosine kinases. A human c-Ha-Ras cDNA sequence is available at GenBank Accession No. J00277 , and a human Raf-1 kinase sequence is available at GenBank Accession No. NM002880. As an example, Ras coding sequences may be ligated in frame to the donor P4- 3 IDFP construct, and the Raf-1 coding sequences may be ligated in frame to the acceptor S65A IDFP construct.
Constructs encoding the two IDFP-fusion proteins are transfected, either simultaneously or sequentially into cells in which the proteimprotein interaction is to be studied (e.g., HeLa cells, NIH3T3 cells, or another specific cell type of interest) using methods well known in the art (e.g., lipofection, electroporation, calcium phosphate precipitation, or even retroviral infection following generation of recombinant retroviral vector particles as known in the art).
After selection of cells incorporating and expressing the constructs by standard methods, the interaction ofthe proteins of interest is measured by detection of fluorescent emission upon irradiation with light that excites the donor fluorophore, in this instance P4-3, but not the acceptor fluorophore, S65T. If the fused Ras and Raf-1 domains interact, excitation with 381 nm light will result in energy transfer between the P4-3 and S65T fluorophores and emission of light with a maximum at about 511 nm. In contrast, if the domains do not interact, the emission maximum upon excitation at 381 nm will be at about 445 nm, the emission maximum of P4-3. This therefore allows the monitoring ofthe interaction ofthe two domains in response to stimuli, such as the addition of growth factor, growth factor analogs, or candidate modulators ofthe signal transduction pathway.
The proteimprotein interaction assay using IDFP fusion proteins described above may also be performed in vitro with isolated or purified IDFP fusion proteins. This type of assay, or even the cell-based assay described above may be readily adapted to a high-throughput format by placing the transfected cells or protein samples in a multiwell container and monitoring fluorescence output of samples exposed to various candidate modulators. Further, by performing the interaction assay in the presence or absence of a candidate modulator, one may adapt the method for screening of candidate modulator compounds to identify compounds that either increase or decrease the measured interaction. A change in the interaction in the presence of a candidate modulator relative to the interaction in its absence is indicative of a modulatory effect. Example 2. Labeling a cell with an IDFP.
IDFPs according to the invention can be used in any application in which fluorescent polypeptides are useful. For example, cells can be labeled by expression of IDFPs to monitor the uptake and expression of transgene constructs, including plasmid-based and retroviral constructs. Cells may also be labeled to facilitate subsequent FACS analysis in a mixed population. To label a cell, an IDFP-encoding construct is introduced to cells by standard methods appropriate to that cell type. Following introduction, selection for cells receiving the construct can either be performed by standard positive or negative selection based on additional selectable marker sequences (e.g., antibiotic resistance genes), by sorting or selection by FACS, or by allowing cells to form colonies and isolating those colonies that fluoresce when irradiated with light within the excitation spectrum ofthe IDFP. Maintaining the cells under conditions permitting the expression ofthe IDFP will permit the detection ofthe cells by fluorescence. Example 3. Double label monitoring of protein localization.
Fluorescently labeled proteins are often used to examine the sub-cellular localization of proteins of interest. Frequently, it is useful to monitor the localization of two or more proteins or protein domains simultaneously, for example, as a means of identifying relationships between the proteins. When two or more proteins are labeled with fluorescent polypeptides that have the capacity to heterodimerize, the sensitivity ofthe localization assay can be adversely affected by heterodimerization between the fluorescent polypeptides.
Examples of proteins to be monitored for localization include proteins that are recruited to the vicinity ofthe plasma membrane upon a stimulus such as growth factor engagement of a receptor (e.g., G-proteins, Protein Kinase A, SH2-domain containing proteins, etc.), proteins that localize to the nucleus in response to a stimulus (e.g., steroid hormone receptor), or proteins that localize to the golgi, mitochondria, nuclear pores or any other subcellular locale.
In order to simultaneously monitor the localization of two proteins of interest, two IDFP fusion constructs, each comprising sequences encoding one ofthe proteins of interest, are introduced to cells, either simultaneously or sequentially, using standard methods appropriate for that cell type. The localization ofthe IDFP-tagged proteins is monitored by fluorescence microscopy using excitation wavelengths and filter sets appropriate for the different fluorophores. While not wishing to exclude the possibility, it is generally not necessary that the two IDFPs be fluorescent donor and acceptor to each other. More frequently, unless one is assaying for direct interaction ofthe proteins, it is preferred that the fluorescent proteins are not related to each other in this manner. The IDFP fusion protein constructs are made using standard methods well known in the art. Examples of pairs of fluorescent polypeptides that are well suited for simultaneous monitoring of localization include, but are not limited to any of S72A,
K79R, Y145F, M153A and T203I (excitation λ 395 nm, emission λ 511 nm) as donor, and any
of S65G, T203Y (excitation λ 514 nm, emission λ 527 nm) or T203Y/S65G, V68L or Q69K
(excitation λ 515 nm, emission λ 527 nm) as acceptor (See Tsien et al., WO 97/28261). See also
Table 2.
TABLE 2 FLUORESCENCE CHARACTERISTICS OF VARIOUS GFP MUTANTS
Extinct.
Excitation Emission Coefficient Quantum
Clone Mutation (s) max (nm) max (nm) (M_1cm -1) yield
Wild type None 393(475) 508 21,000 (7,150) 0. .77
P4 Y66H 383 447 13,500 0. ,21
P4-3 Y66H 381 445 14,000 0. .38 Y145F
W7 Y66W 433(453) 475(501) 18,000 0. .67
N146L (17,100)
M153T
V163A
N212K
W2 Y66W 432(453) 480 10,000 (9,600) 0. .72
I123V
Y145H
H148R
M153T
V163A
N212K
S65T S65T 489 511 39,200 0. .68
P4-I S65T 504(396) 514 14,500 (8,600) 0. .53
M153A
K238E
S65A S65A 471 504
S65C S65C 479 507
S65L S65L 484 510
Y66F Y66F 360 442
Y66W Y66W 458 480
10c S65G 513 527 V68L V72A T203Y
W1B F64L 432(453) 476(503)
S65T
Y66W
N146I
M153T
V163A
N212K
Emerald S65T 487 508
S72A
N149K
M153T
I167T
Sapphire S72A 395 511
Y145F
T203I

Claims

1. A recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein said first and second polypeptides are found in nature as monomers of a multimeric protein and said first and second polypeptides are not fluorescent donor and acceptor to each other, and wherein said recombinant fusion polypeptide is fluorescent when excited.
2. The recombinant fusion polypeptide of claim 1 wherein said first polypeptide and said second polypeptide are peptide bonded to each other via a linker sequence.
3. The recombinant fusion polypeptide of claim 2 wherein said linker sequence is from 5 to 50 amino acids long.
4. The recombinant fusion polypeptide of claim 1 , further comprising a third polypeptide peptide bonded to said recombinant fusion polypeptide.
5. The recombinant fusion polypeptide of claim 4 wherein said third polypeptide is a member of a specific binding pair.
6. The recombinant fusion polypeptide of claim 4 wherein said third polypeptide is fused to the amino terminus of said first polypeptide.
7. The recombinant fusion polypeptide of claim 4 wherein said third polypeptide is fused to the carboxy terminus of said second polypeptide sequence.
8. The recombinant fusion polypeptide of claim 1 wherein each of said first and said second polypeptide, independently, is a monomer of a multimeric protein selected from the group consisting of: R. reniformis GFP, R. mulleri GFP and A. victoria GFP.
9. The recombinant fusion polypeptide of claim 1 wherein both of said first and second polypeptides are monomers of a single multimeric protein selected from the group consisting of R. reniformis GFP, R. mulleri GFP and .4. victoria GFP.
10. A polynucleotide encoding a recombinant fusion polypeptide comprising a first polypeptide peptide bonded to a second polypeptide, wherein said first and second polypeptides are found in nature as monomers of a multimeric protein, and wherein said recombinant fusion polypeptide is fluorescent when excited.
11. The polynucleotide of claim 10 wherein said first polypeptide and said second polypeptide are peptide bonded to each other via a linker sequence.
12. The polynucleotide of claim 11 wherein said linker sequence is from 5 to 50 amino acids long.
13. The polynucleotide of claim 10, wherein said polynucleotide further encodes a third polypeptide peptide bonded to said recombinant fusion polypeptide.
14. The polynucleotide of claim 13 wherein said third polypeptide is a member of a specific binding pair.
15. The polynucleotide of claim 13 wherein said third polypeptide is fused to the amino terminus of said first polypeptide.
16. The polynucleotide of claim 13 wherein said third polypeptide is fused to the carboxy terminus of said second polypeptide.
17. The polynucleotide of claim 10 wherein each of said first and said second polypeptide, independently, is a monomer of a multimeric protein selected from the group consisting ofR. reniformis GFP, R. mulleri GFP, and victoria GFP.
18. The polynucleotide of claim 10 wherein both of said first and second polypeptides are monomers of a single multimeric protein selected from the group consisting ofR. reniformis GFP, R. mulleri GFP, A. victoria GFP.
19. A vector comprising the polynucleotide of claim 10.
20. A cell comprising the vector of claim 19.
21. The cell of claim 20, said cell being a bacterial cell.
22. The cell of claim 20, said cell being a eukaryotic cell.
23. The eukaryotic cell of claim 22, wherein said cell is a yeast cell, an insect cell, or a mammalian cell.
24. A pair of polypeptides comprising a polypeptide labeled with a fluorescent dye and a recombinant fusion polypeptide of claim 1 wherein said fluorescent dye and said recombinant fusion polypeptide are fluorescent donor and acceptor to each other.
25. A pair of recombinant fusion polypeptides comprising a first fusion polypeptide as claimed in claim 1 and a second fusion polypeptide as claimed in claim 1 wherein said first fusion polypeptide and said second fusion polypeptide are fluorescent donor and acceptor to each other.
26. The pair of recombinant fusion polypeptides of claim 25 wherein each of said first and second fusion polypeptides further comprises a third polypeptide, and wherein said third polypeptide of said first fusion polypeptide comprises a sequence which is different from said third polypeptide of said second fusion polypeptide.
27. A method of producing a fluorescently labeled recombinant fusion polypeptide, said method comprising the steps of introducing a polynucleotide of claim 10 to a cell, and culturing said cell under conditions that permit the synthesis of said recombinant fusion polypeptide, whereby said recombinant fusion polypeptide is produced.
28. A method of labeling a cell with a fluorescent recombinant fusion polypeptide, said method comprising the steps of:
(a) introducing a polynucleotide of claim 10 to a cell; and
(b) culturing said cell under conditions that permit the synthesis of said recombinant fusion polypeptide, whereby said cell is labeled with said fluorescent recombinant fusion polypeptide.
29. The method of claim 26 wherein, in said introducing step (a), said polynucleotide introduced to said cell further comprises a sequence encoding a third polypeptide fused in frame to the sequence encoding said recombinant fusion polypeptide.
30. A method of monitoring the interaction of two polypeptides of interest, said method comprising the steps of:
(a) contacting a first polypeptide and a second polypeptide wherein: (i) said first polypeptide is a recombinant fusion polypeptide of claim 4 wherein said third polypeptide is a first polypeptide of interest;
(ii) said second polypeptide comprises a second polypeptide of interest and is fluorescently labeled; and
(iii) the fluorophores comprised by said first and second polypeptides are fluorescent donor and fluorescent acceptor to each other;
(b) exciting said donor fluorophore; and
(c) detecting fluorescent emission from said fluorescent acceptor, wherein said emission is indicative ofthe interaction of said first and said second polypeptides of interest.
31. The method of claim 30 wherein said second polypeptide comprises a fusion polypeptide of claim 5, wherein said third polypeptide of said second fusion polypeptide is different from said third polypeptide of said first fusion polypeptide.
32. The method of claim 30 wherein said contacting step is performed in vitro.
33. The method of claim 30 wherein said contacting step is performed in a cell.
34. The method of claim 33 wherein said contacting comprises the step of introducing nucleic acid encoding said first and said second polypeptides to a cell.
35. A method of screening for a compound that modulates the interaction of a first and a second member of a specific binding pair, said method comprising the steps of:
(a) contacting a first polypeptide and a second polypeptide in the presence and absence of a candidate modulator wherein: (i) said first polypeptide is a recombinant fusion polypeptide of claim 5, wherein said member of a specific binding pair is said first member of a specific binding pair;
(ii) said second polypeptide is fluorescently labeled and comprises said second member of a specific binding pair; and
(iii) the fluorophores comprised by said first and second polypeptides are fluorescent donor and acceptor to each other;
(b) exciting said donor fluorophore; and
(c) detecting the fluorescence of said acceptor fluorophore, wherein emission ofthe spectrum characteristic of said fluorescent acceptor indicates the interaction of said first and said second members of said specific binding pair, and wherein a change in said interaction in the presence of said candidate modulator indicates that said candidate modulator modulates the interaction ofthe members of said specific binding pair.
36. The method of claim 35 wherein said second polypeptide is a recombinant fusion polypeptide of claim 5 and said member of a specific binding pair comprised by said second polypeptide is said second member of a specific binding pair.
PCT/US2001/048690 2000-12-15 2001-12-13 Dimeric fluorescent polypeptides WO2002048174A2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
AU2002230920A AU2002230920A1 (en) 2000-12-15 2001-12-13 Dimeric fluorescent polypeptides
EP01991178A EP1349867A4 (en) 2000-12-15 2001-12-13 Dimeric fluorescent polypeptides
CA002432782A CA2432782A1 (en) 2000-12-15 2001-12-13 Dimeric fluorescent polypeptides

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US25612100P 2000-12-15 2000-12-15
US60/256,121 2000-12-15

Publications (3)

Publication Number Publication Date
WO2002048174A2 WO2002048174A2 (en) 2002-06-20
WO2002048174A9 true WO2002048174A9 (en) 2003-04-24
WO2002048174A3 WO2002048174A3 (en) 2003-08-14

Family

ID=22971170

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/048690 WO2002048174A2 (en) 2000-12-15 2001-12-13 Dimeric fluorescent polypeptides

Country Status (5)

Country Link
US (2) US6936428B2 (en)
EP (1) EP1349867A4 (en)
AU (1) AU2002230920A1 (en)
CA (1) CA2432782A1 (en)
WO (1) WO2002048174A2 (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1272612A4 (en) * 2000-02-28 2004-03-17 Stratagene Inc Renilla reniformis green fluorescent protein and mutants thereof
CA2434293A1 (en) * 2001-01-29 2002-08-08 Anticancer, Inc. Fluorescent proteins
JP4215512B2 (en) * 2001-02-26 2009-01-28 ザ リージェンツ オブ ザ ユニバーシティ オブ カリフォルニア Non-oligomerized tandem fluorescent protein
DE60223937T2 (en) * 2001-10-12 2008-11-27 Clontech Laboratories, Inc., Mountain View NUCLEIC ACIDS CODING COMPOUND CHROMO / FLUORESCENCE DOMAINS AND METHOD FOR THEIR USE
EP1539786A4 (en) * 2002-07-10 2006-09-06 Stratagene California Humanized renilla reniformis green fluorescent protein as a scaffold
DE10244502B4 (en) * 2002-09-25 2006-06-01 Universität Gesamthochschule Kassel Molecular weight markers for proteins and process for its preparation
US7541152B2 (en) * 2002-12-24 2009-06-02 Agilent Technologies, Inc. Integrated light source for diagnostic arrays
WO2004090115A2 (en) * 2003-04-04 2004-10-21 Stratagene Renilla gfp mutants with increased fluorescent intensity and special shift
US20090202638A2 (en) * 2004-10-27 2009-08-13 New York Society For The Ruptured And Crippled Maintaining The Hospital For Special Surgery Bmp gene and fusion protein
US8680235B2 (en) * 2006-09-22 2014-03-25 Stowers Institute For Medical Research Branchiostoma derived fluorescent proteins
EP3708189B1 (en) 2013-07-05 2023-11-29 University of Washington through its Center for Commercialization Soluble mic neutralizing monoclonal antibody for treating cancer
US20150118701A1 (en) 2013-10-25 2015-04-30 Biomadison, Inc. Enzyme assay with duplicate fluorophores
CN114805527A (en) * 2014-05-21 2022-07-29 哈佛大学的校长及成员们 RAS inhibitory peptides and uses thereof

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6803188B1 (en) * 1996-01-31 2004-10-12 The Regents Of The University Of California Tandem fluorescent protein constructs
US6376257B1 (en) * 1997-04-24 2002-04-23 University Of Rochester Detection by fret changes of ligand binding by GFP fusion proteins
EP1064360B1 (en) 1998-03-27 2008-03-05 Prolume, Ltd. Luciferases, gfp fluorescent proteins, their nucleic acids and the use thereof in diagnostics
US5985577A (en) * 1998-10-14 1999-11-16 The Trustees Of Columbia University In The City Of New York Protein conjugates containing multimers of green fluorescent protein

Also Published As

Publication number Publication date
AU2002230920A1 (en) 2002-06-24
EP1349867A4 (en) 2004-10-27
WO2002048174A2 (en) 2002-06-20
CA2432782A1 (en) 2002-06-20
US6936428B2 (en) 2005-08-30
US20030108566A1 (en) 2003-06-12
US20060068451A1 (en) 2006-03-30
WO2002048174A3 (en) 2003-08-14
EP1349867A2 (en) 2003-10-08

Similar Documents

Publication Publication Date Title
US20060068451A1 (en) Dimeric fluorescent polypeptides
US8138320B2 (en) Fluorescent proteins and methods for using same
JP2004530423A (en) Non-oligomerized tandem fluorescent protein
US7888113B2 (en) Modified green fluorescent proteins and methods for using same
US7022826B2 (en) Non-oligomerizing fluorescent proteins
JP4644600B2 (en) Fluorescent and pigment proteins from non-Owan jellyfish hydrozoa species and methods for their use
EP1576157B1 (en) Fluorescent proteins from copepoda species and methods for using same
US20090203035A1 (en) Fluorescent proteins with increased photostability
WO2002057451A9 (en) Humanized polynucleotide sequence encoding renilla mulleri green fluorescent protein
AU2002246706A1 (en) Humanized polynucleotide sequence encoding renilla mulleri green fluorescent protein
WO2001064843A1 (en) Renilla reniformis green fluorescent protein and mutants thereof
US8563703B2 (en) Fluorescent proteins and methods for using same

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
COP Corrected version of pamphlet

Free format text: PAGES 1/5-5/5, DRAWINGS, REPLACED BY NEW PAGES 1/6-6/6; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE

WWE Wipo information: entry into national phase

Ref document number: 2002230920

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 2432782

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2001991178

Country of ref document: EP

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWP Wipo information: published in national office

Ref document number: 2001991178

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP