WO2014078652A1 - Heavy atom labeled nucleosides, nucleotides, and nucleic acid polymers, and uses thereof - Google Patents

Heavy atom labeled nucleosides, nucleotides, and nucleic acid polymers, and uses thereof Download PDF

Info

Publication number
WO2014078652A1
WO2014078652A1 PCT/US2013/070299 US2013070299W WO2014078652A1 WO 2014078652 A1 WO2014078652 A1 WO 2014078652A1 US 2013070299 W US2013070299 W US 2013070299W WO 2014078652 A1 WO2014078652 A1 WO 2014078652A1
Authority
WO
WIPO (PCT)
Prior art keywords
substituted
unsubstituted
instance
certain embodiments
atom
Prior art date
Application number
PCT/US2013/070299
Other languages
French (fr)
Inventor
Suhaib M. Siddiqi
William Roy Glover
Lawrence Scipioni
Katelyn Marie Murtagh
Mallikarjuna Reddy Putta
Original Assignee
Zs Genetics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zs Genetics, Inc. filed Critical Zs Genetics, Inc.
Publication of WO2014078652A1 publication Critical patent/WO2014078652A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H19/00Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof
    • C07H19/02Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof sharing nitrogen
    • C07H19/04Heterocyclic radicals containing only nitrogen atoms as ring hetero atom
    • C07H19/06Pyrimidine radicals
    • C07H19/073Pyrimidine radicals with 2-deoxyribosyl as the saccharide radical
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H19/00Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof
    • C07H19/02Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof sharing nitrogen
    • C07H19/04Heterocyclic radicals containing only nitrogen atoms as ring hetero atom
    • C07H19/06Pyrimidine radicals
    • C07H19/10Pyrimidine radicals with the saccharide radical esterified by phosphoric or polyphosphoric acids
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H19/00Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof
    • C07H19/02Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof sharing nitrogen
    • C07H19/04Heterocyclic radicals containing only nitrogen atoms as ring hetero atom
    • C07H19/16Purine radicals
    • C07H19/173Purine radicals with 2-deoxyribosyl as the saccharide radical
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H19/00Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof
    • C07H19/02Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof sharing nitrogen
    • C07H19/04Heterocyclic radicals containing only nitrogen atoms as ring hetero atom
    • C07H19/16Purine radicals
    • C07H19/20Purine radicals with the saccharide radical esterified by phosphoric or polyphosphoric acids
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H21/00Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids
    • C07H21/04Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids with deoxyribosyl as saccharide radical
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Definitions

  • the ADF-STEM was the method of choice for Crewe and co-workers to originally image single heavy atoms in anticipating that the method might be used for sequencing DNA. (Crewe, 1970; Crewe et al, 1970). Recent STEM improvements now allow studies of atomic-level and single atom imaging (Batson et al., 2002; Voyles et al., 2002; Jia et al., 2003).
  • a very small electron beam is raster- scanned across the sample. Most of the electrons pass through the sample with only subtle changes of energy, direction, and/or phase. However, some electrons scatter at a high angle.
  • the present disclosure provides compositions and methods to sequence nucleic acid molecules including improving sequencing read length by directly visualizing DNA as long, intact molecules using electron microscopy, such as high-resolution scanning transmission electron microscopy (STEM).
  • STEM high-resolution scanning transmission electron microscopy
  • template-directed polymerase enzymes are used to incorporate heavy-atom labeled bases directly into a long DNA molecule.
  • ADF-STEM annular dark-field imaging
  • the methods disclosed also simplify the challenge of making the labeling reactions sequence- specific because polymerase reactions are intrinsically sequence specific.
  • inventive heavy-atom labeled compounds optionally for use in the inventive methods as described herein.
  • exemplary heavy-atom labeled compounds include compounds of Formula I):
  • nucleic acid polymers comprising one or more heavy-atom labeled units of Formula (II'):
  • each instance of G is independently -0-, -S-, -Se-, -CH 2 -, or -NH-; each instance of G 2 is independently hydrogen, halogen, -OR A , -SR A , -N(R A ) 2 , -SHg, -S0 2 SHg , -SHgR D , -SeR D or -TeR D ;
  • each instance of R A is independently hydrogen, substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted C 2 _ 2 oalkenyl, substituted or unsubstituted C 2 _ 2 o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen atom, a sulfur protecting group when attached to a sulfur atom, a nitrogen protecting group when attached to a nitrogen atom; or two R A groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
  • each instance of M 1 is independently -0-, -S-, -NH-, -Se-,or -C(R M ) 2 -, wherein each instance of R M is independently hydrogen or halogen;
  • each instance of G 3 is independently hydrogen, substituted or unsubstituted C ⁇ oalkyl, substituted or unsubstituted C 2 _ 2 oalkenyl, substituted or unsubstituted C 2 _ 20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or a monophosphate, diphosphate, or triphosphate of formula:
  • each instance of M is independently -0-, -S-, or -Se-;
  • each instance of R 1 , R 2 , R 4 , and R 5 is independently hydrogen, substituted or unsubstituted C ⁇ oalkyl, substituted or unsubstituted C 2 - 2 oalkenyl, substituted or unsubstituted C 2 - 20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, a nitrogen protecting group, -OR B , or -SR B , wherein each instance of R B is independently hydrogen, substituted or unsubstituted C ⁇ oalkyl, substituted or unsubstituted C 2 - 2 oalkenyl, substituted or unsubstituted C 2 - 20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substitute
  • each instance of R is independently substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted C 2 - 2 oalkenyl, substituted or unsubstituted C 2 - 20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or
  • R c is hydrogen, substituted or unsubstituted C ⁇ oalkyl, substituted or unsubstituted C 2 - 2 oalkenyl, substituted or unsubstituted C 2 - 20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen atom, a sulfur protecting group when attached to a sulfur atom, a nitrogen protecting group when attached to a nitrogen atom;
  • each instance of L 1 is independently absent or a linking moiety selected from the group consisting of substituted or unsubstituted C ⁇ oalkylene, substituted or unsubstituted C 2 2 oalkenylene, substituted or unsubstituted C 2 - 20 alkynylene, substituted or unsubstituted heteroC 1 _ 2 oalkylene, substituted or unsubstituted heteroC 2 - 2 oalkenylene, substituted or unsubstituted heteroC 2 - 2 o alkynylene, substituted or unsubstituted carbocycylene, substituted or unsubstituted heterocyclylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a combination thereof;
  • each instance of R D is independently hydrogen, substituted or unsubstituted C ⁇ oalkyl, substituted or unsubstituted C 2 - 2 oalkenyl, substituted or unsubstituted C 2 - 20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; and
  • each instance of M 3 and M 4 are independently O, Se, Te, CH 2 , CF 2 , CCI 2 , CBr 2 , or CI 2 ;
  • n 1 to 200,000, inclusive
  • the compound comprises at least one instance of a heavy atom selected from the group consisting of bromine, iodine, selenium, tellurium, or mercury.
  • the invention provides compositions of heavy atom labeled nucleic acids, as well as systems and methods of identifying, sequencing and/or detecting nucleic acid polymers, as well as related components (e.g. , substrates, software and the like).
  • methods of determining the sequence of a nucleic acid polymer labeled with heavy atoms are provided. The methods include forming a complementary strand of the nucleic acid polymer comprising one or more heavy- atom labeled compounds as described herein and identifying a sequence of nucleotides in the nucleic acid polymer and/or in the complementary strand using a particle beam.
  • the nucleic acid polymer and/or the complementary strand is DNA or RNA.
  • the nucleic acid polymer and/or its complementary strand is formed by a nucleic acid polymerase enzyme, such as using polymerase chain reaction (PCR).
  • the nucleotides of the nucleic acid polymer and/or the complementary strand are modified to include labels comprising one or more heavy-atom labeled compounds as described herein.
  • the labels are specific for each type of nucleotide.
  • at least two types of nucleotides are labeled with the same type of heavy-atom label.
  • one type of nucleotide is labeled, two types of nucleotides are labeled, three types of nucleotides are labeled, or four types of nucleotides are labeled.
  • substantially all of the nucleotides of the nucleic acid polymer and/or complementary strand are modified such that all nucleotides are labeled.
  • nucleotide specific labels are incorporated in the nucleic acid polymer and/or the complementary strand during formation of the nucleic acid polymer and/or the complementary strand.
  • the nucleic acid polymer and/or the complementary strand are affixed to a substrate, and prior to the step of identification the nucleotides of the nucleic acid polymer and/or its complementary strand are substantially removed from the substrate, leaving the labels of the labeled nucleotides affixed to the substrate.
  • the step of identifying a sequence of nucleotides includes generating a particle beam, exposing the nucleic acid polymer and/or the
  • the nucleotides of the nucleic acid polymer and/or the complementary strand are modified to include labels as described herein, and more preferably the step of identifying the nucleotides includes detecting characteristic changes to the particle beam due to the heavy atom label(s) on the nucleotides.
  • the particle beam is a lepton beam; more preferably the lepton beam is an electron beam.
  • the particle beam is an ion beam; more preferably a helium ion beam or a gallium ion beam.
  • the nucleic acid polymer and/or the complementary strand are affixed to a substrate.
  • the nucleic acid polymer and/or the complementary strand can be affixed to a substrate at one end of the nucleic acid polymer and/or the complementary strand, at both ends of the nucleic acid polymer and/or the complementary strand, and/or at a plurality of locations along the length of the nucleic acid polymer and/or the complementary strand.
  • the nucleic acid polymer and/or the complementary strand are substantially straightened prior to identifying the sequence.
  • the nucleic acid polymer and/or the complementary strand are straightened by fluid flow, and more preferably the fluid flow includes molecular combing.
  • the fluid can include one or more liquids, gases, phases or a combination thereof.
  • the nucleic acid polymer and/or the complementary strand are attached to a substrate and straightened by hybridization in the fluid flow to oligonucleotides that are attached to the substrate.
  • the step of identifying the nucleotides in the nucleic acid polymer and/or its complementary strand includes interpreting changes in the particle beam resulting from interactions with the nucleotides to detect the nucleotides in the nucleic acid polymer and/or its complementary strand, whereby the sequence of the nucleic acid polymer is determined.
  • the nucleotides are labeled as described herein.
  • the changes in the particle beam include changes in absorbance, reflection, deflection, energy or direction.
  • the changes in the particle beam also can be changes in a spatial pattern, for example, a one dimensional pattern, a two dimensional pattern or a three dimensional pattern.
  • the method also includes attaching the complementary strand and/or the nucleic acid polymer to a substrate.
  • the attachment is by nucleic acid sequence- specific molecules, which preferably are oligonucleotides.
  • the substrate is derivatized to provide attachment points that are sequence non-specific.
  • the complementary strand and optionally the nucleic acid polymer can be attached to the substrate in a grid pattern.
  • the substrate includes a carbon thin film.
  • the step of identifying the sequence of nucleotides includes performing a plurality of scans of the nucleic acid polymer and/or the
  • nucleic acid polymer Preferably at least 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1200, 1400, 1600, 1800, 2000, 3000, 4000, 5000, 6000. 7000, 8000, 9000, 10000, or more nucleotides are identified in each scan.
  • methods of determining the sequence of a nucleic acid polymer include synthesizing the nucleic acid polymer and/or its complementary strand using labeled ribonucleotide and/or deoxyribonucleotide triphosphates as described herein, and identifying labeled
  • ribonucleotides and/or deoxyribonucleotides in the nucleic acid polymer and/or its complementary strand using a particle beam wherein the labeled ribonucleotides and/or deoxyribonucleotides, when incorporated in the nucleic acid polymer and/or its
  • complementary strand are identifiable using the particle beam.
  • the nucleic acid polymer and/or the complementary strand is DNA or RNA.
  • the nucleic acid polymer and/or its complementary strand is synthesized by a nucleic acid polymerase enzyme, such as using polymerase chain reaction (PCR).
  • PCR polymerase chain reaction
  • the labels are specific for each type of nucleotide. However, in some embodiments, at least two types of nucleotides are labeled with the same type of heavy-atom label. In other embodiments, one type of nucleotide is labeled, two types of nucleotides are labeled, three types of nucleotides are labeled, or four types of nucleotides are labeled. In some embodiments, substantially all of the nucleotides of the nucleic acid polymer and/or complementary strand are modified such that all nucleotides are labeled.
  • the labels are incorporated in the ribonucleotide and/or
  • deoxyribonucleotide triphosphates used in synthesis of the nucleic acid polymer and/or the complementary strand.
  • the step of identifying the labeled ribonucleotides and/or deoxyribonucleotides includes generating a particle beam, exposing the nucleic acid polymer and the complementary strand to the particle beam, and identifying the ribonucleotides and/or deoxyribonucleotides due to characteristic changes to the particle beam due to the heavy atom label(s) on the nucleotides.
  • the step of detecting the ribonucleotides and/or deoxyribonucleotides includes detecting characteristic changes to the particle beam.
  • the particle beam is a lepton beam; more preferably the lepton beam is an electron beam.
  • the particle beam is an ion beam; more preferably a helium ion beam or a gallium ion beam.
  • the nucleic acid polymer and/or the complementary strand are affixed to a substrate.
  • the ribonucleotides and/or deoxyribonucleotides of the nucleic acid polymer and/or its complementary strand are substantially removed from the substrate, leaving the labels of the labeled ribonucleotides and/or deoxyribonucleotides affixed to the substrate.
  • the nucleic acid polymer and/or the complementary strand can be affixed to a substrate at one end of the nucleic acid polymer and/or the complementary strand, at both ends of the nucleic acid polymer and/or the complementary strand, and/or at a plurality of locations along the length of the nucleic acid polymer and/or the complementary strand.
  • the nucleic acid polymer and/or the complementary strand are substantially straightened prior to identifying the labeled ribonucleotides and/or deoxyribonucleotides.
  • the nucleic acid polymer and/or the complementary strand are straightened by fluid flow, and more preferably the fluid flow includes molecular combing.
  • the fluid can include one or more liquids, gases, phases or a combination thereof.
  • the nucleic acid polymer and/or the complementary strand are attached to a substrate and straightened by hybridization in the fluid flow to oligonucleotides that are attached to the substrate.
  • the step of identifying the nucleotides in the nucleic acid polymer and/or its complementary strand includes interpreting changes in the particle beam resulting from interactions with the nucleotides to detect the ribonucleotides and/or deoxyribonucleotides in the nucleic acid polymer and/or its complementary strand, whereby the sequence of the nucleic acid polymer is determined.
  • the nucleotides are labeled as described herein.
  • the changes in the particle beam include changes in absorbance, reflection, deflection, energy or direction.
  • the changes in the particle beam also can be changes in a spatial pattern, for example, a one dimensional pattern, a two dimensional pattern or a three dimensional pattern.
  • the method also includes attaching the complementary strand and/or the nucleic acid polymer to a substrate.
  • the attachment is by nucleic acid sequence- specific molecules, which preferably are oligonucleotides.
  • the substrate is derivatized to provide attachment points that are sequence non-specific.
  • the complementary strand and optionally the nucleic acid polymer can be attached to the substrate in a grid pattern.
  • the substrate includes a carbon thin film.
  • the step of identifying the sequence of nucleotides includes performing a plurality of scans of the nucleic acid polymer and/or the
  • nucleic acid polymer Preferably at least 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1200, 1400, 1600, 1800, 2000, 3000, 4000, 5000, 6000. 7000, 8000, 9000, 10000, or more nucleotides are identified in each scan. [0028] According to another aspect of the invention, methods of determining the sequence of a nucleic acid polymer are provided.
  • the methods include synthesizing a complementary strand of the nucleic acid polymer using labeled ribonucleotide triphosphates or deoxyribonucleotide triphosphates as described herein, attaching the nucleic acid polymer and/or the complementary strand to a substrate, substantially straightening the nucleic acid polymer and/or the complementary strand using molecular combing, generating a particle beam, exposing the nucleic acid polymer and the complementary strand to the particle beam through the complementary strand on the substrate, and interpreting changes in the particle beam resulting from interactions with the nucleotides to detect the labeled nucleotides in the complementary strand, whereby the sequence of a nucleic acid polymer is determined.
  • methods of detecting the presence and/or identifying a nucleic acid polymer include forming a complementary strand of the nucleic acid polymer, attaching the complementary strand and, optionally, the nucleic acid polymer to a substrate, and detecting the presence and/or identifying the complementary strand and/or the nucleic acid polymer using a particle beam.
  • the step of identifying includes measuring the length or determining at least a partial sequence of the complementary strand and/or the nucleic acid polymer.
  • the nucleic acid polymer and/or its complementary strand is DNA or RNA.
  • the nucleic acid polymer and/or its complementary strand is formed by a nucleic acid polymerase enzyme, e.g. , using polymerase chain reaction (PCR); preferably the nucleic acid polymerase enzyme is a DNA- dependent DNA polymerase, a RNA-dependent DNA polymerase or a RNA-dependent RNA polymerase.
  • PCR polymerase chain reaction
  • the nucleotides of the nucleic acid polymer and/or the complementary strand are modified to include labels as described herein.
  • the labels are specific for each type of nucleotide.
  • at least two types of nucleotides are labeled with the same type of heavy-atom label.
  • one type of nucleotide is labeled, two types of nucleotides are labeled, three types of nucleotides are labeled, or four types of nucleotides are labeled.
  • substantially all of the nucleotides of the nucleic acid polymer and/or complementary strand are modified such that all nucleotides are labeled.
  • nucleotide specific labels are incorporated in the nucleic acid polymer and/or the complementary strand during formation of the nucleic acid polymer and/or the complementary strand.
  • the step of detecting the presence and/or identifying of the complementary strand and/or the nucleic acid polymer using a particle beam includes generating a particle beam, exposing the nucleic acid polymer and/or the complementary strand to the particle beam, and detecting the nucleotides of the complementary strand and/or the nucleic acid polymer due to characteristic changes to the particle beam.
  • the nucleotides of the nucleic acid polymer and/or the complementary strand are modified to include labels as described herein.
  • the step of detecting the ribonucleotides and/or deoxyribonucleotides includes detecting characteristic changes to the particle beam due to the heavy atom label(s) on the nucleotides.
  • the particle beam is a lepton beam; more preferably the lepton beam is an electron beam.
  • the particle beam is an ion beam; more preferably a helium ion beam or a gallium ion beam.
  • the nucleic acid polymer and/or the complementary strand are substantially straightened prior to identifying the sequence.
  • the nucleic acid polymer and/or the complementary strand are straightened by fluid flow, and more preferably the fluid flow includes molecular combing.
  • the fluid can include one or more liquids, gases, phases or a combination thereof.
  • the nucleic acid polymer and/or the complementary strand are attached to a substrate and straightened by hybridization in the fluid flow to oligonucleotides that are attached to the substrate.
  • the step of identifying the nucleotides in the nucleic acid polymer and/or its complementary strand includes interpreting changes in the particle beam resulting from interactions with the nucleotides to detect the nucleotides in the nucleic acid polymer and/or its complementary strand, whereby the presence of the nucleic acid polymer is determined and/or the nucleic acid polymer is identified.
  • the nucleotides are labeled as described herein and the characteristic changes to the particle beam due to the heavy atom label(s) on the nucleotides .
  • the changes in the particle beam include changes in absorbance, reflection, deflection, energy or direction.
  • the changes in the particle beam also can be changes in a spatial pattern, for example, a one dimensional pattern, a two dimensional pattern or a three dimensional pattern.
  • the method also includes attaching the complementary strand and/or the nucleic acid polymer to a substrate.
  • the attachment is by nucleic acid sequence- specific molecules, which preferably are oligonucleotides.
  • the substrate is derivatized to provide attachment points that are sequence non-specific.
  • the complementary strand and optionally the nucleic acid polymer can be attached to the substrate in a grid pattern.
  • the substrate includes a carbon thin film.
  • the method also includes quantifying the amount of the complementary strand and/or the nucleic acid polymer.
  • a device that includes a substrate that is substantially transparent to a particle beam, and nucleic acid polymer binding sites on a surface of the substrate.
  • the substrate is substantially transparent to an electron beam.
  • the substrate includes a carbon thin film.
  • the device also includes a support that is substantially transparent to a particle beam.
  • the substrate is less than 5 nm thick, more preferably less than 2 nm thick, still more preferably less than 1.5 nm thick, and yet more preferably less than 1.1 nm thick.
  • the nucleic acid polymer binding sites are formed at predetermined positions on the surface of the substrate, preferably in a grid pattern.
  • the nucleic acid polymer binding sites are sequence specific, preferably oligonucleotides. In other embodiments, the nucleic acid polymer binding sites are not sequence specific.
  • the device also includes one or more nucleic acid polymers affixed to the nucleic acid polymer binding sites.
  • the one or more nucleic acid polymers are modified to include labels.
  • methods for making a device include obtaining a substrate that is substantially transparent to a particle beam, and forming nucleic acid polymer binding sites on a surface of the substrate.
  • the substrate is substantially transparent to an electron beam.
  • the substrate includes a carbon thin film.
  • the nucleic acid polymer binding sites are formed at predetermined positions on the surface of the substrate, preferably in a grid pattern.
  • the method also includes attaching to the substrate a support that is substantially transparent to a particle beam.
  • the substrate is less than 5 nm thick, more preferably less than 2 nm thick, still more preferably less than 1.5 nm thick, and yet more preferably less than 1.1 nm thick.
  • the nucleic acid polymer binding sites are sequence specific, preferably oligonucleotides. In other embodiments, the nucleic acid polymer binding sites are not sequence specific.
  • the methods also include affixing one or more nucleic acid polymers to the nucleic acid polymer binding sites.
  • the one or more nucleic acid polymers are modified to include labels.
  • systems designed to detect the presence of, determine the sequence of and/or identify a nucleic acid polymer include: a sample chamber; a particle beam generator associated with the chamber; a sample comprising a labeled complementary strand of a nucleic acid polymer, wherein the sample, when positioned in the chamber, is exposed to a particle beam generated by the particle beam generator resulting in an interaction between the particle beam and the complementary strand; and a detector constructed and arranged to collect particle beam species after the interaction.
  • the system also includes a data analysis module operative to receive and analyze signals from the detector.
  • the data analysis module is operative to analyze signals related to absorbance, reflection, deflection, energy or direction.
  • the data analysis module is operative to analyze pattern recognition techniques to analyze the signals.
  • system also includes a user interface operative to control a display of information received and/or generated by the data analysis module.
  • the particle beam generator is an electron beam generator.
  • the system in other embodiments also includes a feedback module designed to calibrate the system based on nucleic acid polymer data.
  • systems designed to detect the presence of, determine the sequence of and/or identify a nucleic acid polymer include: a sample chamber; a particle beam generator associated with the chamber; a detector constructed and arranged to collect particle beam species after interaction between the particle beam and a sample comprising the nucleic acid polymer and/or a complementary strand of the nucleic acid polymer; a data analysis module designed to analyze signals related to the particle beam species to determine information related to the nucleic acid polymer; and a feedback module designed to calibrate the system based on the information.
  • the sample includes a labeled complementary strand of a nucleic acid polymer.
  • the feedback module is designed to calibrate the system based on a base-base distance of the nucleic acid polymer. In other embodiments, the feedback module is designed to calibrate the system based on known geometries of the nucleic acid polymer.
  • the methods include acquiring data related to a nucleic acid polymer; and calibrating the instrument based on the data.
  • the data is related to a base-base distance of the nucleic acid polymer.
  • the calibrating includes calibrating the instrument based on known geometries of the nucleic acid polymer.
  • systems for detecting, sequencing and/or identifying a nucleic acid polymer based on particle beam species detected by a detector, the particle beam species resulting from exposure of a sample comprising a nucleic acid polymer and/or its complementary strand to a particle beam.
  • the systems include a data analysis module operative to receive one or more signals from the detector, the one or more signals representing the particle beam species, and to detect, sequence and/or identify the nucleic acid polymer and/or its complementary strand comprised in the sample based at least in part on the received one or more signals.
  • the nucleic acid polymer and/or its complementary strand is labeled.
  • the particle beam species has one or more of the following properties: absorbance, reflection, deflection, energy and direction, and the data analysis module is operative to analyze the one or more signals to determine values of the one or more properties.
  • the data analysis module is operative to access a data resource comprising nucleic acid polymer information, the data resource including a data structure having a plurality of entries, each entry specifying information about a respective nucleic acid polymer sequence.
  • the data analysis module is operative to partially sequence the nucleic acid polymer based on the one or more signals, the data analysis module further comprising: a combining module to combine the partial sequence with sequencing information of the nucleic acid polymer accessed from the data resource.
  • the data analysis module includes a comparison module operative to compare information determined from the one or more signals to the information specified by one or more of the data structure entries.
  • the comparison module is operative to use pattern recognition techniques to compare the information determined from the one or more signals to the information specified by the one or more the data structure entries.
  • the data analysis module includes a user interface module to display information received and/or generated by the data analysis module to a user.
  • the particle beam to which the sample is exposed is generated by a particle beam generator
  • the data analysis module includes a feedback module operative to provide one or more feedback signals to the particle beam generator and/or the detector, the one or more feedback signals specifying information determined at least in part from the one or more signals received from the detector.
  • the one or more feedback signals include information for calibrating the particle beam generator.
  • the feedback module is operative to generate the one or more feedback signals based at least in part on known geometries of the nucleic acid polymer.
  • the data analysis module preferably includes a storage module operative to store information received and/or generated by the data analysis module on a computer-readable medium.
  • the sample includes a plurality of molecules of a same nucleic acid polymer and/or its complementary strand, and a plurality of particle beam species results from exposure of the plurality of molecules of the sample to the particle beam, the one or more signals representing the plurality of particle beam species, wherein the data analysis module is operative to partially sequence the nucleic acid polymer based on a first of the plurality of molecules to produce a first partial sequence, and to partially sequence the nucleic acid polymer based on a second of the plurality of molecules to produce a second partial sequence, and wherein the data processing module further includes a combining module to combine the first and second partial sequences.
  • a computer-readable medium having computer-readable signals stored thereon that define instructions that, as a result of being executed by a computer, control the computer to perform a process of detecting, sequencing and/or identifying a nucleic acid polymer based on particle beam species detected by a detector, the particle beam species resulting from exposure of a sample comprising a nucleic acid polymer and/or its complementary strand to a particle beam.
  • the process includes: receiving one or more signals from the detector, the one or more signals representing the particle beam species; and detecting, sequencing and/or identifying the nucleic acid polymer and/or its complementary strand comprised in the sample based at least in part on the received one or more signals.
  • the nucleic acid polymer and/or its complementary strand is labeled.
  • the particle beam species has one or more of the following properties: absorbance, reflection, deflection, energy and direction, and the act of detecting, sequencing and/or identifying includes analyzing the one or more signals to determine values of the one or more properties.
  • the act of detecting, sequencing and/or identifying includes accessing a data resource comprising nucleic acid polymer information, the data resource including a data structure having a plurality of entries, each entry specifying information about a respective nucleic acid polymer sequence.
  • the act of detecting, sequencing and/or identifying includes partially sequencing the nucleic acid polymer based on the one or more signals to produce a partial sequence; accessing partial sequence information of the nucleic acid polymer from the data resource; and combining the partial sequence with the partial sequence information.
  • the act of detecting, sequencing and/or identifying includes comparing information determined from the one or more signals to the information specified by one or more of the entries.
  • the act of detecting, sequencing and/or identifying preferably includes using pattern recognition techniques to compare the information determined from the one or more signals to the information specified by the one or more entries.
  • the process further includes displaying information determined from the one or more received signals to a user.
  • the particle beam to which the sample is exposed is generated by a particle beam generator
  • the process further includes providing one or more feedback signals to the particle beam generator and/or the detector, the one or more feedback signals specifying information determined at least in part from the one or more signals received from the detector.
  • the act of providing includes providing one or more feedback signals that include information for calibrating the particle beam generator.
  • the process further includes generating the one or more feedback signals based at least in part on known geometries of the nucleic acid polymer.
  • the process further includes storing information determine from the one or more signals on a computer-readable medium.
  • the sample includes a plurality of molecules of a same nucleic acid polymer and/or its complementary strand, and a plurality of particle beam species result from exposure of the plurality of molecules of the sample to the particle beam, the one or more signals representing the plurality of particle beam species, and the act of detecting, sequencing and/or identifying includes partially sequencing the nucleic acid polymer based on a first of the plurality of molecules to produce a first partial sequence; partially sequencing the nucleic acid polymer based on a second of the plurality of molecules to produce a second partial sequence; combining the first and second partial sequences.
  • Figure 1 depicts a non-limiting example of heavy atoms labels detected within DNA molecules.
  • A Schematic showing heavy atoms deflecting portion of the raster scanned electron beam. Highly deflected electrons are detected on the ADF detector.
  • B Unlabeled DNA bases scatter fewer electrons than the heavy- atom-labeled bases,
  • Figure 2 depicts a non-limiting example of heavy-atom-labeling strategy.
  • a single stranded template is primed with a complementary oligonucleotide primer.
  • the lengths of the primer and the template have been shortened.
  • the template directs the synthesis of a complementary strand.
  • Thymine deoxyribose nucleotide triphosphates in the primer extension reaction have been completely replaced with a heavy-atom-modified analog. Consequently, the resulting double- stranded DNA molecule is modified with heavy atoms on the thymine bases of the synthetic strand. These heavy atoms provide signal to the dark-field detector of a STEM system.
  • Figure 3 depicts DNA alignment of a prepared and linearized DNA molecule on a thin amorphous carbon substrate.
  • A Bright-field TEM image of multiple DNA molecules linearized on amorphous carbon surface.
  • B Darkfield STEM image of linearized DNA molecule on thin amorphous carbon substrate.
  • Figure 4 depicts heavy-atom locations and contrast distribution.
  • Figure 5 depicts a schematic of repeating "test pattern" molecule. Heavy atoms are attached to thymine/uridine bases of one strand of double- stranded DNA molecules. The labels nearest one another are separated by one unlabeled base pair; the theoretical pitch between the heavy atoms is 0.7 to 1.2 nm. These doublets repeat every 12 base pairs, for a theoretical pitch of 4.1 to 7.3 nm. Actual spacing of both patterns depends on local stretching, predicted to be 0% to 80%.
  • Figure 6 depicts sequence data from repeating "test pattern" molecule.
  • A Partial sequence of DNA molecule. Yellow lines (starred, *) show heavy atoms in predicted large- scale test pattern positions, where distances to neighbors in both directions match the large- scale test pattern. White circles show pairs of atoms matching small-scale pattern. Red lines (indicated by arrows) show atoms of the large-scale pattern in positions predicted by spacing with one rather than two neighbors.
  • Compounds described herein can comprise one or more asymmetric centers, and thus can exist in various stereoisomeric forms, e.g., enantiomers and/or diastereomers.
  • the compounds described herein can be in the form of an individual enantiomer, diastereomer or geometric isomer, or can be in the form of a mixture of stereoisomers, including racemic mixtures and mixtures enriched in one or more stereoisomer.
  • Isomers can be isolated from mixtures by methods known to those skilled in the art, including chiral high pressure liquid chromatography (HPLC) and the formation and crystallization of chiral salts; or preferred isomers can be prepared by asymmetric syntheses.
  • HPLC high pressure liquid chromatography
  • alkyl refers to a radical of a straight-chain or branched saturated hydrocarbon group having from 1 to 20 carbon atoms (“C ⁇ o alkyl”). In some embodiments, an alkyl group has 1 to 10 carbon atoms (“Ci-w alkyl”). In some embodiments, an alkyl group has 1 to 9 carbon atoms ("Q- 9 alkyl”). In some embodiments, an alkyl group has 1 to 8 carbon atoms (“Q-s alkyl”). In some embodiments, an alkyl group has 1 to 7 carbon atoms (“Ci- 7 alkyl”). In some embodiments, an alkyl group has 1 to 6 carbon atoms (“C ⁇ alkyl”).
  • an alkyl group has 1 to 5 carbon atoms ("Q-s alkyl”). In some embodiments, an alkyl group has 1 to 4 carbon atoms ("C ⁇ alkyl”). In some embodiments, an alkyl group has 1 to 3 carbon atoms (“Ci_ 3 alkyl”). In some embodiments, an alkyl group has 1 to 2 carbon atoms ("Ci-2 alkyl”). In some embodiments, an alkyl group has 1 carbon atom (“Ci alkyl”). In some embodiments, an alkyl group has 2 to 6 carbon atoms (“C2- 6 alkyl”).
  • Q_ 6 alkyl groups include methyl (CO, ethyl (C 2 ), n-propyl (C 3 ), isopropyl (C 3 ), n-butyl (C 4 ), tert-butyl (C 4 ), sec-butyl (C 4 ), iso-butyl (C 4 ), n-pentyl (C 5 ), 3- pentanyl (C 5 ), amyl (C 5 ), neopentyl (C 5 ), 3-methyl-2-butanyl (C 5 ), tertiary amyl (C 5 ), and n- hexyl (C 6 ).
  • Additional examples of alkyl groups include n-heptyl (C 7 ), n-octyl (C 8 ) and the like. Unless otherwise specified, each instance of an alkyl group is independently
  • the alkyl group is an unsubstituted Ci_2o alkyl ⁇ e.g., -CH 3 ). In certain embodiments, the alkyl group is a substituted Ci_2o alkyl.
  • haloalkyl is an alkyl group as defined herein wherein one or more of the hydrogen atoms are independently replaced by a halogen, e.g., fluoro, bromo, chloro, or iodo.
  • a halogen e.g., fluoro, bromo, chloro, or iodo.
  • Perhaloalkyl is a subset of haloalkyl, and refers to an alkyl group wherein all of the hydrogen atoms are independently replaced by a halogen, e.g., fluoro, bromo, chloro, or iodo.
  • the haloalkyl moiety has 1 to 20 carbon atoms ("C ⁇ o haloalkyl").
  • the haloalkyl moiety has 1 to 10 carbon atoms ("Ci-w haloalkyl”). In some embodiments, the haloalkyl moiety has 1 to 8 carbon atoms ("Q-e haloalkyl”). In some embodiments, the haloalkyl moiety has 1 to 6 carbon atoms ("Ci_6 haloalkyl”). In some embodiments, the haloalkyl moiety has 1 to 4 carbon atoms (“C ⁇ haloalkyl”). In some embodiments, the haloalkyl moiety has 1 to 3 carbon atoms ("C ⁇ haloalkyl").
  • the haloalkyl moiety has 1 to 2 carbon atoms ("Ci_ 2 haloalkyl").
  • all of the haloalkyl hydrogen atoms are replaced with fluoro to provide a perfluoroalkyl group.
  • all of the haloalkyl hydrogen atoms are replaced with chloro to provide a "perchloroalkyl" group.
  • Examples of haloalkyl groups include -CF 3 , -CF 2 CF 3 , -CF 2 CF 2 CF 3 , -CC1 3 , -CFC1 2 , -CF 2 C1, and the like.
  • Haloalkenyl, haloalkynyl, halocarbocyclyl, haloheterocylyl, haloaryl, and haloheteroaryl follow the definition of haloalkyl, and refer to an alkenyl, alkynyl,
  • perhaloalkenyl, perhaloalkynyl, perhalocarbocyclyl, perhaloheterocylyl, perhaloaryl, and perhaloheteroaryl follow the definition of perhaloalkyl, and refer to an alkenyl, alkynyl, carbocyclyl, heterocylyl, aryl, and heteroaryl group, as defined herein, wherein all of the hydrogen atoms are independently replaced by a halogen, e.g., fluoro, bromo, chloro, or iodo.
  • a halogen e.g., fluoro, bromo, chloro, or iodo.
  • heteroalkyl refers to an alkyl group as defined herein which further includes at least one heteroatom (e.g., 1, 2, 3, or 4 heteroatoms) selected from oxygen, nitrogen, or sulfur within (i.e., inserted between adjacent carbon atoms of) and/or placed at one or more terminal position(s) of the parent chain.
  • a heteroalkyl group refers to a saturated group having from 1 to 10 carbon atoms and 1 or more
  • heteroalkyl group is a saturated group having 1 to 9 carbon atoms and 1 or more heteroatoms within the parent chain ("heteroCi-g alkyl”).
  • a heteroalkyl group is a saturated group having 1 to 8 carbon atoms and 1 or more heteroatoms within the parent chain ("heteroCi-8 alkyl”).
  • a heteroalkyl group is a saturated group having 1 to 7 carbon atoms and 1 or more heteroatoms within the parent chain (“heteroC ⁇ alkyl").
  • a heteroalkyl group is a saturated group having 1 to 6 carbon atoms and 1 or more heteroatoms within the parent chain ("heteroQ-e alkyl"). In some embodiments, a heteroalkyl group is a saturated group having 1 to 5 carbon atoms and 1 or 2 heteroatoms within the parent chain ("heteroCi-5 alkyl"). In some embodiments, a heteroalkyl group is a saturated group having 1 to 4 carbon atoms and lor 2 heteroatoms within the parent chain (“heteroC ⁇ alkyl"). In some embodiments, a heteroalkyl group is a saturated group having 1 to 3 carbon atoms and 1 heteroatom within the parent chain
  • heteroCi-3 alkyl a heteroalkyl group is a saturated group having 1 to 2 carbon atoms and 1 heteroatom within the parent chain ("heteroC ⁇ alkyl”). In some embodiments, a heteroalkyl group is a saturated group having 1 carbon atom and 1 heteroatom (“heteroCi alkyl”). In some embodiments, a heteroalkyl group is a saturated group having 2 to 6 carbon atoms and 1 or 2 heteroatoms within the parent chain (“heteroC 2 -6 alkyl").
  • each instance of a heteroalkyl group is independently unsubstituted (an "unsubstituted heteroalkyl") or substituted (a "substituted heteroalkyl") with one or more substituents.
  • the heteroalkyl group is an unsubstituted heteroC ⁇ o alkyl.
  • the heteroalkyl group is a substituted heteroC ⁇ o alkyl.
  • alkenyl refers to a radical of a straight-chain or branched hydrocarbon group having from 2 to 20 carbon atoms and one or more carbon-carbon double bonds (e.g., 1, 2, 3, or 4 double bonds).
  • an alkenyl group has 2 to 10 carbon atoms ("C 2 - 10 alkenyl”).
  • an alkenyl group has 2 to 9 carbon atoms ("C 2 -9 alkenyl”).
  • an alkenyl group has 2 to 8 carbon atoms (“C 2 -8 alkenyl”).
  • an alkenyl group has 2 to 7 carbon atoms (“C 2 _ 7 alkenyl”).
  • an alkenyl group has 2 to 6 carbon atoms ("C 2 -6 alkenyl”). In some embodiments, an alkenyl group has 2 to 5 carbon atoms ("C 2 -5 alkenyl”). In some embodiments, an alkenyl group has 2 to 4 carbon atoms ("C 2 ⁇ alkenyl”). In some embodiments, an alkenyl group has 2 to 3 carbon atoms (“C 2 _ 3 alkenyl”). In some embodiments, an alkenyl group has 2 carbon atoms ("C 2 alkenyl”). The one or more carbon-carbon double bonds can be internal (such as in 2-butenyl) or terminal (such as in 1- butenyl).
  • Examples of C 2 ⁇ alkenyl groups include ethenyl (C 2 ), 1-propenyl (C 3 ), 2-propenyl (C 3 ), 1-butenyl (C 4 ), 2-butenyl (C 4 ), butadienyl (C 4 ), and the like.
  • Examples of C 2 -6 alkenyl groups include the aforementioned C 2 ⁇ alkenyl groups as well as pentenyl (C 5 ), pentadienyl (C 5 ), hexenyl (C 6 ), and the like. Additional examples of alkenyl include heptenyl (C 7 ), octenyl (Cg), octatrienyl (Cg), and the like.
  • each instance of an alkenyl group is independently unsubstituted (an "unsubstituted alkenyl") or substituted (a "substituted alkenyl") with one or more substituents.
  • the alkenyl group is an unsubstituted C 2 - 20 alkenyl.
  • the alkenyl group is a substituted C 2 - 20 alkenyl.
  • heteroalkenyl refers to an alkenyl group as defined herein which further includes at least one heteroatom (e.g., 1, 2, 3, or 4 heteroatoms) selected from oxygen, nitrogen, or sulfur within (i.e., inserted between adjacent carbon atoms of) and/or placed at one or more terminal position(s) of the parent chain.
  • a heteroalkenyl group refers to a group having from 2 to 20 carbon atoms, at least one double bond, and 1 or more heteroatoms within the parent chain (“heteroC 2 - 2 o alkenyl").
  • a heteroalkenyl group refers to a group having from 2 to 10 carbon atoms, at least one double bond, and 1 or more heteroatoms within the parent chain ("heteroC 2 -io alkenyl"). In some embodiments, a heteroalkenyl group has 2 to 9 carbon atoms at least one double bond, and 1 or more heteroatoms within the parent chain ("heteroC 2 -9 alkenyl"). In some embodiments, a heteroalkenyl group has 2 to 8 carbon atoms, at least one double bond, and 1 or more heteroatoms within the parent chain (“heteroC 2 -8 alkenyl").
  • a heteroalkenyl group has 2 to 7 carbon atoms, at least one double bond, and 1 or more heteroatoms within the parent chain ("heteroC 2 -7 alkenyl"). In some embodiments, a heteroalkenyl group has 2 to 6 carbon atoms, at least one double bond, and 1 or more heteroatoms within the parent chain ("heteroC 2 -6 alkenyl"). In some embodiments, a heteroalkenyl group has 2 to 5 carbon atoms, at least one double bond, and 1 or 2
  • heteroalkenyl group has 2 to 4 carbon atoms, at least one double bond, and lor 2 heteroatoms within the parent chain ("heteroC 2 ⁇ alkenyl”).
  • a heteroalkenyl group has 2 to 3 carbon atoms, at least one double bond, and 1 heteroatom within the parent chain ("heteroC 2 -3 alkenyl”).
  • a heteroalkenyl group has 2 to 6 carbon atoms, at least one double bond, and 1 or 2 heteroatoms within the parent chain (“heteroC 2 -6 alkenyl”). Unless otherwise specified, each instance of a heteroalkenyl group is
  • the heteroalkenyl group is an unsubstituted heteroC 2 - 2 o alkenyl. In certain embodiments, the heteroalkenyl group is a substituted heteroC2-2o alkenyl.
  • alkynyl refers to a radical of a straight-chain or branched hydrocarbon group having from 2 to 20 carbon atoms and one or more carbon-carbon triple bonds (e.g., 1, 2, 3, or 4 triple bonds) ("C 2 - 20 alkynyl").
  • an alkynyl group has 2 to 10 carbon atoms ("C 2 - 10 alkynyl”).
  • an alkynyl group has 2 to 9 carbon atoms (“C 2 -9 alkynyl”).
  • an alkynyl group has 2 to 8 carbon atoms (“C 2 -8 alkynyl”).
  • an alkynyl group has 2 to 7 carbon atoms ("C 2 -7 alkynyl”). In some embodiments, an alkynyl group has 2 to 6 carbon atoms ("C 2 -6 alkynyl”). In some embodiments, an alkynyl group has 2 to 5 carbon atoms (“C 2 _5 alkynyl”). In some embodiments, an alkynyl group has 2 to 4 carbon atoms ("C 2 ⁇ alkynyl”). In some embodiments, an alkynyl group has 2 to 3 carbon atoms ("C 2 -3 alkynyl”). In some embodiments, an alkynyl group has 2 carbon atoms ("C 2 alkynyl”).
  • the one or more carbon- carbon triple bonds can be internal (such as in 2-butynyl) or terminal (such as in 1-butynyl).
  • Examples of C 2 -A alkynyl groups include, without limitation, ethynyl (C 2 ), 1-propynyl (C 3 ), 2-propynyl (C 3 ), 1-butynyl (C 4 ), 2-butynyl (C 4 ), and the like.
  • Examples of C 2 -6 alkenyl groups include the aforementioned C 2 ⁇ alkynyl groups as well as pentynyl (C 5 ), hexynyl (C 6 ), and the like.
  • alkynyl examples include heptynyl (C 7 ), octynyl (Cg), and the like. Unless otherwise specified, each instance of an alkynyl group is independently unsubstituted (an "unsubstituted alkynyl") or substituted (a "substituted alkynyl") with one or more substituents. In certain embodiments, the alkynyl group is an unsubstituted C 2 - 20 alkynyl. In certain embodiments, the alkynyl group is a substituted C 2 - 20 alkynyl.
  • heteroalkynyl refers to an alkynyl group as defined herein which further includes at least one heteroatom (e.g., 1, 2, 3, or 4 heteroatoms) selected from oxygen, nitrogen, or sulfur within (i.e., inserted between adjacent carbon atoms of) and/or placed at one or more terminal position(s) of the parent chain.
  • a heteroalkynyl group refers to a group having from 2 to 20 carbon atoms, at least one triple bond, and 1 or more heteroatoms within the parent chain (“heteroC 2 - 2 o alkynyl").
  • a heteroalkynyl group refers to a group having from 2 to 10 carbon atoms, at least one triple bond, and 1 or more heteroatoms within the parent chain ("heteroC 2 -io alkynyl"). In some embodiments, a heteroalkynyl group has 2 to 9 carbon atoms, at least one triple bond, and 1 or more heteroatoms within the parent chain ("heteroC 2 -9 alkynyl"). In some embodiments, a heteroalkynyl group has 2 to 8 carbon atoms, at least one triple bond, and 1 or more heteroatoms within the parent chain (“heteroC 2 -8 alkynyl").
  • a heteroalkynyl group has 2 to 7 carbon atoms, at least one triple bond, and 1 or more heteroatoms within the parent chain ("heteroC 2 -7 alkynyl"). In some embodiments, a heteroalkynyl group has 2 to 6 carbon atoms, at least one triple bond, and 1 or more heteroatoms within the parent chain ("heteroC 2 -6 alkynyl”). In some embodiments, a heteroalkynyl group has 2 to 5 carbon atoms, at least one triple bond, and 1 or 2 heteroatoms within the parent chain ("heteroC 2 -5 alkynyl").
  • a heteroalkynyl group has 2 to 4 carbon atoms, at least one triple bond, and lor 2 heteroatoms within the parent chain alkynyl"). In some embodiments, a heteroalkynyl group has 2 to 3 carbon atoms, at least one triple bond, and 1 heteroatom within the parent chain ("heteroC 2 - 3 alkynyl"). In some embodiments, a heteroalkynyl group has 2 to 6 carbon atoms, at least one triple bond, and 1 or 2 heteroatoms within the parent chain (“heteroC 2 -6 alkynyl").
  • each instance of a heteroalkynyl group is independently unsubstituted (an "unsubstituted heteroalkynyl") or substituted (a "substituted heteroalkynyl") with one or more substituents.
  • the heteroalkynyl group is an unsubstituted heteroC 2 - 2 o alkynyl.
  • the heteroalkynyl group is a substituted heteroC2-2o alkynyl.
  • Carbocyclyl or “carbocyclic” refers to a radical of a non- aromatic cyclic hydrocarbon group having from 3 to 10 ring carbon atoms ('3 ⁇ 4_ 10
  • carbocyclyl and zero heteroatoms in the non-aromatic ring system.
  • a carbocyclyl group has 3 to 8 ring carbon atoms ("C 3 _g carbocyclyl").
  • a carbocyclyl group has 3 to 7 ring carbon atoms ("C 3 _7 carbocyclyl”). In some embodiments, a carbocyclyl group has 3 to 6 ring carbon atoms ("C 3 _6 carbocyclyl”). In some embodiments, a carbocyclyl group has 4 to 6 ring carbon atoms ("C 4 _6 carbocyclyl”). In some embodiments, a carbocyclyl group has 5 to 6 ring carbon atoms (“Cs_6 carbocyclyl”). In some embodiments, a carbocyclyl group has 5 to 10 ring carbon atoms ("Cs-io carbocyclyl”). Exemplary C 3 _6 carbocyclyl groups include, without limitation, cyclopropyl (C 3 ),
  • cyclopropenyl C 3
  • cyclobutyl C 4
  • cyclobutenyl C 4
  • cyclopentyl C 5
  • cyclopentenyl C 5
  • cyclohexyl C 6
  • cyclohexenyl C 6
  • cyclohexadienyl C 6
  • Exemplary C 3 _g carbocyclyl groups include, without limitation, the aforementioned C 3 _ 6 carbocyclyl groups as well as cycloheptyl (C 7 ), cycloheptenyl (C 7 ), cycloheptadienyl (C 7 ), cycloheptatrienyl (C 7 ), cyclooctyl (Cg), cyclooctenyl (Cg), bicyclo[2.2.1]heptanyl (C 7 ), bicyclo[2.2.2]octanyl (Cg), and the like.
  • Exemplary C 3 _ 10 carbocyclyl groups include, without limitation, the
  • C 3 _g carbocyclyl groups as well as cyclononyl (C 9 ), cyclononenyl (C 9 ), cyclodecyl (C 10 ), cyclodecenyl (C 10 ), octahydro-lH-indenyl (C 9 ), decahydronaphthalenyl (Cio), spiro[4.5]decanyl (C 10 ), and the like.
  • the carbocyclyl group is either monocyclic (“monocyclic carbocyclyl”) or polycyclic (e.g., containing a fused, bridged or spiro ring system such as a bicyclic system (“bicyclic carbocyclyl”) or tricyclic system (“tricyclic carbocyclyl”)) and can be saturated or can contain one or more carbon-carbon double or triple bonds.
  • Carbocyclyl also includes ring systems wherein the carbocyclyl ring, as defined above, is fused with one or more aryl or heteroaryl groups wherein the point of attachment is on the carbocyclyl ring, and in such instances, the number of carbons continue to designate the number of carbons in the carbocyclic ring system.
  • each instance of a carbocyclyl group is independently unsubstituted (an "unsubstituted carbocyclyl") or substituted (a "substituted carbocyclyl”) with one or more substituents.
  • the carbocyclyl group is an unsubstituted C 3 _ 10 carbocyclyl.
  • the carbocyclyl group is a substituted C 3 _ 10 carbocyclyl.
  • “carbocyclyl” is a monocyclic, saturated carbocyclyl group having from 3 to 10 ring carbon atoms ("Cs-io cycloalkyl”). In some embodiments, a cycloalkyl group has 3 to 8 ring carbon atoms ("C 3 _ 8 cycloalkyl”). In some embodiments, a cycloalkyl group has 3 to 6 ring carbon atoms ("C3_6 cycloalkyl”). In some embodiments, a cycloalkyl group has 4 to 6 ring carbon atoms ("C 4 _6 cycloalkyl").
  • a cycloalkyl group has 5 to 6 ring carbon atoms ("C 5 _6 cycloalkyl”). In some embodiments, a cycloalkyl group has 5 to 10 ring carbon atoms ("Cs-io cycloalkyl"). Examples of C 5 _6 cycloalkyl groups include cyclopentyl (C 5 ) and cyclohexyl (C 5 ). Examples of C 3 _6 cycloalkyl groups include the aforementioned C 5 _6 cycloalkyl groups as well as cyclopropyl (C 3 ) and cyclobutyl (C 4 ).
  • C 3 _ 8 cycloalkyl groups include the aforementioned C 3 _ 6 cycloalkyl groups as well as cycloheptyl (C 7 ) and cyclooctyl (C 8 ). Unless otherwise specified, each instance of a cycloalkyl group is independently unsubstituted (an
  • the cycloalkyl group is an unsubstituted C 3 _io cycloalkyl. In certain embodiments, the cycloalkyl group is a substituted C 3 _ 10 cycloalkyl.
  • heterocyclyl or “heterocyclic” refers to a radical of a 3- to 14- membered non-aromatic ring system having ring carbon atoms and 1 to 4 ring heteroatoms, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur ("3-14 membered heterocyclyl").
  • the point of attachment can be a carbon or nitrogen atom, as valency permits.
  • a heterocyclyl group can either be monocyclic (“monocyclic heterocyclyl”) or polycyclic (e.g., a fused, bridged or spiro ring system such as a bicyclic system (“bicyclic heterocyclyl”) or tricyclic system (“tricyclic heterocyclyl”)), and can be saturated or can contain one or more carbon- carbon double or triple bonds.
  • Heterocyclyl polycyclic ring systems can include one or more heteroatoms in one or both rings.
  • Heterocyclyl also includes ring systems wherein the heterocyclyl ring, as defined above, is fused with one or more carbocyclyl groups wherein the point of attachment is either on the carbocyclyl or heterocyclyl ring, or ring systems wherein the heterocyclyl ring, as defined above, is fused with one or more aryl or heteroaryl groups, wherein the point of attachment is on the heterocyclyl ring, and in such instances, the number of ring members continue to designate the number of ring members in the heterocyclyl ring system.
  • each instance of heterocyclyl is independently unsubstituted (an "unsubstituted heterocyclyl") or substituted (a "substituted heterocyclyl") with one or more substituents.
  • the heterocyclyl group is an unsubstituted 3-14 membered heterocyclyl. In certain embodiments, the heterocyclyl group is a substituted 3-14 membered heterocyclyl.
  • a heterocyclyl group is a 5-10 membered non-aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur ("5-10 membered heterocyclyl").
  • a heterocyclyl group is a 5-8 membered non-aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms, wherein each heteroatom is
  • a heterocyclyl group is a 5-6 membered non-aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms, wherein each heteroatom is
  • the 5-6 membered heterocyclyl has 1-3 ring heteroatoms selected from nitrogen, oxygen, and sulfur. In some embodiments, the 5-6 membered heterocyclyl has 1-2 ring heteroatoms selected from nitrogen, oxygen, and sulfur. In some embodiments, the 5-6 membered heterocyclyl has 1 ring heteroatom selected from nitrogen, oxygen, and sulfur.
  • Exemplary 3-membered heterocyclyl groups containing 1 heteroatom include, without limitation, azirdinyl, oxiranyl, thiorenyl.
  • Exemplary 4-membered heterocyclyl groups containing 1 heteroatom include, without limitation, azetidinyl, oxetanyl and thietanyl.
  • Exemplary 5-membered heterocyclyl groups containing 1 heteroatom include, without limitation, tetrahydrofuranyl, dihydrofuranyl, tetrahydrothiophenyl,
  • Exemplary 5- membered heterocyclyl groups containing 2 heteroatoms include, without limitation, dioxolanyl, oxathiolanyl and dithiolanyl.
  • Exemplary 5-membered heterocyclyl groups containing 3 heteroatoms include, without limitation, triazolinyl, oxadiazolinyl, and thiadiazolinyl.
  • Exemplary 6-membered heterocyclyl groups containing 1 heteroatom include, without limitation, piperidinyl, tetrahydropyranyl, dihydropyridinyl, and thianyl.
  • Exemplary 6-membered heterocyclyl groups containing 2 heteroatoms include, without limitation, piperazinyl, morpholinyl, dithianyl, dioxanyl.
  • heterocyclyl groups containing 2 heteroatoms include, without limitation, triazinanyl.
  • Exemplary 7-membered heterocyclyl groups containing 1 heteroatom include, without limitation, azepanyl, oxepanyl and thiepanyl.
  • Exemplary 8-membered heterocyclyl groups containing 1 heteroatom include, without limitation, azocanyl, oxecanyl and thiocanyl.
  • Exemplary bicyclic heterocyclyl groups include, without limitation, indolinyl, isoindolinyl, dihydrobenzofuranyl, dihydrobenzothienyl, tetrahydrobenzothienyl, tetrahydrobenzofuranyl, tetrahydroindolyl, tetrahydroquinolinyl, tetrahydroisoquinolinyl, decahydroquinolinyl, decahydroisoquinolinyl, octahydrochromenyl, octahydroisochromenyl,
  • decahydronaphthyridinyl decahydro-l,8-naphthyridinyl, octahydropyrrolo[3,2-b]pyrrole, indolinyl, phthalimidyl, naphthalimidyl, chromanyl, chromenyl, lH-benzo[e] [l,4]diazepinyl, l,4,5,7-tetrahydropyrano[3,4-b]pyrrolyl, 5,6-dihydro-4H-furo[3,2-b]pyrrolyl, 6,7-dihydro- 5H-furo [3 ,2-b]pyranyl, 5 ,7-dihydro-4H-thieno [2,3-c]pyranyl, 2,3-dihydro- 1 H- pyrrolo[2,3-b]pyridinyl, 2,3-dihydrofuro[2,3-b]pyridinyl, 4,5
  • aryl refers to a radical of a monocyclic or polycyclic (e.g., bicyclic or tricyclic) 4n+2 aromatic ring system (e.g., having 6, 10, or 14 ⁇ electrons shared in a cyclic array) having 6-14 ring carbon atoms and zero heteroatoms provided in the aromatic ring system ("C 6 i4 aryl").
  • an aryl group has 6 ring carbon atoms ("C 6 aryl”; e.g., phenyl).
  • an aryl group has 10 ring carbon atoms ("Cio aryl”; e.g., naphthyl such as 1-naphthyl and 2-naphthyl).
  • Cio aryl e.g., naphthyl such as 1-naphthyl and 2-naphthyl.
  • an aryl group has 14 ring carbon atoms ("G ⁇ aryl”; e.g., anthracyl).
  • Aryl also includes ring systems wherein the aryl ring, as defined above, is fused with one or more carbocyclyl or heterocyclyl groups wherein the radical or point of attachment is on the aryl ring, and in such instances, the number of carbon atoms continue to designate the number of carbon atoms in the aryl ring system.
  • each instance of an aryl group is independently unsubstituted (an "unsubstituted aryl") or substituted (a "substituted aryl”) with one or more substituents.
  • the aryl group is an
  • the aryl group is a substituted C 6 -i4 aryl.
  • heteroaryl refers to a radical of a 5-14 membered monocyclic or polycyclic (e.g., bicyclic, tricyclic) 4n+2 aromatic ring system (e.g., having 6, 10, or 14 ⁇ electrons shared in a cyclic array) having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen and sulfur ("5-14 membered heteroaryl").
  • the point of attachment can be a carbon or nitrogen atom, as valency permits.
  • Heteroaryl polycyclic ring systems can include one or more heteroatoms in one or both rings.
  • Heteroaryl includes ring systems wherein the heteroaryl ring, as defined above, is fused with one or more carbocyclyl or heterocyclyl groups wherein the point of attachment is on the heteroaryl ring, and in such instances, the number of ring members continue to designate the number of ring members in the heteroaryl ring system.
  • Heteroaryl also includes ring systems wherein the heteroaryl ring, as defined above, is fused with one or more aryl groups wherein the point of attachment is either on the aryl or heteroaryl ring, and in such instances, the number of ring members designates the number of ring members in the fused polycyclic (aryl/heteroaryl) ring system.
  • Polycyclic heteroaryl groups wherein one ring does not contain a heteroatom e.g., indolyl, quinolinyl, carbazolyl, and the like
  • the point of attachment can be on either ring, i.e., either the ring bearing a heteroatom (e.g., 2-indolyl) or the ring that does not contain a heteroatom (e.g., 5-indolyl).
  • a heteroaryl group is a 5-10 membered aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur ("5-10 membered heteroaryl").
  • a heteroaryl group is a 5-8 membered aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur ("5-8 membered heteroaryl").
  • a heteroaryl group is a 5-6 membered aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur ("5-6 membered heteroaryl").
  • the 5-6 membered heteroaryl has 1-3 ring heteroatoms selected from nitrogen, oxygen, and sulfur.
  • the 5-6 membered heteroaryl has 1-2 ring heteroatoms selected from nitrogen, oxygen, and sulfur.
  • the 5-6 membered heteroaryl has 1 ring heteroatom selected from nitrogen, oxygen, and sulfur.
  • each instance of a heteroaryl group is independently unsubstituted (an "unsubstituted heteroaryl") or substituted (a "substituted heteroaryl") with one or more substituents.
  • the heteroaryl group is an unsubstituted 5-14 membered heteroaryl.
  • the heteroaryl group is a substituted 5-14 membered heteroaryl.
  • Exemplary 5-membered heteroaryl groups containing 1 heteroatom include, without limitation, pyrrolyl, furanyl and thiophenyl.
  • Exemplary 5-membered heteroaryl groups containing 2 heteroatoms include, without limitation, imidazolyl, pyrazolyl, oxazolyl, isoxazolyl, thiazolyl, and isothiazolyl.
  • Exemplary 5-membered heteroaryl groups containing 3 heteroatoms include, without limitation, triazolyl, oxadiazolyl, and thiadiazolyl.
  • 5- membered heteroaryl groups containing 4 heteroatoms include, without limitation, tetrazolyl.
  • Exemplary 6-membered heteroaryl groups containing 1 heteroatom include, without limitation, pyridinyl.
  • Exemplary 6-membered heteroaryl groups containing 2 heteroatoms include, without limitation, pyridazinyl, pyrimidinyl, and pyrazinyl.
  • 6- membered heteroaryl groups containing 3 or 4 heteroatoms include, without limitation, triazinyl and tetrazinyl, respectively.
  • Exemplary 7-membered heteroaryl groups containing 1 heteroatom include, without limitation, azepinyl, oxepinyl, and thiepinyl.
  • Exemplary 5,6- bicyclic heteroaryl groups include, without limitation, indolyl, isoindolyl, indazolyl, benzotriazolyl, benzothiophenyl, isobenzothiophenyl, benzofuranyl, benzoisofuranyl, benzimidazolyl, benzoxazolyl, benzisoxazolyl, benzoxadiazolyl, benzthiazolyl,
  • Exemplary 6,6-bicyclic heteroaryl groups include, without limitation, naphthyridinyl, pteridinyl, quinolinyl, isoquinolinyl, cinnolinyl, quinoxalinyl, phthalazinyl, and quinazolinyl.
  • Exemplary tricyclic heteroaryl groups include, without limitation, phenanthridinyl, dibenzofuranyl, carbazolyl, acridinyl, phenothiazinyl, phenoxazinyl and phenazinyl.
  • partially unsaturated refers to a ring moiety that includes at least one double or triple bond.
  • the term “partially unsaturated” is intended to encompass rings having multiple sites of unsaturation, but is not intended to include aromatic groups (e.g., aryl or heteroaryl moieties) as herein defined.
  • saturated refers to a ring moiety that does not contain a double or triple bond, i.e., the ring contains all single bonds.
  • alkylene is the divalent moiety of alkyl
  • alkenylene is the divalent moiety of alkenyl
  • alkynylene is the divalent moiety of alkynyl
  • heteroalkylene is the divalent moiety of heteroalkyl
  • heteroalkenylene is the divalent moiety of heteroalkenyl
  • heteroalkynylene is the divalent moiety of heteroalkynyl
  • carbocyclylene is the divalent moiety of carbocyclyl
  • heterocyclylene is the divalent moiety of heterocyclyl
  • arylene is the divalent moiety of aryl
  • heteroarylene is the divalent moiety of heteroaryl.
  • alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl groups, as defined herein, are, in certain embodiments, optionally substituted.
  • Optionally substituted refers to a group which may be substituted or unsubstituted (e.g., "substituted” or "unsubstituted” alkyl, "substituted” or “unsubstituted” alkenyl, "substituted” or “unsubstituted” alkynyl, "substituted” or
  • substituted means that at least one hydrogen present on a group is replaced with a permissible substituent, e.g., a substituent which upon substitution results in a stable compound, e.g., a compound which does not spontaneously undergo transformation such as by rearrangement, cyclization, elimination, or other reaction.
  • a "substituted" group has a substituent at one or more substitutable positions of the group, and when more than one position in any given structure is substituted, the substituent is either the same or different at each position.
  • substituted is contemplated to include substitution with all permissible substituents of organic compounds, any of the substituents described herein that results in the formation of a stable compound.
  • the present invention contemplates any and all such combinations in order to arrive at a stable compound.
  • heteroatoms such as nitrogen may have hydrogen substituents and/or any suitable substituent as described herein which satisfy the valencies of the heteroatoms and results in the formation of a stable moiety.
  • R aa is, independently, selected from C ⁇ o alkyl, C ⁇ o perhaloalkyl, C 2 _io alkenyl, C 2 _ 10 alkynyl, Ci-w heteroalkyl, C 2 _ 10 heteroalkenyl, C ⁇ ioheteroalkynyl, C 3 _ 10 carbocyclyl, 3-14 membered heterocyclyl, C 6 -i4 aryl, and 5-14 membered heteroaryl, or two R aa groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered
  • each instance of R cc is, independently, selected from hydrogen, C ⁇ o alkyl, Cno perhaloalkyl, C 2 _ 10 alkenyl, C 2 _ 10 alkynyl, Ci-w heteroalkyl, C 2 _ 10 heteroalkenyl, C 2
  • each instance of R ee is, independently, selected from Q_ 6 alkyl, C ⁇ perhaloalkyl, C 2 6 alkenyl, C 2 _ 6 alkynyl, C ⁇ heteroalkyl, C 2 _ 6 heteroalkenyl, C 2 _ 6 heteroalkynyl, C 3 _io carbocyclyl, C 6 -io aryl, 3-10 membered heterocyclyl, and 3-10 membered heteroaryl, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R gg groups;
  • each instance of R is, independently, selected from hydrogen, Ci- ⁇ alkyl, Ci-e perhaloalkyl, C 2 _ 6 alkenyl, C 2 _ 6 alkynyl, Q- 6 heteroalkyl, C 2 _ 6 heteroalkenyl, C 2
  • each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R gg groups;
  • halo refers to fluorine (fluoro, -F), chlorine (chloro, -CI), bromine (bromo, -Br), or iodine (iodo, -I).
  • a "counterion” is a negatively charged group associated with a positively charged quarternary amine in order to maintain electronic neutrality.
  • exemplary counterions include halide ions (e.g., F , CI “ , Br “ , ⁇ ), N0 3 , C10 4 , OFT, H 2 P0 4 , HS0 4 , sulfonate ions (e.g., methansulfonate, trifluoromethanesulfonate, p-toluenesulfonate, benzenesulfonate, 10-camphor sulfonate, naphthalene-2-sulfonate, naphthalene-l-sulfonic acid-5-sulfonate, ethan-l-sulfonic acid-2-sulfonate, and the like), and carboxylate ions (e.g., acetate, ethanoate, propanoate, benzoate
  • the substituent present on the nitrogen atom is an nitrogen protecting group (also referred to as an "amino protecting group").
  • heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aralkyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R dd groups, and wherein R aa , R bb , R cc and R dd are as defined herein.
  • Nitrogen protecting groups are well known in the art and include those described in detail in Protecting Groups in Organic Synthesis, T. W. Greene and P. G. M.
  • Nitrogen protecting groups such as sulfonamide groups include, but are not limited to, p-toluenesulfonamide (Ts), benzenesulfonamide, 2,3,6,-trimethyl-4- methoxybenzenesulfonamide (Mtr), 2,4,6-trimethoxybenzenesulfonamide (Mtb), 2,6- dimethyl-4-methoxybenzenesulfonamide (Pme), 2,3,5, 6-tetramethyl-4- methoxybenzenesulfonamide (Mte), 4-methoxybenzenesulfonamide (Mbs), 2,4,6- trimethylbenzenesulfonamide (Mts), 2,6-dimethoxy-4-methylbenzenesulfonamide (iMds), 2,2,5,7, 8-pentamethylchroman-6-sulfonamide (Pmc), methanesulfonamide
  • nitrogen protecting groups include, but are not limited to, phenothiazinyl- (10)-acyl derivative, N'-p-toluenesulfonylaminoacyl derivative, N'-phenylaminothioacyl derivative, N-benzoylphenylalanyl derivative, N-acetylmethionine derivative, 4,5-diphenyl- 3-oxazolin-2-one, N-phthalimide, N-dithiasuccinimide (Dts), N-2,3-diphenylmaleimide, N-2,5-dimethylpyrrole, N-l,l,4,4-tetramethyldisilylazacyclopentane adduct (STABASE), 5-substituted l,3-dimethyl-l,3,5-triazacyclohexan-2-one, 5-substituted 1,3-dibenzyl- l,3,5-triazacyclohexan-2-one, 1-
  • benzenesulfenamide o-nitrobenzenesulfenamide (Nps), 2,4-dinitrobenzenesulfenamide, pentachlorobenzenesulfenamide, 2-nitro-4-methoxybenzenesulfenamide,
  • triphenylmethylsulfenamide triphenylmethylsulfenamide
  • 3-nitropyridinesulfenamide Npys
  • the substituent present on an oxygen atom is an oxygen protecting group (also referred to as a "hydroxyl protecting group").
  • Oxygen protecting groups are well known in the art and include those described in detail in Protecting Groups in Organic Synthesis, T. W. Greene and P. G. M. Wuts, 3 rd edition, John Wiley & Sons, 1999, incorporated herein by reference.
  • oxygen protecting groups include, but are not limited to, methyl, methoxylmethyl (MOM), methylthiomethyl (MTM), t-butylthiomethyl,
  • DPMS diphenylmethylsilyl
  • TMPS i-butylmethoxyphenylsilyl
  • the substituent present on an sulfur atom is a sulfur protecting group (also referred to as a "thiol protecting group").
  • Sulfur protecting groups are well known in the art and include those described in detail in Protecting Groups in Organic Synthesis, T. W. Greene and P. G. M. Wuts, 3 rd edition, John Wiley & Sons, 1999, incorporated herein by reference.
  • salt refers to any and all salts.
  • Exemplary acid-addition salts include, but are not limited to, acid-addition salt between an amino substituent and an inorganic acid such as hydrochloric acid, hydrobromic acid, phosphoric acid, sulfuric acid and perchloric acid, or with organic acids such as acetic acid, oxalic acid, maleic acid, tartaric acid, citric acid, succinic acid or malonic acid or by using other methods used in the art such as ion exchange.
  • an inorganic acid such as hydrochloric acid, hydrobromic acid, phosphoric acid, sulfuric acid and perchloric acid
  • organic acids such as acetic acid, oxalic acid, maleic acid, tartaric acid, citric acid, succinic acid or malonic acid or by using other methods used in the art such as ion exchange.
  • acid addition salts include salts formed from adipate, alginate, ascorbate, aspartate, benzenesulfonate, benzoate, bisulfate, borate, butyrate, camphorate, camphorsulfonate, citrate, cyclopentanepropionate, digluconate, dodecylsulfate, ethanesulfonate, formate, fumarate, glucoheptonate,
  • glycerophosphate gluconate, hemisulfate, heptanoate, hexanoate, hydroiodide, 2-hydroxy- ethanesulfonate, lactobionate, lactate, laurate, lauryl sulfate, malate, maleate, malonate, methanesulfonate, 2-naphthalenesulfonate, nicotinate, nitrate, oleate, oxalate, palmitate, pamoate, pectinate, persulfate, 3-phenylpropionate, phosphate, picrate, pivalate, propionate, stearate, succinate, sulfate, tartrate, thiocyanate, p-toluenesulfonate, undecanoate, valerate salts, and the like.
  • Exemplary salts derived from appropriate bases include amino acids having a net positive charge, metals, and quaternary amine salts ⁇ e.g., + NH 4 and + N (Ci ⁇ alkyl) 4 salts).
  • Representative metals include, but are not limited to, alkali metals ⁇ e.g., Li, Na, K, Cs), alkaline earth metals ⁇ e.g., Mg, Ca, Ba), and transition metals ⁇ e.g., Hg).
  • Exemplary amino acids include, but are not limited to, arginine, histidine, lysine, aspartic acid, glutamic acid, serine, threonine, asparagine, glutamine, cysteine, selenocysteine, glycine, proline, alanine, valine, isoleucine, leucine, methionine, phenylalanine, tyrosine, and tryptophan.
  • compositions of heavy-atom labeled nucleic acids for use in systems and methods of sequencing, identifying and/or detecting nucleic acid polymers, such as DNA are provided.
  • the methods can involve using a particle beam, such as an electron beam, or ion beam, to obtain information regarding the heavy-atom labeled nucleic acid polymer.
  • a sample of heavy-atom labeled DNA can be exposed to a particle beam and changes in the beam resulting from interaction with the sample may form a pattern which can be interpreted to provide the information.
  • a particle beam instrument e.g. , an electron microscope
  • the methods can enable nucleic acid sequencing, identifying and/or detection at high speeds, low costs, and high accuracy, amongst other advantages.
  • a complementary strand of a nucleic acid polymer may be analyzed to determine the sequence and/or presence of a nucleic acid polymer.
  • the sample may be formed of one or more strands of the nucleic acid polymer along with or separate from the complementary strand.
  • complementary strand is to obtain a single strand of a nucleic acid polymer.
  • Any suitable technique may be used to obtain a single strand. Standard denaturing processes (e.g., thermal, enzymatic) which break the hydrogen bonding between the strands may be used.
  • a single strand can be created by synthesizing it from a template. For example, polymerase chain reaction (PCR) or reverse transcriptase processes that are well known in the art may be used.
  • a single strand may be chemically synthesized one nucleotide at a time, for example, in an oligonucleotide synthesis process. Such synthetic processes are well known in the art and can be automated. It is also possible to obtain a single strand by purifying it from a natural source, such as single stranded RNA from cells. Combinations of the foregoing (and other methods known to those of skill in the art) also can be used.
  • a complementary strand of a nucleic acid polymer can be created from the single strand using any suitable conventional technique.
  • standard polymerization techniques may be used including polymerase chain reaction (PCR) (e.g. , standard PCR, long PCR protocols).
  • PCR polymerase chain reaction
  • the techniques generally involve exposing the single strand to an excess of nucleotides under the proper reaction conditions.
  • the nucleotides may be labeled, as described in further below, and shown schematically in FIG. 2.
  • single or multiple polymerase enzymes are used to facilitate reactions.
  • Polymerase enzymes include DNA-dependent DNA polymerases (including thermostable enzymes such as Taq
  • RNA-dependent DNA polymerases e.g., reverse transcriptases
  • RNA- dependent RNA polymerases e.g., enzymes need not be used (e.g., in vitro chemical synthesis).
  • suitable components e.g., nucleotide primers, other enzymes such as primases, and the like may also be present.
  • complementary strands may be modified to include other components that would not otherwise be present in a DNA strand.
  • the complementary strand may be modified to include labels (e.g. , during formation) that facilitate detection and identification of nucleotides in methods of the invention. Labels (e.g. , atoms or molecules) when exposed to a particle beam create characteristic particle beam species that may be detected and identified using the systems and methods of the invention.
  • the nucleic acid polymer also can be modified to include labels as described herein. This advantageously is done during synthesis of the nucleic acid, for example using PCR, which typically results in the synthesis of both strands (i.e., the nucleic acid polymer and its complementary strand).
  • labels When labels are present, it may be preferable to attach the labels to nucleotides of the complementary strand only (e.g. , as shown in FIG. 2) or to both strands of the nucleic acid. Labels can be incorporated in the complementary strand only (e.g. , using a single round of PCR) or in both strands of the nucleic acid (e.g. , using two or more rounds of PCR). In certain embodiments, specific types of label are respectively attached to each type of nucleotide (e.g.
  • cytosine triphosphate CTP
  • ATP adenosine triphosphate
  • TTP thymine triphosphate
  • UTP uracil triphosphate
  • GTP guano sine triphosphate
  • a first type of label is attached to a first nucleotide type (e.g. , CTP);
  • a second type of label is attached to a second nucleotide type (e.g., ATP);
  • a third type of label is attached to a third nucleotide type (e.g.
  • nucleotide types may be identified by identifying a particular label or labels on the labeled nucleotide. Modified (non-natural) or atypical natural nucleotides also can be used, in which the bases, sugars or phosphate moieties can be different than those present in typical naturally occurring nucleotides (e.g. , in A, C, G, T and U).
  • locked nucleic acids which for example can be a bicyclic nucleic acid where a ribonucleoside is linked between the 2'-oxygen and the 4'-carbon atoms with a methylene unit. Mixtures of the foregoing can be employed in the invention.
  • nucleotide comprises a
  • nucleotide is a nucleotide triphosphate, such as cytosine triphosphate as referred to above.
  • nucleoside comprises a nitrogenous base and a sugar molecule, as described above, but no linking group.
  • base comprises a nitrogenous base, but not the sugar molecule or linking group.
  • nucleotide can be polymerized into a nucleic acid polymer, but a nucleoside or base cannot.
  • labels may be attached to nucleotides, which may be polymerized into nucleic acid polymer, as opposed to nucleic acid bases.
  • a "base pair” is conventionally used to denote pairs of nucleotides that are bound in a sequence specific manner, e.g. , Watson-Crick pairing such as A-T and C-G, in a double stranded nucleic acid polymer.
  • this term also can refer to pairings of nucleosides or bases, which by definition are not part of nucleic acid polymers.
  • each nucleotide type bearing a unique label is that only a single "data read" is needed to obtain the sequence directly. Some interpretation as to which strand a given nucleotide is on may be required. Labeling each type of nucleotide uniquely also allows for some flexibility in data interpretation, as each base pair is identified twice: each nucleotide is identified directly and there are two nucleotides per base pair, which provides an internal control for the correctness of the data read and sequence. [00128] In other embodiments, each nucleotide type (e.g. , C, A, T, U, G) in a given strand bears a unique label, but the labels on the other strand are different. This can be
  • nucleotide is accomplished by using different sets of labeled nucleotides in sequential PCR cycles, or other synthetic methods, and allows for greater ease in tracking the strand to which a nucleotide belongs.
  • nucleotide types need to be labeled. For example, if three nucleotide types (e.g. , C, A, T) are labeled and the fourth (e.g., G) is unlabeled, then each "unlabeled" type may readily be identified as the fourth nucleotide type (e.g. , G).
  • the position of the unlabeled nucleotides can be inferred from observation of the distances between labeled nucleotides, given the highly regular spacing of nucleotides in nucleic acid polymers. In other embodiments, only two of the nucleotide types may be labeled.
  • a first set of sequencing data may be generated with two nucleotide types labeled (e.g., C, A) and a second set of sequencing data may be generated with the other two nucleotide types labeled (e.g. , T, G). Both data sets may be processed to provide information regarding the entire sequence.
  • nucleic acid polymer by labeling only two nucleotides (e.g., A, C) on both strands of a nucleic acid polymer, the sequence of either strand can be inferred from the sequence of the other strand.
  • all labeled adenines in one strand of a double stranded nucleic acid polymer will be bound to thymines on the opposite strand in accordance with Watson- Crick nucleotide binding rules.
  • observation of an adenine on one strand allows one to infer the existence of a thymine in the corresponding position of the other strand of a double stranded nucleic acid.
  • the positions of other nucleotides can likewise be directly read or inferred from observing a double stranded nucleic acid that incorporates only two nucleotide - specific labels.
  • the labels may be attached to nucleotides in a variety of different locations.
  • labels are attached to the nucleotides on, or within, the nitrogenous base (e.g., adenine, guanine, thymine, cytosine, uracil).
  • the nitrogenous base e.g., adenine, guanine, thymine, cytosine, uracil.
  • labels may be attached to carbon/nitrogen rings in the base or may replace carbon or nitrogen atoms in the base.
  • labels are attached to the nucleotides on, or within, the sugar molecule (e.g. , ribose in RNA, or deoxyribose in DNA).
  • labels are attached on, or within, linking groups of the nucleotides.
  • the labels may be attached on, or within, a phosphate linking group.
  • the labels may be attached to oxygen substitutes, such as sulfur (e.g. , alpha substituted phosphates, aS) or may replace the phosphorous atom at certain sites.
  • the labels are attached to the nucleotides by covalent bonding.
  • covalent bonding provides strong attachment between labels and nucleotides which can enable labeled samples to withstand exposure to relatively high particle beam energies (e.g. , greater than about 50 kV for electron beams, for example about 80- 120 kV) that may be important to detection and/or identification of nucleic acids.
  • the labels are attached to nucleotides prior to the nucleotides forming the complementary strand (and/or copies of the first strand of the nucleic acid polymer).
  • the labels may be selected from types, as described further below, that do not prevent polymerase reactions that form the
  • the complementary strand (and/or copies of the first strand of the nucleic acid polymer).
  • the complementary strand is labeled during its formation.
  • nucleotides may have been modified (prior to formation of the complementary strand and/or copies of the first strand of the nucleic acid polymer) to include a suitable attachment site which can be bound, preferably covalently, to a desired label type. After formation, the nucleic acid strand(s) may be exposed to the labels which attach to the sites.
  • the complementary strand is separated from first strand to form a single complementary strand as shown which is used as the sample.
  • the complementary strand may be separated from the first strand using conventional denaturing techniques (e.g., thermal, enzymatic). After separation, the first strand may be discarded, or may be retained and otherwise used.
  • separation and use of the complementary strand can simplify detection and/or identification and/or quantitation in subsequent method steps.
  • the complementary strand and the first strand are not separated, and the double- stranded structure is used as a sample in the detection and/or identification steps.
  • the complementary strand when the complementary strand is separated from the first strand, the complementary strand is used as a template to create another strand which may be labeled.
  • This can create a double- stranded structure which includes two labeled strands (i.e., the complementary strand and the new strand created from the complementary strand).
  • this double- stranded structure is used as the sample in the detection and/or identification steps.
  • Methods of the invention may involve attaching a sample (e.g. , complementary strand, complementary strand and first strand, complementary strand and new strand), or more than one sample, to a substrate.
  • the sample may be the same (i.e., based on the same sequence) or different.
  • the substrate should be suitable for exposure to a particle beam. In embodiments in which particle beam species transmitted through the sample are detected, the substrate should permit sufficient transmission of the particle beam.
  • the substrate is generally thin to enable sufficient particle beam transmission therethrough.
  • the substrate may be less than 5 nanometers (nm); in some cases, less than 2 nm; or, even less than 1.5 or 1.1 nm.
  • the substrate may be formed of a single layer or multiple layers. In certain cases, the layer(s) may be cross-linked. Conventional techniques can be used to form the substrates including vapor deposition and FIB milling, amongst others.
  • Suitable substrate materials are known to those of skill in the art and can include carbon (e.g. , pure carbon, graphene, diamond), boron nitride (e.g., having a cubic structure), aluminum and certain polymeric resins (e.g. , FORMVAR® (polyvinyl formal)).
  • the substrate is formed fromorganic materials such as a lipid, natural protein or synthetic protein.
  • the substrate material may be doped with chemicals, for example, to cross-link layers or to facilitate attachment of the sample as described further below.
  • Samples may be attached to the substrate by chemically bonding at least a portion of the sample to the substrate. Suitable techniques are known to those of skill in the art. For example, molecules present on the surface of the substrate (e.g. , pre-existing as part of the substrate or following derivatization of the substrate) may be used to bind to the sample.
  • the molecules may be nucleic acid sequence specific molecules (e.g. , oligonucleotides).
  • the substrate surface may be derivatized to provide attachment points that are sequence non-specific.
  • electrical charge may be used to bind the sample to the substrate surface.
  • the attachment points for the samples can be spaced apart in a predetermined pattern, such as a grid or microarray.
  • a portion, or portions, of a sample may be attached to the substrate.
  • both ends of the sample e.g. , complementary strand, complementary strand and first strand, complementary strand and new strand
  • only one end of the sample may be attached; in some cases, one or more non-end portions along the length of the sample may be attached.
  • the attachment at the end(s) or along the length of the nucleic acid molecule(s) can be facilitated, if desired, by including in the nucleic acid during synthesis nucleotides capable of forming bonds with the substrate.
  • Certain methods of the invention involve substantially straightening a sample (e.g., labeled double strand) prior to, during, or even after, attachment to the substrate. This can facilitate detection and/or identification.
  • the labeled double strand may be attached to the substrate, for example, via a linking bond to a bonding site as described further below.
  • a sample may be straightened using fluid flow (e.g. , molecular combing).
  • the fluid may comprise one or more liquids, gases, or combinations thereof.
  • the sample is attached and straightened by hybridization in a fluid flow to oligonucleotides present on the substrate surface.
  • electrical fields may be used (either in the presence of fluid flow, or alone) to promote sample straightening.
  • each sample may be aligned substantially parallel to one another to facilitate exposure to the beam. Methods exist to perform molecular alignment of nucleic acid molecules in a thin or monolayer on a substrate.
  • methods and compositions of the present disclosure may be combined with methods to perform high-density molecular alignment of nucleic acid molecules on substrates or surfaces as embodied in PCT Publication Nos. WO 2009/002506 A2 and WO 2010/144128 A2, entitled “High Density Molecular Alignment of Nucleic Acid Molecules,” and “Molecular Alignment and Attachment of Nucleic Acid Molecules,” respectively, both of which are incorporated herein by reference in their entirety.
  • the disclosure provides compositions and methods aside from nucleic acid sequencing and/or identification, such as gene expression analysis.
  • Procedures used for gene expression are generally based on immobilizing mRNA or cDNA (prepared via reverse transcriptase PCR from mRNA) to microarrays, and estimating quantity from fluorescent images. Some of these procedures are described in U.S. Patent Nos. 5,405,783; 5,424,186; 5,445,934; 5,744,305; 6,261,776; 6,406,844; 6,416,952; 6,506,558; and 5, 143,854.
  • One aspect of the disclosure provides a substrate having a combination of materials and dimensions that allows the substrate to have distinct physical properties. Specifically, in one embodiment, the materials and dimensions of the substrate allow it to be used for imaging samples with a particle beam instrument such as a transmission electron microscope.
  • the substrate can include one or more ligands (e.g. , nucleic acids, polypeptides,
  • the array dimensions are on the order of nanometers per functional region rather than micrometers as in certain conventional arrays. With these dimensions, smaller amounts of sample material can be used and more accurate genetic analyses performed. These smaller substrate dimensions may also give rise to dramatically reduced production costs, amongst other advantages.
  • the transparency of the substrate, due to thinness, material type and other factors, may provide a suitable contrast ratio between the labeled molecules and the substrate that result in higher quality readings and lower cost analysis than some conventional techniques.
  • quantification, sequencing, fingerprinting, and mapping of polymers particularly biological polymers.
  • Various embodiments of the invention may be applied, for example, in the sequencing, fingerprinting, identification, quantification, or mapping of nucleic acids, polypeptides, oligosaccharides, and synthetic polymers.
  • WO06019903 all entitled, “Systems and Methods of Analyzing Nucleic Acid Polymers and Related Components,” as well as 2007/0134699, which corresponds to WO07120202, entitled “Nano-Scale Ligand Arrays on Substrates for Particle Beam Instruments and Related Methods,” each of which is incorporated herein by reference in its entirety.
  • These references may provide, for example, methods and devices for incorporating contrast heavy atom labels in a biologic sample that are designed to interfere with a beam from a particle beam instrument.
  • the labeled sample materials are bindi ng partners, which can be bound to ligands in an array on a suitable substrate.
  • a particle beam may be directed through the array and the labels can create interference patterns that are then read by a detector instrument and processed by a data analysis module.
  • Methods of the invention involve exposing the sample to a particle beam.
  • the particle beam is a lepton beam such as an electron beam.
  • the particle beam may be an x-ray beam.
  • the particle beam may be an ion beam such as a helium or gallium ion beam.
  • a beam generator produces a beam having a desired voltage which, for example, can be greater than 50 kV, e.g. , 80-300 kV, preferably 80-120 kV.
  • Beam energies are a function of both voltage and current.
  • the beam current typically ranges between 5 to 25 ⁇ , preferably between 8 and 15 ⁇ .
  • the specific beam energy depends, in part, on the specific analysis being performed.
  • Methods can include properly focusing the beam on the sample using a lens arrangement as known to those of skill in the art. Methods may also include a calibration step. In certain cases, the system may be automatically calibrated based on known information from nucleic acid molecules in the sample (such as known molecular geometries and structures) using a feedback loop. For example, data obtained from a nucleic acid sample using an electron beam may include internucleotide (e.g. , interlabel) distances. As used herein, an internucleotide distance is the distance from one nucleotide base in one strand to the adjacent nucleotide base in the same strand. While the internucleotide distances of, for example, a DNA molecule are generally known, the internucleotide distance in any given sample may not correspond to the generally known distance, but will typically by
  • a sample as affixed to a substrate particularly a sample that has been straightened, e.g. , by treatment using molecular combing or like methods.
  • various aspects of the system can be calibrated or adjusted using a feedback control system. For example, knowing the internucleotide distances permits feedback relevant to focusing the particle beam and movement of the sample relative to the particle beam.
  • systems of the invention may include several components similar to that of a conventional transmission electron microscope (e.g. , beam generator, lens, etc.), certain systems of the invention may be more simple than typical conventional TEMs.
  • the systems are simplified by limiting the magnification range, accelerating voltages, probe diameter, beam current, and sample flexibility, amongst other features.
  • problems related to spherical aberration in conventional TEMs may be limited, or eliminated, by using a lens arrangement that is pre- set for typical operating conditions for the system.
  • Characteristics of the particle beam are changed when the beam interacts with the sample. For example, one or more of the following characteristics of the particle beam may change: energy, direction, absorbance, reflection and deflection. Such changes may result from interactions between the particle beam and labels attached to nucleotides as described above. Specific types of labels may produce specific or characteristic changes. Thus, a label (and, the specific nucleotide to which it is attached) may be identified by recognizing the specific or characteristic beam changes.
  • a detector collects particle beam species after the interaction between the particle beam and the sample.
  • the detector typically collects beam species that have been transmitted through the sample, though also can collect beam species that are reflected and/or scattered.
  • the detector may include a charge coupled device (CCD).
  • CCD charge coupled device
  • the CCD may directly convert the beam species into digital information. Technologies other than CCD technology may be used to convert the beam species into digital information, and are intended to fall within the scope of the invention.
  • a nucleic acid polymer may be detected, and/or sequenced and/or identified based on particle beam species detected by a detector (e.g., the detector described above).
  • Particle beam species may result from exposure of a sample comprising a nucleic acid polymer and/or its complimentary strand to a particle beam (e.g., a lepton beam such as an electron beam).
  • a particle beam e.g., a lepton beam such as an electron beam.
  • Heavy-atom labeled compounds contemplated herein include, but are not limited to, heavy-atom labeled nucleosides, heavy-atom labeled nucleotides, and heavy-atom labeled nucleic acid polymers. Such compounds may be useful in the inventive methods as described herein.
  • each instance of Gi is independently -0-, -S-, -Se-, -CH 2 -, or -NH-;
  • each instance of G 2 is independently hydrogen, halogen, -OR A , -SR A , -N(R A ) 2 , -SHg, -S0 2 SHg , -SHgR D , -SeR D or -TeR D ;
  • each instance of R A is independently hydrogen, substituted or unsubstituted Ci_ 2 oalkyl, substituted or unsubstituted C 2 _ 2 oalkenyl, substituted or unsubstituted C 2 _ 20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen atom, a sulfur protecting group when attached to a sulfur atom, a nitrogen protecting group when attached to a nitrogen atom; or two R A groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
  • each instance of M 1 is independently -0-, -S-, -NH-, -Se-,or -C(R M ) 2 -, wherein each instance of R M is independently hydrogen or halogen;
  • each instance of G 3 is independently hydrogen, substituted or unsubstituted Ci_ 2 oalkyl, substituted or unsubstituted C 2 _ 2 oalkenyl, substituted or unsubstituted C 2 _ 2 o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or a monophosphate, diphosphate, or triphosphate of formula:
  • each instance of M is independently -0-, -S-, or -Se-;
  • each instance of R 1 , R 2 , R 4 , and R 5 is independently hydrogen, substituted or unsubstituted C ⁇ oalkyl, substituted or unsubstituted C 2 - 2 oalkenyl, substituted or unsubstituted C 2 - 2 o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, a nitrogen protecting group, -OR B , or -SR B , wherein each instance of R B is independently hydrogen, substituted or unsubstituted C 1 _ 2 oalkyl, substituted or unsubstituted C 2 _ 2 oalkenyl, substituted or unsubstituted C 2 - 20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted
  • each instance of R is independently substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted C 2 - 2 oalkenyl, substituted or unsubstituted C 2 - 20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or
  • R c is hydrogen, substituted or unsubstituted C ⁇ oalkyl, substituted or unsubstituted C 2 - 2 oalkenyl, substituted or unsubstituted C 2 - 20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen atom, a sulfur protecting group when attached to a sulfur atom, a nitrogen protecting group when attached to a nitrogen atom; or
  • each instance of L 1 is independently absent or a linking moiety selected from the group consisting of substituted or unsubstituted C ⁇ oalkylene, substituted or unsubstituted C 2 2 oalkenylene, substituted or unsubstituted C 2 - 20 alkynylene, substituted or unsubstituted heteroC 1 _ 2 oalkylene, substituted or unsubstituted heteroC 2 - 2 oalkenylene, substituted or unsubstituted heteroC 2 - 2 o alkynylene, substituted or unsubstituted carbocycylene, substituted or unsubstituted heterocyclylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a combination thereof;
  • each instance of R D is independently hydrogen, substituted or unsubstituted C ⁇ oalkyl, substituted or unsubstituted C 2 - 2 oalkenyl, substituted or unsubstituted C 2 - 20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; and
  • each instance of M 3 and M 4 are independently O, Se, Te, CH 2 , CF 2 , CCI 2 , CBr 2 , or CI 2 ; provided that the compound comprises at least one instance of a heavy atom selected from the group consisting of bromine, iodine, selenium, tellurium, or mercury.
  • a heavy atom selected from the group consisting of bromine, iodine, selenium, tellurium, or mercury.
  • An exemplary nucleic acid polymers is a heavy-atom labeled nucleic acid polymers of Formula (II):
  • n 1 to 200,000, inclusive
  • the polymer comprises at least one instance of a heavy atom selected from the group consisting of bromine, iodine, selenium, tellurium, or mercury.
  • n is 1 to 180,000, inclusive; n is 1 to 160,000, inclusive; n is 1 to 140,000, inclusive; n is 1 to 120,000, inclusive; n is 1 to 100,000, inclusive; n is 1 to 80,000, inclusive; n is 1 to 60,000, inclusive; n is 1 to 40,000, inclusive; n is 1 to 20,000, inclusive; n is 1 to 18,000, inclusive; n is 1 to 16,000, inclusive; n is 1 to 14,000, inclusive; n is 1 to 12,000, inclusive; n is 1 to 10,000, inclusive; n is 1 to 9,000, inclusive; n is 1 to 8,000, inclusive; n is 1 to 7,000, inclusive; n is 1 to 6,000, inclusive; n is 1 to 5,000, inclusive; n is 1 to 4,000, inclusive; n is 1 to 3,000, inclusive; n is 1 to 2,000, inclusive; n is 1 to 1,000, inclusive; n is 1 to 900, inclusive; n is 1 to 3,000, inclusive; n is 1 to 2,000
  • nucleic acid polymers such as polymers of Formula (II), comprising one or more units of the below formula are also specifically excluded:
  • Further compounds excluded include, but are not limited to, 2'MeSe-ATP, 2'- TePh, 2'-SeCR, and C5-TePh, and selenium compounds as disclosed in JP2008195648 and JP2007000032111.
  • the heavy-atom labeled compound of Formula (I), polymer of Formula (II), or unit of Formula (IF), comprises at least one instance of a heavy atom selected from the group consisting of bromine, iodine, selenium, tellurium, or mercury.
  • the Base region comprises at least one instance of the heavy atom.
  • the sugar region comprises at least one instance of the heavy atom.
  • the phosphate region comprises at least one instance of the heavy atom.
  • at least one instance of the heavy atom is provided in the Base region. Labeling in the Base region as described herein is contemplated to provide clearer and unambiguous imaging results compared to labeling elsewhere in the molecule.
  • the compound comprises at least one instance (e.g. , 1, 2, 3, 4 or more instances) of bromine.
  • bromine is attached to a carbon atom which optionally comprises one or two additional instances of a halogen, e.g., for example, -CBr 3 , CBr 2 H, -CBrH 2 , -CBr 2 X, or -CBrX 2 , wherein each instance of X is independently -CI, -F, or -I.
  • the compound comprises at least one instance (e.g. , 1, 2, 3, 4 or more instances) of iodine.
  • iodine is attached to a carbon atom which optionally comprises one or two additional instances of a halogen, e.g. , for example, - CI 3 , CI 2 H, -CIH 2 , -CI 2 X, or -CIX 2 , wherein each instance of X is independently -CI, -F, or - Br.
  • the compound comprises a divalent -Se- group.
  • the compound comprises a monovalent -SeR D group.
  • the compound comprises a divalent -Te- group.
  • the compound comprises a monovalent -TeR D group.
  • the compound comprises at least one instance of -SHgR D , -SeR D , or -TeR D .
  • the compound comprises at least one instance of -SeR D or -TeR D , wherein R D is hydrogen, i.e. , to provide -SeH or -TeH.
  • the compound comprises at least one instance of -SHgR D , -SeR D , or -TeR D , wherein R D is substituted or unsubstituted C ⁇ oalkyl, e.g., R D is substituted or unsubstituted Ci-igalkyl, substituted or unsubstituted Q-iealkyl, substituted or
  • R D is substituted or unsubstituted C 1 ; C 2 , C 3 , C 4 , C 5 , or C6-alkyl.
  • R D is alkyl substituted with at least one or more halogen atoms (i.e. , one or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C ⁇ ohaloalkyl, substituted or unsubstituted Ci-ighaloalkyl, substituted or unsubstituted Ci- ⁇ haloalkyl, substituted or unsubstituted Ci- ⁇ haloalkyl, substituted or unsubstituted Q-nhaloalkyl, substituted or unsubstituted Q-iohaloalkyl, substituted or unsubstituted Q-ghaloalkyl, substituted or unsubstituted C ⁇ haloalkyl, substituted or unsubstituted C 1 _ 4 haloalkyl, substituted or unsubstituted C ⁇ haloalkyl, or substituted or unsubstituted C 1
  • R D is substituted or unsubstituted Q, C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkyl.
  • the haloalkyl is a perhaloalkyl group.
  • R D is -CX 3 , wherein X is halogen.
  • R D is - CBr 3 , CBr 2 H, -CBrH 2 , -CBr 2 X, or -CBrX 2 , wherein each instance of X is independently -CI, -F, or -I.
  • R D is -CI 3 , CI 2 H, -CIH 2 , -CI 2 X, or -CIX 2 , wherein each instance of X is independently -CI, -F, or -Br.
  • R D is -CBr 3 , -CI 3 , - CFClBr, or -CClBrl.
  • the compound comprises at least one instance of -SHgR D , -SeR D , or -TeR D , wherein R D is substituted or unsubstituted C 2 _ 2 oalkenyl, e.g., R D is substituted or unsubstituted C ⁇ alkenyl, substituted or unsubstituted C ⁇ alkenyl, substituted or unsubstituted C 2 _ 14 alkenyl, substituted or unsubstituted C 2 _ 12 alkenyl, substituted or unsubstituted C 2 _ioalkenyl, substituted or unsubstituted C 2 _galkenyl, substituted or unsubstituted C 2 _ 6 alkenyl, substituted or unsubstituted C 2 ⁇ alkenyl, or substituted or unsubstituted C 2 _ 3 alkenyl.
  • R D is substituted or unsubstituted C 2 _ 2 o
  • R D is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkenyl.
  • R D is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R D is substituted or unsubstituted C 2 _ 2 ohaloalkenyl, substituted or unsubstituted
  • substituted or unsubstituted C ⁇ haloalkenyl substituted or unsubstituted C 2 _ 14 haloalkenyl, substituted or unsubstituted C 2 _ 12 haloalkenyl, substituted or unsubstituted C ⁇ ohaloalkenyl, substituted or unsubstituted C 2 _ghaloalkenyl, substituted or unsubstituted C 2 _ 6 haloalkenyl, substituted or unsubstituted C 2 _ 4 haloalkenyl, or substituted or unsubstituted C 2 _ 3 haloalkenyl.
  • the haloalkenyl is a perhaloalkenyl group.
  • R D is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkenyl.
  • the compound comprises at least one instance of -SHgR D , -SeR D , or -TeR D , wherein R D is substituted or unsubstituted C 2 _ 2 oalkynyl, e.g., R D is substituted or unsubstituted substituted or unsubstituted C 2 _ 16 alkynyl, substituted or unsubstituted C 2 _ 14 alkynyl, substituted or unsubstituted C 2 _ 12 alkynyl, substituted or unsubstituted C 2 _ioalkynyl, substituted or unsubstituted C 2 _galkynyl, substituted or unsubstituted C 2 _ 6 alkynyl, substituted or unsubstituted C 2 ⁇ alkynyl, or substituted or unsubstituted C 2 _ 3 alkynyl.
  • R D is substituted or unsubstituted C 2
  • R D is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkynyl.
  • R D is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R D is substituted or unsubstituted C 2 _ 2 ohaloalkynyl, substituted or unsubstituted C ⁇ haloalkynyl, substituted or unsubstituted C ⁇ haloalkynyl, substituted or unsubstituted C 2 _ 14 haloalkynyl, substituted or unsubstituted C 2 _ 12 haloalkynyl, substituted or unsubstituted C ⁇ iohaloalkynyl, substituted or unsubstituted C 2
  • the haloalkynyl is a perhaloalkynyl group.
  • R D is substituted or unsubstituted C 2 , C 3 , C 4 , C5, or C 6 -haloalkynyl.
  • the compound comprises at least one instance of -SHgR D , -SeR D , or -TeR D , wherein R D is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C 3 carbocycyl, substituted or unsubstituted C 4 carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C 6 carbocycyl.
  • R D is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C 3 carbocycyl, substituted or unsubstituted C 4 carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C 6 carbocycyl.
  • R D is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R D is substituted or unsubstituted C 3 halocarbocycyl, substituted or unsubstituted C 4 halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • the compound comprises at least one instance of -SeR D or -TeR D , wherein R D is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered
  • R D is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g.
  • R D is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5- membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl.
  • the compound comprises at least one instance of -SHgR D , -SeR D , or -TeR D , wherein R D is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl.
  • R is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R D is substituted or unsubstituted haloaryl.
  • R D is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
  • mono substituted halophenyl e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point
  • the compound comprises at least one instance of -SHgR D , -SeR D , or -TeR D , wherein R D is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl.
  • R D is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R D is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
  • the compound comprises at least one instance (e.g. , 1, 2, 3, 4 or more instances) of mercury, e.g. , for example, in certain embodiments, the compound comprises at least one instance of -SHg, -S0 2 SHg, or -SHgR D (e.g., -SHgMe).
  • the compound comprises at least one instance of -SHg, -S0 2 SHg, or -SHgR D (e.g., -SHgMe).
  • one or more heavy atoms are present in the sugar region of the compound. In certain embodiments, one or more heavy atoms (e.g. , 1, 2, 3, or 4 heavy atoms) are present in the phosphate region of the compound. In certain embodiments, one or more heavy atoms (e.g. , 1, 2, 3, or 4 heavy atoms) are present in the base region of the compound.
  • the compound in certain embodiments, may comprise heavy atoms in either the sugar, the phosphate, or the base region. In certain embodiments, the compound may comprise only heavy atoms in the sugar or phosphate region. In certain embodiments, the compound may comprise only heavy atoms in the base. In certain embodiments, the compound may comprises no heavy atoms in the base, and in that instance, heavy atoms are necessarily present in the sugar or phosphate region.
  • the compound is an nucleic acid polymer comprising one or more units of Formula (IF), such as a compound of Formula (II)
  • the 5' and/or 3' terminating group and/or one or more repeating units, e.g., 1 to 25,000 units, of the nucleic acid polymer may comprise heavy atoms.
  • the nucleic acid polymer comprises one or more instances of a heavy-atom labeled nucleotide in combination with one or more instances of an unlabeled nucleotide. In certain embodiments, there are multiple instances, e.g., 2 or more instances, of the same heavy-atom labeled nucleotide.
  • each instance of a particular nucleotide is replaced with a different heavy-atom labeled nucleotide as described herein.
  • each instance of A is replaced with a heavy-atom labeled nucleotide as described herein.
  • each instance of G is replaced with a heavy-atom labeled nucleotide as described herein.
  • each instance of T is replaced with a heavy-atom labeled nucleotide as described herein.
  • each instance of C is replaced with a heavy-atom labeled nucleotide as described herein.
  • each instance of U is replaced with a heavy-atom labeled nucleotide as described herein.
  • one of the heavy-atom labeled compounds is labeled in the sugar or phosphate region, and one of the heavy-atom labeled compounds is labeled in the base region, in order to better distinguish between A, G, T, C, or U.
  • one of the heavy-atom labeled compounds is labeled in the sugar or phosphate region with one type of label, and one of the heavy-atom labeled compounds is labeled in the base region with a different type of label, in order to better distinguish between A, G, T, C, or U.
  • the "sugar region" of the heavy-atom labeled compound of Formula (I), polymer of Formula (II), or unit of Formula (II'), may comprise a heavy atom, or may not comprise a heavy atom. If the sugar region does not comprise a heavy atom, the phosphate and/or base region of the heavy-atom labeled compound of Formula (I), polymer of Formula (II), or unit of Formula (II' comprises a heavy atom.
  • each instance of Gi is independently -0-, -S-, - Se-, -CH 2 - or -NH-.
  • at least one instance ⁇ e.g., 1, 2, 3, 4 or more instances, or each instance) of Gi is -0-.
  • at least one instance ⁇ e.g., 1, 2, 3, 4 or more instances, or each instance) of G is -S-.
  • at least one instance (e.g., 1, 2, 3, 4 or more instances, or each instance) of G is -Se-
  • at least one instance (e.g., 1, 2, 3, 4 or more instances, or each instance) of Gi is -CH 2 -.
  • at least one instance (e.g., 1, 2, 3, 4 or more instances, or each instance) of Gi is -NH-.
  • each instance of G 2 is independently hydrogen, halogen, -OR A , -SR A , -N(R A ) 2 , -SHg, -S0 2 SHg , -SHgR D , -SeR D or -TeR D
  • At least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of G 2 is hydrogen.
  • At least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of G 2 is halogen, i.e. , G 2 is -Br, -I, -F, or -CI.
  • At least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of G 2 is -OR A , wherein R A is hydrogen, substituted or unsubstituted Ci_ 20 alkyl, substituted or unsubstituted C 2 _ 2 oalkenyl, substituted or unsubstituted C 2 _ 2 o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or an oxygen protecting group.
  • R A is hydrogen, substituted or unsubstituted Ci_ 20 alkyl, substituted or unsubstituted C 2 _ 2 oalkenyl, substituted or unsubstituted C 2 _ 2 o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted hetero
  • G 2 is -OR A and R A is hydrogen, i.e., G 2 is -OH.
  • G 2 is -OR A and R A is an oxygen protecting group, as defined herein.
  • G 2 is -OR A and R A is substituted or unsubstituted Ci_ 20 alkyl, e.g., G 2 is -OR A and R A is substituted or unsubstituted Q-igalkyl, substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Ci_ 14 alkyl, substituted or unsubstituted Ci_ 12 alkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted C ⁇ ancyl, substituted or unsubstituted C ⁇ alkyl, substituted or unsubstituted Ci ⁇ alkyl, substituted or unsubstituted Ci_ 3 alkyl, or substituted or unsubstituted Ci_ 2 alkyl.
  • G 2 is -OR A and R A is substituted or unsubstituted Q, C 2 , C 3 , C 4 , C 5 , or C6-alkyl.
  • R A is alkyl substituted with at least one or more halogen atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g.
  • R A is substituted or unsubstituted Ci_ 2 ohaloalkyl, substituted or unsubstituted Ci-ighaloalkyl, substituted or unsubstituted Cuehaloalkyl, substituted or unsubstituted Ci_ 14 haloalkyl, substituted or unsubstituted Ci_ 12 haloalkyl, substituted or unsubstituted Ci-iohaloalkyl, substituted or unsubstituted Ci-shaloalkyl, substituted or unsubstituted C ⁇ haloalkyl, substituted or unsubstituted Ci ⁇ haloalkyl, substituted or unsubstituted Ci_ 3 haloalkyl, or substituted or unsubstituted Ci_ 2 haloalkyl.
  • G 2 is -OR and R is substituted or unsubstituted Q, C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkyl.
  • the haloalkyl is a perhaloalkyl group.
  • R A is -CX 3 , wherein X is halogen.
  • R A is -CBr 3 , CBr 2 H, -CBrH 2 , -CBr 2 X, or -CBrX 2 , wherein each instance of X is independently -CI, -F, or -I.
  • R A is -CI 3 , CI 2 H, -CIH 2 , -CI 2 X, or -CIX 2 , wherein each instance of X is independently -CI, -F, or -Br.
  • R A is -CBr , -CI 3 , - CFClBr, or -CClBrl.
  • G 2 is -OR A and R A is substituted or unsubstituted C 2 20 alkenyl, e.g., G 2 is -OR A and R A is substituted or unsubstituted substituted or unsubstituted C 2 _ 16 alkenyl, substituted or unsubstituted C 2 _ 14 alkenyl, substituted or unsubstituted C 2 _ 12 alkenyl, substituted or unsubstituted C ⁇ ioalkenyl, substituted or unsubstituted C 2 _ 8 alkenyl, substituted or unsubstituted C 2 _ 6 alkenyl, substituted or
  • G 2 is -OR A and R A is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkenyl.
  • R A is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g.
  • R A is substituted or unsubstituted C 2 _ 2 ohaloalkenyl, substituted or unsubstituted substituted or unsubstituted C ⁇ haloalkenyl, substituted or unsubstituted C 2 _ 14 haloalkenyl, substituted or unsubstituted C 2 _ 12 haloalkenyl, substituted or unsubstituted C ⁇ iohaloalkenyl, substituted or unsubstituted C 2 _ 8 haloalkenyl, substituted or unsubstituted C 2 _ 6 haloalkenyl, substituted or unsubstituted C 2 _ 4 haloalkenyl, or substituted or unsubstituted C 2 _ 3 haloalkenyl.
  • the haloalkenyl is a perhaloalkenyl group.
  • G 2 is -OR A and R A is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkenyl.
  • G 2 is -OR A and R A is substituted or unsubstituted C 2 20 alkynyl, e.g., G 2 is -OR A and R A is substituted or unsubstituted C ⁇ alkynyl, substituted or unsubstituted C ⁇ alkynyl, substituted or unsubstituted C 2 _ 14 alkynyl, substituted or unsubstituted C 2 _ 12 alkynyl, substituted or unsubstituted C 2 _ioalkynyl, substituted or unsubstituted C 2 _galkynyl, substituted or unsubstituted C 2 _ 6 alkynyl, substituted or
  • G 2 is -OR A and R A is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 - alkynyl.
  • R A is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R A is substituted or unsubstituted C 2 _ 2 ohaloalkynyl, substituted or unsubstituted C 2 -i 8 haloalkynyl, substituted or unsubstituted C 2 -i 6 haloalkynyl, substituted or unsubstituted C 2 -i 4 haloalkynyl, substituted or unsubstituted C 2 -i 2 haloalkynyl, substituted or unsubstituted C 2 -i
  • the haloalkynyl is a perhaloalkynyl group.
  • G 2 is - OR A and R A is substituted or unsubstituted C 2 , C3, C 4 , C5, or C 6 -haloalkynyl.
  • G 2 is -OR A and R A is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C 3 carbocycyl, substituted or unsubstituted C 4 carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C 6 carbocycyl.
  • R A is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g.
  • R A is substituted or unsubstituted C 3 halocarbocycyl, substituted or unsubstituted C 4 halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted
  • G 2 is -OR A and R A is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5-membered
  • heterocyclyl or substituted or unsubstituted 6-membered heterocyclyl.
  • R A is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R A is substituted or unsubstituted 3- membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6- membered haloheterocyclyl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • G 2 is -OR A and R A is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl.
  • R A is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R A is substituted or unsubstituted haloaryl.
  • R A is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para-substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
  • mono substituted halophenyl e.g., ortho, meta, or para-substituted with halogen atoms
  • G 2 is -OR A and R A is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl.
  • R A is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R A is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
  • G 2 is -SR A and R A is hydrogen, i.e., G 2 is -SH.
  • G 2 is -SR A and R A is a sulfur protecting group, as defined herein.
  • G 2 is -SR A and R A is substituted or unsubstituted Q_ 2 oalkyl
  • G 2 is -SR A and R A is substituted or unsubstituted Q_ 2 oalkyl
  • G 2 is -SR A and R A is substituted or unsubstituted Q-igalkyl, substituted or unsubstituted C ⁇ alkyl, substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted C ⁇ ancyl, substituted or unsubstituted C ⁇ alkyl, substituted or unsubstituted C ⁇ alkyl, substituted or unsubstituted C ⁇ alkyl, or substituted or unsubstituted C 1 _ 2 alkyl.
  • G 2 is -SR A and R A is substituted or unsubstituted C 1; C 2 , C 3 , C 4 , C 5 , or C 6 -alkyl.
  • R A is alkyl substituted with at least one or more halogen atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g., R A is substituted or unsubstituted C 1 _ 2 ohaloalkyl, substituted or unsubstituted Ci-ighaloalkyl, substituted or unsubstituted Ci-iehaloalkyl, substituted or unsubstituted Ci- ⁇ haloalkyl, substituted or unsubstituted Ci- ⁇ haloalkyl, substituted or unsubstituted Ci- ⁇ haloalkyl, substituted or unsubstituted Ci-iohaloalkyl, substituted or unsubstituted Ci-io
  • G 2 is -SR A and R A is substituted or unsubstituted C 1; C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkyl.
  • the haloalkyl is a perhaloalkyl group.
  • R A is -CX 3 , wherein X is halogen.
  • R A is -CBr 3 , CBr 2 H, -CBrH 2 , -CBr 2 X, or -CBrX 2 , wherein each instance of X is independently -CI, -F, or -I.
  • R A is -CI 3 , CI 2 H, -CIH 2 , -CI 2 X, or -CIX 2 , wherein each instance of X is independently -CI, -F, or -Br.
  • R A is -CBr 3 , -CI 3 , - CFClBr, or -CClBrl.
  • G 2 is -SR and R is substituted or unsubstituted C 2 2 oalkenyl
  • G 2 is -SR A and R A is substituted or unsubstituted C 2 -i 8 alkenyl, substituted or unsubstituted C 2 _ 16 alkenyl, substituted or unsubstituted C 2 _ 14 alkenyl, substituted or unsubstituted C 2 _ 12 alkenyl, substituted or unsubstituted C 2 -ioalkenyl, substituted or unsubstituted C 2 _salkenyl, substituted or unsubstituted C 2 _ 6 alkenyl, substituted or
  • G 2 is -SR A and R A is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkenyl.
  • R A is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g.
  • R A is substituted or unsubstituted C 2 - 2 ohaloalkenyl, substituted or unsubstituted substituted or unsubstituted C 2 -i 6 haloalkenyl, substituted or unsubstituted C 2 _ 14 haloalkenyl, substituted or unsubstituted C 2 _ 12 haloalkenyl, substituted or unsubstituted C 2 -iohaloalkenyl, substituted or unsubstituted C 2 _ghaloalkenyl, substituted or unsubstituted C 2 _ 6 haloalkenyl, substituted or unsubstituted C 2 _ 4 haloalkenyl, or substituted or unsubstituted C 2 _ 3 haloalkenyl.
  • the haloalkenyl is a perhaloalkenyl group.
  • G 2 is -SR A and R A is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkenyl.
  • G 2 is -SR A and R A is substituted or unsubstituted C 2 20 alkynyl, e.g., G 2 is -SR A and R A is substituted or unsubstituted C ⁇ alkynyl, substituted or unsubstituted C 2 _ 16 alkynyl, substituted or unsubstituted C 2 _ 14 alkynyl, substituted or unsubstituted C 2 _ 12 alkynyl, substituted or unsubstituted C 2 _ioalkynyl, substituted or unsubstituted C 2 _ 8 alkynyl, substituted or unsubstituted C 2 _ 6 alkynyl, substituted or
  • G 2 is -SR A and R A is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkynyl.
  • R A is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g.
  • R A is substituted or unsubstituted C 2 _ 2 ohaloalkynyl, substituted or unsubstituted C ⁇ haloalkynyl, substituted or unsubstituted C ⁇ iehaloalkynyl, substituted or unsubstituted C 2 _ 14 haloalkynyl, substituted or unsubstituted C 2 _ 12 haloalkynyl, substituted or unsubstituted C ⁇ iohaloalkynyl, substituted or unsubstituted C 2 _ 8 haloalkynyl, substituted or unsubstituted C 2 _ 6 haloalkynyl, substituted or unsubstituted C 2 _ 4 haloalkynyl, or substituted or unsubstituted C 2 _ 3 haloalkynyl.
  • the haloalkynyl is a perhaloalkynyl group.
  • G 2 is -SR A and R A is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkynyl.
  • G 2 is -SR A and R A is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C 3 carbocycyl, substituted or unsubstituted C 4 carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C 6 carbocycyl.
  • R A is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R A is substituted or unsubstituted C 3 halocarbocycyl, substituted or unsubstituted C 4 halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted
  • G 2 is -SR A and R A is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5-membered
  • heterocyclyl or substituted or unsubstituted 6-membered heterocyclyl.
  • R A is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R A is substituted or unsubstituted 3- membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6- membered haloheterocyclyl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • G 2 is -SR A and R A is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl.
  • R A is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R A is substituted or unsubstituted haloaryl.
  • R A is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para-substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
  • mono substituted halophenyl e.g., ortho, meta, or para-substituted with halogen atoms
  • G 2 is -SR A and R A is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl.
  • R A is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
  • G 2 is -N(R A ) 2 and at least one R A is hydrogen, i.e. , G 2 is - NHR A or -NH 2 .
  • G 2 is -N(R A ) 2 and at least one R A is a nitrogen protecting group, as defined herein.
  • G 2 is -N(R A ) 2 and at least one R A is substituted or unsubstituted Q- ⁇ alkyl, e.g., G 2 is -N(R A ) 2 and at least one R A is substituted or
  • G 2 is -N(R A ) 2 and at least one R A is substituted or unsubstituted Q, C 2 , C 3 , C 4 , C 5 , or C 6 -alkyl.
  • at least one is R A is alkyl substituted with at least one or more halogen atoms (i.e. , one or more -Br, -I, -F, or -CI atoms), e.g.
  • R A is substituted or unsubstituted C 1 _ 2 ohaloalkyl, substituted or unsubstituted Q-ighaloalkyl, substituted or unsubstituted Q-iehaloalkyl, substituted or unsubstituted Q-whaloalkyl, substituted or unsubstituted Q- ⁇ haloalkyl, substituted or unsubstituted Q-iohaloalkyl, substituted or unsubstituted Ci ⁇ haloalkyl, substituted or unsubstituted Ci ⁇ haloalkyl, substituted or unsubstituted C ⁇ haloalkyl, substituted or unsubstituted C ⁇ haloalkyl, or substituted or unsubstituted C ⁇ haloalkyl, or substituted or unsubstituted C 1 _ 2 haloalkyl.
  • G 2 is -N(R A ) 2 and at least one R A is substituted or unsubstituted Q, C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkyl.
  • the haloalkyl is a perhaloalkyl group.
  • at least one R A is -CX 3 , wherein X is halogen.
  • at least one R A is -CBr , CBr 2 H, -CBrH 2 , -CBr 2 X, or -CBrX 2 , wherein each instance of X is independently -CI, -F, or -I.
  • At least one R A is -CI 3 , CI 2 H, -CIH 2 , - CI 2 X, or -CIX 2 , wherein each instance of X is independently -CI, -F, or -Br. In certain embodiments, at least one R A is -CBr 3 , -CI 3 , -CFClBr, or -CClBrl.
  • G 2 is -N(R A ) 2 and at least one R A is substituted or unsubstituted C 2 _ 2 oalkenyl, e.g., G 2 is -N(R A ) 2 and at least one R A is substituted or unsubstituted C 2 _ 18 alkenyl, substituted or unsubstituted C 2 _ 16 alkenyl, substituted or unsubstituted C 2 _ 14 alkenyl, substituted or unsubstituted C 2 _ 12 alkenyl, substituted or unsubstituted C 2 _ioalkenyl, substituted or unsubstituted C 2 _ 8 alkenyl, substituted or
  • G 2 is -N(R ) 2 and at least one R is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkenyl.
  • at least one R A is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g.
  • R A is substituted or unsubstituted C 2 - 2 ohaloalkenyl, substituted or unsubstituted C 2 -i 8 haloalkenyl, substituted or unsubstituted C 2 -i 6 haloalkenyl, substituted or unsubstituted C 2 -i 4 haloalkenyl, substituted or unsubstituted C 2 -i 2 haloalkenyl, substituted or unsubstituted C 2 -iohaloalkenyl, substituted or unsubstituted C 2 - 8 haloalkenyl, substituted or unsubstituted C 2 - 6 haloalkenyl, substituted or unsubstituted C 2 - 4 haloalkenyl, or substituted or unsubstituted C 2 - 3 haloalkenyl.
  • the haloalkenyl is a perhaloalkenyl group.
  • G 2 is -N(R A ) 2 and at least one R A is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkenyl.
  • G 2 is -N(R A ) 2 and at least one R A is substituted or unsubstituted C 2 - 2 oalkynyl, e.g., G 2 is -N(R A ) 2 and at least one R A is substituted or unsubstituted C 2 -i 8 alkynyl, substituted or unsubstituted C 2 -i 6 alkynyl, substituted or unsubstituted C 2 -i 4 alkynyl, substituted or unsubstituted C 2 -i 2 alkynyl, substituted or unsubstituted C 2 -ioalkynyl, substituted or unsubstituted C 2 - 8 alkynyl, substituted or unsubstituted C 2 - 6 alkynyl, substituted or unsubstituted C 2 - 4 alkynyl, or substituted or unsubstituted C 2 - 3
  • G 2 is -N(R A ) 2 and at least one R A is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkynyl.
  • at least one R A is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g.
  • At least one R A is substituted or unsubstituted C 2- 2 ohaloalkynyl, substituted or unsubstituted C 2 -i 8 haloalkynyl, substituted or unsubstituted C 2 16 haloalkynyl, substituted or unsubstituted C 2 -i 4 haloalkynyl, substituted or unsubstituted C 2 12 haloalkynyl, substituted or unsubstituted C 2 -iohaloalkynyl, substituted or unsubstituted C 2 shaloalkynyl, substituted or unsubstituted C 2 - 6 haloalkynyl, substituted or unsubstituted C 2- 4 haloalkynyl, or substituted or unsubstituted C 2 - 3 haloalkynyl.
  • the haloalkynyl is a perhaloalkynyl group.
  • G 2 is -N(R A ) 2 and at least one R A is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkynyl.
  • At least one R A is -CH 2 X - -CHX 2 , or- -CX 3 , wherein each X is independently -CI, -F, -Br, or -I.
  • G 2 is -N(R A ) 2 and at least one R A is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C 3 carbocycyl, substituted or unsubstituted C 4 carbocycyl, substituted or unsubstituted C 5 carbocycyl, or substituted or unsubstituted Cecarbocycyl.
  • At least one R A is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one R A is substituted or unsubstituted Cshalocarbocycyl, substituted or unsubstituted C 4 halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • G 2 is -N(R A ) 2 and at least one R A is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5- membered heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl.
  • At least one R A is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one R A is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4- membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • G 2 is -N(R A ) 2 and at least one R A is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl.
  • at least one R A is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one R A is substituted or unsubstituted haloaryl.
  • At least one R A is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
  • mono substituted halophenyl e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative
  • G 2 is -N(R A ) 2 and at least one R A is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl.
  • At least one R A is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one R A is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
  • G 2 is -N(R ) 2 , and two R groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring, e.g. , a 5- to 6- membered substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring.
  • At least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of G 2 is -SHg or -S0 2 SHg, or -SHgR D , wherein R D is as defined herein.
  • At least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of G 2 is -SeR D , wherein R D is as defined herein.
  • At least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of G 2 is -TeR D , wherein R D is as defined herein.
  • each instance of M 1 is independently -0-, -S-, - NH-, -Se-,or -C(R M ) 2 -, wherein each instance of R M is independently hydrogen or halogen.
  • at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of M 1 is -0-.
  • at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of M 1 is -S-.
  • at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of M 1 is -NH-.
  • At least one instance (e.g., 1, 2, 3, 4 or more instances, or each instance) of M 1 is -Se-. In certain embodiments, at least one instance (e.g., 1, 2, 3, 4 or more instances, or each instance) of M 1 is -C(R M ) 2 -, wherein each instance of R M is independently hydrogen or halogen. In certain embodiments, each instance of R M is hydrogen. In certain embodiments, at least one instance of R M is halogen, e.g., -Br, -I, -F, or -CI.
  • each instance of Gi is O to provide a compound of Formula I-b), polymer of Formula (Il-b), or unit of Formula (II '-b):
  • G 2 is hydrogen.
  • G 2 is - SHgR D (e.g., -SHgMe), -SHg, or -S0 2 SHg.
  • G 2 is -SeR D , e.g. , - SeCX 3 , wherein X is halogen.
  • G 2 is -TeR D , e.g. , -TeCX 3 , wherein X is halogen.
  • each instance of Gi and M 1 is O to provide a compound of Formula (I-c), polymer of Formula (II-c), or unit of Formula (II'-c):
  • G 2 is hydrogen.
  • G 2 is - SHgR D (e.g., -SHgMe), -SHg, or -S0 2 SHg.
  • G 2 is -SeR D , e.g. , - SeCX 3 , wherein X is halogen.
  • G 2 is -TeR D , e.g. , -TeCX 3 , wherein X is halogen.
  • G 2 is Se-CBr or -TeBr .
  • each instance of Gi and M 1 is O to provide a compound of Formula (I-d), polymer of Formula (Il-d), or unit of Formula (II '-d) with the specified stereochemistry:
  • G 2 is hydrogen.
  • G 2 is -SHgR D (e.g., -SHgMe), -SHg, or -S0 2 SHg.
  • G 2 is -SeR D , e.g., -SeCX 3 , wherein X is halogen.
  • G 2 is -TeR D , e.g., -TeCX 3 , wherein X is halogen.
  • each instance of G 3 independently is hydrogen, substituted or unsubstituted C ⁇ oalkyl, substituted or unsubstituted C 2 - 2 oalkenyl, substituted or unsubstituted C 2 - 2 o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or may be a monophosphate, diphosphate, or triphosphate group.
  • At least one instance of G 3 is hydrogen.
  • At least one instance of G 3 is substituted or unsubstituted C 1 _ 2 oalkyl, e.g., substituted or unsubstituted Ci-igalkyl, substituted or unsubstituted Q_ 16 alkyl, substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted C ⁇ alkyl, substituted or unsubstituted Ci-ioalkyl, substituted or unsubstituted C ⁇ ancyl, substituted or unsubstituted C h alky!, substituted or unsubstituted Ci ⁇ alkyl, substituted or unsubstituted C ⁇ alkyl, or substituted or unsubstituted C ⁇ alkyl.
  • C 1 _ 2 oalkyl e.g., substituted or unsubstituted Ci-igalkyl, substituted or unsubstituted Q_ 16 alkyl, substituted or unsubstituted Ci- ⁇ alky
  • At least one instance of G 3 is substituted or unsubstituted Q, C 2 , C 3 , C 4 , C 5 , or C 6 -alkyl.
  • at least one instance of G 3 is alkyl substituted with at least one or more halogen atoms (i.e. , one or more -Br, -I, -F, or -CI atoms), e.g.
  • G 3 is substituted or unsubstituted C ⁇ ohaloalkyl, substituted or unsubstituted Q-ighaloalkyl, substituted or unsubstituted Ci-iehaloalkyl, substituted or unsubstituted Ci-nhaloalkyl, substituted or unsubstituted Ci- ⁇ haloalkyl, substituted or unsubstituted Ci-iohaloalkyl, substituted or unsubstituted Ci-shaloalkyl, substituted or unsubstituted Ci ⁇ haloalkyl, substituted or unsubstituted C 1 _ 4 haloalkyl, substituted or unsubstituted C ⁇ haloalkyl, or substituted or unsubstituted C ⁇ haloalkyl.
  • At least one instance of G 3 is substituted or unsubstituted C 1 ; C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkyl.
  • the haloalkyl is a perhaloalkyl group.
  • R D is -CX 3 , wherein X is halogen.
  • at least one instance of G 3 is -CBr , CBr 2 H, -CBrH 2 , -CBr 2 X, or - CBrX 2 , wherein each instance of X is independently -CI, -F, or -I.
  • At least one instance of G 3 is -CI 3 , CI 2 H, -CIH 2 , -CI 2 X, or -CIX 2 , wherein each instance of X is independently -CI, -F, or -Br. In certain embodiments, at least one instance of G 3 is - CBr 3 , -CI 3 , -CFClBr, or -CClBrl.
  • At least one instance of G 3 is substituted or unsubstituted C 2 - 2 oalkenyl, e.g., at least one instance of G 3 is substituted or unsubstituted C 2 -i 8 alkenyl, substituted or unsubstituted C 2 -i 6 alkenyl, substituted or unsubstituted C 2 14 alkenyl, substituted or unsubstituted C 2 -i 2 alkenyl, substituted or unsubstituted C 2 -ioalkenyl, substituted or unsubstituted C 2 - 8 alkenyl, substituted or unsubstituted C 2 - 6 alkenyl, substituted or unsubstituted C 2 ⁇ alkenyl, or substituted or unsubstituted C 2 - 3 alkenyl.
  • At least one instance of G 3 is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 - alkenyl.
  • at least one instance of G 3 is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one instance of G 3 is substituted or unsubstituted C 2 - 2 ohaloalkenyl, substituted or unsubstituted C 2 -i 8 haloalkenyl, substituted or unsubstituted C 2 -i 6 haloalkenyl, substituted or unsubstituted C 2 -i 4 haloalkenyl, substituted or unsubstituted C 2 -i 2 haloalkenyl, substituted or unsubstituted C 2 -i 2 haloal
  • the haloalkenyl is a perhaloalkenyl group.
  • at least one instance of G 3 is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkenyl.
  • At least one instance of G 3 is substituted or unsubstituted C 2 _ 2 oalkynyl, e.g., at least one instance of G 3 is substituted or unsubstituted C 2 _ 18 alkynyl, substituted or unsubstituted C 2 _ 16 alkynyl, substituted or unsubstituted C 2 _ 14 alkynyl, substituted or unsubstituted C 2 _ 12 alkynyl, substituted or unsubstituted C 2 _ioalkynyl, substituted or unsubstituted C 2 _ 8 alkynyl, substituted or unsubstituted C 2 _ 6 alkynyl, substituted or unsubstituted C 2 ⁇ alkynyl, or substituted or unsubstituted C 2 _ 3 alkynyl.
  • At least one instance of G 3 is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 - alkynyl. In certain embodiments, at least one instance of G 3 is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g.
  • G 3 is substituted or unsubstituted C 2 _ 2 ohaloalkynyl, substituted or unsubstituted C 2 _ 18 haloalkynyl, substituted or unsubstituted C ⁇ haloalkynyl, substituted or unsubstituted C 2 _ 14 haloalkynyl, substituted or unsubstituted C 2 _ 12 haloalkynyl, substituted or unsubstituted C ⁇ iohaloalkynyl, substituted or unsubstituted C 2 _ 8 haloalkynyl, substituted or unsubstituted C 2 _ 6 haloalkynyl, substituted or unsubstituted C 2 _ 4 haloalkynyl, or substituted or unsubstituted C 2 _ 3 haloalkynyl.
  • the haloalkynyl is a perhaloalkynyl group.
  • at least one instance of G 3 is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkynyl.
  • At least one instance of G 3 is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C 3 carbocycyl, substituted or unsubstituted C 4 carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C 6 carbocycyl.
  • at least one instance of G 3 is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g.
  • At least one instance of G 3 is substituted or unsubstituted C 3 halocarbocycyl, substituted or unsubstituted C 4 halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
  • At least one instance of G 3 is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5-membered
  • heterocyclyl or substituted or unsubstituted 6-membered heterocyclyl.
  • At least one instance of G 3 is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , at least one instance of G 3 is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • At least one instance of G 3 is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl.
  • at least one instance of G 3 is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one instance of G 3 is substituted or unsubstituted haloaryl.
  • At least one instance of G 3 is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
  • mono substituted halophenyl e.g., ortho, meta, or para- substituted with halogen atoms,
  • At least one instance of G 3 is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl.
  • at least one instance of G 3 is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more - Br, -I, -F, or -CI atoms), e.g. , at least one instance of G 3 is substituted or unsubstituted 5- membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
  • At least one instance of G 3 is a monophosphate, diphosphate, or triphosphate, referred to herein as the "phosphate region" of Formula (I) and (II).
  • group G 3 is a monophosphate, diphosphate, or triphosphate
  • G 3 may comprise a heavy atom, or may not comprise a heavy atom. If the "phosphate region" does not comprise a heavy atom, the sugar and/or base region of Formula (I) or (II) comprises a heavy atom.
  • each instance of the "phosphate region" G 3 is independently a monophosphate, diphosphate, or triphosphate of formula:
  • each instance of M is independently -0-, -S-, or -Se-.
  • at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of M is -0-.
  • at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of M is -S-.
  • at least one instance of M is -S-.
  • M (e.g., 1, 2, 3, 4 or more instances, or each instance) of M is -Se-.
  • At least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of G 3 is a monophosphate group. In certain embodiments, at least one instance (e.g., 1, 2, 3, 4 or more instances, or each instance) of G 3 is a diphosphate group. In certain embodiments, at least one instance (e.g., 1, 2, 3, 4 or more instances, or each instance) of G 3 is a triphosphate group. In certain embodiments, each instance of G 3 is a triphosphate group.
  • the compound of Formula (I) and (II) may also be provided as a salt, and in this instance, in certain embodiments, the monophosphate, diphosphate, or triphosphate groups may be provided as a salt form:
  • each Y is independently hydrogen or an electropositive group (e.g. , a quaternary amine, an amino acid, a metal) provided at least one instance of Y (e.g. , at least 1, 2, 3, or all instances of Y) is an electropositive group in order to provide the salt.
  • at least one instance of Y is a metal.
  • Exemplary metals include alkali metals (e.g., Li, Na, K, Cs), alkaline earth metals (e.g., Mg, Ca, Ba), a transition metal (e.g. , Hg).
  • at least one instance of Y is a quaternary amine (e.g.
  • At least one instance of Y is an amino acid having a net positive charge, e.g. , for example, wherein the zwitterionic form which predominates in equilibrium is the amino acid with a quaternized alpha-amino group and the protonated alpha-carboxylic acid group.
  • Exemplary amino acids include, but are not limited to, arginine, histidine, lysine, aspartic acid, glutamic acid, serine, threonine, asparagine, glutamine, cysteine, selenocysteine, glycine, proline, alanine, valine, isoleucine, leucine, methionine, phenylalanine, tyrosine, and tryptophan.
  • each instance of G 3 is independently a hydrogen or a triphosphate to provide a compound of Formula (I-el), (I-e2), (Il-el), (II-e2), (II-e3), or (II- e4):
  • Gi is O.
  • M 1 is O.
  • G 2 is hydrogen.
  • G 2 is -SHg or -S0 2 SHg .
  • G 2 is -SeR , e.g., -SeCX 3 , wherein X is halogen.
  • G 2 is -TeR D , e.g., -TeCX 3 , wherein X is halogen.
  • G 2 is Se-CBr 3 or - TeBr .
  • M 2 is O.
  • the compound is a salt, e.g., a salt of a quaternary amine, an amino acid, or a metal.
  • each instance of G 3 is independently a hydrogen or a triphosphate, and Gi, M 1 , and M 2 are O, to provide a compound of Formula (I-fl), (I-f2), (II- fl), (II-f2), ( ⁇ - ⁇ ), or (II-f4
  • G 2 is hydrogen. In certain embodiments, G 2 is -SHg or -S0 2 SHg . In certain embodiments, G 2 is -SeR D , e.g., -SeCX 3 , wherein X is halogen. In certain embodiments, G 2 is -TeR D , e.g., -TeCX 3 , wherein X is halogen. In certain embodiments, G 2 is Se-CBr 3 or -TeBr 3 . In certain embodiments, the compound is a salt, e.g., a salt of a quaternary amine, an amino acid, or a metal.
  • the compound is a salt, e.g., a salt of a quaternary amine, an amino acid, or a metal.
  • each instance of G 3 is independently a hydrogen or a triphosphate, and Gi, M 1 , and M 2 are O, to provide a compound of Formula (I-gl), (I-g2), (Il-gl), (II-g2), (II-g3), or (II-g4 with the specified stereochemistry:
  • G 2 is hydrogen. In certain embodiments, G 2 is -SHg or -S0 2 SHg . In certain embodiments, G 2 is -SeR D , e.g., - SeCX 3 , wherein X is halogen. In certain embodiments, G 2 is -TeR D , e.g., -TeCX 3 , wherein X is halogen. In certain embodiments, G 2 is Se-CBr or -TeBr . In certain embodiments, the compound is a salt, e.g., a salt of a quaternary amine, an amino acid, or a metal.
  • the "base region" of a compound of Formula (I-b), polymer of Formula (Il-b), or unit of Formula (II '-b) may comprise a heavy atom, or may not comprise a heavy atom. If the "base region” does not comprise a heavy atom, the phosphate and/or sugar region of a compound of Formula (I-b), polymer of Formula (Il-b), or unit of Formula (II '-b) comprises a heavy atom.
  • the Base does not comprise a heavy atom, and is selected from the group consisting of:
  • a nucleic acid polymer such as a polymer of Formula (II), may have one or more instances of any of the above formula.
  • the Base is an analog of adenine and guanine, and which optionally comprises a heavy atom, selected from the group consisting of:
  • a nucleic acid polymer such as a polymer of Formula (II), may have one or more instances of any of the above formula.
  • the Base is an analog of cytosine, uracil, and thymine, and which optionally comprises a heavy atom, selected from the group consisting of:
  • a nucleic acid polymer such as a polymer of Formula (II), may have one or more instances of any of the above formula.
  • each instance of R 1 and R is independently hydrogen, substituted or unsubstituted C ⁇ oalkyl, substituted or unsubstituted C 2 - 2 oalkenyl, substituted or unsubstituted C 2 - 20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or
  • OR B or— SR B , or R 1 and R 2" are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring.
  • R 1 is hydrogen. In certain embodiments, both R 1 and R2 are hydrogen.
  • At least one of R 1 and R 2 is substituted or unsubstituted Q_
  • R 1 and R 2 is substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted Q-galkyl, substituted or unsubstituted C ⁇ alkyl, substituted or unsubstituted Ci ⁇ alkyl, substituted or unsubstituted Ci ⁇ alkyl, or substituted or unsubstituted C ⁇ alkyl.
  • at least one of R 1 and R 2 is substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted Q-galkyl
  • R and R is substituted or unsubstituted Q, C 2 , C 3 , C 4 , C 5 , or C6-alkyl.
  • At least one of R and R is alkyl substituted with at least one or more halogen
  • R and R is substituted or unsubstituted Ci ⁇ ohaloalkyl, substituted or unsubstituted Ci- ⁇ haloalkyl, substituted or unsubstituted Q-iehaloalkyl, substituted or unsubstituted Ci-nhaloalkyl, substituted or unsubstituted Q-nhaloalkyl, substituted or unsubstituted Q-iohaloalkyl, substituted or unsubstituted Ci-shaloalkyl, substituted or unsubstituted Ci ⁇ haloalkyl, substituted or unsubstituted C ⁇ haloalkyl, substituted or unsubstituted Ci ⁇ haloalkyl, or substituted or substituted or substituted or substituted or substituted or substituted Ci ⁇ haloalkyl, substituted or unsubstituted C ⁇ haloalkyl, substituted or unsubstituted Ci ⁇ haloalkyl, or substituted or substituted or substituted or
  • R and R is substituted or unsubstituted C 1 ; C 2 , C 3 , C 4 , C 5 , or C6-haloalkyl.
  • the haloalkyl is a
  • At least one of R and R is -CX 3 , wherein X is
  • R and R" is -CBr 3 , CBr 2 H, -CBrH 2 , - CBr 2 X, or -CBrX 2 , wherein each instance of X is independently -CI, -F, or -I.
  • At least one of R and R" is -CI 3 , CI 2 H, -CIH 2 , -CI 2 X, or -CIX 2 , wherein each instance of X is independently -CI, -F, or -Br.
  • at least one of R 1 and R" is -CBr 3 , -CI 3 , -CFClBr, or -CClBrl. In any of the above instances, in certain
  • R is as defined above, and R is hydrogen.
  • At least one of R and R is substituted or unsubstituted C 2 20alkenyl, e.g. , substituted or unsubstituted C 2 -i 8 alkenyl, substituted or unsubstituted C 2 16 alkenyl, substituted or unsubstituted C 2 -i 4 alkenyl, substituted or unsubstituted C 2 -i 2 alkenyl, substituted or unsubstituted C 2 -ioalkenyl, substituted or unsubstituted C 2 - 8 alkenyl, substituted or unsubstituted C 2 - 6 alkenyl, substituted or unsubstituted C 2 ⁇ alkenyl, or substituted or
  • R and R" are alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -
  • R and R is substituted or unsubstituted C 2 _ 2 ohaloalkenyl, substituted or unsubstituted C 2 18 haloalkenyl, substituted or unsubstituted C 2 - iehaloalkenyl, substituted or unsubstituted C 2 14 haloalkenyl, substituted or unsubstituted C 2 - 12 haloalkenyl, substituted or unsubstituted C 2 -iohaloalkenyl, substituted or unsubstituted C 2 - ghaloalkenyl, substituted or unsubstituted C 2 - 6 haloalkenyl, substituted or unsubstituted C 2 _ 4 haloalkenyl, or substituted or unsubstituted C 2 _ 3 haloalkenyl.
  • the haloalkenyl is a perhaloalkenyl group.
  • at least one of R 1 and R 2" is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkenyl.
  • the alkenyl group is trans or the E-isomer.
  • At least one of R 1 and R 2 is substituted or unsubstituted C 2
  • R 1 and R 2 is substituted or unsubstituted C ⁇ alkynyl, substituted or unsubstituted C 2 _ 16 alkynyl, substituted or unsubstituted C 2 _ 14 alkynyl, substituted or unsubstituted C 2 _ 12 alkynyl, substituted or unsubstituted C 2 _ioalkynyl, substituted or unsubstituted C 2 _ 8 alkynyl, substituted or unsubstituted C 2 _ 6 alkynyl, substituted or unsubstituted C 2 ⁇ alkynyl, or substituted or unsubstituted C 2 _ 3 alkynyl.
  • At least one of R 1 and R 2 is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 - alkynyl.
  • at least one of R 1 and R 2 is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one of R 1 and R 2 is substituted or unsubstituted C 2 _ 2 ohaloalkynyl, substituted or unsubstituted substituted or unsubstituted C 2 _ 14 haloalkynyl, substituted or unsubstituted C 2 _ 12 haloalkynyl, substituted or unsubstituted C ⁇ iohaloalkynyl, substituted or unsubstituted C 2 _ 8 haloalky
  • R 1 is as defined above, and R 2 is hydrogen.
  • R 1 and R 2 are substituted or unsubstituted carbocycyl, e.g., substituted or unsubstituted C 3 carbocycyl, substituted or unsubstituted C 4 carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted
  • C 6 c is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI
  • R and R are substituted or unsubstituted Cshalocarbocycyl, substituted or unsubstituted C 4 halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
  • R is as defined above, and R is hydrogen.
  • At least one of R and R is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5-membered
  • heterocyclyl or substituted or unsubstituted 6-membered heterocyclyl.
  • At least one of R and R is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one of R 1 and R is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • R is as defined above, and R is hydrogen.
  • At least one of R and R is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl.
  • At least one of R and R is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one of R 1
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • R is substituted or unsubstituted haloaryl.
  • at least one of R and R is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
  • mono substituted halophenyl e.g
  • At least one of R and R" is or substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or
  • R and R" are heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -
  • R 1 is as defined above, and R 2 is hydrogen.
  • At least one of R 1 and R 2 is a nitrogen protecting group, as defined herein.
  • R 1 and R 2 are -OR B , wherein R B is independently hydrogen, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted C 2 _ 2 oalkenyl, substituted or unsubstituted C 2 - 2 0 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or an oxygen protecting group.
  • R B is independently hydrogen, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted C 2 _ 2 oalkenyl, substituted or unsubstituted C 2 - 2 0 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substitute
  • R 1 and R 2 are -SR B , wherein R B is independently hydrogen, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted C 2 _ 2 oalkenyl, substituted or unsubstituted C 2 - 2 0 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or a sulfur protecting group.
  • R B is independently hydrogen, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted C 2 _ 2 oalkenyl, substituted or unsubstituted C 2 - 2 0 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or
  • R 1 and R 2 are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring, e.g., a 5- to 6- membered substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring.
  • each instance of R 4 and R 5 is independently hydrogen, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted C 2 - 2 oalkenyl, substituted or unsubstituted C 2 - 20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or
  • OR B", or— SR B , or R 4" and R 5 J are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring.
  • R 4 is hydrogen. In certain embodiments, both R 4 and R 5 are hydrogen.
  • At least one of R 4 and R 5 is substituted or unsubstituted Q_ 2 oalkyl, e.g., at least one of R 4 and R 5 is substituted or unsubstituted Ci-igalkyl, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted C ⁇ ancyl, substituted or unsubstituted C ⁇ alkyl, substituted or unsubstituted C 1-4 alkyl, substituted or unsubstituted C h alky!, or substituted or unsubstituted C ⁇ alkyl.
  • At least one of R 4 and R 5 is substituted or unsubstituted Q, C 2 , C 3 , C 4 , C 5 , or C 6 -alkyl. In certain embodiments, at least one of R 4 and R 5 is alkyl substituted with at least one or more halogen atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g.
  • R 4 and R 5 is substituted or unsubstituted Ci ⁇ ohaloalkyl, substituted or unsubstituted Ci- ⁇ haloalkyl, substituted or unsubstituted Q-iehaloalkyl, substituted or unsubstituted Ci-nhaloalkyl, substituted or unsubstituted Q-nhaloalkyl, substituted or unsubstituted Q-iohaloalkyl, substituted or unsubstituted Ci-shaloalkyl, substituted or unsubstituted Ci ⁇ haloalkyl, substituted or unsubstituted C ⁇ haloalkyl, substituted or unsubstituted Ci ⁇ haloalkyl, or substituted or unsubstituted C ⁇ haloalkyl.
  • At least one of R 4 and R 5 is substituted or unsubstituted C 1 ; C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkyl.
  • the haloalkyl is a perhaloalkyl group.
  • at least one of R 4 and R 5 is -CX 3 , wherein X is halogen.
  • at least one of R 4 and R 5 is -CBr 3 , CBr 2 H, -CBrH 2 , - CBr 2 X, or -CBrX 2 , wherein each instance of X is independently -CI, -F, or -I.
  • At least one of R 4 and R 5 is -CI 3 , CI 2 H, -CIH 2 , -CI 2 X, or -CIX 2 , wherein each instance of X is independently -CI, -F, or -Br.
  • at least one of R 4 and R 5 is -CBr 3 , -CI 3 , -CFClBr, or -CClBrl.
  • R 4 is as defined above, and R 5 is hydrogen.
  • R 4 and R 5 is substituted or unsubstituted C 2 2 oalkenyl, e.g. , substituted or unsubstituted C 2 -i 8 alkenyl, substituted or unsubstituted C 2 16 alkenyl, substituted or unsubstituted C 2 _ 14 alkenyl, substituted or unsubstituted C 2 _ 12 alkenyl, substituted or unsubstituted C 2 -ioalkenyl, substituted or unsubstituted C 2 _galkenyl, substituted or unsubstituted C 2 _ 6 alkenyl, substituted or unsubstituted C 2 ⁇ alkenyl, or substituted or unsubstituted C 2 _ 3 alkenyl.
  • C 2 2 oalkenyl e.g. , substituted or unsubstituted C 2 -i 8 alkenyl, substituted or unsubstituted C 2 16 alken
  • At least one of R 4 and R 5 is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkenyl. In certain embodiments, at least one of R 4 and R 5 is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more - Br, -I, -F, or -CI atoms), e.g.
  • R 4 and R 5 is substituted or unsubstituted C 2 _ 2 ohaloalkenyl, substituted or unsubstituted C 2 18 haloalkenyl, substituted or unsubstituted C 2 iehaloalkenyl, substituted or unsubstituted C 2 14 haloalkenyl, substituted or unsubstituted C 2 12 haloalkenyl, substituted or unsubstituted C 2 -iohaloalkenyl, substituted or unsubstituted C 2 ghaloalkenyl, substituted or unsubstituted C 2 - 6 haloalkenyl, substituted or unsubstituted C 2 _ 4 haloalkenyl, or substituted or unsubstituted C 2 _ 3 haloalkenyl.
  • the haloalkenyl is a perhaloalkenyl group.
  • at least one of R 4 and R 5 is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkenyl.
  • the alkenyl group is trans or the E-isomer.
  • At least one of R 4 and R 5 is substituted or unsubstituted C 2 2 o alkynyl, e.g., at least one of R 4 and R 5 is substituted or unsubstituted C ⁇ alkynyl, substituted or unsubstituted C 2 _ 16 alkynyl, substituted or unsubstituted C 2 _ 14 alkynyl, substituted or unsubstituted C 2 _ 12 alkynyl, substituted or unsubstituted C 2 _ioalkynyl, substituted or unsubstituted C 2 _ 8 alkynyl, substituted or unsubstituted C 2 _ 6 alkynyl, substituted or unsubstituted C 2 ⁇ alkynyl, or substituted or unsubstituted C 2 _ 3 alkynyl.
  • At least one of R 4 and R 5 is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 - alkynyl.
  • at least one of R 4 and R 5 is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one of R 4 and R 5 is substituted or unsubstituted C 2 _ 2 ohaloalkynyl, substituted or unsubstituted substituted or unsubstituted C 2 _ 14 haloalkynyl, substituted or unsubstituted C 2 _ 12 haloalkynyl, substituted or unsubstituted C ⁇ iohaloalkynyl, substituted or unsubstituted C 2 _ 8 haloalky
  • the haloalkynyl is a perhaloalkynyl group.
  • at least one of R 4 and R 5 is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkynyl.
  • R 4 is as defined above, and R 5 is hydrogen.
  • R 4 and R 5 is substituted or unsubstituted carbocycyl, e.g., substituted or unsubstituted C 3 carbocycyl, substituted or unsubstituted C 4 carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C 6 carbocycyl.
  • At least one of R 4 and R 5 is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one of R 4 and R 5 is substituted or unsubstituted C 3 halocarbocycyl, substituted or unsubstituted C 4 halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
  • R 4 is as defined above, and R 5 is hydrogen.
  • R 4 and R 5 is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5-membered
  • heterocyclyl or substituted or unsubstituted 6-membered heterocyclyl.
  • At least one of R 4 and R 5 is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one of R 4 and R 5 is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl.
  • R 4 is as defined above
  • R 5 is hydrogen.
  • At least one of R 4 and R 5 is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl.
  • at least one of R 4 and R 5 is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one of R 4 and R 5 is substituted or unsubstituted haloaryl.
  • R 4 and R 5 is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
  • R 4 is as defined above
  • R 5 is hydrogen.
  • At least one of R 4 and R 5 is or substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl.
  • At least one of R 4 and R 5 is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more - Br, -I, -F, or -CI atoms), e.g., at least one of R 4 and R 5 is substituted or unsubstituted 5- membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
  • R 4 is as defined above, and R 5 is hydrogen.
  • At least one of R 4 and R 5 is a nitrogen protecting group, as defined herein.
  • at least one of R 4 and R 5 is -OR B , wherein R B is independently hydrogen, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted C 2 _ 2 oalkenyl, substituted or unsubstituted C 2 - 2 0 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or an oxygen protecting group.
  • R 4 and R 5 is -SR B , wherein R B is independently hydrogen, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted C 2 _ 2 oalkenyl, substituted or unsubstituted C 2 - 2 0 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or a sulfur protecting group.
  • R B is independently hydrogen, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted C 2 _ 2 oalkenyl, substituted or unsubstituted C 2 - 2 0 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or
  • R 4 and R 5 are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring, e.g., a 5- to 6- membered substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring.
  • R 3 is independently substituted or unsubstituted C ⁇ oalkyl, substituted or unsubstituted C 2 - 2 oalkenyl, substituted or unsubstituted C 2 - 20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, halogen, -OR c , -SR C , -N(R C ) 2 , -SHg, -S0 2 SHg , -SHgR D , -SeR D , or -TeR D .
  • R 3 is substituted or unsubstituted Q- ⁇ alkyl, e.g., substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted C ⁇ ancyl, substituted or unsubstituted C ⁇ alkyl, substituted or unsubstituted C ⁇ alkyl, substituted or unsubstituted C ⁇ alkyl, or substituted or unsubstituted C ⁇ alkyl, or substituted or unsubstituted C ⁇ alkyl.
  • R is substituted or unsubstituted C 1; C 2 , C3, C 4 , C5, or C6-alkyl.
  • R is alkyl substituted with at least one or more halogen atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C 1 _ 2 ohaloalkyl, substituted or unsubstituted Q_ 18 haloalkyl, substituted or unsubstituted Q-iehaloalkyl, substituted or unsubstituted Q_ 14 haloalkyl, substituted or unsubstituted Q-nhaloalkyl, substituted or unsubstituted Q_ iohaloalkyl, substituted or unsubstituted Ci ⁇ haloalkyl, substituted or unsubstituted Q_ 6 haloalkyl, substituted or unsubstitute
  • R is substituted or unsubstituted C 1 ; C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkyl.
  • the haloalkyl is a perhaloalkyl group.
  • R is -CX 3 , wherein X is halogen.
  • R is -CBr 3 , CBr 2 H, -CBrH 2 , -CBr 2 X, or -CBrX 2 , wherein each instance of X is independently -CI, -F, or -I.
  • R is -CI 3 , CI 2 H, - CIH 2 , -CI 2 X, or -CIX 2 , wherein each instance of X is independently -CI, -F, or -Br.
  • R 3 is -CBr 3 , -CI 3 , -CFClBr, or -CClBrl.
  • R is substituted or unsubstituted C 2 _ 2 oalkenyl, e.g. , substituted or unsubstituted C ⁇ alkenyl, substituted or unsubstituted C 2 -i 6 alkenyl, substituted or unsubstituted C 2 _ 14 alkenyl, substituted or unsubstituted C 2 _ 12 alkenyl, substituted or unsubstituted C 2 -ioalkenyl, substituted or unsubstituted C 2 _ 8 alkenyl, substituted or unsubstituted C 2 _ 6 alkenyl, substituted or
  • R is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkenyl.
  • R is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted C 2 _
  • the haloalkenyl is a perhaloalkenyl group.
  • R is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkenyl.
  • alkenyl group is trans or the E-isomer.
  • R 3 is substituted or unsubstituted C 2 _ 2 o alkynyl, e.g., R is substituted or unsubstituted substituted or unsubstituted C 2 _ 16 alkynyl, substituted or unsubstituted C 2 _ 14 alkynyl, substituted or unsubstituted C 2 _ 12 alkynyl, substituted or unsubstituted C ⁇ ioalkynyl, substituted or unsubstituted C 2 _ 8 alkynyl, substituted or unsubstituted C 2 _ 6 alkynyl, substituted or
  • R is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkynyl.
  • R is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C 2 _
  • the haloalkynyl is a perhaloalkynyl group.
  • R is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkynyl.
  • R 3 is substituted or unsubstituted carbocycyl, e.g., substituted or unsubstituted C 3 carbocycyl, substituted or unsubstituted C 4 carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C 6 carbocycyl.
  • R is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C 3 halocarbocycyl, substituted or unsubstituted
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • R 3 is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5- membered heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl.
  • R is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted 3- membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6- membered haloheterocyclyl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • R 3 is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl.
  • R is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted haloaryl.
  • R is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
  • mono substituted halophenyl e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of
  • R 3 is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl.
  • R is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • R 3 is halogen, i.e., R 3 is -Br, -I, -F, or -CI.
  • R 3 is -OR c
  • R c is hydrogen, i.e., R is -OH.
  • R 3 is -OR c
  • R c is an oxygen protecting group, as defined herein.
  • R 3 is -OR c
  • R c is substituted or unsubstituted C ⁇ oalkyl
  • R 3 is -OR C
  • R C is substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Q-nalkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted Ci-galkyl, substituted or unsubstituted C ⁇ a cyl, substituted or unsubstituted C 1 _ 4 alkyl, substituted or unsubstituted C ⁇ alkyl, or substituted or unsubstituted
  • R 3 is -OR C , and R C is substituted or unsubstituted C 1; C 2 , C 3 , C 4 , C 5 , or C 6 -alkyl.
  • R is alkyl substituted with at least one or more halogen atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C ⁇ ohaloalkyl, substituted or unsubstituted Ci-ighaloalkyl, substituted or unsubstituted Ci- ⁇ haloalkyl, substituted or unsubstituted Ci- ⁇ haloalkyl, substituted or unsubstituted Q-nhaloalkyl, substituted or unsubstituted Q-iohaloalkyl, substituted or unsubstituted Q-ghaloalkyl, substituted or unsub
  • R 3 is -OR C , and R C is substituted or unsubstituted Q, C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkyl.
  • the haloalkyl is a perhaloalkyl group.
  • R is -CX 3 , wherein X is halogen.
  • R is -CBr , CBr 2 H, -CBrH 2 , -CBr 2 X, or -CBrX 2 , wherein each instance of X is independently -CI, -F, or -I.
  • R is -CI 3 , CI 2 H, -CIH 2 , -CI 2 X, or -CIX 2 , wherein each instance of X is independently -CI, -F, or -Br.
  • R c is -CBr 3 , -CI 3 , -CFClBr, or -CClBrl.
  • R 3 is -OR c , and R c
  • R is substituted or unsubstituted C 2 _ 2 oalkenyl, e.g., R is -OR , and R is substituted or unsubstituted substituted or unsubstituted C 2 _ 16 alkenyl, substituted or unsubstituted C 2 _ 14 alkenyl, substituted or unsubstituted C 2 _ 12 alkenyl, substituted or unsubstituted C ⁇ ioalkenyl, substituted or unsubstituted C 2 _ 8 alkenyl, substituted or
  • R is -OR , and R is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkenyl.
  • R is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g.
  • R is substituted or unsubstituted C 2 _ 2 ohaloalkenyl, substituted or unsubstituted substituted or unsubstituted C 2 _ 14 haloalkenyl, substituted or unsubstituted C 2 _ 12 haloalkenyl, substituted or unsubstituted C ⁇ iohaloalkenyl, substituted or unsubstituted C 2 _ 8 haloalkenyl, substituted or unsubstituted C 2 _ 6 haloalkenyl, substituted or unsubstituted C 2 _ 4 haloalkenyl, or substituted or unsubstituted C 2 _ 3 haloalkenyl.
  • the haloalkenyl is a perhaloalkenyl group.
  • R is -OR , and R is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 - haloalkenyl.
  • R 3 is -OR c , and R c
  • R is substituted or unsubstituted C 2 _ 2 oalkynyl, e.g., R is -OR , and R is substituted or unsubstituted C 2 _18alkynyl, substituted or unsubstituted C ⁇ alkynyl, substituted or unsubstituted C 2 _ 14 alkynyl, substituted or unsubstituted C 2 _ 12 alkynyl, substituted or unsubstituted C 2 _ioalkynyl, substituted or unsubstituted C 2 _galkynyl, substituted or unsubstituted C 2 _ 6 alkynyl, substituted or unsubstituted C 2 _ 4 alkynyl, or substituted or substituted or
  • R is -OR , and R is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkynyl.
  • R is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g.
  • R is substituted or unsubstituted C 2 _ 2 ohaloalkynyl, substituted or unsubstituted C 2 -i 8 haloalkynyl, substituted or unsubstituted C 2 -i 6 haloalkynyl, substituted or unsubstituted C 2 -i 4 haloalkynyl, substituted or unsubstituted C 2 -i 2 haloalkynyl, substituted or unsubstituted C 2 -iohaloalkynyl, substituted or unsubstituted C 2 -shaloalkynyl, substituted or unsubstituted C 2 - 6 haloalkynyl, substituted or unsubstituted C 2 - 4 haloalkynyl, or substituted or unsubstituted C 2 - 3 haloalkynyl.
  • the haloalkynyl is substituted or unsub
  • R is -OR , and R is substituted or unsubstituted C 2 , C 3 , C 4 , C5, or C 6 - haloalkynyl.
  • R 3 is -OR c
  • R c is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C 3 carbocycyl, substituted or unsubstituted C 4 carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted Cecarbocycyl.
  • R is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted C 3 halocarbocycyl, substituted or unsubstituted C 4 halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • R 3 is -OR c , and R c is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or
  • R is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g.
  • R is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl.
  • R 3 is -OR c , and R c is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl.
  • R is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted haloaryl.
  • R is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
  • mono substituted halophenyl e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of
  • R 3 is -OR c
  • R c is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl.
  • R is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • R 3 is -SR C
  • R c is hydrogen, i.e., R is -SH.
  • R 3 is -SR C
  • R c is a sulfur protecting group, as defined herein.
  • R 3 is -SR C
  • R c is substituted or unsubstituted C ⁇ oalkyl
  • R 3 is -SR C
  • R C is substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Q-walkyl, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted C h alky!, substituted or unsubstituted Ci ⁇ alkyl, substituted or unsubstituted C 1-4 alkyl, substituted or unsubstituted C ⁇ alkyl, or substituted or unsubstituted
  • R 3 is -SR C
  • R C is substituted or unsubstituted C 1; C 2 , C 3 , C 4 , C 5 , or C 6 -alkyl.
  • R is alkyl substituted with at least one or more halogen atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C ⁇ ohaloalkyl, substituted or unsubstituted Ci- ⁇ haloalkyl, substituted or unsubstituted Q-iehaloalkyl, substituted or unsubstituted Ci-nhaloalkyl, substituted or unsubstituted Q-nhaloalkyl, substituted or unsubstituted Q-iohaloalkyl, substituted or unsubstituted Ci-shaloalkyl, substituted or unsubstituted or unsubstituted Ci-
  • R 3 is -SR C , and R C is substituted or unsubstituted Q, C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkyl.
  • the haloalkyl is a perhaloalkyl group.
  • R is -CX 3 , wherein X is halogen.
  • R is -CBr 3 , CBr 2 H, -CBrH 2 , -CBr 2 X, or -CBrX 2 , wherein each instance of X is independently -CI, -F, or -I.
  • R is -CI 3 , CI 2 H, -CIH 2 , -CI 2 X, or -CIX 2 , wherein each instance of X is independently -CI, -F, or -Br.
  • R c is -CBr 3 , -CI 3 , -CFClBr, or -CClBrl.
  • R 3 is -OR c , and R c is substituted or unsubstituted C 2 _ 2 oalkenyl, e.g., R 3 is -SR C , and R C is substituted or unsubstituted substituted or unsubstituted C 2 -i 6 alkenyl, substituted or unsubstituted C 2 _ 14 alkenyl, substituted or unsubstituted C 2 _ 12 alkenyl, substituted or unsubstituted C 2 -ioalkenyl, substituted or unsubstituted C 2 _ 8 alkenyl, substituted or
  • R 3 is -SR C
  • R C is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkenyl.
  • R is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C 2 - 2 ohaloalkenyl, substituted or unsubstituted C 2 -i 8 haloalkenyl, substituted or unsubstituted C 2 -i 6 haloalkenyl, substituted or unsubstituted C 2 _ 14 haloalkenyl, substituted or unsubstituted C 2 _ 12 haloalkenyl, substituted or unsubstituted C 2 -iohaloalkenyl, substituted or unsubstituted C 2 _ 8 haloalkenyl, substituted or unsubstituted C 2 - 6 haloalkenyl, substituted or unsubstituted or unsub
  • the haloalkenyl is a perhaloalkenyl group.
  • R 3 is -SR C
  • R C is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 - haloalkenyl.
  • the alkenyl group is trans or the E-isomer.
  • R 3 is -SR C
  • R c is substituted or unsubstituted C 2 _ 2 oalkynyl
  • R 3 is -SR C
  • R C is substituted or unsubstituted C ⁇ alkynyl, substituted or unsubstituted C ⁇ alkynyl, substituted or unsubstituted C 2 _ 14 alkynyl, substituted or unsubstituted C 2 _ 12 alkynyl, substituted or unsubstituted C 2 _ioalkynyl, substituted or unsubstituted C 2 _galkynyl, substituted or unsubstituted C 2 _ 6 alkynyl, substituted or unsubstituted C 2 _ 4 alkynyl, or substituted or unsubstituted C 2 _ 3 alky
  • R 3 is -SR C
  • R C is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkynyl.
  • R is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C 2 _ 2 ohaloalkynyl, substituted or unsubstituted C 2 -i 8 haloalkynyl, substituted or unsubstituted C 2 -i 6 haloalkynyl, substituted or unsubstituted C 2 -i 4 haloalkynyl, substituted or unsubstituted C 2 -i 2 haloalkynyl, substituted or unsubstituted C 2 -i 2 haloalky
  • R is -SR , and R is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 - haloalkynyl.
  • R 3 is -SR C
  • R c is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C 3 carbocycyl, substituted or unsubstituted C 4 carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted Cecarbocycyl.
  • R is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted C 3 halocarbocycyl, substituted or unsubstituted C 4 halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • R 3 is -SR C
  • R c is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5-membered heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl.
  • R is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • R 3 is -SR C
  • R c is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl.
  • R is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted haloaryl.
  • R is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4- , or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
  • mono substituted halophenyl e.g., ortho, meta, or para- substituted with halogen atoms
  • disubstituted halophenyl e.g., 1,2-, 1,
  • R 3 is -SR C
  • R c is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl.
  • R is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • R 3 is -N(R C ) 2 , and at least one R C is hydrogen, i.e., R 3 is -NHR C or -NH 2 .
  • R 3 is -N(R C ) 2 , and at least one R c is a nitrogen protecting group, as defined herein.
  • R 3 is -N(R C ) 2 , and at least one R c is substituted or unsubstituted C 1 _ 2 oalkyl, e.g., R 3 is -N(R C ) 2 , and at least one R C is substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Q- ⁇ alkyl, substituted or unsubstituted Q-walkyl, substituted or unsubstituted Ci- ⁇ alkyl, substituted or unsubstituted Ci-ioalkyl, substituted or unsubstituted Q-galkyl, substituted or unsubstituted C ⁇ a cyl, substituted or unsubstituted C 1 _ 4 alkyl, substituted or unsubstituted C ⁇ alkyl, or substituted or unsubstituted C 1 _ 2 alkyl.
  • R 3 is -N(R C ) 2 , and at least one R C is substituted or unsubstituted Q, C 2 , C 3 , C 4 , C 5 , or C 6 -alkyl.
  • at least one is R A is alkyl substituted with at least one or more halogen atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g., at least one is R is substituted or unsubstituted C 1 _ 2 ohaloalkyl, substituted or unsubstituted Q_ ighaloalkyl, substituted or unsubstituted Ci- ⁇ haloalkyl, substituted or unsubstituted Q_ 14 haloalkyl, substituted or unsubstituted Q-nhaloalkyl, substituted or unsubstituted Q_ 10 haloalkyl, substituted or unsubstit
  • R is -
  • R c N(R c ) 2 , and at least one R c is substituted or unsubstituted Q, C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkyl.
  • the haloalkyl is a perhaloalkyl group.
  • at least one R c is -CX 3 , wherein X is halogen.
  • at least one R c is -
  • At least one R" is -CI 3 , CI 2 H, -CIH 2 , -CI 2 X, or -CIX 2 , wherein each instance of X is independently -CI, -F, or -Br.
  • at least one R c is -CBr 3 , -CI 3 , -CFClBr, or -CClBrl.
  • R 3 is -N(R C ) 2 , and at least one R c is substituted or unsubstituted C 2 _ 2 oalkenyl, e.g., R 3 is -N(R C ) 2 , and at least one R C is substituted or unsubstituted C 2 _ 18 alkenyl, substituted or unsubstituted C 2 _ 16 alkenyl, substituted or unsubstituted C 2 _ 14 alkenyl, substituted or unsubstituted C 2 _ 12 alkenyl, substituted or unsubstituted C 2 -ioalkenyl, substituted or unsubstituted C 2 _ 8 alkenyl, substituted or
  • R 3 is -N(R C ) 2 , and at least one R C is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkenyl.
  • At least one R is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C 2 - 2 ohaloalkenyl, substituted or unsubstituted C 2 _ 18 haloalkenyl, substituted or unsubstituted C 2 -i 6 haloalkenyl, substituted or unsubstituted C 2 14 haloalkenyl, substituted or unsubstituted C 2 _ 12 haloalkenyl, substituted or unsubstituted C 2 -iohaloalkenyl, substituted or unsubstituted C 2 _ 8 haloalkenyl, substituted or unsubstituted C 2 - 6 haloalkenyl, substituted or unsubstituted or unsub
  • the haloalkenyl is a perhaloalkenyl group.
  • R 3 is -N(R C ) 2 , and at least one R C is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkenyl.
  • the alkenyl group is trans or the E-isomer.
  • R 3 is -N(R C ) 2 , and at least one R c is substituted or unsubstituted C 2 _ 2 oalkynyl, e.g., R 3 is -N(R C ) 2 , and at least one R C is substituted or unsubstituted C 2 _ 18 alkynyl, substituted or unsubstituted C 2 _ 16 alkynyl, substituted or unsubstituted C 2 _ 14 alkynyl, substituted or unsubstituted C 2 _ 12 alkynyl, substituted or unsubstituted C ⁇ ioalkynyl, substituted or unsubstituted C 2 _ 8 alkynyl, substituted or
  • R 3 is -N(R C ) 2 , and at least one R C is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -alkynyl.
  • At least one R A is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one R is substituted or unsubstituted C 2 _ 20 haloalkynyl, substituted or unsubstituted C 2 _ 18 haloalkynyl, substituted or unsubstituted C 2 16 haloalkynyl, substituted or unsubstituted C ⁇ haloalkynyl, substituted or unsubstituted C 2 12 haloalkynyl, substituted or unsubstituted C ⁇ ohaloalkynyl, substituted or unsubstituted C 2 ghaloalkynyl, substituted or unsubstituted C 2 _ 6 haloalkynyl, substituted or unsubstituted C 2
  • the haloalkynyl is a perhaloalkynyl group.
  • R 3 is -N(R C ) 2 , and at least one R is substituted or unsubstituted C 2 , C 3 , C 4 , C 5 , or C 6 -haloalkynyl.
  • R 3 is -N(R C ) 2
  • at least one R c is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C 3 carbocycyl, substituted or unsubstituted C 4 carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C 6 carbocycyl.
  • At least one R is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one R is substituted or unsubstituted C 3 halocarbocycyl, substituted or unsubstituted C 4 halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • R 3 is -N(R C ) 2 , and at least one R c is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5- membered heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl.
  • At least one R is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one R is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4- membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • R 3 is -N(R C ) 2
  • at least one R c is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl.
  • at least one R is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one R is substituted or unsubstituted haloaryl.
  • At least one R is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
  • mono substituted halophenyl e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to
  • R 3 is -N(R C ) 2 , and at least one R c is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl.
  • At least one R is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one R is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
  • halogen atoms e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms
  • R 3 is -N(R C ) 2 , and two R c groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring, e.g., a 5- to 6- membered substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring.
  • R 3 is -SHgR D , wherein RD is
  • R 3 is -SHgMe, - SHg or -S0 2 SHg.
  • R 3 is -SeR D , as defined herein.
  • R 3 is -TeR D , as defined herein.
  • each instance of L 1 is independently absent or a linking moiety selected from the group consisting of substituted or unsubstituted C ⁇ oalkylene, substituted or unsubstituted C 2 _ 2 oalkenylene, substituted or unsubstituted C 2 _ 20 alkynylene, substituted or unsubstituted heteroCi- 20 alkylene, substituted or unsubstituted heteroC 2 _ 2 oalkenylene, substituted or unsubstituted heteroC 2 _ 2 o alkynylene, substituted or unsubstituted carbocycylene, substituted or
  • L 1 is absent, and R 3 is attached directly to the ring system.
  • L 1 is a linking moiety selected from the group consisting of substituted and unsubstituted alkylene; substituted and unsubstituted
  • linking moiety consisting of "a combination” refers to a linking moiety compring 1, 2, 3, 4 or more of the recited moieties.
  • the linking moiety may consist of an alkynylene attached to an alkylene.
  • at least one instance refers to 1, 2, 3, 4, or more instances of the recited moiety.
  • L 1 is a linking moiety selected from the group consisting of substituted and unsubstituted alkylene; substituted and unsubstituted alkenylene; and substituted and unsubstituted alkynylene, and combinations thereof.
  • L 1 comprises at least one instance of substituted or unsubstituted alkylene, e.g., substituted or unsubstituted C ⁇ alkylene, substituted or unsubstituted C ⁇ alkylene, substituted or unsubstituted C 2 _ 3 alkylene, substituted or unsubstituted C 3 ⁇ alkylene, substituted or unsubstituted C 4 _salkylene, substituted or unsubstituted C 5 _ 6 alkylene, substituted or unsubstituted C 3 _ 6 alkylene, or substituted or unsubstituted C 4 _ 6 alkylene.
  • substituted or unsubstituted alkylene e.g., substituted or unsubstituted C ⁇ alkylene, substituted or unsubstituted C ⁇ alkylene, substituted or unsubstituted C 2 _ 3 alkylene, substituted or unsubstituted C 3 ⁇ alkylene, substituted
  • Exemplary alkylene groups include unsubstituted alkylene groups such as methylene -CH 2 -, ethylene -(CH 2 ) 2 -, n-propylene -(CH 2 ) 3 -, n-butylene - (CH 2 ) 4 -, n-pentylene -(CH 2 )s-, and n-hexylene -(CH 2 ) 6 -.
  • L 1 is a linking moiety selected from the group consisting of substituted and unsubstituted alkylene and substituted and unsubstituted alkenylene, and combinations thereof.
  • L 1 comprises at least one instance of substituted or unsubstituted alkenylene, e.g., substituted or unsubstituted C 2 _ 6 alkenylene, substituted or unsubstituted C 2 _ 3 alkenylene, substituted or unsubstituted C 3 ⁇ alkenylene, substituted or unsubstituted C 4 _salkenylene, or substituted or unsubstituted Cs ⁇ alkenylene.
  • substituted or unsubstituted alkenylene e.g., substituted or unsubstituted C 2 _ 6 alkenylene, substituted or unsubstituted C 2 _ 3 alkenylene, substituted or unsubstituted C 3 ⁇ alkenylene, substituted or unsubstituted C 4 _salkenylene, or substituted or unsubstituted Cs ⁇ alkenylene.
  • L 1 is a linking moiety selected from the group consisting of substituted and unsubstituted alkylene and substituted and unsubstituted alkynylene, and combinations thereof.
  • L 1 comprises at least one instance of substituted or unsubstituted alkynylene, e.g., substituted or unsubstituted C 2 _ 6 alkynylene, substituted or unsubstituted C 2 _ 3 alkynylene, substituted or unsubstituted C 3 ⁇ alkynylene, substituted or unsubstituted C 4 _salkynylene, or substituted or unsubstituted Cs ⁇ alkynylene.
  • substituted or unsubstituted alkynylene e.g., substituted or unsubstituted C 2 _ 6 alkynylene, substituted or unsubstituted C 2 _ 3 alkynylene, substituted or unsubstituted C 3 ⁇ alkynylene, substituted or unsubstituted C 4 _salkynylene, or substituted or unsubstituted Cs ⁇ alkynylene.
  • L 1 comprises at least one instance of substituted or unsubstituted heteroalkylene, e.g., substituted or unsubstituted heteroQ-ealkylene, substituted or unsubstituted heteroC ⁇ alkylene, substituted or unsubstituted heteroC 2 _ 3 alkylene, substituted or unsubstituted heteroC 3 ⁇ alkylene, substituted or unsubstituted heteroC 4 _ salkylene, or substituted or unsubstituted heteroCs_ 6 alkylene.
  • exemplary heteroalkylene groups include unsubstituted alkylene groups such as -(CH 2 ) 2 -0(CH 2 ) 2 -, -OCH 2 -, -CH 2 0-,
  • L 1 comprises at least one instance of substituted or unsubstituted heteroalkenylene, e.g., substituted or unsubstituted heteroC 2 _ 6 alkenylene, substituted or unsubstituted heteroC 2 _ 3 alkenylene, substituted or unsubstituted heteroC 3 _ 4 alkenylene, substituted or unsubstituted heteroC 4 _salkenylene, or substituted or unsubstituted heteroC 5 _ 6 alkenylene .
  • substituted or unsubstituted heteroalkenylene e.g., substituted or unsubstituted heteroC 2 _ 6 alkenylene, substituted or unsubstituted heteroC 2 _ 3 alkenylene, substituted or unsubstituted heteroC 3 _ 4 alkenylene, substituted or unsubstituted heteroC 5 _ 6 alkenylene .
  • L 1 comprises at least one instance of substituted or unsubstituted heteroalkynylene, e.g., substituted or unsubstituted heteroC 2 _ 6 alkynylene, substituted or unsubstituted heteroC 2 _ 3 alkynylene, substituted or unsubstituted heteroC 3 _ 4 alkynylene, substituted or unsubstituted heteroC 4 _salkynylene, or substituted or unsubstituted heteroC 5 _ 6 alkynylene.
  • substituted or unsubstituted heteroalkynylene e.g., substituted or unsubstituted heteroC 2 _ 6 alkynylene, substituted or unsubstituted heteroC 2 _ 3 alkynylene, substituted or unsubstituted heteroC 3 _ 4 alkynylene, substituted or unsubstituted heteroC 5 _ 6 alkynylene.
  • L 1 comprises at least one instance of substituted or unsubstituted carbocyclylene, e.g., substituted or unsubstituted C 3 _ 6 carbocyclylene, substituted or unsubstituted C 3 ⁇ carbocyclylene, substituted or unsubstituted C 4 _ 5
  • L 1 comprises at least one instance of substituted or unsubstituted heterocyclylene, e.g., substituted or unsubstituted C 3 _ 6 heterocyclylene, substituted or unsubstituted C 3 ⁇ heterocyclylene, substituted or unsubstituted C 4 _ 5 heterocyclylene, or substituted or unsubstituted C 5 _ 6 heterocyclylene.
  • substituted or unsubstituted heterocyclylene e.g., substituted or unsubstituted C 3 _ 6 heterocyclylene, substituted or unsubstituted C 3 ⁇ heterocyclylene, substituted or unsubstituted C 4 _ 5 heterocyclylene, or substituted or unsubstituted C 5 _ 6 heterocyclylene.
  • L 1 comprises at least one instance of substituted or unsubstituted arylene, e.g., substituted or unsubstituted phenylene.
  • L 1 comprises at least one instance of substituted or unsubstituted heteroarylene, e.g., substituted or unsubstituted 5- to 6-membered
  • L 1 represents a linking moiety consisting of a combination of one or more consecutive covalently bonded groups of the formula:
  • each instance of m is independently an integer between 1 to 10, inclusive;
  • each instance of p is independently an integer between 1 to 4, inclusive;
  • each instance of R wl is independently hydrogen; substituted or unsubstituted alkyl; substituted or unsubstituted alkenyl; substituted or unsubstituted alkynyl; substituted or unsubstituted carbocyclyl; substituted or unsubstituted heterocyclyl; substituted or unsubstituted aryl; substituted or unsubstituted heteroaryl; or a nitrogen protecting group; each instance of R W2 is independently hydrogen; halogen; substituted or unsubstituted alkyl; substituted or unsubstituted alkenyl; substituted or unsubstituted alkynyl; substituted or unsubstituted carbocyclyl; substituted or unsubstituted heterocyclyl; substituted or unsubstituted aryl; substituted or unsubstituted heteroaryl; or two R W2 groups are joined to form a substituted or unsubstituted 5- to 6-membered
  • L 1 represents a linking moiety consisting of a
  • L 1 represents a linking moiety consisting of a combination of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 consecutive covalently bonded groups of the above described formulae. It should be generally understood that multiple instances of a given variable or group present in a linking moiety may optionally differ.
  • each instance of R wl is independently hydrogen; substituted or unsubstituted alkyl; substituted or unsubstituted alkenyl; substituted or unsubstituted alkynyl; substituted or unsubstituted carbocyclyl; substituted or unsubstituted heterocyclyl; substituted or unsubstituted aryl; substituted or unsubstituted heteroaryl; a nitrogen protecting group if attached to a nitrogen atom, or an oxygen protecting group if attached to an oxygen atom.
  • each instance of R wl is independently hydrogen; substituted or unsubstituted alkyl (e.g., methyl); or a nitrogen protecting group.
  • each instance of R W2 is independently hydrogen; halogen; substituted or unsubstituted alkyl; substituted or unsubstituted alkenyl; substituted or unsubstituted alkynyl; substituted or unsubstituted carbocyclyl; substituted or unsubstituted heterocyclyl; substituted or unsubstituted aryl; substituted or unsubstituted heteroaryl; or two R W2 groups are joined to form a 5-6 membered ring.
  • each instance of R is independently hydrogen, halogen (e.g. , -Br, -CI, -F, or -I), or substituted or unsubstituted alkyl (e.g., methyl).
  • each instance of m is independently an integer between 1 to 10, inclusive.
  • m is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10.
  • each instance of p is independently an integer between 1 to 4, inclusive. In certain embodiments, p is 1, 2, 3, or 4.
  • R W2 , m, and R 3 are as defined herein.
  • m is 1, 2, 3, or 4.
  • each instance of R W2 is hydrogen.
  • R 3 is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I).
  • R is halogen (e.g. , -Br, -I), -OR c , -SR C , -N(R C ) 2 , -SHg, -S0 2 SHg , -SHgR D , -SeR D , or -TeR D .
  • R is -OR . In certain embodiments, R is -SR . In certain
  • R is -N(R ) 2 . In certain embodiments, R is -CX 3 , wherein X is halogen. In
  • R is -SHgR (e.g., -SHgMe), -SHg or -S0 2 SHg.
  • R is -SeR , e.g. , -SeCX 3 , wherein X is halogen. In certain embodiments, R is
  • R is Se-CBr or -TeBr .
  • each instance of M 3 and M 4 is independently O, Se, Te, CH 2 , CF 2 , CC1 2 , CBr 2 , or CI 2 .
  • M 3 is O. In certain embodiments of formula (vii), (viii), (ix), and (x), M is Se. In certain embodiments of formula (vii), (viii), (ix), and (x), M is Te. In certain embodiments of formula (vii), (viii),
  • M is Te. In certain embodiments of formula (vii), (viii), (ix), and (x), M is CH 2 . In certain embodiments of formula (vii), (viii), (ix), and (x), M is CF 2 . In certain embodiments of formula (vii), (viii), (ix), and (x), M is CC1 2 . In certain embodiments of formula (vii), (viii), (ix), and (x), M is CBr 2 . In certain embodiments of formula (vii), (viii), (ix), and (x), M 3 is CI 2 . [00339] In certain embodiments of formula (ii), (iv), (ix), and (x), M 4 is O.
  • M 4 is Se. In certain embodiments of formula (iii), (iv), (ix), and (x), M 4 is Te. In certain embodiments of formula (vii), (viii), (ix), and (x), M 4 is CH 2 . In certain embodiments of formula (vii), (viii), (ix), and (x), M 4 is CF 2 . In certain embodiments of formula (vii), (viii), (ix), and (x), M 4 is CC1 2 . In certain embodiments of formula (vii), (viii), (ix), and (x), M 4 is CBr 2 . In certain embodiments of formula (vii), (viii),
  • M 4 is CI 2 .
  • M 3 is O and M 4 is O.
  • M 3 is Se and M 4 is O.
  • M 3 is Te and M 4 is O.
  • M 3 is CH 2 and M 4 is O.
  • M 3 is CF 2 and M 4 is O.
  • M 3 is CC1 2 and M 4 is O.
  • M 3 is CBr 2 and M 4 is O.
  • M 3 is CI 2 and M 4 is O .
  • M 3 is O and M 4 is Se. In certain embodiments of formula (ix) and (x), M 3 is Se and M 4 is Se. In certain embodiments of formula (ix) and (x), M 3 is Te and M 4 is Se. In certain embodiments of formula (ix) and (x), M 3 is CH 2 and M 4 is Se. In certain embodiments of formula (ix) and (x), M 3 is CF 2 and M 4 is Se. In certain embodiments of formula (ix) and (x), M 3 is CC1 2 and M 4 is Se. In certain embodiments of formula (ix) and (x), M 3 is CBr 2 and M 4 is Se. In certain embodiments of formula (ix) and (x), M 3 is CI 2 and M 4 is Se .
  • M 3 is O and M 4 is Te. In certain embodiments of formula (ix) and (x), M 3 is Se and M 4 is Te. In certain embodiments of formula (ix) and (x), M 3 is Te and M 4 is Te. In certain embodiments of formula (ix) and (x), M 3 is CH 2 and M 4 is Te. In certain embodiments of formula (ix) and (x), M 3 is CF 2 and M 4 is Te. In certain embodiments of formula (ix) and (x), M 3 is CC1 2 and M 4 is Te. In certain embodiments of formula (ix) and (x), M 3 is CBr 2 and M 4 is Te. In certain embodiments of formula (ix) and (x), M 3 is CI 2 and M 4 is Te .
  • M 3 is O and M 4 is CH 2 .
  • M 3 is Se and M 4 is CH 2 .
  • M 3 is Te and M 4 is CH 2 .
  • M 3 is CH 2 and M 4 is CH 2 .
  • M 3 is CF 2 and M 4 is CH 2 .
  • M 3 is CC1 2 and M 4 is CH 2 .
  • M 3 is CBr 2 and M 4 is CH 2 .
  • M 3 is CI 2 and M 4 is CH 2.
  • M 3 is O and M 4 is CF 2 .
  • M 3 is Se and M 4 is CF 2 .
  • M 3 is Te and M 4 is CF 2 .
  • M 3 is CH 2 and M 4 is CF 2 .
  • M 3 is CF 2 and M 4 is CF 2 .
  • M 3 is CC1 2 and M 4 is CF 2 .
  • M 3 is CBr 2 and M 4 is CF 2 .
  • M 3 is CI 2 and M 4 is CF 2.
  • M 3 is O and M 4 is CC1 2 .
  • M 3 is Se and M 4 is CC1 2 .
  • M 3 is Te and M 4 is CC1 2 .
  • M 3 is CH 2 and M 4 is CC1 2 . In certain embodiments of formula (ix) and (x), M 3 is CF 2 and M 4 is CC1 2 . In certain embodiments of formula (ix) and (x), M 3 is CC1 2 and M 4 is CC1 2 . In certain embodiments of formula (ix) and (x), M 3 is CBr 2 and M 4 is CC1 2 . In certain embodiments of formula (ix) and (x), M 3 is CH 2 and M 4 is CC1 2 . In certain embodiments of formula (ix) and (x), M 3 is CF 2 and M 4 is CC1 2 . In certain embodiments of formula (ix) and (x), M 3 is CC1 2 and M 4 is CC1 2 . In certain embodiments of formula (ix) and (x), M 3 is CBr 2 and M 4 is CC1 2 . In certain
  • M 3 is CI 2 and M 4 is CC1 2.
  • M 3 is O and M 4 is CBr 2 .
  • M 3 is Se and M 4 is CBr 2 .
  • M 3 is Te and M 4 is CBr 2 .
  • M 3 is CH 2 and M 4 is CBr 2 . In certain embodiments of formula (ix) and (x), M 3 is CF 2 and M 4 is CBr 2 . In certain embodiments of formula (ix) and (x), M 3 is CC1 2 and M 4 is CBr 2 . In certain embodiments of formula (ix) and (x), M 3 is CBr 2 and M 4 is CBr 2 . In certain embodiments of formula (ix) and (x), M 3 is CI 2 and M 4 is CBr 2.
  • M 3 is O and M 4 is CI 2 .
  • M 3 is Se and M 4 is CI 2 .
  • M 3 is Te and M 4 is CI 2 .
  • M 3 is CH 2 and M 4 is CI 2 .
  • M 3 is CF 2 and M 4 is CI 2 .
  • M 3 is CC1 2 and M 4 is CI 2 .
  • M 3 is CBr 2 and M 4 is CI 2 .
  • M 3 is CI 2 and M 4 is CI 2.
  • Base region Various combinations of the above described embodiments of the "base region” are further contemplated herein.
  • at least one instance of a Base is of formula (i), (iii), (v), (vii), or (ix):
  • the base optionally comprises a heavy atom.
  • the base comprises a heavy atom.
  • the base does not comprise a heavy atom, but the sugar region or phosphate region comprises a heavy atom.
  • M is O. In certain embodiments, M is Se. In certain embodiments, M is O. In certain
  • M is Se. In certain embodiments, R and R are hydrogen. In certain embodiments, R 4 and R 5 are hydrogen. In certain embodiments, L 1 is absent. In certain embodiments, L 1 is substituted or unsubstituted alkynylene. In certain embodiments, L 1 is substituted or uns formula:
  • R W2 , m, and R 3 are as defined herein.
  • m is 1, 2, 3, or 4.
  • each instance of R W2 is hydrogen.
  • R 3 is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I).
  • R is halogen (e.g. , -Br, -I), -OR c , -SR C , -N(R C ) 2 , -SHg, -S0 2 SHg , -SHgR D , -SeR D , or -TeR D .
  • R is -OR . In certain embodiments, R is -SR . In certain
  • R is -N(R ) 2 . In certain embodiments, R is -CX 3 , wherein X is halogen. In
  • R is -SHgR (e.g., -SHgMe), -SHg or -S0 2 SHg.
  • R is -SeR , e.g. , -SeCX 3 , wherein X is halogen.
  • R is
  • R is Se-CBr or -TeBr .
  • the base optionally comprises a heavy atom.
  • the base comprises a heavy atom.
  • the base does not comprise a heavy atom, but the sugar region or phosphate region comprises a heavy atom.
  • R is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I). In certain
  • R 3 is halogen (e.g. , -Br, -I), -OR c , -SR C , -N(R C ) 2 , -SHg, -S0 2 SHg , -SHgR D ,
  • R is -OR . In certain embodiments, R is -SR .
  • R is -N(R ) 2 . In certain embodiments, R is -CX 3 , wherein X is halogen. In certain embodiments, R 3 is -SHgR D (e.g., -SHgMe), -SHg or -S0 2 SHg. In
  • R is -SeR , e.g. , -SeCX 3 , wherein X is halogen.
  • R is -TeR , e.g. , -TeCX 3 , wherein X is halogen.
  • R is Se-CBr 3 or -TeBr 3 .
  • R 4 and R 5 are hydrogen.
  • G 2 of the sugar region is hydrogen.
  • G 2 of the sugar region is -SHgMe, -SHg or -S0 2 SHg.
  • G 2 of the sugar region is -SeR D , e.g. , -SeCX 3 , wherein X is halogen.
  • G 2 of the sugar region is -TeR D , e.g. , -TeCX 3 , wherein X is halogen.
  • G 2 is Se-CBr 3 or -TeBr 3 .
  • G is a monophosphate, diphosphate, or triphosphate of formula:
  • M is -0-.
  • R is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I).
  • R is halogen (e.g. , - Br, -I), -OR c , -SR C , -N(R C ) 2 , -SHg, -S0 2 SHg , -SHgR D , -SeR D , or -TeR D .
  • R is -OR . In certain embodiments, R is -SR . In certain embodiments, R c c
  • R is -N(R ) 2 .
  • R is -CX 3 , wherein X is halogen.
  • R is -SHgR (e.g., -SHgMe), -SHg or -S0 2 SHg.
  • R is -SeR D , e.g. , -SeCX 3 , wherein X is halogen.
  • R 3 is -TeR D , e.g. , - TeCX 3 , wherein X is halogen.
  • R is Se-CBr 3 or -TeBr 3 .
  • Base is of formula (ix-a), (ix-b), (ix-c), or (ix- d)
  • a compound of Formula ( ⁇ '-d) comprising at least one instance of (Il'-d-ix- a), (ir-d-ix-b), ( ⁇ -d-ix-c), and/or ( ⁇ -d-ix-d):
  • M 2 is -0-.
  • R 4 and R 5 are hydrogen.
  • G 2 of the sugar region is hydrogen.
  • G 2 of the sugar region is -SHgMe, -SHg or -S0 2 SHg.
  • G 2 of the sugar region is -SeR D , e.g. , -SeCX 3 , wherein X is halogen.
  • G 2 of the sugar region is -TeR D , e.g. , -TeCX 3 , wherein X is halogen.
  • G 2 is Se-CBr or -TeBr .
  • R is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I).
  • R is halogen (e.g.
  • R 3 is -OR C .
  • R 3 is -SR C .
  • R 3 is -N(R c ) 2 .
  • R c is -CX 3 , wherein X is halogen.
  • R 3 is -SHgRD (e In certain embodiments, R 3
  • R 3 is -SeR D , e.g. , -SeCX 3 , wherein X is halogen.
  • R 3 is -TeR D , e.g. , - TeCX 3 , wherein X is halogen.
  • R is Se-CBr 3 or -TeBr 3 .
  • L 1 is absent, or -L 1 -R3 is of formula or , m is 1, and each R W2 is hydrogen, provided is a Base of formula (i-a), (i-b), (i-c), or (i-d):
  • the base optionally comprises a heavy atom.
  • the base comprises a heavy atom.
  • the base does not comprise a heavy atom, but the sugar region or phosphate region comprises a heavy atom.
  • R 4
  • R is aryl substituted with at least one or more
  • R is halogen (e.g. , -Br, -I).
  • R is halogen (e.g. , -Br, -I), -OR , - SR C , -N(R C ) 2 , -SHg, -S0 2 SHg , -SHgR D , -SeR D , or -TeR D .
  • R 3 is -
  • R is -SR . In certain embodiments, R is -N(R ) 2 . In certain
  • R is -CX 3 , wherein X is halogen.
  • R is -SHgR (e.g., -SHgMe), -SHg or -S0 2 SHg.
  • R 3 is -SeR D , e.g. , -SeCX 3 ,
  • R is -TeR , e.g. , -TeCX 3 , wherein X is halogen.
  • R is Se-CBr or -TeBr .
  • R 4 and R 5 are hydrogen.
  • G 2 of the sugar region is hydrogen.
  • G 2 of the sugar region is -SHgMe, -SHg or -S0 2 SHg.
  • G 2 of the sugar region is -SeR D , e.g. , -SeCX 3 , wherein X is halogen.
  • G 2 of the sugar region is -TeR D , e.g. , -TeCX 3 , wherein X is halogen.
  • G 2 is Se-CBr or -TeBr .
  • G is a monophosphate, diphosphate, or triphosphate of formula:
  • M is -0-.
  • R is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I).
  • R is halogen (e.g. , - Br, -I), -0R C , -SR C , -N(R C ) 2 , -SHg, -S0 2 SHg , -SHgR D , -SeR D , or -TeR D .
  • R is -OR .
  • R is -SR .
  • R c c
  • R is -N(R ) 2 .
  • R is -CX 3 , wherein X is halogen.
  • R is -SHgR (e.g., -SHgMe), -SHg or -S0 2 SHg.
  • R is -SeR D , e.g. , -SeCX 3 , wherein X is halogen.
  • R 3 is -TeR D , e.g. , - TeCX 3 , wherein X is halogen.
  • R is Se-CBr 3 or -TeBr 3 .
  • Base is of formula (i-a), (i-b), (i-c), or (i-d)
  • a compound of Formula (II'-d) comprising at least one instance of (II'-d-i-a), (ir-d-i-b), (ir-d-i-c), and/or ( ⁇ -d-i-d):
  • M 2 is -0-.
  • R 4 and R 5 are hydrogen.
  • G 2 of the sugar region is hydrogen.
  • G 2 of the sugar region is -SHgMe, -SHg or -S0 2 SHg.
  • G 2 of the sugar region is -SeR D , e.g. , -SeCX 3 , wherein X is halogen.
  • G 2 of the sugar region is -TeR D , e.g. , -TeCX 3 , wherein X is halogen.
  • G 2 is Se-CBr 3 or -TeBr 3 .
  • R is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I).
  • R is halogen (e.g.
  • R is -OR . In certain embodiments, R is -SR . In certain embodiments, R c c
  • R is -N(R ) 2 .
  • R is -CX 3 , wherein X is halogen.
  • R is -SHgR (e.g., -SHgMe), -SHg or -S0 2 SHg.
  • R is -SeR D , e.g. , -SeCX 3 , wherein X is halogen.
  • R 3 is -TeR D , e.g. , - TeCX 3 , wherein X is halogen.
  • R is Se-CBr 3 or -TeBr 3 .
  • L is absent, or -L -R is a group
  • the base optionally comprises a heavy atom.
  • the base comprises a heavy atom.
  • the base does not comprise a heavy atom, but the sugar region or phosphate region comprises a heavy atom.
  • R 1 is a heavy atom.
  • R is aryl substituted with at least one or more
  • R is halogen (e.g. , -Br, -I).
  • R is halogen (e.g. , -Br, -I), -OR , - SR C , -N(R C ) 2 , -SHg, -S0 2 SHg , -SHgR D , -SeR D , or -TeR D .
  • R 3 is -
  • R is -SR . In certain embodiments, R is -N(R ) 2 . In certain
  • R is -CX 3 , wherein X is halogen.
  • R is -SHgR (e.g., -SHgMe), -SHg or -S0 2 SHg.
  • R 3 is -SeR D , e.g. , -SeCX 3 ,
  • R is -TeR , e.g. , -TeCX 3 , wherein X is halogen.
  • R 3 is Se-CBr or -TeBr .
  • M 4 is O. In certain embodiments, M 4 is Se.
  • G 2 of the sugar region is -SHgMe, -SHg or -S0 2 SHg.
  • G 2 of the sugar region is -SeR D , e.g. , -SeCX 3 , wherein X is halogen.
  • G 2 of the sugar region is -TeR D , e.g. , -TeCX 3 , wherein X is halogen.
  • G 2 is Se-CBr or -TeBr .
  • G is a monophosphate, diphosphate, or triphosphate of formula:
  • M is -0-.
  • R is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I).
  • R is halogen (e.g. , - Br, -I), -0R C , -SR C , -N(R C ) 2 , -SHg, -S0 2 SHg , -SHgR D , -SeR D , or -TeR D .
  • R is -OR . In certain embodiments, R is -SR . In certain embodiments, R c c
  • R is -N(R ) 2 .
  • R is -CX 3 , wherein X is halogen.
  • R is -SHgR (e.g., -SHgMe), -SHg or -S0 2 SHg.
  • R is -SeR D , e.g. , -SeCX 3 , wherein X is halogen.
  • R 3 is -TeR D , e.g. , - TeCX 3 , wherein X is halogen.
  • R is Se-CBr 3 or -TeBr 3 .
  • M 4 is O. In certain embodiments, M 4 is Se.
  • Base is of formula (i-a), (i-b), (i-c), or (i-d)
  • a compound of Formula ( ⁇ '-d) comprising at least one instance of (II '-d-i-a), (Il'-d-i-b (Il'-d-i-c), and/or (H'-d-i-d):
  • M 2 is -0-.
  • R 4 and R 5 are hydrogen.
  • G 2 of the sugar region is hydrogen.
  • G 2 of the sugar region is -SHgMe, -SHg or -S0 2 SHg.
  • G 2 of the sugar region is -SeR D , e.g. , -SeCX 3 , wherein X is halogen.
  • G 2 of the sugar region is -TeR D , e.g. , -TeCX 3 , wherein X is halogen.
  • G 2 is Se-CBr 3 or -TeBr 3 .
  • R is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I).
  • R is halogen (e.g.
  • R is -OR . In certain embodiments, R is -SR . In certain embodiments, R c c
  • R is -N(R ) 2 .
  • R is -CX 3 , wherein X is halogen.
  • R is -SHgR (e.g., -SHgMe), -SHg or -S0 2 SHg.
  • R is -SeR D , e.g. , -SeCX 3 , wherein X is halogen.
  • R 3 is -TeR D , e.g. , - TeCX 3 , wherein X is halogen.
  • R is Se-CBr 3 or -TeBr 3 .
  • M 4 is O. In certain embodiments, M 4 is Se.
  • At least one instance of a Base is of formula (ii), (iv), (vi), (viii), or (x):
  • the base optionally comprises a heavy atom.
  • the base comprises a heavy atom.
  • the base does not comprise a heavy atom, but the sugar region or phosphate region comprises a heavy atom.
  • M 3 is O.
  • M 4 is O.
  • M 3 and M 4 are both O.
  • M 3 is Se or Te.
  • M 4 is Se or Te.
  • R 4 is -CX 3 wherein X is halogen.
  • At least one instance of a Base is:
  • sugar region comprises a heavy atom.
  • M 3 and M 4 are O.
  • R 1 and R 2 are hydrogen.
  • R 4 and R 5 are hydrogen.
  • G 2 of the sugar region is hydrogen.
  • G 2 of the sugar region is -SHgMe, -SHg or -S0 2 SHg.
  • G 2 of the sugar region is -SeR D , e.g. , -SeCX 3 , wherein X is halogen.
  • G 2 of the sugar region is -TeR D , e.g. , -TeCX 3 , wherein X is halogen.
  • G 2 is Se-CBr or -TeBr .
  • Exem lary compounds of Formula (I) include, but are not limited to:
  • Exemplary compounds of Formula (II), and salts thereof, comprise at least one instance of any one of the formula:
  • annular dark-field (ADF) imaging is utilized in the present work.
  • MCSTEM monochromated, spherical- aberration corrected scanning transmission electron microscope
  • FIG. 1 The ADF-STEM was the method of choice for Crewe and co-workers to originally image single heavy atoms in 1970 (Crewe, 1970; Crewe et aL, 1970), anticipating that the method might be used for sequencing DNA.
  • Recent STEM improvements now allow studies of atomic-level and single atom imaging (Batson et aL, 2002; Voyles et aL, 2002; Jia et aL, 2003).
  • a very small electron beam is raster-scanned across the sample.
  • test pattern DNA was built from a synthetic gene (provided by DNA 2.0, Menlo Park, CA, USA) with a 3,072 base-pair segment with all the thymines of one strand in a repeating pattern,...TNTNNNNNNN..., where T represents thymine and N represents any of the other three nucleobases.
  • This pseudo-repeating region was amplified using flanking priming sites and standard polymerase chain reaction methods, with one standard primer and one biotinyla ed primer.
  • the product was purified by centnfugation filter and bound to Dynabeads, Single- stranded DNA was obtained by denaturation, and the template strand was then used as template in a one-primer, one-cycle polymerization reaction using Bst polymerase standard reaction conditions, replacing 1 ⁇ of dTTP with 1.5 ⁇ CH 3 -Hg- S-dUTP (Livingston et aL, 1976), The DNA product was gel purified. Final concentration and buffer exchange was done on centnfugation filter. The efficacy of label inclusion was tested with restriction enzymes, which were seen not to react with modified recognition sites (Banfalvi & Sarkar, 1995), confirming the presence of modifications at those sites. The DNA was also assayed by inductively coupled plasma mass spectrometry, which confirmed the presence of mercury. Single stranded M13 and primers were processed in the same manner.
  • Double- stranded DNA was prepared that had been completely substituted with mercury labeled nucleotides on one strand, using 5-MeHgS-dUTP.
  • This "Z-dNTP” is labeled on the 5 carbon of the uridines and is known to readily incorporate into DNA (Bridgman & Petersen, 1996), taking the place of the thymines.
  • M13 DNA a 7,249 base-pair viral genome molecule
  • a 3,272 base-pair synthetic molecule with a visually identifiable "test pattern.” Success with both confirmed the efficient incorporation of labels into DNA molecules substantially longer than sequenceable via other technologies.
  • FIG 3B shows a bright-field TEM image of a prepared and linearized DNA molecule on a thin amorphous carbon substrate.
  • a critical distinguishing factor in identifying these molecules is their general morphology. Specifically, at relatively low TEM
  • the labeled DNA molecules are seen to be 2 nm in width.
  • the lengths of the molecules match the known lengths of the M13 and test pattern molecules, with an allowance for elongation. These observed widths and lengths correspond to the known dimensions of DNA molecules; no such features are found in control samples that do not include labeled DNA.
  • the resulting STEM images were despeckled and thresholded to identify the features with the greatest contrast. Features that match the known morphology of linearized DNA were selected. Features that did not match known DNA morphology were not included in subsequent analysis, A trace was drawn over the centei ine of the resulting linear features. The individual features are assessed to be individual mercury atoms, or in the case of the M13 molecules, adjacent mercury atoms. This continuous trace, including dark-field current values for both high contrast features and low contrast gaps, is shown in FIG. 4 as the dark- field current,
  • test pattern molecule is a synthetic gene (Villalobos et al., 2006), with a sequence specified to include two identifiable patterns. In this molecule, a pair of uridine bases, each with one heavy atom, are separated by exactly one nonlabeled base.
  • FIG. 5 shows a region of the test pattern molecule. The ADF-STEM intensity is noticeable above the substrate
  • the track of the atoms follows an approximately linear pattern, which corresponds to the known position of the linearized DNA. This allows us to conclude not only that the features represent individual, high-/ atoms, but that those atoms are collocated with the DNA molecule.
  • Every heavy atom in this sequence is in a location predicted by the pattern built into the synthetic sequence. While there are missing mercury atoms, there are none where they should not be. More specifically, within this 180 base-pair segment of DNA, the test pattern repeats, in part, 15 times. Of the 30 predicted labels, 17 are present. Fifteen occur in positions predicted by proximity to neighbors on both sides; the other two labels appear in locations predicted by neighbors on only one side. These two also are larger in cross section than the others, probably indicating a tangle that has added to the contrast of the mercury atom, and shortened the distance on one side of the label. All labels are found within the DNA molecule itself. The mercury atom contrast is consistent and statistically above the contrast found either outside the molecule or elsewhere within the molecule.
  • the smaller scale pattern is expected to have a characteristic distance of 0.7 to 1.2 nm.
  • the larger scale pattern is predicted to have a pitch of between 4,1 and 7,3 nm.
  • three characteristic modes are observed, around 1 nm, around 4.5 to 7.5 nm, and between 14 and 16 nm. All three of these distances match the test pattern.
  • the mercury atoms are seen to be between 0.7 and 1.1 nm apart (FIG. 6A). These measurements closely match the predicted spacing of the small- scale test pattern.
  • the cluster of spacing between 4.5 and 7.5 nm matches the predicted large- scale pattern.
  • DNA base pairs have a nominal linear pitch of 0.34 nm. If the distance between labeled bases is less than 1 nm or so, the number of unlabeled bases this distance indicates is likely to be fairly certain, whereas if the distance is greater, the number of unlabeled bases in between could become unpractically ambiguous. A higher labeling density might be able to overcome this problem, either using the same label for multiple base types or distinct labels to identify distinct base types.
  • using the same label for multiple base types would require parallel experiments to deduce actual sequence, For example, labeling C's and T's in one experiment, then C's and A's in the next, then combining the information to deduce the identity of the bases.
  • using different labels for different bases would avoid the issue of paral lel experiments.
  • Using different labels for distinct bases allows for differentiating distinct signals, either by number of atoms in the label, or by atomic number of individual labeling atoms,

Abstract

The present disclosure provides compositions and methods to sequence nucleic acid polymers for improving sequencing read length using electron microscopy, e.g., high-resolution scanning transmission electron microscopy (STEM). The present disclosure further provides heavy-atom labeled compounds of Formula (I): nucleic acid polymers comprising one or more heavy-atom labeled units of Formula (II'): such as heavy-atom labeled nucleic acid polymers of Formula (II): and salts thereof, wherein each of G 1, G2, G 3, M 1, M2, Base, and n are as defined herein, optionally for use in the methods described.

Description

HEAVY ATOM LABELED NUCLEOSIDES, NUCLEOTIDES, AND
NUCLEIC ACID POLYMERS, AND USES THEREOF
RELATED APPLICATIONS
[0001] The present application claims priority under 35 U.S.C. § 119(e) to U.S.
provisional patent application, U.S. S.N. 61/727,589, filed November 16, 2012, which is incorporated herein by reference.
BACKGROUND OF THE INVENTION
[0002] Advances over the last decade have greatly improved the speed and reduced the costs of DNA sequencing. Currently, they are limited to molecules less than 1 ,000 base pairs long, principally due to the inefficiency or incomplete nature of the fluorescent labeling reactions of "next generation" approaches (Schuster, 2008). Richard Feynman (1959) famously suggested that the incredible magnification power of electron microscopes might be harnessed to read DNA sequence; until now this challenge had not been met. The limiting issue has not been the small size of DNA, but the fact that the four different base types differ by only a few atoms, and all of the differing atoms are light elements, differences particularly indistinguishable for electron microscopy. Standard techniques used to increase sample contrast for electron microscopy have not been able to do so in a reliably sequence-specific manner, even after 40 years of effort (Gal-Or et al. 1967; ASTA, 2010).
[0003] The ADF-STEM was the method of choice for Crewe and co-workers to originally image single heavy atoms in anticipating that the method might be used for sequencing DNA. (Crewe, 1970; Crewe et al, 1970). Recent STEM improvements now allow studies of atomic-level and single atom imaging (Batson et al., 2002; Voyles et al., 2002; Jia et al., 2003). In an ADF-STEM, a very small electron beam is raster- scanned across the sample. Most of the electrons pass through the sample with only subtle changes of energy, direction, and/or phase. However, some electrons scatter at a high angle. The high angle scattering process (Rutherford scattering) scales with the atomic number (Z) of the atom (MuUer et al., 2008) raised to the power of approximately 1.5. The Z " dependence allows heavy nuclei to be definitively discriminated from light nuclei. The direct identification of unlabeled DNA base pairs, with average Z ~ 5.5, has proven to be difficult, and to-date unsuccessful. There is simply not enough difference between the base types to be detected without suitable contrast enhancement. Various groups have worked to overcome this problem, chiefly by chemically modifying single- stranded DNA with dusters of heavy atoms (Beer & Moudrianakis, 1962; Moudrianakis & Beer, 1965; Ottensmeyer, 1979).
SUMMARY OF THE INVENTION
[0004] The present disclosure provides compositions and methods to sequence nucleic acid molecules including improving sequencing read length by directly visualizing DNA as long, intact molecules using electron microscopy, such as high-resolution scanning transmission electron microscopy (STEM). In some aspects, template-directed polymerase enzymes are used to incorporate heavy-atom labeled bases directly into a long DNA molecule. As shown herein, the incorporation of heavy-atom labeled bases provides annular dark-field imaging (ADF-STEM) contrast substantially greater than in unlabeled DNA. The methods disclosed also simplify the challenge of making the labeling reactions sequence- specific because polymerase reactions are intrinsically sequence specific.
[0005] The present disclosure further provides inventive heavy-atom labeled compounds optionally for use in the inventive methods as described herein. Exemplary heavy-atom labeled compounds include compounds of Formula I):
Figure imgf000004_0001
and nucleic acid polymers comprising one or more heavy-atom labeled units of Formula (II'):
Figure imgf000004_0002
such as heavy-atom labeled nucleic acid polymers of Formula (II):
Figure imgf000004_0003
and salts thereof;
wherein: each instance of G is independently -0-, -S-, -Se-, -CH2-, or -NH-; each instance of G2 is independently hydrogen, halogen, -ORA, -SRA, -N(RA)2, -SHg, -S02SHg , -SHgRD, -SeRD or -TeRD;
each instance of RA is independently hydrogen, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2_2o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen atom, a sulfur protecting group when attached to a sulfur atom, a nitrogen protecting group when attached to a nitrogen atom; or two RA groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of M1 is independently -0-, -S-, -NH-, -Se-,or -C(RM)2-, wherein each instance of RM is independently hydrogen or halogen;
each instance of G3 is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2_20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or a monophosphate, diphosphate, or triphosphate of formula:
O I I ¾ 9 I I 9 I I l 0 I I 0 I I 0 I I c,
HO-P— HO-P-O-P— I HO-P-O-P-O-P— I
M2-H , OH M2-H , or OH OH M2-H
wherein each instance of M is independently -0-, -S-, or -Se-; and
each instance of Base is independently:
Figure imgf000005_0001
Adenine Guanine Cytosine Uracil Thymine or an analog thereof selected from the group consisting of:
Figure imgf000005_0002
Figure imgf000006_0001
Figure imgf000006_0002
Figure imgf000006_0003
wherein:
each instance of R1, R2, R4, and R5 is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, a nitrogen protecting group, -OR B , or -SR B , wherein each instance of R B is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen group, or a sulfur protecting group when attached to a sulfur group; or R 1 and R 2 and/or R4 and R5 are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of R is independently substituted or unsubstituted Q-^alkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or
c c c unsubstituted aryl, substituted or unsubstituted heteroaryl, halogen, -OR , -SR , -N(R )2, - SHg, -S02SHg , -SHgRD, -SeRD, or -TeRD wherein each instance of Rc is hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen atom, a sulfur protecting group when attached to a sulfur atom, a nitrogen protecting group when attached to a nitrogen atom; or two R groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of L1 is independently absent or a linking moiety selected from the group consisting of substituted or unsubstituted C^oalkylene, substituted or unsubstituted C2 2oalkenylene, substituted or unsubstituted C2-20 alkynylene, substituted or unsubstituted heteroC1_2oalkylene, substituted or unsubstituted heteroC2-2oalkenylene, substituted or unsubstituted heteroC2-2o alkynylene, substituted or unsubstituted carbocycylene, substituted or unsubstituted heterocyclylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a combination thereof;
each instance of RD is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; and
each instance of M3 and M4 are independently O, Se, Te, CH2, CF2, CCI2, CBr2, or CI2; and
n is 1 to 200,000, inclusive;
provided that the compound comprises at least one instance of a heavy atom selected from the group consisting of bromine, iodine, selenium, tellurium, or mercury.
[0006] In some aspects, the invention provides compositions of heavy atom labeled nucleic acids, as well as systems and methods of identifying, sequencing and/or detecting nucleic acid polymers, as well as related components (e.g. , substrates, software and the like). [0007] According to one aspect of the invention, methods of determining the sequence of a nucleic acid polymer labeled with heavy atoms are provided. The methods include forming a complementary strand of the nucleic acid polymer comprising one or more heavy- atom labeled compounds as described herein and identifying a sequence of nucleotides in the nucleic acid polymer and/or in the complementary strand using a particle beam.
[0008] In certain embodiments, the nucleic acid polymer and/or the complementary strand is DNA or RNA. In other embodiments, the nucleic acid polymer and/or its complementary strand is formed by a nucleic acid polymerase enzyme, such as using polymerase chain reaction (PCR).
[0009] In preferred embodiments, the nucleotides of the nucleic acid polymer and/or the complementary strand are modified to include labels comprising one or more heavy-atom labeled compounds as described herein. Preferably the labels are specific for each type of nucleotide. However, in some embodiments, at least two types of nucleotides are labeled with the same type of heavy-atom label. In other embodiments, one type of nucleotide is labeled, two types of nucleotides are labeled, three types of nucleotides are labeled, or four types of nucleotides are labeled. In some embodiments, substantially all of the nucleotides of the nucleic acid polymer and/or complementary strand are modified such that all nucleotides are labeled.
[0010] Preferably nucleotide specific labels are incorporated in the nucleic acid polymer and/or the complementary strand during formation of the nucleic acid polymer and/or the complementary strand.
[0011] In further embodiments, the nucleic acid polymer and/or the complementary strand are affixed to a substrate, and prior to the step of identification the nucleotides of the nucleic acid polymer and/or its complementary strand are substantially removed from the substrate, leaving the labels of the labeled nucleotides affixed to the substrate.
[0012] In still other embodiments, the step of identifying a sequence of nucleotides includes generating a particle beam, exposing the nucleic acid polymer and/or the
complementary strand to the particle beam, and identifying the nucleotides due to
characteristic changes to the particle beam. Preferably the nucleotides of the nucleic acid polymer and/or the complementary strand are modified to include labels as described herein, and more preferably the step of identifying the nucleotides includes detecting characteristic changes to the particle beam due to the heavy atom label(s) on the nucleotides. In certain embodiments, the particle beam is a lepton beam; more preferably the lepton beam is an electron beam. In some embodiments the particle beam is an ion beam; more preferably a helium ion beam or a gallium ion beam.
[0013] In other embodiments the nucleic acid polymer and/or the complementary strand are affixed to a substrate. The nucleic acid polymer and/or the complementary strand can be affixed to a substrate at one end of the nucleic acid polymer and/or the complementary strand, at both ends of the nucleic acid polymer and/or the complementary strand, and/or at a plurality of locations along the length of the nucleic acid polymer and/or the complementary strand.
[0014] In certain embodiments, the nucleic acid polymer and/or the complementary strand are substantially straightened prior to identifying the sequence. Preferably the nucleic acid polymer and/or the complementary strand are straightened by fluid flow, and more preferably the fluid flow includes molecular combing. The fluid can include one or more liquids, gases, phases or a combination thereof. In some embodiments, the nucleic acid polymer and/or the complementary strand are attached to a substrate and straightened by hybridization in the fluid flow to oligonucleotides that are attached to the substrate.
[0015] In additional embodiments, the step of identifying the nucleotides in the nucleic acid polymer and/or its complementary strand includes interpreting changes in the particle beam resulting from interactions with the nucleotides to detect the nucleotides in the nucleic acid polymer and/or its complementary strand, whereby the sequence of the nucleic acid polymer is determined. Preferably the nucleotides are labeled as described herein. The changes in the particle beam include changes in absorbance, reflection, deflection, energy or direction. The changes in the particle beam also can be changes in a spatial pattern, for example, a one dimensional pattern, a two dimensional pattern or a three dimensional pattern.
[0016] In further embodiments, the method also includes attaching the complementary strand and/or the nucleic acid polymer to a substrate. Preferably the attachment is by nucleic acid sequence- specific molecules, which preferably are oligonucleotides. In other preferred the substrate is derivatized to provide attachment points that are sequence non-specific. The complementary strand and optionally the nucleic acid polymer can be attached to the substrate in a grid pattern. Preferably the substrate includes a carbon thin film.
[0017] In other embodiments, the step of identifying the sequence of nucleotides includes performing a plurality of scans of the nucleic acid polymer and/or the
complementary strand using the particle beam. Preferably at least 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1200, 1400, 1600, 1800, 2000, 3000, 4000, 5000, 6000. 7000, 8000, 9000, 10000, or more nucleotides are identified in each scan. [0018] According to another aspect of the invention, methods of determining the sequence of a nucleic acid polymer are provided. The methods include synthesizing the nucleic acid polymer and/or its complementary strand using labeled ribonucleotide and/or deoxyribonucleotide triphosphates as described herein, and identifying labeled
ribonucleotides and/or deoxyribonucleotides in the nucleic acid polymer and/or its complementary strand using a particle beam, wherein the labeled ribonucleotides and/or deoxyribonucleotides, when incorporated in the nucleic acid polymer and/or its
complementary strand, are identifiable using the particle beam.
[0019] In certain embodiments, the nucleic acid polymer and/or the complementary strand is DNA or RNA. In other embodiments, the nucleic acid polymer and/or its complementary strand is synthesized by a nucleic acid polymerase enzyme, such as using polymerase chain reaction (PCR).
[0020] In preferred embodiments, the labels are specific for each type of nucleotide. However, in some embodiments, at least two types of nucleotides are labeled with the same type of heavy-atom label. In other embodiments, one type of nucleotide is labeled, two types of nucleotides are labeled, three types of nucleotides are labeled, or four types of nucleotides are labeled. In some embodiments, substantially all of the nucleotides of the nucleic acid polymer and/or complementary strand are modified such that all nucleotides are labeled.
[0021] Preferably, the labels are incorporated in the ribonucleotide and/or
deoxyribonucleotide triphosphates used in synthesis of the nucleic acid polymer and/or the complementary strand.
[0022] In further embodiments, the step of identifying the labeled ribonucleotides and/or deoxyribonucleotides includes generating a particle beam, exposing the nucleic acid polymer and the complementary strand to the particle beam, and identifying the ribonucleotides and/or deoxyribonucleotides due to characteristic changes to the particle beam due to the heavy atom label(s) on the nucleotides. Preferably the step of detecting the ribonucleotides and/or deoxyribonucleotides includes detecting characteristic changes to the particle beam. In certain embodiments, the particle beam is a lepton beam; more preferably the lepton beam is an electron beam. In some embodiments the particle beam is an ion beam; more preferably a helium ion beam or a gallium ion beam.
[0023] In other embodiments the nucleic acid polymer and/or the complementary strand are affixed to a substrate. In certain embodiments, prior to the step of identification the ribonucleotides and/or deoxyribonucleotides of the nucleic acid polymer and/or its complementary strand are substantially removed from the substrate, leaving the labels of the labeled ribonucleotides and/or deoxyribonucleotides affixed to the substrate. The nucleic acid polymer and/or the complementary strand can be affixed to a substrate at one end of the nucleic acid polymer and/or the complementary strand, at both ends of the nucleic acid polymer and/or the complementary strand, and/or at a plurality of locations along the length of the nucleic acid polymer and/or the complementary strand.
[0024] In certain embodiments, the nucleic acid polymer and/or the complementary strand are substantially straightened prior to identifying the labeled ribonucleotides and/or deoxyribonucleotides. Preferably the nucleic acid polymer and/or the complementary strand are straightened by fluid flow, and more preferably the fluid flow includes molecular combing. The fluid can include one or more liquids, gases, phases or a combination thereof. In some embodiments, the nucleic acid polymer and/or the complementary strand are attached to a substrate and straightened by hybridization in the fluid flow to oligonucleotides that are attached to the substrate.
[0025] In additional embodiments, the step of identifying the nucleotides in the nucleic acid polymer and/or its complementary strand includes interpreting changes in the particle beam resulting from interactions with the nucleotides to detect the ribonucleotides and/or deoxyribonucleotides in the nucleic acid polymer and/or its complementary strand, whereby the sequence of the nucleic acid polymer is determined. Preferably the nucleotides are labeled as described herein. The changes in the particle beam include changes in absorbance, reflection, deflection, energy or direction. The changes in the particle beam also can be changes in a spatial pattern, for example, a one dimensional pattern, a two dimensional pattern or a three dimensional pattern.
[0026] In further embodiments, the method also includes attaching the complementary strand and/or the nucleic acid polymer to a substrate. Preferably the attachment is by nucleic acid sequence- specific molecules, which preferably are oligonucleotides. In other preferred the substrate is derivatized to provide attachment points that are sequence non-specific. The complementary strand and optionally the nucleic acid polymer can be attached to the substrate in a grid pattern. Preferably the substrate includes a carbon thin film.
[0027] In other embodiments, the step of identifying the sequence of nucleotides includes performing a plurality of scans of the nucleic acid polymer and/or the
complementary strand using the particle beam. Preferably at least 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1200, 1400, 1600, 1800, 2000, 3000, 4000, 5000, 6000. 7000, 8000, 9000, 10000, or more nucleotides are identified in each scan. [0028] According to another aspect of the invention, methods of determining the sequence of a nucleic acid polymer are provided. The methods include synthesizing a complementary strand of the nucleic acid polymer using labeled ribonucleotide triphosphates or deoxyribonucleotide triphosphates as described herein, attaching the nucleic acid polymer and/or the complementary strand to a substrate, substantially straightening the nucleic acid polymer and/or the complementary strand using molecular combing, generating a particle beam, exposing the nucleic acid polymer and the complementary strand to the particle beam through the complementary strand on the substrate, and interpreting changes in the particle beam resulting from interactions with the nucleotides to detect the labeled nucleotides in the complementary strand, whereby the sequence of a nucleic acid polymer is determined.
[0029] According to another aspect of the invention, methods of detecting the presence and/or identifying a nucleic acid polymer are provided. The methods include forming a complementary strand of the nucleic acid polymer, attaching the complementary strand and, optionally, the nucleic acid polymer to a substrate, and detecting the presence and/or identifying the complementary strand and/or the nucleic acid polymer using a particle beam.
[0030] In some embodiments, the step of identifying includes measuring the length or determining at least a partial sequence of the complementary strand and/or the nucleic acid polymer.
[0031] In certain embodiments, the nucleic acid polymer and/or its complementary strand is DNA or RNA. In other embodiments, the nucleic acid polymer and/or its complementary strand is formed by a nucleic acid polymerase enzyme, e.g. , using polymerase chain reaction (PCR); preferably the nucleic acid polymerase enzyme is a DNA- dependent DNA polymerase, a RNA-dependent DNA polymerase or a RNA-dependent RNA polymerase.
[0032] In other embodiments, the nucleotides of the nucleic acid polymer and/or the complementary strand are modified to include labels as described herein. In preferred embodiments, the labels are specific for each type of nucleotide. However, in some embodiments, at least two types of nucleotides are labeled with the same type of heavy-atom label. In other embodiments, one type of nucleotide is labeled, two types of nucleotides are labeled, three types of nucleotides are labeled, or four types of nucleotides are labeled. In some embodiments, substantially all of the nucleotides of the nucleic acid polymer and/or complementary strand are modified such that all nucleotides are labeled. [0033] Preferably nucleotide specific labels are incorporated in the nucleic acid polymer and/or the complementary strand during formation of the nucleic acid polymer and/or the complementary strand.
[0034] In further embodiments, the step of detecting the presence and/or identifying of the complementary strand and/or the nucleic acid polymer using a particle beam includes generating a particle beam, exposing the nucleic acid polymer and/or the complementary strand to the particle beam, and detecting the nucleotides of the complementary strand and/or the nucleic acid polymer due to characteristic changes to the particle beam.
[0035] In some embodiments, the nucleotides of the nucleic acid polymer and/or the complementary strand are modified to include labels as described herein. Preferably the step of detecting the ribonucleotides and/or deoxyribonucleotides includes detecting characteristic changes to the particle beam due to the heavy atom label(s) on the nucleotides. In certain embodiments, the particle beam is a lepton beam; more preferably the lepton beam is an electron beam. In some embodiments the particle beam is an ion beam; more preferably a helium ion beam or a gallium ion beam.
[0036] In certain embodiments, the nucleic acid polymer and/or the complementary strand are substantially straightened prior to identifying the sequence. Preferably the nucleic acid polymer and/or the complementary strand are straightened by fluid flow, and more preferably the fluid flow includes molecular combing. The fluid can include one or more liquids, gases, phases or a combination thereof. In some embodiments, the nucleic acid polymer and/or the complementary strand are attached to a substrate and straightened by hybridization in the fluid flow to oligonucleotides that are attached to the substrate.
[0037] In additional embodiments, the step of identifying the nucleotides in the nucleic acid polymer and/or its complementary strand includes interpreting changes in the particle beam resulting from interactions with the nucleotides to detect the nucleotides in the nucleic acid polymer and/or its complementary strand, whereby the presence of the nucleic acid polymer is determined and/or the nucleic acid polymer is identified. Preferably the nucleotides are labeled as described herein and the characteristic changes to the particle beam due to the heavy atom label(s) on the nucleotides . The changes in the particle beam include changes in absorbance, reflection, deflection, energy or direction. The changes in the particle beam also can be changes in a spatial pattern, for example, a one dimensional pattern, a two dimensional pattern or a three dimensional pattern.
[0038] In further embodiments, the method also includes attaching the complementary strand and/or the nucleic acid polymer to a substrate. Preferably the attachment is by nucleic acid sequence- specific molecules, which preferably are oligonucleotides. In other preferred the substrate is derivatized to provide attachment points that are sequence non-specific. The complementary strand and optionally the nucleic acid polymer can be attached to the substrate in a grid pattern. Preferably the substrate includes a carbon thin film.
[0039] In other embodiments, the method also includes quantifying the amount of the complementary strand and/or the nucleic acid polymer.
[0040] According to another aspect of the invention, a device is provided that includes a substrate that is substantially transparent to a particle beam, and nucleic acid polymer binding sites on a surface of the substrate.
[0041] In some embodiments the substrate is substantially transparent to an electron beam. Preferably the substrate includes a carbon thin film.
[0042] In other embodiments, the device also includes a support that is substantially transparent to a particle beam.
[0043] Preferably the substrate is less than 5 nm thick, more preferably less than 2 nm thick, still more preferably less than 1.5 nm thick, and yet more preferably less than 1.1 nm thick.
[0044] In other embodiments, the nucleic acid polymer binding sites are formed at predetermined positions on the surface of the substrate, preferably in a grid pattern. In certain embodiments, the nucleic acid polymer binding sites are sequence specific, preferably oligonucleotides. In other embodiments, the nucleic acid polymer binding sites are not sequence specific.
[0045] In further embodiments, the device also includes one or more nucleic acid polymers affixed to the nucleic acid polymer binding sites. Preferably the one or more nucleic acid polymers are modified to include labels.
[0046] According to another aspect of the invention, methods for making a device are provided. The methods include obtaining a substrate that is substantially transparent to a particle beam, and forming nucleic acid polymer binding sites on a surface of the substrate.
[0047] In some embodiments the substrate is substantially transparent to an electron beam. Preferably the substrate includes a carbon thin film. In some embodiments, the nucleic acid polymer binding sites are formed at predetermined positions on the surface of the substrate, preferably in a grid pattern.
[0048] In other embodiments, the method also includes attaching to the substrate a support that is substantially transparent to a particle beam. [0049] Preferably the substrate is less than 5 nm thick, more preferably less than 2 nm thick, still more preferably less than 1.5 nm thick, and yet more preferably less than 1.1 nm thick.
[0050] In certain embodiments, the nucleic acid polymer binding sites are sequence specific, preferably oligonucleotides. In other embodiments, the nucleic acid polymer binding sites are not sequence specific.
[0051] In still other embodiments, the methods also include affixing one or more nucleic acid polymers to the nucleic acid polymer binding sites. Preferably, the one or more nucleic acid polymers are modified to include labels.
[0052] According to another aspect of the invention, systems designed to detect the presence of, determine the sequence of and/or identify a nucleic acid polymer are provided. The systems include: a sample chamber; a particle beam generator associated with the chamber; a sample comprising a labeled complementary strand of a nucleic acid polymer, wherein the sample, when positioned in the chamber, is exposed to a particle beam generated by the particle beam generator resulting in an interaction between the particle beam and the complementary strand; and a detector constructed and arranged to collect particle beam species after the interaction.
[0053] In some embodiments, the system also includes a data analysis module operative to receive and analyze signals from the detector. Preferably the data analysis module is operative to analyze signals related to absorbance, reflection, deflection, energy or direction. In other embodiments, the data analysis module is operative to analyze pattern recognition techniques to analyze the signals.
[0054] In further embodiments, the system also includes a user interface operative to control a display of information received and/or generated by the data analysis module.
[0055] In preferred embodiments, the particle beam generator is an electron beam generator.
[0056] The system in other embodiments also includes a feedback module designed to calibrate the system based on nucleic acid polymer data.
[0057] According to another aspect of the invention, systems designed to detect the presence of, determine the sequence of and/or identify a nucleic acid polymer are provided. The systems include: a sample chamber; a particle beam generator associated with the chamber; a detector constructed and arranged to collect particle beam species after interaction between the particle beam and a sample comprising the nucleic acid polymer and/or a complementary strand of the nucleic acid polymer; a data analysis module designed to analyze signals related to the particle beam species to determine information related to the nucleic acid polymer; and a feedback module designed to calibrate the system based on the information.
[0058] In some embodiments, the sample includes a labeled complementary strand of a nucleic acid polymer.
[0059] In certain embodiments, the feedback module is designed to calibrate the system based on a base-base distance of the nucleic acid polymer. In other embodiments, the feedback module is designed to calibrate the system based on known geometries of the nucleic acid polymer.
[0060] Also provided in accordance with another aspect of the invention are methods for calibrating a particle beam instrument. The methods include acquiring data related to a nucleic acid polymer; and calibrating the instrument based on the data. Preferably the data is related to a base-base distance of the nucleic acid polymer. In some embodiments, the calibrating includes calibrating the instrument based on known geometries of the nucleic acid polymer.
[0061] According to another aspect of the invention, systems are provided for detecting, sequencing and/or identifying a nucleic acid polymer based on particle beam species detected by a detector, the particle beam species resulting from exposure of a sample comprising a nucleic acid polymer and/or its complementary strand to a particle beam. The systems include a data analysis module operative to receive one or more signals from the detector, the one or more signals representing the particle beam species, and to detect, sequence and/or identify the nucleic acid polymer and/or its complementary strand comprised in the sample based at least in part on the received one or more signals. Preferably the nucleic acid polymer and/or its complementary strand is labeled.
[0062] In some embodiments, the particle beam species has one or more of the following properties: absorbance, reflection, deflection, energy and direction, and the data analysis module is operative to analyze the one or more signals to determine values of the one or more properties.
[0063] In other embodiments, the data analysis module is operative to access a data resource comprising nucleic acid polymer information, the data resource including a data structure having a plurality of entries, each entry specifying information about a respective nucleic acid polymer sequence. Preferably the data analysis module is operative to partially sequence the nucleic acid polymer based on the one or more signals, the data analysis module further comprising: a combining module to combine the partial sequence with sequencing information of the nucleic acid polymer accessed from the data resource. In preferred embodiments the data analysis module includes a comparison module operative to compare information determined from the one or more signals to the information specified by one or more of the data structure entries. Preferably the comparison module is operative to use pattern recognition techniques to compare the information determined from the one or more signals to the information specified by the one or more the data structure entries.
[0064] In other embodiments the data analysis module includes a user interface module to display information received and/or generated by the data analysis module to a user.
[0065] In further embodiments the particle beam to which the sample is exposed is generated by a particle beam generator, and the data analysis module includes a feedback module operative to provide one or more feedback signals to the particle beam generator and/or the detector, the one or more feedback signals specifying information determined at least in part from the one or more signals received from the detector. Preferably the one or more feedback signals include information for calibrating the particle beam generator. In preferred embodiments the feedback module is operative to generate the one or more feedback signals based at least in part on known geometries of the nucleic acid polymer. The data analysis module preferably includes a storage module operative to store information received and/or generated by the data analysis module on a computer-readable medium.
[0066] In some embodiments the sample includes a plurality of molecules of a same nucleic acid polymer and/or its complementary strand, and a plurality of particle beam species results from exposure of the plurality of molecules of the sample to the particle beam, the one or more signals representing the plurality of particle beam species, wherein the data analysis module is operative to partially sequence the nucleic acid polymer based on a first of the plurality of molecules to produce a first partial sequence, and to partially sequence the nucleic acid polymer based on a second of the plurality of molecules to produce a second partial sequence, and wherein the data processing module further includes a combining module to combine the first and second partial sequences.
[0067] According to another aspect of the invention, a computer-readable medium is provided having computer-readable signals stored thereon that define instructions that, as a result of being executed by a computer, control the computer to perform a process of detecting, sequencing and/or identifying a nucleic acid polymer based on particle beam species detected by a detector, the particle beam species resulting from exposure of a sample comprising a nucleic acid polymer and/or its complementary strand to a particle beam. The process includes: receiving one or more signals from the detector, the one or more signals representing the particle beam species; and detecting, sequencing and/or identifying the nucleic acid polymer and/or its complementary strand comprised in the sample based at least in part on the received one or more signals. Preferably the nucleic acid polymer and/or its complementary strand is labeled.
[0068] In some embodiments, the particle beam species has one or more of the following properties: absorbance, reflection, deflection, energy and direction, and the act of detecting, sequencing and/or identifying includes analyzing the one or more signals to determine values of the one or more properties.
[0069] In other embodiments, the act of detecting, sequencing and/or identifying includes accessing a data resource comprising nucleic acid polymer information, the data resource including a data structure having a plurality of entries, each entry specifying information about a respective nucleic acid polymer sequence. Preferably the act of detecting, sequencing and/or identifying includes partially sequencing the nucleic acid polymer based on the one or more signals to produce a partial sequence; accessing partial sequence information of the nucleic acid polymer from the data resource; and combining the partial sequence with the partial sequence information. In preferred embodiments the act of detecting, sequencing and/or identifying includes comparing information determined from the one or more signals to the information specified by one or more of the entries. In some of these embodiments, the act of detecting, sequencing and/or identifying preferably includes using pattern recognition techniques to compare the information determined from the one or more signals to the information specified by the one or more entries.
[0070] In further embodiments, the process further includes displaying information determined from the one or more received signals to a user.
[0071] In other embodiments the particle beam to which the sample is exposed is generated by a particle beam generator, and the process further includes providing one or more feedback signals to the particle beam generator and/or the detector, the one or more feedback signals specifying information determined at least in part from the one or more signals received from the detector. Preferably the act of providing includes providing one or more feedback signals that include information for calibrating the particle beam generator. In some embodiments the process further includes generating the one or more feedback signals based at least in part on known geometries of the nucleic acid polymer.
[0072] In other embodiments the process further includes storing information determine from the one or more signals on a computer-readable medium. [0073] In further embodiments the sample includes a plurality of molecules of a same nucleic acid polymer and/or its complementary strand, and a plurality of particle beam species result from exposure of the plurality of molecules of the sample to the particle beam, the one or more signals representing the plurality of particle beam species, and the act of detecting, sequencing and/or identifying includes partially sequencing the nucleic acid polymer based on a first of the plurality of molecules to produce a first partial sequence; partially sequencing the nucleic acid polymer based on a second of the plurality of molecules to produce a second partial sequence; combining the first and second partial sequences.
[0074] The details of one or more embodiments of the invention are set forth in the accompanying Figures and the Detailed Description. Other features, objects, and advantages of the invention will be apparent from the description and from the claims.
BRIEF DESCRIPTION OF THE DRAWINGS
[0075] Figure 1 depicts a non-limiting example of heavy atoms labels detected within DNA molecules. (A): Schematic showing heavy atoms deflecting portion of the raster scanned electron beam. Highly deflected electrons are detected on the ADF detector. (B): Unlabeled DNA bases scatter fewer electrons than the heavy- atom-labeled bases,
distinguished by detector current.
[0076] Figure 2 depicts a non-limiting example of heavy-atom-labeling strategy. A single stranded template is primed with a complementary oligonucleotide primer. For simplicity, the lengths of the primer and the template have been shortened. In the presence of polymerase, the template directs the synthesis of a complementary strand. Thymine deoxyribose nucleotide triphosphates in the primer extension reaction have been completely replaced with a heavy-atom-modified analog. Consequently, the resulting double- stranded DNA molecule is modified with heavy atoms on the thymine bases of the synthetic strand. These heavy atoms provide signal to the dark-field detector of a STEM system.
[0077] Figure 3 depicts DNA alignment of a prepared and linearized DNA molecule on a thin amorphous carbon substrate. (A) Bright-field TEM image of multiple DNA molecules linearized on amorphous carbon surface. (B) Darkfield STEM image of linearized DNA molecule on thin amorphous carbon substrate.
[0078] Figure 4 depicts heavy-atom locations and contrast distribution. (A): ADF detector readings from labeled M13 viral DNA (NEB product #N4040S) showing thymine
discrimination with red-colored band, 1.66 to 5 standard deviations (95% to 99.99+% confidence) from unlabeled DNA regions. (B): Corresponding residual current histogram of labeled DNA molecule scan. Data acquired in Carl Zeiss Libra 200-80kV aberration- corrected EF-STEM with Cs = - 1.2 μιη, 80 kV with elastic scattering using in-column energy-filter retaining only zero energy-loss electrons. DNA labeled with 5-Me-Hg-S-dUTP replacing dTTP in primer extension labeling reaction.
[0079] Figure 5 depicts a schematic of repeating "test pattern" molecule. Heavy atoms are attached to thymine/uridine bases of one strand of double- stranded DNA molecules. The labels nearest one another are separated by one unlabeled base pair; the theoretical pitch between the heavy atoms is 0.7 to 1.2 nm. These doublets repeat every 12 base pairs, for a theoretical pitch of 4.1 to 7.3 nm. Actual spacing of both patterns depends on local stretching, predicted to be 0% to 80%.
[0080] Figure 6 depicts sequence data from repeating "test pattern" molecule. (A): Partial sequence of DNA molecule. Yellow lines (starred, *) show heavy atoms in predicted large- scale test pattern positions, where distances to neighbors in both directions match the large- scale test pattern. White circles show pairs of atoms matching small-scale pattern. Red lines (indicated by arrows) show atoms of the large-scale pattern in positions predicted by spacing with one rather than two neighbors. (B): Schematic of repeating test pattern shows both small-scale and large-scale patterns. Not to scale.
DEFINITIONS
[0081] Definitions of specific functional groups and chemical terms are described in more detail below. The chemical elements are identified in accordance with the Periodic Table of the Elements, CAS version, Handbook of Chemistry and Physics, 75th Ed., inside cover, and specific functional groups are generally defined as described therein. Additionally, general principles of organic chemistry, as well as specific functional moieties and reactivity, are described in Organic Chemistry, Thomas Sorrell, University Science Books, Sausalito, 1999; Smith and March March's Advanced Organic Chemistry, 5th Edition, John Wiley & Sons, Inc., New York, 2001; Larock, Comprehensive Organic Transformations, VCH Publishers, rd
Inc., New York, 1989; and Carruthers, Some Modern Methods of Organic Synthesis, 3 Edition, Cambridge University Press, Cambridge, 1987.
[0082] Compounds described herein can comprise one or more asymmetric centers, and thus can exist in various stereoisomeric forms, e.g., enantiomers and/or diastereomers. For example, the compounds described herein can be in the form of an individual enantiomer, diastereomer or geometric isomer, or can be in the form of a mixture of stereoisomers, including racemic mixtures and mixtures enriched in one or more stereoisomer. Isomers can be isolated from mixtures by methods known to those skilled in the art, including chiral high pressure liquid chromatography (HPLC) and the formation and crystallization of chiral salts; or preferred isomers can be prepared by asymmetric syntheses. See, for example, Jacques et ah, Enantiomers, Racemates and Resolutions (Wiley Interscience, New York, 1981); Wilen et ah, Tetrahedron 33:2725 (1977); Eliel, E.L. Stereochemistry of Carbon Compounds (McGraw-Hill, NY, 1962); and Wilen, S.H. Tables of Resolving Agents and Optical Resolutions p. 268 (E.L. Eliel, Ed., Univ. of Notre Dame Press, Notre Dame, IN 1972). The invention additionally encompasses compounds as individual isomers substantially free of other isomers, and alternatively, as mixtures of various isomers.
[0083] When a range of values is listed, it is intended to encompass each value and subrange within the range. For example "Ci_6 alkyl" is intended to encompass, Q, C2, C3, C4,
C5, C6, Ci_6, Ci_5, Ci^, Ci-3, Ci-2, C2-6, C2_5, C-2-A, C2_3, C3_6, C3_5, C3^, C S, C4_5, and C5_6 alkyl.
[0084] As used herein, "alkyl" refers to a radical of a straight-chain or branched saturated hydrocarbon group having from 1 to 20 carbon atoms ("C^o alkyl"). In some embodiments, an alkyl group has 1 to 10 carbon atoms ("Ci-w alkyl"). In some embodiments, an alkyl group has 1 to 9 carbon atoms ("Q-9 alkyl"). In some embodiments, an alkyl group has 1 to 8 carbon atoms ("Q-s alkyl"). In some embodiments, an alkyl group has 1 to 7 carbon atoms ("Ci-7 alkyl"). In some embodiments, an alkyl group has 1 to 6 carbon atoms ("C^ alkyl"). In some embodiments, an alkyl group has 1 to 5 carbon atoms ("Q-s alkyl"). In some embodiments, an alkyl group has 1 to 4 carbon atoms ("C^ alkyl"). In some embodiments, an alkyl group has 1 to 3 carbon atoms ("Ci_3 alkyl"). In some embodiments, an alkyl group has 1 to 2 carbon atoms ("Ci-2 alkyl"). In some embodiments, an alkyl group has 1 carbon atom ("Ci alkyl"). In some embodiments, an alkyl group has 2 to 6 carbon atoms ("C2-6 alkyl"). Examples of Q_6 alkyl groups include methyl (CO, ethyl (C2), n-propyl (C3), isopropyl (C3), n-butyl (C4), tert-butyl (C4), sec-butyl (C4), iso-butyl (C4), n-pentyl (C5), 3- pentanyl (C5), amyl (C5), neopentyl (C5), 3-methyl-2-butanyl (C5), tertiary amyl (C5), and n- hexyl (C6). Additional examples of alkyl groups include n-heptyl (C7), n-octyl (C8) and the like. Unless otherwise specified, each instance of an alkyl group is independently
unsubstituted (an "unsubstituted alkyl") or substituted (a "substituted alkyl") with one or more substituents. In certain embodiments, the alkyl group is an unsubstituted Ci_2o alkyl {e.g., -CH3). In certain embodiments, the alkyl group is a substituted Ci_2o alkyl.
[0085] As used herein, "haloalkyl" is an alkyl group as defined herein wherein one or more of the hydrogen atoms are independently replaced by a halogen, e.g., fluoro, bromo, chloro, or iodo. "Perhaloalkyl" is a subset of haloalkyl, and refers to an alkyl group wherein all of the hydrogen atoms are independently replaced by a halogen, e.g., fluoro, bromo, chloro, or iodo. In some embodiments, the haloalkyl moiety has 1 to 20 carbon atoms ("C^o haloalkyl"). In some embodiments, the haloalkyl moiety has 1 to 10 carbon atoms ("Ci-w haloalkyl"). In some embodiments, the haloalkyl moiety has 1 to 8 carbon atoms ("Q-e haloalkyl"). In some embodiments, the haloalkyl moiety has 1 to 6 carbon atoms ("Ci_6 haloalkyl"). In some embodiments, the haloalkyl moiety has 1 to 4 carbon atoms ("C^ haloalkyl"). In some embodiments, the haloalkyl moiety has 1 to 3 carbon atoms ("C^ haloalkyl"). In some embodiments, the haloalkyl moiety has 1 to 2 carbon atoms ("Ci_2 haloalkyl"). In some embodiments, all of the haloalkyl hydrogen atoms are replaced with fluoro to provide a perfluoroalkyl group. In some embodiments, all of the haloalkyl hydrogen atoms are replaced with chloro to provide a "perchloroalkyl" group. Examples of haloalkyl groups include -CF3, -CF2CF3, -CF2CF2CF3, -CC13, -CFC12, -CF2C1, and the like.
[0086] Haloalkenyl, haloalkynyl, halocarbocyclyl, haloheterocylyl, haloaryl, and haloheteroaryl follow the definition of haloalkyl, and refer to an alkenyl, alkynyl,
carbocyclyl, heterocylyl, aryl, and heteroaryl group, as defined herein, wherein one or more of the hydrogen atoms are independently replaced by a halogen, e.g., fluoro, bromo, chloro, or iodo. Likewise, perhaloalkenyl, perhaloalkynyl, perhalocarbocyclyl, perhaloheterocylyl, perhaloaryl, and perhaloheteroaryl follow the definition of perhaloalkyl, and refer to an alkenyl, alkynyl, carbocyclyl, heterocylyl, aryl, and heteroaryl group, as defined herein, wherein all of the hydrogen atoms are independently replaced by a halogen, e.g., fluoro, bromo, chloro, or iodo.
[0087] As used herein, "heteroalkyl" refers to an alkyl group as defined herein which further includes at least one heteroatom (e.g., 1, 2, 3, or 4 heteroatoms) selected from oxygen, nitrogen, or sulfur within (i.e., inserted between adjacent carbon atoms of) and/or placed at one or more terminal position(s) of the parent chain. In certain embodiments, a heteroalkyl group refers to a saturated group having from 1 to 10 carbon atoms and 1 or more
heteroatoms within the parent chain ("heteroCi-io alkyl"). In some embodiments, a heteroalkyl group is a saturated group having 1 to 9 carbon atoms and 1 or more heteroatoms within the parent chain ("heteroCi-g alkyl"). In some embodiments, a heteroalkyl group is a saturated group having 1 to 8 carbon atoms and 1 or more heteroatoms within the parent chain ("heteroCi-8 alkyl"). In some embodiments, a heteroalkyl group is a saturated group having 1 to 7 carbon atoms and 1 or more heteroatoms within the parent chain ("heteroC^ alkyl"). In some embodiments, a heteroalkyl group is a saturated group having 1 to 6 carbon atoms and 1 or more heteroatoms within the parent chain ("heteroQ-e alkyl"). In some embodiments, a heteroalkyl group is a saturated group having 1 to 5 carbon atoms and 1 or 2 heteroatoms within the parent chain ("heteroCi-5 alkyl"). In some embodiments, a heteroalkyl group is a saturated group having 1 to 4 carbon atoms and lor 2 heteroatoms within the parent chain ("heteroC^ alkyl"). In some embodiments, a heteroalkyl group is a saturated group having 1 to 3 carbon atoms and 1 heteroatom within the parent chain
("heteroCi-3 alkyl"). In some embodiments, a heteroalkyl group is a saturated group having 1 to 2 carbon atoms and 1 heteroatom within the parent chain ("heteroC^ alkyl"). In some embodiments, a heteroalkyl group is a saturated group having 1 carbon atom and 1 heteroatom ("heteroCi alkyl"). In some embodiments, a heteroalkyl group is a saturated group having 2 to 6 carbon atoms and 1 or 2 heteroatoms within the parent chain ("heteroC2-6 alkyl"). Unless otherwise specified, each instance of a heteroalkyl group is independently unsubstituted (an "unsubstituted heteroalkyl") or substituted (a "substituted heteroalkyl") with one or more substituents. In certain embodiments, the heteroalkyl group is an unsubstituted heteroC^o alkyl. In certain embodiments, the heteroalkyl group is a substituted heteroC^o alkyl.
[0088] As used herein, "alkenyl" refers to a radical of a straight-chain or branched hydrocarbon group having from 2 to 20 carbon atoms and one or more carbon-carbon double bonds (e.g., 1, 2, 3, or 4 double bonds). In some embodiments, an alkenyl group has 2 to 10 carbon atoms ("C2-10 alkenyl"). In some embodiments, an alkenyl group has 2 to 9 carbon atoms ("C2-9 alkenyl"). In some embodiments, an alkenyl group has 2 to 8 carbon atoms ("C2-8 alkenyl"). In some embodiments, an alkenyl group has 2 to 7 carbon atoms ("C2_7 alkenyl"). In some embodiments, an alkenyl group has 2 to 6 carbon atoms ("C2-6 alkenyl"). In some embodiments, an alkenyl group has 2 to 5 carbon atoms ("C2-5 alkenyl"). In some embodiments, an alkenyl group has 2 to 4 carbon atoms ("C2^ alkenyl"). In some embodiments, an alkenyl group has 2 to 3 carbon atoms ("C2_3 alkenyl"). In some embodiments, an alkenyl group has 2 carbon atoms ("C2 alkenyl"). The one or more carbon-carbon double bonds can be internal (such as in 2-butenyl) or terminal (such as in 1- butenyl). Examples of C2^ alkenyl groups include ethenyl (C2), 1-propenyl (C3), 2-propenyl (C3), 1-butenyl (C4), 2-butenyl (C4), butadienyl (C4), and the like. Examples of C2-6 alkenyl groups include the aforementioned C2^ alkenyl groups as well as pentenyl (C5), pentadienyl (C5), hexenyl (C6), and the like. Additional examples of alkenyl include heptenyl (C7), octenyl (Cg), octatrienyl (Cg), and the like. Unless otherwise specified, each instance of an alkenyl group is independently unsubstituted (an "unsubstituted alkenyl") or substituted (a "substituted alkenyl") with one or more substituents. In certain embodiments, the alkenyl group is an unsubstituted C2-20 alkenyl. In certain embodiments, the alkenyl group is a substituted C2-20 alkenyl.
[0089] As used herein, "heteroalkenyl" refers to an alkenyl group as defined herein which further includes at least one heteroatom (e.g., 1, 2, 3, or 4 heteroatoms) selected from oxygen, nitrogen, or sulfur within (i.e., inserted between adjacent carbon atoms of) and/or placed at one or more terminal position(s) of the parent chain. In certain embodiments, a heteroalkenyl group refers to a group having from 2 to 20 carbon atoms, at least one double bond, and 1 or more heteroatoms within the parent chain ("heteroC2-2o alkenyl"). In certain embodiments, a heteroalkenyl group refers to a group having from 2 to 10 carbon atoms, at least one double bond, and 1 or more heteroatoms within the parent chain ("heteroC2-io alkenyl"). In some embodiments, a heteroalkenyl group has 2 to 9 carbon atoms at least one double bond, and 1 or more heteroatoms within the parent chain ("heteroC2-9 alkenyl"). In some embodiments, a heteroalkenyl group has 2 to 8 carbon atoms, at least one double bond, and 1 or more heteroatoms within the parent chain ("heteroC2-8 alkenyl"). In some embodiments, a heteroalkenyl group has 2 to 7 carbon atoms, at least one double bond, and 1 or more heteroatoms within the parent chain ("heteroC2-7 alkenyl"). In some embodiments, a heteroalkenyl group has 2 to 6 carbon atoms, at least one double bond, and 1 or more heteroatoms within the parent chain ("heteroC2-6 alkenyl"). In some embodiments, a heteroalkenyl group has 2 to 5 carbon atoms, at least one double bond, and 1 or 2
heteroatoms within the parent chain ("heteroC2-5 alkenyl"). In some embodiments, a heteroalkenyl group has 2 to 4 carbon atoms, at least one double bond, and lor 2 heteroatoms within the parent chain ("heteroC2^ alkenyl"). In some embodiments, a heteroalkenyl group has 2 to 3 carbon atoms, at least one double bond, and 1 heteroatom within the parent chain ("heteroC2-3 alkenyl"). In some embodiments, a heteroalkenyl group has 2 to 6 carbon atoms, at least one double bond, and 1 or 2 heteroatoms within the parent chain ("heteroC2-6 alkenyl"). Unless otherwise specified, each instance of a heteroalkenyl group is
independently unsubstituted (an "unsubstituted heteroalkenyl") or substituted (a "substituted heteroalkenyl") with one or more substituents. In certain embodiments, the heteroalkenyl group is an unsubstituted heteroC2-2o alkenyl. In certain embodiments, the heteroalkenyl group is a substituted heteroC2-2o alkenyl.
[0090] As used herein, "alkynyl" refers to a radical of a straight-chain or branched hydrocarbon group having from 2 to 20 carbon atoms and one or more carbon-carbon triple bonds (e.g., 1, 2, 3, or 4 triple bonds) ("C2-20 alkynyl"). In some embodiments, an alkynyl group has 2 to 10 carbon atoms ("C2-10 alkynyl"). In some embodiments, an alkynyl group has 2 to 9 carbon atoms ("C2-9 alkynyl"). In some embodiments, an alkynyl group has 2 to 8 carbon atoms ("C2-8 alkynyl"). In some embodiments, an alkynyl group has 2 to 7 carbon atoms ("C2-7 alkynyl"). In some embodiments, an alkynyl group has 2 to 6 carbon atoms ("C2-6 alkynyl"). In some embodiments, an alkynyl group has 2 to 5 carbon atoms ("C2_5 alkynyl"). In some embodiments, an alkynyl group has 2 to 4 carbon atoms ("C2^ alkynyl"). In some embodiments, an alkynyl group has 2 to 3 carbon atoms ("C2-3 alkynyl"). In some embodiments, an alkynyl group has 2 carbon atoms ("C2 alkynyl"). The one or more carbon- carbon triple bonds can be internal (such as in 2-butynyl) or terminal (such as in 1-butynyl). Examples of C2-A alkynyl groups include, without limitation, ethynyl (C2), 1-propynyl (C3), 2-propynyl (C3), 1-butynyl (C4), 2-butynyl (C4), and the like. Examples of C2-6 alkenyl groups include the aforementioned C2^ alkynyl groups as well as pentynyl (C5), hexynyl (C6), and the like. Additional examples of alkynyl include heptynyl (C7), octynyl (Cg), and the like. Unless otherwise specified, each instance of an alkynyl group is independently unsubstituted (an "unsubstituted alkynyl") or substituted (a "substituted alkynyl") with one or more substituents. In certain embodiments, the alkynyl group is an unsubstituted C2-20 alkynyl. In certain embodiments, the alkynyl group is a substituted C2-20 alkynyl.
[0091] As used herein, "heteroalkynyl" refers to an alkynyl group as defined herein which further includes at least one heteroatom (e.g., 1, 2, 3, or 4 heteroatoms) selected from oxygen, nitrogen, or sulfur within (i.e., inserted between adjacent carbon atoms of) and/or placed at one or more terminal position(s) of the parent chain. In certain embodiments, a heteroalkynyl group refers to a group having from 2 to 20 carbon atoms, at least one triple bond, and 1 or more heteroatoms within the parent chain ("heteroC2-2o alkynyl"). In certain embodiments, a heteroalkynyl group refers to a group having from 2 to 10 carbon atoms, at least one triple bond, and 1 or more heteroatoms within the parent chain ("heteroC2-io alkynyl"). In some embodiments, a heteroalkynyl group has 2 to 9 carbon atoms, at least one triple bond, and 1 or more heteroatoms within the parent chain ("heteroC2-9 alkynyl"). In some embodiments, a heteroalkynyl group has 2 to 8 carbon atoms, at least one triple bond, and 1 or more heteroatoms within the parent chain ("heteroC2-8 alkynyl"). In some embodiments, a heteroalkynyl group has 2 to 7 carbon atoms, at least one triple bond, and 1 or more heteroatoms within the parent chain ("heteroC2-7 alkynyl"). In some embodiments, a heteroalkynyl group has 2 to 6 carbon atoms, at least one triple bond, and 1 or more heteroatoms within the parent chain ("heteroC2-6 alkynyl"). In some embodiments, a heteroalkynyl group has 2 to 5 carbon atoms, at least one triple bond, and 1 or 2 heteroatoms within the parent chain ("heteroC2-5 alkynyl"). In some embodiments, a heteroalkynyl group has 2 to 4 carbon atoms, at least one triple bond, and lor 2 heteroatoms within the parent chain
Figure imgf000026_0001
alkynyl"). In some embodiments, a heteroalkynyl group has 2 to 3 carbon atoms, at least one triple bond, and 1 heteroatom within the parent chain ("heteroC2-3 alkynyl"). In some embodiments, a heteroalkynyl group has 2 to 6 carbon atoms, at least one triple bond, and 1 or 2 heteroatoms within the parent chain ("heteroC2-6 alkynyl"). Unless otherwise specified, each instance of a heteroalkynyl group is independently unsubstituted (an "unsubstituted heteroalkynyl") or substituted (a "substituted heteroalkynyl") with one or more substituents. In certain embodiments, the heteroalkynyl group is an unsubstituted heteroC2-2o alkynyl. In certain embodiments, the heteroalkynyl group is a substituted heteroC2-2o alkynyl.
[0092] As used herein, "carbocyclyl" or "carbocyclic" refers to a radical of a non- aromatic cyclic hydrocarbon group having from 3 to 10 ring carbon atoms ('¾_10
carbocyclyl") and zero heteroatoms in the non-aromatic ring system. In some embodiments, a carbocyclyl group has 3 to 8 ring carbon atoms ("C3_g carbocyclyl"). In some
embodiments, a carbocyclyl group has 3 to 7 ring carbon atoms ("C3_7 carbocyclyl"). In some embodiments, a carbocyclyl group has 3 to 6 ring carbon atoms ("C3_6 carbocyclyl"). In some embodiments, a carbocyclyl group has 4 to 6 ring carbon atoms ("C4_6 carbocyclyl"). In some embodiments, a carbocyclyl group has 5 to 6 ring carbon atoms ("Cs_6 carbocyclyl"). In some embodiments, a carbocyclyl group has 5 to 10 ring carbon atoms ("Cs-io carbocyclyl"). Exemplary C3_6 carbocyclyl groups include, without limitation, cyclopropyl (C3),
cyclopropenyl (C3), cyclobutyl (C4), cyclobutenyl (C4), cyclopentyl (C5), cyclopentenyl (C5), cyclohexyl (C6), cyclohexenyl (C6), cyclohexadienyl (C6), and the like. Exemplary C3_g carbocyclyl groups include, without limitation, the aforementioned C3_6 carbocyclyl groups as well as cycloheptyl (C7), cycloheptenyl (C7), cycloheptadienyl (C7), cycloheptatrienyl (C7), cyclooctyl (Cg), cyclooctenyl (Cg), bicyclo[2.2.1]heptanyl (C7), bicyclo[2.2.2]octanyl (Cg), and the like. Exemplary C3_10 carbocyclyl groups include, without limitation, the
aforementioned C3_g carbocyclyl groups as well as cyclononyl (C9), cyclononenyl (C9), cyclodecyl (C10), cyclodecenyl (C10), octahydro-lH-indenyl (C9), decahydronaphthalenyl (Cio), spiro[4.5]decanyl (C10), and the like. As the foregoing examples illustrate, in certain embodiments, the carbocyclyl group is either monocyclic ("monocyclic carbocyclyl") or polycyclic (e.g., containing a fused, bridged or spiro ring system such as a bicyclic system ("bicyclic carbocyclyl") or tricyclic system ("tricyclic carbocyclyl")) and can be saturated or can contain one or more carbon-carbon double or triple bonds. "Carbocyclyl" also includes ring systems wherein the carbocyclyl ring, as defined above, is fused with one or more aryl or heteroaryl groups wherein the point of attachment is on the carbocyclyl ring, and in such instances, the number of carbons continue to designate the number of carbons in the carbocyclic ring system. Unless otherwise specified, each instance of a carbocyclyl group is independently unsubstituted (an "unsubstituted carbocyclyl") or substituted (a "substituted carbocyclyl") with one or more substituents. In certain embodiments, the carbocyclyl group is an unsubstituted C3_10 carbocyclyl. In certain embodiments, the carbocyclyl group is a substituted C3_10 carbocyclyl.
[0093] In some embodiments, "carbocyclyl" is a monocyclic, saturated carbocyclyl group having from 3 to 10 ring carbon atoms ("Cs-io cycloalkyl"). In some embodiments, a cycloalkyl group has 3 to 8 ring carbon atoms ("C3_8 cycloalkyl"). In some embodiments, a cycloalkyl group has 3 to 6 ring carbon atoms ("C3_6 cycloalkyl"). In some embodiments, a cycloalkyl group has 4 to 6 ring carbon atoms ("C4_6 cycloalkyl"). In some embodiments, a cycloalkyl group has 5 to 6 ring carbon atoms ("C5_6 cycloalkyl"). In some embodiments, a cycloalkyl group has 5 to 10 ring carbon atoms ("Cs-io cycloalkyl"). Examples of C5_6 cycloalkyl groups include cyclopentyl (C5) and cyclohexyl (C5). Examples of C3_6 cycloalkyl groups include the aforementioned C5_6 cycloalkyl groups as well as cyclopropyl (C3) and cyclobutyl (C4). Examples of C3_8 cycloalkyl groups include the aforementioned C3_6 cycloalkyl groups as well as cycloheptyl (C7) and cyclooctyl (C8). Unless otherwise specified, each instance of a cycloalkyl group is independently unsubstituted (an
"unsubstituted cycloalkyl") or substituted (a "substituted cycloalkyl") with one or more substituents. In certain embodiments, the cycloalkyl group is an unsubstituted C3_io cycloalkyl. In certain embodiments, the cycloalkyl group is a substituted C3_10 cycloalkyl.
[0094] As used herein, "heterocyclyl" or "heterocyclic" refers to a radical of a 3- to 14- membered non-aromatic ring system having ring carbon atoms and 1 to 4 ring heteroatoms, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur ("3-14 membered heterocyclyl"). In heterocyclyl groups that contain one or more nitrogen atoms, the point of attachment can be a carbon or nitrogen atom, as valency permits. A heterocyclyl group can either be monocyclic ("monocyclic heterocyclyl") or polycyclic (e.g., a fused, bridged or spiro ring system such as a bicyclic system ("bicyclic heterocyclyl") or tricyclic system ("tricyclic heterocyclyl")), and can be saturated or can contain one or more carbon- carbon double or triple bonds. Heterocyclyl polycyclic ring systems can include one or more heteroatoms in one or both rings. "Heterocyclyl" also includes ring systems wherein the heterocyclyl ring, as defined above, is fused with one or more carbocyclyl groups wherein the point of attachment is either on the carbocyclyl or heterocyclyl ring, or ring systems wherein the heterocyclyl ring, as defined above, is fused with one or more aryl or heteroaryl groups, wherein the point of attachment is on the heterocyclyl ring, and in such instances, the number of ring members continue to designate the number of ring members in the heterocyclyl ring system. Unless otherwise specified, each instance of heterocyclyl is independently unsubstituted (an "unsubstituted heterocyclyl") or substituted (a "substituted heterocyclyl") with one or more substituents. In certain embodiments, the heterocyclyl group is an unsubstituted 3-14 membered heterocyclyl. In certain embodiments, the heterocyclyl group is a substituted 3-14 membered heterocyclyl.
[0095] In some embodiments, a heterocyclyl group is a 5-10 membered non-aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur ("5-10 membered heterocyclyl"). In some embodiments, a heterocyclyl group is a 5-8 membered non-aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms, wherein each heteroatom is
independently selected from nitrogen, oxygen, and sulfur ("5-8 membered heterocyclyl"). In some embodiments, a heterocyclyl group is a 5-6 membered non-aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms, wherein each heteroatom is
independently selected from nitrogen, oxygen, and sulfur ("5-6 membered heterocyclyl"). In some embodiments, the 5-6 membered heterocyclyl has 1-3 ring heteroatoms selected from nitrogen, oxygen, and sulfur. In some embodiments, the 5-6 membered heterocyclyl has 1-2 ring heteroatoms selected from nitrogen, oxygen, and sulfur. In some embodiments, the 5-6 membered heterocyclyl has 1 ring heteroatom selected from nitrogen, oxygen, and sulfur.
[0096] Exemplary 3-membered heterocyclyl groups containing 1 heteroatom include, without limitation, azirdinyl, oxiranyl, thiorenyl. Exemplary 4-membered heterocyclyl groups containing 1 heteroatom include, without limitation, azetidinyl, oxetanyl and thietanyl. Exemplary 5-membered heterocyclyl groups containing 1 heteroatom include, without limitation, tetrahydrofuranyl, dihydrofuranyl, tetrahydrothiophenyl,
dihydrothiophenyl, pyrrolidinyl, dihydropyrrolyl and pyrrolyl-2,5-dione. Exemplary 5- membered heterocyclyl groups containing 2 heteroatoms include, without limitation, dioxolanyl, oxathiolanyl and dithiolanyl. Exemplary 5-membered heterocyclyl groups containing 3 heteroatoms include, without limitation, triazolinyl, oxadiazolinyl, and thiadiazolinyl. Exemplary 6-membered heterocyclyl groups containing 1 heteroatom include, without limitation, piperidinyl, tetrahydropyranyl, dihydropyridinyl, and thianyl. Exemplary 6-membered heterocyclyl groups containing 2 heteroatoms include, without limitation, piperazinyl, morpholinyl, dithianyl, dioxanyl. Exemplary 6-membered
heterocyclyl groups containing 2 heteroatoms include, without limitation, triazinanyl.
Exemplary 7-membered heterocyclyl groups containing 1 heteroatom include, without limitation, azepanyl, oxepanyl and thiepanyl. Exemplary 8-membered heterocyclyl groups containing 1 heteroatom include, without limitation, azocanyl, oxecanyl and thiocanyl.
Exemplary bicyclic heterocyclyl groups include, without limitation, indolinyl, isoindolinyl, dihydrobenzofuranyl, dihydrobenzothienyl, tetrahydrobenzothienyl, tetrahydrobenzofuranyl, tetrahydroindolyl, tetrahydroquinolinyl, tetrahydroisoquinolinyl, decahydroquinolinyl, decahydroisoquinolinyl, octahydrochromenyl, octahydroisochromenyl,
decahydronaphthyridinyl, decahydro-l,8-naphthyridinyl, octahydropyrrolo[3,2-b]pyrrole, indolinyl, phthalimidyl, naphthalimidyl, chromanyl, chromenyl, lH-benzo[e] [l,4]diazepinyl, l,4,5,7-tetrahydropyrano[3,4-b]pyrrolyl, 5,6-dihydro-4H-furo[3,2-b]pyrrolyl, 6,7-dihydro- 5H-furo [3 ,2-b]pyranyl, 5 ,7-dihydro-4H-thieno [2,3-c]pyranyl, 2,3-dihydro- 1 H- pyrrolo[2,3-b]pyridinyl, 2,3-dihydrofuro[2,3-b]pyridinyl, 4,5,6,7-tetrahydro-lH-pyrrolo- [2,3-b]pyridinyl, 4,5,6,7-tetrahydrofuro[3,2-c]pyridinyl, 4,5,6,7-tetrahydrothieno[3,2- b]pyridinyl, l,2,3,4-tetrahydro-l,6-naphthyridinyl, and the like.
[0097] As used herein, "aryl" refers to a radical of a monocyclic or polycyclic (e.g., bicyclic or tricyclic) 4n+2 aromatic ring system (e.g., having 6, 10, or 14 π electrons shared in a cyclic array) having 6-14 ring carbon atoms and zero heteroatoms provided in the aromatic ring system ("C6 i4 aryl"). In some embodiments, an aryl group has 6 ring carbon atoms ("C6 aryl"; e.g., phenyl). In some embodiments, an aryl group has 10 ring carbon atoms ("Cio aryl"; e.g., naphthyl such as 1-naphthyl and 2-naphthyl). In some
embodiments, an aryl group has 14 ring carbon atoms ("G^ aryl"; e.g., anthracyl). "Aryl" also includes ring systems wherein the aryl ring, as defined above, is fused with one or more carbocyclyl or heterocyclyl groups wherein the radical or point of attachment is on the aryl ring, and in such instances, the number of carbon atoms continue to designate the number of carbon atoms in the aryl ring system. Unless otherwise specified, each instance of an aryl group is independently unsubstituted (an "unsubstituted aryl") or substituted (a "substituted aryl") with one or more substituents. In certain embodiments, the aryl group is an
unsubstituted C6-i4 aryl. In certain embodiments, the aryl group is a substituted C6-i4 aryl.
[0098] As used herein, "heteroaryl" refers to a radical of a 5-14 membered monocyclic or polycyclic (e.g., bicyclic, tricyclic) 4n+2 aromatic ring system (e.g., having 6, 10, or 14 π electrons shared in a cyclic array) having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen and sulfur ("5-14 membered heteroaryl"). In heteroaryl groups that contain one or more nitrogen atoms, the point of attachment can be a carbon or nitrogen atom, as valency permits. Heteroaryl polycyclic ring systems can include one or more heteroatoms in one or both rings. "Heteroaryl" includes ring systems wherein the heteroaryl ring, as defined above, is fused with one or more carbocyclyl or heterocyclyl groups wherein the point of attachment is on the heteroaryl ring, and in such instances, the number of ring members continue to designate the number of ring members in the heteroaryl ring system. "Heteroaryl" also includes ring systems wherein the heteroaryl ring, as defined above, is fused with one or more aryl groups wherein the point of attachment is either on the aryl or heteroaryl ring, and in such instances, the number of ring members designates the number of ring members in the fused polycyclic (aryl/heteroaryl) ring system. Polycyclic heteroaryl groups wherein one ring does not contain a heteroatom (e.g., indolyl, quinolinyl, carbazolyl, and the like) the point of attachment can be on either ring, i.e., either the ring bearing a heteroatom (e.g., 2-indolyl) or the ring that does not contain a heteroatom (e.g., 5-indolyl).
[0099] In some embodiments, a heteroaryl group is a 5-10 membered aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur ("5-10 membered heteroaryl"). In some embodiments, a heteroaryl group is a 5-8 membered aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur ("5-8 membered heteroaryl"). In some embodiments, a heteroaryl group is a 5-6 membered aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur ("5-6 membered heteroaryl"). In some embodiments, the 5-6 membered heteroaryl has 1-3 ring heteroatoms selected from nitrogen, oxygen, and sulfur. In some embodiments, the 5-6 membered heteroaryl has 1-2 ring heteroatoms selected from nitrogen, oxygen, and sulfur. In some embodiments, the 5-6 membered heteroaryl has 1 ring heteroatom selected from nitrogen, oxygen, and sulfur. Unless otherwise specified, each instance of a heteroaryl group is independently unsubstituted (an "unsubstituted heteroaryl") or substituted (a "substituted heteroaryl") with one or more substituents. In certain embodiments, the heteroaryl group is an unsubstituted 5-14 membered heteroaryl. In certain embodiments, the heteroaryl group is a substituted 5-14 membered heteroaryl. [00100] Exemplary 5-membered heteroaryl groups containing 1 heteroatom include, without limitation, pyrrolyl, furanyl and thiophenyl. Exemplary 5-membered heteroaryl groups containing 2 heteroatoms include, without limitation, imidazolyl, pyrazolyl, oxazolyl, isoxazolyl, thiazolyl, and isothiazolyl. Exemplary 5-membered heteroaryl groups containing 3 heteroatoms include, without limitation, triazolyl, oxadiazolyl, and thiadiazolyl. Exemplary
5- membered heteroaryl groups containing 4 heteroatoms include, without limitation, tetrazolyl. Exemplary 6-membered heteroaryl groups containing 1 heteroatom include, without limitation, pyridinyl. Exemplary 6-membered heteroaryl groups containing 2 heteroatoms include, without limitation, pyridazinyl, pyrimidinyl, and pyrazinyl. Exemplary
6- membered heteroaryl groups containing 3 or 4 heteroatoms include, without limitation, triazinyl and tetrazinyl, respectively. Exemplary 7-membered heteroaryl groups containing 1 heteroatom include, without limitation, azepinyl, oxepinyl, and thiepinyl. Exemplary 5,6- bicyclic heteroaryl groups include, without limitation, indolyl, isoindolyl, indazolyl, benzotriazolyl, benzothiophenyl, isobenzothiophenyl, benzofuranyl, benzoisofuranyl, benzimidazolyl, benzoxazolyl, benzisoxazolyl, benzoxadiazolyl, benzthiazolyl,
benzisothiazolyl, benzthiadiazolyl, indolizinyl, and purinyl. Exemplary 6,6-bicyclic heteroaryl groups include, without limitation, naphthyridinyl, pteridinyl, quinolinyl, isoquinolinyl, cinnolinyl, quinoxalinyl, phthalazinyl, and quinazolinyl. Exemplary tricyclic heteroaryl groups include, without limitation, phenanthridinyl, dibenzofuranyl, carbazolyl, acridinyl, phenothiazinyl, phenoxazinyl and phenazinyl.
[00101] As used herein, the term "partially unsaturated" refers to a ring moiety that includes at least one double or triple bond. The term "partially unsaturated" is intended to encompass rings having multiple sites of unsaturation, but is not intended to include aromatic groups (e.g., aryl or heteroaryl moieties) as herein defined.
[00102] As used herein, the term "saturated" refers to a ring moiety that does not contain a double or triple bond, i.e., the ring contains all single bonds.
[00103] Affixing the suffix "-ene" to a group indicates the group is a divalent moiety, e.g., alkylene is the divalent moiety of alkyl, alkenylene is the divalent moiety of alkenyl, alkynylene is the divalent moiety of alkynyl, heteroalkylene is the divalent moiety of heteroalkyl, heteroalkenylene is the divalent moiety of heteroalkenyl, heteroalkynylene is the divalent moiety of heteroalkynyl, carbocyclylene is the divalent moiety of carbocyclyl, heterocyclylene is the divalent moiety of heterocyclyl, arylene is the divalent moiety of aryl, and heteroarylene is the divalent moiety of heteroaryl. [00104] As understood from the above, alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl groups, as defined herein, are, in certain embodiments, optionally substituted. Optionally substituted refers to a group which may be substituted or unsubstituted (e.g., "substituted" or "unsubstituted" alkyl, "substituted" or "unsubstituted" alkenyl, "substituted" or "unsubstituted" alkynyl, "substituted" or
"unsubstituted" heteroalkyl, "substituted" or "unsubstituted" heteroalkenyl, "substituted" or "unsubstituted" heteroalkynyl, "substituted" or "unsubstituted" carbocyclyl, "substituted" or "unsubstituted" heterocyclyl, "substituted" or "unsubstituted" aryl or "substituted" or "unsubstituted" heteroaryl group). In general, the term "substituted" means that at least one hydrogen present on a group is replaced with a permissible substituent, e.g., a substituent which upon substitution results in a stable compound, e.g., a compound which does not spontaneously undergo transformation such as by rearrangement, cyclization, elimination, or other reaction. Unless otherwise indicated, a "substituted" group has a substituent at one or more substitutable positions of the group, and when more than one position in any given structure is substituted, the substituent is either the same or different at each position. The term "substituted" is contemplated to include substitution with all permissible substituents of organic compounds, any of the substituents described herein that results in the formation of a stable compound. The present invention contemplates any and all such combinations in order to arrive at a stable compound. For purposes of this invention, heteroatoms such as nitrogen may have hydrogen substituents and/or any suitable substituent as described herein which satisfy the valencies of the heteroatoms and results in the formation of a stable moiety.
[00105] Exemplary carbon atom substituents include, but are not limited to, halogen, -CN, -N02, -N3, -S02H, -S03H, -OH, -ORaa, -ON(Rbb)2, -N(Rbb)2, -N(Rbb)3 +X , -N(ORcc)Rbb, -SH, -SRaa, -SSRCC, -C(=0)Raa, -C02H, -CHO, -C(ORcc)2, -C02Raa, -OC(=0)Raa, - OCOaR^, -C(=0)N(Rbb)2, -OC(=0)N(Rbb)2, -NRbbC(=0)Raa, -NRbbC02Raa, - NRbbC(=0)N(Rbb)2, -C(=NRbb)Raa, -C(=NRbb)ORaa, -OC(=NRbb)Raa, -OC(=NRbb)ORaa, - C(=NRbb)N(Rbb)2, -OC(=NRbb)N(Rbb)2, -NRbbC(=NRbb)N(Rbb)2, -C(=0)NRbbS02Raa, - NRbbS02Raa, -S02N(Rbb)2, -S02Raa, -S02ORaa, -OS02Raa, -S(=0)Raa, -OS(=0)Raa, - Si(Raa)3, -OSi(Raa)3 -C(=S)N(Rbb)2, -C(=0)SRaa, -C(=S)SRaa, -SC(=S)SRaa, -SC(=0)SRaa, -OC(=0)SRaa, -SC(=0)ORaa, -SC(=0)Raa, -P(=0)2Raa, -OP(=0)2Raa, -P(=0)(Raa)2, - OP(=0)(Raa)2, -OP(=0)(ORcc)2, -P(=0)2N(Rbb)2, -OP(=0)2N(Rbb)2, -P(=0)(NRbb)2, - OP(=0)(NRbb)2, -NRbbP(=0)(ORcc)2, -NRbbP(=0)(NRbb)2, -P(RCC)2, -P(RCC)3, -OP(Rcc)2, - OP(Rcc)3, -B(Raa)2, -B(ORcc)2, -BRaa(ORcc), Cn0 alkyl, Ci_i0 perhaloalkyl, C2_i0 alkenyl, C2_io alkynyl, C^o heteroalkyl, C2_io heteroalkenyl, C^ioheteroalkynyl, C3_14 carbocyclyl, 3- 14 membered heterocyclyl, C6-i4 aryl, and 5-14 membered heteroaryl, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 Rdd groups;
or two geminal hydrogens on a carbon atom are replaced with the group =0, =S, =NN(Rbb)2, =NNRbbC(=0)Raa, =NNRbbC(=0)ORaa, =NNRbbS(=0)2Raa, =NRbb, or =NORcc; each instance of Raa is, independently, selected from C^o alkyl, C^o perhaloalkyl, C2_io alkenyl, C2_10 alkynyl, Ci-w heteroalkyl, C2_10 heteroalkenyl, C^ioheteroalkynyl, C3_10 carbocyclyl, 3-14 membered heterocyclyl, C6-i4 aryl, and 5-14 membered heteroaryl, or two Raa groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 Rdd groups;
each instance of Rbb is, independently, selected from hydrogen, -OH, -ORaa, - N(RCC)2, -CN, -C(=0)Raa, -C(=0)N(Rcc)2, -C02Raa, -S02Raa, -C(=NRcc)ORaa, - C(=NRCC)N(RCC)2, -S02N(Rcc)2, -S02Rcc, -S02ORcc, -SORaa, -C(=S)N(RCC)2, -C(=0)SRcc, - C(=S)SRCC, -P(=0)2Raa, -P(=0)(Raa)2, -P(=0)2N(Rcc)2, -P(=0)(NRcc)2, Ci_i0 alkyl, Ci_i0 perhaloalkyl, C2_io alkenyl, C2_io alkynyl, C^o heteroalkyl, C2_io heteroalkenyl, C2
10heteroalkynyl, C3_io carbocyclyl, 3-14 membered heterocyclyl, C6-i4 aryl, and 5-14 membered heteroaryl, or two Rbb groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 Rdd groups;
each instance of Rcc is, independently, selected from hydrogen, C^o alkyl, Cno perhaloalkyl, C2_10 alkenyl, C2_10 alkynyl, Ci-w heteroalkyl, C2_10 heteroalkenyl, C2
10heteroalkynyl, C3_io carbocyclyl, 3-14 membered heterocyclyl, C6-i4 aryl, and 5-14 membered heteroaryl, or two Rcc groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 Rdd groups;
each instance of Rdd is, independently, selected from halogen, -CN, -N02, -N3, - S02H, -S03H, -OH, -ORee, -ON(Rff)2, -N(Rff)2, -N(Rff)3 +X , -N(ORee)Rff, -SH, -SRee, - SSRee, -C(=0)Ree, -C02H, -C02Ree, -OC(=0)Ree, -OC02Ree, -C(=0)N(Rff)2, - OC(=0)N(Rff)2, -NRffC(=0)Ree, -NRffC02Ree, -NRffC(=0)N(Rff)2, -C(=NRff)ORee, - OC(=NRff)Ree, -OC(=NRff)ORee, -C(=NRff)N(Rff)2, -OC(=NRff)N(Rff)2, - NR"C(=NR")N(R")2,-NR"S02Ree, -S02N(R")2, -S02Ree, -S02ORee, -OS02Ree, -S(=0)Ree, -Si(Ree)3, -OSi(Ree)3, -C(=S)N(Rff)2, -C(=0)SRee, -C(=S)SRee, -SC(=S)SRee, -P(=0)2Ree, - P(=0)(Ree)2, -OP(=0)(Ree)2, -OP(=0)(ORee)2, Ci_6 alkyl, Ci_6 perhaloalkyl, C2_6 alkenyl, C2 6 alkynyl, Ci_6 heteroalkyl, C2_6 heteroalkenyl, C2_6heteroalkynyl, C3_10 carbocyclyl, 3-10 membered heterocyclyl, C^o aryl, 5-10 membered heteroaryl, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 Rgg groups, or two geminal Rdd substituents can be joined to form =0 or =S;
each instance of Ree is, independently, selected from Q_6 alkyl, C^ perhaloalkyl, C2 6 alkenyl, C2_6 alkynyl, C^ heteroalkyl, C2_6 heteroalkenyl, C2_6heteroalkynyl, C3_io carbocyclyl, C6-io aryl, 3-10 membered heterocyclyl, and 3-10 membered heteroaryl, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 Rgg groups;
ff
each instance of R is, independently, selected from hydrogen, Ci-β alkyl, Ci-e perhaloalkyl, C2_6 alkenyl, C2_6 alkynyl, Q-6 heteroalkyl, C2_6 heteroalkenyl, C2
6heteroalkynyl, C3_io carbocyclyl, 3-10 membered heterocyclyl, C6-io aryl and 5-10
ff
membered heteroaryl, or two R groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 Rgg groups; and
each instance of Rgg is, independently, halogen, -CN, -N02, -N3, -S02H, -S03H, - OH, -OCi-6 alkyl, -ON(Ci_6 alkyl)2, -N(Ci_6 alkyl)2, -N(Ci_6 alkyl)3 +X- -NH(Ci_6 alkyl)2 +X~ -NH2(Ci_6 alkyl) +X", -NH3 +X~ -N(OCi_6 alky CCi-e alkyl), -N(OH)(Ci_6 alkyl), -NH(OH), -SH, -SCi_6 alkyl, -SS(Ci_e alkyl), -C(=0)(Ci_6 alkyl), -C02H, -C02(Ci_6 alkyl), -OC(=0)(Ci_6 alkyl), -OC02(d_6 alkyl), -C(=0)NH2, -C(=0)N(d_6 alkyl)2, - OC(=0)NH(Ci_6 alkyl), -NHC(=0)( Ci_6 alkyl), -N(Ci_6 alkyl)C(=0)( Ci_6 alkyl), - NHC02(Ci_6 alkyl), -NHC(=0)N(Ci_6 alkyl)2, -NHC(=0)NH(Ci_6 alkyl), -NHC(=0)NH2, -C(=NH)0(Ci_6 alkyl) ,-OC(=NH)(Ci_6 alkyl), -OC(=NH)OCi^, alkyl, -C(=NH)N(Ci_6 alkyl)2, -C(=NH)NH(Ci_6 alkyl), -C(=NH)NH2, -OC(=NH)N(Ci_6 alkyl)2, - OC(NH)NH(Ci^ alkyl), -OC(NH)NH2, -NHC(NH)N(Ci_6 alkyl)2, -NHC(=NH)NH2, - NHS02(Ci_6 alkyl), -S02N(Ci_6 alkyl)2, -S02NH(Ci_6 alkyl), -S02NH2,-S02Ci_6 alkyl, - S02OCi^ alkyl, -OS02Ci_6 alkyl, -SOCi_6 alkyl, -Si(Ci_e alkyl)3, -OSi(Ci_6 alkyl)3 - C(=S)N(Ci_6 alkyl)2, C(=S)NH(d_6 alkyl), C(=S)NH2, -C(=0)S(C^ alkyl), -C(=S)SC^ alkyl, -SC(=S)Sd_6 alkyl, -P(=0)2(C^ alkyl), -P(=0)(d_6 alkyl)2, -OP(=0)(d 6 alkyl)2, - 0P(=0)(0Ci_6 alkyl)2, C^ alkyl, d_6 perhaloalkyl, C2_6 alkenyl, C2_6 alkynyl, Ci_
6heteroalkyl, C2_6 heteroalkenyl, C2_6heteroalkynyl, C3_10 carbocyclyl, C6-io aryl, 3-10 membered heterocyclyl, 5-10 membered heteroaryl; or two geminal Rgg substituents can be joined to form =0 or =S; wherein X is a counterion.
[00106] As used herein, the term "halo" or "halogen" refers to fluorine (fluoro, -F), chlorine (chloro, -CI), bromine (bromo, -Br), or iodine (iodo, -I).
[00107] As used herein, a "counterion" is a negatively charged group associated with a positively charged quarternary amine in order to maintain electronic neutrality. Exemplary counterions include halide ions (e.g., F , CI", Br", Γ), N03 , C104 , OFT, H2P04 , HS04 , sulfonate ions (e.g., methansulfonate, trifluoromethanesulfonate, p-toluenesulfonate, benzenesulfonate, 10-camphor sulfonate, naphthalene-2-sulfonate, naphthalene-l-sulfonic acid-5-sulfonate, ethan-l-sulfonic acid-2-sulfonate, and the like), and carboxylate ions (e.g., acetate, ethanoate, propanoate, benzoate, glycerate, lactate, tartrate, glycolate, and the like).
[00108] In certain embodiments, the substituent present on the nitrogen atom is an nitrogen protecting group (also referred to as an "amino protecting group"). Nitrogen protecting groups include, but are not limited to, -OH, -OR^, -N(RCC)2, -C(=0)Raa, -C(=0)N(Rcc)2, - COaR^, -S02Raa, -C(=NRcc)Raa, -C(=NRcc)ORaa, -C(=NRCC)N(RCC)2, -S02N(Rcc)2, -S02Rcc, -S02ORcc, -SORaa, -C(=S)N(RCC)2, -C(=0)SRcc, -C(=S)SRCC, Ci_i0 alkyl (e.g., aralkyl, heteroaralkyl), C2_io alkenyl, C2_io alkynyl, C^o heteroalkyl, C2_io heteroalkenyl, C2_io heteroalkynyl, C3_io carbocyclyl, 3-14 membered heterocyclyl, C6-i4 aryl, and 5-14 membered heteroaryl groups, wherein each alkyl, alkenyl, alkynyl, heteroalkyl,
heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aralkyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 Rdd groups, and wherein Raa, Rbb, Rcc and Rdd are as defined herein. Nitrogen protecting groups are well known in the art and include those described in detail in Protecting Groups in Organic Synthesis, T. W. Greene and P. G. M.
Wuts, 3 rd edition, John Wiley & Sons, 1999, incorporated herein by reference.
[00109] For example, nitrogen protecting groups such as amide groups (e.g., -C(=0)Raa) include, but are not limited to, formamide, acetamide, chloroacetamide, trichloroacetamide, trifluoroacetamide, phenylacetamide, 3-phenylpropanamide, picolinamide, 3- pyridylcarboxamide, N-benzoylphenylalanyl derivative, benzamide, p-phenylbenzamide, o- nitophenylacetamide, o-nitrophenoxyacetamide, acetoacetamide, (Ν'- dithiobenzyloxyacylamino)acetamide, 3-(p-hydroxyphenyl)propanamide, 3-(o- nitrophenyl)propanamide, 2-methyl-2-(o-nitrophenoxy)propanamide, 2-methyl-2-(o- phenylazophenoxy)propanamide, 4-chlorobutanamide, 3-methyl-3-nitrobutanamide, o- nitrocinnamide, N-acetylmethionine derivative, o-nitrobenzamide and o- (benzoyloxymethyl)benzamide.
[00110] Nitrogen protecting groups such as carbamate groups {e.g., -C(=0)ORaa) include, but are not limited to, methyl carbamate, ethyl carbamante, 9-fluorenylmethyl carbamate (Fmoc), 9-(2-sulfo)fluorenylmethyl carbamate, 9-(2,7-dibromo)fluoroenylmethyl carbamate, 2,7-di-i-butyl-[9-( 10,10-dioxo-l 0, 10, 10,10-tetrahydrothioxanthyl)]methyl carbamate (DBD-Tmoc), 4-methoxyphenacyl carbamate (Phenoc), 2,2,2-trichloroethyl carbamate (Troc), 2-trimethylsilylethyl carbamate (Teoc), 2-phenylethyl carbamate (hZ), 1- (l-adamantyl)-l-methylethyl carbamate (Adpoc), l, l-dimethyl-2-haloethyl carbamate, l,l-dimethyl-2,2-dibromoethyl carbamate (DB-i-BOC), l,l-dimethyl-2,2,2-trichloroethyl carbamate (TCBOC), l-methyl-l-(4-biphenylyl)ethyl carbamate (Bpoc), 1— (3,5— di— i— butylphenyl)-l-methylethyl carbamate (i-Bumeoc), 2-(2'- and 4'-pyridyl)ethyl carbamate (Pyoc), 2-(N,N-dicyclohexylcarboxamido)ethyl carbamate, i-butyl carbamate (BOC), 1- adamantyl carbamate (Adoc), vinyl carbamate (Voc), allyl carbamate (Alloc), 1- isopropylallyl carbamate (Ipaoc), cinnamyl carbamate (Coc), 4-nitrocinnamyl carbamate (Noc), 8-quinolyl carbamate, N-hydroxypiperidinyl carbamate, alkyldithio carbamate, benzyl carbamate (Cbz), p-methoxybenzyl carbamate (Moz), /?-nitobenzyl carbamate, p- bromobenzyl carbamate, p-chlorobenzyl carbamate, 2,4-dichlorobenzyl carbamate, 4- methylsulfinylbenzyl carbamate (Msz), 9-anthrylmethyl carbamate, diphenylmethyl carbamate, 2-methylthioethyl carbamate, 2-methylsulfonylethyl carbamate, 2-(p- toluenesulfonyl)ethyl carbamate, [2-(l,3-dithianyl)]methyl carbamate (Dmoc), 4- methylthiophenyl carbamate (Mtpc), 2,4-dimethylthiophenyl carbamate (Bmpc), 2- phosphonioethyl carbamate (Peoc), 2-triphenylphosphonioisopropyl carbamate (Ppoc), 1, 1- dimethyl-2-cyanoethyl carbamate, m-chloro-p-acyloxybenzyl carbamate, p- (dihydroxyboryl)benzyl carbamate, 5-benzisoxazolylmethyl carbamate, 2-(trifluoromethyl)- 6-chromonylmethyl carbamate (Tcroc), m-nitrophenyl carbamate, 3,5-dimethoxybenzyl carbamate, o-nitrobenzyl carbamate, 3,4-dimethoxy-6-nitrobenzyl carbamate, phenyl (o- nitrophenyl)methyl carbamate, i-amyl carbamate, 5-benzyl thiocarbamate, p-cyanobenzyl carbamate, cyclobutyl carbamate, cyclohexyl carbamate, cyclopentyl carbamate,
cyclopropylmethyl carbamate, p-decyloxybenzyl carbamate, 2,2-dimethoxyacylvinyl carbamate, o-(N,N-dimethylcarboxamido)benzyl carbamate, l,l-dimethyl-3-(N,N- dimethylcarboxamido)propyl carbamate, 1, 1-dimethylpropynyl carbamate, di(2- pyridyl)methyl carbamate, 2-furanylmethyl carbamate, 2-iodoethyl carbamate, isoborynl carbamate, isobutyl carbamate, isonicotinyl carbamate, p-(p '-methoxyphenylazo)benzyl carbamate, 1-methylcyclobutyl carbamate, 1-methylcyclohexyl carbamate, 1-methyl-l- cyclopropylmethyl carbamate, l-methyl-l-(3,5-dimethoxyphenyl)ethyl carbamate, 1- methyl-l-(p-phenylazophenyl)ethyl carbamate, 1 -methyl- 1-phenylethyl carbamate, 1- methyl-l-(4-pyridyl)ethyl carbamate, phenyl carbamate, /?-(phenylazo)benzyl carbamate, 2,4,6-tri-i-butylphenyl carbamate, 4-(trimethylammonium)benzyl carbamate, and 2,4,6- trimethylbenzyl carbamate.
[00111] Nitrogen protecting groups such as sulfonamide groups (e.g., -S(=0)2Raa) include, but are not limited to, p-toluenesulfonamide (Ts), benzenesulfonamide, 2,3,6,-trimethyl-4- methoxybenzenesulfonamide (Mtr), 2,4,6-trimethoxybenzenesulfonamide (Mtb), 2,6- dimethyl-4-methoxybenzenesulfonamide (Pme), 2,3,5, 6-tetramethyl-4- methoxybenzenesulfonamide (Mte), 4-methoxybenzenesulfonamide (Mbs), 2,4,6- trimethylbenzenesulfonamide (Mts), 2,6-dimethoxy-4-methylbenzenesulfonamide (iMds), 2,2,5,7, 8-pentamethylchroman-6-sulfonamide (Pmc), methanesulfonamide (Ms), β- trimethylsilylethanesulfonamide (SES), 9-anthracenesulfonamide, 4-(4',8'- dimethoxynaphthylmethyl)benzenesulfonamide (DNMBS), benzylsulfonamide,
trifluoromethylsulfonamide, and phenacylsulfonamide.
[00112] Other nitrogen protecting groups include, but are not limited to, phenothiazinyl- (10)-acyl derivative, N'-p-toluenesulfonylaminoacyl derivative, N'-phenylaminothioacyl derivative, N-benzoylphenylalanyl derivative, N-acetylmethionine derivative, 4,5-diphenyl- 3-oxazolin-2-one, N-phthalimide, N-dithiasuccinimide (Dts), N-2,3-diphenylmaleimide, N-2,5-dimethylpyrrole, N-l,l,4,4-tetramethyldisilylazacyclopentane adduct (STABASE), 5-substituted l,3-dimethyl-l,3,5-triazacyclohexan-2-one, 5-substituted 1,3-dibenzyl- l,3,5-triazacyclohexan-2-one, 1-substituted 3,5-dinitro-4-pyridone, N-methylamine, N- allylamine, N-[2-(trimethylsilyl)ethoxy]methylamine (SEM), N-3-acetoxypropylamine, N- (l-isopropyl-4-nitro-2-oxo-3-pyroolin-3-yl)amine, quaternary ammonium salts, N- benzylamine, N-di(4-methoxyphenyl)methylamine, N-5-dibenzosuberylamine, N- triphenylmethylamine (Tr), N-[(4-methoxyphenyl)diphenylmethyl] amine (MMTr), N-9- phenylfluorenylamine (PhF), N-2,7-dichloro-9-fluorenylmethyleneamine, N- ferrocenylmethylamino (Fcm), N-2-picolylamino N'-oxide, N-1,1- dimethylthiomethyleneamine, N-benzylideneamine, N-p-methoxybenzylideneamine, N- diphenylmethyleneamine, N-[(2-pyridyl)mesityl]methyleneamine, N-(N',N'- dimethylaminomethylene)amine, N,N '-isopropylidenediamine, N-p-nitrobenzylideneamine, N-salicylideneamine, N-5-chlorosalicylideneamine, N-(5-chloro-2- hydroxyphenyl)phenylmethyleneamine, N-cyclohexylideneamine, N-(5,5-dimethyl-3-oxo- l-cyclohexenyl)amine, N-borane derivative, N-diphenylborinic acid derivative, N- [phenyl(pentaacylchromium- or tungsten)acyl] amine, N-copper chelate, N-zinc chelate, N- nitroamine, N-nitrosoamine, amine N-oxide, diphenylphosphinamide (Dpp),
dimethylthiophosphinamide (Mpt), diphenylthiophosphinamide (Ppt), dialkyl
phosphoramidates, dibenzyl phosphoramidate, diphenyl phosphoramidate,
benzenesulfenamide, o-nitrobenzenesulfenamide (Nps), 2,4-dinitrobenzenesulfenamide, pentachlorobenzenesulfenamide, 2-nitro-4-methoxybenzenesulfenamide,
triphenylmethylsulfenamide, and 3-nitropyridinesulfenamide (Npys).
[00113] In certain embodiments, the substituent present on an oxygen atom is an oxygen protecting group (also referred to as a "hydroxyl protecting group"). Oxygen protecting groups include, but are not limited to, -Raa, -N(Rbb)2, -C(=0)SRaa, -C(=0)Raa, -C02Raa, - C(=0)N(Rbb)2, -C(=NRbb)Raa, -C(=NRbb)ORaa, -C(=NRbb)N(Rbb)2, -S(=0)Raa, -SO^, - Si(Raa)3 -P(RCC)2, -P(RCC)3, -P(=0)2Raa, -P(=0)(Raa)2, -P(=0)(ORcc)2, -P(=0)2N(Rbb)2, and - P(=0)(NRbb)2, wherein Raa, Rbb, and Rcc are as defined herein. Oxygen protecting groups are well known in the art and include those described in detail in Protecting Groups in Organic Synthesis, T. W. Greene and P. G. M. Wuts, 3rd edition, John Wiley & Sons, 1999, incorporated herein by reference.
[00114] Exemplary oxygen protecting groups include, but are not limited to, methyl, methoxylmethyl (MOM), methylthiomethyl (MTM), t-butylthiomethyl,
(phenyldimethylsilyl)methoxymethyl (SMOM), benzyloxymethyl (BOM), p- methoxybenzyloxymethyl (PMBM), (4-methoxyphenoxy)methyl (p-AOM), guaiacolmethyl (GUM), i-butoxymethyl, 4-pentenyloxymethyl (POM), siloxymethyl, 2- methoxyethoxymethyl (MEM), 2,2,2-trichloroethoxymethyl, bis(2-chloroethoxy)methyl, 2- (trimethylsilyl)ethoxymethyl (SEMOR), tetrahydropyranyl (THP), 3- bromotetrahydropyranyl, tetrahydrothiopyranyl, 1-methoxycyclohexyl, 4- methoxytetrahydropyranyl (MTHP), 4-methoxytetrahydrothiopyranyl, 4- methoxytetrahydrothiopyranyl S,S-dioxide, l-[(2-chloro-4-methyl)phenyl]-4- methoxypiperidin-4-yl (CTMP), l,4-dioxan-2-yl, tetrahydrofuranyl, tetrahydrothiofuranyl, 2,3,3a,4,5,6,7,7a-octahydro-7,8,8-trimethyl-4,7-methanobenzofuran-2-yl, 1-ethoxyethyl, l-(2-chloroethoxy)ethyl, 1-methyl-l-methoxyethyl, 1-methyl-l-benzyloxyethyl, 1- methyl-l-benzyloxy-2-fluoroethyl, 2,2,2-trichloroethyl, 2-trimethylsilylethyl, 2- (phenylselenyl)ethyl, i-butyl, allyl, p-chlorophenyl, p-methoxyphenyl, 2,4-dinitrophenyl, benzyl (Bn), /?-methoxybenzyl, 3,4-dimethoxybenzyl, o-nitrobenzyl, p-nitrobenzyl, p- halobenzyl, 2,6-dichlorobenzyl, p-cyanobenzyl, p-phenylbenzyl, 2-picolyl, 4-picolyl, 3- methyl-2-picolyl N-oxido, diphenylmethyl, p,p '-dinitrobenzhydryl, 5-dibenzosuberyl, triphenylmethyl, a-naphthyldiphenylmethyl, /7-methoxyphenyldiphenylmethyl, di(p- methoxyphenyl)phenylmethyl, tri(p-methoxyphenyl)methyl, 4-(4'- bromophenacyloxyphenyl)diphenylmethyl, 4,4',4"-tris(4,5- dichlorophthalimidophenyl)methyl, 4,4',4"-tris(levulinoyloxyphenyl)methyl, 4,4',4"- tris(benzoyloxyphenyl)methyl, 3-(imidazol-l-yl)bis(4',4"-dimethoxyphenyl)methyl, 1 , 1- bis(4-methoxyphenyl)- -pyrenylmethyl, 9-anthryl, 9-(9-phenyl)xanthenyl, 9-(9-phenyl- 10-oxo)anthryl, l,3-benzodithiolan-2-yl, benzisothiazolyl S,S-dioxido, trimethylsilyl (TMS), triethylsilyl (TES), triisopropylsilyl (TIPS), dimethylisopropylsilyl (IPDMS), diethylisopropylsilyl (DEIPS), dimethylthexylsilyl, i-butyldimethylsilyl (TBDMS), t- butyldiphenylsilyl (TBDPS), tribenzylsilyl, tri-p-xylylsilyl, triphenylsilyl,
diphenylmethylsilyl (DPMS), i-butylmethoxyphenylsilyl (TBMPS), formate,
benzoylformate, acetate, chloroacetate, dichloroacetate, trichloroacetate, trifhioroacetate, methoxyacetate, triphenylmethoxyacetate, phenoxyacetate, p-chlorophenoxyacetate, 3- phenylpropionate, 4-oxopentanoate (levulinate), 4,4-(ethylenedithio)pentanoate
(levulinoyldithioacetal), pivaloate, adamantoate, crotonate, 4-methoxycrotonate, benzoate, p- phenylbenzoate, 2,4,6-trimethylbenzoate (mesitoate), alkyl methyl carbonate, 9- fluorenylmethyl carbonate (Fmoc), alkyl ethyl carbonate, alkyl 2,2,2-trichloroethyl carbonate (Troc), 2-(trimethylsilyl)ethyl carbonate (TMSEC), 2-(phenylsulfonyl) ethyl carbonate (Psec), 2-(triphenylphosphonio) ethyl carbonate (Peoc), alkyl isobutyl carbonate, alkyl vinyl carbonate alkyl allyl carbonate, alkyl p-nitrophenyl carbonate, alkyl benzyl carbonate, alkyl /7-methoxybenzyl carbonate, alkyl 3,4-dimethoxybenzyl carbonate, alkyl o-nitrobenzyl carbonate, alkyl p-nitrobenzyl carbonate, alkyl 5-benzyl thiocarbonate, 4-ethoxy-l- napththyl carbonate, methyl dithiocarbonate, 2-iodobenzoate, 4-azidobutyrate, 4-nitro-4- methylpentanoate, o-(dibromomethyl)benzoate, 2-formylbenzenesulfonate, 2- (methylthiomethoxy)ethyl, 4-(methylthiomethoxy)butyrate, 2-
(methylthiomethoxymethyl)benzoate, 2,6-dichloro-4-methylphenoxyacetate, 2,6-dichloro- 4-( 1 , 1 ,3,3-tetramethylbutyl)phenoxyacetate, 2,4-bis( 1 , l-dimethylpropyl)phenoxyacetate, chlorodiphenylacetate, isobutyrate, monosuccinoate, (E)-2-methyl-2-butenoate, o- (methoxyacyl)benzoate, a-naphthoate, nitrate, alkyl Ν,Ν,Ν',Ν'- tetramethylphosphorodiamidate, alkyl N-phenylcarbamate, borate, dimethylphosphinothioyl, alkyl 2,4-dinitrophenylsulfenate, sulfate, methanesulfonate (mesylate), benzylsulfonate, and tosylate (Ts).
[00115] In certain embodiments, the substituent present on an sulfur atom is a sulfur protecting group (also referred to as a "thiol protecting group"). Sulfur protecting groups include, but are not limited to, -Raa, -N(Rbb)2, -C(=0)SRaa, -C(=0)Raa, -C02Raa, - C(=0)N(Rbb)2, -C(=NRbb)Raa, -C(=NRbb)ORaa, -C(=NRbb)N(Rbb)2, -S(=0)Raa, -SO^, - Si(Raa)3 -P(RCC)2, -P(RCC)3, -P(=0)2Raa, -P(=0)(Raa)2, -P(=0)(ORcc)2, -P(=0)2N(Rbb)2, and - P(=0)(NRbb)2, wherein Raa, Rbb, and Rcc are as defined herein. Sulfur protecting groups are well known in the art and include those described in detail in Protecting Groups in Organic Synthesis, T. W. Greene and P. G. M. Wuts, 3rd edition, John Wiley & Sons, 1999, incorporated herein by reference.
[00116] These and other exemplary substituents are described in more detail in the Detailed Description, Examples, and claims. The invention is not intended to be limited in any manner by the above exemplary listing of substituents.
[00117] As used herein, the term "salt" refers to any and all salts.
[00118] Exemplary acid-addition salts include, but are not limited to, acid-addition salt between an amino substituent and an inorganic acid such as hydrochloric acid, hydrobromic acid, phosphoric acid, sulfuric acid and perchloric acid, or with organic acids such as acetic acid, oxalic acid, maleic acid, tartaric acid, citric acid, succinic acid or malonic acid or by using other methods used in the art such as ion exchange. Other acid addition salts include salts formed from adipate, alginate, ascorbate, aspartate, benzenesulfonate, benzoate, bisulfate, borate, butyrate, camphorate, camphorsulfonate, citrate, cyclopentanepropionate, digluconate, dodecylsulfate, ethanesulfonate, formate, fumarate, glucoheptonate,
glycerophosphate, gluconate, hemisulfate, heptanoate, hexanoate, hydroiodide, 2-hydroxy- ethanesulfonate, lactobionate, lactate, laurate, lauryl sulfate, malate, maleate, malonate, methanesulfonate, 2-naphthalenesulfonate, nicotinate, nitrate, oleate, oxalate, palmitate, pamoate, pectinate, persulfate, 3-phenylpropionate, phosphate, picrate, pivalate, propionate, stearate, succinate, sulfate, tartrate, thiocyanate, p-toluenesulfonate, undecanoate, valerate salts, and the like.
[00119] Exemplary salts derived from appropriate bases include amino acids having a net positive charge, metals, and quaternary amine salts {e.g., +NH4 and +N (Ci^alkyl)4 salts). Representative metals include, but are not limited to, alkali metals {e.g., Li, Na, K, Cs), alkaline earth metals {e.g., Mg, Ca, Ba), and transition metals {e.g., Hg). Exemplary amino acids include, but are not limited to, arginine, histidine, lysine, aspartic acid, glutamic acid, serine, threonine, asparagine, glutamine, cysteine, selenocysteine, glycine, proline, alanine, valine, isoleucine, leucine, methionine, phenylalanine, tyrosine, and tryptophan.
DETAILED DESCRIPTION OF CERTAIN EMBODIMENTS OF THE INVENTION
[00120] Compositions of heavy-atom labeled nucleic acids for use in systems and methods of sequencing, identifying and/or detecting nucleic acid polymers, such as DNA, are provided. The methods can involve using a particle beam, such as an electron beam, or ion beam, to obtain information regarding the heavy-atom labeled nucleic acid polymer.
Examples of such methods using particle beams to obtain information can be found in U.S. Patent Publication Nos. 2006/0024716, 2006/0024717, 2006/0024718, 2006/0029957, which correspond to PCT Application Publication No. WO06019903; and U.S. Patent Publication 2007/0190557, which corresponds to PCT Application Publication No.WO07089542, all entitled, "Systems and Methods of Analyzing Nucleic Acid Polymers and Related
Components," each of which is incorporated by reference in its entirety. For example, a sample of heavy-atom labeled DNA can be exposed to a particle beam and changes in the beam resulting from interaction with the sample may form a pattern which can be interpreted to provide the information. In some embodiments, a particle beam instrument (e.g. , an electron microscope) can be used to directly view samples of DNA. As described further below, the methods can enable nucleic acid sequencing, identifying and/or detection at high speeds, low costs, and high accuracy, amongst other advantages.
[00121] In some embodiments, a complementary strand of a nucleic acid polymer may be analyzed to determine the sequence and/or presence of a nucleic acid polymer. In certain embodiments, it is preferred that the sample be formed of one or more complementary strands of the nucleic acid polymer. In other embodiments, the sample may be formed of one or more strands of the nucleic acid polymer along with or separate from the complementary strand.
[00122] Conventional techniques may be used to form a complementary strand of a nucleic acid polymer and/or the polymer itself. Typically, the first step in forming the
complementary strand is to obtain a single strand of a nucleic acid polymer. Any suitable technique may be used to obtain a single strand. Standard denaturing processes (e.g., thermal, enzymatic) which break the hydrogen bonding between the strands may be used. In other embodiments, a single strand can be created by synthesizing it from a template. For example, polymerase chain reaction (PCR) or reverse transcriptase processes that are well known in the art may be used. In other embodiments, a single strand may be chemically synthesized one nucleotide at a time, for example, in an oligonucleotide synthesis process. Such synthetic processes are well known in the art and can be automated. It is also possible to obtain a single strand by purifying it from a natural source, such as single stranded RNA from cells. Combinations of the foregoing (and other methods known to those of skill in the art) also can be used.
[00123] A complementary strand of a nucleic acid polymer can be created from the single strand using any suitable conventional technique. For example, standard polymerization techniques may be used including polymerase chain reaction (PCR) (e.g. , standard PCR, long PCR protocols). The techniques generally involve exposing the single strand to an excess of nucleotides under the proper reaction conditions. The nucleotides may be labeled, as described in further below, and shown schematically in FIG. 2. In some embodiments, single or multiple polymerase enzymes are used to facilitate reactions. Polymerase enzymes include DNA-dependent DNA polymerases (including thermostable enzymes such as Taq
polymerase), RNA-dependent DNA polymerases (e.g., reverse transcriptases) and RNA- dependent RNA polymerases. In other embodiments, enzymes need not be used (e.g., in vitro chemical synthesis). Other suitable components (e.g., nucleotide primers, other enzymes such as primases, and the like) may also be present.
[00124] It should be understood that complementary strands may be modified to include other components that would not otherwise be present in a DNA strand. For example, the complementary strand may be modified to include labels (e.g. , during formation) that facilitate detection and identification of nucleotides in methods of the invention. Labels (e.g. , atoms or molecules) when exposed to a particle beam create characteristic particle beam species that may be detected and identified using the systems and methods of the invention. Similarly, the nucleic acid polymer also can be modified to include labels as described herein. This advantageously is done during synthesis of the nucleic acid, for example using PCR, which typically results in the synthesis of both strands (i.e., the nucleic acid polymer and its complementary strand).
[00125] When labels are present, it may be preferable to attach the labels to nucleotides of the complementary strand only (e.g. , as shown in FIG. 2) or to both strands of the nucleic acid. Labels can be incorporated in the complementary strand only (e.g. , using a single round of PCR) or in both strands of the nucleic acid (e.g. , using two or more rounds of PCR). In certain embodiments, specific types of label are respectively attached to each type of nucleotide (e.g. , cytosine triphosphate (CTP), adenosine triphosphate (ATP), thymine triphosphate (TTP), uracil triphosphate (UTP), guano sine triphosphate (GTP); conventionally these nucleotides as incorporated into nucleic acid molecules are referred to by a single letter, e.g. , A, C, G, T or U). For example, for labeling DNA, a first type of label is attached to a first nucleotide type (e.g. , CTP); a second type of label is attached to a second nucleotide type (e.g., ATP); a third type of label is attached to a third nucleotide type (e.g. , TTP); and a fourth type of label is attached to a fourth nucleotide type (e.g., GTP). Thus, as described further below, nucleotide types may be identified by identifying a particular label or labels on the labeled nucleotide. Modified (non-natural) or atypical natural nucleotides also can be used, in which the bases, sugars or phosphate moieties can be different than those present in typical naturally occurring nucleotides (e.g. , in A, C, G, T and U). One example of this is "locked" nucleic acids, which for example can be a bicyclic nucleic acid where a ribonucleoside is linked between the 2'-oxygen and the 4'-carbon atoms with a methylene unit. Mixtures of the foregoing can be employed in the invention.
[00126] It should be understood that, as used herein, a "nucleotide" comprises a
nitrogenous base, a sugar molecule (e.g. , deoxyribose in DNA, ribose in RNA) and one or more (typically 1-3) linking groups (e.g., phosphate, peptide). A typical nucleotide is a nucleotide triphosphate, such as cytosine triphosphate as referred to above. As used herein, a "nucleoside" comprises a nitrogenous base and a sugar molecule, as described above, but no linking group. As used herein, a "base" comprises a nitrogenous base, but not the sugar molecule or linking group. Because of these composition differences, a nucleotide can be polymerized into a nucleic acid polymer, but a nucleoside or base cannot. As described further below, one advantage of certain embodiments of the present invention is that labels may be attached to nucleotides, which may be polymerized into nucleic acid polymer, as opposed to nucleic acid bases. Note, however, that a "base pair" is conventionally used to denote pairs of nucleotides that are bound in a sequence specific manner, e.g. , Watson-Crick pairing such as A-T and C-G, in a double stranded nucleic acid polymer. However, this term also can refer to pairings of nucleosides or bases, which by definition are not part of nucleic acid polymers.
[00127] One of the advantages of having each nucleotide type bearing a unique label is that only a single "data read" is needed to obtain the sequence directly. Some interpretation as to which strand a given nucleotide is on may be required. Labeling each type of nucleotide uniquely also allows for some flexibility in data interpretation, as each base pair is identified twice: each nucleotide is identified directly and there are two nucleotides per base pair, which provides an internal control for the correctness of the data read and sequence. [00128] In other embodiments, each nucleotide type (e.g. , C, A, T, U, G) in a given strand bears a unique label, but the labels on the other strand are different. This can be
accomplished by using different sets of labeled nucleotides in sequential PCR cycles, or other synthetic methods, and allows for greater ease in tracking the strand to which a nucleotide belongs.
[00129] In certain embodiments, not all nucleotide types need to be labeled. For example, if three nucleotide types (e.g. , C, A, T) are labeled and the fourth (e.g., G) is unlabeled, then each "unlabeled" type may readily be identified as the fourth nucleotide type (e.g. , G). The position of the unlabeled nucleotides can be inferred from observation of the distances between labeled nucleotides, given the highly regular spacing of nucleotides in nucleic acid polymers. In other embodiments, only two of the nucleotide types may be labeled. For example, a first set of sequencing data may be generated with two nucleotide types labeled (e.g., C, A) and a second set of sequencing data may be generated with the other two nucleotide types labeled (e.g. , T, G). Both data sets may be processed to provide information regarding the entire sequence.
[00130] Alternatively, by labeling only two nucleotides (e.g., A, C) on both strands of a nucleic acid polymer, the sequence of either strand can be inferred from the sequence of the other strand. For example, all labeled adenines in one strand of a double stranded nucleic acid polymer will be bound to thymines on the opposite strand in accordance with Watson- Crick nucleotide binding rules. Thus, observation of an adenine on one strand allows one to infer the existence of a thymine in the corresponding position of the other strand of a double stranded nucleic acid. The positions of other nucleotides can likewise be directly read or inferred from observing a double stranded nucleic acid that incorporates only two nucleotide - specific labels.
[00131] The labels may be attached to nucleotides in a variety of different locations. In some embodiments, labels are attached to the nucleotides on, or within, the nitrogenous base (e.g., adenine, guanine, thymine, cytosine, uracil). For example, in these embodiments, labels may be attached to carbon/nitrogen rings in the base or may replace carbon or nitrogen atoms in the base. In other embodiments, labels are attached to the nucleotides on, or within, the sugar molecule (e.g. , ribose in RNA, or deoxyribose in DNA). In other embodiments, labels are attached on, or within, linking groups of the nucleotides. For example, the labels may be attached on, or within, a phosphate linking group. The labels may be attached to oxygen substitutes, such as sulfur (e.g. , alpha substituted phosphates, aS) or may replace the phosphorous atom at certain sites. [00132] In certain embodiments, the labels are attached to the nucleotides by covalent bonding. As described further below, covalent bonding provides strong attachment between labels and nucleotides which can enable labeled samples to withstand exposure to relatively high particle beam energies (e.g. , greater than about 50 kV for electron beams, for example about 80- 120 kV) that may be important to detection and/or identification of nucleic acids.
[00133] In certain embodiments, it is preferable that the labels are attached to nucleotides prior to the nucleotides forming the complementary strand (and/or copies of the first strand of the nucleic acid polymer). In these embodiments, the labels may be selected from types, as described further below, that do not prevent polymerase reactions that form the
complementary strand (and/or copies of the first strand of the nucleic acid polymer). Thus, in these cases, the complementary strand is labeled during its formation.
[00134] However, in other embodiments, it may be desired to attach additional labels to nucleotides after formation of the complementary strand (and/or copies of the first strand of the nucleic acid polymer). In these cases, the nucleotides may have been modified (prior to formation of the complementary strand and/or copies of the first strand of the nucleic acid polymer) to include a suitable attachment site which can be bound, preferably covalently, to a desired label type. After formation, the nucleic acid strand(s) may be exposed to the labels which attach to the sites.
[00135] In certain methods of the invention, the complementary strand is separated from first strand to form a single complementary strand as shown which is used as the sample. The complementary strand may be separated from the first strand using conventional denaturing techniques (e.g., thermal, enzymatic). After separation, the first strand may be discarded, or may be retained and otherwise used.
[00136] In some cases, separation and use of the complementary strand can simplify detection and/or identification and/or quantitation in subsequent method steps. Although, in some embodiments, the complementary strand and the first strand are not separated, and the double- stranded structure is used as a sample in the detection and/or identification steps.
[00137] In certain embodiments, when the complementary strand is separated from the first strand, the complementary strand is used as a template to create another strand which may be labeled. This can create a double- stranded structure which includes two labeled strands (i.e., the complementary strand and the new strand created from the complementary strand). In certain methods, this double- stranded structure is used as the sample in the detection and/or identification steps. [00138] Methods of the invention may involve attaching a sample (e.g. , complementary strand, complementary strand and first strand, complementary strand and new strand), or more than one sample, to a substrate. When more than one sample is attached, the sample may be the same (i.e., based on the same sequence) or different. In general, the substrate should be suitable for exposure to a particle beam. In embodiments in which particle beam species transmitted through the sample are detected, the substrate should permit sufficient transmission of the particle beam.
[00139] The substrate is generally thin to enable sufficient particle beam transmission therethrough. For example, the substrate may be less than 5 nanometers (nm); in some cases, less than 2 nm; or, even less than 1.5 or 1.1 nm. The substrate may be formed of a single layer or multiple layers. In certain cases, the layer(s) may be cross-linked. Conventional techniques can be used to form the substrates including vapor deposition and FIB milling, amongst others.
[00140] Suitable substrate materials are known to those of skill in the art and can include carbon (e.g. , pure carbon, graphene, diamond), boron nitride (e.g., having a cubic structure), aluminum and certain polymeric resins (e.g. , FORMVAR® (polyvinyl formal)). In other embodiments, the substrate is formed fromorganic materials such as a lipid, natural protein or synthetic protein. The substrate material may be doped with chemicals, for example, to cross-link layers or to facilitate attachment of the sample as described further below.
[00141] Samples may be attached to the substrate by chemically bonding at least a portion of the sample to the substrate. Suitable techniques are known to those of skill in the art. For example, molecules present on the surface of the substrate (e.g. , pre-existing as part of the substrate or following derivatization of the substrate) may be used to bind to the sample. The molecules may be nucleic acid sequence specific molecules (e.g. , oligonucleotides). In other cases, the substrate surface may be derivatized to provide attachment points that are sequence non-specific. In other cases, electrical charge may be used to bind the sample to the substrate surface. The attachment points for the samples can be spaced apart in a predetermined pattern, such as a grid or microarray.
[00142] A portion, or portions, of a sample may be attached to the substrate. In some cases, both ends of the sample (e.g. , complementary strand, complementary strand and first strand, complementary strand and new strand) may be attached; in other cases, only one end of the sample may be attached; in some cases, one or more non-end portions along the length of the sample may be attached. The attachment at the end(s) or along the length of the nucleic acid molecule(s) can be facilitated, if desired, by including in the nucleic acid during synthesis nucleotides capable of forming bonds with the substrate.
[00143] Certain methods of the invention involve substantially straightening a sample (e.g., labeled double strand) prior to, during, or even after, attachment to the substrate. This can facilitate detection and/or identification. The labeled double strand may be attached to the substrate, for example, via a linking bond to a bonding site as described further below.
Conventional techniques may be used to straighten the sample. For example, a sample may be straightened using fluid flow (e.g. , molecular combing). The fluid may comprise one or more liquids, gases, or combinations thereof. In certain embodiments, the sample is attached and straightened by hybridization in a fluid flow to oligonucleotides present on the substrate surface. In some cases, electrical fields may be used (either in the presence of fluid flow, or alone) to promote sample straightening. In embodiments in which more than one sample is attached to the substrate, it may be preferred for each sample to be aligned substantially parallel to one another to facilitate exposure to the beam. Methods exist to perform molecular alignment of nucleic acid molecules in a thin or monolayer on a substrate. Some focus on isolating one or a few strands of materials and stretching them out for observation and genetic analysis. Examples of such methods are molecular combing using an air- water meniscus developed by the Pasteur Institute (e.g. , US Patents to Bensimon et al. 5,840,862, 6,265, 153 and 6,548,255) and a molecular alignment technique for optical mapping used by OpGen, Inc. Methods also exist to attach nucleic acid molecules in high density patterns on a substrate with a thickness of tens to millions of atoms. An example would be oligo synthesis or spotting on a microarray.
[00144] In certain embodiments, methods and compositions of the present disclosure may be combined with methods to perform high-density molecular alignment of nucleic acid molecules on substrates or surfaces as embodied in PCT Publication Nos. WO 2009/002506 A2 and WO 2010/144128 A2, entitled "High Density Molecular Alignment of Nucleic Acid Molecules," and "Molecular Alignment and Attachment of Nucleic Acid Molecules," respectively, both of which are incorporated herein by reference in their entirety.
[00145] In some embodiments, the disclosure provides compositions and methods aside from nucleic acid sequencing and/or identification, such as gene expression analysis.
Procedures used for gene expression are generally based on immobilizing mRNA or cDNA (prepared via reverse transcriptase PCR from mRNA) to microarrays, and estimating quantity from fluorescent images. Some of these procedures are described in U.S. Patent Nos. 5,405,783; 5,424,186; 5,445,934; 5,744,305; 6,261,776; 6,406,844; 6,416,952; 6,506,558; and 5, 143,854.
[00146] One aspect of the disclosure provides a substrate having a combination of materials and dimensions that allows the substrate to have distinct physical properties. Specifically, in one embodiment, the materials and dimensions of the substrate allow it to be used for imaging samples with a particle beam instrument such as a transmission electron microscope. The substrate can include one or more ligands (e.g. , nucleic acids, polypeptides,
oligosaccharides, and synthetic polymers) which may form an array. Corresponding changes in labeling chemistry can allow for ligands, binding partners and other relevant materials to be identifiable, quantitatable, and even sequenceable via modified forms of electron microscopy. In certain embodiments, the array dimensions are on the order of nanometers per functional region rather than micrometers as in certain conventional arrays. With these dimensions, smaller amounts of sample material can be used and more accurate genetic analyses performed. These smaller substrate dimensions may also give rise to dramatically reduced production costs, amongst other advantages. The transparency of the substrate, due to thinness, material type and other factors, may provide a suitable contrast ratio between the labeled molecules and the substrate that result in higher quality readings and lower cost analysis than some conventional techniques.
[00147] Certain embodiments of the invention may be used for identification,
quantification, sequencing, fingerprinting, and mapping of polymers, particularly biological polymers. Various embodiments of the invention may be applied, for example, in the sequencing, fingerprinting, identification, quantification, or mapping of nucleic acids, polypeptides, oligosaccharides, and synthetic polymers.
[00148] Aspects of the present disclosure may be combined with the description of certain embodiments in U.S. Patent Publication Nos. 2006/0024716, 2006/0024717,
2006/0024718, 2006/0029957, which correspond to PCT Application Publication No.
WO06019903, all entitled, "Systems and Methods of Analyzing Nucleic Acid Polymers and Related Components," as well as 2007/0134699, which corresponds to WO07120202, entitled "Nano-Scale Ligand Arrays on Substrates for Particle Beam Instruments and Related Methods," each of which is incorporated herein by reference in its entirety. These references may provide, for example, methods and devices for incorporating contrast heavy atom labels in a biologic sample that are designed to interfere with a beam from a particle beam instrument. In certain embodiments, the labeled sample materials are bindi ng partners, which can be bound to ligands in an array on a suitable substrate. A particle beam may be directed through the array and the labels can create interference patterns that are then read by a detector instrument and processed by a data analysis module.
[00149] Methods of the invention involve exposing the sample to a particle beam. In certain embodiments, it is preferred that the particle beam is a lepton beam such as an electron beam. In other cases, the particle beam may be an x-ray beam. Yet in other embodiments, the particle beam may be an ion beam such as a helium or gallium ion beam. When an electron beam is used, a beam generator produces a beam having a desired voltage which, for example, can be greater than 50 kV, e.g. , 80-300 kV, preferably 80-120 kV. Beam energies are a function of both voltage and current. The beam current typically ranges between 5 to 25 μΑ, preferably between 8 and 15 μΑ. The specific beam energy depends, in part, on the specific analysis being performed.
[00150] Methods can include properly focusing the beam on the sample using a lens arrangement as known to those of skill in the art. Methods may also include a calibration step. In certain cases, the system may be automatically calibrated based on known information from nucleic acid molecules in the sample (such as known molecular geometries and structures) using a feedback loop. For example, data obtained from a nucleic acid sample using an electron beam may include internucleotide (e.g. , interlabel) distances. As used herein, an internucleotide distance is the distance from one nucleotide base in one strand to the adjacent nucleotide base in the same strand. While the internucleotide distances of, for example, a DNA molecule are generally known, the internucleotide distance in any given sample may not correspond to the generally known distance, but will typically by
substantially uniform within a sample as affixed to a substrate, particularly a sample that has been straightened, e.g. , by treatment using molecular combing or like methods. Thus, after obtaining a data read on a given sample, various aspects of the system can be calibrated or adjusted using a feedback control system. For example, knowing the internucleotide distances permits feedback relevant to focusing the particle beam and movement of the sample relative to the particle beam.
[00151] Though systems of the invention may include several components similar to that of a conventional transmission electron microscope (e.g. , beam generator, lens, etc.), certain systems of the invention may be more simple than typical conventional TEMs. For example, in some embodiments, the systems are simplified by limiting the magnification range, accelerating voltages, probe diameter, beam current, and sample flexibility, amongst other features. Also, problems related to spherical aberration in conventional TEMs may be limited, or eliminated, by using a lens arrangement that is pre- set for typical operating conditions for the system.
[00152] Characteristics of the particle beam are changed when the beam interacts with the sample. For example, one or more of the following characteristics of the particle beam may change: energy, direction, absorbance, reflection and deflection. Such changes may result from interactions between the particle beam and labels attached to nucleotides as described above. Specific types of labels may produce specific or characteristic changes. Thus, a label (and, the specific nucleotide to which it is attached) may be identified by recognizing the specific or characteristic beam changes.
[00153] A detector collects particle beam species after the interaction between the particle beam and the sample. The detector typically collects beam species that have been transmitted through the sample, though also can collect beam species that are reflected and/or scattered. The detector may include a charge coupled device (CCD). The CCD may directly convert the beam species into digital information. Technologies other than CCD technology may be used to convert the beam species into digital information, and are intended to fall within the scope of the invention.
[00154] In some embodiments of the invention, a nucleic acid polymer may be detected, and/or sequenced and/or identified based on particle beam species detected by a detector (e.g., the detector described above). Particle beam species may result from exposure of a sample comprising a nucleic acid polymer and/or its complimentary strand to a particle beam (e.g., a lepton beam such as an electron beam). Methods, systems, computers, computer systems, computer storage media, software, and components for analyzing digital information generated by a detector and e.g. a CCD are known in the art, and are exemplified in U.S. Patent Publication Nos. U.S. Patent Publication Nos. 2006/0024716, 2006/0024717,
2006/0024718, 2006/0029957, which correspond to PCT Application Publication No.
WO06019903, each of which are incorporated by reference in their entirety.
Compounds
[00155] Heavy-atom labeled compounds contemplated herein include, but are not limited to, heavy-atom labeled nucleosides, heavy-atom labeled nucleotides, and heavy-atom labeled nucleic acid polymers. Such compounds may be useful in the inventive methods as described herein.
[00156] In one aspect, provided are heavy-atom labeled compounds of Formula (I):
Figure imgf000051_0001
and salts thereof;
wherein:
each instance of Gi is independently -0-, -S-, -Se-, -CH2-, or -NH-;
each instance of G2 is independently hydrogen, halogen, -ORA, -SRA, -N(RA)2, -SHg, -S02SHg , -SHgRD, -SeRD or -TeRD;
each instance of RA is independently hydrogen, substituted or unsubstituted Ci_2oalkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2_20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen atom, a sulfur protecting group when attached to a sulfur atom, a nitrogen protecting group when attached to a nitrogen atom; or two RA groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of M1 is independently -0-, -S-, -NH-, -Se-,or -C(RM)2-, wherein each instance of RM is independently hydrogen or halogen;
each instance of G3 is independently hydrogen, substituted or unsubstituted Ci_2oalkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2_2o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or a monophosphate, diphosphate, or triphosphate of formula:
O I I ¾ 9 I I 9 I I c, O I I O I I O I I t
HO-P— HO-P-O-P— I HO-P-O-P-O-P— I
M2-H , OH 2-H ; or OH OH M2-H
wherein each instance of M is independently -0-, -S-, or -Se-; and
each instance of Base is independently:
Figure imgf000051_0002
Adenine Guanine Cytosine Uracil Thymine or an analog thereof selected from the group consisting of:
Figure imgf000052_0001
Figure imgf000052_0002
Figure imgf000052_0003
Figure imgf000052_0004
Figure imgf000052_0005
wherein:
each instance of R1, R2, R4, and R5 is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-2o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, a nitrogen protecting group, -OR B , or -SR B , wherein each instance of R B is independently hydrogen, substituted or unsubstituted C1_2oalkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen group, or a sulfur protecting group when attached to a sulfur group; or R 1 and R 2 and/or R4 and R5 are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of R is independently substituted or unsubstituted Q-^alkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or
unsubstituted aryl, substituted or unsubstituted heteroaryl, halogen, -OR c , -SR c , -N(R c )2, - SHg, -S02SHg, -SHgRD, -SeRD, or -TeRD wherein each instance of Rc is hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen atom, a sulfur protecting group when attached to a sulfur atom, a nitrogen protecting group when attached to a nitrogen atom; or two R groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of L1 is independently absent or a linking moiety selected from the group consisting of substituted or unsubstituted C^oalkylene, substituted or unsubstituted C2 2oalkenylene, substituted or unsubstituted C2-20 alkynylene, substituted or unsubstituted heteroC1_2oalkylene, substituted or unsubstituted heteroC2-2oalkenylene, substituted or unsubstituted heteroC2-2o alkynylene, substituted or unsubstituted carbocycylene, substituted or unsubstituted heterocyclylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a combination thereof;
each instance of RD is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; and
each instance of M3 and M4 are independently O, Se, Te, CH2, CF2, CCI2, CBr2, or CI2; provided that the compound comprises at least one instance of a heavy atom selected from the group consisting of bromine, iodine, selenium, tellurium, or mercury. [00157] In another aspect, provided are nucleic acid polymers comprising one or more heavy- atom labeled units of Formula (IF), wherein one or more units may be the same or different:
Figure imgf000054_0001
wherein Base, G1; G2, G3, M 1 , and M 2 are as defined herein.
[00158] An exemplary nucleic acid polymers is a heavy-atom labeled nucleic acid polymers of Formula (II):
Figure imgf000054_0002
and salts thereof; wherein Base, G1; G2, G3, M 1 , and M 2 are as defined herein; and
n is 1 to 200,000, inclusive;
provided that the polymer comprises at least one instance of a heavy atom selected from the group consisting of bromine, iodine, selenium, tellurium, or mercury.
[00159] In certain embodiments of Formula (II), n is 1 to 180,000, inclusive; n is 1 to 160,000, inclusive; n is 1 to 140,000, inclusive; n is 1 to 120,000, inclusive; n is 1 to 100,000, inclusive; n is 1 to 80,000, inclusive; n is 1 to 60,000, inclusive; n is 1 to 40,000, inclusive; n is 1 to 20,000, inclusive; n is 1 to 18,000, inclusive; n is 1 to 16,000, inclusive; n is 1 to 14,000, inclusive; n is 1 to 12,000, inclusive; n is 1 to 10,000, inclusive; n is 1 to 9,000, inclusive; n is 1 to 8,000, inclusive; n is 1 to 7,000, inclusive; n is 1 to 6,000, inclusive; n is 1 to 5,000, inclusive; n is 1 to 4,000, inclusive; n is 1 to 3,000, inclusive; n is 1 to 2,000, inclusive; n is 1 to 1,000, inclusive; n is 1 to 900, inclusive; n is 1 to 800, inclusive; n is 1 to 700, inclusive; n is 1 to 600, inclusive; n is 1 to 500, inclusive; n is 1 to 400, inclusive; n is 1 to 300, inclusive; n is 1 to 200, inclusive; n is 1 to 100, inclusive; n is 1 to 90, inclusive; n is 1 to 80, inclusive; n is 1 to 70, inclusive; n is 1 to 60, inclusive; n is 1 to 50, inclusive; n is 1 to 40, inclusive; n is 1 to 30, inclusive; n is 1 to 20, inclusive; n is 1 to 10, inclusive; n is 1 to 5, inclusive; n is 5,000 to 15, 000, inclusive; n is 5,000 to 50, 000, inclusive; n is 5,000 to 200,000, inclusive.
[00160] As depicted herein, it is understood that the compound of Formula (I), polymer of Formula (II), or unit of Formula (II ') encompasses any number of stereoisomers. However, in certain embodiments, Formula (I), Formula (II), and Formula (II' respectively, encompasses the stereoisomers (I-a), (Il-a), and (Il'-a):
Figure imgf000055_0001
Figure imgf000055_0002
the enantiomer thereof, and/or salt thereof.
[00161] Such compounds may be employed using the inventive methods as described herein. However, in certain embodiments, any one of the following compounds of Formula (I) are specifically excluded:
Figure imgf000055_0003
, and , and salts thereof. [00162] In certain embodiments, nucleic acid polymers, such as polymers of Formula (II), comprising one or more units of the below formula are also specifically excluded:
Figure imgf000056_0001
and salts thereof.
[00163] Further compounds excluded include, but are not limited to, 2'MeSe-ATP, 2'- TePh, 2'-SeCR, and C5-TePh, and selenium compounds as disclosed in JP2008195648 and JP2007000032111.
Heavy atom labels
[00164] As generally described herein, the heavy-atom labeled compound of Formula (I), polymer of Formula (II), or unit of Formula (IF), comprises at least one instance of a heavy atom selected from the group consisting of bromine, iodine, selenium, tellurium, or mercury.
[00165] In certain embodiments, there is only one instance of a heavy atom provided in a compound of Formula (I) or unit of Formula (II ') as described herein. In certain
embodiments, there are more than one instances of a heavy atom provided in a compound of Formula (I) or unit of Formula (IF) as described herein {e.g. 2, 3, 4 or more instances). In certain preferred embodiments, the Base region comprises at least one instance of the heavy atom. In certain embodiments, the sugar region comprises at least one instance of the heavy atom. In certain embodiments the phosphate region comprises at least one instance of the heavy atom. In embodiments, at least one instance of the heavy atom is provided in the Base region. Labeling in the Base region as described herein is contemplated to provide clearer and unambiguous imaging results compared to labeling elsewhere in the molecule.
[00166] In certain embodiments, the compound comprises at least one instance (e.g. , 1, 2, 3, 4 or more instances) of bromine. In certain embodiments, bromine is attached to a carbon atom which optionally comprises one or two additional instances of a halogen, e.g., for example, -CBr3, CBr2H, -CBrH2, -CBr2X, or -CBrX2, wherein each instance of X is independently -CI, -F, or -I.
[00167] In certain embodiments, the compound comprises at least one instance (e.g. , 1, 2, 3, 4 or more instances) of iodine. In certain embodiments, iodine is attached to a carbon atom which optionally comprises one or two additional instances of a halogen, e.g. , for example, - CI3, CI2H, -CIH2, -CI2X, or -CIX2, wherein each instance of X is independently -CI, -F, or - Br.
[00168] In certain embodiments, the compound comprises at least one instance (e.g. , 1, 2, 3, 4 or more instances) of selenium, e.g. , for example, a divalent =Se or -Se- group, or a monovalent -SeRD group. In certain embodiments, the compound comprises a divalent =Se group. In certain embodiments, the compound comprises a divalent -Se- group. In certain embodiments, the compound comprises a monovalent -SeRD group.
[00169] In certain embodiments, the compound comprises at least one instance (e.g. , 1, 2, 3, 4 or more instances) of tellurium, e.g. , for example, a divalent =Te or -Te- group, or - TeRD. In certain embodiments, the compound comprises a divalent =Te group. In certain embodiments, the compound comprises a divalent -Te- group. In certain embodiments, the compound comprises a monovalent -TeRD group.
[00170] As understood from the above, in certain embodiments, the compound comprises at least one instance of -SHgRD, -SeRD, or -TeRD.
[00171] In certain embodiments, the compound comprises at least one instance of -SeRD or -TeRD, wherein RD is hydrogen, i.e. , to provide -SeH or -TeH.
[00172] In certain embodiments, the compound comprises at least one instance of -SHgRD, -SeRD, or -TeRD, wherein RD is substituted or unsubstituted C^oalkyl, e.g., RD is substituted or unsubstituted Ci-igalkyl, substituted or unsubstituted Q-iealkyl, substituted or
unsubstituted C^^alkyl, substituted or unsubstituted Ci_12alkyl, substituted or unsubstituted Ci-ioalkyl, substituted or unsubstituted Q-ealkyl, substituted or unsubstituted Ci^alkyl, substituted or unsubstituted Ci^alkyl, substituted or unsubstituted Ci_3alkyl, or substituted or unsubstituted C^alkyl. In certain embodiments, RD is substituted or unsubstituted C1 ; C2, C3, C4, C5, or C6-alkyl. In certain embodiments, RD is alkyl substituted with at least one or more halogen atoms (i.e. , one or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C^ohaloalkyl, substituted or unsubstituted Ci-ighaloalkyl, substituted or unsubstituted Ci-^haloalkyl, substituted or unsubstituted Ci-^haloalkyl, substituted or unsubstituted Q-nhaloalkyl, substituted or unsubstituted Q-iohaloalkyl, substituted or unsubstituted Q-ghaloalkyl, substituted or unsubstituted C^haloalkyl, substituted or unsubstituted C1_4haloalkyl, substituted or unsubstituted C ^haloalkyl, or substituted or unsubstituted C^haloalkyl. In certain embodiments, RD is substituted or unsubstituted Q, C2, C3, C4, C5, or C6-haloalkyl. In certain embodiments, the haloalkyl is a perhaloalkyl group. In certain embodiments, RD is -CX3, wherein X is halogen. In certain embodiments, RD is - CBr3, CBr2H, -CBrH2, -CBr2X, or -CBrX2, wherein each instance of X is independently -CI, -F, or -I. In certain embodiments, RD is -CI3, CI2H, -CIH2, -CI2X, or -CIX2, wherein each instance of X is independently -CI, -F, or -Br. In certain embodiments, RD is -CBr3, -CI3, - CFClBr, or -CClBrl.
[00173] In certain embodiments, the compound comprises at least one instance of -SHgRD, -SeRD, or -TeRD, wherein RD is substituted or unsubstituted C2_2oalkenyl, e.g., RD is substituted or unsubstituted C^^alkenyl, substituted or unsubstituted C^^alkenyl, substituted or unsubstituted C2_14alkenyl, substituted or unsubstituted C2_12alkenyl, substituted or unsubstituted C2_ioalkenyl, substituted or unsubstituted C2_galkenyl, substituted or unsubstituted C2_6alkenyl, substituted or unsubstituted C2^alkenyl, or substituted or unsubstituted C2_3alkenyl. In certain embodiments, RD is substituted or unsubstituted C2, C3, C4, C5, or C6-alkenyl. In certain embodiments, RD is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., RD is substituted or unsubstituted C2_2ohaloalkenyl, substituted or unsubstituted
Figure imgf000058_0001
substituted or unsubstituted C^^haloalkenyl, substituted or unsubstituted C2_14haloalkenyl, substituted or unsubstituted C2_12haloalkenyl, substituted or unsubstituted C^ohaloalkenyl, substituted or unsubstituted C2_ghaloalkenyl, substituted or unsubstituted C2_6haloalkenyl, substituted or unsubstituted C2_4haloalkenyl, or substituted or unsubstituted C2_3haloalkenyl. In certain embodiments, the haloalkenyl is a perhaloalkenyl group. In certain embodiments, RD is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkenyl. In certain embodiments, RD is -CH2CX=CX2, -CH2CH=CX2, -CH2CX=CHX, -CH2CH=CHX, -CH2CX=CH2, wherein each instance of X is independently -CI, -F, -Br, or -I.
[00174] In certain embodiments, the compound comprises at least one instance of -SHgRD, -SeRD, or -TeRD, wherein RD is substituted or unsubstituted C2_2oalkynyl, e.g., RD is substituted or unsubstituted
Figure imgf000058_0002
substituted or unsubstituted C2_16alkynyl, substituted or unsubstituted C2_14alkynyl, substituted or unsubstituted C2_12alkynyl, substituted or unsubstituted C2_ioalkynyl, substituted or unsubstituted C2_galkynyl, substituted or unsubstituted C2_6alkynyl, substituted or unsubstituted C2^alkynyl, or substituted or unsubstituted C2_3alkynyl. In certain embodiments, RD is substituted or unsubstituted C2, C3, C4, C5, or C6-alkynyl. In certain embodiments, RD is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., RD is substituted or unsubstituted C2_2ohaloalkynyl, substituted or unsubstituted C^^haloalkynyl, substituted or unsubstituted C^^haloalkynyl, substituted or unsubstituted C2_14haloalkynyl, substituted or unsubstituted C2_12haloalkynyl, substituted or unsubstituted C^iohaloalkynyl, substituted or unsubstituted C2_ghaloalkynyl, substituted or unsubstituted C2_6haloalkynyl, substituted or unsubstituted C2_4haloalkynyl, or substituted or unsubstituted C2_3haloalkynyl. In certain embodiments, the haloalkynyl is a perhaloalkynyl group. In certain embodiments, RD is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkynyl. In certain embodiments, RD is - -CH2X -- -CHX2, or- = -CX3, wherein each X is independently -CI,
-F, -Br, or -I.
[00175] In certain embodiments, the compound comprises at least one instance of -SHgRD, -SeRD, or -TeRD, wherein RD is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C3carbocycyl, substituted or unsubstituted C4carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C6carbocycyl. In certain embodiments, RD is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , RD is substituted or unsubstituted C3halocarbocycyl, substituted or unsubstituted C4halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
[00176] In certain embodiments, the compound comprises at least one instance of -SeRD or -TeRD, wherein RD is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered
heterocyclyl, substituted or unsubstituted 5-membered heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl. In certain embodiments, RD is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , RD is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5- membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl.
[00177] In certain embodiments, the compound comprises at least one instance of -SHgRD, -SeRD, or -TeRD, wherein RD is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl. In certain embodiments, R is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , RD is substituted or unsubstituted haloaryl. In certain embodiments, RD is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
[00178] In certain embodiments, the compound comprises at least one instance of -SHgRD, -SeRD, or -TeRD, wherein RD is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl. In certain embodiments, RD is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , RD is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
[00179] In certain embodiments, the compound comprises at least one instance (e.g. , 1, 2, 3, 4 or more instances) of mercury, e.g. , for example, in certain embodiments, the compound comprises at least one instance of -SHg, -S02SHg, or -SHgRD (e.g., -SHgMe).
[00180] In certain embodiments, one or more heavy atoms (e.g. , 1, 2, 3, 4, or more heavy atoms) are present in the sugar region of the compound. In certain embodiments, one or more heavy atoms (e.g. , 1, 2, 3, or 4 heavy atoms) are present in the phosphate region of the compound. In certain embodiments, one or more heavy atoms (e.g. , 1, 2, 3, or 4 heavy atoms) are present in the base region of the compound.
[00181] In the instance wherein the compound is an nucleotide of Formula (I), in certain embodiments, the compound may comprise heavy atoms in either the sugar, the phosphate, or the base region. In certain embodiments, the compound may comprise only heavy atoms in the sugar or phosphate region. In certain embodiments, the compound may comprise only heavy atoms in the base. In certain embodiments, the compound may comprises no heavy atoms in the base, and in that instance, heavy atoms are necessarily present in the sugar or phosphate region.
[00182] In the instance wherein the compound is an nucleic acid polymer comprising one or more units of Formula (IF), such as a compound of Formula (II), the 5' and/or 3' terminating group and/or one or more repeating units, e.g., 1 to 25,000 units, of the nucleic acid polymer may comprise heavy atoms. In certain embodiments, the nucleic acid polymer comprises one or more instances of a heavy-atom labeled nucleotide in combination with one or more instances of an unlabeled nucleotide. In certain embodiments, there are multiple instances, e.g., 2 or more instances, of the same heavy-atom labeled nucleotide. In certain embodiments, each instance of a particular nucleotide, for example, A, G, T, C, or U, is replaced with a different heavy-atom labeled nucleotide as described herein. For example, in certain embodiments, each instance of A is replaced with a heavy-atom labeled nucleotide as described herein. In certain embodiments, each instance of G is replaced with a heavy-atom labeled nucleotide as described herein. In certain embodiments, each instance of T is replaced with a heavy-atom labeled nucleotide as described herein. In certain embodiments, each instance of C is replaced with a heavy-atom labeled nucleotide as described herein. In certain embodiments, each instance of U is replaced with a heavy-atom labeled nucleotide as described herein. In these instances, in certain embodiments, one of the heavy-atom labeled compounds is labeled in the sugar or phosphate region, and one of the heavy-atom labeled compounds is labeled in the base region, in order to better distinguish between A, G, T, C, or U. In certain embodiments, one of the heavy-atom labeled compounds is labeled in the sugar or phosphate region with one type of label, and one of the heavy-atom labeled compounds is labeled in the base region with a different type of label, in order to better distinguish between A, G, T, C, or U.
The Sugar Region and Groups G , G and M
[00183] As generally described herein, the "sugar region" of the heavy-atom labeled compound of Formula (I), polymer of Formula (II), or unit of Formula (II'), may comprise a heavy atom, or may not comprise a heavy atom. If the sugar region does not comprise a heavy atom, the phosphate and/or base region of the heavy-atom labeled compound of Formula (I), polymer of Formula (II), or unit of Formula (II' comprises a heavy atom.
Figure imgf000061_0001
"sugar region" of Formula (I) "sugar region" of Formula (II)
[00184] As generally described herein, each instance of Gi is independently -0-, -S-, - Se-, -CH2- or -NH-. In certain embodiments, at least one instance {e.g., 1, 2, 3, 4 or more instances, or each instance) of Gi is -0-. In certain embodiments, at least one instance {e.g., 1, 2, 3, 4 or more instances, or each instance) of G is -S-. In certain embodiments, at least one instance (e.g., 1, 2, 3, 4 or more instances, or each instance) of G is -Se- In certain embodiments, at least one instance (e.g., 1, 2, 3, 4 or more instances, or each instance) of Gi is -CH2-. In certain embodiments, at least one instance (e.g., 1, 2, 3, 4 or more instances, or each instance) of Gi is -NH-.
[00185] As generally described herein, each instance of G2 is independently hydrogen, halogen, -ORA, -SRA, -N(RA)2, -SHg, -S02SHg , -SHgRD, -SeRD or -TeRD
[00186] In certain embodiments, at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of G2 is hydrogen.
[00187] In certain embodiments, at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of G2 is halogen, i.e. , G2 is -Br, -I, -F, or -CI.
[00188] In certain embodiments, at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of G2 is -ORA, wherein RA is hydrogen, substituted or unsubstituted Ci_ 20alkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2_2o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or an oxygen protecting group.
[00189] In certain embodiments, G2 is -ORA and RA is hydrogen, i.e., G2 is -OH.
[00190] In certain embodiments, G2 is -ORA and RA is an oxygen protecting group, as defined herein.
[00191] In certain embodiments, G2 is -ORA and RA is substituted or unsubstituted Ci_ 20alkyl, e.g., G2 is -ORA and RA is substituted or unsubstituted Q-igalkyl, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Ci_14alkyl, substituted or unsubstituted Ci_12alkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted C^ancyl, substituted or unsubstituted C^alkyl, substituted or unsubstituted Ci^alkyl, substituted or unsubstituted Ci_3alkyl, or substituted or unsubstituted Ci_2alkyl. In certain embodiments, G2 is -ORA and RA is substituted or unsubstituted Q, C2, C3, C4, C5, or C6-alkyl. In certain embodiments, RA is alkyl substituted with at least one or more halogen atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g. , RA is substituted or unsubstituted Ci_2ohaloalkyl, substituted or unsubstituted Ci-ighaloalkyl, substituted or unsubstituted Cuehaloalkyl, substituted or unsubstituted Ci_14haloalkyl, substituted or unsubstituted Ci_12haloalkyl, substituted or unsubstituted Ci-iohaloalkyl, substituted or unsubstituted Ci-shaloalkyl, substituted or unsubstituted C^haloalkyl, substituted or unsubstituted Ci^haloalkyl, substituted or unsubstituted Ci_3haloalkyl, or substituted or unsubstituted Ci_2haloalkyl. In certain embodiments, G2 is -OR and R is substituted or unsubstituted Q, C2, C3, C4, C5, or C6-haloalkyl. In certain embodiments, the haloalkyl is a perhaloalkyl group. In certain embodiments, RA is -CX3, wherein X is halogen. In certain embodiments, RA is -CBr3, CBr2H, -CBrH2, -CBr2X, or -CBrX2, wherein each instance of X is independently -CI, -F, or -I. In certain embodiments, RA is -CI3, CI2H, -CIH2, -CI2X, or -CIX2, wherein each instance of X is independently -CI, -F, or -Br. In certain embodiments, RA is -CBr , -CI3, - CFClBr, or -CClBrl.
[00192] In certain embodiments, G2 is -ORA and RA is substituted or unsubstituted C2 20alkenyl, e.g., G2 is -ORA and RA is substituted or unsubstituted
Figure imgf000063_0001
substituted or unsubstituted C2_16alkenyl, substituted or unsubstituted C2_14alkenyl, substituted or unsubstituted C2_12alkenyl, substituted or unsubstituted C^ioalkenyl, substituted or unsubstituted C2_8alkenyl, substituted or unsubstituted C2_6alkenyl, substituted or
unsubstituted C2_4alkenyl, or substituted or unsubstituted C2_3alkenyl. In certain
embodiments, G2 is -ORA and RA is substituted or unsubstituted C2, C3, C4, C5, or C6-alkenyl. In certain embodiments, RA is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , RA is substituted or unsubstituted C2_2ohaloalkenyl, substituted or unsubstituted
Figure imgf000063_0002
substituted or unsubstituted C^^haloalkenyl, substituted or unsubstituted C2_14haloalkenyl, substituted or unsubstituted C2_12haloalkenyl, substituted or unsubstituted C^iohaloalkenyl, substituted or unsubstituted C2_8haloalkenyl, substituted or unsubstituted C2_6haloalkenyl, substituted or unsubstituted C2_ 4haloalkenyl, or substituted or unsubstituted C2_3haloalkenyl. In certain embodiments, the haloalkenyl is a perhaloalkenyl group. In certain embodiments, G2 is -ORA and RA is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkenyl. In certain embodiments, RA is - CH2CX=CX2, -CH2CH=CX2, -CH2CX=CHX, -CH2CH=CHX, -CH2CX=CH2, wherein each instance of X is independently -CI, -F, -Br, or -I.
[00193] In certain embodiments, G2 is -ORA and RA is substituted or unsubstituted C2 20alkynyl, e.g., G2 is -ORA and RA is substituted or unsubstituted C^^alkynyl, substituted or unsubstituted C^^alkynyl, substituted or unsubstituted C2_14alkynyl, substituted or unsubstituted C2_12alkynyl, substituted or unsubstituted C2_ioalkynyl, substituted or unsubstituted C2_galkynyl, substituted or unsubstituted C2_6alkynyl, substituted or
unsubstituted C2_4alkynyl, or substituted or unsubstituted C2_3alkynyl. In certain
embodiments, G2 is -ORA and RA is substituted or unsubstituted C2, C3, C4, C5, or C6- alkynyl. In certain embodiments, RA is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., RA is substituted or unsubstituted C2_2ohaloalkynyl, substituted or unsubstituted C2-i8haloalkynyl, substituted or unsubstituted C2-i6haloalkynyl, substituted or unsubstituted C2-i4haloalkynyl, substituted or unsubstituted C2-i2haloalkynyl, substituted or unsubstituted C2-iohaloalkynyl, substituted or unsubstituted C2-8haloalkynyl, substituted or unsubstituted C2-6haloalkynyl, substituted or unsubstituted C2-4haloalkynyl, or substituted or unsubstituted C2-3haloalkynyl. In certain embodiments, the haloalkynyl is a perhaloalkynyl group. In certain embodiments, G2 is - ORA and RA is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkynyl. In certain embodiments, RA is - -CI¾X, -: -CHX2, or- = -CX3, wherein each X is independently -CI, -F, -Br, or -I.
[00194] In certain embodiments, G2 is -ORA and RA is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C3carbocycyl, substituted or unsubstituted C4carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C6carbocycyl. In certain embodiments, RA is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , RA is substituted or unsubstituted C3halocarbocycyl, substituted or unsubstituted C4halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted
Cehalocarbocycyl.
[00195] In certain embodiments, G2 is -ORA and RA is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5-membered
heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl. In certain
embodiments, RA is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , RA is substituted or unsubstituted 3- membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6- membered haloheterocyclyl.
[00196] In certain embodiments, G2 is -ORA and RA is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl. In certain embodiments, RA is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , RA is substituted or unsubstituted haloaryl. In certain embodiments, RA is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para-substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
[00197] In certain embodiments, G2 is -ORA and RA is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl. In certain embodiments, RA is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., RA is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
[00198] In certain embodiments, G2 is -SRA and RA is hydrogen, i.e., G2 is -SH.
[00199] In certain embodiments, G2 is -SRA and RA is a sulfur protecting group, as defined herein.
[00200] In certain embodiments, G2 is -SRA and RA is substituted or unsubstituted Q_ 2oalkyl, e.g., G2 is -SRA and RA is substituted or unsubstituted Q-igalkyl, substituted or unsubstituted C^^alkyl, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted C^ancyl, substituted or unsubstituted C^alkyl, substituted or unsubstituted C^alkyl, substituted or unsubstituted C^alkyl, or substituted or unsubstituted C1_2alkyl. In certain embodiments, G2 is -SRA and RA is substituted or unsubstituted C1; C2, C3, C4, C5, or C6-alkyl. In certain embodiments, RA is alkyl substituted with at least one or more halogen atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g., RA is substituted or unsubstituted C1_2ohaloalkyl, substituted or unsubstituted Ci-ighaloalkyl, substituted or unsubstituted Ci-iehaloalkyl, substituted or unsubstituted Ci-^haloalkyl, substituted or unsubstituted Ci-^haloalkyl, substituted or unsubstituted Ci-iohaloalkyl, substituted or unsubstituted Ci-shaloalkyl, substituted or unsubstituted C ^haloalkyl, substituted or unsubstituted C^haloalkyl, substituted or unsubstituted C ^haloalkyl, or substituted or unsubstituted C1_2haloalkyl. In certain embodiments, G2 is -SRA and RA is substituted or unsubstituted C1; C2, C3, C4, C5, or C6-haloalkyl. In certain embodiments, the haloalkyl is a perhaloalkyl group. In certain embodiments, RA is -CX3, wherein X is halogen. In certain embodiments, RA is -CBr3, CBr2H, -CBrH2, -CBr2X, or -CBrX2, wherein each instance of X is independently -CI, -F, or -I. In certain embodiments, RA is -CI3, CI2H, -CIH2, -CI2X, or -CIX2, wherein each instance of X is independently -CI, -F, or -Br. In certain embodiments, RA is -CBr3, -CI3, - CFClBr, or -CClBrl. [00201] In certain embodiments, G2 is -SR and R is substituted or unsubstituted C2 2oalkenyl, e.g., G2 is -SRA and RA is substituted or unsubstituted C2-i8alkenyl, substituted or unsubstituted C2_16alkenyl, substituted or unsubstituted C2_14alkenyl, substituted or unsubstituted C2_12alkenyl, substituted or unsubstituted C2-ioalkenyl, substituted or unsubstituted C2_salkenyl, substituted or unsubstituted C2_6alkenyl, substituted or
unsubstituted C2_4alkenyl, or substituted or unsubstituted C2_3alkenyl. In certain
embodiments, G2 is -SRA and RA is substituted or unsubstituted C2, C3, C4, C5, or C6-alkenyl. In certain embodiments, RA is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , RA is substituted or unsubstituted C2-2ohaloalkenyl, substituted or unsubstituted
Figure imgf000066_0001
substituted or unsubstituted C2-i6haloalkenyl, substituted or unsubstituted C2_14haloalkenyl, substituted or unsubstituted C2_12haloalkenyl, substituted or unsubstituted C2-iohaloalkenyl, substituted or unsubstituted C2_ghaloalkenyl, substituted or unsubstituted C2_6haloalkenyl, substituted or unsubstituted C2_ 4haloalkenyl, or substituted or unsubstituted C2_3haloalkenyl. In certain embodiments, the haloalkenyl is a perhaloalkenyl group. In certain embodiments, G2 is -SRA and RA is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkenyl. In certain embodiments, RA is - CH2CX=CX2, -CH2CH=CX2, -CH2CX=CHX, -CH2CH=CHX, -CH2CX=CH2, wherein each instance of X is independently -CI, -F, -Br, or -I.
[00202] In certain embodiments, G2 is -SRA and RA is substituted or unsubstituted C2 20alkynyl, e.g., G2 is -SRA and RA is substituted or unsubstituted C^^alkynyl, substituted or unsubstituted C2_16alkynyl, substituted or unsubstituted C2_14alkynyl, substituted or unsubstituted C2_12alkynyl, substituted or unsubstituted C2_ioalkynyl, substituted or unsubstituted C2_8alkynyl, substituted or unsubstituted C2_6alkynyl, substituted or
unsubstituted C2_4alkynyl, or substituted or unsubstituted C2_3alkynyl. In certain
embodiments, G2 is -SRA and RA is substituted or unsubstituted C2, C3, C4, C5, or C6-alkynyl. In certain embodiments, RA is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , RA is substituted or unsubstituted C2_2ohaloalkynyl, substituted or unsubstituted C^^haloalkynyl, substituted or unsubstituted C^iehaloalkynyl, substituted or unsubstituted C2_14haloalkynyl, substituted or unsubstituted C2_12haloalkynyl, substituted or unsubstituted C^iohaloalkynyl, substituted or unsubstituted C2_8haloalkynyl, substituted or unsubstituted C2_6haloalkynyl, substituted or unsubstituted C2_ 4haloalkynyl, or substituted or unsubstituted C2_3haloalkynyl. In certain embodiments, the haloalkynyl is a perhaloalkynyl group. In certain embodiments, G2 is -SRA and RA is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkynyl. In certain embodiments, RA is - -CH2X, -: -CHX2, or- =-CX3, wherein each X is independently -CI, -F, -
Br, or -I.
[00203] In certain embodiments, G2 is -SRA and RA is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C3carbocycyl, substituted or unsubstituted C4carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C6carbocycyl. In certain embodiments, RA is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., RA is substituted or unsubstituted C3halocarbocycyl, substituted or unsubstituted C4halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted
Cehalocarbocycyl.
[00204] In certain embodiments, G2 is -SRA and RA is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5-membered
heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl. In certain
embodiments, RA is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., RA is substituted or unsubstituted 3- membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6- membered haloheterocyclyl.
[00205] In certain embodiments, G2 is -SRA and RA is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl. In certain embodiments, RA is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., RA is substituted or unsubstituted haloaryl. In certain embodiments, RA is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para-substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
[00206] In certain embodiments, G2 is -SRA and RA is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl. In certain embodiments, RA is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
[00207] In certain embodiments, G2 is -N(RA)2 and at least one RA is hydrogen, i.e. , G2 is - NHRA or -NH2.
[00208] In certain embodiments, G2 is -N(RA)2 and at least one RA is a nitrogen protecting group, as defined herein.
[00209] In certain embodiments, G2 is -N(RA)2 and at least one RA is substituted or unsubstituted Q-^alkyl, e.g., G2 is -N(RA)2 and at least one RA is substituted or
unsubstituted Q-igalkyl, substituted or unsubstituted C^^alkyl, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted Chalky!, substituted or unsubstituted Ci^alkyl, substituted or unsubstituted C1-4alkyl, substituted or unsubstituted C^alkyl, or substituted or unsubstituted C1_2alkyl. In certain embodiments, G2 is -N(RA)2 and at least one RA is substituted or unsubstituted Q, C2, C3, C4, C5, or C6-alkyl. In certain embodiments, at least one is RA is alkyl substituted with at least one or more halogen atoms (i.e. , one or more -Br, -I, -F, or -CI atoms), e.g. , at least one is RA is substituted or unsubstituted C1_2ohaloalkyl, substituted or unsubstituted Q-ighaloalkyl, substituted or unsubstituted Q-iehaloalkyl, substituted or unsubstituted Q-whaloalkyl, substituted or unsubstituted Q-^haloalkyl, substituted or unsubstituted Q-iohaloalkyl, substituted or unsubstituted Ci^haloalkyl, substituted or unsubstituted Ci^haloalkyl, substituted or unsubstituted C^haloalkyl, substituted or unsubstituted C^haloalkyl, or substituted or unsubstituted C1_2haloalkyl. In certain embodiments, G2 is -N(RA)2 and at least one RA is substituted or unsubstituted Q, C2, C3, C4, C5, or C6-haloalkyl. In certain embodiments, the haloalkyl is a perhaloalkyl group. In certain embodiments, at least one RA is -CX3, wherein X is halogen. In certain embodiments, at least one RA is -CBr , CBr2H, -CBrH2, -CBr2X, or -CBrX2, wherein each instance of X is independently -CI, -F, or -I. In certain embodiments, at least one RA is -CI3, CI2H, -CIH2, - CI2X, or -CIX2, wherein each instance of X is independently -CI, -F, or -Br. In certain embodiments, at least one RA is -CBr3, -CI3, -CFClBr, or -CClBrl.
[00210] In certain embodiments, G2 is -N(RA)2 and at least one RA is substituted or unsubstituted C2_2oalkenyl, e.g., G2 is -N(RA)2 and at least one RA is substituted or unsubstituted C2_18alkenyl, substituted or unsubstituted C2_16alkenyl, substituted or unsubstituted C2_14alkenyl, substituted or unsubstituted C2_12alkenyl, substituted or unsubstituted C2_ioalkenyl, substituted or unsubstituted C2_8alkenyl, substituted or
unsubstituted C2_6alkenyl, substituted or unsubstituted C2_4alkenyl, or substituted or unsubstituted C2-3alkenyl. In certain embodiments, G2 is -N(R )2 and at least one R is substituted or unsubstituted C2, C3, C4, C5, or C6-alkenyl. In certain embodiments, at least one RA is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , RA is substituted or unsubstituted C2-2ohaloalkenyl, substituted or unsubstituted C2-i8haloalkenyl, substituted or unsubstituted C2-i6haloalkenyl, substituted or unsubstituted C2-i4haloalkenyl, substituted or unsubstituted C2-i2haloalkenyl, substituted or unsubstituted C2-iohaloalkenyl, substituted or unsubstituted C2-8haloalkenyl, substituted or unsubstituted C2-6haloalkenyl, substituted or unsubstituted C2-4haloalkenyl, or substituted or unsubstituted C2-3haloalkenyl. In certain embodiments, the haloalkenyl is a perhaloalkenyl group. In certain embodiments, G2 is -N(RA)2 and at least one RA is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkenyl. In certain embodiments, at least one RA is -CH2CX=CX2, -CH2CH=CX2, -CH2CX=CHX, -CH2CH=CHX, -CH2CX=CH2, wherein each instance of X is independently -CI, -F, -Br, or -I.
[00211] In certain embodiments, G2 is -N(RA)2 and at least one RA is substituted or unsubstituted C2-2oalkynyl, e.g., G2 is -N(RA)2 and at least one RA is substituted or unsubstituted C2-i8alkynyl, substituted or unsubstituted C2-i6alkynyl, substituted or unsubstituted C2-i4alkynyl, substituted or unsubstituted C2-i2alkynyl, substituted or unsubstituted C2-ioalkynyl, substituted or unsubstituted C2-8alkynyl, substituted or unsubstituted C2-6alkynyl, substituted or unsubstituted C2-4alkynyl, or substituted or unsubstituted C2-3alkynyl. In certain embodiments, G2 is -N(RA)2 and at least one RA is substituted or unsubstituted C2, C3, C4, C5, or C6-alkynyl. In certain embodiments, at least one RA is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , at least one RA is substituted or unsubstituted C2- 2ohaloalkynyl, substituted or unsubstituted C2-i8haloalkynyl, substituted or unsubstituted C2 16haloalkynyl, substituted or unsubstituted C2-i4haloalkynyl, substituted or unsubstituted C2 12haloalkynyl, substituted or unsubstituted C2-iohaloalkynyl, substituted or unsubstituted C2 shaloalkynyl, substituted or unsubstituted C2-6haloalkynyl, substituted or unsubstituted C2- 4haloalkynyl, or substituted or unsubstituted C2-3haloalkynyl. In certain embodiments, the haloalkynyl is a perhaloalkynyl group. In certain embodiments, G2 is -N(RA)2 and at least one RA is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkynyl. In certain
embodiments, at least one RA is -CH2X - -CHX2, or- -CX3, wherein each X is independently -CI, -F, -Br, or -I.
[00212] In certain embodiments, G2 is -N(RA)2 and at least one RA is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C3carbocycyl, substituted or unsubstituted C4carbocycyl, substituted or unsubstituted C5carbocycyl, or substituted or unsubstituted Cecarbocycyl. In certain embodiments, at least one RA is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one RA is substituted or unsubstituted Cshalocarbocycyl, substituted or unsubstituted C4halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
[00213] In certain embodiments, G2 is -N(RA)2 and at least one RA is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5- membered heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl. In certain embodiments, at least one RA is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one RA is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4- membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl.
[00214] In certain embodiments, G2 is -N(RA)2 and at least one RA is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl. In certain embodiments, at least one RA is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one RA is substituted or unsubstituted haloaryl. In certain embodiments, at least one RA is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
[00215] In certain embodiments, G2 is -N(RA)2 and at least one RA is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl. In certain embodiments, at least one RA is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one RA is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl. [00216] In certain embodiments, G2 is -N(R )2, and two R groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring, e.g. , a 5- to 6- membered substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring.
[00217] In certain embodiments, at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of G2 is -SHg or -S02SHg, or -SHgRD, wherein RD is as defined herein.
[00218] In certain embodiments, at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of G2 is -SeRD, wherein RD is as defined herein.
[00219] In certain embodiments, at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of G2 is -TeRD, wherein RD is as defined herein.
[00220] As generally described herein, each instance of M1 is independently -0-, -S-, - NH-, -Se-,or -C(RM)2-, wherein each instance of RM is independently hydrogen or halogen. In certain embodiments, at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of M1 is -0-. In certain embodiments, at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of M1 is -S-. In certain embodiments, at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of M1 is -NH-. In certain embodiments, at least one instance (e.g., 1, 2, 3, 4 or more instances, or each instance) of M1 is -Se-. In certain embodiments, at least one instance (e.g., 1, 2, 3, 4 or more instances, or each instance) of M1 is -C(RM)2-, wherein each instance of RM is independently hydrogen or halogen. In certain embodiments, each instance of RM is hydrogen. In certain embodiments, at least one instance of RM is halogen, e.g., -Br, -I, -F, or -CI.
[00221] Various combinations of the above described embodiments of the "sugar regions" are further contemplated herein.
[00222] For example, in certain embodiments, each instance of Gi is O to provide a compound of Formula I-b), polymer of Formula (Il-b), or unit of Formula (II '-b):
Figure imgf000071_0001
Figure imgf000072_0001
or salt thereof. In certain embodiments, G2 is hydrogen. In certain embodiments, G2 is - SHgRD (e.g., -SHgMe), -SHg, or -S02SHg. In certain embodiments, G2 is -SeRD, e.g. , - SeCX3, wherein X is halogen. In certain embodiments, G2 is -TeRD, e.g. , -TeCX3, wherein X is halogen.
[00223] In certain embodiments, each instance of Gi and M1 is O to provide a compound of Formula (I-c), polymer of Formula (II-c), or unit of Formula (II'-c):
Figure imgf000072_0002
or salt thereof. In certain embodiments, G2 is hydrogen. In certain embodiments, G2 is - SHgRD (e.g., -SHgMe), -SHg, or -S02SHg. In certain embodiments, G2 is -SeRD, e.g. , - SeCX3, wherein X is halogen. In certain embodiments, G2 is -TeRD, e.g. , -TeCX3, wherein X is halogen. In certain embodiments, G2 is Se-CBr or -TeBr .
[00224] In certain embodiments, each instance of Gi and M1 is O to provide a compound of Formula (I-d), polymer of Formula (Il-d), or unit of Formula (II '-d) with the specified stereochemistry:
Figure imgf000073_0001
Figure imgf000073_0002
Figure imgf000073_0003
or the enantiomer thereof and/or salt thereof. In certain embodiments, G2 is hydrogen. In certain embodiments, G2 is -SHgRD (e.g., -SHgMe), -SHg, or -S02SHg. In certain embodiments, G2 is -SeRD, e.g., -SeCX3, wherein X is halogen. In certain embodiments, G2 is -TeRD, e.g., -TeCX3, wherein X is halogen.
The Phosphate Region and G3
[00225] As generally described herein, each instance of G3 independently is hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-2o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or may be a monophosphate, diphosphate, or triphosphate group.
[00226] In certain embodiments, at least one instance of G3 is hydrogen.
[00227] In certain embodiments, at least one instance of G3 is substituted or unsubstituted C1_2oalkyl, e.g., substituted or unsubstituted Ci-igalkyl, substituted or unsubstituted Q_ 16alkyl, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted C^^alkyl, substituted or unsubstituted Ci-ioalkyl, substituted or unsubstituted C^ancyl, substituted or unsubstituted Chalky!, substituted or unsubstituted Ci^alkyl, substituted or unsubstituted C^alkyl, or substituted or unsubstituted C^alkyl. In certain embodiments, at least one instance of G3 is substituted or unsubstituted Q, C2, C3, C4, C5, or C6-alkyl. In certain embodiments, at least one instance of G3 is alkyl substituted with at least one or more halogen atoms (i.e. , one or more -Br, -I, -F, or -CI atoms), e.g. , at least one instance of G3 is substituted or unsubstituted C^ohaloalkyl, substituted or unsubstituted Q-ighaloalkyl, substituted or unsubstituted Ci-iehaloalkyl, substituted or unsubstituted Ci-nhaloalkyl, substituted or unsubstituted Ci-^haloalkyl, substituted or unsubstituted Ci-iohaloalkyl, substituted or unsubstituted Ci-shaloalkyl, substituted or unsubstituted Ci^haloalkyl, substituted or unsubstituted C1_4haloalkyl, substituted or unsubstituted C^haloalkyl, or substituted or unsubstituted C^haloalkyl. In certain embodiments, at least one instance of G3 is substituted or unsubstituted C1 ; C2, C3, C4, C5, or C6-haloalkyl. In certain embodiments, the haloalkyl is a perhaloalkyl group. In certain embodiments, RD is -CX3, wherein X is halogen. In certain embodiments, at least one instance of G3 is -CBr , CBr2H, -CBrH2, -CBr2X, or - CBrX2, wherein each instance of X is independently -CI, -F, or -I. In certain embodiments, at least one instance of G3 is -CI3, CI2H, -CIH2, -CI2X, or -CIX2, wherein each instance of X is independently -CI, -F, or -Br. In certain embodiments, at least one instance of G3 is - CBr3, -CI3, -CFClBr, or -CClBrl.
[00228] In certain embodiments, at least one instance of G3 is substituted or unsubstituted C2-2oalkenyl, e.g., at least one instance of G3 is substituted or unsubstituted C2-i8alkenyl, substituted or unsubstituted C2-i6alkenyl, substituted or unsubstituted C2 14alkenyl, substituted or unsubstituted C2-i2alkenyl, substituted or unsubstituted C2-ioalkenyl, substituted or unsubstituted C2-8alkenyl, substituted or unsubstituted C2-6alkenyl, substituted or unsubstituted C2^alkenyl, or substituted or unsubstituted C2-3alkenyl. In certain embodiments, at least one instance of G3 is substituted or unsubstituted C2, C3, C4, C5, or C6- alkenyl. In certain embodiments, at least one instance of G3 is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one instance of G3 is substituted or unsubstituted C2-2ohaloalkenyl, substituted or unsubstituted C2-i8haloalkenyl, substituted or unsubstituted C2-i6haloalkenyl, substituted or unsubstituted C2-i4haloalkenyl, substituted or unsubstituted C2-i2haloalkenyl, substituted or unsubstituted C2-iohaloalkenyl, substituted or unsubstituted C2-8haloalkenyl, substituted or unsubstituted C2-6haloalkenyl, substituted or unsubstituted C2^haloalkenyl, or substituted or unsubstituted C2-3haloalkenyl. In certain embodiments, the haloalkenyl is a perhaloalkenyl group. In certain embodiments, at least one instance of G3 is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkenyl. In certain embodiments, at least one instance of G3 is - CH2CX=CX2, -CH2CH=CX2, -CH2CX=CHX, -CH2CH=CHX, -CH2CX=CH2, wherein each instance of X is independently -CI, -F, -Br, or -I.
[00229] In certain embodiments, at least one instance of G3 is substituted or unsubstituted C2_2oalkynyl, e.g., at least one instance of G3 is substituted or unsubstituted C2_18alkynyl, substituted or unsubstituted C2_16alkynyl, substituted or unsubstituted C2_14alkynyl, substituted or unsubstituted C2_12alkynyl, substituted or unsubstituted C2_ioalkynyl, substituted or unsubstituted C2_8alkynyl, substituted or unsubstituted C2_6alkynyl, substituted or unsubstituted C2^alkynyl, or substituted or unsubstituted C2_3alkynyl. In certain embodiments, at least one instance of G3 is substituted or unsubstituted C2, C3, C4, C5, or C6- alkynyl. In certain embodiments, at least one instance of G3 is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , at least one instance of G3 is substituted or unsubstituted C2_2ohaloalkynyl, substituted or unsubstituted C2_18haloalkynyl, substituted or unsubstituted C^^haloalkynyl, substituted or unsubstituted C2_14haloalkynyl, substituted or unsubstituted C2_12haloalkynyl, substituted or unsubstituted C^iohaloalkynyl, substituted or unsubstituted C2_8haloalkynyl, substituted or unsubstituted C2_6haloalkynyl, substituted or unsubstituted C2_4haloalkynyl, or substituted or unsubstituted C2_3haloalkynyl. In certain embodiments, the haloalkynyl is a perhaloalkynyl group. In certain embodiments, at least one instance of G3 is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkynyl. In certain embodiments, at least one instance of G3 is - CH2X; -- -CHX2, or - =-CX3, wherein each X is independently -CI, -F, -Br, or -I.
[00230] In certain embodiments, at least one instance of G3 is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C3carbocycyl, substituted or unsubstituted C4carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C6carbocycyl. In certain embodiments, at least one instance of G3 is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , at least one instance of G3 is substituted or unsubstituted C3halocarbocycyl, substituted or unsubstituted C4halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
[00231] In certain embodiments, at least one instance of G3 is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5-membered
heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl. In certain
embodiments, at least one instance of G3 is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , at least one instance of G3 is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl.
[00232] In certain embodiments, at least one instance of G3 is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl. In certain embodiments, at least one instance of G3 is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one instance of G3 is substituted or unsubstituted haloaryl. In certain embodiments, at least one instance of G3 is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
[00233] In certain embodiments, at least one instance of G3 is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl. In certain embodiments, at least one instance of G3 is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more - Br, -I, -F, or -CI atoms), e.g. , at least one instance of G3 is substituted or unsubstituted 5- membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
[00234] In certain embodiments, at least one instance of G3 is a monophosphate, diphosphate, or triphosphate, referred to herein as the "phosphate region" of Formula (I) and (II). In certain embodiments, wherein group G3 is a monophosphate, diphosphate, or triphosphate, G3 may comprise a heavy atom, or may not comprise a heavy atom. If the "phosphate region" does not comprise a heavy atom, the sugar and/or base region of Formula (I) or (II) comprises a heavy atom.
[00235] As generally described herein, each instance of the "phosphate region" G3 is independently a monophosphate, diphosphate, or triphosphate of formula:
O I I ¾ 9 I I 9 I I c, O I I O I I O I I t
HO-P— HO-P-O-P— I HO-P-O-P-O-P— I
M2-H , OH 2-H ; or OH OH M2-H
wherein each instance of M is independently -0-, -S-, or -Se-. [00236] In certain embodiments, at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of M is -0-. In certain embodiments, at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of M is -S-. In certain embodiments, at least one instance
(e.g., 1, 2, 3, 4 or more instances, or each instance) of M is -Se-.
[00237] In certain embodiments, at least one instance (e.g. , 1, 2, 3, 4 or more instances, or each instance) of G3 is a monophosphate group. In certain embodiments, at least one instance (e.g., 1, 2, 3, 4 or more instances, or each instance) of G3 is a diphosphate group. In certain embodiments, at least one instance (e.g., 1, 2, 3, 4 or more instances, or each instance) of G3 is a triphosphate group. In certain embodiments, each instance of G3 is a triphosphate group.
[00238] It is understood that the compound of Formula (I) and (II) may also be provided as a salt, and in this instance, in certain embodiments, the monophosphate, diphosphate, or triphosphate groups may be provided as a salt form:
O O O O O O
Π . I I I I ¾ I I I I I I ¾
YO-p— I YO-P-O-P— YO-P-O-P-O-P—
OY M2-Y ' , o orr OY OY
wherein M is as defined herein, and each Y is independently hydrogen or an electropositive group (e.g. , a quaternary amine, an amino acid, a metal) provided at least one instance of Y (e.g. , at least 1, 2, 3, or all instances of Y) is an electropositive group in order to provide the salt. In certain embodiments, at least one instance of Y is a metal. Exemplary metals include alkali metals (e.g., Li, Na, K, Cs), alkaline earth metals (e.g., Mg, Ca, Ba), a transition metal (e.g. , Hg). In certain embodiments, at least one instance of Y is a quaternary amine (e.g. , ammonium, NH4 +). In certain embodiments, at least one instance of Y is an amino acid having a net positive charge, e.g. , for example, wherein the zwitterionic form which predominates in equilibrium is the amino acid with a quaternized alpha-amino group and the protonated alpha-carboxylic acid group. Exemplary amino acids include, but are not limited to, arginine, histidine, lysine, aspartic acid, glutamic acid, serine, threonine, asparagine, glutamine, cysteine, selenocysteine, glycine, proline, alanine, valine, isoleucine, leucine, methionine, phenylalanine, tyrosine, and tryptophan.
[00239] Various combinations of the above described embodiments of the "phosphate region" are further contemplated herein.
[00240] In certain embodiments, each instance of G3 is independently a hydrogen or a triphosphate to provide a compound of Formula (I-el), (I-e2), (Il-el), (II-e2), (II-e3), or (II- e4):
Figure imgf000078_0001
Figure imgf000078_0002
Figure imgf000078_0003
Figure imgf000078_0004
Figure imgf000078_0005
Figure imgf000078_0006
or salt thereof. In certain embodiments, Gi is O. In certain embodiments, M1 is O. In certain embodiments, G2 is hydrogen. In certain embodiments, G2 is -SHg or -S02SHg . In certain embodiments, G2 is -SeR , e.g., -SeCX3, wherein X is halogen. In certain embodiments, G2 is -TeRD, e.g., -TeCX3, wherein X is halogen. In certain embodiments, G2 is Se-CBr3 or - TeBr . In certain embodiments, M2 is O. In certain embodiments, the compound is a salt, e.g., a salt of a quaternary amine, an amino acid, or a metal.
[00241] In certain embodiments, each instance of G3 is independently a hydrogen or a triphosphate, and Gi, M1, and M2 are O, to provide a compound of Formula (I-fl), (I-f2), (II- fl), (II-f2), (ΙΙ-β), or (II-f4
Figure imgf000079_0001
Figure imgf000079_0002
Figure imgf000079_0003
Figure imgf000079_0004
0 0 0
II I I I I 0 0 0
II I I II
P-O- -p-o- -P-i -p-o- -P-O -P-OH
OH OH OH OH OH OH
(II- f4),
or salt thereof. In certain embodiments, G2 is hydrogen. In certain embodiments, G2 is -SHg or -S02SHg . In certain embodiments, G2 is -SeRD, e.g., -SeCX3, wherein X is halogen. In certain embodiments, G2 is -TeRD, e.g., -TeCX3, wherein X is halogen. In certain embodiments, G2 is Se-CBr3 or -TeBr3. In certain embodiments, the compound is a salt, e.g., a salt of a quaternary amine, an amino acid, or a metal.
[00242] In certain embodiments, each instance of G3 is independently a hydrogen or a triphosphate, and Gi, M1, and M2 are O, to provide a compound of Formula (I-gl), (I-g2), (Il-gl), (II-g2), (II-g3), or (II-g4 with the specified stereochemistry:
Figure imgf000080_0001
Figure imgf000080_0002
Figure imgf000081_0001
or the enantiomer thereof and/or salt thereof. In certain embodiments, G2 is hydrogen. In certain embodiments, G2 is -SHg or -S02SHg . In certain embodiments, G2 is -SeRD, e.g., - SeCX3, wherein X is halogen. In certain embodiments, G2 is -TeRD, e.g., -TeCX3, wherein X is halogen. In certain embodiments, G2 is Se-CBr or -TeBr . In certain embodiments, the compound is a salt, e.g., a salt of a quaternary amine, an amino acid, or a metal.
The Base Region
[00243] As generally described herein, the "base region" of a compound of Formula (I-b), polymer of Formula (Il-b), or unit of Formula (II '-b) may comprise a heavy atom, or may not comprise a heavy atom. If the "base region" does not comprise a heavy atom, the phosphate and/or sugar region of a compound of Formula (I-b), polymer of Formula (Il-b), or unit of Formula (II '-b) comprises a heavy atom.
[00244] In certain embodiments, the Base does not comprise a heavy atom, and is selected from the group consisting of:
Figure imgf000081_0002
Adenine Guanine Cytosine Uracil ancj Thymine
A nucleic acid polymer, such as a polymer of Formula (II), may have one or more instances of any of the above formula.
[00245] In certain embodiments, the Base is an analog of adenine and guanine, and which optionally comprises a heavy atom, selected from the group consisting of:
Figure imgf000082_0001
wherein R 1 , R 2", R 3J, L 1 , R 4", R 5J, and M 3J are as defined herein. A nucleic acid polymer, such as a polymer of Formula (II), may have one or more instances of any of the above formula.
[00246] In certain embodiments, the Base is an analog of cytosine, uracil, and thymine, and which optionally comprises a heavy atom, selected from the group consisting of:
Figure imgf000082_0002
(x), wherein R3, L1, R4, R5, M3, and M4 are as defined herein. A nucleic acid polymer, such as a polymer of Formula (II), may have one or more instances of any of the above formula.
( i) Groups R1 and R2
[00247] In certain embodiments of formula (iii), (iv), (v), and (vi), each instance of R1 and R is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or
unsubstituted aryl, or substituted or unsubstituted heteroaryl, a nitrogen protecting group, -
OR B", or— SR B , or R 1 and R 2" are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring.
[00248] In certain embodiments, R 1 is hydrogen. In certain embodiments, both R 1 and R2 are hydrogen.
[00249] In certain embodiments, at least one of R 1 and R 2 is substituted or unsubstituted Q_
20alkyl, e.g., at least one of R 1 and R 2 is substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted Q-galkyl, substituted or unsubstituted C^alkyl, substituted or unsubstituted Ci^alkyl, substituted or unsubstituted Ci^alkyl, or substituted or unsubstituted C^alkyl. In certain embodiments, at
1 2
least one of R and R is substituted or unsubstituted Q, C2, C3, C4, C5, or C6-alkyl. In certain
1 2
embodiments, at least one of R and R is alkyl substituted with at least one or more halogen
1 2
atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g. , at least one of R and R is substituted or unsubstituted Ci^ohaloalkyl, substituted or unsubstituted Ci-^haloalkyl, substituted or unsubstituted Q-iehaloalkyl, substituted or unsubstituted Ci-nhaloalkyl, substituted or unsubstituted Q-nhaloalkyl, substituted or unsubstituted Q-iohaloalkyl, substituted or unsubstituted Ci-shaloalkyl, substituted or unsubstituted Ci^haloalkyl, substituted or unsubstituted C^haloalkyl, substituted or unsubstituted Ci^haloalkyl, or substituted or
1 2
unsubstituted C^haloalkyl. In certain embodiments, at least one of R and R is substituted or unsubstituted C1 ; C2, C3, C4, C5, or C6-haloalkyl. In certain embodiments, the haloalkyl is a
1 2
perhaloalkyl group. In certain embodiments, at least one of R and R is -CX3, wherein X is
1 2
halogen. In certain embodiments, at least one of R and R" is -CBr3, CBr2H, -CBrH2, - CBr2X, or -CBrX2, wherein each instance of X is independently -CI, -F, or -I. In certain
1 2
embodiments, at least one of R and R" is -CI3, CI2H, -CIH2, -CI2X, or -CIX2, wherein each instance of X is independently -CI, -F, or -Br. In certain embodiments, at least one of R1 and R" is -CBr3, -CI3, -CFClBr, or -CClBrl. In any of the above instances, in certain
1 2
embodiments, R is as defined above, and R is hydrogen.
[00250] 1 2
In certain embodiments, at least one of R and R is substituted or unsubstituted C2 20alkenyl, e.g. , substituted or unsubstituted C2-i8alkenyl, substituted or unsubstituted C2 16alkenyl, substituted or unsubstituted C2-i4alkenyl, substituted or unsubstituted C2-i2alkenyl, substituted or unsubstituted C2-ioalkenyl, substituted or unsubstituted C2-8alkenyl, substituted or unsubstituted C2-6alkenyl, substituted or unsubstituted C2^alkenyl, or substituted or
1 2
unsubstituted C2-3alkenyl. In certain embodiments, at least one of R and R is substituted or
1 2 unsubstituted C2, C3, C4, C5, or C6-alkenyl. In certain embodiments, at least one of R and R" is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -
1 2
Br, -I, -F, or -CI atoms), e.g. , at least one of R and R is substituted or unsubstituted C2_ 2ohaloalkenyl, substituted or unsubstituted C2 18haloalkenyl, substituted or unsubstituted C2- iehaloalkenyl, substituted or unsubstituted C2 14haloalkenyl, substituted or unsubstituted C2- 12haloalkenyl, substituted or unsubstituted C2-iohaloalkenyl, substituted or unsubstituted C2- ghaloalkenyl, substituted or unsubstituted C2-6haloalkenyl, substituted or unsubstituted C2_ 4haloalkenyl, or substituted or unsubstituted C2_3haloalkenyl. In certain embodiments, the haloalkenyl is a perhaloalkenyl group. In certain embodiments, at least one of R 1 and R 2" is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkenyl. In certain embodiments, at least one of R1 and R2 is -CH2CX=CX2, -CH2CH=CX2, -CH2CX=CHX, -CH2CH=CHX, - CH2CX=CH2, -CX2CH=CH2, -CX2CX=CX2, -CX2CH=CX2, -CX2CX=CHX, - CX2CH=CHX, -CX2CX=CH2, CHXCH=CH2, -CHXCX=CX2, -CHXCH=CX2, - CHXCX=CHX, -CHXCH=CHX, or -CHXCX=CH2, wherein each instance of X is independently -CI, -F, -Br, or -I. In certain embodiments, the alkenyl group is trans or the E-isomer. In any of the above instances, in certain embodiments, R1 is as defined above, and R is hydrogen.
[00251] In certain embodiments, at least one of R 1 and R 2 is substituted or unsubstituted C2
2o alkynyl, e.g., at least one of R 1 and R 2 is substituted or unsubstituted C^^alkynyl, substituted or unsubstituted C2_16alkynyl, substituted or unsubstituted C2_14alkynyl, substituted or unsubstituted C2_12alkynyl, substituted or unsubstituted C2_ioalkynyl, substituted or unsubstituted C2_8alkynyl, substituted or unsubstituted C2_6alkynyl, substituted or unsubstituted C2^alkynyl, or substituted or unsubstituted C2_3alkynyl. In certain embodiments, at least one of R 1 and R 2 is substituted or unsubstituted C2, C3, C4, C5, or C6- alkynyl. In certain embodiments, at least one of R 1 and R 2 is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one of R 1 and R 2 is substituted or unsubstituted C2_2ohaloalkynyl, substituted or unsubstituted
Figure imgf000084_0001
substituted or unsubstituted C2_14haloalkynyl, substituted or unsubstituted C2_12haloalkynyl, substituted or unsubstituted C^iohaloalkynyl, substituted or unsubstituted C2_8haloalkynyl, substituted or unsubstituted C2_6haloalkynyl, substituted or unsubstituted C2_4haloalkynyl, or substituted or unsubstituted C2_3haloalkynyl. In certain embodiments, the haloalkynyl is a perhaloalkynyl group. In certain embodiments, at least one of R 1 and R 2 is substituted or unsubstituted C2,
1 2
C3, C4, C5, or C6-haloalkynyl. In certain embodiments, at least one of R and R" is -
CH2X; -- -CHX2, or- =-CX3, wherein each X is independently -CI, -F, -Br, or -I .
In any of the above instances, in certain embodiments, R 1 is as defined above, and R 2 is hydrogen.
[00252] In certain embodiments, at least one of R 1 and R 2 is substituted or unsubstituted carbocycyl, e.g., substituted or unsubstituted C3carbocycyl, substituted or unsubstituted C4carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted
arbocycyl. In certain embodiments, at least one of R 1 and R 2
C6c is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI
1 2
atoms), e.g. , at least one of R and R is substituted or unsubstituted Cshalocarbocycyl, substituted or unsubstituted C4halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl. In any of the above instances, in certain
1 2
embodiments, R is as defined above, and R is hydrogen.
1 2
[00253] In certain embodiments, at least one of R and R is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5-membered
heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl. In certain
1 2
embodiments, at least one of R and R is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one of R1 and R is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl. In any of the
1 2
above instances, in certain embodiments, R is as defined above, and R is hydrogen.
1 2
[00254] In certain embodiments, at least one of R and R is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl. In
1 2
certain embodiments, at least one of R and R is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one of R1
2 1 and R is substituted or unsubstituted haloaryl. In certain embodiments, at least one of R and R is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring). In any of the above instances, in certain embodiments, R1 is as defined above, and R is hydrogen.
1 2
[00255] In certain embodiments, at least one of R and R" is or substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or
1 2 unsubstituted 6-membered heteroaryl. In certain embodiments, at least one of R and R" is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -
1 2
Br, -I, -F, or -CI atoms), e.g. , at least one of R and R is substituted or unsubstituted 5- membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl. In any of the above instances, in certain embodiments, R 1 is as defined above, and R 2 is hydrogen.
[00256] In certain embodiments, at least one of R 1 and R 2 is a nitrogen protecting group, as defined herein.
[00257] In certain embodiments, at least one of R1 and R2 is -ORB, wherein RB is independently hydrogen, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or an oxygen protecting group.
[00258] In certain embodiments, at least one of R1 and R2 is -SRB, wherein RB is independently hydrogen, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or a sulfur protecting group.
[00259] In certain embodiments, R 1 and R 2 are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring, e.g., a 5- to 6- membered substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring.
[00260] In certain embodiments of formula (i), (ii), (v), (vi), (vii), and (viii) each instance of R4 and R5 is independently hydrogen, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or
unsubstituted aryl, or substituted or unsubstituted heteroaryl, a nitrogen protecting group, -
OR B", or— SR B , or R 4" and R 5J are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring.
( ii) Groups R4 and R5
[00261] In certain embodiments, R4 is hydrogen. In certain embodiments, both R4 and R5 are hydrogen.
[00262] In certain embodiments, at least one of R4 and R5 is substituted or unsubstituted Q_ 2oalkyl, e.g., at least one of R4 and R5 is substituted or unsubstituted Ci-igalkyl, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted C^ancyl, substituted or unsubstituted C^alkyl, substituted or unsubstituted C1-4alkyl, substituted or unsubstituted Chalky!, or substituted or unsubstituted C^alkyl. In certain embodiments, at least one of R4 and R5 is substituted or unsubstituted Q, C2, C3, C4, C5, or C6-alkyl. In certain embodiments, at least one of R4 and R5 is alkyl substituted with at least one or more halogen atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g. , at least one of R4 and R5 is substituted or unsubstituted Ci^ohaloalkyl, substituted or unsubstituted Ci-^haloalkyl, substituted or unsubstituted Q-iehaloalkyl, substituted or unsubstituted Ci-nhaloalkyl, substituted or unsubstituted Q-nhaloalkyl, substituted or unsubstituted Q-iohaloalkyl, substituted or unsubstituted Ci-shaloalkyl, substituted or unsubstituted Ci^haloalkyl, substituted or unsubstituted C^haloalkyl, substituted or unsubstituted Ci^haloalkyl, or substituted or unsubstituted C^haloalkyl. In certain embodiments, at least one of R4 and R5 is substituted or unsubstituted C1 ; C2, C3, C4, C5, or C6-haloalkyl. In certain embodiments, the haloalkyl is a perhaloalkyl group. In certain embodiments, at least one of R4 and R5 is -CX3, wherein X is halogen. In certain embodiments, at least one of R4 and R5 is -CBr3, CBr2H, -CBrH2, - CBr2X, or -CBrX2, wherein each instance of X is independently -CI, -F, or -I. In certain embodiments, at least one of R4 and R5 is -CI3, CI2H, -CIH2, -CI2X, or -CIX2, wherein each instance of X is independently -CI, -F, or -Br. In certain embodiments, at least one of R4 and R5 is -CBr3, -CI3, -CFClBr, or -CClBrl. In any of the above instances, in certain embodiments, R4 is as defined above, and R5 is hydrogen.
[00263] In certain embodiments, at least one of R4 and R5 is substituted or unsubstituted C2 2oalkenyl, e.g. , substituted or unsubstituted C2-i8alkenyl, substituted or unsubstituted C2 16alkenyl, substituted or unsubstituted C2_14alkenyl, substituted or unsubstituted C2_12alkenyl, substituted or unsubstituted C2-ioalkenyl, substituted or unsubstituted C2_galkenyl, substituted or unsubstituted C2_6alkenyl, substituted or unsubstituted C2^alkenyl, or substituted or unsubstituted C2_3alkenyl. In certain embodiments, at least one of R4 and R5 is substituted or unsubstituted C2, C3, C4, C5, or C6-alkenyl. In certain embodiments, at least one of R4 and R5 is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more - Br, -I, -F, or -CI atoms), e.g. , at least one of R4 and R5 is substituted or unsubstituted C2_ 2ohaloalkenyl, substituted or unsubstituted C2 18haloalkenyl, substituted or unsubstituted C2 iehaloalkenyl, substituted or unsubstituted C2 14haloalkenyl, substituted or unsubstituted C2 12haloalkenyl, substituted or unsubstituted C2-iohaloalkenyl, substituted or unsubstituted C2 ghaloalkenyl, substituted or unsubstituted C2-6haloalkenyl, substituted or unsubstituted C2_ 4haloalkenyl, or substituted or unsubstituted C2_3haloalkenyl. In certain embodiments, the haloalkenyl is a perhaloalkenyl group. In certain embodiments, at least one of R4 and R5 is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkenyl. In certain embodiments, at least one of R4 and R5 is -CH2CX=CX2, -CH2CH=CX2, -CH2CX=CHX, -CH2CH=CHX, - CH2CX=CH2, -CX2CH=CH2, -CX2CX=CX2, -CX2CH=CX2, -CX2CX=CHX, - CX2CH=CHX, -CX2CX=CH2, CHXCH=CH2, -CHXCX=CX2, -CHXCH=CX2, - CHXCX=CHX, -CHXCH=CHX, or -CHXCX=CH2, wherein each instance of X is independently -CI, -F, -Br, or -I. In certain embodiments, the alkenyl group is trans or the E-isomer. In any of the above instances, in certain embodiments, R4 is as defined above, and R5 is hydrogen.
[00264] In certain embodiments, at least one of R4 and R5 is substituted or unsubstituted C2 2o alkynyl, e.g., at least one of R4 and R5 is substituted or unsubstituted C^^alkynyl, substituted or unsubstituted C2_16alkynyl, substituted or unsubstituted C2_14alkynyl, substituted or unsubstituted C2_12alkynyl, substituted or unsubstituted C2_ioalkynyl, substituted or unsubstituted C2_8alkynyl, substituted or unsubstituted C2_6alkynyl, substituted or unsubstituted C2^alkynyl, or substituted or unsubstituted C2_3alkynyl. In certain embodiments, at least one of R4 and R5 is substituted or unsubstituted C2, C3, C4, C5, or C6- alkynyl. In certain embodiments, at least one of R4 and R5 is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one of R4 and R5 is substituted or unsubstituted C2_2ohaloalkynyl, substituted or unsubstituted
Figure imgf000088_0001
substituted or unsubstituted C2_14haloalkynyl, substituted or unsubstituted C2_12haloalkynyl, substituted or unsubstituted C^iohaloalkynyl, substituted or unsubstituted C2_8haloalkynyl, substituted or unsubstituted C2_6haloalkynyl, substituted or unsubstituted C2_4haloalkynyl, or substituted or unsubstituted C2_3haloalkynyl. In certain embodiments, the haloalkynyl is a perhaloalkynyl group. In certain embodiments, at least one of R4 and R5 is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkynyl. In certain embodiments, at least one of R4 and R5 is - CH2X; -- -CHX2, or- =-CX3, wherein each X is independently -CI, -F, -Br, or -I. In any of the above instances, in certain embodiments, R4 is as defined above, and R5 is hydrogen.
[00265] In certain embodiments, at least one of R4 and R5 is substituted or unsubstituted carbocycyl, e.g., substituted or unsubstituted C3carbocycyl, substituted or unsubstituted C4carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C6carbocycyl. In certain embodiments, at least one of R4 and R5 is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one of R4 and R5 is substituted or unsubstituted C3halocarbocycyl, substituted or unsubstituted C4halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl. In any of the above instances, in certain embodiments, R4 is as defined above, and R5 is hydrogen.
[00266] In certain embodiments, at least one of R4 and R5 is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5-membered
heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl. In certain
embodiments, at least one of R4 and R5 is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one of R4 and R5 is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl. In any of the above instances, in certain embodiments, R4 is as defined above, and R5 is hydrogen.
[00267] In certain embodiments, at least one of R4 and R5 is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl. In certain embodiments, at least one of R4 and R5 is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one of R4 and R5 is substituted or unsubstituted haloaryl. In certain embodiments, at least one of R4 and R5 is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring). In any of the above instances, in certain embodiments, R4 is as defined above, and R5 is hydrogen.
[00268] In certain embodiments, at least one of R4 and R5 is or substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl. In certain embodiments, at least one of R4 and R5 is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more - Br, -I, -F, or -CI atoms), e.g., at least one of R4 and R5 is substituted or unsubstituted 5- membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl. In any of the above instances, in certain embodiments, R4 is as defined above, and R5 is hydrogen.
[00269] In certain embodiments, at least one of R4 and R5 is a nitrogen protecting group, as defined herein. [00270] In certain embodiments, at least one of R4 and R5 is -ORB, wherein RB is independently hydrogen, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or an oxygen protecting group.
[00271] In certain embodiments, at least one of R4 and R5 is -SRB, wherein RB is independently hydrogen, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or a sulfur protecting group.
[00272] In certain embodiments, R4 and R5 are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring, e.g., a 5- to 6- membered substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring.
1 3
( iii) Groups L and Group R
[00273] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is independently substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, halogen, -ORc, -SRC, -N(RC)2, -SHg, -S02SHg , -SHgRD, -SeRD, or -TeRD.
[00274] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is substituted or unsubstituted Q-^alkyl, e.g., substituted or unsubstituted Q-^alkyl, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted C^ancyl, substituted or unsubstituted C^alkyl, substituted or unsubstituted C^alkyl, substituted or unsubstituted C^alkyl, or substituted or unsubstituted C^alkyl. In certain embodiments, R is substituted or unsubstituted C1; C2, C3, C4, C5, or C6-alkyl. In certain embodiments, R is alkyl substituted with at least one or more halogen atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C1_2ohaloalkyl, substituted or unsubstituted Q_ 18haloalkyl, substituted or unsubstituted Q-iehaloalkyl, substituted or unsubstituted Q_ 14haloalkyl, substituted or unsubstituted Q-nhaloalkyl, substituted or unsubstituted Q_ iohaloalkyl, substituted or unsubstituted Ci^haloalkyl, substituted or unsubstituted Q_ 6haloalkyl, substituted or unsubstituted C^haloalkyl, substituted or unsubstituted Q_ 3haloalkyl, or substituted or unsubstituted C^haloalkyl. In certain embodiments, R is substituted or unsubstituted C1 ; C2, C3, C4, C5, or C6-haloalkyl. In certain embodiments, the haloalkyl is a perhaloalkyl group. In certain embodiments, R is -CX3, wherein X is halogen. In certain embodiments, R is -CBr3, CBr2H, -CBrH2, -CBr2X, or -CBrX2, wherein each instance of X is independently -CI, -F, or -I. In certain embodiments, R is -CI3, CI2H, - CIH2, -CI2X, or -CIX2, wherein each instance of X is independently -CI, -F, or -Br. In certain embodiments, R3 is -CBr3, -CI3, -CFClBr, or -CClBrl.
In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R is substituted or unsubstituted C2_2oalkenyl, e.g. , substituted or unsubstituted C^^alkenyl, substituted or unsubstituted C2-i6alkenyl, substituted or unsubstituted C2_14alkenyl, substituted or unsubstituted C2_12alkenyl, substituted or unsubstituted C2-ioalkenyl, substituted or unsubstituted C2_8alkenyl, substituted or unsubstituted C2_6alkenyl, substituted or
unsubstituted C2-4alkenyl, or substituted or unsubstituted C2_3alkenyl. In certain
embodiments, R is substituted or unsubstituted C2, C3, C4, C5, or C6-alkenyl. In certain embodiments, R is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted C2_
2ohaloalkenyl, substituted or unsubstituted C2 18haloalkenyl, substituted or unsubstituted C2 ^haloalkenyl, substituted or unsubstituted C2_14haloalkenyl, substituted or unsubstituted C2 12haloalkenyl, substituted or unsubstituted C2-iohaloalkenyl, substituted or unsubstituted C2 ghaloalkenyl, substituted or unsubstituted C2_6haloalkenyl, substituted or unsubstituted C2_ 4haloalkenyl, or substituted or unsubstituted C2_3haloalkenyl. In certain embodiments, the haloalkenyl is a perhaloalkenyl group. In certain embodiments, R is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkenyl. In certain embodiments, R is -CH2CX=CX2, -CH2CH=CX2, -CH2CX=CHX, -CH2CH=CHX, -CH2CX=CH2, -CX2CH=CH2, - CX2CX=CX2, -CX2CH=CX2, -CX2CX=CHX, -CX2CH=CHX, -CX2CX=CH2,
CHXCH=CH2, -CHXCX=CX2, -CHXCH=CX2, -CHXCX=CHX, -CHXCH=CHX, or - CHXCX=CH2, wherein each instance of X is independently -CI, -F, -Br, or -I. In certain embodiments, the alkenyl group is trans or the E-isomer.
[00275] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is substituted or unsubstituted C2_2o alkynyl, e.g., R is substituted or unsubstituted
Figure imgf000091_0001
substituted or unsubstituted C2_16alkynyl, substituted or unsubstituted C2_14alkynyl, substituted or unsubstituted C2_12alkynyl, substituted or unsubstituted C^ioalkynyl, substituted or unsubstituted C2_8alkynyl, substituted or unsubstituted C2_6alkynyl, substituted or
unsubstituted C2_4alkynyl, or substituted or unsubstituted C2_3alkynyl. In certain
embodiments, R is substituted or unsubstituted C2, C3, C4, C5, or C6-alkynyl. In certain embodiments, R is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C2_
2ohaloalkynyl, substituted or unsubstituted C^^haloalkynyl, substituted or unsubstituted C2 iehaloalkynyl, substituted or unsubstituted C2_14haloalkynyl, substituted or unsubstituted C2 12haloalkynyl, substituted or unsubstituted C^ohaloalkynyl, substituted or unsubstituted C2 ghaloalkynyl, substituted or unsubstituted C2_6haloalkynyl, substituted or unsubstituted C2_ 4haloalkynyl, or substituted or unsubstituted C2_3haloalkynyl. In certain embodiments, the haloalkynyl is a perhaloalkynyl group. In certain embodiments, R is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkynyl. In certain embodiments, R is - 0¾Χ, -- -CHX2, or- =-CX3, wherein each X is independently -CI, -F, -Br, or -I.
[00276] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is substituted or unsubstituted carbocycyl, e.g., substituted or unsubstituted C3carbocycyl, substituted or unsubstituted C4carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C6carbocycyl. In certain embodiments, R is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C3halocarbocycyl, substituted or unsubstituted
C4halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
[00277] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5- membered heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl. In certain embodiments, R is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted 3- membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6- membered haloheterocyclyl.
[00278] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl. In certain embodiments, R is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted haloaryl. In certain embodiments, R is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
[00279] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl. In certain embodiments, R is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
[00280] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is halogen, i.e., R3 is -Br, -I, -F, or -CI.
[00281] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -ORc, and Rc is hydrogen, i.e., R is -OH.
[00282] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -ORc, and Rc is an oxygen protecting group, as defined herein.
[00283] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -ORc, and Rc is substituted or unsubstituted C^oalkyl, e.g., R 3 is -OR C , and R C is substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Q-nalkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted Ci-galkyl, substituted or unsubstituted C^a cyl, substituted or unsubstituted C1_4alkyl, substituted or unsubstituted C^alkyl, or substituted or unsubstituted
C^alkyl. In certain embodiments, R 3 is -OR C , and R C is substituted or unsubstituted C1; C2, C3, C4, C5, or C6-alkyl. In certain embodiments, R is alkyl substituted with at least one or more halogen atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C^ohaloalkyl, substituted or unsubstituted Ci-ighaloalkyl, substituted or unsubstituted Ci-^haloalkyl, substituted or unsubstituted Ci-^haloalkyl, substituted or unsubstituted Q-nhaloalkyl, substituted or unsubstituted Q-iohaloalkyl, substituted or unsubstituted Q-ghaloalkyl, substituted or unsubstituted C^haloalkyl, substituted or unsubstituted C^haloalkyl, substituted or unsubstituted C ^haloalkyl, or substituted or unsubstituted C^haloalkyl. In certain embodiments, R 3 is -OR C , and R C is substituted or unsubstituted Q, C2, C3, C4, C5, or C6-haloalkyl. In certain embodiments, the haloalkyl is a perhaloalkyl group. In certain embodiments, R is -CX3, wherein X is halogen. In certain embodiments, R is -CBr , CBr2H, -CBrH2, -CBr2X, or -CBrX2, wherein each instance of X is independently -CI, -F, or -I. In certain embodiments, R is -CI3, CI2H, -CIH2, -CI2X, or -CIX2, wherein each instance of X is independently -CI, -F, or -Br. In certain embodiments, Rc is -CBr3, -CI3, -CFClBr, or -CClBrl.
[00284] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -ORc, and Rc
3 C C
is substituted or unsubstituted C2_2oalkenyl, e.g., R is -OR , and R is substituted or unsubstituted
Figure imgf000094_0001
substituted or unsubstituted C2_16alkenyl, substituted or unsubstituted C2_14alkenyl, substituted or unsubstituted C2_12alkenyl, substituted or unsubstituted C^ioalkenyl, substituted or unsubstituted C2_8alkenyl, substituted or
unsubstituted C2_6alkenyl, substituted or unsubstituted C2_4alkenyl, or substituted or
3 C C
unsubstituted C2_3alkenyl. In certain embodiments, R is -OR , and R is substituted or unsubstituted C2, C3, C4, C5, or C6-alkenyl. In certain embodiments, R is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted C2_2ohaloalkenyl, substituted or unsubstituted
Figure imgf000094_0002
substituted or unsubstituted C2_14haloalkenyl, substituted or unsubstituted C2_12haloalkenyl, substituted or unsubstituted C^iohaloalkenyl, substituted or unsubstituted C2_8haloalkenyl, substituted or unsubstituted C2_6haloalkenyl, substituted or unsubstituted C2_4haloalkenyl, or substituted or unsubstituted C2_3haloalkenyl. In certain embodiments, the haloalkenyl is a perhaloalkenyl group. In
3 C C
certain embodiments, R is -OR , and R is substituted or unsubstituted C2, C3, C4, C5, or C6- haloalkenyl. In certain embodiments, Rc is-CH2CX=CX2, -CH2CH=CX2, -CH2CX=CHX, - CH2CH=CHX, -CH2CX=CH2, -CX2CH=CH2, -CX2CX=CX2, -CX2CH=CX2, - CX2CX=CHX, -CX2CH=CHX, -CX2CX=CH2, CHXCH=CH2, -CHXCX=CX2, - CHXCH=CX2, -CHXCX=CHX, -CHXCH=CHX, or -CHXCX=CH2, wherein each instance of X is independently -CI, -F, -Br, or -I. In certain embodiments, the alkenyl group is trans or the E-isomer.
[00285] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -ORc, and Rc
3 C C
is substituted or unsubstituted C2_2oalkynyl, e.g., R is -OR , and R is substituted or unsubstituted C2_18alkynyl, substituted or unsubstituted C^^alkynyl, substituted or unsubstituted C2_14alkynyl, substituted or unsubstituted C2_12alkynyl, substituted or unsubstituted C2_ioalkynyl, substituted or unsubstituted C2_galkynyl, substituted or unsubstituted C2_6alkynyl, substituted or unsubstituted C2_4alkynyl, or substituted or
3 C C
unsubstituted C2_3alkynyl. In certain embodiments, R is -OR , and R is substituted or unsubstituted C2, C3, C4, C5, or C6-alkynyl. In certain embodiments, R is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted C2_2ohaloalkynyl, substituted or unsubstituted C2-i8haloalkynyl, substituted or unsubstituted C2-i6haloalkynyl, substituted or unsubstituted C2-i4haloalkynyl, substituted or unsubstituted C2-i2haloalkynyl, substituted or unsubstituted C2-iohaloalkynyl, substituted or unsubstituted C2-shaloalkynyl, substituted or unsubstituted C2-6haloalkynyl, substituted or unsubstituted C2-4haloalkynyl, or substituted or unsubstituted C2-3haloalkynyl. In certain embodiments, the haloalkynyl is a perhaloalkynyl group. In
3 C C
certain embodiments, R is -OR , and R is substituted or unsubstituted C2, C3, C4, C5, or C6- haloalkynyl. In certain embodiments, R is - -CI¾X, -: -CHX2, or - =-CX3, wherein each X is independently -CI, -F, -Br, or -I.
[00286] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -ORc, and Rc is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C3carbocycyl, substituted or unsubstituted C4carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted Cecarbocycyl. In certain embodiments, R is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted C3halocarbocycyl, substituted or unsubstituted C4halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
[00287] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -ORc, and Rc is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or
unsubstituted 5-membered heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl. In certain embodiments, R is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl.
[00288] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -ORc, and Rc is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl. In certain embodiments, R is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted haloaryl. In certain embodiments, R is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
[00289] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -ORc, and Rc is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl. In certain embodiments, R is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
[00290] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -SRC, and Rc is hydrogen, i.e., R is -SH.
[00291] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -SRC, and Rc is a sulfur protecting group, as defined herein.
[00292] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -SRC, and Rc is substituted or unsubstituted C^oalkyl, e.g., R 3 is -SR C , and R C is substituted or unsubstituted Q-^alkyl, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Q-walkyl, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted Q-ioalkyl, substituted or unsubstituted Chalky!, substituted or unsubstituted Ci^alkyl, substituted or unsubstituted C1-4alkyl, substituted or unsubstituted C^alkyl, or substituted or unsubstituted
C^alkyl. In certain embodiments, R 3 is -SR C , and R C is substituted or unsubstituted C1; C2, C3, C4, C5, or C6-alkyl. In certain embodiments, R is alkyl substituted with at least one or more halogen atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C^ohaloalkyl, substituted or unsubstituted Ci-^haloalkyl, substituted or unsubstituted Q-iehaloalkyl, substituted or unsubstituted Ci-nhaloalkyl, substituted or unsubstituted Q-nhaloalkyl, substituted or unsubstituted Q-iohaloalkyl, substituted or unsubstituted Ci-shaloalkyl, substituted or unsubstituted Ci^haloalkyl, substituted or unsubstituted C^haloalkyl, substituted or unsubstituted Ci^haloalkyl, or substituted or unsubstituted C^haloalkyl. In certain embodiments, R 3 is -SR C , and R C is substituted or unsubstituted Q, C2, C3, C4, C5, or C6-haloalkyl. In certain embodiments, the haloalkyl is a perhaloalkyl group. In certain embodiments, R is -CX3, wherein X is halogen. In certain embodiments, R is -CBr3, CBr2H, -CBrH2, -CBr2X, or -CBrX2, wherein each instance of X is independently -CI, -F, or -I. In certain embodiments, R is -CI3, CI2H, -CIH2, -CI2X, or -CIX2, wherein each instance of X is independently -CI, -F, or -Br. In certain embodiments, Rc is -CBr3, -CI3, -CFClBr, or -CClBrl.
[00293] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -ORc, and Rc is substituted or unsubstituted C2_2oalkenyl, e.g., R 3 is -SR C , and R C is substituted or unsubstituted
Figure imgf000097_0001
substituted or unsubstituted C2-i6alkenyl, substituted or unsubstituted C2_14alkenyl, substituted or unsubstituted C2_12alkenyl, substituted or unsubstituted C2-ioalkenyl, substituted or unsubstituted C2_8alkenyl, substituted or
unsubstituted C2_6alkenyl, substituted or unsubstituted C2-4alkenyl, or substituted or unsubstituted C2_3alkenyl. In certain embodiments, R 3 is -SR C , and R C is substituted or unsubstituted C2, C3, C4, C5, or C6-alkenyl. In certain embodiments, R is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C2-2ohaloalkenyl, substituted or unsubstituted C2-i8haloalkenyl, substituted or unsubstituted C2-i6haloalkenyl, substituted or unsubstituted C2_14haloalkenyl, substituted or unsubstituted C2_12haloalkenyl, substituted or unsubstituted C2-iohaloalkenyl, substituted or unsubstituted C2_8haloalkenyl, substituted or unsubstituted C2-6haloalkenyl, substituted or unsubstituted C2-4haloalkenyl, or substituted or unsubstituted C2-3haloalkenyl. In certain embodiments, the haloalkenyl is a perhaloalkenyl group. In certain embodiments, R 3 is -SR C , and R C is substituted or unsubstituted C2, C3, C4, C5, or C6- haloalkenyl. In certain embodiments, Rc is -CH2CX=CX2, -CH2CH=CX2, -CH2CX=CHX, - CH2CH=CHX, -CH2CX=CH2, -CX2CH=CH2, -CX2CX=CX2, -CX2CH=CX2, - CX2CX=CHX, -CX2CH=CHX, -CX2CX=CH2, CHXCH=CH2, -CHXCX=CX2, - CHXCH=CX2, -CHXCX=CHX, -CHXCH=CHX, or -CHXCX=CH2, wherein each instance of X is independently -CI, -F, -Br, or -I. In certain embodiments, the alkenyl group is trans or the E-isomer.
[00294] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -SRC, and Rc is substituted or unsubstituted C2_2oalkynyl, e.g., R 3 is -SR C , and R C is substituted or unsubstituted C^^alkynyl, substituted or unsubstituted C^^alkynyl, substituted or unsubstituted C2_14alkynyl, substituted or unsubstituted C2_12alkynyl, substituted or unsubstituted C2_ioalkynyl, substituted or unsubstituted C2_galkynyl, substituted or unsubstituted C2_6alkynyl, substituted or unsubstituted C2_4alkynyl, or substituted or unsubstituted C2_3alkynyl. In certain embodiments, R 3 is -SR C , and R C is substituted or unsubstituted C2, C3, C4, C5, or C6-alkynyl. In certain embodiments, R is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C2_2ohaloalkynyl, substituted or unsubstituted C2-i8haloalkynyl, substituted or unsubstituted C2-i6haloalkynyl, substituted or unsubstituted C2-i4haloalkynyl, substituted or unsubstituted C2-i2haloalkynyl, substituted or unsubstituted C2-iohaloalkynyl, substituted or unsubstituted C2-shaloalkynyl, substituted or unsubstituted C2-6haloalkynyl, substituted or unsubstituted C2-4haloalkynyl, or substituted or unsubstituted C2-3haloalkynyl. In certain embodiments, the haloalkynyl is a perhaloalkynyl group. In
3 C C
certain embodiments, R is -SR , and R is substituted or unsubstituted C2, C3, C4, C5, or C6- haloalkynyl. In certain embodiments, R is- --CH2X, - -CHX2, or - = -CX3, wherein each X is independently -CI, -F, -Br, or -I .
[00295] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -SRC, and Rc is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C3carbocycyl, substituted or unsubstituted C4carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted Cecarbocycyl. In certain embodiments, R is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted C3halocarbocycyl, substituted or unsubstituted C4halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
[00296] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -SRC, and Rc is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5-membered heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl. In certain embodiments, R is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4-membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl.
[00297] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -SRC, and Rc is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl. In certain embodiments, R is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g. , R is substituted or unsubstituted haloaryl. In certain embodiments, R is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4- , or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
[00298] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -SRC, and Rc is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl. In certain embodiments, R is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
[00299] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -N(RC)2, and at least one R C is hydrogen, i.e., R 3 is -NHR C or -NH2.
[00300] In certain embodiments, R3 is -N(RC)2, and at least one Rc is a nitrogen protecting group, as defined herein.
[00301] In certain embodiments, R3 is -N(RC)2, and at least one Rc is substituted or unsubstituted C1_2oalkyl, e.g., R 3 is -N(R C )2, and at least one R C is substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted Q-walkyl, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted Ci-ioalkyl, substituted or unsubstituted Q-galkyl, substituted or unsubstituted C^a cyl, substituted or unsubstituted C1_4alkyl, substituted or unsubstituted C^alkyl, or substituted or unsubstituted C1_2alkyl. In certain embodiments, R 3 is -N(R C )2, and at least one R C is substituted or unsubstituted Q, C2, C3, C4, C5, or C6-alkyl. In certain embodiments, at least one is RA is alkyl substituted with at least one or more halogen atoms (i.e., one or more -Br, -I, -F, or -CI atoms), e.g., at least one is R is substituted or unsubstituted C1_2ohaloalkyl, substituted or unsubstituted Q_ ighaloalkyl, substituted or unsubstituted Ci-^haloalkyl, substituted or unsubstituted Q_ 14haloalkyl, substituted or unsubstituted Q-nhaloalkyl, substituted or unsubstituted Q_ 10haloalkyl, substituted or unsubstituted Q-ghaloalkyl, substituted or unsubstituted Q_ 6haloalkyl, substituted or unsubstituted C^haloalkyl, substituted or unsubstituted Q_
3haloalkyl, or substituted or unsubstituted C^haloalkyl. In certain embodiments, R is -
N(R c )2, and at least one R c is substituted or unsubstituted Q, C2, C3, C4, C5, or C6-haloalkyl.
In certain embodiments, the haloalkyl is a perhaloalkyl group. In certain embodiments, at least one R c is -CX3, wherein X is halogen. In certain embodiments, at least one R c is -
CBr3, CBr2H, -CBrH2, -CBr2X, or -CBrX2, wherein each instance of X is independently -CI,
-F, or -I. In certain embodiments, at least one R" is -CI3, CI2H, -CIH2, -CI2X, or -CIX2, wherein each instance of X is independently -CI, -F, or -Br. In certain embodiments, at least one Rc is -CBr3, -CI3, -CFClBr, or -CClBrl. [00302] In certain embodiments, R3 is -N(RC)2, and at least one Rc is substituted or unsubstituted C2_2oalkenyl, e.g., R 3 is -N(R C )2, and at least one R C is substituted or unsubstituted C2_18alkenyl, substituted or unsubstituted C2_16alkenyl, substituted or unsubstituted C2_14alkenyl, substituted or unsubstituted C2_12alkenyl, substituted or unsubstituted C2-ioalkenyl, substituted or unsubstituted C2_8alkenyl, substituted or
unsubstituted C2_6alkenyl, substituted or unsubstituted C2_4alkenyl, or substituted or unsubstituted C2_3alkenyl. In certain embodiments, R 3 is -N(R C )2, and at least one R C is substituted or unsubstituted C2, C3, C4, C5, or C6-alkenyl. In certain embodiments, at least one R is alkenyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., R is substituted or unsubstituted C2-2ohaloalkenyl, substituted or unsubstituted C2_18haloalkenyl, substituted or unsubstituted C2-i6haloalkenyl, substituted or unsubstituted C2 14haloalkenyl, substituted or unsubstituted C2_12haloalkenyl, substituted or unsubstituted C2-iohaloalkenyl, substituted or unsubstituted C2_8haloalkenyl, substituted or unsubstituted C2-6haloalkenyl, substituted or unsubstituted C2_4haloalkenyl, or substituted or unsubstituted C2_3haloalkenyl. In certain embodiments, the haloalkenyl is a perhaloalkenyl group. In certain embodiments, R 3 is -N(R C )2, and at least one R C is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkenyl. In certain embodiments, at least one Rc is -CH2CX=CX2, -CH2CH=CX2, -CH2CX=CHX, -CH2CH=CHX, -CH2CX=CH2, - CX2CH=CH2, -CX2CX=CX2, -CX2CH=CX2, -CX2CX=CHX, -CX2CH=CHX, - CX2CX=CH2, CHXCH=CH2, -CHXCX=CX2, -CHXCH=CX2, -CHXCX=CHX, - CHXCH=CHX, or -CHXCX=CH2, wherein each instance of X is independently -CI, -F, -Br, or -I. In certain embodiments, the alkenyl group is trans or the E-isomer.
[00303] In certain embodiments, R3 is -N(RC)2, and at least one Rc is substituted or unsubstituted C2_2oalkynyl, e.g., R 3 is -N(R C )2, and at least one R C is substituted or unsubstituted C2_18alkynyl, substituted or unsubstituted C2_16alkynyl, substituted or unsubstituted C2_14alkynyl, substituted or unsubstituted C2_12alkynyl, substituted or unsubstituted C^ioalkynyl, substituted or unsubstituted C2_8alkynyl, substituted or
unsubstituted C2_6alkynyl, substituted or unsubstituted C2_4alkynyl, or substituted or unsubstituted C2_3alkynyl. In certain embodiments, R 3 is -N(R C )2, and at least one R C is substituted or unsubstituted C2, C3, C4, C5, or C6-alkynyl. In certain embodiments, at least one RA is alkynyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one R is substituted or unsubstituted C2_ 20haloalkynyl, substituted or unsubstituted C2_18haloalkynyl, substituted or unsubstituted C2 16haloalkynyl, substituted or unsubstituted C^^haloalkynyl, substituted or unsubstituted C2 12haloalkynyl, substituted or unsubstituted C^ohaloalkynyl, substituted or unsubstituted C2 ghaloalkynyl, substituted or unsubstituted C2_6haloalkynyl, substituted or unsubstituted C2_ 4haloalkynyl, or substituted or unsubstituted C2_3haloalkynyl. In certain embodiments, the haloalkynyl is a perhaloalkynyl group. In certain embodiments, R 3 is -N(R C )2, and at least one R is substituted or unsubstituted C2, C3, C4, C5, or C6-haloalkynyl. In certain
embodiments, at least one RA isis - -CH2X -- -CHX2, or- = -CX3, wherein each X is independently -CI, -F, -Br, or -I.
[00304] In certain embodiments, R3 is -N(RC)2, and at least one Rc is substituted or unsubstituted carbocyclyl, e.g., substituted or unsubstituted C3carbocycyl, substituted or unsubstituted C4carbocycyl, substituted or unsubstituted Cscarbocycyl, or substituted or unsubstituted C6carbocycyl. In certain embodiments, at least one R is carbocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one R is substituted or unsubstituted C3halocarbocycyl, substituted or unsubstituted C4halocarbocycyl, substituted or unsubstituted Cshalocarbocycyl, or substituted or unsubstituted Cehalocarbocycyl.
[00305] In certain embodiments, R3 is -N(RC)2, and at least one Rc is substituted or unsubstituted heterocyclyl, e.g., substituted or unsubstituted 3-membered heterocyclyl, substituted or unsubstituted 4-membered heterocyclyl, substituted or unsubstituted 5- membered heterocyclyl, or substituted or unsubstituted 6-membered heterocyclyl. In certain embodiments, at least one R is heterocyclyl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one R is substituted or unsubstituted 3-membered haloheterocyclyl, substituted or unsubstituted 4- membered haloheterocyclyl, substituted or unsubstituted 5-membered haloheterocyclyl, or substituted or unsubstituted 6-membered haloheterocyclyl.
[00306] In certain embodiments, R3 is -N(RC)2, and at least one Rc is substituted or unsubstituted aryl, e.g., substituted or unsubstituted phenyl or substituted or unsubstituted naphthyl. In certain embodiments, at least one R is aryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one R is substituted or unsubstituted haloaryl. In certain embodiments, at least one R is substituted or unsubstituted halophenyl, such as mono substituted halophenyl (e.g., ortho, meta, or para- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), disubstituted halophenyl (e.g., 1,2-, 1,3-, 1,4-, 1,5-, 2,3-, 2,4-, or 2,5- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring), or trisubstituted halophenyl (e.g., 1,3,5-, 1,2,3-, 1,2,4-, 1,2,5-, or 2,3,4- substituted with halogen atoms, substitution relative to the point of attachment of the halophenyl ring).
[00307] In certain embodiments, R3 is -N(RC)2, and at least one Rc is substituted or unsubstituted heteroaryl, e.g., substituted or unsubstituted 5-membered heteroaryl or substituted or unsubstituted 6-membered heteroaryl. In certain embodiments, at least one R is heteroaryl substituted with at least one or more halogen atoms (e.g., 1, 2, 3, 4, 5, 6, or more -Br, -I, -F, or -CI atoms), e.g., at least one R is substituted or unsubstituted 5-membered haloheteroaryl or substituted or unsubstituted 6-membered haloheteroaryl.
[00308] In certain embodiments, R3 is -N(RC)2, and two Rc groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring, e.g., a 5- to 6- membered substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring.
[00309] In certain embodiments, R3 is -SHgRD, wherein RD is
[00310] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -SHgMe, - SHg or -S02SHg.
[00311] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -SeRD, as defined herein.
[00312] In certain embodiments of formula (i), (iii), (v), (vii), and (ix), R3 is -TeRD, as defined herein.
[00313] Furthermore, in certain embodiments of formula (i), (iii), (v), (vii), and (ix), each instance of L1 is independently absent or a linking moiety selected from the group consisting of substituted or unsubstituted C^oalkylene, substituted or unsubstituted C2_2oalkenylene, substituted or unsubstituted C2_20 alkynylene, substituted or unsubstituted heteroCi- 20alkylene, substituted or unsubstituted heteroC2_2oalkenylene, substituted or unsubstituted heteroC2_2o alkynylene, substituted or unsubstituted carbocycylene, substituted or
unsubstituted heterocyclylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a combination thereof.
[00314] In certain embodiments, L 1 is absent, and R 3 is attached directly to the ring system.
[00315] However, in certain embodiments, L1 is a linking moiety selected from the group consisting of substituted and unsubstituted alkylene; substituted and unsubstituted
alkenylene; substituted and unsubstituted alkynylene; substituted and unsubstituted heteroalkylene; substituted and unsubstituted heteroalkenylene; substituted and unsubstituted heteroalkynylene; substituted and unsubstituted heterocyclylene; substituted and unsubstituted carbocyclylene; substituted and unsubstituted arylene; substituted and unsubstituted heteroarylene; and combinations thereof.
[00316] As used herein, reference to a linking moiety consisting of "a combination" refers to a linking moiety compring 1, 2, 3, 4 or more of the recited moieties. For example, the linking moiety may consist of an alkynylene attached to an alkylene. As used herein "at least one instance" refers to 1, 2, 3, 4, or more instances of the recited moiety.
[00317] In certain embodiments, L1 is a linking moiety selected from the group consisting of substituted and unsubstituted alkylene; substituted and unsubstituted alkenylene; and substituted and unsubstituted alkynylene, and combinations thereof.
[00318] In certain embodiments, L1 comprises at least one instance of substituted or unsubstituted alkylene, e.g., substituted or unsubstituted C ^alkylene, substituted or unsubstituted C^alkylene, substituted or unsubstituted C2_3alkylene, substituted or unsubstituted C3^alkylene, substituted or unsubstituted C4_salkylene, substituted or unsubstituted C5_6alkylene, substituted or unsubstituted C3_6alkylene, or substituted or unsubstituted C4_6alkylene. Exemplary alkylene groups include unsubstituted alkylene groups such as methylene -CH2-, ethylene -(CH2)2-, n-propylene -(CH2)3-, n-butylene - (CH2)4-, n-pentylene -(CH2)s-, and n-hexylene -(CH2)6-.
[00319] In certain embodiments, L1 is a linking moiety selected from the group consisting of substituted and unsubstituted alkylene and substituted and unsubstituted alkenylene, and combinations thereof.
[00320] In certain embodiments, L1 comprises at least one instance of substituted or unsubstituted alkenylene, e.g., substituted or unsubstituted C2_6alkenylene, substituted or unsubstituted C2_3alkenylene, substituted or unsubstituted C3^alkenylene, substituted or unsubstituted C4_salkenylene, or substituted or unsubstituted Cs^alkenylene.
[00321] In certain embodiments, L1 is a linking moiety selected from the group consisting of substituted and unsubstituted alkylene and substituted and unsubstituted alkynylene, and combinations thereof.
[00322] In certain embodiments, L1 comprises at least one instance of substituted or unsubstituted alkynylene, e.g., substituted or unsubstituted C2_6alkynylene, substituted or unsubstituted C2_3alkynylene, substituted or unsubstituted C3^alkynylene, substituted or unsubstituted C4_salkynylene, or substituted or unsubstituted Cs^alkynylene.
[00323] In certain embodiments, L1 comprises at least one instance of substituted or unsubstituted heteroalkylene, e.g., substituted or unsubstituted heteroQ-ealkylene, substituted or unsubstituted heteroC^alkylene, substituted or unsubstituted heteroC2_3alkylene, substituted or unsubstituted heteroC3^alkylene, substituted or unsubstituted heteroC4_ salkylene, or substituted or unsubstituted heteroCs_6alkylene. Exemplary heteroalkylene groups include unsubstituted alkylene groups such as -(CH2)2-0(CH2)2-, -OCH2-, -CH20-,
-0(CH2)2- -(CH2)20-, -0(CH2)3- -(CH2)30-, -0(CH2)4- -(CH2)40-, -0(CH2)5- - (CH2)50-, -0(CH2)6- and -0(CH2)60-.
[00324] In certain embodiments, L1 comprises at least one instance of substituted or unsubstituted heteroalkenylene, e.g., substituted or unsubstituted heteroC2_6alkenylene, substituted or unsubstituted heteroC2_3alkenylene, substituted or unsubstituted heteroC3_ 4alkenylene, substituted or unsubstituted heteroC4_salkenylene, or substituted or unsubstituted heteroC5_6alkenylene .
[00325] In certain embodiments, L1 comprises at least one instance of substituted or unsubstituted heteroalkynylene, e.g., substituted or unsubstituted heteroC2_6alkynylene, substituted or unsubstituted heteroC2_3alkynylene, substituted or unsubstituted heteroC3_ 4alkynylene, substituted or unsubstituted heteroC4_salkynylene, or substituted or unsubstituted heteroC5_6alkynylene.
[00326] In certain embodiments, L1 comprises at least one instance of substituted or unsubstituted carbocyclylene, e.g., substituted or unsubstituted C3_6carbocyclylene, substituted or unsubstituted C3^carbocyclylene, substituted or unsubstituted C4_5
carbocyclylene, or substituted or unsubstituted C5_6 carbocyclylene.
[00327] In certain embodiments, L1 comprises at least one instance of substituted or unsubstituted heterocyclylene, e.g., substituted or unsubstituted C3_6 heterocyclylene, substituted or unsubstituted C3^ heterocyclylene, substituted or unsubstituted C4_5 heterocyclylene, or substituted or unsubstituted C5_6 heterocyclylene.
[00328] In certain embodiments, L1 comprises at least one instance of substituted or unsubstituted arylene, e.g., substituted or unsubstituted phenylene.
[00329] In certain embodiments, L1 comprises at least one instance of substituted or unsubstituted heteroarylene, e.g., substituted or unsubstituted 5- to 6-membered
heteroarylene.
[00330] In certain embodiments, L1 represents a linking moiety consisting of a combination of one or more consecutive covalently bonded groups of the formula:
Figure imgf000104_0001
Figure imgf000105_0001
wherein:
each instance of m is independently an integer between 1 to 10, inclusive;
each instance of p is independently an integer between 1 to 4, inclusive;
each instance of Rwl is independently hydrogen; substituted or unsubstituted alkyl; substituted or unsubstituted alkenyl; substituted or unsubstituted alkynyl; substituted or unsubstituted carbocyclyl; substituted or unsubstituted heterocyclyl; substituted or unsubstituted aryl; substituted or unsubstituted heteroaryl; or a nitrogen protecting group; each instance of RW2 is independently hydrogen; halogen; substituted or unsubstituted alkyl; substituted or unsubstituted alkenyl; substituted or unsubstituted alkynyl; substituted or unsubstituted carbocyclyl; substituted or unsubstituted heterocyclyl; substituted or unsubstituted aryl; substituted or unsubstituted heteroaryl; or two RW2 groups are joined to form a substituted or unsubstituted 5- to 6-membered ring.
[00331] In certain embodiments, L1 represents a linking moiety consisting of a
combination of 1 to 10 consecutive covalently bonded groups of the above described formulae, e.g., L1 represents a linking moiety consisting of a combination of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 consecutive covalently bonded groups of the above described formulae. It should be generally understood that multiple instances of a given variable or group present in a linking moiety may optionally differ.
[00332] As described herein, each instance of Rwl is independently hydrogen; substituted or unsubstituted alkyl; substituted or unsubstituted alkenyl; substituted or unsubstituted alkynyl; substituted or unsubstituted carbocyclyl; substituted or unsubstituted heterocyclyl; substituted or unsubstituted aryl; substituted or unsubstituted heteroaryl; a nitrogen protecting group if attached to a nitrogen atom, or an oxygen protecting group if attached to an oxygen atom. In any of the above formulae, as described herein, in certain embodiments, each instance of Rwl is independently hydrogen; substituted or unsubstituted alkyl (e.g., methyl); or a nitrogen protecting group.
[00333] As described herein, each instance of RW2 is independently hydrogen; halogen; substituted or unsubstituted alkyl; substituted or unsubstituted alkenyl; substituted or unsubstituted alkynyl; substituted or unsubstituted carbocyclyl; substituted or unsubstituted heterocyclyl; substituted or unsubstituted aryl; substituted or unsubstituted heteroaryl; or two RW2 groups are joined to form a 5-6 membered ring. In any of the above formulae, as described herein, in certain embodiments, each instance of R is independently hydrogen, halogen (e.g. , -Br, -CI, -F, or -I), or substituted or unsubstituted alkyl (e.g., methyl).
[00334] As described herein, each instance of m is independently an integer between 1 to 10, inclusive. In certain embodiments, m is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10.
[00335] As described herein, each instance of p is independently an integer between 1 to 4, inclusive. In certain embodiments, p is 1, 2, 3, or 4.
[00336]
Figure imgf000106_0001
wherein RW2, m, and R3 are as defined herein. In certain embodiments, m is 1, 2, 3, or 4. In certain embodiments, each instance of RW2 is hydrogen. In certain embodiments, R3 is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I). In certain embodiments, R is halogen (e.g. , -Br, -I), -ORc, -SRC, -N(RC)2, -SHg, -S02SHg , -SHgRD, -SeRD, or -TeRD.
3 C 3 C
In certain embodiments, R is -OR . In certain embodiments, R is -SR . In certain
3 C C
embodiments, R is -N(R )2. In certain embodiments, R is -CX3, wherein X is halogen. In
3 D
certain embodiments, R is -SHgR (e.g., -SHgMe), -SHg or -S02SHg. In certain
3 D 3 embodiments, R is -SeR , e.g. , -SeCX3, wherein X is halogen. In certain embodiments, R is
D 3
-TeR , e.g. , -TeCX3, wherein X is halogen. In certain embodiments, R is Se-CBr or -TeBr .
( iv ) Group M3 and M4
[00337] As generally defined herein, each instance of M3 and M4 is independently O, Se, Te, CH2, CF2, CC12, CBr2, or CI2.
[00338] In certain embodiments of formula (vii), (viii), (ix), and (x), M3 is O. In certain embodiments of formula (vii), (viii), (ix), and (x), M is Se. In certain embodiments of formula (vii), (viii), (ix), and (x), M is Te. In certain embodiments of formula (vii), (viii),
3 3
(ix), and (x), M is Te. In certain embodiments of formula (vii), (viii), (ix), and (x), M is CH2. In certain embodiments of formula (vii), (viii), (ix), and (x), M is CF2. In certain embodiments of formula (vii), (viii), (ix), and (x), M is CC12. In certain embodiments of formula (vii), (viii), (ix), and (x), M is CBr2. In certain embodiments of formula (vii), (viii), (ix), and (x), M3 is CI2. [00339] In certain embodiments of formula (iii), (iv), (ix), and (x), M4 is O. In certain embodiments of formula (iii), (iv), (ix), and (x), M4 is Se. In certain embodiments of formula (iii), (iv), (ix), and (x), M4 is Te. In certain embodiments of formula (vii), (viii), (ix), and (x), M4 is CH2. In certain embodiments of formula (vii), (viii), (ix), and (x), M4 is CF2. In certain embodiments of formula (vii), (viii), (ix), and (x), M4 is CC12. In certain embodiments of formula (vii), (viii), (ix), and (x), M4 is CBr2. In certain embodiments of formula (vii), (viii),
(ix) , and (x), M4 is CI2.
[00340] In certain embodiments of formula (ix) and (x), M3 is O and M4 is O. In certain embodiments of formula (ix) and (x), M3 is Se and M4 is O. In certain embodiments of formula (ix) and (x), M3 is Te and M4 is O. In certain embodiments of formula (ix) and (x), M3 is CH2 and M4 is O. In certain embodiments of formula (ix) and (x), M3 is CF2 and M4 is O. In certain embodiments of formula (ix) and (x), M3 is CC12 and M4 is O. In certain embodiments of formula (ix) and (x), M3 is CBr2 and M4 is O. In certain embodiments of formula (ix) and (x), M3 is CI2 and M4 is O.
[00341] In certain embodiments of formula (ix) and (x), M3 is O and M4 is Se. In certain embodiments of formula (ix) and (x), M3 is Se and M4 is Se. In certain embodiments of formula (ix) and (x), M3 is Te and M4 is Se. In certain embodiments of formula (ix) and (x), M3 is CH2 and M4 is Se. In certain embodiments of formula (ix) and (x), M3 is CF2 and M4 is Se. In certain embodiments of formula (ix) and (x), M3 is CC12 and M4 is Se. In certain embodiments of formula (ix) and (x), M3 is CBr2 and M4 is Se. In certain embodiments of formula (ix) and (x), M3 is CI2 and M4 is Se.
[00342] In certain embodiments of formula (ix) and (x), M3 is O and M4 is Te. In certain embodiments of formula (ix) and (x), M3 is Se and M4 is Te. In certain embodiments of formula (ix) and (x), M3 is Te and M4 is Te. In certain embodiments of formula (ix) and (x), M3 is CH2 and M4 is Te. In certain embodiments of formula (ix) and (x), M3 is CF2 and M4 is Te. In certain embodiments of formula (ix) and (x), M3 is CC12 and M4 is Te. In certain embodiments of formula (ix) and (x), M3 is CBr2 and M4 is Te. In certain embodiments of formula (ix) and (x), M3 is CI2 and M4 is Te.
[00343] In certain embodiments of formula (ix) and (x), M3 is O and M4 is CH2. In certain embodiments of formula (ix) and (x), M3 is Se and M4 is CH2. In certain embodiments of formula (ix) and (x), M3 is Te and M4 is CH2. In certain embodiments of formula (ix) and
(x) , M 3 is CH2 and M 4 is CH2. In certain embodiments of formula (ix) and (x), M 3 is CF2 and M4 is CH2. In certain embodiments of formula (ix) and (x), M3 is CC12 and M4 is CH2. In certain embodiments of formula (ix) and (x), M3 is CBr2 and M4 is CH2. In certain embodiments of formula (ix) and (x), M3 is CI2 and M4 is CH2.
[00344] In certain embodiments of formula (ix) and (x), M3 is O and M4 is CF2. In certain embodiments of formula (ix) and (x), M3 is Se and M4 is CF2. In certain embodiments of formula (ix) and (x), M3 is Te and M4 is CF2. In certain embodiments of formula (ix) and (x), M3 is CH2 and M4 is CF2. In certain embodiments of formula (ix) and (x), M3 is CF2 and M4 is CF2. In certain embodiments of formula (ix) and (x), M3 is CC12 and M4 is CF2. In certain embodiments of formula (ix) and (x), M3 is CBr2 and M4 is CF2. In certain embodiments of formula (ix) and (x), M3 is CI2 and M4 is CF2.
[00345] In certain embodiments of formula (ix) and (x), M3 is O and M4 is CC12. In certain embodiments of formula (ix) and (x), M3 is Se and M4 is CC12. In certain embodiments of formula (ix) and (x), M3 is Te and M4 is CC12. In certain embodiments of formula (ix) and
(x), M 3 is CH2 and M 4 is CC12. In certain embodiments of formula (ix) and (x), M 3 is CF2 and M4 is CC12. In certain embodiments of formula (ix) and (x), M3 is CC12 and M4 is CC12. In certain embodiments of formula (ix) and (x), M3 is CBr2 and M4 is CC12. In certain
embodiments of formula (ix) and (x), M3 is CI2 and M4 is CC12.
[00346] In certain embodiments of formula (ix) and (x), M3 is O and M4 is CBr2. In certain embodiments of formula (ix) and (x), M3 is Se and M4 is CBr2. In certain embodiments of formula (ix) and (x), M3 is Te and M4 is CBr2. In certain embodiments of formula (ix) and
(x), M 3 is CH2 and M 4 is CBr2. In certain embodiments of formula (ix) and (x), M 3 is CF2 and M4 is CBr2. In certain embodiments of formula (ix) and (x), M3 is CC12 and M4 is CBr2. In certain embodiments of formula (ix) and (x), M3 is CBr2 and M4 is CBr2. In certain embodiments of formula (ix) and (x), M3 is CI2 and M4 is CBr2.
[00347] In certain embodiments of formula (ix) and (x), M3 is O and M4 is CI2. In certain embodiments of formula (ix) and (x), M3 is Se and M4 is CI2. In certain embodiments of formula (ix) and (x), M3 is Te and M4 is CI2. In certain embodiments of formula (ix) and (x), M3 is CH2 and M4 is CI2. In certain embodiments of formula (ix) and (x), M3 is CF2 and M4 is CI2. In certain embodiments of formula (ix) and (x), M3 is CC12 and M4 is CI2. In certain embodiments of formula (ix) and (x), M3 is CBr2 and M4 is CI2. In certain embodiments of formula (ix) and (x), M3 is CI2 and M4 is CI2.
(v) Exemplary combinations of the base region
[00348] Various combinations of the above described embodiments of the "base region" are further contemplated herein. [00349] For example, in certain embodiments, at least one instance of a Base is of formula (i), (iii), (v), (vii), or (ix):
Figure imgf000109_0001
wherein the base optionally comprises a heavy atom. In certain embodiments, the base comprises a heavy atom. In certain embodiments, the base does not comprise a heavy atom, but the sugar region or phosphate region comprises a heavy atom. In certain embodiments,
3 3 4
M is O. In certain embodiments, M is Se. In certain embodiments, M is O. In certain
4 1 2
embodiments, M is Se. In certain embodiments, R and R are hydrogen. In certain embodiments, R4 and R5 are hydrogen. In certain embodiments, L1 is absent. In certain embodiments, L1 is substituted or unsubstituted alkynylene. In certain embodiments, L1 is substituted or uns formula:
Figure imgf000109_0002
wherein RW2, m, and R3 are as defined herein. In certain embodiments, m is 1, 2, 3, or 4. In certain embodiments, each instance of RW2 is hydrogen. In certain embodiments, R3 is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I). In certain embodiments, R is halogen (e.g. , -Br, -I), -ORc, -SRC, -N(RC)2, -SHg, -S02SHg , -SHgRD, -SeRD, or -TeRD.
3 C 3 C
In certain embodiments, R is -OR . In certain embodiments, R is -SR . In certain
3 C C
embodiments, R is -N(R )2. In certain embodiments, R is -CX3, wherein X is halogen. In
3 D
certain embodiments, R is -SHgR (e.g., -SHgMe), -SHg or -S02SHg. In certain 3 D 3 embodiments, R is -SeR , e.g. , -SeCX3, wherein X is halogen. In certain embodiments, R is
D 3
- -TeCX3, wherein X is halogen. In certain embodiments, R is Se-CBr or -TeBr .
[00350] In certain embodiments of formula (ix), wherein M4 is O, wherein L1 is absent, or -
Figure imgf000110_0001
wherein the base optionally comprises a heavy atom. In certain embodiments, the base comprises a heavy atom. In certain embodiments, the base does not comprise a heavy atom, but the sugar region or phosphate region comprises a heavy atom. In certain embodiments, R is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I). In certain
embodiments, R3 is halogen (e.g. , -Br, -I), -ORc, -SRC, -N(RC)2, -SHg, -S02SHg , -SHgRD,
D D 3 C 3 C
-SeR , or -TeR . In certain embodiments, R is -OR . In certain embodiments, R is -SR .
3 C C
In certain embodiments, R is -N(R )2. In certain embodiments, R is -CX3, wherein X is halogen. In certain embodiments, R3 is -SHgRD (e.g., -SHgMe), -SHg or -S02SHg. In
3 D
certain embodiments, R is -SeR , e.g. , -SeCX3, wherein X is halogen. In certain
3 D 3 embodiments, R is -TeR , e.g. , -TeCX3, wherein X is halogen. In certain embodiments, R is Se-CBr3 or -TeBr3.
[00351] In certain embodiments of Formula (I-d), wherein the Base is of formula (ix-a), (ix-b), (ix-c), or (ix-d), provided is a compound of Formula (I-d-ix-a), (I-d-ix-b), (I-d-ix-c), or (I-d-ix-d):
Figure imgf000111_0001
(I-d-ix-c), or (I-d-ix-d) wherein the compound comprises a heavy atom. In certain embodiments, R4 and R5 are hydrogen. In certain embodiments, G2 of the sugar region is hydrogen. In certain
embodiments, G2 of the sugar region is -SHgMe, -SHg or -S02SHg. In certain
embodiments, G2 of the sugar region is -SeRD, e.g. , -SeCX3, wherein X is halogen. In certain embodiments, G2 of the sugar region is -TeRD, e.g. , -TeCX3, wherein X is halogen. In certain embodiments, G2 is Se-CBr3 or -TeBr3. In certain embodiments, G is a monophosphate, diphosphate, or triphosphate of formula:
O I I ¾ P I I 9 I I c, O I I O I I O I I t
HO-P— HO-P-O-P— HO-P-O-P-O-P— I
M2-H , OH 2-H ; or OH OH M2-H .
2 3
In certain embodimenets, M is -0-. In certain embodiments, R is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I). In certain embodiments, R is halogen (e.g. , - Br, -I), -ORc, -SRC, -N(RC)2, -SHg, -S02SHg , -SHgRD, -SeRD, or -TeRD. In certain
3 C 3 C 3 embodiments, R is -OR . In certain embodiments, R is -SR . In certain embodiments, R c c
is -N(R )2. In certain embodiments, R is -CX3, wherein X is halogen. In certain
3 D 3 embodiments, R is -SHgR (e.g., -SHgMe), -SHg or -S02SHg. In certain embodiments, R is -SeRD, e.g. , -SeCX3, wherein X is halogen. In certain embodiments, R3 is -TeRD, e.g. , - TeCX3, wherein X is halogen. In certain embodiments, R is Se-CBr3 or -TeBr3.
[00352] In certain embodiments, wherein the Base is of formula (ix-a), (ix-b), (ix-c), or (ix- d), provided is a compound of Formula (ΙΙ'-d), comprising at least one instance of (Il'-d-ix- a), (ir-d-ix-b), (ΙΓ-d-ix-c), and/or (ΙΓ-d-ix-d):
Figure imgf000112_0001
wherein the at least one instance of (II'-d-ix-a), (II'-d-ix-b), (II'-d-ix-c), and/or (II'-d-ix-d) comprises a heavy atom. In certain embodiments, M2 is -0-. In certain embodiments, R4 and R5 are hydrogen. In certain embodiments, G2 of the sugar region is hydrogen. In certain embodiments, G2 of the sugar region is -SHgMe, -SHg or -S02SHg. In certain
embodiments, G2 of the sugar region is -SeRD, e.g. , -SeCX3, wherein X is halogen. In certain embodiments, G2 of the sugar region is -TeRD, e.g. , -TeCX3, wherein X is halogen. In certain embodiments, G2 is Se-CBr or -TeBr . In certain embodiments, R is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I). In certain embodiments, R is halogen (e.g. , - Br, -I), -ORc, -SRC, -N(RC)2, -SHg, -S02SHg , -SHgRD, -SeRD, or -TeRD. In certain embodiments, R 3 is -OR C . In certain embodiments, R 3 is -SR C . In certain embodiments, R 3 is -N(R c )2. In certain embodiments, R c is -CX3, wherein X is halogen. In certain
embodiments, R 3 is -SHgRD (e In certain embodiments, R 3
.g., -SHgMe), -SHg or -S02SHg.
is -SeRD, e.g. , -SeCX3, wherein X is halogen. In certain embodiments, R3 is -TeRD, e.g. , - TeCX3, wherein X is halogen. In certain embodiments, R is Se-CBr3 or -TeBr3.
5 In certain embodiments of formula i herein L 1 is absent, or -L 1 -R3 is of formula
Figure imgf000112_0002
or , m is 1, and each RW2 is hydrogen, provided is a Base of formula (i-a), (i-b), (i-c), or (i-d):
Figure imgf000113_0001
wherein the base optionally comprises a heavy atom. In certain embodiments, the base comprises a heavy atom. In certain embodiments, the base does not comprise a heavy atom, but the sugar region or phosphate region comprises a heavy atom. In certain embodiments, R4
5 3
and R are hydrogen. In certain embodiments, R is aryl substituted with at least one or more
3 C halogen atoms (e.g. , -Br, -I). In certain embodiments, R is halogen (e.g. , -Br, -I), -OR , - SRC, -N(RC)2, -SHg, -S02SHg , -SHgRD, -SeRD, or -TeRD. In certain embodiments, R3 is -
C 3 C 3 C
OR . In certain embodiments, R is -SR . In certain embodiments, R is -N(R )2. In certain
C 3 D embodiments, R is -CX3, wherein X is halogen. In certain embodiments, R is -SHgR (e.g., -SHgMe), -SHg or -S02SHg. In certain embodiments, R3 is -SeRD, e.g. , -SeCX3,
3 D
wherein X is halogen. In certain embodiments, R is -TeR , e.g. , -TeCX3, wherein X is halogen. In certain embodiments, R is Se-CBr or -TeBr .
[00354] In certain embodiments of Formula (I-d), wherein the Base is of formula (i-a), (i- b), (i-c),or (i-d), provided is a compound of Formula (I-d-i-a), (I-d-i-b), (I-d-i-c), or (I-d-i- d):
Figure imgf000113_0002
Figure imgf000114_0001
wherein the compound comprises a heavy atom. In certain embodiments, R4 and R5 are hydrogen. In certain embodiments, G2 of the sugar region is hydrogen. In certain
embodiments, G2 of the sugar region is -SHgMe, -SHg or -S02SHg. In certain
embodiments, G2 of the sugar region is -SeRD, e.g. , -SeCX3, wherein X is halogen. In certain embodiments, G2 of the sugar region is -TeRD, e.g. , -TeCX3, wherein X is halogen. In certain embodiments, G2 is Se-CBr or -TeBr . In certain embodiments, G is a monophosphate, diphosphate, or triphosphate of formula:
O 9 9 O O O
HO-P— HO-P-O-P— HO-P-O-P-O-P— I
M2-H , OH M2-H , or OH OH M2-H .
In certain embodimenets, M is -0-. In certain embodiments, R is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I). In certain embodiments, R is halogen (e.g. , - Br, -I), -0RC, -SRC, -N(RC)2, -SHg, -S02SHg , -SHgRD, -SeRD, or -TeRD. In certain embodiments, R is -OR . In certain embodiments, R is -SR . In certain embodiments, R c c
is -N(R )2. In certain embodiments, R is -CX3, wherein X is halogen. In certain
embodiments, R is -SHgR (e.g., -SHgMe), -SHg or -S02SHg. In certain embodiments, R is -SeRD, e.g. , -SeCX3, wherein X is halogen. In certain embodiments, R3 is -TeRD, e.g. , - TeCX3, wherein X is halogen. In certain embodiments, R is Se-CBr3 or -TeBr3.
[00355] In certain embodiments, wherein the Base is of formula (i-a), (i-b), (i-c), or (i-d), provided is a compound of Formula (II'-d), comprising at least one instance of (II'-d-i-a), (ir-d-i-b), (ir-d-i-c), and/or (ΙΓ-d-i-d):
Figure imgf000115_0001
wherein the at least one instance of (ΙΓ-d-i-a), (II'-d-i-b), (ΙΓ-d-i-c), and/or (II'-d-i-d) comprises a heavy atom. In certain embodiments, M2 is -0-. In certain embodiments, R4 and R5 are hydrogen. In certain embodiments, G2 of the sugar region is hydrogen. In certain embodiments, G2 of the sugar region is -SHgMe, -SHg or -S02SHg. In certain
embodiments, G2 of the sugar region is -SeRD, e.g. , -SeCX3, wherein X is halogen. In certain embodiments, G2 of the sugar region is -TeRD, e.g. , -TeCX3, wherein X is halogen. In certain embodiments, G2 is Se-CBr3 or -TeBr3. In certain embodiments, R is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I). In certain embodiments, R is halogen (e.g. , - Br, -I), -ORc, -SRC, -N(RC)2, -SHg, -S02SHg , -SHgRD, -SeRD, or -TeRD. In certain
3 C 3 C 3 embodiments, R is -OR . In certain embodiments, R is -SR . In certain embodiments, R c c
is -N(R )2. In certain embodiments, R is -CX3, wherein X is halogen. In certain
3 D 3 embodiments, R is -SHgR (e.g., -SHgMe), -SHg or -S02SHg. In certain embodiments, R is -SeRD, e.g. , -SeCX3, wherein X is halogen. In certain embodiments, R3 is -TeRD, e.g. , - TeCX3, wherein X is halogen. In certain embodiments, R is Se-CBr3 or -TeBr3.
[00356] 1 1 3
In certain embodiments of formula iii herein L is absent, or -L -R is a group
of formula
Figure imgf000115_0002
, or , m is 1, and each R is hydrogen, provided is a Base of formula (i-a), (i-b), (i-c), or (i-d):
Figure imgf000116_0001
(iii-a), (iii-b),
Figure imgf000116_0002
wherein the base optionally comprises a heavy atom. In certain embodiments, the base comprises a heavy atom. In certain embodiments, the base does not comprise a heavy atom, but the sugar region or phosphate region comprises a heavy atom. In certain embodiments, R1
2 3
and R are hydrogen. In certain embodiments, R is aryl substituted with at least one or more
3 C halogen atoms (e.g. , -Br, -I). In certain embodiments, R is halogen (e.g. , -Br, -I), -OR , - SRC, -N(RC)2, -SHg, -S02SHg , -SHgRD, -SeRD, or -TeRD. In certain embodiments, R3 is -
C 3 C 3 C
OR . In certain embodiments, R is -SR . In certain embodiments, R is -N(R )2. In certain
C 3 D embodiments, R is -CX3, wherein X is halogen. In certain embodiments, R is -SHgR (e.g., -SHgMe), -SHg or -S02SHg. In certain embodiments, R3 is -SeRD, e.g. , -SeCX3,
3 D
wherein X is halogen. In certain embodiments, R is -TeR , e.g. , -TeCX3, wherein X is halogen. In certain embodiments, R3 is Se-CBr or -TeBr . In certain embodiments, M4 is O. In certain embodiments, M4 is Se.
[00357] In certain embodiments of Formula (I-d), wherein the Base is of formula (i-a), (i- b), (i-c),or (i-d), provided is a compound of Formula (I-d-i-a), (I-d-i-b), (I-d-i-c), or (I-d-i- d):
Figure imgf000116_0003
Figure imgf000117_0001
(I-d-i-c), H0 ¾ (I-d-i-d), wherein the compound comprises a heavy atom. In certain embodiments, R4 and R5 are hydrogen. In certain embodiments, G2 of the sugar region is hydrogen. In certain
embodiments, G2 of the sugar region is -SHgMe, -SHg or -S02SHg. In certain
embodiments, G2 of the sugar region is -SeRD, e.g. , -SeCX3, wherein X is halogen. In certain embodiments, G2 of the sugar region is -TeRD, e.g. , -TeCX3, wherein X is halogen. In certain embodiments, G2 is Se-CBr or -TeBr . In certain embodiments, G is a monophosphate, diphosphate, or triphosphate of formula:
O I I ¾ 9 I I 9 I I l O I I O I I O I I c,
HO-P— HO-P-O-P— HO-P-O-P-O-P— I
M2-H , OH M2-H , or OH OH M2-H .
2 3
In certain embodimenets, M is -0-. In certain embodiments, R is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I). In certain embodiments, R is halogen (e.g. , - Br, -I), -0RC, -SRC, -N(RC)2, -SHg, -S02SHg , -SHgRD, -SeRD, or -TeRD. In certain
3 C 3 C 3 embodiments, R is -OR . In certain embodiments, R is -SR . In certain embodiments, R c c
is -N(R )2. In certain embodiments, R is -CX3, wherein X is halogen. In certain
3 D 3 embodiments, R is -SHgR (e.g., -SHgMe), -SHg or -S02SHg. In certain embodiments, R is -SeRD, e.g. , -SeCX3, wherein X is halogen. In certain embodiments, R3 is -TeRD, e.g. , - TeCX3, wherein X is halogen. In certain embodiments, R is Se-CBr3 or -TeBr3. In certain embodiments, M4 is O. In certain embodiments, M4 is Se.
[00358] In certain embodiments, wherein the Base is of formula (i-a), (i-b), (i-c), or (i-d), provided is a compound of Formula (ΙΙ'-d), comprising at least one instance of (II '-d-i-a), (Il'-d-i-b (Il'-d-i-c), and/or (H'-d-i-d):
Figure imgf000117_0002
Figure imgf000118_0001
wherein the at least one instance of (ΙΓ-d-i-a), (II'-d-i-b), (ΙΓ-d-i-c), and/or (II'-d-i-d) comprises a heavy atom. In certain embodiments, M2 is -0-. In certain embodiments, R4 and R5 are hydrogen. In certain embodiments, G2 of the sugar region is hydrogen. In certain embodiments, G2 of the sugar region is -SHgMe, -SHg or -S02SHg. In certain
embodiments, G2 of the sugar region is -SeRD, e.g. , -SeCX3, wherein X is halogen. In certain embodiments, G2 of the sugar region is -TeRD, e.g. , -TeCX3, wherein X is halogen. In certain embodiments, G2 is Se-CBr3 or -TeBr3. In certain embodiments, R is aryl substituted with at least one or more halogen atoms (e.g. , -Br, -I). In certain embodiments, R is halogen (e.g. , - Br, -I), -ORc, -SRC, -N(RC)2, -SHg, -S02SHg , -SHgRD, -SeRD, or -TeRD. In certain
3 C 3 C 3 embodiments, R is -OR . In certain embodiments, R is -SR . In certain embodiments, R c c
is -N(R )2. In certain embodiments, R is -CX3, wherein X is halogen. In certain
3 D 3 embodiments, R is -SHgR (e.g., -SHgMe), -SHg or -S02SHg. In certain embodiments, R is -SeRD, e.g. , -SeCX3, wherein X is halogen. In certain embodiments, R3 is -TeRD, e.g. , - TeCX3, wherein X is halogen. In certain embodiments, R is Se-CBr3 or -TeBr3. In certain embodiments, M4 is O. In certain embodiments, M4 is Se.
[00359] In certain embodiments, at least one instance of a Base is of formula (ii), (iv), (vi), (viii), or (x):
Figure imgf000118_0002
Figure imgf000119_0001
wherein the base optionally comprises a heavy atom. In certain embodiments, the base comprises a heavy atom. In certain embodiments, the base does not comprise a heavy atom, but the sugar region or phosphate region comprises a heavy atom. In certain embodiments, M3 is O. In certain embodiments, M4 is O. In certain embodiments, M3 and M4 are both O. In certain embodiments, M3 is Se or Te. In certain embodiments, M4 is Se or Te. In certain embodiments, R4 is -CX3 wherein X is halogen.
[00360] In other embodiments, at least one instance of a Base is:
Figure imgf000119_0002
wherein the sugar region comprises a heavy atom. In certain embodiments, M3 and M4 are O. In certain embodiments, R1 and R2 are hydrogen. In certain embodiments, R4 and R5 are hydrogen. In certain embodiments, G2 of the sugar region is hydrogen. In certain
embodiments, G2 of the sugar region is -SHgMe, -SHg or -S02SHg. In certain
embodiments, G2 of the sugar region is -SeRD, e.g. , -SeCX3, wherein X is halogen. In certain embodiments, G2 of the sugar region is -TeRD, e.g. , -TeCX3, wherein X is halogen. In certain embodiments, G2 is Se-CBr or -TeBr . Exemplary Compounds of Formula (I) and (II)
[00361] Exem lary compounds of Formula (I) include, but are not limited to:
Figure imgf000120_0001
Figure imgf000121_0001
119
Figure imgf000122_0001
120 and salts thereof.
[00363] Exemplary compounds of Formula (II), and salts thereof, comprise at least one instance of any one of the formula:
Figure imgf000123_0001
Figure imgf000124_0001
ı22
Figure imgf000125_0001
and/or salts thereof.
EXEMPLIFICATION
[00364] In order that the invention described herein may be more fully understood, the following examples are set forth. It should be understood that these examples are for illustrative purposes only and are not to be construed as limiting this invention in any manner.
DNA BASE IDENTIFICATION BY ELECTRON MICROSCOPY
[00365] Advances in DNA sequencing, based on fluorescent microscopy, have transformed many areas of biological research. However, only relatively short molecules can be sequenced by these technologies. Dramatic improvements in genomic research will require accurate sequencing of long (>10,GQ0 base-pairs), intact DNA molecules. Our approach directly visualizes the sequence of DNA molecules using electron microscopy. This disclosure represents the first identification of DNA base pairs within intact DNA molecules by electron microscopy. By enzymaticailv incorporating modified bases, which contain atoms of increased atomic number, direct visualization and identification of individually labeled bases within a synthetic 3,272 base-pair DNA molecule and a 7,249 base-pair viral genome have been accomplished. This proof of principle is made possible by the use of a dUTP nucleotide, substituted with a single mercury atom attached to the nitrogenous base. One of these contrast-enhanced, heavy-atom-labeled bases is paired with each adenosine base in the template molecule and then built into a double- stranded DNA molecule by a template- directed DNA polymerase enzyme. This modification is small enough to allow very long molecules with labels at each A-U position, Image contrast is further enhanced by using annular dark-field scanning transmission electron microscopy (ADF-STEM), Further refinements to identify additional base types and more precisely determine the location of identified bases would allow full sequencing of long, intact DNA molecules, significantly improving the pace of complex genomic discoveries. The inventors have published this work in Bell el αί, Micros, Microanai (2012) 18: 1049-1053, published online October 9, 2012,
Introduction
[00366] Advances over the last decade have greatly improved the speed and reduced the costs of DNA sequencing. Currently, they are limited to molecules less than 1,000 base pairs long, principally due to the inefficiency or incomplete nature of the fluorescent labeling reactions of "next generation" approaches (Schuster, 2008).
[00367] The approach taken in this article aims to improve read length by directly visualizing DNA as long, intact molecules using high-resolution scanning transmission electron microscopy (STEM). Richard Feynman (1999) famously suggested that the incredible magnification power of electron microscopes might be harnessed to read DNA sequence; until now this challenge had not been met. The limiting issue has not been the small size of DNA, but the fact that the four different base types differ by only a few atoms, and all of the differing atoms are light elements, differences particularly indistinguishable for electron microscopy. Standard techniques used to increase sample contrast for electron microscopy have not been able to do so in a reliably sequence specific manner, even after 40 years of effort (Gal-Or et al., 1967; ASTA, 2010).
[00368] In the present work, annular dark-field (ADF) imaging is utilized in the
monochromated, spherical- aberration corrected scanning transmission electron microscope (MCSTEM) for high-resolution atomic identification (see FIG. 1). The ADF-STEM was the method of choice for Crewe and co-workers to originally image single heavy atoms in 1970 (Crewe, 1970; Crewe et aL, 1970), anticipating that the method might be used for sequencing DNA. Recent STEM improvements now allow studies of atomic-level and single atom imaging (Batson et aL, 2002; Voyles et aL, 2002; Jia et aL, 2003). In an ADF-STEM, a very small electron beam is raster-scanned across the sample. Most of the electrons pass through the sample with only subtle changes of energy, direction, and/or phase. However, some electrons scatter at a high angle. The high angle scattering process (Rutherford scattering) scales with the atomic number (Z) of the atom (Muller et aL, 2008) raised to the power of approximately 1.5. The Z!L5> dependence allows heavy nuclei to be definitively discriminated from light nuclei. The direct dentification of unlabeled DNA base pairs, with average Z ~ 5,5, has proven to be difficult, and to -date unsuccessful. There is simply not enough difference between the base types to be detected without suitable contrast enhancement, Various groups have worked to overcome this problem, chiefly by chemically modifying single- stranded DNA with clusters of heavy atoms (Beer & Moudrianakis, 1962;
Moudrianakis & Beer, 1965; Ottensmeyer, 1979). The approach employed here uses a standard, template-directed polymerase enzyme to incorporate heavy-atom-modified bases directly into a long DNA molecule (FIG. 2). The modification, with a single mercury atom on each thymine/uridine, provides ADF-STEM contrast substantially greater than in natural DNA, This also simplifies the challenge of making the labeling reactions sequence-specific because polymerase reactions are intrinsically sequence specific,
Methods
[00369] The "test pattern" DNA was built from a synthetic gene (provided by DNA 2.0, Menlo Park, CA, USA) with a 3,072 base-pair segment with all the thymines of one strand in a repeating pattern,...TNTNNNNNNNNN..., where T represents thymine and N represents any of the other three nucleobases. This pseudo-repeating region was amplified using flanking priming sites and standard polymerase chain reaction methods, with one standard primer and one biotinyla ed primer. The product was purified by centnfugation filter and bound to Dynabeads, Single- stranded DNA was obtained by denaturation, and the template strand was then used as template in a one-primer, one-cycle polymerization reaction using Bst polymerase standard reaction conditions, replacing 1 μΜ of dTTP with 1.5 μΜ CH3-Hg- S-dUTP (Livingston et aL, 1976), The DNA product was gel purified. Final concentration and buffer exchange was done on centnfugation filter. The efficacy of label inclusion was tested with restriction enzymes, which were seen not to react with modified recognition sites (Banfalvi & Sarkar, 1995), confirming the presence of modifications at those sites. The DNA was also assayed by inductively coupled plasma mass spectrometry, which confirmed the presence of mercury. Single stranded M13 and primers were processed in the same manner.
[00370] Mercury-labeled DNA was deposited and aligned on an amorphous carbon film on a 400 mesh Au transmission electron microscopy (TEM) grid using a method similar to (Bensimon et. al. 1994). The sample was vacuum dried for 2 min, then placed immediately into the STEM apparatus. TEM imaging was conducted on FEI T-12 TEM (FEI Company, Hillsboro, OR) at 80 kV. ADF-STEM imaging was performed by an aberration-corrected STEM, Carl Zeiss Libra 200-80kV (Carl Zeiss, Oberkochen, Germany) with Cs = -1.2 μιτι, 80 kV with elastic scattering using the in-colurcm energy-filter retaining only zero energy-loss electrons.
Results and Discussion
[00371] Double- stranded DNA was prepared that had been completely substituted with mercury labeled nucleotides on one strand, using 5-MeHgS-dUTP. This "Z-dNTP" is labeled on the 5 carbon of the uridines and is known to readily incorporate into DNA (Bridgman & Petersen, 1996), taking the place of the thymines. In this work, we have labeled M13 DNA, a 7,249 base-pair viral genome molecule, and a 3,272 base-pair synthetic molecule, with a visually identifiable "test pattern," Success with both confirmed the efficient incorporation of labels into DNA molecules substantially longer than sequenceable via other technologies.
[00372] Labeled DN A molecules were mounted on a thin supporting substrate, using a method (Bensimon et al., 1995) that separates, linearizes, and may partially stretch individual DNA molecules. FIG 3B shows a bright-field TEM image of a prepared and linearized DNA molecule on a thin amorphous carbon substrate. A critical distinguishing factor in identifying these molecules is their general morphology. Specifically, at relatively low TEM
magnifications (12,000 to 80,000X), the labeled DNA molecules are seen to be 2 nm in width. In separate experiments, the lengths of the molecules match the known lengths of the M13 and test pattern molecules, with an allowance for elongation. These observed widths and lengths correspond to the known dimensions of DNA molecules; no such features are found in control samples that do not include labeled DNA.
[00373] The resulting STEM images were despeckled and thresholded to identify the features with the greatest contrast. Features that match the known morphology of linearized DNA were selected. Features that did not match known DNA morphology were not included in subsequent analysis, A trace was drawn over the centei ine of the resulting linear features. The individual features are assessed to be individual mercury atoms, or in the case of the M13 molecules, adjacent mercury atoms. This continuous trace, including dark-field current values for both high contrast features and low contrast gaps, is shown in FIG. 4 as the dark- field current,
[00374] In ADF-STEM imaging, individual heavy atoms were clearly visible in the labeled M 13 molecules. The heavy atoms create a substantially and statistically higher current in the ADF detector as the electron probe passes through these atoms (FIG. 4). The detection events are mostly distinguishable from background fluctuations, but not perfectly, with a slight overlap between event and nonevent histograms. The test pattern molecule is a synthetic gene (Villalobos et al., 2006), with a sequence specified to include two identifiable patterns. In this molecule, a pair of uridine bases, each with one heavy atom, are separated by exactly one nonlabeled base. Depending on local stretching, the distance between atoms separated by only one unlabeled base pair should be between 0.68 and 1.2 nm (Bensimon et al,, 1995) (FIG. 5). This small-scale pattern repeats every 12 base pairs, creating a large-scale pattern. The large-scale pattern should have a period of 4.1 to 7.3 nm. FIG. 6 shows a region of the test pattern molecule. The ADF-STEM intensity is noticeable above the substrate
background, in large part, this is due to the in-cohimti energy filter, which selectively eliminates the inelastic scattering contribution, substantially reducing the intensity of substrate background noise. It should be noted, however, that the exact thresholding intensity- required varies between individual ADF-STEM images due to variations of the direct current offset on the ADF detector.
[00375] With image thresholds and a median filter applied, individual mercur atom labels become distinct due to their scattering intensity and size. Under these imaging conditions, only individual atoms, or clusters of atoms, are capable of producing such high contrast in such a small cross-sectional area. In this case, these features are known to be individual atoms, rather than columns, because the morphology is independent of viewing angle.
Moreover, the track of the atoms follows an approximately linear pattern, which corresponds to the known position of the linearized DNA. This allows us to conclude not only that the features represent individual, high-/ atoms, but that those atoms are collocated with the DNA molecule.
[00376] Every heavy atom in this sequence is in a location predicted by the pattern built into the synthetic sequence. While there are missing mercury atoms, there are none where they should not be. More specifically, within this 180 base-pair segment of DNA, the test pattern repeats, in part, 15 times. Of the 30 predicted labels, 17 are present. Fifteen occur in positions predicted by proximity to neighbors on both sides; the other two labels appear in locations predicted by neighbors on only one side. These two also are larger in cross section than the others, probably indicating a tangle that has added to the contrast of the mercury atom, and shortened the distance on one side of the label. All labels are found within the DNA molecule itself. The mercury atom contrast is consistent and statistically above the contrast found either outside the molecule or elsewhere within the molecule.
[00377] A fraction of the labels are missing due to thermal damage; the samples are heated prior to imaging to prevent adsorbed materials from interfering with STEM imaging. Greater heating has been observed to drive off a higher proportion of the labeling atoms.
Nevertheless, the labels that remain, and the corresponding gaps between them, follow precisely the predicted pattern from the synthetic DNA. The smaller scale pattern is expected to have a characteristic distance of 0.7 to 1.2 nm. The larger scale pattern is predicted to have a pitch of between 4,1 and 7,3 nm. In fact, three characteristic modes are observed, around 1 nm, around 4.5 to 7.5 nm, and between 14 and 16 nm. All three of these distances match the test pattern. Where doublets are intact, the mercury atoms are seen to be between 0.7 and 1.1 nm apart (FIG. 6A). These measurements closely match the predicted spacing of the small- scale test pattern. The cluster of spacing between 4.5 and 7.5 nm matches the predicted large- scale pattern. This is more varied than the doublet pattern, likely because these spacing represent a thermal loss of one of the atoms in the doublet; the distances between them depend on which atom of the two was lost. The mode at 14 to 16 nm is very close to twice the large-scale pattem and corresponds to instances in which both atoms of a small-scale pair were lost.
[00378] The variability within the smaller scale pattem results in less ambiguity in determining the local sequence of the DN A molecule than variability in the larger scale pattem. This suggests a potential limit in the use of this technique for determining local sequence. DNA base pairs have a nominal linear pitch of 0.34 nm. If the distance between labeled bases is less than 1 nm or so, the number of unlabeled bases this distance indicates is likely to be fairly certain, whereas if the distance is greater, the number of unlabeled bases in between could become unpractically ambiguous. A higher labeling density might be able to overcome this problem, either using the same label for multiple base types or distinct labels to identify distinct base types. In certain embodiments, using the same label for multiple base types would require parallel experiments to deduce actual sequence, For example, labeling C's and T's in one experiment, then C's and A's in the next, then combining the information to deduce the identity of the bases. However, using different labels for different bases would avoid the issue of paral lel experiments. Using different labels for distinct bases allows for differentiating distinct signals, either by number of atoms in the label, or by atomic number of individual labeling atoms,
[00379] Simultaneous use of multiple labeled bases allows for greater sequence information determination from a DNA molecule. For example, complete labeling of a molecule, with distinct labels for each of the four base types, allows for total sequence information to be extracted. If one of the bases were unlabeled, such that the other three were differentially labeled, the identity of the fourth could be determined by the absence of labels in a given location. However, this could become limiting in certain conditions, such as an extended segment (homopolymer read) section in which multiple unlabeled bases were all in a row. It would be knowabie that the region corresponds only to the unlabeled base, but the precise number could be ambiguous. With fewer than three labeled bases, the challenge of interpreting unlabeled regions would become more difficult and require Sanger-style evaluation of multiple molecules representing the same region in order to extract complete sequence information.
References
(1) ASTA. (2010), Advanced Sequencing Technology Awards 2010. September 1, 2010, National Human Genome Research Institute, Available at genome.gov/27541189.
(2) Banfalvi, G. & Sarkar, N. (1995), Effect of mercury substitution of DNA on its susceptibility to cleavage by restriction endonucleases. DNA Cell Biol 14, 5,
(3) Batson, P.E., Dellby, N. & Kxivanek, O.L. (2002). Sub-angstrom resolution using aberration corrected electron optics. Nature 418, 617-620,
(4) Beer, M. & Moudrianakis, E.N. (1962). Determination of base sequence in nucleic acids with the electron microscope: Visibility of a marker. Proc Nat Acad Sci 48, 409-416,
(5) Bensimon, A., Simon, A., Chiffaudel, A., Croquette, V. JHeslot, F. & Bensimon, D. (1994), Alignment and sensitive detection of DNA by a moving interface. Science 265, 2096- 2098.
(6) Bensimon, D., Simon, A.J., Croquette,V. & Bensimon, A. (1995). Stretching DNA with a receding meniscus: Experiments and models. Phys Rev Lett 74, 4754-4757.
(7) Bridgman, A.J. & Petersen, G.B. (1996). An improved method for the synthesis of mercurated dUTP. J Sequencing and Mapping 6, 199-209,
(8) Crewe, A.V. (1970). Individual atoms photographed. Science News 97, 524. (9) Crewe, A.V., Wall, J. & Langmore, J. (1970). Visibility of single atoms. Science 168, 1338-340.
(10) Feynman, R.P. (1959). There is plenty of room at the bottom, hi Feynm n and Computation, Hey, A.J.G. (Ed.), pp. 63-76. Cambridge, MA: Perseus Press.
(11) Gal- Or, L., Mellema, J.E., Moudrianakis, E.N. & Beer, M. (1967). Electron microscopic study of ase sequence in nucleic acids. VII, Cytosine-specific addition of acyl hydrazides. Biochemistry 6(7), 1909-1915.
(12) Jia, C.L., Lentzen, M. & Urban, K. (2003). Atomic resolution imaging of oxygen in perovskite ceramics. Science 299, 870-873.
(13) Livingston, D.C., Dale, R.M.K. & Ward, D.C. (1976). The synthesis and enzymatic polymerization of 5-thio-and 5-methylmercurithio-pyrimidine nucleotides. Biochim Biophys Act -Nucl Acids Prot Synth 454, 9-20.
(14) Moudrianakis, E.N. & Beer, M. (1965), Base sequence determination in nucleic acids with the electron microscope, III, Chemistry and microscopy of guanine-iabelled DNA. Proc Natl Acad Sci 53, 564-581.
(15) Muller, D.A., Fitting Kourkoutis, L., Murfitt, M., Song, J.H., Hwang, .Y., Silcox, .1., Dellby, N. & Krivanek, O.L. (2008). Atomic-scale chemical imaging of composition and bonding by aberration-corrected microscopy. Science 319, 1073.
(16) Ottensmeyer, F.P. (1979). Molecular structure determination by high resolution electron microscopy, Ann Rev Biophys Bioeng 8, 129-144,
(17) Schuster, 8.C. (2008). Next-generation sequencing transforms today's biology. Nat Methods 5, 16-18.
(18) Villalobos, A., Ness, J.E., Gustafsson, C, Minshull, J. & Govindarajan, S. (2006). Gene designer: A synthetic biology tool for constructing artificial DNA segments. BMC Bioinformatics 7, 285.
(19) Voyles, P.M., Muller, D.A., Grazul, J.L., Citrin, P.H. & Gossman, H.J.L. (2002). Atomic-scale imaging of individual dopant atoms and clusters in highly n-type bulk Si, Nature 416, 826-829. SYNTHESIS OF EXEMPLARY HEAVY-ATOM LABELED NUCLEOTIDES
Example 1. Selenium-labeled base
Dichlorotetraiso propyldisiloxane (TIPDS)
Figure imgf000133_0001
Pyridine
1.2
1.1
Figure imgf000133_0002
Example 2. Selenium-labeled sugar
1) MsCI/THF/Et3N
2) NaHSe then ICBr3
Figure imgf000133_0003
2.1 2.2 OH SeCB
Figure imgf000134_0001
Example 3. Tellurium labeled base
Figure imgf000134_0002
Figure imgf000134_0003
Example 4. Tellurium labeled sugar
1) MsCI/THF/Et3N
2) NaHTe then ICBr3
Figure imgf000135_0001
4.1 4.2
Figure imgf000135_0002
4.3 4.4
Example 5. Mercury labeled base
Figure imgf000135_0003
Example 6. Iodine labeled base
Figure imgf000136_0001
Figure imgf000136_0002
Figure imgf000137_0001
7.4 7.5
[00380] Reagents and Conditions: a) HC≡C-TMS, Et3N, Pd(0), Cul, DMF, rt, 12h; b) K2C03, THF/MeOH; c) 1,4-Diiodobenzene, Et3N, Pd(0), Cul, DMF, rt, 12h; d) i. (MeO)3P, Proton sponge, POCl3, 0 °C - rt, 3h; ii. Pyrophosphate, in'-n-BuNH, DMF, 0 °C - rt, lh; iii. TEAB, rt, lh.
[00381] 5-(4-Iodophenylethynyl)-2'-deoxyuridine: 5-Ethynyl-dU (0.17 g, 0.674 mmol) was dissolved in DMF (5 mL) and maintained under nitrogen atmosphere. To this solution NEt3 (0.427 mL, 3.03 mmol), 1,4-diiodobenzene (1.12 g, 3.37 mmol), Pd(Ph3P)4 (78 mg, 0.068 mmol) and Cul (25.7 mg, 0.135 mmol) were added sequentially with stirring under nitrogen. The reaction was continued at rt for 2 h and TLC (10% MeOH in DCM) and LCMS (ES+) indicated complete disappearance of starting material. After removing the solvent under reduced pressure, the residue was chromatographed on silica gel column using 0 - 20% MeOH gradient over DCM) to get pure product (0.178 g, 58%). TLC: (10% MeOH in DCM): Rf = 0.68. LCMS (ES+): (M+H) calculated mass: 455.22 and observed mass: 454.67. 1H-NMR (DMSO-d6): δ 11.70 (bs, 1H, 3-NH-), 8.37 (s, 1H, 6-H), 7.75 (d, 2H, Ar-H), 7.23 (d, 2H, Ar-H), 6.09 (t, 1H, l'-H), 5.44 (d, 1H, 2'-OH), 5.28 (t, 1H, 5'-OH), 4.24 (m, 1H, 4H), 3.53 - 3.80 (m, 3H, 3'-H, 5'-H & 5"-H), 2.45 (m, 1H, 2'-H), 2.14 (m, 1H, 2' & 2"-H).
[00382] 5- (4-Iodophenylethynyl) -2 ' -deoxyuridine-5 ' - triphosphate : 5 - (4- Iodophenylethynyl)-2'-deoxyuridine (150 mg, 0.33 mmol) and proton sponge (108 mg, 0.495 mmol) were dissolved in trimethylphosphate (2 mL) and cooled to -10 °C and maintained under a nitrogen atmosphere. POCl3 (63 μί, 0.66 mmol) was added, reaction mixture brought to rt and stirred for lh. A solution of bis-tri-w-butylammonium pyrophosphate (0.784 g, 1.65 mmol) and tri-w-butylamine (0.397 mL, 1.65 mmol) in anhydrous DMF (3 mL) was added at 0 °C. After stirring for 2h at rt, TEAB buffer (0.5 M, pH 8.5; 15 mL) was added. The reaction was stirred at room temperature for 30 min and lyophilized. The residue was dissolved in water (2 mL), filtered, and purified on RP-HPLC using 50 mM TEAB buffer pH 8.5 and acetonitrile to yield triphosphate product. LCMS (ES-): [M-H] calculated mass was 693. and observed mass was 692.51.
[00383] Compounds which may be synthesized following the Example 7 procedure include:
Figure imgf000138_0001
dCTP-Ethyne-Ph
Example 8. Iodine and Bromine labeled base fdUTP-Ethyne- Br^]
Figure imgf000138_0002
[00384] Reagents and Conditions: a) i-BuONO, TMS-N3, ACN, 0 °C - rt; b) CuS04, Na ascorbate, THF/H20/t-BuOH 3: 1: 1; c) (i) POCl3, (MeO)3P, Proton sponge, 0 °C - rt, 2h; (ii) Pyrophosphate, in'-n-BuNH, DMF, 0 °C - rt, lh; (iii) TEAB, rt, lh.
[00385] l-Azido-2,4,6-tribromo-3,5-diiodobenzene: 2,4,6-Tribromo-3,5-diiodoaniline (272 mg, 0.468 mmol) was suspended in CH CN (5 mL) and cooled to 0°C in an ice bath. To this stirred mixture was added i-BuONO (247 μΙ_, 1.87 mmol) followed by TMSN3 (196 μΙ_, 1.40 mmol) dropwise. The resulting suspension was stirred at room temperature for 2 h. The solvent and volatile reagents were rotoevaporated under reduced pressure and the product was dried under high vacuum. TLC (10% EtOAc in hexanes) indicated product (brown solid, 296 mg, 94%) was pure and used without further purification. TLC: 10% EtOAc in hexanes, Rf: 0.92. LCMS (ES+): M+H calculated mass 609.62, and observed mass 609.29. 13C-NMR (DMSO-d6): δ 135.86, 135.57, 127.55 and 111.38.
[00386] 5-[l-(2,4,6-Tribromo-3,5-diiodobenzene)-l,2,3-triazol-4-yl]-2'-deoxyuracil or l-(2-Deoxy-P-D-erjt¾ro^entofuranosyl)-5-[l-(2,4,6-Tribromo-3,5-diiodobenzene)-l,2,3- triazol-4-yl]uracil: 5-Ethynyl-dU (0.12 g, 0.476 mmol) and l-azido-2,4,6-tribromo-3,5- diiodobenzene (0.29 g, 0.476 mmol) were dissolved in THF/H20/i-BuOH (3: 1: 1, v/v, 5 mL). To this stirring solution, freshly prepared 1 M sodium ascorbate solution in water (243 μί, 0.238 mmol), followed by CuS04.5H20 7.5% in water (396 μΐ,, 0.119 mmol) were added at rt. The reaction mixture was stirred for 20 h at room temperature. The solvent was evaporated, and the residue was purified by silica gel column chromatography using 2 - 20% MeOH over DCM to get pure product (0.226 g, 55%) as a brown solid. TLC: 10% MeOH in DCM: Rf: 0.58. LCMS (ES+): [M+H] calculated mass 859.84, and observed mass 859.23. 1H-NMR (DMSO-d6): δ 11.72 (1H, -NH-), 8.72 (s, 1H, 6-H), 8.66 (s, 1H, Triazole-H), 6.24 (t, 1H, l'-H), 5.29 (d, 1H, 2'-OH), 5.08 (t, 1H, 5'-OH), 4.29 (m, 1H, 4'-H), 3.86 (m, 1H, 5'- H), 3.58 - 3.64 (m, 2H, 3'-H, 5"-H), 2.20 (m, 2H, 2' & 2"-H). 13C-NMR (DMSO-d6): δ 166.42, 155.04, 146.66, 144.72, 141.95, 139.52, 137.13, 137.08, 128.92, 116.77, 116.78, 109.93, 93.07, 90.22, 76.06, 66.77 and 60.32.
[00387] 5-[l-(2,4,6-Tribromo-3,5-diiodobenzene)-l,2,3-triazol-4-yl]-2'-deoxyuracil-5'- triphosphate (10): 5-[l-(2,4,6-Tribromo-3,5-diiodobenzene)-l,2,3-triazol-4-yl]-2'- deoxyuracil or l-(2-Deoxy-P-D-eryi/zra-pentofuranosyl)-5-[l-(2,4,6-Tribromo-3,5- diiodobenzene)-l,2,3-triazol-4-yl]uracil (75 mg, 0.087 mmol) and proton sponge (27 mg, 0.131 mmol) were dissolved in trimethylphosphate (2 mL) and cooled to -10 °C and maintained under a nitrogen atmosphere. POCl3 (17 μί, 0.175 mmol) was added, reaction mixture brought to rt and stirred for lh. A solution of bis-tri-n-butylammonium
pyrophosphate (207 mg, 0.436 mmol) and tri-w-butylamine (105 μί, 0.436 mmol) in anhydrous DMF (3.0 mL) was added at 0 °C. After stirring for 2h at rt, triethylammonium bicarbonate (TEAB) buffer (0.5 M, pH 8.5; 15 mL) was added. The reaction was stirred at room temperature for 30 min and lyophilized. The residue was dissolved in water (3 mL), filtered, and purified on RP-HPLC using 50 mM TEAB buffer pH 8.5 and acetonitrile to yield the triphosphate product. LCMS (ES-): [M-H] calculated mass 1097.78, and observed mass 1097.50. 31P-NMR (D20): δ 8.35, 8.90 and -20.70.
Example 9. Iodine labeled base fdCTP-Ph-I2]
Figure imgf000140_0001
[00388] Reagents and Conditions: a) 1,3,5-Triiodobenzene, Et3N, Pd(Ph3P)4, Cul, DMF, rt, 12h; b) i. (MeO)3P, Proton sponge, POCl3, 0 °C - rt, 3h; ii. Pyrophosphate, in'-n-BuNH, DMF, 0 °C - rt, lh; iii. TEAB, rt, lh.
[00389] 5-(3,5-Diiodophenylethynyl)-2'-deoxycytidine: Ethynyl-dC was synthesized following the procedure of Dodd et ah, Org. Biomol. Chem. (2010) 8:663-6665. Ethynyl-dC (0.2 g, 0.796 mmol) was then dissolved in DMF (10 mL) and maintained under nitrogen atmosphere. To this solution NEt (0.1.12 mL, 7.96 mmol), 1,3,5-triiodobenzene (1.11 g, 2.39 mmol), Pd(Ph P)4 (92 mg, 0.080 mmol) and Cul (31 mg, 0.16 mmol) were added sequentially with stirring under nitrogen. The reaction was continued at rt for 2 h and TLC (10% MeOH in DCM) and LCMS (ES+) indicated complete disappearance of starting material. After removing the solvent under reduced pressure, the residue was chromatographed on silica gel column using 0 - 20% MeOH gradient over DCM) to get pure product (0.324 g, 70%). TLC: (10% MeOH in DCM): Rf = 0.52. LCMS (ES+): (M+H) calculated mass: 579.14 and observed mass: 579.31. 1H-NMR (DMSO-d6): δ 8.82 (bs, 2H, 4-NH2), 8.38 (s, 1H, 6-H), 8.06 (t, 1H, Ar-H), 8.00 (d, 2H, Ar-H), 6.10 (t, 1H, l'-H), 5.22 (d, 1H, 2'-OH), 5.13 (t, 1H, 5'- OH), 4.22 (m, 1H, 4H), 3.80 (m, 1H, 3'-H), 3.63 - 3.68 (m, 1H, 5'-H), 3.55 - 3.61 (m, 1H, 5"-H), 2.15 - 2.22 (m, 1H, 2'-H), 1.98 - 2.04 (m, 1H, 2"-H).
[00390] 5-(3,5-Diiodophenylethynyl)-2'-deoxycytidine-5'-triphosphate: 5 (3,5- Diiodophenylethynyl)-2'-deoxycytidine (226 mg, 0.39 mmol) and proton sponge (128 mg, 0.585 mmol) were dissolved in trimethylphosphate (6 mL) and cooled to -10 °C and maintained under a nitrogen atmosphere. POCl3 (73 μί, 0.78 mmol) was added, reaction mixture brought to rt and stirred for lh. A solution of bis-tri-w-butylammonium
pyrophosphate (0.927 g, 1.95 mmol) and tri-w-butylamine (0.47 mL, 1.95 mmol) in anhydrous DMF (3 mL) was added at 0 °C. After stirring for 1.5h at rt, TEAB buffer (0.5 M, pH 8.5; 30 mL) was added. The reaction was stirred at room temperature for 30 min and lyophilized. The residue was dissolved in water (3 mL), filtered, and purified on RP-HPLC using 50 mM TEAB buffer pH 8.5 and acetonitrile to yield the triphosphate. LCMS (ES-): [M-H] calculated mass was 818.08, and observed mass was 817.98.
[00391] Compounds which may be synthesized following the Example 9 procedure include:
Figure imgf000141_0001
Example 10. Te labeled base fdATP-Ethyne-Te-Ph]
Figure imgf000142_0001
[00392] Reagents and Conditions: a) TMSC≡CH, Et3N, Pd(0), Cul, DMF, rt, 12h; b) K2C03, THF/MeOH 3: 1; c) Ph-Te-Te-Ph, Cul, K2C03, DMSO, rt, 12h; d) (i) POCl3, (MeO)3P, Proton sponge, 0 °C - rt, 2h; (ii) Pyrophosphate, tri-n-BuNH, DMF, 0 °C - rt, lh; (iii) TEAB, rt, lh.
[00393] Synthesis of 7-Deaza-7-(phenyltelluro)ethynyl-2'-deoxyadenosine: Deaza-7- ethynyl-2'-dA synthesized following Seela and Zulauf, Synthesis (1996) 726 - 730. A mixture of 7-Deaza-7-ethynyl-2'-dA (0.19 g, 0.69 mmol), (PhTe)2 (0.14 g. 0.34 mmol) Cul (13.1 mg, 0.069 mmol) and K2C03 (0.19 g, 1.37 mmol) in 1.0 mL of commercial grade, undried DMSO was stirred at room temperature for the 30 min. TLC (solvent: MeOH/DCM 1:9) and LCMS (ES+) indicated disappearance of most of the starting material and product formation. Continued stirring at rt for overnight, reaction was quenched with water (1 mL) and lyophilized to dryness. Residue was dissolved in 10%MeOH in DCM and purified on silica gel using 0 - 20% MeOH gradient over DCM to get pure compound 5 (0.29 g, 64%). TLC (1:9 MeOH/DCM): Rf = 0.74. LCMS (ES+): (M+H) calculated: 478.98 and observed: 478.71. 1H-NMR (DMSO-d6): δ 8.10 (s, IH, 2-H), 7.77 (m, 3H, Ar-H), 7.27 (m, 2H, Ar-H), 6.44 (m, 2H, l'-H & 8-H), 5.43 (bs, IH, 2'-OH), 5.20 (bs, IH, 5'-OH), 4.32 (bs, 2H, 6-NH2), 3.48 - 3.85 (m, 4H, 3'-H, 4'-H, 5'-H & 5"-H), 2.45 (m, IH, 2'-H), 2.19 (m, IH, 2"-H).
[00394] Synthesis of 7-Deaza-7-(phenyltelluro)ethynyl-2' -deoxyadenosine-5' - triphosphate: 7-Deaza-7-(phenyltelluro)ethynyl-2'-dA (60 mg, 0.126 mmol) and proton sponge (41 mg, 0.188 mmol) were dissolved in trimethylphosphate (1 mL) and cooled to -10 oC and maintained under a nitrogen atmosphere. POCl3 (23 uL, 0.251 mmol) was added, reaction mixture brought to rt and stirred for lh. A solution of bis-tri-n-butylammonium pyrophosphate (0.45g, 0.941 mmol) and tri-n-butylamine (0.23 mL, 0.941 mmol) in anhydrous DMF (1.5 mL) was added at 0 oC. After stirring for lh at rt, triethylammonium bicarbonate buffer (0.5 M, pH 8.5; 15 mL) was added. The reaction was stirred at room temperature for lh and lyophilized. The residue was dissolved in water (2.0 mL), filtered, and purified on RP-HPLC using 50 mM TEAB buffer pH 8.5 and acetonitrile to yield
triphosphate. LCMS (ES-): [M-H] calculated mass was 716.92, and observed mass was 716.35.
Example 11. Te labeled base fdUTP-Ethyne-Te-Ph]
Figure imgf000143_0001
[00395] Reagents and Conditions: a) TMSC≡CH, Et3N, Pd(0), Cul, DMF, rt, 12h; b) K2C03, THF/MeOH 3: 1; c) Ph-Te-Te-Ph, Cul, K2C03, DMSO, rt, 12h; d) (i) POCl3, (MeO)3P, Proton sponge, 0 °C - rt, 2h; (ii) Pyrophosphate, m'-n-BuNH, DMF, 0 °C - rt, lh; (iii) TEAB, rt, lh.
[00396] 5-(Phenyltelluro)ethynyl-2'-deoxyuridine: 5-Ethynyl-2'-dU was synthesized following the procedure of Yu, Synlett (2000) 86-88. A mixture of 5-Ethynyl-2'-dU (50 mg, 0.17 mmol), (PhTe)2 (42 mg, 0.1 mmol), Cul (4 mg, 0.02 mmol) and K2C03 (55 mg, 0.40 mmol) in 1.0 mL of commercial grade, undried DMSO was stirred at room temperature for the 30 min. TLC (solvent: MeOH/DCM 1:9) and LCMS (ES+) indicated disappearance of most of the starting material and product formation. Continued stirring at rt for overnight, reaction was quenched with water (1 mL) and lyophilized to dryness. Residue was dissolved in 10%MeOH in DCM and purified on silica gel using 0 - 20% MeOH gradient over DCM to get pure compound (0.38 g, 42%). TLC (10% MeOH in DCM): Rf = 0.63. LCMS (ES+): (M+H) calculated: 456.93 and observed: 456.82. 1H-NMR (DMSO-d6): δ 8.26 (s, 1H, 6-H), 7.78 (m, 2H, Ar-H), 7.28 (m, 3H, Ar-H), 6.11 (t, 1H, l'-H), 5.29 (bs, 1H, 2'-OH), 5.14 (bs, 1H, 5'-OH), 4.24 (bs, 1H, 3-NH-), 3.80 (m, 1H, 4'-H) 3.54 - 3.64 (m, 2H, 3'-H, 5'-H), 3.29 (m, 1H, 5"-H), 2.14 (m, 2H, 2' & 2"-H).
[00397] 5-(Phenyltelluro)ethynyl-2'-deoxyuridine-5'-triphosphate: 5-
(Phenyltelluro)ethynyl-2'-dU (26 mg, 0.057 mmol) and proton sponge (18.7 mg, 0.086 mmol) were dissolved in trimethylphosphate (1 mL) and cooled to -10 °C and maintained under a nitrogen atmosphere. POCl3 (11 μί, 0.115 mmol) was added, reaction mixture brought to rt and stirred for lh. A solution of bis-tri-n-butylammonium pyrophosphate (135 mg, 0.285 mmol) and tri-n-butylamine (69 μί, 0.285 mmol) in anhydrous DMF (1.0 mL) was added at 0 °C. After stirring for lh at rt, triethylammonium bicarbonate (TEAB) buffer (0.5 M, pH 8.5; 10 mL) was added. The reaction was stirred at room temperature for 30 min and lyophilized. The residue was dissolved in water (2 mL), filtered, and purified on RP-HPLC using 50 mM TEAB buffer pH 8.5 and acetonitrile to yield triphosphate product. LCMS (ES+): [M+H] calculated mass was 696.87, [M-sugar] calculated mass was 338.80, and observed mass was 338.21. Example 12. SHsMe labeled base fdCTP-SHsMel
Figure imgf000145_0001
[00398] Reagents and Conditions: a) Propargyl S-benzoate, Et3N, Pd(PPh3)4, Cul, DMF, rt, 12h; b) (i) POCl3, (MeO)3P, Proton sponge, 0 °C - rt, 2h; (ii) Pyrophosphate, in'-n-BuNH, DMF, 0 °C - rt, lh; (iii) TEAB, rt, lh; c) NH OH, rt, 2h; d) (i) TCEP, H20, 30 min; (ii) MeHgOH, H20, rt, lh.
[00399] 5-(Propargylthiobenzoate)-2'-deoxycytidine: 5-Iodo-dC (0.25 g, 0.708 mmol) was suspended in DMF (5 mL) and maintained under nitrogen atmosphere. To this solution NEt (0.45 mL, 3.2 mmol), propargyl S-thiobenzoate (0.374 g, 2.12 mmol), Pd(Ph3P)4 (82 mg, 0.071 mmol) and Cul (27 mg, 0.142 mmol) were added sequentially with stirring under nitrogen. The reaction mixture was stirred at rt for 12 h. TLC (10% MeOH in DCM) and LCMS (ES+) indicated complete disappearance of starting material. After removing the solvent under reduced pressure, the residue was chromatographed on silica gel column using 0 - 20% MeOH gradient over DCM) to get pure product (0.14 g, 49%). TLC: (10% MeOH in DCM): Rf = 0.45. LCMS (ES+): (M+H) calculated mass: 402.45 and observed mass: 402.14. 1H-NMR (CD3OD): δ 8.08 (bs, 2H, 4-NH2), 8.31 (s, 1H, 6-H), 7.97 (dd, 2H, Ar-H), 7.66 (m, 1H, Ar-H), 7.54 (m, 2H, Ar-H), 6.19 (t, 1H, l'-H), 4.89 (bs, 1H, 2'-OH), 4.59 (bs, 1H, 5'- OH), 4.35 (m, 1H, 4'-H), 3.94 (m, 1H, 3'-H), 3.70 - 3.83 (m, 2H, 5'&5"-H), 2.38 (m, 1H, 2'- H), 2.13 (m, 1H, 2"-H).
[00400] 5- (Propargylthiobenzoate) -2 ' -deoxyc tidine-5 ' - triphosphate : 5 -
(Propargylthiobenzoate)-2'-deoxycytidine(135 mg, 0.336 mmol) and proton sponge (108 mg, 0.504 mmol) were dissolved in trimethylphosphate (2 mL) and cooled to -10 °C and maintained under a nitrogen atmosphere. POCl3 (63 μί, 0.673 mmol) was added, reaction mixture brought to rt and stirred for lh. A solution of bis-tri-w-butylammonium
pyrophosphate (0.799 g, 1.68 mmol) and tri-w-butylamine (0.405 mL, 1.68 mmol) in anhydrous DMF (3 mL) was added at 0 °C. After stirring for 1 h at rt, TEAB buffer (0.5 M, pH 8.5; 15 mL) was added. The reaction was stirred at room temperature for 30 min and lyophilized. The residue was dissolved in water (3 mL), filtered, and purified on RP-HPLC using 50 mM TEAB buffer pH 8.5 and acetonitrile to yield the triphosphate product (16.4 mg, 7.6%) as white solid. LCMS (ES+): [M+H] calculated mass was 642.39, and observed mass was 642.34.
[00401] 5-(Propargyl S-thiomethylmercury)-2'-deoxycytidine-5'-triphosphate: 5-
(Propargylthiobenzoate)-2'-deoxycytidine-5' -triphosphate (8.5 mg, 0.013 mmol) was dissolved in ammonium hydroxide 28 - 30% (5 mL), capped the vial tightly and occasionally agitated for 1 h at room temperature. LCMS (ES+) indicated complete disappearance of starting material. Ammonia was removed under vacuum and lyophilized. The resulting solid was dissolved in water (1 mL) and TCEP (8.2, 0.028 mmol) was added and agitated for 30 min at room temperature. To this solution methylmercury (II) hydroxide (1M, 60 μί, 56 μιηοΐ) was added, and agitated occasionally for 1 h at room temperature. The solution was purified on RP-HPLC using 50 mM TEAB buffer pH 8.5 and acetonitrile to yield
triphosphate product. LCMS (ES-): [M-H] calculated mass was 750.89, and observed mass was 750.83.
Example 13. Iodine labeled base [dUTP-Ethyne-Phli]
Figure imgf000146_0001
Figure imgf000147_0001
13.4 13.5
[00402] Reagents and Conditions: a) HC≡C-TMS, Et3N, Pd(0), Oil, DMF, rt, 12h; b) K2C03, THF/MeOH; c) 1,2,4,5-Tetraiodobenzene, Et3N, Pd(Ph3P)4, Cul, DMF, rt, 12h; d) i. (MeO)3P, Proton sponge, POCl3, 0 °C - rt, 3h; ii. Pyrophosphate, in'-n-BuNH, DMF, 0 °C - rt, lh; iii. TEAB, rt, lh.
[00403] 5-(2,4,5-Triiodophenylethynyl)-2'-deoxyuridine: 5-Ethynyl-dU was synthesized following the procedure of Yu, Synlett 2000, 86-88. 1,2,4,5-tetraiiodobenzene may be synthesized following the procedure of Mattern, J. Org. Chem., 1983, 48, 4773-4774.
[00404] 5-Ethynyl-dU (0.1 g, 0.396 mmol) was suspended in DMF (10 mL) and maintained under nitrogen atmosphere. To this solution NEt3 (0.558 mL, 3.96 mmol), 1,2,4,5- tetraiiodobenzene (0.941 g, 1.59 mmol), Pd(Ph3P)4 (46 mg, 0.04 mmol) and Cul (16 mg, 0.08 mmol) were added sequentially with stirring under nitrogen. The reaction mixture was heated to 100 °C and the solution was stirred for 2 h. TLC (10% MeOH in DCM) and LCMS (ES+) indicated complete disappearance of starting material. Reaction mixture was cooled to rt and after removing the solvent under reduced pressure, the residue was chromatographed on silica gel column using 0 - 20% MeOH gradient over DCM) to get pure product (0.106 g, 38%). TLC: (10% MeOH in DCM): Rf = 0.68. LCMS (ES+): (M+H) calculated mass: 707.02 and observed mass: 706.36. 1H-NMR (DMSO-d6): δ 11.71 (bs, 1H, 3-NH-), 8.44 (s, 1H, 6-H), 8.37 (s, 1H, Ar-H), 7.90 (s, 1H, Ar-H), 6.11 (t, 1H, l'-H), 5.24 (d, 1H, 2'-OH), 5.07 (t, 1H, 5'-OH), 4.24 (m, 1H, 4-H), 3.81 (m, 1H, 3'-H), 3.55 - 3.67 (m, 2H, 5'&5"-H), 2.17 - 2.22 (m, 2H, 2'&2"-H).
[00405] 5-(2,4,5-Triiodophenylethynyl)-2'-deoxycytidine-5'-triphosphate: 5-(2,4,5- Triiodophenylethynyl)-2'-deoxyuridine (90 mg, 0.127 mmol) and proton sponge (42 mg, 0.191 mmol) were dissolved in trimethylphosphate (3 mL) and cooled to 0 °C and maintained under a nitrogen atmosphere. POCl3 (24 μί, 0.255 mmol) was added, reaction mixture brought to rt and stirred for 2h. Reaction became clear and LCMS (ES-) indicated
monophosphate formation. A solution of bis-tri-w-butylammonium pyrophosphate (0.303 g, 0.637 mmol) and tri-w-butylamine (0.153 mL, 0.637 mmol) in anhydrous DMF (3 mL) was added at 0 °C. After stirring at rt for 2 h, TEAB buffer (0.5 M, pH 8.5; 30 mL) was added. The reaction was stirred at room temperature for 30 min and lyophilized. The residue was dissolved in water (3 mL), filtered, and purified on RP-HPLC using 50 mM TEAB buffer pH 8.5 and acetonitrile to yield triphosphate product. LCMS (ES-): [M-H] calculated mass was 944.96, and observed mass was 944.29.
Example 14. Se labeled base
Figure imgf000148_0001
[00406] Reagents and Conditions: a) l,3-Dichloro-l,l,3,3-tetraisopropyldisiloxane, DMAP; b) POCl3, with heat; c) NaHSe, DMF, with heat; d) TABF; e) i) (MeO)3P=0/POCl3, Proton sponge, ii) tetrabutylammonium pyrophosphate OTHER EMBODIMENTS
[00407] In the claims articles such as "a," "an," and "the" may mean one or more than one unless indicated to the contrary or otherwise evident from the context. Claims or descriptions that include "or" between one or more members of a group are considered satisfied if one, more than one, or all of the group members are present in, employed in, or otherwise relevant to a given product or process unless indicated to the contrary or otherwise evident from the context. The invention includes embodiments in which exactly one member of the group is present in, employed in, or otherwise relevant to a given product or process. The invention includes embodiments in which more than one, or all of the group members are present in, employed in, or otherwise relevant to a given product or process.
[00408] Furthermore, the invention encompasses all variations, combinations, and permutations in which one or more limitations, elements, clauses, and descriptive terms from one or more of the listed claims is introduced into another claim. For example, any claim that is dependent on another claim can be modified to include one or more limitations found in any other claim that is dependent on the same base claim. Where elements are presented as lists, e.g., in Markush group format, each subgroup of the elements is also disclosed, and any element(s) can be removed from the group. It should it be understood that, in general, where the invention, or aspects of the invention, is/are referred to as comprising particular elements and/or features, certain embodiments of the invention or aspects of the invention consist, or consist essentially of, such elements and/or features. For purposes of simplicity, those embodiments have not been specifically set forth in haec verba herein. It is also noted that the terms "comprising" and "containing" are intended to be open and permits the inclusion of additional elements or steps. Where ranges are given, endpoints are included. Furthermore, unless otherwise indicated or otherwise evident from the context and understanding of one of ordinary skill in the art, values that are expressed as ranges can assume any specific value or sub-range within the stated ranges in different embodiments of the invention, to the tenth of the unit of the lower limit of the range, unless the context clearly dictates otherwise.
[00409] This application refers to various issued patents, published patent applications, journal articles, and other publications, all of which are incorporated herein by reference. If there is a conflict between any of the incorporated references and the instant specification, the specification shall control. In addition, any particular embodiment of the present invention that falls within the prior art may be explicitly excluded from any one or more of the claims. Because such embodiments are deemed to be known to one of ordinary skill in the art, they may be excluded even if the exclusion is not set forth explicitly herein. Any particular embodiment of the invention can be excluded from any claim, for any reason, whether or not related to the existence of prior art.
[00410] Those skilled in the art will recognize or be able to ascertain using no more than routine e perimentation many equivalents to the specific embodiments described herein. The scope of the present embodiments described herein is not intended to be limited to the above Description, but rather is as set forth in the appended claims. Those of ordinary skill in the art will appreciate that various changes and modifications to this description may be made without departing from the spirit or scope of the present invention, as defined in the following claims.

Claims

CLAIMS What is claimed is:
1. A heavy- atom labeled nucleotide of Formula I):
Figure imgf000151_0001
or a salt thereof;
wherein:
each instance of Gi is independently -0-, -S-, -Se-, -CH2-, or -NH-;
each instance of G2 is independently hydrogen, halogen, -ORA, -SRA, -N(RA)2, -SHg, -S02SHg , -SeRD or -TeRD;
each instance of RA is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2_20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen atom, a sulfur protecting group when attached to a sulfur atom, a nitrogen protecting group when attached to a nitrogen atom; or two RA groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of M1 is independently -0-, -S-, -NH-, -Se-,or -C(RM)2-, wherein each instance of RM is independently hydrogen or halogen;
each instance of G3 is independently hydrogen, hydrogen, substituted or unsubstituted Ci_2oalkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2_2o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or a
monophosphate, diphosphate, or triphosphate of formula:
O I I ¾ 9 I I 9 I I c, O I I O I I O I I t
HO-P— HO-P-O-P— HO-P-O-P-O-P— I
M2-H , OH 2-H ; or OH OH M2-H
wherein each instance of M is independently -0-, -S-, or -Se-; and
each instance of Base is independently:
Figure imgf000152_0001
Figure imgf000152_0002
Figure imgf000152_0003
Figure imgf000152_0004
Figure imgf000152_0005
wherein:
each instance of R1, R2, R4, and R5 is independently hydrogen, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted C2-2oalkenyl, substituted or
unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, a nitrogen protecting group, -OR B , or -SR B , wherein each instance of R B is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen group, or a sulfur protecting group when attached to a sulfur group; or R 1 and R 2 and/or R4 and R5 are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of R is independently substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or
unsubstituted aryl, substituted or unsubstituted heteroaryl, halogen, -OR c , -SR c , -N(R c )2, - SHg, -S02SHg , -SeRD, or -TeRD wherein each instance of Rc is hydrogen, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted C2-2oalkenyl, substituted or
unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen atom, a sulfur protecting group when attached to a sulfur atom, a nitrogen protecting group when attached to a nitrogen atom; or two R groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of L1 is independently absent or a linking moiety selected from the group consisting of substituted or unsubstituted C^oalkylene, substituted or unsubstituted C2 2oalkenylene, substituted or unsubstituted C2-20 alkynylene, substituted or unsubstituted heteroC1_2oalkylene, substituted or unsubstituted heteroC2-2oalkenylene, substituted or unsubstituted heteroC2-2o alkynylene, substituted or unsubstituted carbocycylene, substituted or unsubstituted heterocyclylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a combination thereof;
each instance of RD is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; and
each instance of M3 and M4 are independently O, Se, Te, CH2, CF2, CC12, CBr2, or CI2; provided that the compound comprises at least one instance of a heavy atom selected from the group consisting of bromine, iodine, selenium, tellurium, or mercury; and
further provided that the following compounds are specifically excluded:
Figure imgf000154_0001
and salts thereof.
2. The heavy-atom labeled nucleotide of claim 1 having the following stereochemistry:
G3-C Base
Figure imgf000154_0002
or the enantiomer thereof and/or salt thereof.
3. The heavy- atom labeled nucleotide of claim 1, wherein the nucleotide is a quaternary amine, an amino acid, or a metal salt.
4. The heavy- atom labeled nucleotide of claim 1, comprising at least one instance of - SeRD or -TeRD, wherein RD is substituted or unsubstituted C^ohaloalkyl.
5. The heavy- atom labeled nucleotide of claim 1, comprising at least one instance of - SHg or -S02SHg .
6. The heavy-atom labeled nucleotide of claim 1, comprising at least one instance of -Br or -I.
7. The heavy- atom labeled nucleotide of claim 1, wherein the base comprises a heavy atom.
8. The heavy- atom labeled nucleotide of claim 1, wherein the base does not comprise a heavy atom.
9. The heavy-atom labeled nucleotide of claim 1, wherein G2 is hydrogen.
10. The heavy-atom labeled nucleotide of claim 1, wherein G2 is -SHg or -S02SHg .
11. The heavy- atom labeled nucleotide of claim 1, wherein G2 is -SeRD or -TeRD, wherein RD is substituted or unsubstituted C1-2ohaloalkyl.
12. The heavy-atom labeled nucleotide of claim 1, wherein G3 is hydrogen or a triphosphate group.
13. The heavy- atom labeled nucleotide of claim 1, wherein the base is selected from the group consisting of:
Figure imgf000155_0001
wherein the base comprises a heavy atom.
14. The heavy-atom labeled nucleotide of claim 13, wherein L is a linking moiety selected from the group consisting of substituted and unsubstituted alkylene; substituted and unsubstituted alkenylene; substituted and unsubstituted alkynylene.
15. The heavy-atom labeled nucleotide of claim 13, wherein L1 represents a linker consisting of a combination of one or more consecutive covalently bonded groups of the formula:
Figure imgf000156_0001
wherein:
each instance of m is independently an integer between 1 to 10, inclusive;
each instance of p is independently an integer between 1 to 4, inclusive;
each instance of Rwl is independently hydrogen; substituted or unsubstituted alkyl; substituted or unsubstituted alkenyl; substituted or unsubstituted alkynyl; substituted or unsubstituted carbocyclyl; substituted or unsubstituted heterocyclyl; substituted or unsubstituted aryl; substituted or unsubstituted heteroaryl; or a nitrogen protecting group; each instance of RW2 is independently hydrogen; halogen; substituted or unsubstituted alkyl; substituted or unsubstituted alkenyl; substituted or unsubstituted alkynyl; substituted or unsubstituted carbocyclyl; substituted or unsubstituted heterocyclyl; substituted or unsubstituted aryl; substituted or unsubstituted heteroaryl; or two RW2 groups are joined to form a substituted or unsubstituted 5- to 6-membered ring.
16. The heav -atom labeled nucleotide of claim 15, wherein L 1 -R 3 is a rou of formula:
Figure imgf000156_0002
17. The heavy-atom labeled nucleotide of claim 13, wherein M is O.
18. The heavy-atom labeled nucleotide of claim 13, wherein M4 is O.
19. The heavy- atom labeled nucleotide of claim 13, wherein R 1 and R 2 are hydrogen.
20. The heavy- atom labeled nucleotide of claim 13, wherein R4 and R5 are hydrogen.
21. The heavy- atom labeled nucleotide of claim 13, wherein R 3 is -Br, -I, -OR C , -SR C , - N(RC)2, -SHg, -S02SHg , -SeRD, or -TeRD.
22. The heavy-atom labeled nucleotide of claim 1 selected from the group consisting of:
Figure imgf000157_0001
and salts thereof.
23. A heavy- atom labeled nucleic acid polymer of Formula (II):
Figure imgf000158_0001
(ID or a salt thereof;
wherein:
each instance of Gi is independently -0-, -S-, -Se-, -CH2-, or -NH-;
each instance of G2 is independently hydrogen, halogen, -ORA, -SRA, -N(RA)2, -SHg, -S02SHg , -SeRD or -TeRD;
each instance of RA is independently hydrogen, substituted or unsubstituted Ci-^alkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2_2o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen atom, a sulfur protecting group when attached to a sulfur atom, a nitrogen protecting group when attached to a nitrogen atom; or two RA groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of M1 is independently -0-, -S-, -NH-, -Se-,or -C(RM)2-, wherein each instance of RM is independently hydrogen or halogen;
each instance of G3 is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2_20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or a monophosphate, diphosphate, or triphosphate of formula:
O I I ¾ 9 I I 9 I I l 0 I I 0 I I 0 I I c,
HO-P— HO-P-O-P— I HO-P-O-P-O-P— I
M2-H , OH M2-H , or OH OH M2-H
wherein each instance of M is independently -0-, -S-, or -Se-; and
each instance of Base is independently:
Figure imgf000159_0001
Figure imgf000159_0002
Figure imgf000159_0003
Figure imgf000159_0004
Figure imgf000159_0005
wherein:
each instance of R1, R2, R4, and R5 is independently hydrogen, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted C2-2oalkenyl, substituted or
unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, a nitrogen protecting group, -OR B , or -SR B , wherein each instance of R B is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen group, or a sulfur protecting group when attached to a sulfur group; or R 1 and R 2 and/or R4 and R5 are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of R is independently substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or
unsubstituted aryl, substituted or unsubstituted heteroaryl, halogen, -OR c , -SR c , -N(R c )2, - SHg, -S02SHg , -SeRD, or -TeRD wherein each instance of Rc is hydrogen, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted C2-2oalkenyl, substituted or
unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen atom, a sulfur protecting group when attached to a sulfur atom, a nitrogen protecting group when attached to a nitrogen atom; or two R groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of L1 is independently absent or a linking moiety selected from the group consisting of substituted or unsubstituted C^oalkylene, substituted or unsubstituted C2 2oalkenylene, substituted or unsubstituted C2-20 alkynylene, substituted or unsubstituted heteroC1_2oalkylene, substituted or unsubstituted heteroC2-2oalkenylene, substituted or unsubstituted heteroC2-2o alkynylene, substituted or unsubstituted carbocycylene, substituted or unsubstituted heterocyclylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a combination thereof;
each instance of RD is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl;
each instance of M3 and M4 are independently O, Se, Te, CH2, CF2, CC12, CBr2, or CI2; and
and n is 1 to 25,000, inclusive;
provided that the compound comprises at least one instance of a heavy atom selected from the group consisting of bromine, iodine, selenium, tellurium, or mercury; and
further provided that the compounds of Formula (II) comprising one or more instances of the formula:
Figure imgf000161_0001
24. The heavy-atom labeled nucleic acid polymer of claim 23 comprising one or more instances of formula:
Figure imgf000162_0002
Figure imgf000162_0003
25. A method of determining the sequence of a nucleic acid polymer comprising forming a complementary strand of the nucleic acid polymer from one or more atom labeled compounds of Formula (I):
Figure imgf000162_0004
or a salt thereof; wherein:
each instance of G is independently -0-, -S-, -Se-, -CH2-, or -NH-;
each instance of G2 is independently hydrogen, halogen, -ORA, -SRA, -N(RA)2, -SHg, -S02SHg , -SeRD or -TeRD;
each instance of RA is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2_2o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen atom, a sulfur protecting group when attached to a sulfur atom, a nitrogen protecting group when attached to a nitrogen atom; or two RA groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of M1 is independently -0-, -S-, -NH-, -Se-,or -C(RM)2-, wherein each instance of RM is independently hydrogen or halogen;
each instance of G3 is independently hydrogen, substituted or unsubstituted Q-^alkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2_20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, or a monophosphate, diphosphate, or triphosphate of formula:
O I I ¾ 9 I I 9 I I l 0 I I 0 I I 0 I I c,
HO-P— HO-P-O-P— I HO-P-O-P-O-P— I
M2-H , OH M2-H , or OH OH M2-H
wherein each instance of M is independently -0-, -S-, or -Se-; and
each instance of Base is independently:
Figure imgf000163_0001
Adenine Guanine Cytosine Uracil Thymine or an analog thereof selected from the group consisting of:
Figure imgf000164_0001
Figure imgf000164_0002
Figure imgf000164_0003
Figure imgf000164_0004
Figure imgf000164_0005
wherein:
each instance of R1, R2, R4, and R5 is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-2o alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, a nitrogen protecting group, -OR B , or -SR B , wherein each instance of R B is independently hydrogen, substituted or unsubstituted C1_2oalkyl, substituted or unsubstituted C2_2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen group, or a sulfur protecting group when attached to a sulfur group; or R 1 and R 2 and/or R4 and R5 are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of R is independently substituted or unsubstituted Q-^alkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or
unsubstituted aryl, substituted or unsubstituted heteroaryl, halogen, -OR c , -SR c , -N(R c )2, - SHg, -S02SHg , -SHgRD, -SeRD, or -TeRD wherein each instance of Rc is hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl, an oxygen protecting group when attached to an oxygen atom, a sulfur protecting group when attached to a sulfur atom, a nitrogen protecting group when attached to a nitrogen atom; or two R groups are joined to form a substituted or unsubstituted heterocyclic or substituted or unsubstituted heteroaryl ring;
each instance of L1 is independently absent or a linking moiety selected from the group consisting of substituted or unsubstituted C^oalkylene, substituted or unsubstituted C2 2oalkenylene, substituted or unsubstituted C2-20 alkynylene, substituted or unsubstituted heteroC1_2oalkylene, substituted or unsubstituted heteroC2-2oalkenylene, substituted or unsubstituted heteroC2-2o alkynylene, substituted or unsubstituted carbocycylene, substituted or unsubstituted heterocyclylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a combination thereof;
each instance of RD is independently hydrogen, substituted or unsubstituted C^oalkyl, substituted or unsubstituted C2-2oalkenyl, substituted or unsubstituted C2-20 alkynyl, substituted or unsubstituted carbocycyl, substituted or unsubstituted heterocyclyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; and
each instance of M3 and M4 are independently O, Se, Te, CH2, CF2, CCI2, CBr2, or CI2; provided that the compound comprises at least one instance of a heavy atom selected from the group consisting of bromine, iodine, selenium, tellurium, or mercury;
and identifying a sequence of nucleotides in the nucleic acid polymer and/or in the complementary strand using a particle beam.
26. The method of claim 25, wherein the nucleic acid polymer is DNA or RNA.
27. The method of claim 25, wherein the complementary strand is DNA or RNA.
28. The method of claim 25, wherein the nucleic acid polymer and/or its complementary strand is formed by a nucleic acid polymerase enzyme.
29. The method of claim 25, wherein the complementary strand of the nucleic acid polymer is formed using polymerase chain reaction (PCR).
30. The method of claim 25, wherein the nucleotides of the nucleic acid polymer and/or the complementary strand are modified to include labels comprising one or more heavy-atom labeled compounds of Formula (I).
31. The method of claim 30, wherein at least two types of nucleotides are labeled with the same type of heavy-atom label.
32. The method of claim 30, wherein one type of nucleotide is labeled.
33. The method of claim 30, wherein two types of nucleotides are labeled.
34. The method of claim 30, wherein three types of nucleotides are labeled.
35. The method of claim 30, wherein all the nucleotides are labeled.
36. The method of claim 25, wherein nucleotide specific labels are incorporated in the nucleic acid polymer and/or the complementary strand during formation of the nucleic acid polymer and/or the complementary strand.
37. The method of claim 25, wherein the step of identifying a sequence of nucleotides comprises generating a particle beam, exposing the nucleic acid polymer and/or the complementary strand to the particle beam, and identifying the nucleotides due to characteristic changes to the particle beam.
38. The method of claim 37, wherein the step of identifying the nucleotides comprises detecting characteristic changes to the particle beam.
39. The method of claim 25, 37 or 38, wherein the particle beam is a lepton beam.
40. The method of claim 39, wherein the lepton beam is an electron beam.
PCT/US2013/070299 2012-11-16 2013-11-15 Heavy atom labeled nucleosides, nucleotides, and nucleic acid polymers, and uses thereof WO2014078652A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201261727589P 2012-11-16 2012-11-16
US61/727,589 2012-11-16

Publications (1)

Publication Number Publication Date
WO2014078652A1 true WO2014078652A1 (en) 2014-05-22

Family

ID=49674403

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/070299 WO2014078652A1 (en) 2012-11-16 2013-11-15 Heavy atom labeled nucleosides, nucleotides, and nucleic acid polymers, and uses thereof

Country Status (1)

Country Link
WO (1) WO2014078652A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104630346A (en) * 2015-01-05 2015-05-20 东南大学 Sequencing method of single DNA molecule based on high-resolution transmission electron microscopy
WO2017075179A1 (en) * 2015-10-27 2017-05-04 Zs Genetics, Inc. Sequencing by deconvolution

Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5143854A (en) 1989-06-07 1992-09-01 Affymax Technologies N.V. Large scale photolithographic solid phase synthesis of polypeptides and receptor binding screening thereof
US5424186A (en) 1989-06-07 1995-06-13 Affymax Technologies N.V. Very large scale immobilized polymer synthesis
US5744305A (en) 1989-06-07 1998-04-28 Affymetrix, Inc. Arrays of materials attached to a substrate
US5840862A (en) 1994-02-11 1998-11-24 Institut Pasteur Process for aligning, adhering and stretching nucleic acid strands on a support surface by passage through a meniscus
WO2002032920A2 (en) * 2000-10-18 2002-04-25 Pharmasset Limited Modified nucleosides for treatment of viral infections and abnormal cellular proliferation
US6406844B1 (en) 1989-06-07 2002-06-18 Affymetrix, Inc. Very large scale immobilized polymer synthesis
US6416952B1 (en) 1989-06-07 2002-07-09 Affymetrix, Inc. Photolithographic and other means for manufacturing arrays
WO2002077002A2 (en) * 2001-03-22 2002-10-03 Research Foundation Of The City University Of New York Synthesis of selenium-derivatized nucleosides, nucleotides, phosphoramidites, triphosphates and nucleic acids
US6506558B1 (en) 1990-03-07 2003-01-14 Affymetrix Inc. Very large scale immobilized polymer synthesis
WO2004097032A2 (en) * 2003-04-28 2004-11-11 Nagayama Ip Holdings Llc Method for nucleic acid sequencing
US20060024717A1 (en) 2004-07-14 2006-02-02 Zs Genetics, Inc. Systems and methods of analyzing nucleic acid polymers and related components
WO2007013469A1 (en) 2005-07-26 2007-02-01 Mitsubishi Gas Chemical Company, Inc. (alkylphenyl)alkylcyclohexane and method for producing (alkylphenyl)alkylcyclohexane or alkylbiphenyl
WO2007089542A2 (en) 2006-01-27 2007-08-09 Zs Genetics, Inc. Systems and methods of analyzing nucleic acid polymers by using particle beams
WO2007120202A2 (en) 2005-11-09 2007-10-25 Zs Genetics, Inc. Nano-scale ligand arrays on substrates for particle beam instruments and related methods
JP2008195648A (en) 2007-02-13 2008-08-28 Hokkaido Univ 4'-selenonucleoside and 4'-selenonucleotide
WO2009002506A2 (en) 2007-06-25 2008-12-31 Zs Genetics, Inc. High density molecular alignment of nucleic acid molecules
WO2009117668A1 (en) * 2008-03-21 2009-09-24 Zhen Huang Improved method and process for synthesis of 2',3'-didehydro-2',3'-dideoxynucleosides
WO2010135564A2 (en) * 2009-05-20 2010-11-25 Sena Research, Inc. Novel compounds and derivatization of dnas and rnas on the nucleobases of pyrimidines for function, structure, and therapeutics
WO2010144128A2 (en) 2009-06-08 2010-12-16 Zs Genetics, Inc. Molecular alignment and attachment of nucleic acid molecules
WO2012050264A1 (en) * 2010-10-11 2012-04-19 Kongju National University Industry-University Cooperation Foundation Novel seleny-methyluracil compounds, radiosensitizer and pharmaceutical composition using them

Patent Citations (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6261776B1 (en) 1989-06-07 2001-07-17 Affymetrix, Inc. Nucleic acid arrays
US5143854A (en) 1989-06-07 1992-09-01 Affymax Technologies N.V. Large scale photolithographic solid phase synthesis of polypeptides and receptor binding screening thereof
US5424186A (en) 1989-06-07 1995-06-13 Affymax Technologies N.V. Very large scale immobilized polymer synthesis
US5445934A (en) 1989-06-07 1995-08-29 Affymax Technologies N.V. Array of oligonucleotides on a solid substrate
US5744305A (en) 1989-06-07 1998-04-28 Affymetrix, Inc. Arrays of materials attached to a substrate
US6406844B1 (en) 1989-06-07 2002-06-18 Affymetrix, Inc. Very large scale immobilized polymer synthesis
US5405783A (en) 1989-06-07 1995-04-11 Affymax Technologies N.V. Large scale photolithographic solid phase synthesis of an array of polymers
US6416952B1 (en) 1989-06-07 2002-07-09 Affymetrix, Inc. Photolithographic and other means for manufacturing arrays
US6506558B1 (en) 1990-03-07 2003-01-14 Affymetrix Inc. Very large scale immobilized polymer synthesis
US6265153B1 (en) 1994-02-11 2001-07-24 Institut Pasteur Process for aligning macromolecules by passage of a meniscus and applications
US5840862A (en) 1994-02-11 1998-11-24 Institut Pasteur Process for aligning, adhering and stretching nucleic acid strands on a support surface by passage through a meniscus
US6548255B2 (en) 1994-02-11 2003-04-15 Institut Pasteur And Centre National De La Recherche Scientifique Molecular combing process for detecting macromolecules
WO2002032920A2 (en) * 2000-10-18 2002-04-25 Pharmasset Limited Modified nucleosides for treatment of viral infections and abnormal cellular proliferation
WO2002077002A2 (en) * 2001-03-22 2002-10-03 Research Foundation Of The City University Of New York Synthesis of selenium-derivatized nucleosides, nucleotides, phosphoramidites, triphosphates and nucleic acids
WO2004097032A2 (en) * 2003-04-28 2004-11-11 Nagayama Ip Holdings Llc Method for nucleic acid sequencing
US20060024718A1 (en) 2004-07-14 2006-02-02 Zs Genetics, Inc. Systems and methods of analyzing nucleic acid polymers and related components
US20060024716A1 (en) 2004-07-14 2006-02-02 Zs Genetics, Inc. Systems and methods of analyzing nucleic acid polymers and related components
US20060029957A1 (en) 2004-07-14 2006-02-09 Zs Genetics, Inc. Systems and methods of analyzing nucleic acid polymers and related components
WO2006019903A1 (en) 2004-07-14 2006-02-23 Zs Genetics, Inc. Systems and methods of analyzing nucleic acid polymers and related components
US20060024717A1 (en) 2004-07-14 2006-02-02 Zs Genetics, Inc. Systems and methods of analyzing nucleic acid polymers and related components
WO2007013469A1 (en) 2005-07-26 2007-02-01 Mitsubishi Gas Chemical Company, Inc. (alkylphenyl)alkylcyclohexane and method for producing (alkylphenyl)alkylcyclohexane or alkylbiphenyl
WO2007120202A2 (en) 2005-11-09 2007-10-25 Zs Genetics, Inc. Nano-scale ligand arrays on substrates for particle beam instruments and related methods
WO2007089542A2 (en) 2006-01-27 2007-08-09 Zs Genetics, Inc. Systems and methods of analyzing nucleic acid polymers by using particle beams
US20070190557A1 (en) 2006-01-27 2007-08-16 Zs Genetics, Inc. Systems and methods of analyzing nucleic acid polymers and related components
JP2008195648A (en) 2007-02-13 2008-08-28 Hokkaido Univ 4'-selenonucleoside and 4'-selenonucleotide
WO2009002506A2 (en) 2007-06-25 2008-12-31 Zs Genetics, Inc. High density molecular alignment of nucleic acid molecules
WO2009117668A1 (en) * 2008-03-21 2009-09-24 Zhen Huang Improved method and process for synthesis of 2',3'-didehydro-2',3'-dideoxynucleosides
WO2010135564A2 (en) * 2009-05-20 2010-11-25 Sena Research, Inc. Novel compounds and derivatization of dnas and rnas on the nucleobases of pyrimidines for function, structure, and therapeutics
WO2010144128A2 (en) 2009-06-08 2010-12-16 Zs Genetics, Inc. Molecular alignment and attachment of nucleic acid molecules
WO2012050264A1 (en) * 2010-10-11 2012-04-19 Kongju National University Industry-University Cooperation Foundation Novel seleny-methyluracil compounds, radiosensitizer and pharmaceutical composition using them

Non-Patent Citations (42)

* Cited by examiner, † Cited by third party
Title
"Handbook of Chemistry and Physics"
ASTA: "Advanced Sequencing Technology Awards 2010", 1 September 2010, NATIONAL HUMAN GENOME RESEARCH INSTITUTE
BANFALVI, G.; SARKAR, N: "Ettect of mercury substitution of DNA on its susceptibility to cleavage by restriction endonucleases", DNA CELL BIOL, vol. 14, 1995, pages 5
BATSON, P. E; DELLBY, N; KNVANEK, O L.: "Sub angstrom resolution using aberration corrected electron optics", NATURE, vol. 418, 2002, pages 617 - 620
BEER, M; MOUDRIANAKIS, E.N: "Determination of base sequence in nucleic acids with the electron microscope Visibility of a marker", PROC NAT ACAD SCI, vol. 48, 1962, pages 409 - 416, XP002396884, DOI: doi:10.1073/pnas.48.3.409
BELL, MICROS. MICROANAL., vol. 18, 9 October 2012 (2012-10-09), pages 1049 - 1053
BENSIMON, A; SIMON, A; CHIFFAUDEL, A; CROQUETTE, V.; HESLOT, F; BENSIMON, D: "Alignment and sensitive detection of DNA by a moving interlace", SCIENCE, vol. 265, 1994, pages 2096 - 2098, XP002258439, DOI: doi:10.1126/science.7522347
BENSIMON, D.; SIMON, A J; CROQUETTE, V.; BENSIMON, A: "Stretching DN A with a receding meniscus. Experiments and models", PHYS REV LETT, vol. 74, 1995, pages 4754 - 4757
BRIDGAM, A. J; PETERSEN, G B: "An improved method for the synthesis of mercurated dUTP", J SEQUENCING AND MAPPING, vol. 6, 1996, pages 199 - 209
CARRUTHERS: "Some Modern Methods of Organic Synthesis", 1987, CAMBRIDGE UNIVERSITY PRESS
CREWE A.V.: "Individual atoms photographed", SCIENCE NEWS, vol. 97, 1970, pages 524
CREWE, A V; WALL. J; LANGMORE, J: "Visibility of single atoms", SCIENCE, vol. 168, 1970, pages 1338 - 340
DAVID C BELL ET AL: "DNA Base Identification by Electron Microscopy", MICROSCOPY AND MICROANALYSIS, SPRINGER, NEW YORK, NY, US, vol. 18, no. 5, 1 October 2012 (2012-10-01), pages 1049 - 1053, XP001578803, ISSN: 1431-9276, [retrieved on 20121009], DOI: 10.1017/S1431927612012615 *
DODD ET AL., ORG. BIOMOL. CHEM., vol. 8, 2010, pages 663 - 6665
ELIEL, E.L.: "Stereochemistry of Carbon Compounds", 1962, MCGRAW-HILL
FEYNMAN, R P: "Feynman and Computation", 1959, PERSEUS PRESS, article "There is plenty of room at the bottom", pages: 63 - 76
GAL-OR, L.; MELLEMA, J E.; MOUDNANAKIS. E.N.; BEER, M: "Election microscopic study of ase sequence in nucleic acids VII. Cytosine-specific addition of acyl hydrazides", BIOCHEMISTRY, vol. 6, no. 7, 1967, pages 1909 - 1915
GUPTA ET AL: "Single-molecule DNA sequencing technologies for future genomics research", TRENDS IN BIOTECHNOLOGY, ELSEVIER PUBLICATIONS, CAMBRIDGE, GB, vol. 26, no. 11, 1 November 2008 (2008-11-01), pages 602 - 611, XP025589109, ISSN: 0167-7799, [retrieved on 20080821], DOI: 10.1016/J.TIBTECH.2008.07.003 *
HARAGUCHI ET AL: "Synthesis and Anti-HIV Activity of 4'-Cyano-2',3'-didehydro-3'-deoxythymidine", NUCLEOSIDES, NUCLEOTIDES AND NUCLEIC ACIDS, TAYLOR & FRANCIS, PHILADELPHIA, PA, USA, vol. 23, no. 4, 1 January 2004 (2004-01-01), pages 647 - 654, XP009160041, ISSN: 1525-7770 *
J. SHENG ET AL: "Synthesis, structure and imaging of oligodeoxyribonucleotides with tellurium-nucleobase derivatization", NUCLEIC ACIDS RESEARCH, vol. 39, no. 9, 17 January 2011 (2011-01-17), pages 3962 - 3971, XP055093603, ISSN: 0305-1048, DOI: 10.1093/nar/gkq1288 *
JACQUES ET AL.: "Enantiomers, Racemates and Resolutions", 1981, WILEY INTERSCIENCE
JIA, C L.; LENTZEN, M; URBAN, K.: "Atomic resolution imaging of oxygen in perovskite ceramics", SCIENCE, vol. 299, 2003, pages 870 - 873
LAROCK: "Comprehensive Organic Transformations", 1989, VCH PUBLISHERS, INC.
LIVINGSTON D.C.; DALE, R.M.K.; WARD, D.C.: "The synthesis and enzymatic polymerization of 5-thio-and 5-methylmercurithio-pyrimidine nucleotides", BIOCHIM BIOPHYS ACTA-NUCL ACIDS PROT SYNTH, vol. 454, 1976, pages 9 - 20, XP025482179, DOI: doi:10.1016/0005-2787(76)90350-6
MATTEM, J. ORG. CHEM., vol. 48, 1983, pages 4773 - 4774
MOHANNAD ABDO ET AL: "Electrophilic Aromatic Selenylation: New OPRT Inhibitors", ORGANIC LETTERS, vol. 12, no. 13, 2 July 2010 (2010-07-02), pages 2982 - 2985, XP055093658, ISSN: 1523-7060, DOI: 10.1021/ol1010032 *
MOUDRIANAKIS, E N; BEER, M: "Base sequence determination in nucleic acids with the electron microscope, III. Chemistry and microscopy of guanine-labelled DNA", PROC NATL ACAD SCI, vol. 53, 1965, pages 564 - 581
MULLER, D A; FITTING KOURKOUTIS, L.; MURFITT, M; SONG, J H.; HWANG, H Y; SILCOX I; DELLBY, N.; KRIVANEK 0 L.: "Atomic-scale chemical imaging of composition and bonding by aberration-collected microscopy", SCIENCE, vol. 319, 2008, pages 1073
OTTENSMEYER, F.P.: "Moleculat structure determination by high resolution electron microscopy", ANN REV BIOPHYS BIOENG, vol. 8, 1979, pages 129 - 14 4
ROSENBLUM B B ET AL: "NEW DYE-LABELED TERMINATORS FOR IMPROVED DNA SEQUENCING PATTERNS", NUCLEIC ACIDS RESEARCH, OXFORD UNIVERSITY PRESS, GB, vol. 25, no. 22, 1 January 1997 (1997-01-01), pages 4500 - 4504, XP002201149, ISSN: 0305-1048, DOI: 10.1093/NAR/25.22.4500 *
SCHMID M ET AL: "Synthesis and evaluation of a radiometal-labeled macrocyclic chelator-derivatised thymidine analog", NUCLEAR MEDICINE AND BIOLOGY, ELSEVIER, NY, US, vol. 33, no. 3, 1 April 2006 (2006-04-01), pages 359 - 366, XP027962863, ISSN: 0969-8051, [retrieved on 20060401] *
SCHUSTER. S C: "Next-generation sequencing transfonns today's biology", NAT METHODS, vol. 5, 2008, pages 16 - 18, XP008106842, DOI: doi:10.1038/NMETH1156
SEELA; ZULAUF, SYNTHESIS, 1996, pages 726 - 730
SHENG JIA ET AL: "Synthesis of the First Tellurium-Derivatized Oligonucleotides for Structural and Functional Studies", CHEMISTRY - A EUROPEAN JOURNAL, WILEY - V C H VERLAG GMBH & CO. KGAA, WEINHEIM, DE, vol. 15, no. 39, 5 October 2009 (2009-10-05), pages 10210 - 10216, XP002595450, ISSN: 0947-6539, [retrieved on 20090818], DOI: 10.1002/CHEM.200900774 *
SMITH; MARCH: "March's Advanced Organic Chemistry, 5th Edition,", 2001, JOHN WILEY & SONS, INC.
T. W. GREENE; P. G. M. WUTS: "Protecting Groups in Organic Synthesis, 3rd edition,", 1999, JOHN WILEY & SONS
THOMAS SORRELL: "Organic Chemistry", 1999, UNIVERSITY SCIENCE BOOKS
VILLALOBOS, A.; NESS, J.E.; GUSTAFSSON, C.; MINSHULL, J.; GOVINDARAJAN, S: "Gene designer: A synthetic biology tool for constructing artificial DNA segments", BMC BIOINFORMATICS, vol. 7, 2006, pages 285, XP021013796, DOI: doi:10.1186/1471-2105-7-285
VOYLES. P M.; MULLER, D.A.; GRAZUL. J.L.; CITRIN, P H.; GOSSMAN, H J.L.: "Atomic-scale imaging of individual dopant atoms and clusters in highly n-type bulk Si.", NATURE, vol. 416, 2002, pages 826 - 829, XP002381022, DOI: doi:10.1038/416826a
WILEN ET AL., TETRAHEDRON, vol. 33, 1977, pages 2725
WILEN, S.H.: "Tables of Resolving Agents and Optical Resolutions", 1972, UNIV. OF NOTRE DAME PRESS, pages: 268
YU, SYNLETT, 2000, pages 86 - 88

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104630346A (en) * 2015-01-05 2015-05-20 东南大学 Sequencing method of single DNA molecule based on high-resolution transmission electron microscopy
WO2017075179A1 (en) * 2015-10-27 2017-05-04 Zs Genetics, Inc. Sequencing by deconvolution

Similar Documents

Publication Publication Date Title
Fauster et al. 2′-Azido RNA, a versatile tool for chemical biology: synthesis, X-ray structure, siRNA applications, click labeling
JP5146957B2 (en) Nucleic acid replication method and novel artificial base pair
RU2073682C1 (en) 3'- and/or 2'-amino- or thiol-modified nucleosides, nucleotides or oligonucleotides, method of compound synthesis
Jemielity et al. Synthetic mRNA cap analogs with a modified triphosphate bridge–synthesis, applications and prospects
EP3864151A2 (en) Enzymatic rna synthesis
He et al. Integrating PDA microtube waveguide system with heterogeneous CHA amplification strategy towards superior sensitive detection of miRNA
WO2020014681A1 (en) Biconjugatable labels and methods of use
Bande et al. Isoguanine and 5‐Methyl‐Isocytosine Bases, In Vitro and In Vivo
Flamme et al. Enzymatic formation of an artificial base pair using a modified purine nucleoside triphosphate
Tauraitė et al. Modified nucleotides as substrates of terminal deoxynucleotidyl transferase
Haas et al. Four phosphates at one blow: Access to pentaphosphorylated magic spot nucleotides and their analysis by capillary electrophoresis
Aviñó et al. Parallel Clamps and Polypurine Hairpins (PPRH) for Gene Silencing and Triplex‐Affinity Capture: Design, Synthesis, and Use
WO2014078652A1 (en) Heavy atom labeled nucleosides, nucleotides, and nucleic acid polymers, and uses thereof
Zhang et al. Synthesis of Threose Nucleic Acid (TNA) Triphosphates and Oligonucleotides by Polymerase‐Mediated Primer Extension
Wang et al. Short oligonucleotides facilitate co-transcriptional labeling of RNA at specific positions
Wang et al. Chemical synthesis, purification, and characterization of 3′-5′-linked canonical cyclic dinucleotides (CDNs)
Nainytė et al. Synthesis of an acp 3 U phosphoramidite and incorporation of the hypermodified base into RNA
JP5973425B2 (en) Method for producing phosphate compound having isotope
Flamme et al. Enzymatic formation of an artificial base pair using a modified adenine nucleoside triphosphate
Senthilvelan et al. An Efficient Gram‐Scale Chemical Synthesis of UNA‐Nucleoside‐5′‐O‐Triphosphates
Xu et al. Synthesis and hydrolytic properties of thymidine boranomonophosphate
Shanmugasundaram et al. An Efficient Gram‐Scale Chemical Synthesis of Purine Locked Nucleic Acid Nucleoside‐5′‐O‐Triphosphates
Bag et al. Design and Synthesis of Triazolyl‐Donor/Acceptor Unnatural Nucleosides and Oligonucleotide Probes Containing Triazolyl‐Phenanthrene Nucleoside
WO2023250342A2 (en) Cyclopropene phosphoramidites and conjugates thereof
Wanga et al. Chemical synthesis, purification, and 30-50-linked characterization canonical of cyclic dinucleotides (CDNs)

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13796214

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13796214

Country of ref document: EP

Kind code of ref document: A1