WO2011107887A2 - Methods for replicating polynucleotides with secondary structure - Google Patents

Methods for replicating polynucleotides with secondary structure Download PDF

Info

Publication number
WO2011107887A2
WO2011107887A2 PCT/IB2011/000830 IB2011000830W WO2011107887A2 WO 2011107887 A2 WO2011107887 A2 WO 2011107887A2 IB 2011000830 W IB2011000830 W IB 2011000830W WO 2011107887 A2 WO2011107887 A2 WO 2011107887A2
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
polynucleotide
region
sequence
stranded
Prior art date
Application number
PCT/IB2011/000830
Other languages
French (fr)
Other versions
WO2011107887A3 (en
Inventor
Sydney Brenner
Robert Osborne
Andrew Slatter
Original Assignee
Population Genetic Technologies Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Population Genetic Technologies Ltd. filed Critical Population Genetic Technologies Ltd.
Publication of WO2011107887A2 publication Critical patent/WO2011107887A2/en
Publication of WO2011107887A3 publication Critical patent/WO2011107887A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6848Nucleic acid amplification reactions characterised by the means for preventing contamination or increasing the specificity or sensitivity of an amplification reaction
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6853Nucleic acid amplification reactions using modified primers or templates

Definitions

  • Genometics often includes the replication of large polynucleotide sequences. This genetic analysis can be complicated when polynucleotides, e.g., single stranded genomic polynucleotides, have stable secondary structures that hamper replication/amplification reactions .
  • polynucleotide sequences having secondary structure(s) that reduce or inhibit the replication process having secondary structure(s) that reduce or inhibit the replication process.
  • Compositions, kits and systems that find use in carrying out the replication processes described herein are also provided.
  • Figure 1 An exemplary schematic showing one embodiment of replicating a polynucleotide having stable secondary structure using multiple sequence- specific synthesis primers.
  • Figure 2 Schematics of exemplary starting structures used to produce a replication adapter.
  • Figure 3 Schematic showing examples of how to employ the structures in Figure 2 to produce a replication adapter.
  • Figure 4 Schematic showing exemplary methods for employing a replication adapter to amplify/replicate a region(s) of interest from a target polynucleotide (in this case a viral genomic RNA).
  • a target polynucleotide in this case a viral genomic RNA
  • Figures 5 and 6 Schematics showing exemplary steps for processing and/or analyzing the amplified/replicated polynucleotides produced using replication adapters (as depicted in Figure 4).
  • Amplicon means the product of a polynucleotide amplification reaction. That is, it is a population of polynucleotides, usually double stranded, that are replicated from one or more starting sequences. The one or more starting sequences may be one or more copies of the same sequence, or it may be a mixture of different sequences. Amplicons may be produced by a variety of amplification reactions whose products are multiple replicates of one or more target nucleic acids. Generally, amplification reactions producing amplicons are "template-driven” in that base pairing of reactants, either nucleotides or oligonucleotides, have complements in a template polynucleotide that are required for the creation of reaction products. In one aspect, template-driven reactions are primer
  • PCRs polymerase chain reactions
  • NASBAs nucleic acid sequence-based amplification
  • rolling circle amplifications examples include, but are not limited to, polymerase chain reactions (PCRs), linear polymerase reactions, nucleic acid sequence-based amplification (NASBAs), rolling circle amplifications, and the like, disclosed in the following references that are
  • amplicons of the invention are produced by PCRs.
  • An amplification reaction may be a "real-time"
  • reaction mixture means a solution containing all the necessary reactants for performing a reaction, which may include, but not be limited to, buffering agents to maintain pH at a selected level during a reaction, salts, co-factors, scavengers, and the like.
  • assessing includes any form of measurement, and includes determining if an element is present or not.
  • the terms “determining”, “measuring”, “evaluating”, “assessing” and “assaying” are used interchangeably and includes quantitative and qualitative determinations. Assessing may be relative or absolute. “Assessing the presence of includes determining the amount of something present, and/or determining whether it is present or absent. As used herein, the terms “determining,” “measuring,” and “assessing,” and “assaying” are used interchangeably and include both quantitative and qualitative determinations.
  • “Complementary” or “substantially complementary” refers to the hybridization or base pairing or the formation of a duplex between nucleotides or nucleic acids, such as, for instance, between the two strands of a double stranded DNA molecule or between an oligonucleotide primer and a primer binding site on a single stranded nucleic acid.
  • Complementary nucleotides are, generally, A and T (or A and U), or C and G.
  • Two single stranded RNA or DNA molecules are said to be substantially complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 80% of the nucleotides of the other strand, usually at least about 90% to 95%, and more preferably from about 98 to 100%.
  • RNA or DNA strand will hybridize under selective hybridization conditions to its complement.
  • selective hybridization will occur when there is at least about 65% complementary over a stretch of at least 14 to 25 nucleotides, preferably at least about 75%, more preferably at least about 90% complementary. See, M. Kanehisa Nucleic Acids Res. 12:203 (1984), incorporated herein by reference.
  • Duplex means at least two oligonucleotides and/or polynucleotides that are fully or partially complementary undergo Watson-Crick type base pairing among all or most of their nucleotides so that a stable complex is formed.
  • annealing and
  • hybridization are used interchangeably to mean the formation of a stable duplex.
  • Perfectly matched in reference to a duplex means that the poly- or oligonucleotide strands making up the duplex form a double stranded structure with one another such that every nucleotide in each strand undergoes Watson-Crick base pairing with a nucleotide in the other strand.
  • a stable duplex can include Watson-Crick base pairing and/or non- Watson-Crick base pairing between the strands of the duplex (where base pairing means the forming hydrogen bonds).
  • a non-Watson-Crick base pair includes a nucleoside analog, such as deoxyinosine, 2, 6-diaminopurine, PNAs, LNA's and the like.
  • a non-Watson-Crick base pair includes a "wobble base", such as deoxyinosine, 8-oxo-dA, 8-oxo-dG and the like, where by “wobble base” is meant a nucleic acid base that can base pair with a first nucleotide base in a complementary nucleic acid strand but that, when employed as a template strand for nucleic acid synthesis, leads to the incorporation of a second, different nucleotide base into the synthesizing strand (wobble bases are described in further detail below).
  • a "mismatch" in a duplex between two oligonucleotides or polynucleotides means that a pair of nucleotides in the duplex fails to undergo Watson-Crick bonding.
  • Genetic locus in reference to a genome or target polynucleotide, means a contiguous sub-region or segment of the genome or target polynucleotide.
  • genetic locus, locus, or locus of interest may refer to the position of a nucleotide, a gene or a portion of a gene in a genome, including mitochondrial DNA or other non-chromosomal DNA (e.g., bacterial plasmid), or it may refer to any contiguous portion of genomic sequence whether or not it is within, or associated with, a gene.
  • a genetic locus, locus, or locus of interest can be from a single nucleotide to a segment of a few hundred or a few thousand nucleotides in length or more.
  • a locus of interest will have a reference sequence associated with it (see description of "reference sequence” below).
  • Kit refers to any delivery system for delivering materials or reagents for carrying out a method of the invention.
  • delivery systems include systems that allow for the storage, transport, or delivery of reaction reagents (e.g., probes, enzymes, etc. in the appropriate containers) and/or supporting materials (e.g., buffers, written instructions for performing the assay etc.) from one location to another.
  • reaction reagents e.g., probes, enzymes, etc. in the appropriate containers
  • supporting materials e.g., buffers, written instructions for performing the assay etc.
  • kits include one or more enclosures (e.g., boxes) containing the relevant reaction reagents and/or supporting materials.
  • Such contents may be delivered to the intended recipient together or separately.
  • a first container may contain an enzyme for use in an assay, while a second container contains probes.
  • Ligation means to form a covalent bond or linkage between the termini of two or more nucleic acids, e.g. oligonucleotides and/or polynucleotides, in a template-driven reaction.
  • the nature of the bond or linkage may vary widely and the ligation may be carried out enzymatically or chemically.
  • ligations are usually carried out enzymatically to form a phosphodiester linkage between a 5' carbon of a terminal nucleotide of one oligonucleotide with 3' carbon of another oligonucleotide.
  • Multiplex Identifier refers to a tag associated with a nucleic acid whose identity (e.g., sequence) can be used to differentiate nucleic acids in a sample.
  • the MID on a nucleic acid is used to identify the source from which a nucleic acid is derived.
  • a nucleic acid sample may be a pool of nucleic acids derived from different sources, (e.g., nucleic acids derived from different individuals, different tissues or cells, or nucleic acids isolated at different times points), where the nucleic acids from each different source is tagged with a unique MID.
  • MIDs are employed to uniquely tag each individual nucleic acid in a sample. Identification of the number of unique MIDs in a sample will provide a readout of how many individual nucleic acids are present in the sample (or from how many original nucleic acids a manipulated nucleic acid sample was derived; see, e.g., U.S. Patent No. 7,537,897, issued on May 26, 2009, incorporated herein by reference in its entirety). MIDs can range in length from 2 to 100 nucleotide bases or more.
  • nucleic acid tags that find use as MIDs are described in U.S. Patent 7,544,473, issued on June 6, 2009, and titled "Nucleic Acid Analysis Using Sequence Tokens", as well as U.S. Patent 7,393,665, issued on July 1, 2008, and titled
  • nucleoside includes the natural nucleosides, including 2'-deoxy and 2'-hydroxyl forms, e.g. as described in Kornberg and Baker, DNA Replication, 2nd Ed. (Freeman, San Francisco, 1992).
  • Analogs in reference to nucleosides includes synthetic nucleosides having modified base moieties and/or modified sugar moieties, e.g. described by Scheit, Nucleotide Analogs (John Wiley, New York, 1980); Uhlman and Peyman,
  • Exemplary types of polynucleotides that are capable of enhancing duplex stability include oligonucleotide N3'— >P5' phosphoramidates (referred to herein as “amidates”), peptide nucleic acids (referred to herein as “PNAs”), oligo-2'-0- alkylribonucleotides, polynucleotides containing C-5 propynylpyrimidines, locked nucleic acids (“LNAs”), and like compounds.
  • Such oligonucleotides are either available
  • PCR Polymerase chain reaction
  • PCR is a reaction for making multiple copies or replicates of a target nucleic acid flanked by primer binding sites, such reaction comprising one or more repetitions of the following steps: (i) denaturing the target nucleic acid, (ii) annealing primers to the primer binding sites, and (iii) extending the primers by a nucleic acid polymerase in the presence of nucleoside triphosphates.
  • the reaction is cycled through different temperatures optimized for each step in a thermal cycler instrument. Particular temperatures, durations at each step, and rates of change between steps depend on many factors well-known to those of ordinary skill in the art, e.g.
  • PCR encompasses derivative forms of the reaction, including but not limited to, RT-PCR, real-time PCR, nested PCR, quantitative PCR, multiplexed PCR, and the like.
  • Reaction volumes range from a few hundred nanoliters, e.g. 200 nL, to a few hundred ⁇ , e.g. 200 ⁇ ⁇ .
  • Reverse transcription PCR or “RT-PCR” means a PCR that is preceded by a reverse transcription reaction that converts a target RNA to a complementary single stranded DNA, which is then amplified, e.g. Tecott et al, U.S. patent 5,168,038, which patent is incorporated herein by reference.
  • Real-time PCR means a PCR for which the amount of reaction product, i.e. amplicon, is monitored as the reaction proceeds.
  • Nested PCR means a two-stage PCR wherein the amplicon of a first PCR becomes the sample for a second PCR using a new set of primers, at least one of which binds to an interior location of the first amplicon.
  • initial primers in reference to a nested amplification reaction mean the primers used to generate a first amplicon
  • secondary primers mean the one or more primers used to generate a second, or nested, amplicon.
  • Multiplexed PCR means a PCR wherein multiple target sequences (or a single target sequence and one or more reference sequences) are
  • Quantitative PCR means a PCR designed to measure the abundance of one or more specific target sequences in a sample or specimen. Quantitative PCR includes both absolute quantitation and relative quantitation of such target sequences. Quantitative measurements are made using one or more reference sequences that may be assayed separately or together with a target sequence.
  • the reference sequence may be endogenous or exogenous to a sample or specimen, and in the latter case, may comprise one or more competitor templates.
  • Typical endogenous reference sequences include segments of transcripts of the following genes: ⁇ -actin, GAPDH, p 2 -microglobulin, ribosomal RNA, and the like.
  • Monomers making up polynucleotides and oligonucleotides are capable of specifically binding to a natural polynucleotide by way of a regular pattern of monomer-to-monomer interactions, such as Watson-Crick type of base pairing, base stacking, Hoogsteen or reverse Hoogsteen types of base pairing, wobble base pairing, or the like.
  • wobble base is meant a nucleic acid base that can base pair with a first nucleotide base in a complementary nucleic acid strand but that, when employed as a template strand for nucleic acid synthesis, leads to the incorporation of a second, different nucleotide base into the synthesizing strand.
  • Such monomers and their internucleosidic linkages may be naturally occurring or may be analogs thereof, e.g. naturally occurring or non-naturally occurring analogs.
  • Non-naturally occurring analogs may include peptide nucleic acids (PNAs, e.g., as described in U.S. Patent 5,539,082, incorporated herein by reference), locked nucleic acids (LNAs, e.g., as described in U.S. Patent 6,670,461, incorporated herein by reference), phosphorothioate internucleosidic linkages, bases containing linking groups permitting the attachment of labels, such as fluorophores, or haptens, and the like.
  • PNAs peptide nucleic acids
  • LNAs locked nucleic acids
  • phosphorothioate internucleosidic linkages bases containing linking groups permitting the attachment of labels, such as fluorophores, or haptens, and the like.
  • oligonucleotide or polynucleotide requires enzymatic processing, such as extension by a polymerase, ligation by a ligase, or the like, one of ordinary skill would understand that oligonucleotides or polynucleotides in those instances would not contain certain analogs of internucleosidic linkages, sugar moieties, or bases at any or some positions.
  • Polynucleotides typically range in size from a few monomeric units, e.g. 5-40, when they are usually referred to as “oligonucleotides,” to several thousand monomeric units.
  • a polynucleotide or oligonucleotide is represented by a sequence of letters (upper or lower case), such as "ATGCCTG,” it will be understood that the nucleotides are in 5'— >3' order from left to right and that "A” denotes deoxyadenosine, “C” denotes deoxycytidine, “G” denotes deoxyguanosine, and “T” denotes thymidine, “I” denotes deoxyinosine, "U” denotes uridine, unless otherwise indicated or obvious from context.
  • polynucleotides comprise the four natural nucleosides (e.g.
  • oligonucleotide or polynucleotide substrates selection of appropriate composition for the oligonucleotide or polynucleotide substrates is well within the knowledge of one of ordinary skill, especially with guidance from treatises, such as Sambrook et al, Molecular Cloning, Second Edition (Cold Spring Harbor Laboratory, New York, 1989), and like references.
  • Primer means an oligonucleotide, either natural or synthetically produced, that is capable, upon forming a duplex with a polynucleotide template, of acting as a point of initiation of nucleic acid synthesis and being extended from its 3' end along the template so that an extended duplex is formed.
  • the sequence of nucleotides added during the extension process is determined by the sequence of the template polynucleotide. Usually primers are extended by a DNA polymerase.
  • Primers are generally of a length compatible with its use in synthesis of primer extension products, and are usually are in the range of between 8 to 100 nucleotides in length, such as 10 to 75, 15 to 60, 15 to 40, 18 to 30, 20 to 40, 21 to 50, 22 to 45, 25 to 40, and so on, more typically in the range of between 18-40, 20-35, 21-30 nucleotides long, and any length between the stated ranges.
  • Typical primers can be in the range of between 10-50 nucleotides long, such as 15-45, 18-40, 20-30, 21-25 and so on, and any length between the stated ranges.
  • the primers are usually not more than about 10, 12, 15, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 65, or 70 nucleotides in length.
  • Primers are usually single-stranded for maximum efficiency in amplification, but may alternatively be double- stranded. If double-stranded, the primer is usually first treated to separate its strands before being used to prepare extension products. This denaturation step is typically affected by heat, but may alternatively be carried out using alkali, followed by neutralization.
  • a "primer" is complementary to a template, and complexes by hydrogen bonding or hybridization with the template to give a primer/template complex for initiation of synthesis by a polymerase, which is extended by the addition of covalently bonded bases linked at its 3' end complementary to the template in the process of DNA synthesis.
  • a “primer pair” as used herein refers to first and second primers having nucleic acid sequence suitable for nucleic acid-based amplification of a target nucleic acid.
  • Such primer pairs generally include a first primer having a sequence that is the same or similar to that of a first portion of a target nucleic acid, and a second primer having a sequence that is complementary to a second portion of a target nucleic acid to provide for amplification of the target nucleic acid or a fragment thereof.
  • Reference to "first” and “second” primers herein is arbitrary, unless specifically indicated otherwise.
  • the first primer can be designed as a "forward primer” (which initiates nucleic acid synthesis from a 5' end of the target nucleic acid) or as a "reverse primer” (which initiates nucleic acid synthesis from a 5' end of the extension product produced from synthesis initiated from the forward primer).
  • the second primer can be designed as a forward primer or a reverse primer.
  • Readout means a parameter, or parameters, that are measured and/or detected and can be converted to a number or value.
  • readout may refer to an actual numerical representation of such collected or recorded data.
  • a readout of fluorescence intensity signals from a microarray is the address and fluorescence intensity of a signal being generated at each hybridization site of the microarray; thus, such a readout may be registered or stored in various ways, for example, as an image of the microarray, as a table of numbers, or the like.
  • Reflex site "reflex sequence” and equivalents are used to indicate one or more sequences present in a polynucleotide that are employed to move a domain intra-molecularly from its initial location to a different location in the polynucleotide.
  • reflex sequences are described in detail in U.S. provisional applications 61/235,595 and 61/288,792, filed on August 20, 2009 and December 21, 2009, respectively, and entitled “Compositions and Methods for Intramolecular Nucleic Acid Rearrangement Using Reflex Sequences", incorporated herein by reference.
  • a reflex sequence is chosen so as to be distinct from other sequences in the polynucleotide (i.e., with little sequence homology to other sequences likely to be present in the polynucleotide, e.g., genomic or sub-genomic sequences to be processed).
  • a reflex sequence should be selected so as to not hybridize to any sequence except its complement under the conditions employed in the reflex processes.
  • the reflex sequence may be a synthetic or artificially generated sequence (e.g., added to a polynucleotide in an adapter domain) or a sequence present normally in a polynucleotide being assayed (e.g., a sequence present within a region of interest in a polynucleotide being assayed).
  • a complement to the reflex sequence is present (e.g., inserted in an adapter domain) on the same strand of the polynucleotide as the reflex sequence (e.g., the same strand of a double- stranded polynucleotide or on the same single stranded polynucleotide), where the complement is placed in a particular location so as to facilitate an intramolecular binding and polymerization event on such particular strand.
  • Reflex sequences employed in the reflex process described herein can thus have a wide range of lengths and sequences. Reflex sequences may range from 5 to 200 nucleotide bases in length.
  • Solid support “support”, and “solid phase support” are used interchangeably and refer to a material or group of materials having a rigid or semi-rigid surface or surfaces.
  • at least one surface of the solid support will be substantially flat, although in some embodiments it may be desirable to physically separate synthesis regions for different compounds with, for example, wells, raised regions, pins, etched trenches, or the like.
  • the solid support(s) will take the form of beads, resins, gels, microspheres, or other geometric configurations.
  • Microarrays usually comprise at least one planar solid phase support, such as a glass microscope slide.
  • Specific or “specificity” in reference to the binding of one molecule to another molecule means the recognition, contact, and formation of a stable complex between the two molecules, together with substantially less recognition, contact, or complex formation of that molecule with other molecules.
  • “specific” in reference to the binding of a first molecule to a second molecule means that to the extent the first molecule recognizes and forms a complex with another molecule in a reaction or sample, it forms the largest number of the complexes with the second molecule. Preferably, this largest number is at least fifty percent.
  • molecules involved in a specific binding event have areas on their surfaces or in cavities giving rise to specific recognition between the molecules binding to each other.
  • specific binding examples include antibody-antigen interactions, enzyme-substrate interactions, formation of duplexes or triplexes among polynucleotides and/or oligonucleotides, receptor-ligand interactions, and the like.
  • contact in reference to specificity or specific binding means two molecules are close enough that weak noncovalent chemical interactions, such as Van der Waal forces, hydrogen bonding, base-stacking interactions, ionic and hydrophobic interactions, and the like, dominate the interaction of the molecules.
  • T m is used in reference to the "melting temperature.”
  • the melting temperature is the temperature (e.g., as measured in °C) at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands.
  • sample means a quantity of material from a biological, environmental, medical, or patient source in which detection, measurement, or labeling of target nucleic acids is sought.
  • a specimen or culture e.g., microbiological cultures
  • a sample may include a specimen of synthetic origin.
  • Biological samples may be animal, including human, fluid, solid (e.g., stool) or tissue, as well as liquid and solid food and feed products and ingredients such as dairy items, vegetables, meat and meat by-products, and waste.
  • Biological samples may include materials taken from a patient including, but not limited to cultures, blood, saliva, cerebral spinal fluid, pleural fluid, milk, lymph, sputum, semen, needle aspirates, and the like. Biological samples may be obtained from all of the various families of domestic animals, as well as feral or wild animals, including, but not limited to, such animals as ungulates, bear, fish, rodents, etc. Environmental samples include environmental material such as surface matter, soil, water and industrial samples, as well as samples obtained from food and dairy processing instruments, apparatus, equipment, utensils, disposable and non-disposable items. These examples are not to be construed as limiting the sample types applicable to the present invention.
  • upstream and downstream in describing nucleic acid molecule orientation and/or polymerization are used herein as understood by one of skill in the art.
  • downstream generally means proceeding in the 5' to 3' direction, i.e., the direction in which a nucleotide polymerase normally extends a sequence
  • upstream generally means the converse.
  • a first primer that hybridizes "upstream” of a second primer on the same target nucleic acid molecule is located on the 5' side of the second primer (and thus nucleic acid polymerization from the first primer proceeds towards the second primer).
  • nucleic acid includes a plurality of such nucleic acids
  • compound includes reference to one or more compounds and equivalents thereof known to those skilled in the art, and so forth.
  • the practice of the present invention may employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, and immunology, which are within the skill of the art.
  • Such conventional techniques include polymer array synthesis, hybridization, ligation, and detection of hybridization using a label. Specific illustrations of suitable techniques can be had by reference to the example herein below. However, other equivalent conventional procedures can, of course, also be used.
  • Such conventional techniques and descriptions can be found in standard laboratory manuals such as Genome Analysis: A Laboratory Manual Series (Vols.
  • the replication processes can be employed for the manipulation and analysis of polynucleotide sequences of interest from virtually any polynucleotide source, including but not limited to genomic DNA, complementary DNA (cDNA), RNA (e.g., messenger RNA, ribosomal RNA, short interfering RNA, microRNA, viral genomic RNA, etc.), plasmid DNA, mitochondrial DNA, etc.
  • genomic DNA complementary DNA
  • RNA e.g., messenger RNA, ribosomal RNA, short interfering RNA, microRNA, viral genomic RNA, etc.
  • plasmid DNA mitochondrial DNA
  • mitochondrial DNA mitochondrial DNA
  • Exemplary organisms include, but are not limited to, plants, animals (e.g., reptiles, mammals, insects, worms, fish, etc.), bacteria, fungi (e.g., yeast), viruses, fungi, etc.
  • the polynucleotide sequences are enriched prior to the replication processes described herein.
  • enriched is meant that the polynucleotide is subjected to a process that reduces the complexity of the polynucleotides, generally by increasing the relative concentration of particular polynucleotide species in the sample (e.g., having a specific locus of interest, including a specific polynucleotide sequence, lacking a locus or sequence, being within a specific size range, etc.).
  • enriched polynucleotides having a specific characteristic(s) or sequence there are a wide variety of ways to enrich polynucleotides having a specific characteristic(s) or sequence, and as such any convenient method to accomplish this may be employed.
  • the enrichment can take place at any of a number of steps in the process, and will be determined by the desires of the user. For example, enrichment can take place in individual parental samples (e.g., untagged polynucleotides prior to adaptor ligation) or in multiplexed samples (e.g., polynucleotides tagged with primer binding sites, ME) and/or reflex sequences and pooled; MID are described in further detail below).
  • a sample may be enriched during isolation from its source.
  • pathogen-derived polynucleotides e.g., viral genomes, e.g., Hepatitis C
  • an enrichment process e.g., the process enriches for pathogen-derived polynucleotides
  • polynucleotides in the polynucleotide sample are amplified prior to processing or analysis.
  • the amplification reaction also serves to enrich a starting polynucleotide sample for a sequence or locus of interest.
  • a starting polynucleotide sample can be subjected to a polymerase chain reaction (PCR) that amplifies one or more region of interest.
  • PCR polymerase chain reaction
  • amplification reaction is an exponential amplification reaction, whereas in certain other embodiments, the amplification reaction is a linear amplification reaction. Any convenient method for performing amplification reactions on a starting polynucleotide sample can be used in practicing the subject invention.
  • the nucleic acid in certain embodiments, can be used in practicing the subject invention.
  • polymerase employed in the amplification reaction is a polymerase that has proofreading capability (e.g., phi29 DNA Polymerase, Thermococcus litoralis DNA polymerase,
  • the polynucleotide sample being processed or analyzed is derived from a single source (e.g., a single organism, virus, tissue, cell, subject, etc.), whereas in other embodiments, the polynucleotide sample is a pool of polynucleotides extracted from a plurality of sources (e.g., a pool of polynucleotides from a plurality of organisms, tissues, cells, subjects, etc.), where by "plurality" is meant two or more.
  • a polynucleotide sample can contain polynucleotides from 2 or more sources, 3 or more sources, 5 or more sources, 10 or more sources, 50 or more sources, 100 or more sources, 500 or more sources, 1000 or more sources, 5000 or more sources, up to and including about 10,000 or more sources. It is noted here that in certain embodiments, a single isolated polynucleotide sample can include polynucleotides derived from a plurality of sources.
  • viral genomes isolated from a single patient sample can include genomes derived from many different viruses (e.g., the sample may contain a plurality of different clonal genomes of a virus or interest, e.g., different clones of HepC virus).
  • polynucleotide fragments that are to be pooled with polynucleotide fragments derived from a plurality of sources e.g., a plurality of organisms, tissues, cells, subjects, etc.
  • a plurality of sources e.g., a plurality of organisms, tissues, cells, subjects, etc.
  • the polynucleotides derived from each source include a multiplex identifier (MID) such that the source from which the each tagged polynucleotide fragment was derived can be determined.
  • MID multiplex identifier
  • each polynucleotide sample source is correlated with a unique MID, where by unique MID is meant that each different MID employed can be differentiated from every other MID employed by virtue of at least one characteristic, e.g., the polynucleotide sequence of the MID.
  • unique MID is meant that each different MID employed can be differentiated from every other MID employed by virtue of at least one characteristic, e.g., the polynucleotide sequence of the MID.
  • Any type of MID can be used, including but not limited to those described in co-pending U.S. Patent Application Serial Number
  • a set of MIDs employed to tag a plurality of samples need not have any particular common property (e.g., T m , length, base composition, etc.), as the asymmetric tagging methods (and many tag readout methods, including but not limited to sequencing of the tag or measuring the length of the tag) can accommodate a wide variety of unique MID sets.
  • each individual polynucleotide (e.g., double-stranded or single-stranded, as appropriate to the methodological details employed) in a sample to be analyzed is tagged with a unique MID so that the fate of each polynucleotide can be tracked in subsequent processes (where, as noted above, unique MID is meant to indicate that each different MID employed can be differentiated from every other MID employed by virtue of at least one characteristic, e.g., the polynucleotide sequence of the MID). For example (and as described below), having each polynucleotide tagged with a unique MID allows analysis of the sequence of each individual polynucleotide using the reflex sequence methods described herein.
  • replication adapters can be used to tag each polynucleotide in a sample with a different MID.
  • Single- stranded polynucleotides can form a host of complex secondary structures, including hairpin loops, bulge loops, interior loops, or junctions. Secondary structures are often thermodynamically stable and can inhibit replication (e.g., reverse transcription (RT) (Harrison et al., 1998)). Certain templates, especially when long or highly structured, are refractory to known approaches for improving replication efficiency.
  • RT reverse transcription
  • Oligonucleotide binding can be used to alter the secondary structure of a single- stranded DNA, or RNA, molecule. This approach can be used to prevent hairpin formation and/or other secondary structure formation in a polynucleotide, and thus facilitate its replication (e.g., reverse transcription).
  • minimizing or reducing secondary structure in a single-stranded polynucleotide can be achieved by hybridization to multiple single-stranded synthesis primers (or oligonucleotides, e.g., DNA, or LNA).
  • LNAs can be advantageous for use in highly thermostable regions of a polynucleotide (Fratczak et al., 2009) since the relative stability of duplexes is RNA: LNA > RNA: RNA > RNA: DNA.
  • the positioning of oligonucleotides can be optimized in silico by prior analysis of RNA secondary structure.
  • aspects of the present invention include methods of replicating a region in a polynucleotide having stable secondary structure.
  • the method includes denaturing a polynucleotide having a region of stable secondary structure, annealing at least two different sequence- specific nucleic acid synthesis primers to the denatured polynucleotide at different sites, where at least one of the sequence- specific nucleic acid synthesis primers disrupts the region of stable secondary structure of the polynucleotide when annealed, and extending the annealed sequence- specific synthesis primers (by contacting the synthesis-primer annealed polynucleotide with a nucleic acid polymerase in a nucleic acid synthesis reaction).
  • the extension reaction leaves a nick (or gap) between the extension product of an upstream synthesis primer and the extension product of a
  • downstream synthesis primer Any gap or nick is closed by ligating adjacent upstream and downstream extension products together, thus replicating the region of stable secondary structure in the polynucleotide.
  • the polynucleotide is single stranded RNA and the nucleic acid polymerase is a reverse transcriptase.
  • the RNA is a viral RNA, e.g., a hepatitis virus RNA.
  • a sequence-specific nucleic acid synthesis primer that disrupts the region of stable secondary structure of the polynucleotide is selected from: an LNA primer, a PNA primer, a RNA primer and a DNA primer.
  • the nucleic acid polymerase in the contacting step has 5' to 3' strand displacement activity, thereby generating a 5' branch on the downstream extension product.
  • the contacting step further comprises removing the 5' branch to produce the nick.
  • removing the 5' branch comprises contacting the extended product with a single strand- specific nuclease.
  • the single-strand specific nuclease is selected from: SI nuclease, Mung bean nuclease, FEN1 nuclease, and RecJ f nuclease.
  • the polynucleotide includes multiple regions of stable secondary structure.
  • a sequence- specific nucleic acid synthesis primers specific for each of the multiple regions of stable secondary structure are employed.
  • No limitation with regard to the number of synthesis primers or regions of secondary structure in the polynucleotide is intended.
  • the polynucleotide may include 1, 2, 3 or more, 5 or more, 10 or more, or up to 20 or more regions of stable secondary structure. Synthesis primers specific for any subset of or all of the regions of stable secondary structure in a polynucleotide may be employed.
  • RNA having multiple regions of stable secondary structure DNA polynucleotides may also be replicated using the methods described herein; RNA is employed only as an example). Determining such regions of stable secondary structure can be done by any convenient method, and can include the use of either or both in silico approaches and experimental approaches. No limitation in this regard is intended.
  • the target RNA is denatured and synthesis primers are annealed, where the annealed synthesis primers disrupt the re-formation of the secondary structure of the single stranded RNA.
  • the annealed synthesis primers are then extended by reverse transcriptase.
  • the dNTP mixture of the extension reaction can include a dNTP that reduces the secondary structure of the synthesized product.
  • the synthesized product is DNA
  • 7-deaza-dGTP may replace dGTP.
  • dITP inosine
  • the presence of multiple oligonucleotides on the same RNA strand means that an extended oligonucleotide (an upstream extension product) can meet another oligonucleotide or another extended strand located to its 3' (a downstream extension product). If the polymerase employed does not have displacement activity, it will stop at this junction, leaving a nick (or gap).
  • nucleotide polymerases having 5' to 3' strand displacement activity are used (e.g., reverse transcriptase, e.g., MMLV and SuperscriptTM RT as shown in Figure 1, or other nucleotide polymerases, which will depend on the polynucleotide being replicated and the desires of the user). These polymerases will strand displace when they encounter a 3' synthesis primer/extension product.
  • the displacement activity is generally relatively small ( ⁇ 100 nt, about 30 nucleotides)
  • RNA:DNA heteroduplex in which the downstream extension product is displaced leaving a 5' branch.
  • the 5' branch consists of single-stranded DNA (which might include the entire synthesis primer sequence).
  • the 5' branches can be removed using single strand specific nucleases, such as SI or Mung Bean nuclease, 5' endonucleases such as flap endonuclease (FEN1, as used for the Invader Assay), or 5' exonucleases such as RecJ f .
  • RNA backbone Once the 5' branches are digested/removed, a nick remains between juxtaposed DNA bases on the RNA backbone. These can be sealed by ligation, e.g., using a DNA ligase under appropriate conditions. If desired, the RNA can then be removed, e.g., using standard methods, e.g. RNaseH or NaOH. The single- stranded DNA can then be employed for subsequent manipulation/analysis as desired.
  • replicated polynucleotides can be processed and subjected to sequence analysis, where in certain embodiments, the polynucleotides are sequenced such that clonal information is maintained (i.e., the sequence of each individual polynucleotide in the sample is obtained).
  • polynucleotides produced as described can be subjected to a Reflex process (as described above), which allows clonal sequence information to be obtained from very long polynucleotides in a sample.
  • one or more additional domains may be incorporated into the polynucleotide (e.g., as domains present in an adapter or as a tail on a synthesis primer) either before, during or after the replication step noted above. No limitation with regard to domains or the method of attaching them to a polynucleotide of interest is intended.
  • Exemplary domains may include one or more of the following: MID, uMID, primer binding site, cloning/restriction enzyme site, reflex sequence, promoter sequence, and/or another sequence (e.g., a lead sequence present after a promoter sequence that provides a buffer for initiation of nucleic acid synthesis from the promoter).
  • the domains can be incorporated by ligation, or included on one or more synthesis primer(s) during nucleic acid synthesis.
  • the domains can be incorporated during subsequent reaction steps, e.g. via incorporation during PCR.
  • the additional domains are protected by endonucleases or exonucleases, e.g., by incorporation of phosphorothioate linkages or other blocking groups.
  • the domains consist of double- stranded DNA, and are resistant to endonucleases or exonucleases that preferentially target single-stranded DNA.
  • aspects of the present invention are drawn to methods of replicating a polynucleotide sequence, including the steps of annealing a replication adapter to a target polynucleotide that includes a region of interest (or multiple regions of interest) to be replicated and extending from the annealed replication adapter to replicate the region of interest.
  • the replication adapter includes a double stranded (or duplex) region and a 3 'single- stranded region (the annealing region).
  • the duplex region includes a nucleic acid polymerase promoter site (e.g., RNA or DNA polymerases) and the 3' single- stranded annealing region includes a sequence that is complementary to a location that is 3' to one or more region(s) of interest in the polynucleotide (this is the region that is to be replicated, as shown below).
  • the annealing region can be designed to anneal to virtually any site in the target polynucleotide, and as such no limitation in this regard is intended.
  • the nucleic acid polymerase promoter site is oriented such that nucleic acid synthesis proceeds towards the 3' single- stranded annealing region. In this orientation, nucleic acid synthesis from the promoter will employ the annealed target polynucleotide as a template, thus replicating the region of interest (and any downstream region of interest) in the target polynucleotide.
  • the annealed adapter-polynucleotide complex is contacted with a nucleic acid polymerase specific for the nucleic acid polymerase promoter site in a nucleic acid synthesis reaction, thereby producing a replication product that includes the region(s) of interest (and thus replicating the region(s) of interest).
  • the double stranded region of the replication adapter includes one or more additional domains positioned between the nucleic acid polymerase promoter site and the 3' overhang region, wherein the replication product includes the one or more additional domain.
  • the one or more additional domains are selected from one or more of: MID, primer binding site, cloning/restriction enzyme site, reflex sequence, lead sequence.
  • the replication adapter includes a loop region connecting the complementary strands of the double-stranded region at a position opposite the 3' single- stranded annealing region.
  • the target polynucleotide is single- stranded RNA and wherein the annealed adapter-polynucleotide complex is subjected to a reverse transcription reaction to extend the 3' single stranded region.
  • the single-stranded polynucleotide is viral RNA.
  • the viral RNA is from a hepatitis virus.
  • the nucleic acid polymerase promoter site in the replication adapter is a T3 promoter site and the nucleic acid polymerase is T3 RNA polymerase.
  • the nucleic acid synthesis reaction includes inosine triphosphate (ITP), wherein the reaction product has reduced stable secondary structure as compared to a reaction product produced in the absence of ITP.
  • ITP inosine triphosphate
  • the method further includes treating the annealed adapter- polynucleotide complex with an exonuclease having 3' to 5' single-stranded exonuclease activity, wherein any region of the target polynucleotide that is 3' to the region of interest is removed.
  • embodiments of the replication method described above can be employed to tag each distinct viral RNA genome in a sample with a unique MID and append specific functional domain (e.g., sequencing primer binding sites).
  • the resulting tagged and replicated viral genomes may be subjected to subsequent analyses that exploit the attached functional domains and MID sequences, including (but not limited to) amplification reactions, sequence variant enrichment/isolation reactions, Reflex reactions, and/or sequencing (e.g., using next generation sequencing).
  • replication adapters disclosed herein and methods for their use in replicating polynucleotides allow for the sequencing of multiple regions of interest originating from single viral genome molecules.
  • embodiments of the subject invention allow the analysis and sequencing of single polynucleotide molecules (regardless of their origin).
  • RNA polymerases In the exemplary process shown in Figures 2-6, the following features of viral RNA polymerases are exploited: (1) the ability to copy DNA or RNA sequences; and (2) the ability to cross gaps (or nicks) in the template strand.
  • Figure 2 shows a representative Hepatitis C genome map (single stranded RNA shown at the top) with the genomic regions indicated. The start/stop sites for each region are noted, as well as the length for each region. The exact lengths vary depending on a particular strain. Regions of interest for this embodiments are indicated (i.e., NS3 (helicase/pro tease), NS5A (membrane association/cofactor) and NS5B (RNA polymerase). For the exemplary process of Figure 2, sequences present in two or three of these regions from a single molecule are desired (e.g., the full sequence of each region and/or only a portion of the sequence in each region may be desired, which will be determined by the user).
  • NS3 helicase/pro tease
  • NS5A membrane association/cofactor
  • NS5B RNA polymerase
  • a DNA vector-probe (also referred to as a replication adapter) is first prepared containing a T3 RNA polymerase promoter, an NGS primer (e.g. 454A), a diverse population code (or uMID), a reflex site, and a long probe corresponding to any portion of a viral genome of interest.
  • NGS primer e.g. 454A
  • uMID refers to having a sufficient excess in the number of multiplex identifiers in relation to the number of polynucleotide to be tagged in a sample so that, when the uMIDs are attached, each individual tagged molecule in the sample has a unique MID sequence.
  • Step la of Figure 2 a clone of the desired polynucleotide is obtained.
  • Step lb of Figure 2 the duplex region of the replication adapter is produced, to which the annealing region is added, which can be achieved in any of a variety of convenient methods.
  • Figure 3 (which describes Step 2) provides exemplary ways of producing the probe conjugated vector (or replication adapter). It is noted here that the probe-conjugated vector is synthesized with enough diversity in the population code (MID) to enable each viral genome in any sample be individually 'tagged'. (As noted above, such a diverse population code is herein referred to as a uMID).
  • MID population code
  • Step 3 shown in Figure 4, the vector-probe conjugate is mixed with the RNA genomes to and allowed to hybridize/anneal.
  • this single stranded RNA can be trimmed/removed, e.g., by treatment with ExoT (which removes ssRNA in a 3'- 5' direction).
  • ExoT which removes ssRNA in a 3'- 5' direction.
  • the hybridized complex can be extended, e.g., with reverse transcriptase.
  • In vitro transcription by T3 RNA polymerase (shown in Step 4 of Figure 4) produces replicated/amplified RNA copies of the hybridized viral genomic RNA, whilst tagging each transcript with an MID that is unique to each individual genome (other vector sequences are also added, e.g., sequence primer binding site 454A and the Reflex sequence).
  • the polymerase reaction includes a NTP mix that contains one or more NTP that can reduce secondary structure in the resultant RNA.
  • NTP inosine triphosphate
  • in vitro transcription performed in the presence of inosine triphosphate (ITP) produces amplified RNA having reduced secondary structure (as noted above).
  • ITP inosine triphosphate
  • This reduced secondary structure of the single stranded RNA enables the reverse transcription of long cDNAs therefrom, thus enabling more robust process and analysis steps, e.g., the sequencing of different regions in the polynucleotide that are long distances apart.
  • the long cDNAs described above can serve as substrates for Reflex reactions, which allows the extraction of any desired region of interest whilst retaining the uMID (i.e., so that any obtained sequence can be correlated back to a single specific viral genome), exemplary embodiments of which are shown in Figures 5 (Step 5, option 1) and Figure 6 (Step 5, option 2).
  • Products of reflex reactions can be sequenced using NGS.
  • Sequencing data can be demultiplexed to determine individual viral sequences across multiple different regions.
  • kits and systems for practicing the subject methods as described above, such vectors configured to generate replication adapters or synthesis primers for inhibiting stable secondary structure formation in a polynucleotide as described herein.
  • the various components of the kits may be present in separate containers or certain compatible components may be precombined into a single container, as desired.
  • kits may also include one or more other reagents for preparing or processing a polynucleotide sample according to the subject methods.
  • the reagents may include one or more matrices, solvents, sample preparation reagents, buffers, desalting reagents, enzymatic reagents, denaturing reagents, where calibration standards such as positive and negative controls may be provided as well.
  • the kits may include one or more containers such as vials or bottles, with each container containing a separate component for carrying out a sample processing or preparing step and/or for carrying out one or more steps of a nucleic acid variant isolation assay according to the present invention.
  • the subject kits typically further include instructions for using the components of the kit to practice the subject methods, e.g., to prepare nucleic acid samples for performing a replication reaction according to aspects of the subject methods.
  • the instructions for practicing the subject methods are generally recorded on a suitable recording medium.
  • the instructions may be printed on a substrate, such as paper or plastic, etc.
  • the instructions may be present in the kits as a package insert, in the labeling of the container of the kit or components thereof (i.e., associated with the packaging or sub-packaging) etc.
  • the instructions are present as an electronic storage data file present on a suitable computer readable storage medium, e.g. CD- ROM, diskette, etc.
  • the actual instructions are not present in the kit, but means for obtaining the instructions from a remote source, e.g. via the internet, are provided.
  • An example of this embodiment is a kit that includes a web address where the instructions can be viewed and/or from which the instructions can be downloaded. As with the instructions, this means for obtaining the instructions is recorded on a suitable substrate.
  • kits may also include one or more control samples and reagents, e.g., two or more control samples for use in testing the kit.
  • the methods and compositions provided herein find use in numerous genetic analyses.
  • the replication adapters allow production of samples in which each individual polynucleotide in a sample (e.g., each viral genome) is uniquely tagged such that it can be tracked in subsequent analyses.
  • the ability to replicate regions of a polynucleotide having stable secondary structure will also facilitate genetic analyses, e.g., of viral genomes.

Abstract

Aspects of the present invention are drawn to processes for replicating polynucleotide sequences having secondary structure(s) that reduce or inhibit the replication process. Compositions, kits and systems that find use in carrying out the replication processes described herein are also provided.

Description

METHODS FOR REPLICATING POLYNUCLEOTIDES WITH
SECONDARY STRUCTURE
BACKGROUND
Genetic analysis of genomic RNA or DNA often includes the replication of large polynucleotide sequences. This genetic analysis can be complicated when polynucleotides, e.g., single stranded genomic polynucleotides, have stable secondary structures that hamper replication/amplification reactions . SUMMARY OF THE INVENTION
Aspects of the present invention are drawn to processes for replicating
polynucleotide sequences having secondary structure(s) that reduce or inhibit the replication process. Compositions, kits and systems that find use in carrying out the replication processes described herein are also provided.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention is best understood from the following detailed description when read in conjunction with the accompanying drawings. It is emphasized that, according to common practice, the various features of the drawings are not to scale. Indeed, the dimensions of the various features are arbitrarily expanded or reduced for clarity. Included in the drawings are the following figures:
Figure 1 : An exemplary schematic showing one embodiment of replicating a polynucleotide having stable secondary structure using multiple sequence- specific synthesis primers.
Figure 2: Schematics of exemplary starting structures used to produce a replication adapter.
Figure 3: Schematic showing examples of how to employ the structures in Figure 2 to produce a replication adapter.
Figure 4: Schematic showing exemplary methods for employing a replication adapter to amplify/replicate a region(s) of interest from a target polynucleotide (in this case a viral genomic RNA).
Figures 5 and 6: Schematics showing exemplary steps for processing and/or analyzing the amplified/replicated polynucleotides produced using replication adapters (as depicted in Figure 4). DEFINITIONS
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Still, certain elements are defined for the sake of clarity and ease of reference.
Terms and symbols of nucleic acid chemistry, biochemistry, genetics, and molecular biology used herein follow those of standard treatises and texts in the field, e.g. Kornberg and Baker, DNA Replication, Second Edition (W.H. Freeman, New York, 1992); Lehninger, Biochemistry, Second Edition (Worth Publishers, New York, 1975); Strachan and Read, Human Molecular Genetics, Second Edition (Wiley-Liss, New York, 1999); Eckstein, editor, Oligonucleotides and Analogs: A Practical Approach (Oxford University Press, New York, 1991); Gait, editor, Oligonucleotide Synthesis: A Practical Approach (IRL Press, Oxford, 1984); and the like.
"Amplicon" means the product of a polynucleotide amplification reaction. That is, it is a population of polynucleotides, usually double stranded, that are replicated from one or more starting sequences. The one or more starting sequences may be one or more copies of the same sequence, or it may be a mixture of different sequences. Amplicons may be produced by a variety of amplification reactions whose products are multiple replicates of one or more target nucleic acids. Generally, amplification reactions producing amplicons are "template-driven" in that base pairing of reactants, either nucleotides or oligonucleotides, have complements in a template polynucleotide that are required for the creation of reaction products. In one aspect, template-driven reactions are primer
extensions with a nucleic acid polymerase or oligonucleotide ligations with a nucleic acid ligase. Such reactions include, but are not limited to, polymerase chain reactions (PCRs), linear polymerase reactions, nucleic acid sequence-based amplification (NASBAs), rolling circle amplifications, and the like, disclosed in the following references that are
incorporated herein by reference: Mullis et al, U.S. patents 4,683,195; 4,965,188;
4,683,202; 4,800,159 (PCR); Gelfand et al, U.S. patent 5,210,015 (real-time PCR with "TAQMAN™" probes); Wittwer et al, U.S. patent 6,174,670; Kacian et al, U.S. patent 5,399,491 ("NASBA"); Lizardi, U.S. patent 5,854,033; Aono et al, Japanese patent publ. JP 4-262799 (rolling circle amplification); and the like. In one aspect, amplicons of the invention are produced by PCRs. An amplification reaction may be a "real-time"
amplification if a detection chemistry is available that permits a reaction product to be measured as the amplification reaction progresses, e.g. "real-time PCR" described below, or "real-time NASBA" as described in Leone et al, Nucleic Acids Research, 26: 2150-2155 (1998), and like references. As used herein, the term "amplifying" means performing an amplification reaction. A "reaction mixture" means a solution containing all the necessary reactants for performing a reaction, which may include, but not be limited to, buffering agents to maintain pH at a selected level during a reaction, salts, co-factors, scavengers, and the like.
The term "assessing" includes any form of measurement, and includes determining if an element is present or not. The terms "determining", "measuring", "evaluating", "assessing" and "assaying" are used interchangeably and includes quantitative and qualitative determinations. Assessing may be relative or absolute. "Assessing the presence of includes determining the amount of something present, and/or determining whether it is present or absent. As used herein, the terms "determining," "measuring," and "assessing," and "assaying" are used interchangeably and include both quantitative and qualitative determinations.
"Complementary" or "substantially complementary" refers to the hybridization or base pairing or the formation of a duplex between nucleotides or nucleic acids, such as, for instance, between the two strands of a double stranded DNA molecule or between an oligonucleotide primer and a primer binding site on a single stranded nucleic acid.
Complementary nucleotides are, generally, A and T (or A and U), or C and G. Two single stranded RNA or DNA molecules are said to be substantially complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 80% of the nucleotides of the other strand, usually at least about 90% to 95%, and more preferably from about 98 to 100%.
Alternatively, substantial complementarity exists when an RNA or DNA strand will hybridize under selective hybridization conditions to its complement. Typically, selective hybridization will occur when there is at least about 65% complementary over a stretch of at least 14 to 25 nucleotides, preferably at least about 75%, more preferably at least about 90% complementary. See, M. Kanehisa Nucleic Acids Res. 12:203 (1984), incorporated herein by reference.
"Duplex" means at least two oligonucleotides and/or polynucleotides that are fully or partially complementary undergo Watson-Crick type base pairing among all or most of their nucleotides so that a stable complex is formed. The terms "annealing" and
"hybridization" are used interchangeably to mean the formation of a stable duplex.
"Perfectly matched" in reference to a duplex means that the poly- or oligonucleotide strands making up the duplex form a double stranded structure with one another such that every nucleotide in each strand undergoes Watson-Crick base pairing with a nucleotide in the other strand. A stable duplex can include Watson-Crick base pairing and/or non- Watson-Crick base pairing between the strands of the duplex (where base pairing means the forming hydrogen bonds). In certain embodiments, a non-Watson-Crick base pair includes a nucleoside analog, such as deoxyinosine, 2, 6-diaminopurine, PNAs, LNA's and the like. In certain embodiments, a non-Watson-Crick base pair includes a "wobble base", such as deoxyinosine, 8-oxo-dA, 8-oxo-dG and the like, where by "wobble base" is meant a nucleic acid base that can base pair with a first nucleotide base in a complementary nucleic acid strand but that, when employed as a template strand for nucleic acid synthesis, leads to the incorporation of a second, different nucleotide base into the synthesizing strand (wobble bases are described in further detail below). A "mismatch" in a duplex between two oligonucleotides or polynucleotides means that a pair of nucleotides in the duplex fails to undergo Watson-Crick bonding.
"Genetic locus," "locus," or "locus of interest" in reference to a genome or target polynucleotide, means a contiguous sub-region or segment of the genome or target polynucleotide. As used herein, genetic locus, locus, or locus of interest may refer to the position of a nucleotide, a gene or a portion of a gene in a genome, including mitochondrial DNA or other non-chromosomal DNA (e.g., bacterial plasmid), or it may refer to any contiguous portion of genomic sequence whether or not it is within, or associated with, a gene. A genetic locus, locus, or locus of interest can be from a single nucleotide to a segment of a few hundred or a few thousand nucleotides in length or more. In general, a locus of interest will have a reference sequence associated with it (see description of "reference sequence" below).
"Kit" refers to any delivery system for delivering materials or reagents for carrying out a method of the invention. In the context of reaction assays, such delivery systems include systems that allow for the storage, transport, or delivery of reaction reagents (e.g., probes, enzymes, etc. in the appropriate containers) and/or supporting materials (e.g., buffers, written instructions for performing the assay etc.) from one location to another. For example, kits include one or more enclosures (e.g., boxes) containing the relevant reaction reagents and/or supporting materials. Such contents may be delivered to the intended recipient together or separately. For example, a first container may contain an enzyme for use in an assay, while a second container contains probes.
"Ligation" means to form a covalent bond or linkage between the termini of two or more nucleic acids, e.g. oligonucleotides and/or polynucleotides, in a template-driven reaction. The nature of the bond or linkage may vary widely and the ligation may be carried out enzymatically or chemically. As used herein, ligations are usually carried out enzymatically to form a phosphodiester linkage between a 5' carbon of a terminal nucleotide of one oligonucleotide with 3' carbon of another oligonucleotide. A variety of template-driven ligation reactions are described in the following references, which are incorporated by reference: Whiteley et al, U.S. patent 4,883,750; Letsinger et al, U.S. patent 5,476,930; Fung et al, U.S. patent 5,593,826; Kool, U.S. patent 5,426,180; Landegren et al, U.S. patent 5,871,921; Xu and Kool, Nucleic Acids Research, 27: 875-881 (1999); Higgins et al, Methods in Enzymology, 68: 50-71 (1979); Engler et al, The Enzymes, 15: 3-29 (1982); and Namsaraev, U.S. patent publication 2004/0110213.
"Multiplex Identifier" (MID) as used herein refers to a tag associated with a nucleic acid whose identity (e.g., sequence) can be used to differentiate nucleic acids in a sample. In certain embodiments, the MID on a nucleic acid is used to identify the source from which a nucleic acid is derived. For example, a nucleic acid sample may be a pool of nucleic acids derived from different sources, (e.g., nucleic acids derived from different individuals, different tissues or cells, or nucleic acids isolated at different times points), where the nucleic acids from each different source is tagged with a unique MID. Identification of the MID on a nucleic acid thus allows one to determine from which source the nucleic acid is derived. In other words, the MID provides a correlation between a nucleic acid and its source. In certain embodiments, MIDs are employed to uniquely tag each individual nucleic acid in a sample. Identification of the number of unique MIDs in a sample will provide a readout of how many individual nucleic acids are present in the sample (or from how many original nucleic acids a manipulated nucleic acid sample was derived; see, e.g., U.S. Patent No. 7,537,897, issued on May 26, 2009, incorporated herein by reference in its entirety). MIDs can range in length from 2 to 100 nucleotide bases or more. Exemplary nucleic acid tags that find use as MIDs are described in U.S. Patent 7,544,473, issued on June 6, 2009, and titled "Nucleic Acid Analysis Using Sequence Tokens", as well as U.S. Patent 7,393,665, issued on July 1, 2008, and titled
"Methods and Compositions for Tagging and Identifying Polynucleotides", both of which are incorporated herein by reference in their entirety for their description of nucleic acid tags and their use in identifying polynucleotides. "Nucleoside" as used herein includes the natural nucleosides, including 2'-deoxy and 2'-hydroxyl forms, e.g. as described in Kornberg and Baker, DNA Replication, 2nd Ed. (Freeman, San Francisco, 1992). "Analogs" in reference to nucleosides includes synthetic nucleosides having modified base moieties and/or modified sugar moieties, e.g. described by Scheit, Nucleotide Analogs (John Wiley, New York, 1980); Uhlman and Peyman,
Chemical Reviews, 90: 543-584 (1990), or the like, with the proviso that they are capable of specific hybridization. Such analogs include synthetic nucleosides designed to enhance binding properties, reduce complexity, increase specificity, and the like. Polynucleotides comprising analogs with enhanced hybridization or nuclease resistance properties are described in Uhlman and Peyman (cited above); Crooke et al, Exp. Opin. Ther. Patents, 6: 855-870 (1996); Mesmaeker et al, Current Opinion in Structural Biology, 5: 343-355 (1995); and the like. Exemplary types of polynucleotides that are capable of enhancing duplex stability include oligonucleotide N3'— >P5' phosphoramidates (referred to herein as "amidates"), peptide nucleic acids (referred to herein as "PNAs"), oligo-2'-0- alkylribonucleotides, polynucleotides containing C-5 propynylpyrimidines, locked nucleic acids ("LNAs"), and like compounds. Such oligonucleotides are either available
commercially or may be synthesized using methods described in the literature.
"Polymerase chain reaction," or "PCR," means a reaction for the in vitro
amplification of specific DNA sequences by the simultaneous primer extension of complementary strands of DNA. In other words, PCR is a reaction for making multiple copies or replicates of a target nucleic acid flanked by primer binding sites, such reaction comprising one or more repetitions of the following steps: (i) denaturing the target nucleic acid, (ii) annealing primers to the primer binding sites, and (iii) extending the primers by a nucleic acid polymerase in the presence of nucleoside triphosphates. Usually, the reaction is cycled through different temperatures optimized for each step in a thermal cycler instrument. Particular temperatures, durations at each step, and rates of change between steps depend on many factors well-known to those of ordinary skill in the art, e.g. exemplified by the references: McPherson et al, editors, PCR: A Practical Approach and PCR2: A Practical Approach (IRL Press, Oxford, 1991 and 1995, respectively). For example, in a conventional PCR using Taq DNA polymerase, a double stranded target nucleic acid may be denatured at a temperature >90°C, primers annealed at a temperature in the range 50-75°C, and primers extended at a temperature in the range 72-78°C. The term "PCR" encompasses derivative forms of the reaction, including but not limited to, RT-PCR, real-time PCR, nested PCR, quantitative PCR, multiplexed PCR, and the like. Reaction volumes range from a few hundred nanoliters, e.g. 200 nL, to a few hundred μί, e.g. 200 μΐ^. "Reverse transcription PCR," or "RT-PCR," means a PCR that is preceded by a reverse transcription reaction that converts a target RNA to a complementary single stranded DNA, which is then amplified, e.g. Tecott et al, U.S. patent 5,168,038, which patent is incorporated herein by reference. "Real-time PCR" means a PCR for which the amount of reaction product, i.e. amplicon, is monitored as the reaction proceeds. There are many forms of real-time PCR that differ mainly in the detection chemistries used for monitoring the reaction product, e.g. Gelfand et al, U.S. patent 5,210,015 ("TAQMAN™"); Wittwer et al, U.S. patents 6,174,670 and 6,569,627 (intercalating dyes); Tyagi et al, U.S. patent 5,925,517 (molecular beacons); which patents are incorporated herein by reference. Detection chemistries for real-time PCR are reviewed in Mackay et al, Nucleic Acids Research, 30: 1292-1305 (2002), which is also incorporated herein by reference. "Nested PCR" means a two-stage PCR wherein the amplicon of a first PCR becomes the sample for a second PCR using a new set of primers, at least one of which binds to an interior location of the first amplicon. As used herein, "initial primers" in reference to a nested amplification reaction mean the primers used to generate a first amplicon, and "secondary primers" mean the one or more primers used to generate a second, or nested, amplicon. "Multiplexed PCR" means a PCR wherein multiple target sequences (or a single target sequence and one or more reference sequences) are
simultaneously carried out in the same reaction mixture, e.g. Bernard et al, Anal. Biochem. 273: 221-228 (1999) (two-color real-time PCR). Usually, distinct sets of primers are employed for each sequence being amplified.
"Quantitative PCR" means a PCR designed to measure the abundance of one or more specific target sequences in a sample or specimen. Quantitative PCR includes both absolute quantitation and relative quantitation of such target sequences. Quantitative measurements are made using one or more reference sequences that may be assayed separately or together with a target sequence. The reference sequence may be endogenous or exogenous to a sample or specimen, and in the latter case, may comprise one or more competitor templates. Typical endogenous reference sequences include segments of transcripts of the following genes: β-actin, GAPDH, p2-microglobulin, ribosomal RNA, and the like. Techniques for quantitative PCR are well-known to those of ordinary skill in the art, as exemplified in the following references that are incorporated by reference: Freeman et al, Biotechniques, 26: 112-126 (1999); Becker- Andre et al, Nucleic Acids Research, 17: 9437-9447 (1989);
Zimmerman et al, Biotechniques, 21: 268-279 (1996); Diviacco et al, Gene, 122: 3013-3020 (1992); Becker-Andre et al, Nucleic Acids Research, 17: 9437-9446 (1989); and the like. "Polynucleotide" or "oligonucleotide" is used interchangeably and each means a linear polymer of nucleotide monomers. Monomers making up polynucleotides and oligonucleotides are capable of specifically binding to a natural polynucleotide by way of a regular pattern of monomer-to-monomer interactions, such as Watson-Crick type of base pairing, base stacking, Hoogsteen or reverse Hoogsteen types of base pairing, wobble base pairing, or the like. As described in detail below, by "wobble base" is meant a nucleic acid base that can base pair with a first nucleotide base in a complementary nucleic acid strand but that, when employed as a template strand for nucleic acid synthesis, leads to the incorporation of a second, different nucleotide base into the synthesizing strand. Such monomers and their internucleosidic linkages may be naturally occurring or may be analogs thereof, e.g. naturally occurring or non-naturally occurring analogs. Non-naturally occurring analogs may include peptide nucleic acids (PNAs, e.g., as described in U.S. Patent 5,539,082, incorporated herein by reference), locked nucleic acids (LNAs, e.g., as described in U.S. Patent 6,670,461, incorporated herein by reference), phosphorothioate internucleosidic linkages, bases containing linking groups permitting the attachment of labels, such as fluorophores, or haptens, and the like. Whenever the use of an
oligonucleotide or polynucleotide requires enzymatic processing, such as extension by a polymerase, ligation by a ligase, or the like, one of ordinary skill would understand that oligonucleotides or polynucleotides in those instances would not contain certain analogs of internucleosidic linkages, sugar moieties, or bases at any or some positions.
Polynucleotides typically range in size from a few monomeric units, e.g. 5-40, when they are usually referred to as "oligonucleotides," to several thousand monomeric units.
Whenever a polynucleotide or oligonucleotide is represented by a sequence of letters (upper or lower case), such as "ATGCCTG," it will be understood that the nucleotides are in 5'— >3' order from left to right and that "A" denotes deoxyadenosine, "C" denotes deoxycytidine, "G" denotes deoxyguanosine, and "T" denotes thymidine, "I" denotes deoxyinosine, "U" denotes uridine, unless otherwise indicated or obvious from context. Unless otherwise noted the terminology and atom numbering conventions will follow those disclosed in Strachan and Read, Human Molecular Genetics 2 (Wiley-Liss, New York, 1999). Usually polynucleotides comprise the four natural nucleosides (e.g.
deoxyadenosine, deoxycytidine, deoxyguanosine, deoxythymidine for DNA or their ribose counterparts for RNA) linked by phosphodiester linkages; however, they may also comprise non-natural nucleotide analogs, e.g. including modified bases, sugars, or internucleosidic linkages. It is clear to those skilled in the art that where an enzyme has specific oligonucleotide or polynucleotide substrate requirements for activity, e.g. single stranded DNA, RNA/DNA duplex, or the like, then selection of appropriate composition for the oligonucleotide or polynucleotide substrates is well within the knowledge of one of ordinary skill, especially with guidance from treatises, such as Sambrook et al, Molecular Cloning, Second Edition (Cold Spring Harbor Laboratory, New York, 1989), and like references.
"Primer" means an oligonucleotide, either natural or synthetically produced, that is capable, upon forming a duplex with a polynucleotide template, of acting as a point of initiation of nucleic acid synthesis and being extended from its 3' end along the template so that an extended duplex is formed. The sequence of nucleotides added during the extension process is determined by the sequence of the template polynucleotide. Usually primers are extended by a DNA polymerase. Primers are generally of a length compatible with its use in synthesis of primer extension products, and are usually are in the range of between 8 to 100 nucleotides in length, such as 10 to 75, 15 to 60, 15 to 40, 18 to 30, 20 to 40, 21 to 50, 22 to 45, 25 to 40, and so on, more typically in the range of between 18-40, 20-35, 21-30 nucleotides long, and any length between the stated ranges. Typical primers can be in the range of between 10-50 nucleotides long, such as 15-45, 18-40, 20-30, 21-25 and so on, and any length between the stated ranges. In some embodiments, the primers are usually not more than about 10, 12, 15, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 65, or 70 nucleotides in length.
Primers are usually single-stranded for maximum efficiency in amplification, but may alternatively be double- stranded. If double-stranded, the primer is usually first treated to separate its strands before being used to prepare extension products. This denaturation step is typically affected by heat, but may alternatively be carried out using alkali, followed by neutralization. Thus, a "primer" is complementary to a template, and complexes by hydrogen bonding or hybridization with the template to give a primer/template complex for initiation of synthesis by a polymerase, which is extended by the addition of covalently bonded bases linked at its 3' end complementary to the template in the process of DNA synthesis.
A "primer pair" as used herein refers to first and second primers having nucleic acid sequence suitable for nucleic acid-based amplification of a target nucleic acid. Such primer pairs generally include a first primer having a sequence that is the same or similar to that of a first portion of a target nucleic acid, and a second primer having a sequence that is complementary to a second portion of a target nucleic acid to provide for amplification of the target nucleic acid or a fragment thereof. Reference to "first" and "second" primers herein is arbitrary, unless specifically indicated otherwise. For example, the first primer can be designed as a "forward primer" (which initiates nucleic acid synthesis from a 5' end of the target nucleic acid) or as a "reverse primer" (which initiates nucleic acid synthesis from a 5' end of the extension product produced from synthesis initiated from the forward primer). Likewise, the second primer can be designed as a forward primer or a reverse primer.
"Readout" means a parameter, or parameters, that are measured and/or detected and can be converted to a number or value. In some contexts, readout may refer to an actual numerical representation of such collected or recorded data. For example, a readout of fluorescence intensity signals from a microarray is the address and fluorescence intensity of a signal being generated at each hybridization site of the microarray; thus, such a readout may be registered or stored in various ways, for example, as an image of the microarray, as a table of numbers, or the like.
"Reflex site", "reflex sequence" and equivalents are used to indicate one or more sequences present in a polynucleotide that are employed to move a domain intra-molecularly from its initial location to a different location in the polynucleotide. The use of reflex sequences is described in detail in U.S. provisional applications 61/235,595 and 61/288,792, filed on August 20, 2009 and December 21, 2009, respectively, and entitled "Compositions and Methods for Intramolecular Nucleic Acid Rearrangement Using Reflex Sequences", incorporated herein by reference. In certain embodiments, a reflex sequence is chosen so as to be distinct from other sequences in the polynucleotide (i.e., with little sequence homology to other sequences likely to be present in the polynucleotide, e.g., genomic or sub-genomic sequences to be processed). As such, a reflex sequence should be selected so as to not hybridize to any sequence except its complement under the conditions employed in the reflex processes. The reflex sequence may be a synthetic or artificially generated sequence (e.g., added to a polynucleotide in an adapter domain) or a sequence present normally in a polynucleotide being assayed (e.g., a sequence present within a region of interest in a polynucleotide being assayed). In the reflex system, a complement to the reflex sequence is present (e.g., inserted in an adapter domain) on the same strand of the polynucleotide as the reflex sequence (e.g., the same strand of a double- stranded polynucleotide or on the same single stranded polynucleotide), where the complement is placed in a particular location so as to facilitate an intramolecular binding and polymerization event on such particular strand. Reflex sequences employed in the reflex process described herein can thus have a wide range of lengths and sequences. Reflex sequences may range from 5 to 200 nucleotide bases in length.
"Solid support", "support", and "solid phase support" are used interchangeably and refer to a material or group of materials having a rigid or semi-rigid surface or surfaces. In many embodiments, at least one surface of the solid support will be substantially flat, although in some embodiments it may be desirable to physically separate synthesis regions for different compounds with, for example, wells, raised regions, pins, etched trenches, or the like. According to other embodiments, the solid support(s) will take the form of beads, resins, gels, microspheres, or other geometric configurations. Microarrays usually comprise at least one planar solid phase support, such as a glass microscope slide.
"Specific" or "specificity" in reference to the binding of one molecule to another molecule, such as a labeled target sequence for a probe, means the recognition, contact, and formation of a stable complex between the two molecules, together with substantially less recognition, contact, or complex formation of that molecule with other molecules. In one aspect, "specific" in reference to the binding of a first molecule to a second molecule means that to the extent the first molecule recognizes and forms a complex with another molecule in a reaction or sample, it forms the largest number of the complexes with the second molecule. Preferably, this largest number is at least fifty percent. Generally, molecules involved in a specific binding event have areas on their surfaces or in cavities giving rise to specific recognition between the molecules binding to each other. Examples of specific binding include antibody-antigen interactions, enzyme-substrate interactions, formation of duplexes or triplexes among polynucleotides and/or oligonucleotides, receptor-ligand interactions, and the like. As used herein, "contact" in reference to specificity or specific binding means two molecules are close enough that weak noncovalent chemical interactions, such as Van der Waal forces, hydrogen bonding, base-stacking interactions, ionic and hydrophobic interactions, and the like, dominate the interaction of the molecules.
As used herein, the term "Tm" is used in reference to the "melting temperature." The melting temperature is the temperature (e.g., as measured in °C) at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands. Several equations for calculating the Tm of nucleic acids are known in the art (see e.g., Anderson and Young, Quantitative Filter Hybridization, in Nucleic Acid Hybridization (1985). Other references (e.g., Allawi, H.T. & SantaLucia, J., Jr., Biochemistry 36, 10581-94 (1997)) include alternative methods of computation which take structural and environmental, as well as sequence characteristics into account for the calculation of Tm. "Sample" means a quantity of material from a biological, environmental, medical, or patient source in which detection, measurement, or labeling of target nucleic acids is sought. On the one hand it is meant to include a specimen or culture (e.g., microbiological cultures). On the other hand, it is meant to include both biological and environmental samples. A sample may include a specimen of synthetic origin. Biological samples may be animal, including human, fluid, solid (e.g., stool) or tissue, as well as liquid and solid food and feed products and ingredients such as dairy items, vegetables, meat and meat by-products, and waste. Biological samples may include materials taken from a patient including, but not limited to cultures, blood, saliva, cerebral spinal fluid, pleural fluid, milk, lymph, sputum, semen, needle aspirates, and the like. Biological samples may be obtained from all of the various families of domestic animals, as well as feral or wild animals, including, but not limited to, such animals as ungulates, bear, fish, rodents, etc. Environmental samples include environmental material such as surface matter, soil, water and industrial samples, as well as samples obtained from food and dairy processing instruments, apparatus, equipment, utensils, disposable and non-disposable items. These examples are not to be construed as limiting the sample types applicable to the present invention.
The terms "upstream" and "downstream" in describing nucleic acid molecule orientation and/or polymerization are used herein as understood by one of skill in the art. As such, "downstream" generally means proceeding in the 5' to 3' direction, i.e., the direction in which a nucleotide polymerase normally extends a sequence, and "upstream" generally means the converse. For example, a first primer that hybridizes "upstream" of a second primer on the same target nucleic acid molecule is located on the 5' side of the second primer (and thus nucleic acid polymerization from the first primer proceeds towards the second primer).
It is further noted that the claims may be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as "solely", "only" and the like in connection with the recitation of claim elements, or the use of a "negative" limitation.
DETAILED DESCRIPTION OF THE INVENTION
Before the present invention is described, it is to be understood that this invention is not limited to particular embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only by the appended claims.
Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limits of that range is also specifically disclosed. Each smaller range between any stated value or intervening value in a stated range and any other stated or intervening value in that stated range is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included or excluded in the range, and each range where either, neither or both limits are included in the smaller ranges is also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, some potential and preferred methods and materials are now described. All publications mentioned herein are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited. It is understood that the present disclosure supersedes any disclosure of an incorporated publication to the extent there is a
contradiction.
It must be noted that as used herein and in the appended claims, the singular forms "a", "an", and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a nucleic acid" includes a plurality of such nucleic acids and reference to "the compound" includes reference to one or more compounds and equivalents thereof known to those skilled in the art, and so forth.
The practice of the present invention may employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, and immunology, which are within the skill of the art. Such conventional techniques include polymer array synthesis, hybridization, ligation, and detection of hybridization using a label. Specific illustrations of suitable techniques can be had by reference to the example herein below. However, other equivalent conventional procedures can, of course, also be used. Such conventional techniques and descriptions can be found in standard laboratory manuals such as Genome Analysis: A Laboratory Manual Series (Vols. I- IV), Using Antibodies: A Laboratory Manual, Cells: A Laboratory Manual, PCR Primer: A Laboratory Manual, and Molecular Cloning: A Laboratory Manual (all from Cold Spring Harbor Laboratory Press), Stryer, L. (1995) Biochemistry (4th Ed.) Freeman, New York, Gait, "Oligonucleotide Synthesis: A Practical Approach" 1984, IRL Press, London, Nelson and Cox (2000),
Lehninger, A., Principles of Biochemistry 3 rd Ed., W. H. Freeman Pub., New York, N.Y. and Berg et al. (2002) Biochemistry, 5th Ed., W. H. Freeman Pub., New York, N.Y., all of which are herein incorporated in their entirety by reference for all purposes.
The publications discussed herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Further, the dates of publication provided may be different from the actual publication dates which may need to be independently confirmed.
Nucleic Acids
The replication processes (as described in detail below) can be employed for the manipulation and analysis of polynucleotide sequences of interest from virtually any polynucleotide source, including but not limited to genomic DNA, complementary DNA (cDNA), RNA (e.g., messenger RNA, ribosomal RNA, short interfering RNA, microRNA, viral genomic RNA, etc.), plasmid DNA, mitochondrial DNA, etc. Furthermore, as any organism can be used as a source of polynucleotides to be processed in accordance with the present invention, no limitation in that regard is intended. Exemplary organisms include, but are not limited to, plants, animals (e.g., reptiles, mammals, insects, worms, fish, etc.), bacteria, fungi (e.g., yeast), viruses, fungi, etc.
In certain embodiments, the polynucleotide sequences are enriched prior to the replication processes described herein. By enriched is meant that the polynucleotide is subjected to a process that reduces the complexity of the polynucleotides, generally by increasing the relative concentration of particular polynucleotide species in the sample (e.g., having a specific locus of interest, including a specific polynucleotide sequence, lacking a locus or sequence, being within a specific size range, etc.). There are a wide variety of ways to enrich polynucleotides having a specific characteristic(s) or sequence, and as such any convenient method to accomplish this may be employed. The enrichment (or complexity reduction) can take place at any of a number of steps in the process, and will be determined by the desires of the user. For example, enrichment can take place in individual parental samples (e.g., untagged polynucleotides prior to adaptor ligation) or in multiplexed samples (e.g., polynucleotides tagged with primer binding sites, ME) and/or reflex sequences and pooled; MID are described in further detail below). In certain embodiments, a sample may be enriched during isolation from its source. For example, isolation of pathogen-derived polynucleotides (e.g., viral genomes, e.g., Hepatitis C) from a patient sample in which the patient-derived polynucleotides are removed can be considered an enrichment process (e.g., the process enriches for pathogen-derived polynucleotides).
In certain embodiments, polynucleotides in the polynucleotide sample are amplified prior to processing or analysis. In certain of these embodiments, the amplification reaction also serves to enrich a starting polynucleotide sample for a sequence or locus of interest. For example, a starting polynucleotide sample can be subjected to a polymerase chain reaction (PCR) that amplifies one or more region of interest. In certain embodiments, the
amplification reaction is an exponential amplification reaction, whereas in certain other embodiments, the amplification reaction is a linear amplification reaction. Any convenient method for performing amplification reactions on a starting polynucleotide sample can be used in practicing the subject invention. In certain embodiments, the nucleic acid
polymerase employed in the amplification reaction is a polymerase that has proofreading capability (e.g., phi29 DNA Polymerase, Thermococcus litoralis DNA polymerase,
Pyrococcus furiosus DNA polymerase, etc.).
In certain embodiments, the polynucleotide sample being processed or analyzed is derived from a single source (e.g., a single organism, virus, tissue, cell, subject, etc.), whereas in other embodiments, the polynucleotide sample is a pool of polynucleotides extracted from a plurality of sources (e.g., a pool of polynucleotides from a plurality of organisms, tissues, cells, subjects, etc.), where by "plurality" is meant two or more. As such, in certain embodiments, a polynucleotide sample can contain polynucleotides from 2 or more sources, 3 or more sources, 5 or more sources, 10 or more sources, 50 or more sources, 100 or more sources, 500 or more sources, 1000 or more sources, 5000 or more sources, up to and including about 10,000 or more sources. It is noted here that in certain embodiments, a single isolated polynucleotide sample can include polynucleotides derived from a plurality of sources. For example, viral genomes isolated from a single patient sample can include genomes derived from many different viruses (e.g., the sample may contain a plurality of different clonal genomes of a virus or interest, e.g., different clones of HepC virus).
In certain embodiments, polynucleotide fragments that are to be pooled with polynucleotide fragments derived from a plurality of sources (e.g., a plurality of organisms, tissues, cells, subjects, etc.), where by "plurality" is meant two or more. In such embodiments, the polynucleotides derived from each source include a multiplex identifier (MID) such that the source from which the each tagged polynucleotide fragment was derived can be determined. In such embodiments, each polynucleotide sample source is correlated with a unique MID, where by unique MID is meant that each different MID employed can be differentiated from every other MID employed by virtue of at least one characteristic, e.g., the polynucleotide sequence of the MID. Any type of MID can be used, including but not limited to those described in co-pending U.S. Patent Application Serial Number
11/656,746, filed on January 22, 2007, and titled "Nucleic Acid Analysis Using Sequence Tokens", as well as U.S. Patent 7,393,665, issued on July 1, 2008, and titled "Methods and Compositions for Tagging and Identifying Polynucleotides", both of which are incorporated herein by reference in their entirety for their description of nucleic acid tags and their use in identifying polynucleotides. In certain embodiments, a set of MIDs employed to tag a plurality of samples need not have any particular common property (e.g., Tm, length, base composition, etc.), as the asymmetric tagging methods (and many tag readout methods, including but not limited to sequencing of the tag or measuring the length of the tag) can accommodate a wide variety of unique MID sets.
In certain embodiments, each individual polynucleotide (e.g., double-stranded or single-stranded, as appropriate to the methodological details employed) in a sample to be analyzed is tagged with a unique MID so that the fate of each polynucleotide can be tracked in subsequent processes (where, as noted above, unique MID is meant to indicate that each different MID employed can be differentiated from every other MID employed by virtue of at least one characteristic, e.g., the polynucleotide sequence of the MID). For example (and as described below), having each polynucleotide tagged with a unique MID allows analysis of the sequence of each individual polynucleotide using the reflex sequence methods described herein. This allows the linkage of sequence information for large polynucleotide fragments that cannot be sequenced in a single sequencing run. As discussed below, specially designed replication adapters can be used to tag each polynucleotide in a sample with a different MID.
Using Multiple Sequence-Specific Synthesis Primers to Replicate a Polynucleotide Having Stable Secondary Structure
Single- stranded polynucleotides (e.g., RNA) can form a host of complex secondary structures, including hairpin loops, bulge loops, interior loops, or junctions. Secondary structures are often thermodynamically stable and can inhibit replication (e.g., reverse transcription (RT) (Harrison et al., 1998)). Certain templates, especially when long or highly structured, are refractory to known approaches for improving replication efficiency.
Oligonucleotide binding can be used to alter the secondary structure of a single- stranded DNA, or RNA, molecule. This approach can be used to prevent hairpin formation and/or other secondary structure formation in a polynucleotide, and thus facilitate its replication (e.g., reverse transcription).
For example, minimizing or reducing secondary structure in a single-stranded polynucleotide can be achieved by hybridization to multiple single-stranded synthesis primers (or oligonucleotides, e.g., DNA, or LNA). LNAs can be advantageous for use in highly thermostable regions of a polynucleotide (Fratczak et al., 2009) since the relative stability of duplexes is RNA: LNA > RNA: RNA > RNA: DNA. In certain embodiments, the positioning of oligonucleotides can be optimized in silico by prior analysis of RNA secondary structure. An analogous approach has previously been used to alter the secondary structure of single- stranded DNA into complex shapes (Rothemund, 2006). In another approach, a mixture of synthesis primers having random sequences may be employed. In certain other embodiments, both sequence- specific and random synthesis primers are employed.
Aspects of the present invention include methods of replicating a region in a polynucleotide having stable secondary structure. In certain embodiments, the method includes denaturing a polynucleotide having a region of stable secondary structure, annealing at least two different sequence- specific nucleic acid synthesis primers to the denatured polynucleotide at different sites, where at least one of the sequence- specific nucleic acid synthesis primers disrupts the region of stable secondary structure of the polynucleotide when annealed, and extending the annealed sequence- specific synthesis primers (by contacting the synthesis-primer annealed polynucleotide with a nucleic acid polymerase in a nucleic acid synthesis reaction). The extension reaction leaves a nick (or gap) between the extension product of an upstream synthesis primer and the extension product of a
downstream synthesis primer. Any gap or nick is closed by ligating adjacent upstream and downstream extension products together, thus replicating the region of stable secondary structure in the polynucleotide.
In certain embodiments, the polynucleotide is single stranded RNA and the nucleic acid polymerase is a reverse transcriptase. In certain embodiments, the RNA is a viral RNA, e.g., a hepatitis virus RNA. In certain embodiments, a sequence- specific nucleic acid synthesis primer that disrupts the region of stable secondary structure of the polynucleotide is selected from: an LNA primer, a PNA primer, a RNA primer and a DNA primer.
In certain embodiments, the nucleic acid polymerase in the contacting step has 5' to 3' strand displacement activity, thereby generating a 5' branch on the downstream extension product. In such embodiments, the contacting step further comprises removing the 5' branch to produce the nick. In certain embodiments, removing the 5' branch comprises contacting the extended product with a single strand- specific nuclease. In certain embodiments, the single-strand specific nuclease is selected from: SI nuclease, Mung bean nuclease, FEN1 nuclease, and RecJf nuclease.
In certain embodiments, the polynucleotide includes multiple regions of stable secondary structure. In these embodiments, a sequence- specific nucleic acid synthesis primers specific for each of the multiple regions of stable secondary structure are employed. No limitation with regard to the number of synthesis primers or regions of secondary structure in the polynucleotide is intended. As such, the polynucleotide may include 1, 2, 3 or more, 5 or more, 10 or more, or up to 20 or more regions of stable secondary structure. Synthesis primers specific for any subset of or all of the regions of stable secondary structure in a polynucleotide may be employed.
In an exemplary approach, as shown in Figure 1, multiple synthesis primers (or oligonucleotides) are annealed to an RNA having multiple regions of stable secondary structure (DNA polynucleotides may also be replicated using the methods described herein; RNA is employed only as an example). Determining such regions of stable secondary structure can be done by any convenient method, and can include the use of either or both in silico approaches and experimental approaches. No limitation in this regard is intended.
In Figure 1, the target RNA is denatured and synthesis primers are annealed, where the annealed synthesis primers disrupt the re-formation of the secondary structure of the single stranded RNA. The annealed synthesis primers are then extended by reverse transcriptase. In certain embodiments, the dNTP mixture of the extension reaction can include a dNTP that reduces the secondary structure of the synthesized product. In embodiments in which the synthesized product is DNA, 7-deaza-dGTP may replace dGTP. In embodiments in which the synthesized product is RNA (e.g., using an RNA polymerase in the synthesis step), dITP (inosine) may replace dGTP. The selection of the dNTP mix is up to the desires of the user, and thus no limitation in this regard is intended. As shown in Figure 1, the presence of multiple oligonucleotides on the same RNA strand means that an extended oligonucleotide (an upstream extension product) can meet another oligonucleotide or another extended strand located to its 3' (a downstream extension product). If the polymerase employed does not have displacement activity, it will stop at this junction, leaving a nick (or gap).
In other embodiments, nucleotide polymerases having 5' to 3' strand displacement activity are used (e.g., reverse transcriptase, e.g., MMLV and Superscript™ RT as shown in Figure 1, or other nucleotide polymerases, which will depend on the polynucleotide being replicated and the desires of the user). These polymerases will strand displace when they encounter a 3' synthesis primer/extension product. For MMLV and Superscript™ RT, the displacement activity is generally relatively small (<100 nt, about 30 nucleotides)
(Invitrogen). As shown in Figure 1, this results in an RNA:DNA heteroduplex in which the downstream extension product is displaced leaving a 5' branch. The 5' branch consists of single-stranded DNA (which might include the entire synthesis primer sequence). The 5' branches (or Flaps) can be removed using single strand specific nucleases, such as SI or Mung Bean nuclease, 5' endonucleases such as flap endonuclease (FEN1, as used for the Invader Assay), or 5' exonucleases such as RecJf.
Once the 5' branches are digested/removed, a nick remains between juxtaposed DNA bases on the RNA backbone. These can be sealed by ligation, e.g., using a DNA ligase under appropriate conditions. If desired, the RNA can then be removed, e.g., using standard methods, e.g. RNaseH or NaOH. The single- stranded DNA can then be employed for subsequent manipulation/analysis as desired.
For example, replicated polynucleotides can be processed and subjected to sequence analysis, where in certain embodiments, the polynucleotides are sequenced such that clonal information is maintained (i.e., the sequence of each individual polynucleotide in the sample is obtained). As but one example, polynucleotides produced as described can be subjected to a Reflex process (as described above), which allows clonal sequence information to be obtained from very long polynucleotides in a sample.
As with the Reflex process, certain process steps in a workflow rely on the presence of one or more specific additional domains in the polynucleotide under study. As such, one or more additional domains may be incorporated into the polynucleotide (e.g., as domains present in an adapter or as a tail on a synthesis primer) either before, during or after the replication step noted above. No limitation with regard to domains or the method of attaching them to a polynucleotide of interest is intended. Exemplary domains may include one or more of the following: MID, uMID, primer binding site, cloning/restriction enzyme site, reflex sequence, promoter sequence, and/or another sequence (e.g., a lead sequence present after a promoter sequence that provides a buffer for initiation of nucleic acid synthesis from the promoter). As noted, the domains can be incorporated by ligation, or included on one or more synthesis primer(s) during nucleic acid synthesis. Alternatively, the domains can be incorporated during subsequent reaction steps, e.g. via incorporation during PCR. In certain embodiments, the additional domains are protected by endonucleases or exonucleases, e.g., by incorporation of phosphorothioate linkages or other blocking groups. In certain embodiments, the domains consist of double- stranded DNA, and are resistant to endonucleases or exonucleases that preferentially target single-stranded DNA.
Replicating Target region of a Polynucleotide Using Replication Adapter
Aspects of the present invention are drawn to methods of replicating a polynucleotide sequence, including the steps of annealing a replication adapter to a target polynucleotide that includes a region of interest (or multiple regions of interest) to be replicated and extending from the annealed replication adapter to replicate the region of interest.
The replication adapter includes a double stranded (or duplex) region and a 3 'single- stranded region (the annealing region). The duplex region includes a nucleic acid polymerase promoter site (e.g., RNA or DNA polymerases) and the 3' single- stranded annealing region includes a sequence that is complementary to a location that is 3' to one or more region(s) of interest in the polynucleotide (this is the region that is to be replicated, as shown below). As such, the annealing region can be designed to anneal to virtually any site in the target polynucleotide, and as such no limitation in this regard is intended.
In the replication adapter, the nucleic acid polymerase promoter site is oriented such that nucleic acid synthesis proceeds towards the 3' single- stranded annealing region. In this orientation, nucleic acid synthesis from the promoter will employ the annealed target polynucleotide as a template, thus replicating the region of interest (and any downstream region of interest) in the target polynucleotide.
In the extension reaction, the annealed adapter-polynucleotide complex is contacted with a nucleic acid polymerase specific for the nucleic acid polymerase promoter site in a nucleic acid synthesis reaction, thereby producing a replication product that includes the region(s) of interest (and thus replicating the region(s) of interest).
In certain embodiments, the double stranded region of the replication adapter includes one or more additional domains positioned between the nucleic acid polymerase promoter site and the 3' overhang region, wherein the replication product includes the one or more additional domain.
In certain embodiments, the one or more additional domains are selected from one or more of: MID, primer binding site, cloning/restriction enzyme site, reflex sequence, lead sequence.
In certain embodiments, the replication adapter includes a loop region connecting the complementary strands of the double-stranded region at a position opposite the 3' single- stranded annealing region.
In certain embodiments, the target polynucleotide is single- stranded RNA and wherein the annealed adapter-polynucleotide complex is subjected to a reverse transcription reaction to extend the 3' single stranded region.
In certain embodiments, the single-stranded polynucleotide is viral RNA.
In certain embodiments, the viral RNA is from a hepatitis virus.
In certain embodiments, the nucleic acid polymerase promoter site in the replication adapter is a T3 promoter site and the nucleic acid polymerase is T3 RNA polymerase.
In certain embodiments, the nucleic acid synthesis reaction includes inosine triphosphate (ITP), wherein the reaction product has reduced stable secondary structure as compared to a reaction product produced in the absence of ITP.
In certain embodiments, the method further includes treating the annealed adapter- polynucleotide complex with an exonuclease having 3' to 5' single-stranded exonuclease activity, wherein any region of the target polynucleotide that is 3' to the region of interest is removed.
As described below and show schematically in Figures 3-6, embodiments of the replication method described above can be employed to tag each distinct viral RNA genome in a sample with a unique MID and append specific functional domain (e.g., sequencing primer binding sites). In such embodiments, the resulting tagged and replicated viral genomes may be subjected to subsequent analyses that exploit the attached functional domains and MID sequences, including (but not limited to) amplification reactions, sequence variant enrichment/isolation reactions, Reflex reactions, and/or sequencing (e.g., using next generation sequencing).
Thus, the replication adapters disclosed herein and methods for their use in replicating polynucleotides allow for the sequencing of multiple regions of interest originating from single viral genome molecules. In other words, embodiments of the subject invention allow the analysis and sequencing of single polynucleotide molecules (regardless of their origin).
In the exemplary process shown in Figures 2-6, the following features of viral RNA polymerases are exploited: (1) the ability to copy DNA or RNA sequences; and (2) the ability to cross gaps (or nicks) in the template strand.
Figure 2 shows a representative Hepatitis C genome map (single stranded RNA shown at the top) with the genomic regions indicated. The start/stop sites for each region are noted, as well as the length for each region. The exact lengths vary depending on a particular strain. Regions of interest for this embodiments are indicated (i.e., NS3 (helicase/pro tease), NS5A (membrane association/cofactor) and NS5B (RNA polymerase). For the exemplary process of Figure 2, sequences present in two or three of these regions from a single molecule are desired (e.g., the full sequence of each region and/or only a portion of the sequence in each region may be desired, which will be determined by the user).
As shown in Figure 2A(Step lb), a DNA vector-probe (also referred to as a replication adapter) is first prepared containing a T3 RNA polymerase promoter, an NGS primer (e.g. 454A), a diverse population code (or uMID), a reflex site, and a long probe corresponding to any portion of a viral genome of interest. "uMID", as used herein, refers to having a sufficient excess in the number of multiplex identifiers in relation to the number of polynucleotide to be tagged in a sample so that, when the uMIDs are attached, each individual tagged molecule in the sample has a unique MID sequence.
In Step la of Figure 2, a clone of the desired polynucleotide is obtained. In Step lb of Figure 2, the duplex region of the replication adapter is produced, to which the annealing region is added, which can be achieved in any of a variety of convenient methods. Figure 3 (which describes Step 2) provides exemplary ways of producing the probe conjugated vector (or replication adapter). It is noted here that the probe-conjugated vector is synthesized with enough diversity in the population code (MID) to enable each viral genome in any sample be individually 'tagged'. (As noted above, such a diverse population code is herein referred to as a uMID).
In Step 3, shown in Figure 4, the vector-probe conjugate is mixed with the RNA genomes to and allowed to hybridize/anneal. In embodiments where excess single stranded RNA is present 3' of the annealing site (noted as "Case 2" in Figure 4), this single stranded RNA can be trimmed/removed, e.g., by treatment with ExoT (which removes ssRNA in a 3'- 5' direction). As an optional step (as shown in Figure 3b), the hybridized complex can be extended, e.g., with reverse transcriptase.
In vitro transcription by T3 RNA polymerase (shown in Step 4 of Figure 4) produces replicated/amplified RNA copies of the hybridized viral genomic RNA, whilst tagging each transcript with an MID that is unique to each individual genome (other vector sequences are also added, e.g., sequence primer binding site 454A and the Reflex sequence).
In certain embodiments, the polymerase reaction includes a NTP mix that contains one or more NTP that can reduce secondary structure in the resultant RNA. For example, in vitro transcription performed in the presence of inosine triphosphate (ITP) produces amplified RNA having reduced secondary structure (as noted above). This reduced secondary structure of the single stranded RNA enables the reverse transcription of long cDNAs therefrom, thus enabling more robust process and analysis steps, e.g., the sequencing of different regions in the polynucleotide that are long distances apart.
For example, the long cDNAs described above can serve as substrates for Reflex reactions, which allows the extraction of any desired region of interest whilst retaining the uMID (i.e., so that any obtained sequence can be correlated back to a single specific viral genome), exemplary embodiments of which are shown in Figures 5 (Step 5, option 1) and Figure 6 (Step 5, option 2). Products of reflex reactions can be sequenced using NGS.
Sequencing data can be demultiplexed to determine individual viral sequences across multiple different regions.
Kits and Systems
Also provided by the subject invention are kits and systems for practicing the subject methods, as described above, such vectors configured to generate replication adapters or synthesis primers for inhibiting stable secondary structure formation in a polynucleotide as described herein. The various components of the kits may be present in separate containers or certain compatible components may be precombined into a single container, as desired.
The subject systems and kits may also include one or more other reagents for preparing or processing a polynucleotide sample according to the subject methods. The reagents may include one or more matrices, solvents, sample preparation reagents, buffers, desalting reagents, enzymatic reagents, denaturing reagents, where calibration standards such as positive and negative controls may be provided as well. As such, the kits may include one or more containers such as vials or bottles, with each container containing a separate component for carrying out a sample processing or preparing step and/or for carrying out one or more steps of a nucleic acid variant isolation assay according to the present invention. In addition to above-mentioned components, the subject kits typically further include instructions for using the components of the kit to practice the subject methods, e.g., to prepare nucleic acid samples for performing a replication reaction according to aspects of the subject methods. The instructions for practicing the subject methods are generally recorded on a suitable recording medium. For example, the instructions may be printed on a substrate, such as paper or plastic, etc. As such, the instructions may be present in the kits as a package insert, in the labeling of the container of the kit or components thereof (i.e., associated with the packaging or sub-packaging) etc. In other embodiments, the instructions are present as an electronic storage data file present on a suitable computer readable storage medium, e.g. CD- ROM, diskette, etc. In yet other embodiments, the actual instructions are not present in the kit, but means for obtaining the instructions from a remote source, e.g. via the internet, are provided. An example of this embodiment is a kit that includes a web address where the instructions can be viewed and/or from which the instructions can be downloaded. As with the instructions, this means for obtaining the instructions is recorded on a suitable substrate.
In addition to the subject database, programming and instructions, the kits may also include one or more control samples and reagents, e.g., two or more control samples for use in testing the kit.
Utility
The methods and compositions provided herein find use in numerous genetic analyses. For example, the replication adapters allow production of samples in which each individual polynucleotide in a sample (e.g., each viral genome) is uniquely tagged such that it can be tracked in subsequent analyses. The ability to replicate regions of a polynucleotide having stable secondary structure will also facilitate genetic analyses, e.g., of viral genomes.
Although the foregoing invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, it is readily apparent to those of ordinary skill in the art in light of the teachings of this invention that certain changes and modifications may be made thereto without departing from the spirit or scope of the appended claims.
Accordingly, the preceding merely illustrates the principles of the invention. It will be appreciated that those skilled in the art will be able to devise various arrangements which, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope. Furthermore, all examples and conditional language recited herein are principally intended to aid the reader in understanding the principles of the invention and the concepts contributed by the inventors to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the invention as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents and equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure. The scope of the present invention, therefore, is not intended to be limited to the exemplary embodiments shown and described herein. Rather, the scope and spirit of present invention is embodied by the appended claims.

Claims

CLAIMS That which is claimed is:
1. A method of replicating a region of stable secondary structure in a polynucleotide, the method comprising:
denaturing a polynucleotide having a region of stable secondary structure;
annealing at least two different sequence- specific nucleic acid synthesis primers to the denatured polynucleotide, wherein at least one of the sequence- specific nucleic acid synthesis primers disrupts the region of stable secondary structure of the polynucleotide when annealed;
contacting the synthesis-primer annealed polynucleotide with a nucleic acid polymerase in a nucleic acid synthesis reaction, wherein the annealed at least two different sequence-specific nucleic acid synthesis primers are extended, wherein a nick is produced between the extension product of the upstream synthesis primer and the extension product of the downstream synthesis primer; and
ligating the upstream and downstream extension products, thereby replicating the region of stable secondary structure in the polynucleotide.
2. The method of claim 1, wherein the polynucleotide is single stranded RNA and the nucleic acid polymerase is a reverse transcriptase.
3. The method of claim 2, wherein the RNA is a viral RNA.
4. The method of claim 3, wherein the viral RNA is from a hepatitis virus.
5. The method of claim 1, wherein the at least one sequence-specific nucleic acid synthesis primers that disrupts the region of stable secondary structure of the polynucleotide is selected from: an LNA primer, a PNA primer, a RNA primer and a DNA primer.
6. The method of claim 1, wherein the nucleic acid polymerase in the contacting step has 5' to 3' strand displacement activity, thereby generating a 5' branch on the downstream extension product, and wherein the contacting step further comprises removing the 5' branch to produce the nick.
7. The method of claim 6, wherein removing the 5' branch comprises contacting the extended product with a single strand- specific nuclease.
8. The method of claim 7, wherein the single- strand specific nuclease is selected from: SI nuclease, Mung bean nuclease, FEN1 endonuclease, and RecJf exonuclease.
9. The method of claim 1, wherein the polynucleotide comprises multiple regions of stable secondary structure, wherein the at least two different sequence-specific nucleic acid synthesis primers comprises at least one sequence- specific nucleic acid synthesis primers that disrupt each of the multiple regions of stable secondary structure of the polynucleotide when annealed.
10. A method of replicating a polynucleotide sequence, the method comprising:
annealing a replication adapter to a target polynucleotide comprising a region of interest, wherein the replication adapter comprises a double stranded region and a 3' single- stranded annealing region, wherein the double stranded region comprises a nucleic acid polymerase promoter site and the 3' single-stranded annealing region comprises a sequence complementary to a site 3' to, or at the 3' end of, the region of interest, wherein the nucleic acid polymerase promoter site is oriented such that nucleic acid synthesis proceeds towards the 3' single-stranded annealing region; and
contacting the annealed adapter-polynucleotide complex with a nucleic acid polymerase specific for the nucleic acid polymerase promoter site in a nucleic acid synthesis reaction to produce a replication product, wherein the replication product comprises a copy of the region of interest, thereby replicating the region of interest.
11. The method of claim 10, wherein the double stranded region of the replication adapter comprises one or more additional domains positioned between the nucleic acid polymerase promoter site and the 3' overhang region, wherein the replication product comprises the one or more additional domain.
12. The method of claim 11, wherein the one or more additional domains are selected from one or more of: MID, uMID, primer binding site, cloning/restriction enzyme site, reflex sequence, and promoter sequence.
13. The method of claim 10, wherein the replication adapter comprises a loop region connecting the complementary strands of the double- stranded region at a position opposite the 3' single-stranded annealing region.
14. The method of claim 10, wherein the polynucleotide is single-stranded RNA and wherein the annealed adapter-polynucleotide complex is subjected to a reverse transcription reaction to extend the 3' single stranded region.
15. The method of claim 14, wherein the single- stranded RNA is a viral RNA.
16. The method of claim 15, wherein the viral RNA is from a hepatitis virus.
17. The method of claim 1, wherein the nucleic acid polymerase promoter site in the replication adapter is a T3 promoter site and the nucleic acid polymerase is T3 RNA polymerase.
18. The method of claim 17, wherein the nucleic acid synthesis reaction comprises inosine tri-phosphate (ITP), wherein the reaction product has reduced stable secondary structure as compared to a reaction product produced in the absence of ITP.
19. The method of claim 10, further comprising treating the annealed adapter- polynucleotide complex with an exonuclease having 3' to 5' single-stranded exonuclease activity, wherein any region of the target polynucleotide that is 3' to the region of interest is removed.
20. A kit for comprising:
a replication adapter comprising a double stranded region and a 3' single-stranded annealing region, wherein the double stranded region comprises a nucleic acid polymerase promoter site and the 3' single-stranded annealing region comprises a sequence
complementary to a site 3' to, or at the 3' end of, a region of interest in a target
polynucleotide, wherein the nucleic acid polymerase promoter site is oriented such that nucleic acid synthesis proceeds towards the 3' single-stranded annealing region; and
secondary structure destabilizing nucleotides.
21. A kit comprising:
two different sequence-specific nucleic acid synthesis primers specific for a single stranded target polynucleotide, wherein at least one of the sequence- specific nucleic acid synthesis primers disrupts a region of stable secondary structure of the target polynucleotide when annealed.
22. A replicated polynucleotide as produced by the method of any one of claims 1 to 9.
23. A replicated polynucleotide as produced by the method of any one of claims 10 to 19.
PCT/IB2011/000830 2010-03-02 2011-02-28 Methods for replicating polynucleotides with secondary structure WO2011107887A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US30976810P 2010-03-02 2010-03-02
US61/309,768 2010-03-02

Publications (2)

Publication Number Publication Date
WO2011107887A2 true WO2011107887A2 (en) 2011-09-09
WO2011107887A3 WO2011107887A3 (en) 2012-01-19

Family

ID=44259647

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2011/000830 WO2011107887A2 (en) 2010-03-02 2011-02-28 Methods for replicating polynucleotides with secondary structure

Country Status (1)

Country Link
WO (1) WO2011107887A2 (en)

Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4683195A (en) 1986-01-30 1987-07-28 Cetus Corporation Process for amplifying, detecting, and/or-cloning nucleic acid sequences
US4800159A (en) 1986-02-07 1989-01-24 Cetus Corporation Process for amplifying, detecting, and/or cloning nucleic acid sequences
US4883750A (en) 1984-12-13 1989-11-28 Applied Biosystems, Inc. Detection of specific sequences in nucleic acids
US4965188A (en) 1986-08-22 1990-10-23 Cetus Corporation Process for amplifying, detecting, and/or cloning nucleic acid sequences using a thermostable enzyme
JPH04262799A (en) 1991-02-18 1992-09-18 Toyobo Co Ltd Method for amplifying nucleic acid sequence and reagent kid therefor
US5168038A (en) 1988-06-17 1992-12-01 The Board Of Trustees Of The Leland Stanford Junior University In situ transcription in cells and tissues
US5210015A (en) 1990-08-06 1993-05-11 Hoffman-La Roche Inc. Homogeneous assay system using the nuclease activity of a nucleic acid polymerase
US5399491A (en) 1989-07-11 1995-03-21 Gen-Probe Incorporated Nucleic acid sequence amplification methods
US5426180A (en) 1991-03-27 1995-06-20 Research Corporation Technologies, Inc. Methods of making single-stranded circular oligonucleotides
US5476930A (en) 1993-04-12 1995-12-19 Northwestern University Non-enzymatic ligation of oligonucleotides
US5539082A (en) 1993-04-26 1996-07-23 Nielsen; Peter E. Peptide nucleic acids
US5593826A (en) 1993-03-22 1997-01-14 Perkin-Elmer Corporation, Applied Biosystems, Inc. Enzymatic ligation of 3'amino-substituted oligonucleotides
US5854033A (en) 1995-11-21 1998-12-29 Yale University Rolling circle replication reporter systems
US5871921A (en) 1994-02-16 1999-02-16 Landegren; Ulf Circularizing nucleic acid probe able to interlock with a target sequence through catenation
US5925517A (en) 1993-11-12 1999-07-20 The Public Health Research Institute Of The City Of New York, Inc. Detectably labeled dual conformation oligonucleotide probes, assays and kits
US6174670B1 (en) 1996-06-04 2001-01-16 University Of Utah Research Foundation Monitoring amplification of DNA during PCR
US6670461B1 (en) 1997-09-12 2003-12-30 Exiqon A/S Oligonucleotide analogues
US20040110213A1 (en) 2002-09-30 2004-06-10 Eugeni Namsaraev Polynucleotide synthesis and labeling by kinetic sampling ligation
US7393665B2 (en) 2005-02-10 2008-07-01 Population Genetics Technologies Ltd Methods and compositions for tagging and identifying polynucleotides
US7537897B2 (en) 2006-01-23 2009-05-26 Population Genetics Technologies, Ltd. Molecular counting
US7544473B2 (en) 2006-01-23 2009-06-09 Population Genetics Technologies Ltd. Nucleic acid analysis using sequence tokens

Patent Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4883750A (en) 1984-12-13 1989-11-28 Applied Biosystems, Inc. Detection of specific sequences in nucleic acids
US4683195B1 (en) 1986-01-30 1990-11-27 Cetus Corp
US4683195A (en) 1986-01-30 1987-07-28 Cetus Corporation Process for amplifying, detecting, and/or-cloning nucleic acid sequences
US4800159A (en) 1986-02-07 1989-01-24 Cetus Corporation Process for amplifying, detecting, and/or cloning nucleic acid sequences
US4965188A (en) 1986-08-22 1990-10-23 Cetus Corporation Process for amplifying, detecting, and/or cloning nucleic acid sequences using a thermostable enzyme
US5168038A (en) 1988-06-17 1992-12-01 The Board Of Trustees Of The Leland Stanford Junior University In situ transcription in cells and tissues
US5399491A (en) 1989-07-11 1995-03-21 Gen-Probe Incorporated Nucleic acid sequence amplification methods
US5210015A (en) 1990-08-06 1993-05-11 Hoffman-La Roche Inc. Homogeneous assay system using the nuclease activity of a nucleic acid polymerase
JPH04262799A (en) 1991-02-18 1992-09-18 Toyobo Co Ltd Method for amplifying nucleic acid sequence and reagent kid therefor
US5426180A (en) 1991-03-27 1995-06-20 Research Corporation Technologies, Inc. Methods of making single-stranded circular oligonucleotides
US5593826A (en) 1993-03-22 1997-01-14 Perkin-Elmer Corporation, Applied Biosystems, Inc. Enzymatic ligation of 3'amino-substituted oligonucleotides
US5476930A (en) 1993-04-12 1995-12-19 Northwestern University Non-enzymatic ligation of oligonucleotides
US5539082A (en) 1993-04-26 1996-07-23 Nielsen; Peter E. Peptide nucleic acids
US5925517A (en) 1993-11-12 1999-07-20 The Public Health Research Institute Of The City Of New York, Inc. Detectably labeled dual conformation oligonucleotide probes, assays and kits
US5871921A (en) 1994-02-16 1999-02-16 Landegren; Ulf Circularizing nucleic acid probe able to interlock with a target sequence through catenation
US5854033A (en) 1995-11-21 1998-12-29 Yale University Rolling circle replication reporter systems
US6174670B1 (en) 1996-06-04 2001-01-16 University Of Utah Research Foundation Monitoring amplification of DNA during PCR
US6569627B2 (en) 1996-06-04 2003-05-27 University Of Utah Research Foundation Monitoring hybridization during PCR using SYBR™ Green I
US6670461B1 (en) 1997-09-12 2003-12-30 Exiqon A/S Oligonucleotide analogues
US20040110213A1 (en) 2002-09-30 2004-06-10 Eugeni Namsaraev Polynucleotide synthesis and labeling by kinetic sampling ligation
US7393665B2 (en) 2005-02-10 2008-07-01 Population Genetics Technologies Ltd Methods and compositions for tagging and identifying polynucleotides
US7537897B2 (en) 2006-01-23 2009-05-26 Population Genetics Technologies, Ltd. Molecular counting
US7544473B2 (en) 2006-01-23 2009-06-09 Population Genetics Technologies Ltd. Nucleic acid analysis using sequence tokens

Non-Patent Citations (33)

* Cited by examiner, † Cited by third party
Title
"Genome Analysis: A Laboratory Manual Series", vol. I-IV
"Oligonucleotide Synthesis: A Practical Approach", 1984, IRL PRESS
"Oligonucleotides and Analogs: A Practical Approach", 1991, OXFORD UNIVERSITY PRESS
"PCR: A Practical Approach and PCR2: A Practical Approach", 1991, IRL PRESS
"Using Antibodies: A Laboratory Manual, Cells: A Laboratory Manual, PCR Primer: A Laboratory Manual, and Molecular Cloning: A Laboratory Manual", COLD SPRING HARBOR LABORATORY PRESS
ALLAWI, H.T., SANTALUCIA, J., JR., BIOCHEMISTRY, vol. 36, 1997, pages 10581 - 94
ANDERSON, YOUNG, QUANTITATIVE FILTER HYBRIDIZATION, IN NUCLEIC ACID HYBRIDIZATION, 1985
BECKER-ANDRE ET AL., NUCLEIC ACIDS RESEARCH, vol. 17, 1989, pages 9437 - 9446
BECKER-ANDRE ET AL., NUCLEIC ACIDS RESEARCH, vol. 17, 1989, pages 9437 - 9447
BERG ET AL.: "Biochemistry", 2002, W. H. FREEMAN PUB.
BERNARD ET AL., ANAL. BIOCHEM., vol. 273, 1999, pages 221 - 228
COX, LEHNINGER, A.: "Principles of Biochemistry", 2000, W. H. FREEMAN PUB.
CROOKE ET AL., EXP. OPIN. THER. PATENTS, vol. 6, 1996, pages 855 - 870
DIVIACCO ET AL., GENE, vol. 122, 1992, pages 3013 - 3020
ENGLER ET AL., THE ENZYMES, vol. 15, 1982, pages 3 - 29
FREEMAN ET AL., BIOTECHNIQUES, vol. 26, 1999, pages 112 - 126
GAIT: "Oligonucleotide Synthesis: A Practical Approach", 1984, IRL PRESS
HIGGINS ET AL., METHODS IN ENZYMOLOGY, vol. 68, 1979, pages 50 - 71
KOMBERG, BAKER: "DNA Replication", 1992, W.H. FREEMAN
KORNBERG, BAKER: "DNA Replication", 1992, FREEMAN
LEHNINGER: "Biochemistry", 1975, WORTH PUBLISHERS
LEONE ET AL., NUCLEIC ACIDS RESEARCH, vol. 26, 1998, pages 2150 - 2155
M. KANEHISA, NUCLEIC ACIDS RES., vol. 12, 1984, pages 203
MACKAY ET AL., NUCLEIC ACIDS RESEARCH, vol. 30, 2002, pages 1292 - 1305
MESMAEKER ET AL., CURRENT OPINION IN STRUCTURAL BIOLOGY, vol. 5, 1995, pages 343 - 355
SAMBROOK ET AL.: "Molecular Cloning", 1989, COLD SPRING HARBOR LABORATORY
SCHEIT: "Nucleotide Analogs", 1980, JOHN WILEY
STRACHAN, READ: "Human Molecular Genetics", 1999, WILEY-LISS
STRACHAN, READ: "Human Molecular Genetics", vol. 2, 1999, WILEY-LISS
STRYER, L.: "Biochemistry", 1995, FREEMAN
UHLMAN, PEYMAN, CHEMICAL REVIEWS, vol. 90, 1990, pages 543 - 584
XU, KOOL, NUCLEIC ACIDS RESEARCH, vol. 27, 1999, pages 875 - 881
ZIMMERMAN ET AL., BIOTECHNIQUES, vol. 21, 1996, pages 268 - 279

Also Published As

Publication number Publication date
WO2011107887A3 (en) 2012-01-19

Similar Documents

Publication Publication Date Title
US10907207B2 (en) Methods for analyzing nucleic acids
US8883990B2 (en) Asymmetric adapter library construction
US20130053253A1 (en) Region of Interest Extraction and Normalization Methods
WO2011107887A2 (en) Methods for replicating polynucleotides with secondary structure

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11723111

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11723111

Country of ref document: EP

Kind code of ref document: A2