US20050283320A1 - Method and system for profiling biological systems - Google Patents

Method and system for profiling biological systems Download PDF

Info

Publication number
US20050283320A1
US20050283320A1 US11/141,253 US14125305A US2005283320A1 US 20050283320 A1 US20050283320 A1 US 20050283320A1 US 14125305 A US14125305 A US 14125305A US 2005283320 A1 US2005283320 A1 US 2005283320A1
Authority
US
United States
Prior art keywords
cells
data sets
fluid
data
measurements
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/141,253
Inventor
Noubar Afeyan
Jan Van Der Greef
Frederick Regnier
Aram Adourian
Eric Neumann
Matej Oresic
Elwin Verheij
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BG Medicine Inc
Original Assignee
BG Medicine Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BG Medicine Inc filed Critical BG Medicine Inc
Priority to US11/141,253 priority Critical patent/US20050283320A1/en
Assigned to NEDERLANDSE ORGANISATIE VOOR TOEGEPAST-NATUURWETENSCHAPPELIJK ONDERZOEK TNO reassignment NEDERLANDSE ORGANISATIE VOOR TOEGEPAST-NATUURWETENSCHAPPELIJK ONDERZOEK TNO ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: VERHEIJ, ELWIN ROBBERT
Assigned to BG MEDICINE, INC. reassignment BG MEDICINE, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: NEUMANN, ERIC K., REGNIER, FREDERICK E., ORESIC, MATEJ, ADOURIAN ARAM S., AFEYAN, NOUBAR B., GREEF, JAN VAN DER
Publication of US20050283320A1 publication Critical patent/US20050283320A1/en
Assigned to BG MEDICINE, INC. reassignment BG MEDICINE, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: NEDERLANDSE ORGANISATIE VOOR TOEGEPAST-NATUURWETENSCHAPPELIIJK ONDERZOEK TNO
Assigned to SILICON VALLEY BANK reassignment SILICON VALLEY BANK SECURITY AGREEMENT Assignors: BG MEDICINE, INC.
Assigned to BG MEDICINE, INC. reassignment BG MEDICINE, INC. RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: SILICON VALLEY BANK
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/10Signal processing, e.g. from mass spectrometry [MS] or from PCR
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01JELECTRIC DISCHARGE TUBES OR DISCHARGE LAMPS
    • H01J49/00Particle spectrometers or separator tubes
    • H01J49/0027Methods for using particle spectrometers
    • H01J49/0036Step by step routines describing the handling of the data generated during a measurement
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10TTECHNICAL SUBJECTS COVERED BY FORMER US CLASSIFICATION
    • Y10T436/00Chemistry: analytical and immunological testing
    • Y10T436/24Nuclear magnetic resonance, electron spin resonance or other spin effects or mass spectrometry

Definitions

  • the invention relates to the field of data processing and evaluation.
  • the invention relates to an analytical technology platform for separating and measuring multiple components of a biological sample, and statistical data processing methods for identifying components and revealing patterns and relationships between and among the various measured components.
  • biomarker-patterns may be necessary to characterize and diagnose homeostasis or disease states for such diseases.
  • NMR nuclear magnetic resonance
  • MS mass spectrometry
  • the present invention addresses limitations in current profiling techniques by providing a method and system (or collectively “technology platform”) utilizing hierarchical multivariate analysis of spectrometric data on one or more levels.
  • the present invention further provides a technology platform that facilitates the discernment of similarities, differences, and/or correlations not only between single biomolecular components of a sample or biological system, but also between patterns of biomolecular components of a single bimolecular component type.
  • biomolecule component type refers to a class of biomolecules generally associated with a level of a biological system.
  • gene transcripts are one example of a biomolecule component type that are generally associated with gene expression in a biological system, and the level of a biological system referred to as genomics or functional genomics.
  • Proteins are another example of a biomolecule component type and generally associated with protein expression and modification, etc., and the level of a biological system referred to as proteomics.
  • metabolites which are generally associated with the level of a biological system referred to as metabolomics.
  • the present invention provides a method and system for profiling a biological system utilizing a hierarchical multivariate analysis of spectrometric data to generate a profile of a state of a biological system.
  • the states of a biological system that may be profiled by the invention include, but are not limited to, disease state, pharmacological agent response, toxicological state, biochemical regulation (e.g., apoptosis), age response, environmental response, and stress response.
  • the present invention may use data on a biomolecule component type (e.g., metabolites, proteins, gene transcripts, etc.) from multiple biological sample types (e.g., body fluids, tissue, cells) obtained from multiple sources (such as, for example, blood, urine, cerebospinal fluid, epithelial cells, endothelial cells, different subjects, the same subject at different times, etc.).
  • a biomolecule component type e.g., metabolites, proteins, gene transcripts, etc.
  • biological sample types e.g., body fluids, tissue, cells
  • sources such as, for example, blood, urine, cerebospinal fluid, epithelial cells, endothelial cells, different subjects, the same subject at different times, etc.
  • the present invention may use spectrometric data obtained on one or more platforms including, but not limited to, MS, NMR, liquid chromatography (“LC”), gas-chromatography (“GC”), high performance liquid chromatography (“HPLC”), capillary electrophoresis (“CE”), and any known form of hyphenated mass spectrometry in low or high resolution mode, such as LC-MS, GC-MS, CE-MS, LC-UV, MS-MS, MS n , etc.
  • platforms including, but not limited to, MS, NMR, liquid chromatography (“LC”), gas-chromatography (“GC”), high performance liquid chromatography (“HPLC”), capillary electrophoresis (“CE”), and any known form of hyphenated mass spectrometry in low or high resolution mode, such as LC-MS, GC-MS, CE-MS, LC-UV, MS-MS, MS n , etc.
  • spectrometric data includes data from any spectrometric or chromatographic technique and the term “spectrometric measurement” includes measurements made by any spectrometric or chromatographic technique.
  • Spectrometric techniques include, but are not limited to, resonance spectroscopy, mass spectroscopy, and optical spectroscopy.
  • Chromatographic techniques include, but are not limited to, liquid phase chromatography, gas phase chromatography, and electrophoresis.
  • small molecule and “metabolite” are used interchangeably. Small molecules and metabolites include, but are not limited to, lipids, steroids, amino acids, organic acids, bile acids, eicosanoids, peptides, trace elements, and pharmacophore and drug breakdown products.
  • the present invention provides a method of spectrometric data processing utilizing multiple steps of a multivariate analysis to process data in a hierarchal procedure.
  • the method uses a first multivariate analysis on a plurality of data sets to discern one or more sets of differences and/or similarities between them and then uses a second multivariate analysis to determine a correlation (and/or anti-correlation, i.e., negative correlation) between at least one of these sets of differences (or similarities) and one or more of the plurality of data sets.
  • the method may further comprise developing a profile for a state of a biological system based on the correlation.
  • the term “data sets” refers to the spectrometric data associated with one or more spectrometric measurements.
  • a data set may comprise one or more NMR spectra.
  • a data set may comprise one or more UV emission or absorption spectra.
  • MS a data set may comprise one or more mass spectra.
  • a data set may comprise one or more mass chromatograms.
  • a data set of a chromatographic-MS technique may comprise one or more a total ion current (“TIC”) chromatograms or reconstructed TIC chromatograms.
  • data set includes both raw spectrometric data and data that has been preprocessed (e.g., to remove noise, baseline, detect peaks, to normalize, etc.).
  • data sets may refer to substantially all or a sub-set of the spectrometric data associated with one or more spectrometric measurements.
  • the data associated with the spectrometric measurements of different sample sources e.g., experimental group samples v. control group samples
  • a first data set may refer to experimental group sample measurements and a second data set may refer to control group sample measurements.
  • data sets may refer to data grouped based on any other classification considered relevant.
  • data associated with the spectrometric measurements of a single sample source may be grouped into different data sets based, for example, on the instrument used to perform the measurement, the time a sample was taken, the appearance of the sample, etc. Accordingly, one data set (e.g., grouping of experimental group samples based on appearance) may comprise a sub-set of another data set (e.g., the experimental group data set).
  • the present invention provides a method of spectrometric data processing utilizing multivariate analysis to process data at two or more hierarchal levels of correlation.
  • the method uses a multivariate analysis on a plurality of data sets to discern correlations (and/or anti-correlations) between data sets at a first level of correlation, and then uses the multivariate analysis to discern correlations (and/or anti-correlations) between data sets at a second level of correlation.
  • the method may further comprise developing a profile for a state of a biological system based on the correlations discerned at one or more levels of correlation.
  • the present invention provides a method of spectrometric data processing utilizing multiple steps of a multivariate analysis to process data sets in a hierarchal procedure, wherein one or more of the multivariate analysis steps further comprises processing data at two or more hierarchal levels of correlation.
  • the method comprises: (1) using a first multivariate analysis on a plurality of data sets to discern one or more sets of differences and/or similarities between them; (2) using a second multivariate analysis to determine a first level of correlation (and/or anti-correlation) between a first sets of differences (or similarities) and one or more of the data sets; and (3) using the second multivariate analysis to determine a second level of correlation (and/or anti-correlation) between the first sets of differences (or similarities) and one or more of the data sets.
  • the method of this aspect may also comprise developing a profile for a state of a biological system based on the correlations discerned at one or more levels of correlation.
  • the present invention provides systems adapted to practice the methods of the invention set forth above.
  • the system comprises a spectrometric instrument and a data processing device.
  • the system further comprises a database accessible by the data processing device.
  • the data processing device may comprise an analog and/or digital circuit adapted to implement the functionality of one or more of the methods of the present invention.
  • the data processing device may implement the functionality of the methods of the present invention as software on a general purpose computer.
  • a program may set aside portions of a computer's random access memory to provide control logic that affects the hierarchical multivariate analysis, data preprocessing and the operations with and on the measured interference signals.
  • the program may be written in any one of a number of high-level languages, such as FORTRAN, PASCAL, C, C++, or BASIC. Further, the program may be written in a script, macro, or functionality embedded in commercially available software, such as EXCEL or VISUAL BASIC. Additionally, the software could be implemented in an assembly language directed to a microprocessor resident on a computer.
  • the software could be implemented in Intel 80 ⁇ 86 assembly language if it were configured to run on an IBM PC or PC clone.
  • the software may be embedded on an article of manufacture including, but not limited to, “computer-readable program means” such as a floppy disk, a hard disk, an optical disk, a magnetic tape, a PROM, an EPROM, or CD-ROM.
  • the present invention provides an article of manufacture where the functionality of a method of the present invention is embedded on a computer-readable medium, such as, but not limited to, a floppy disk, a hard disk, an optical disk, a magnetic tape, a PROM, an EPROM, CD-ROM, or DVD-ROM.
  • a computer-readable medium such as, but not limited to, a floppy disk, a hard disk, an optical disk, a magnetic tape, a PROM, an EPROM, CD-ROM, or DVD-ROM.
  • FIG. 1A is a flow diagram of analyzing a plurality of data sets according to various embodiments of the present invention.
  • FIG. 1B is a flow diagram of analyzing a plurality of data sets according to various other embodiments of the present invention.
  • FIGS. 2A and 2B are flow diagrams of the analysis performed according to various embodiments of the present invention on a plurality of data sets of multiple biological sample types obtained from wildtype mice and APO E3 Leiden mice.
  • FIGS. 3A and 3B are examples of partial 400 MHz 1 H-NMR spectra for urine samples of wildtype mouse samples, FIG. 3A and APO E3 mouse samples, FIG. 3B .
  • FIGS. 4A and 4B are examples of partial 400 MHz 1 H-NMR spectra for urine samples of wildtype mouse samples, FIG. 4A and APO E3 mouse samples, FIG. 4B .
  • FIGS. 5A and 5B are examples of partial 400 MHz 1 H-NMR spectra for blood plasma samples of wildtype mouse samples, FIG. 5A , and APO E3 mouse samples, FIG. 5B .
  • FIGS. 6A and 6B are examples of partial 400 MHz 1 H-NMR spectra for blood plasma samples of wildtype mouse samples, FIG. 6A , and APO E3 mouse samples, FIG. 6B .
  • FIGS. 7A and 7B are examples of a blood plasma lipid profile obtained by a LC-MS spectrometric technique using ESI on APO E3 mouse blood plasma samples, FIG. 7A , and wildtype mouse samples, FIG. 7B .
  • FIG. 8 is an example of a PCA-DA score plot of the NMR data for the urine samples of data sets 1 and 2 of FIGS. 2A and 2B .
  • FIG. 9 is an example of a PCA-DA score plot of the NMR data for the urine samples of data set I (wildtype mouse) of FIGS. 2A and 2B .
  • FIG. 10 is an example of a PCA-DA score plot of the NMR data for the urine samples of data set 2 (APO E3 mouse) of FIGS. 2A and 2B .
  • FIG. 11 is an example of a PCA-DA score plot of the NMR data for the urine samples of both wildtype and APO E3 mice.
  • FIG. 12 is an example of a PCA-DA score plot of the NMR data for the blood plasma samples of data sets 3 and 4 of FIGS. 2A and 2B .
  • FIG. 13 is an example of a PCA-DA score plot of the LC-MS data on the blood plasma samples of data sets 5 , 6 of FIGS. 2A and 2B and human samples.
  • FIG. 14 is an example of a loading plot for axis D 2 of FIG. 13 .
  • FIG. 15 is an example of the comparison of normalized blood plasma lipid profiles obtained by an LC-MS spectrometric technique for wildtype mouse samples (thin sold line) and APO E3 mouse samples (thick sold line).
  • FIG. 16 is an example of the comparison of normalized blood plasma lipid profiles obtained by an LC-MS spectrometic technique for wildtype mouse samples (thin sold line) and APO E3 mouse samples (thick sold line).
  • FIG. 17 is an example of a canonical correlation score plot for spectrometric data for one biological sample type (blood plasma) from two different spectrometric techniques (NMR and LC-MS).
  • FIG. 18 is an example of a canonical correlation score plot for spectrometric data for one biological sample type (blood plasma) from the same general spectrometric technique but different instrument configurations.
  • FIG. 19 is a schematic representation of one embodiment of a system adapted to practice the methods of the invention.
  • a flow chart of one embodiment of a method according to the present invention is shown.
  • One or more of a plurality of data sets 110 are preferably subjected to a preprocessing step 120 prior to multivariate analysis.
  • Suitable forms of preprocessing include, but are not limited to, data smoothing, noise reduction, baseline correction, normalization and peak detection.
  • Preferable forms of data preprocessing include entropy-based peak detection (such as disclosed in pending U.S. patent application Ser. No. 09/920,993, filed Aug. 2, 2001, the entire contents of which are hereby incorporated by reference) and partial linear fit techniques (such as found in J. T. W. E.
  • a multivariate analysis is then performed at a first level of correlation 130 to discern differences (and/or similarities) between the data sets.
  • Suitable forms of multivariate analysis include, for example, principal component analysis (“PCA”), discriminant analysis (“DA”), PCA-DA, canonical correlation (“CC”), partial least squares (“PLS”), predictive linear discriminant analysis (“PLDA”), neural networks, and pattern recognition techniques.
  • PCA principal component analysis
  • DA discriminant analysis
  • CC canonical correlation
  • PLS partial least squares
  • PLDA predictive linear discriminant analysis
  • neural networks and pattern recognition techniques.
  • PCA-DA is performed at a first level of correlation that produces a score plot (i.e., a plot of the data in terms of two principal components; see, e.g., FIGS. 8-12 which are described further below).
  • a score plot i.e., a plot of the data in terms of two principal components; see, e.g., FIGS. 8-12 which are described further below.
  • the same or a different multivariate analysis is performed on the data sets at a second level of correlation 140 based on the differences (and/or similarities) discerned from the first level of correlation.
  • the second level of correlation comprises a loading plot produced by a PCA-DA analysis.
  • This second level of correlation bears a hierarchical relationship to the first level in that loading plots provide information on the contributions of individual input vectors to the PCA-DA that in turn are used to produce a score plot.
  • each data set comprises a plurality of mass chromatograms
  • a point on a score plot represents mass chromatograms originating from one sample source.
  • a point on a loading plot represents the contribution of a particular mass (or range of masses) to the correlations between data sets.
  • a point on a score plot represents one NMR spectrum.
  • a point on the corresponding loading plot represents the contribution of a particular NMR chemical shift value (or range of values) to the correlations between data sets.
  • a profile may be developed 151 (“NO” to inspect spectra query 160 ).
  • the region in a score plot where the data points fall for a certain group of data sets may comprise a profile for the state of a biological system associated with that group.
  • the profile may comprise both the above region in a score plot and a specific level of contribution from one or more points in an associated loading plot.
  • a biological system may only fit into the profile of a state if spectrometric data sets from appropriate samples fall in a certain region of the score plot and if the mass chromatograms for a particular range of masses provide a significant contribution to the correlation observed in the score plot.
  • the data sets comprise NMR spectra
  • a biological system may only fit into the profile of a state if spectrometric data sets from appropriate samples fall in a certain region of the score plot and if a particular range of chemical shift values in the NMR spectra provide a significant contribution to the correlation observed in the score plot.
  • the method may further include a step of inspection 155 of one or more specific spectra of the data sets (“YES” to inspect spectra query 160 ) based on the correlations discerned in the analysis at the first level of correlation 130 and/or that at the second level of correlation 140 .
  • a profile based on this inspection is then developed 152 .
  • the method inspects the mass chromatograms of those mass ranges showing a significant contribution to the correlation based on the loading plot. Inspection of these mass chromatograms, for example, may reveal what species of chemical compounds are associated with the profile. Such information may be of particular importance for biomarker identification and drug target identification.
  • FIG. 1B a flow chart of another embodiment of a method according to the present invention is shown.
  • One or more of a plurality of data sets 210 are preferably subjected to a preprocessing step 220 prior to multivariate analysis.
  • a first multivariate analysis is then performed 230 on a plurality of data sets to discern one or more sets of differences and/or similarities between them.
  • the first multivariate analysis may be performed between sub-sets of the data sets. For example, the first multivariate analysis may be performed between data set 1 and data set 2 , 231 and the first multivariate analysis may be performed separately between data set 2 and data set 3 , 232 .
  • the method uses a second multivariate analysis 240 to determine a correlation between at least one of the sets of differences (or similarities) discerned in the first multivariate analysis and one or more of the data sets.
  • This second multivariate analysis 240 bears a hierarchal relationship to the first 230 in that the differences between data sets are discerned in a hierarchal fashion. For example, the differences between data sets 1 and 2 (and data sets 2 and 3 ) are first discerned 231 , 232 and then those differences are subjected to further multivariate analysis 240 .
  • a profile based on the correlations discerned in the second multivariate analysis 240 is developed 250 .
  • any of the multivariate analysis steps 231 , 232 , 240 may further comprise a step of performing the same or a different multivariate analysis at another level of correlation 260 (for example, such as described with respect to FIG. 1A ) based on the differences (and/or similarities) discerned from the level of correlation used in a prior multivariate analysis step 231 , 232 , 240 .
  • a profile based on the information from one or more of these levels of correlation may then be developed 250 , 251 (“NO” to inspect spectra query 270 ).
  • the method may further include a step of inspection 255 of one or more specific spectra of the data sets (“YES” to inspect spectra query 270 ) based on the correlations discerned in the analysis at one ore more levels of correlation and/or one or more multivariate analysis steps.
  • a profile based on this inspection then may be developed 252 .
  • the methods of the present invention may be used to develop profiles on any biomolecular component type. Such profiles facilitate the development of comprehensive profiles of different levels of a biological system, such as, for example, genome profiles, transcriptomic profiles, proteome profiles, and metabolome profiles. Further, such methods may be used for data analysis of spectrometric measurements (of for example, plasma samples from a control and patient group), may be used to evaluate any differences in single components or patterns of components between the two groups exist in order to obtain a better insight into underlying biological mechanisms, to detect novel biomarkers/surrogate markers, and/or develop intervention routes.
  • spectrometric measurements of for example, plasma samples from a control and patient group
  • the present invention provides methods for developing profiles of metabolites and small molecules. Such profiles facilitate the development of comprehensive metabolome profiles.
  • the present invention provides methods for developing profiles of proteins, protein-complexes and the like. Such profiles facilitate the development of comprehensive proteome profiles.
  • the present invention provides methods for developing profiles of gene transcripts, mRNA and the like. Such profiles facilitate the development of comprehensive genome profiles.
  • the method is generally based on the following steps: (1) selection of biological samples, for example body fluids (plasma, urine, cerebral spinal fluid, saliva, synovial fluid etc.); (2) sample preparation based on the biochemical components to be investigated and the spectrometric techniques to be employed (e.g., investigation of lipids, proteins, trace elements, gene expression, etc.); (3) measurement of the high concentration components in the biological samples using methods mass spectrometry and NMR; (4) measurement of selected molecule subclasses using NMR-profiles and preferred MS-approaches to study compounds such as, for example, lipids, steroids, bile acids, eicosanoids, (neuro)peptides, vitamins, organic acids, neurotransmitters, amino acids, carbohydrates, ionic organics, nucleotides, inorganics, xenobiotics etc.; (5) raw data preprocessing; (6) data analysis using multivariate analysis according to any of the methods of the present invention (e.g., to identify patterns in measurements
  • the methods of the present invention may be used to develop profiles on a biomolecular component type obtained from a wide variety of biological sample types including, but not limited to, blood, blood plasma, blood serum, cerebrospinal fluid, bile acid, saliva, synovial fluid, plueral fluid, pericardial fluid, peritoneal fluid, feces, nasal fluid, ocular fluid, intracellular fluid, intercellular fluid, lymph urine, tissue, liver cells, epithelial cells, endothelial cells, kidney cells, prostate cells, blood cells, lung cells, brain cells, adipose cells, tumor cells and mammary cells.
  • biological sample types including, but not limited to, blood, blood plasma, blood serum, cerebrospinal fluid, bile acid, saliva, synovial fluid, plueral fluid, pericardial fluid, peritoneal fluid, feces, nasal fluid, ocular fluid, intracellular fluid, intercellular fluid, lymph urine, tissue, liver cells, epithelial cells, endothelial cells, kidney
  • the present invention provides an article of manufacture where the functionality of a method of the present invention is embedded on a computer-readable medium, such as, but not limited to, a floppy disk, a hard disk, an optical disk, a magnetic tape, a PROM, an EPROM, CD-ROM, or DVD-ROM.
  • a computer-readable medium such as, but not limited to, a floppy disk, a hard disk, an optical disk, a magnetic tape, a PROM, an EPROM, CD-ROM, or DVD-ROM.
  • the functionality of the method may be embedded on the computer-readable medium in any number of computer-readable instructions, or languages such as, for example, FORTRAN, PASCAL, C, C++, BASIC and assembly language.
  • the computer-readable instructions can, for example, be written in a script, macro, or functionally embedded in commercially available software (such as, e.g., EXCEL or VISUAL BASIC).
  • the present invention provides systems adapted to practice the methods of the present invention.
  • the system comprises one or more spectrometric instruments 1910 and a data processing device 1920 in electrical communication, wireless communication, or both.
  • the spectrometric instrument may comprise any instrument capable of generating spectrometric measurements useful in practicing the methods of the present invention. Suitable spectrometric instruments include, but are not limited to, mass spectrometers, liquid phase chromatographers, gas phase chromatographer, and electrophoresis instruments, and combinations thereof.
  • the system further comprises an external database 1930 storing data accessible by the data processing device, wherein the data processing device implement the functionality of one or more of the methods of the present invention using at least in part data stored in the external database.
  • the data processing device may comprise an analog and/or digital circuit adapted to implement the functionality of one or more of the methods of the present invention using at least in part information provided by the spectrometric instrument.
  • the data processing device may implement the functionality of the methods of the present invention as software on a general purpose computer.
  • such a program may set aside portions of a computer's random access memory to provide control logic that affects the spectrometric measurement acquisition, multivariate analysis of data sets, and/or profile development for a biological system.
  • the program may be written in any one of a number of high-level languages, such as FORTRAN, PASCAL, C, C++, or BASIC.
  • the program can be written in a script, macro, or functionality embedded in proprietary software or commercially available software, such as EXCEL or VISUAL BASIC.
  • the software could be implemented in an assembly language directed to a microprocessor resident on a computer.
  • the software can be implemented in Intel 80 ⁇ 86 assembly language if it is configured to run on an IBM PC or PC clone.
  • the software may be embedded on an article of manufacture including, but not limited to, a computer-readable program medium such as a floppy disk, a hard disk, an optical disk, a magnetic tape, a PROM, an EPROM, or CD-ROM.
  • the APO E3 Leiden mouse model is a transgenic animal model described in “The Use of Transgenic Mice in Drug Discovery and Drug Development,” by P. L. B. Bruijnzeel, TNO Pharma, Oct. 24, 2000. Briefly, the APO E3-Leiden allele is identical to the APO E4 (Cys 112 ⁇ Arg) allele, but includes an in frame repeat of 21 nucleotides in exon 4, resulting in tandem repeat of codon 120 - 126 or 121 - 127 . Transgenic mice expressing APO E3-Leiden mutation are known to have hyperlipidemic phenotypes that under specific conditions lead to the development of atherosclerotic plaques. The model has a high predicted success rate in finding differences at the small molecule (metabolite) and protein levels, while the gene level is very well characterized.
  • APO E3 male mice were sacrificed after collection of urine in metabolic cages.
  • the APO E3 mice were created by insertion of a well-defined human gene cluster (APO E3-APC 1), and a very homogeneous population was generated by at least 20 inbred generations.
  • APO E3-APC 1 a well-defined human gene cluster
  • the following samples were available for analysis: (1) 10 wildtype and 10 APO E3 urine samples (about 0.5 ml/animal or more); (2) 10 wildtype and 10 APO E3 (heparin) plasma samples (about 350 ⁇ l/animal); (3) 10 wildtype and 10 APO E3 liver samples. From the plasma samples 100 microliters were used for NMR and the same samples were used for LC-MS, about 250 ul is available for protein work and duplicates. All samples were stored at ⁇ 20 C. In total, 19 plasma samples were received. One sample, animal #6 (APO-E3 Leiden group) was not present. After cleanup, (described below) the portions reserved for proteomics research were transferred to ⁇ 70° C.
  • Plasma sample extraction was accomplished with isopropanol (protein precipitation).
  • LC-MS lipid profile measurements of the plasma samples were obtained with on an electrospray ionization (“ESI”) and atmospheric pressure chemical ionization (“APCI”) LC-MS system.
  • ESI electrospray ionization
  • APCI atmospheric pressure chemical ionization
  • the resultant raw data was preprocessed with an entropy-based peak detection technique substantially similar to that disclosed in pending U.S. patent application Ser. No. 09/920,993, filed Aug. 2, 2001.
  • PCA principal component analysis
  • DA discriminant analysis
  • the raw data from the NMR measurements of the plasma samples was subjected to a pattern recognition analysis (“PARC”), which included preprocessing (such as a partial linear fit), peak detection and multivariate statistical analysis.
  • PARC pattern recognition analysis
  • Urine samples were prepared and NMR measurements of the urine samples were obtained.
  • the raw NMR data on the urine samples was also subjected to a PARC analysis, which included preprocessing, peak detection and multivariate statistical analysis.
  • mice plasma samples were thawed at room temperature. Aliquots of 100 ⁇ l were transferred to a clean eppendorf vials and stored at ⁇ 70° C. The sample volume for sample #12 was low and only 50 ⁇ l was transferred. For NMR and LC-MS lipid analysis 150 ⁇ l aliquots were transferred to clean eppendorf vials.
  • Plasma samples were cleaned up and handled substantially according to the following protocol: (1) add 0.6 ml of isopropanol; (2) vortex; (3) centrifuge at 10,000 rpm for 5 min.; (4) transfer 500 ⁇ l to clean tube for NMR analysis; (5) transfer 100 ⁇ l to clean eppendorf vial; (6) add 400 ⁇ l water and mix; and (7) transfer 200 ⁇ l to autosampler vial insert.
  • the remaining extract and pellet (precipitated protein) were stored at ⁇ 20° C.
  • Human heparin plasma was obtained from a blood bank. In a glass tube, 1 ml of human plasma and 4 ml of isopropanol were mixed (vortexed). After centrifugation, 1 ml of extract was transferred to a tube and 4 ml of water was added. The resulting solution was transferred to 4 autosampler vials (1 ml).
  • Spectrometric measurments of plasma samples were made with a combination HPLC-time-of-flight MS instrument. Efluent emerging from the chromatograph was ionized by electrosrpay ionization (“ESI”) and atmospheric pressure chemical ionization (“APCI”). Typical instrument parameters used with HPLC instrument are given in Table 1 and details of the gradient in Table 2; typical parameters for the ESI source are given in Table 3, and those for the APCI source are given in Table 4. TABLE 1 HPLC Parameters Column: Inertsil ODS3 5 ⁇ m, 100 ⁇ 3 mm i.d.
  • the injection sequence for samples was as follows.
  • the mouse plasma extracts were injected twice in a random order.
  • the human plasma extract was injected twice at the start of the sequence and after every 5 injections of the mouse plasma extracts to monitor the stability of the LC-MS conditions.
  • the random sequence was applied to prevent the detrimental effects of possible drift on the multivariate statistics.
  • NMR spectrometric measurements of plasma samples were made with a 400 MHz 1 H-NMR. Samples for the NMR were prepared and handled substantially in accord with the following protocol. Isopropanol plasma extracts (500 ⁇ l from 2.3.1) were dried under nitrogen, whereafter the residues were dissolved in deuterated methanol (MeOD). Deuterated methanol was selected because it gave the best NMR spectra when chlorofom, water, methanol and dimethylsulfoxide (all deuterated) were compared.
  • FIGS. 2A and 2B A flow chart illustrating the analysis of the spectrometric data of this example according to one embodiment of the present invention is shown in FIGS. 2A and 2B .
  • the spectrometric data obtained was grouped into eight data sets 301 - 308 .
  • the data sets were as follows: (1) data set 1 comprised 400 MHz
  • FIGS. 3A and 4A for data set 1
  • FIGS. 3B and 4B for data set 2
  • FIGS. 5B and 6B for data set 3
  • FIGS. 5A and 6A for data set 4
  • FIG. 7B for data set 5
  • FIG. 7A for data set 6 .
  • Various features were noted in the data of FIGS. 3A-7B .
  • peaks associated with hippuric acid 410 were observed in the wildtype mouse urine sample 1 H-NMR spectra, while such peaks were substantially absent from the APO E3 mouse urine sample 1 H-NMR spectra, indicating a possible biochemical process unique to the APO E3 mouse.
  • peaks associated with an unidentified component 420 were observed in the wildtype mouse urine sample 1 H-NMR spectra, which were also substantially absent from corresponding 1 H-NMR spectra of the APO E3 mouse urine samples.
  • a two series of peaks 510 , 520 were observed in the APO E3 mouse blood plasma sample 1 H-NMR spectra, which were either substantially absent from the wildtype spectra 510 or substantially reduced 520 .
  • the peaks associated with the first series of peaks 510 are substantially absent from the resonance shift region in wildtype spectra 610
  • whole the second series of peaks 520 are present but reduced in the wildtype spectra 620 .
  • peaks associated with lyso-phosphatidylcholines (“lyso-PC”) 710 were slightly reduced in intensity in the APO E3 mouse spectra relative to those for the wildtype, that peaks associated with phospholipids 720 were substantially equal in intensity between the APO E3 and wildtype spectra, and that peaks associated with triglycerides 730 were substantially increased in intensity in the APO E3 mouse spectra relative to those for the wildtype.
  • the raw data from data sets 1 to 8 was preprocessed 320 and a first multivariate analysis was performed between data sets 1 and 2 , 3 and 4 , 5 and 6 , and 7 and 8 , respectively, each at a first level of correlation 330 , i.e., PCA-DA score plots.
  • a first level of correlation 330 i.e., PCA-DA score plots.
  • FIGS. 8-11 Examples of the results of the first multivariate analysis at a first level of correlation are illustrated in FIGS. 8-11 for data sets 1 and 2 ; FIG. 12 for data sets 3 and 4 ; and FIG. 13 for data sets 5 and 6 (which includes data from human samples).
  • Data from the first multivariate analysis was then used to produce an analysis at a second level of correlation 340 , i.e., PCA-DA loading plots.
  • An example of one such PCA-DA loading plot is shown in FIG. 14 .
  • FIG. 8 a PCA-DA score plot of the NMR data for the urine samples of data sets 1 and 2 is shown.
  • the analysis groups NMR data for APO E3 and wildtype group into two substantially distinct regions in the score plot, an APO E3 region 810 and a wildtype region 820 , indicating that urine samples alone may suffice to develop a profile that reflects the transgenic nature of the APO E3 mice and serve as a body fluid biomarker profile for distinguishing APO E3 mice from other types of mice.
  • FIG. 9 a score plot of the NMR data for the urine samples of data set 1 is shown.
  • the analysis indicates that there are similarities and differences within the urine samples of data set 1 that correlate with urine color.
  • the analysis illustrates three distinct regions in the score plot correlated to deep brown urine 910 , brown urine 920 , and yellow urine 930 .
  • FIG. 9 illustrates that there are three distinct subgroups of mouse urine profiles in the wildtype mouse cohort.
  • FIG. 10 a score plot of the NMR data for the urine samples of data set 2 is shown.
  • the analysis indicates that there are similarities and differences within the urine samples of data set 2 that correlate with urine color.
  • the analysis illustrates three regions in the score plot, one correlated to brown urine 1010 , and another to pale brown urine 1020 , that slightly overlaps with a yellow urine correlated region 1030 .
  • FIG. 10 illustrates that there are three subgroups of mouse urine profiles in the APO E3 mouse cohort.
  • FIG. 11 a PCA-DA score plot of the NMR data for the urine samples of both wildtype and APO E3 mice is shown.
  • the analysis indicates that there are similarities and differences within the urine samples of data sets 1 and 2 even for urine with the same color.
  • the analysis illustrates three regions in the score plot, one correlated to yellow APO E3 mouse urine 1110 , one to pale brown APO E3 mouse urine 1120 , and another to yellow wildtype mouse urine 1130 .
  • FIG. 11 illustrates that there are three distinct subgroups of mouse urine profiles which can be used as profiles to distinguish between APO E3 animals from wildtype animals, and to distinguish animals producing yellow urine from pale brown urine.
  • FIG. 12 a PCA-DA score plot of the NMR data for the blood plasma samples of data sets 3 and 4 is shown. As illustrated, the analysis groups NMR data for APO E3 and wildtype group into two substantially distinct regions in the score plot, a wildtype region 1210 and an APO E3 region 1220 , indicating that blood samples alone may be suffice to develop a profile that distinguishes APO E3 mice from wildtype mice.
  • FIG. 13 a PCA-DA score plot of the NMR data for the blood plasma samples of data sets 5 , 6 and the human samples is shown.
  • the analysis groups NMR data regions corresponding to each organism type, a human region 1310 , a wildtype region 1320 and an APO E3 region 1330 .
  • FIG. 13 indicates that blood plasma samples may suffice to develop a profile that distinguishes organisms and genotypes.
  • information at a second level of correlation is obtained from the analysis illustrated in FIG. 13 to investigate, for example, the contribution of each metabolite measured by the NMR technique to the segregation of the data into three regions.
  • a loading plot is used to determine a second level of correlation.
  • An example of a loading plot for axis D 2 of FIG. 13 is shown in FIG. 14 .
  • the abscissa corresponds to masses (or mass-to-charge ranges). Points with positive values along the ordinate indicate component masses that are lower in abundance in the APO E3 mouse versus wildtype, and negative values indicate the reverse.
  • the circled ranges are a significant contribution to the correlations of, for example, FIG. 13 .
  • the mass chromatograms associated these regions were investigated 350 and the upper circled ranges 1401 , 1403 found to be associated with lyso-phosphatidylcholines (“lyso-PC”), and the lower ranges 1402 , 1404 with triglycerides.
  • FIG. 15 An example of the phosphatidylcholine mass chromatograms for both wildtype and APO E3 mouse are shown in FIG. 15 , and an example of the lyso-phosphatidylcholine mass chromatograms for both wildtype and APO E3 mouse are shown in FIG. 16 .
  • n refers to the number of residues
  • FIG. 16 a series of peaks corresponding lyso-phosphatidylcholines, where the designation x:y refers to x number of carbon atoms on the fatty acids and y carbon bonds, is shown for both wildtype (thin solid line) and APO E3 (thick solid line) plasma samples.
  • the chromatograms in FIG. 16 are each normalized such that the maximum intensity of peak 1610 is equal for all the spectra. As illustrated, it was observed that the peaks corresponding to arachidonic acid 1620 , and linoleic acid 1630 were substantially reduced in the APO E3 mouse spectra relative to wildtype.
  • a second multivariate analysis was also performed (“YES” to query 360 ) comprising a canonical correlation.
  • This second multivariate analysis was performed on data sets 3 , 4 , 5 , and 6 , 371 , to produce a canonical correlation score plot 381 .
  • An example of the results of this second multivariate analysis is shown in FIG. 17 .
  • analysis 371 correlates data from two very different spectrometric techniques: data sets 3 and 4 from NMR, and 5 and 6 from LC-MS. Such an analysis, for example, may discern whether different information is being provided by such different techniques.
  • the canonical correlation groups both NMR and LC-MS results for the APO E3 mouse and wildtype mouse into two substantially distinct regions in the plot, a wildtype region 1710 and an APO E3 region 1720 , indicating that both NMR and LC-MS techniques result in segregation into distinct regions, however the LC-MS method yielded a more pronounced separation.
  • a second multivariate analysis was performed on data sets 5 , 6 , 7 and 8 , 372 , to produce a canonical correlation score plot 382 .
  • An example of the results of this second multivariate analysis is shown in FIG. 18 .
  • analysis 372 correlates data from in many respects the same spectrometric technique LC-MS, but different instrument configurations: data sets 5 and 6 using ESI, and 7 and 8 using APCI.
  • Such an analysis may discern whether different information is being provided by such different instrument configurations.
  • such a multivariate analysis may be used to discern whether different machines (that use the exact same instrumentation) provide different information. In cases where different machines provide significantly different information (on the same sample, using the same technique, parameters, and instrumentation) user or machine errors may be detected.
  • the canonical correlation groups both ESI LC-MS results (crosses+) and APCI LC-MS results (asterisks *) for the APO E3 mouse and wildtype mouse into two substantially distinct regions in the plot, a wildtype region 1810 and an APO E3 region 1820 , indicating that both ESI LC-MS and APCI LC-MS techniques result in segregation into distinct regions.

Abstract

The present invention provides methods and systems for developing profiles of a biological system based on the discernment of similarities, differences, and/or correlations between biomolecular components, of a single biomolecular component type, of a plurality of biological samples. Preferably, the method comprises utilizing hierarchical multivariate analysis of spectrometric data at one or more levels of correlation.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • The present application claims the benefit of and priority to copending U.S. provisional application No. 60/312,145, filed Aug. 13, 2001, the entire disclosure of which is herein incorporated by reference.
  • FIELD OF THE INVENTION
  • The invention relates to the field of data processing and evaluation. In particular, the invention relates to an analytical technology platform for separating and measuring multiple components of a biological sample, and statistical data processing methods for identifying components and revealing patterns and relationships between and among the various measured components.
  • BACKGROUND
  • The characterization of complex mixtures has become important in a variety of research and application areas, including pharmaceuticals, biotechnological research, and nutraceutical (functional food) topics. One important area is the study of small molecules in pharmaceutical and biotechnology research, often referred to as metabolomics.
  • For example, an important challenge in the development of new drugs for complex (multi-factorial) diseases is the tracing and validation of biomarkers/surrogate markers. Moreover, it appears that instead of single biomarkers, biomarker-patterns may be necessary to characterize and diagnose homeostasis or disease states for such diseases.
  • In the discipline of metabolomics, the current art in the field of biological sample profiling is based either on measurement by nuclear magnetic resonance (“NMR”) or by mass spectrometry (“MS”) that focuses on a limited number of small molecule compounds. Both of these profiling approaches have limitations. The NMR approaches are limited in that they typically provide reliable profiles only of compounds present at high concentration. On the other hand, focused mass spectrometry based approaches do not require high concentrations but can provide profiles of only limited portions of the metabolome. What is needed is an approach that can address limitations in current profiling techniques and that facilitates the discernment of correlations between components or patterns of component (such as biomarker patterns).
  • SUMMARY OF THE INVENTION
  • The present invention addresses limitations in current profiling techniques by providing a method and system (or collectively “technology platform”) utilizing hierarchical multivariate analysis of spectrometric data on one or more levels. The present invention further provides a technology platform that facilitates the discernment of similarities, differences, and/or correlations not only between single biomolecular components of a sample or biological system, but also between patterns of biomolecular components of a single bimolecular component type.
  • As used herein, the term “biomolecule component type” refers to a class of biomolecules generally associated with a level of a biological system. For example, gene transcripts are one example of a biomolecule component type that are generally associated with gene expression in a biological system, and the level of a biological system referred to as genomics or functional genomics. Proteins are another example of a biomolecule component type and generally associated with protein expression and modification, etc., and the level of a biological system referred to as proteomics. Further, another example of a biomolecule component type are metabolites, which are generally associated with the level of a biological system referred to as metabolomics.
  • The present invention provides a method and system for profiling a biological system utilizing a hierarchical multivariate analysis of spectrometric data to generate a profile of a state of a biological system. The states of a biological system that may be profiled by the invention include, but are not limited to, disease state, pharmacological agent response, toxicological state, biochemical regulation (e.g., apoptosis), age response, environmental response, and stress response. The present invention may use data on a biomolecule component type (e.g., metabolites, proteins, gene transcripts, etc.) from multiple biological sample types (e.g., body fluids, tissue, cells) obtained from multiple sources (such as, for example, blood, urine, cerebospinal fluid, epithelial cells, endothelial cells, different subjects, the same subject at different times, etc.). In addition, the present invention may use spectrometric data obtained on one or more platforms including, but not limited to, MS, NMR, liquid chromatography (“LC”), gas-chromatography (“GC”), high performance liquid chromatography (“HPLC”), capillary electrophoresis (“CE”), and any known form of hyphenated mass spectrometry in low or high resolution mode, such as LC-MS, GC-MS, CE-MS, LC-UV, MS-MS, MSn, etc.
  • As used herein, the term “spectrometric data” includes data from any spectrometric or chromatographic technique and the term “spectrometric measurement” includes measurements made by any spectrometric or chromatographic technique. Spectrometric techniques include, but are not limited to, resonance spectroscopy, mass spectroscopy, and optical spectroscopy. Chromatographic techniques include, but are not limited to, liquid phase chromatography, gas phase chromatography, and electrophoresis.
  • As used herein, the terms “small molecule” and “metabolite” are used interchangeably. Small molecules and metabolites include, but are not limited to, lipids, steroids, amino acids, organic acids, bile acids, eicosanoids, peptides, trace elements, and pharmacophore and drug breakdown products.
  • In one aspect, the present invention provides a method of spectrometric data processing utilizing multiple steps of a multivariate analysis to process data in a hierarchal procedure. In one embodiment, the method uses a first multivariate analysis on a plurality of data sets to discern one or more sets of differences and/or similarities between them and then uses a second multivariate analysis to determine a correlation (and/or anti-correlation, i.e., negative correlation) between at least one of these sets of differences (or similarities) and one or more of the plurality of data sets. The method may further comprise developing a profile for a state of a biological system based on the correlation.
  • As used herein, the term “data sets” refers to the spectrometric data associated with one or more spectrometric measurements. For example, where the spectrometric technique is NMR, a data set may comprise one or more NMR spectra. Where the spectrometric technique is UV spectroscopy, a data set may comprise one or more UV emission or absorption spectra. Similarly, where the spectrometric technique is MS, a data set may comprise one or more mass spectra. Where the spectrometric technique is a chromatographic-MS technique (e.g., LC-MS, GC-MS, etc), a data set may comprise one or more mass chromatograms. Alternatively, a data set of a chromatographic-MS technique may comprise one or more a total ion current (“TIC”) chromatograms or reconstructed TIC chromatograms. In addition, it should be realized that the term “data set” includes both raw spectrometric data and data that has been preprocessed (e.g., to remove noise, baseline, detect peaks, to normalize, etc.).
  • Moreover, as used herein, the term “data sets” may refer to substantially all or a sub-set of the spectrometric data associated with one or more spectrometric measurements. For example, the data associated with the spectrometric measurements of different sample sources (e.g., experimental group samples v. control group samples) may be grouped into different data sets. As a result, a first data set may refer to experimental group sample measurements and a second data set may refer to control group sample measurements. In addition, data sets may refer to data grouped based on any other classification considered relevant. For example, data associated with the spectrometric measurements of a single sample source (e.g., experimental group) may be grouped into different data sets based, for example, on the instrument used to perform the measurement, the time a sample was taken, the appearance of the sample, etc. Accordingly, one data set (e.g., grouping of experimental group samples based on appearance) may comprise a sub-set of another data set (e.g., the experimental group data set).
  • In another aspect, the present invention provides a method of spectrometric data processing utilizing multivariate analysis to process data at two or more hierarchal levels of correlation. In one embodiment, the method uses a multivariate analysis on a plurality of data sets to discern correlations (and/or anti-correlations) between data sets at a first level of correlation, and then uses the multivariate analysis to discern correlations (and/or anti-correlations) between data sets at a second level of correlation. The method may further comprise developing a profile for a state of a biological system based on the correlations discerned at one or more levels of correlation.
  • In yet another aspect, the present invention provides a method of spectrometric data processing utilizing multiple steps of a multivariate analysis to process data sets in a hierarchal procedure, wherein one or more of the multivariate analysis steps further comprises processing data at two or more hierarchal levels of correlation. For example, in one embodiment, the method comprises: (1) using a first multivariate analysis on a plurality of data sets to discern one or more sets of differences and/or similarities between them; (2) using a second multivariate analysis to determine a first level of correlation (and/or anti-correlation) between a first sets of differences (or similarities) and one or more of the data sets; and (3) using the second multivariate analysis to determine a second level of correlation (and/or anti-correlation) between the first sets of differences (or similarities) and one or more of the data sets. The method of this aspect may also comprise developing a profile for a state of a biological system based on the correlations discerned at one or more levels of correlation.
  • In other aspects of the invention, the present invention provides systems adapted to practice the methods of the invention set forth above. In one embodiment, the system comprises a spectrometric instrument and a data processing device. In another embodiment, the system further comprises a database accessible by the data processing device. The data processing device may comprise an analog and/or digital circuit adapted to implement the functionality of one or more of the methods of the present invention.
  • In some embodiments, the data processing device may implement the functionality of the methods of the present invention as software on a general purpose computer. In addition, such a program may set aside portions of a computer's random access memory to provide control logic that affects the hierarchical multivariate analysis, data preprocessing and the operations with and on the measured interference signals. In such an embodiment, the program may be written in any one of a number of high-level languages, such as FORTRAN, PASCAL, C, C++, or BASIC. Further, the program may be written in a script, macro, or functionality embedded in commercially available software, such as EXCEL or VISUAL BASIC. Additionally, the software could be implemented in an assembly language directed to a microprocessor resident on a computer. For example, the software could be implemented in Intel 80×86 assembly language if it were configured to run on an IBM PC or PC clone. The software may be embedded on an article of manufacture including, but not limited to, “computer-readable program means” such as a floppy disk, a hard disk, an optical disk, a magnetic tape, a PROM, an EPROM, or CD-ROM.
  • In a further aspect, the present invention provides an article of manufacture where the functionality of a method of the present invention is embedded on a computer-readable medium, such as, but not limited to, a floppy disk, a hard disk, an optical disk, a magnetic tape, a PROM, an EPROM, CD-ROM, or DVD-ROM.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The foregoing and other features and advantages of the invention, as well as the invention itself, will be more fully understood from the description, drawings, and claims that follow. The drawings are not necessarily drawn to scale, and like reference numerals refer to the same parts throughout the different views.
  • FIG. 1A is a flow diagram of analyzing a plurality of data sets according to various embodiments of the present invention.
  • FIG. 1B is a flow diagram of analyzing a plurality of data sets according to various other embodiments of the present invention.
  • FIGS. 2A and 2B are flow diagrams of the analysis performed according to various embodiments of the present invention on a plurality of data sets of multiple biological sample types obtained from wildtype mice and APO E3 Leiden mice.
  • FIGS. 3A and 3B are examples of partial 400 MHz 1H-NMR spectra for urine samples of wildtype mouse samples, FIG. 3A and APO E3 mouse samples, FIG. 3B.
  • FIGS. 4A and 4B are examples of partial 400 MHz 1H-NMR spectra for urine samples of wildtype mouse samples, FIG. 4A and APO E3 mouse samples, FIG. 4B.
  • FIGS. 5A and 5B are examples of partial 400 MHz 1H-NMR spectra for blood plasma samples of wildtype mouse samples, FIG. 5A, and APO E3 mouse samples, FIG. 5B.
  • FIGS. 6A and 6B are examples of partial 400 MHz 1H-NMR spectra for blood plasma samples of wildtype mouse samples, FIG. 6A, and APO E3 mouse samples, FIG. 6B.
  • FIGS. 7A and 7B are examples of a blood plasma lipid profile obtained by a LC-MS spectrometric technique using ESI on APO E3 mouse blood plasma samples, FIG. 7A, and wildtype mouse samples, FIG. 7B.
  • FIG. 8 is an example of a PCA-DA score plot of the NMR data for the urine samples of data sets 1 and 2 of FIGS. 2A and 2B.
  • FIG. 9 is an example of a PCA-DA score plot of the NMR data for the urine samples of data set I (wildtype mouse) of FIGS. 2A and 2B.
  • FIG. 10 is an example of a PCA-DA score plot of the NMR data for the urine samples of data set 2 (APO E3 mouse) of FIGS. 2A and 2B.
  • FIG. 11 is an example of a PCA-DA score plot of the NMR data for the urine samples of both wildtype and APO E3 mice.
  • FIG. 12 is an example of a PCA-DA score plot of the NMR data for the blood plasma samples of data sets 3 and 4 of FIGS. 2A and 2B.
  • FIG. 13 is an example of a PCA-DA score plot of the LC-MS data on the blood plasma samples of data sets 5, 6 of FIGS. 2A and 2B and human samples.
  • FIG. 14 is an example of a loading plot for axis D2 of FIG. 13.
  • FIG. 15 is an example of the comparison of normalized blood plasma lipid profiles obtained by an LC-MS spectrometric technique for wildtype mouse samples (thin sold line) and APO E3 mouse samples (thick sold line).
  • FIG. 16 is an example of the comparison of normalized blood plasma lipid profiles obtained by an LC-MS spectrometic technique for wildtype mouse samples (thin sold line) and APO E3 mouse samples (thick sold line).
  • FIG. 17 is an example of a canonical correlation score plot for spectrometric data for one biological sample type (blood plasma) from two different spectrometric techniques (NMR and LC-MS).
  • FIG. 18 is an example of a canonical correlation score plot for spectrometric data for one biological sample type (blood plasma) from the same general spectrometric technique but different instrument configurations.
  • FIG. 19 is a schematic representation of one embodiment of a system adapted to practice the methods of the invention.
  • DETAILED DESCRIPTION
  • Referring to FIG. 1A, a flow chart of one embodiment of a method according to the present invention is shown. One or more of a plurality of data sets 110 are preferably subjected to a preprocessing step 120 prior to multivariate analysis. Suitable forms of preprocessing include, but are not limited to, data smoothing, noise reduction, baseline correction, normalization and peak detection. Preferable forms of data preprocessing include entropy-based peak detection (such as disclosed in pending U.S. patent application Ser. No. 09/920,993, filed Aug. 2, 2001, the entire contents of which are hereby incorporated by reference) and partial linear fit techniques (such as found in J. T. W. E. Vogels et al., “Partial Linear Fit: A New NMR Spectroscopy Processing Tool for Pattern Recognition Applications,” Journal of Chemometrics, vol. 10, pp. 425-38 (1996)). A multivariate analysis is then performed at a first level of correlation 130 to discern differences (and/or similarities) between the data sets. Suitable forms of multivariate analysis include, for example, principal component analysis (“PCA”), discriminant analysis (“DA”), PCA-DA, canonical correlation (“CC”), partial least squares (“PLS”), predictive linear discriminant analysis (“PLDA”), neural networks, and pattern recognition techniques. In one embodiment, PCA-DA is performed at a first level of correlation that produces a score plot (i.e., a plot of the data in terms of two principal components; see, e.g., FIGS. 8-12 which are described further below). Subsequently, the same or a different multivariate analysis is performed on the data sets at a second level of correlation 140 based on the differences (and/or similarities) discerned from the first level of correlation.
  • For example, in one embodiment, where the first level comprises a PCA-DA score plot, the second level of correlation comprises a loading plot produced by a PCA-DA analysis. This second level of correlation bears a hierarchical relationship to the first level in that loading plots provide information on the contributions of individual input vectors to the PCA-DA that in turn are used to produce a score plot. For example, where each data set comprises a plurality of mass chromatograms, a point on a score plot represents mass chromatograms originating from one sample source. In comparison, a point on a loading plot represents the contribution of a particular mass (or range of masses) to the correlations between data sets. Similarly, where each data set comprises a plurality of NMR spectra, a point on a score plot represents one NMR spectrum. In comparison, a point on the corresponding loading plot represents the contribution of a particular NMR chemical shift value (or range of values) to the correlations between data sets.
  • Referring again to FIG. 1A, based on the correlations discerned in the analysis at the first level of correlation 130 and/or that at the second level of correlation 140 a profile may be developed 151 (“NO” to inspect spectra query 160). For example, the region in a score plot where the data points fall for a certain group of data sets may comprise a profile for the state of a biological system associated with that group. Further, the profile may comprise both the above region in a score plot and a specific level of contribution from one or more points in an associated loading plot. For example, where the data sets comprise mass chromatograms and/or mass spectra, a biological system may only fit into the profile of a state if spectrometric data sets from appropriate samples fall in a certain region of the score plot and if the mass chromatograms for a particular range of masses provide a significant contribution to the correlation observed in the score plot. Similarly, where the data sets comprise NMR spectra, a biological system may only fit into the profile of a state if spectrometric data sets from appropriate samples fall in a certain region of the score plot and if a particular range of chemical shift values in the NMR spectra provide a significant contribution to the correlation observed in the score plot.
  • In addition, the method may further include a step of inspection 155 of one or more specific spectra of the data sets (“YES” to inspect spectra query 160) based on the correlations discerned in the analysis at the first level of correlation 130 and/or that at the second level of correlation 140. A profile based on this inspection is then developed 152. For example, where the spectra of the data sets comprise mass chromatograms, the method inspects the mass chromatograms of those mass ranges showing a significant contribution to the correlation based on the loading plot. Inspection of these mass chromatograms, for example, may reveal what species of chemical compounds are associated with the profile. Such information may be of particular importance for biomarker identification and drug target identification.
  • Referring to FIG. 1B, a flow chart of another embodiment of a method according to the present invention is shown. One or more of a plurality of data sets 210 are preferably subjected to a preprocessing step 220 prior to multivariate analysis. A first multivariate analysis is then performed 230 on a plurality of data sets to discern one or more sets of differences and/or similarities between them. The first multivariate analysis may be performed between sub-sets of the data sets. For example, the first multivariate analysis may be performed between data set 1 and data set 2, 231 and the first multivariate analysis may be performed separately between data set 2 and data set 3, 232. The method then uses a second multivariate analysis 240 to determine a correlation between at least one of the sets of differences (or similarities) discerned in the first multivariate analysis and one or more of the data sets. This second multivariate analysis 240 bears a hierarchal relationship to the first 230 in that the differences between data sets are discerned in a hierarchal fashion. For example, the differences between data sets 1 and 2 (and data sets 2 and 3) are first discerned 231, 232 and then those differences are subjected to further multivariate analysis 240. In one embodiment, a profile based on the correlations discerned in the second multivariate analysis 240 is developed 250.
  • In addition, any of the multivariate analysis steps 231, 232, 240 may further comprise a step of performing the same or a different multivariate analysis at another level of correlation 260 (for example, such as described with respect to FIG. 1A) based on the differences (and/or similarities) discerned from the level of correlation used in a prior multivariate analysis step 231, 232, 240. A profile based on the information from one or more of these levels of correlation may then be developed 250, 251 (“NO” to inspect spectra query 270). Alternatively, the method may further include a step of inspection 255 of one or more specific spectra of the data sets (“YES” to inspect spectra query 270) based on the correlations discerned in the analysis at one ore more levels of correlation and/or one or more multivariate analysis steps. A profile based on this inspection then may be developed 252.
  • The methods of the present invention may be used to develop profiles on any biomolecular component type. Such profiles facilitate the development of comprehensive profiles of different levels of a biological system, such as, for example, genome profiles, transcriptomic profiles, proteome profiles, and metabolome profiles. Further, such methods may be used for data analysis of spectrometric measurements (of for example, plasma samples from a control and patient group), may be used to evaluate any differences in single components or patterns of components between the two groups exist in order to obtain a better insight into underlying biological mechanisms, to detect novel biomarkers/surrogate markers, and/or develop intervention routes.
  • In various embodiments, the present invention provides methods for developing profiles of metabolites and small molecules. Such profiles facilitate the development of comprehensive metabolome profiles. In other various embodiments, the present invention provides methods for developing profiles of proteins, protein-complexes and the like. Such profiles facilitate the development of comprehensive proteome profiles. In yet other various embodiments, the present invention provides methods for developing profiles of gene transcripts, mRNA and the like. Such profiles facilitate the development of comprehensive genome profiles.
  • In one version of these embodiments, the method is generally based on the following steps: (1) selection of biological samples, for example body fluids (plasma, urine, cerebral spinal fluid, saliva, synovial fluid etc.); (2) sample preparation based on the biochemical components to be investigated and the spectrometric techniques to be employed (e.g., investigation of lipids, proteins, trace elements, gene expression, etc.); (3) measurement of the high concentration components in the biological samples using methods mass spectrometry and NMR; (4) measurement of selected molecule subclasses using NMR-profiles and preferred MS-approaches to study compounds such as, for example, lipids, steroids, bile acids, eicosanoids, (neuro)peptides, vitamins, organic acids, neurotransmitters, amino acids, carbohydrates, ionic organics, nucleotides, inorganics, xenobiotics etc.; (5) raw data preprocessing; (6) data analysis using multivariate analysis according to any of the methods of the present invention (e.g., to identify patterns in measurements of single subclasses of molecules or in measurements of high concentration components using NMR or mass spectrometry); and (7) using of multivariate analysis to combine data sets from distinct experiments and find patterns of interest in the data. In addition, the method may further comprise a step of (8) acquiring data sets at a number of points in time to facilitate the monitoring of temporal changes in the multivariate patterns of interest.
  • The methods of the present invention may be used to develop profiles on a biomolecular component type obtained from a wide variety of biological sample types including, but not limited to, blood, blood plasma, blood serum, cerebrospinal fluid, bile acid, saliva, synovial fluid, plueral fluid, pericardial fluid, peritoneal fluid, feces, nasal fluid, ocular fluid, intracellular fluid, intercellular fluid, lymph urine, tissue, liver cells, epithelial cells, endothelial cells, kidney cells, prostate cells, blood cells, lung cells, brain cells, adipose cells, tumor cells and mammary cells.
  • In another aspect, the present invention provides an article of manufacture where the functionality of a method of the present invention is embedded on a computer-readable medium, such as, but not limited to, a floppy disk, a hard disk, an optical disk, a magnetic tape, a PROM, an EPROM, CD-ROM, or DVD-ROM. The functionality of the method may be embedded on the computer-readable medium in any number of computer-readable instructions, or languages such as, for example, FORTRAN, PASCAL, C, C++, BASIC and assembly language. Further, the computer-readable instructions can, for example, be written in a script, macro, or functionally embedded in commercially available software (such as, e.g., EXCEL or VISUAL BASIC).
  • In other aspects, the present invention provides systems adapted to practice the methods of the present invention. Referring to FIG. 19, in one embodiment, the system comprises one or more spectrometric instruments 1910 and a data processing device 1920 in electrical communication, wireless communication, or both. The spectrometric instrument may comprise any instrument capable of generating spectrometric measurements useful in practicing the methods of the present invention. Suitable spectrometric instruments include, but are not limited to, mass spectrometers, liquid phase chromatographers, gas phase chromatographer, and electrophoresis instruments, and combinations thereof. In another embodiment, the system further comprises an external database 1930 storing data accessible by the data processing device, wherein the data processing device implement the functionality of one or more of the methods of the present invention using at least in part data stored in the external database.
  • The data processing device may comprise an analog and/or digital circuit adapted to implement the functionality of one or more of the methods of the present invention using at least in part information provided by the spectrometric instrument. In some embodiments, the data processing device may implement the functionality of the methods of the present invention as software on a general purpose computer. In addition, such a program may set aside portions of a computer's random access memory to provide control logic that affects the spectrometric measurement acquisition, multivariate analysis of data sets, and/or profile development for a biological system. In such an embodiment, the program may be written in any one of a number of high-level languages, such as FORTRAN, PASCAL, C, C++, or BASIC. Further, the program can be written in a script, macro, or functionality embedded in proprietary software or commercially available software, such as EXCEL or VISUAL BASIC. Additionally, the software could be implemented in an assembly language directed to a microprocessor resident on a computer. For example, the software can be implemented in Intel 80×86 assembly language if it is configured to run on an IBM PC or PC clone. The software may be embedded on an article of manufacture including, but not limited to, a computer-readable program medium such as a floppy disk, a hard disk, an optical disk, a magnetic tape, a PROM, an EPROM, or CD-ROM.
  • EXAMPLE Small Molecule Study of the APO E3 Mouse Model for Atherosclerosis
  • An example of the practice of various embodiments of the present invention is illustrated below in the context of a small molecule study of the APO E3 Leiden transgenic mouse model.
  • A. The APO E3 Leiden Mouse
  • The APO E3 Leiden mouse model is a transgenic animal model described in “The Use of Transgenic Mice in Drug Discovery and Drug Development,” by P. L. B. Bruijnzeel, TNO Pharma, Oct. 24, 2000. Briefly, the APO E3-Leiden allele is identical to the APO E4 (Cys 112→Arg) allele, but includes an in frame repeat of 21 nucleotides in exon 4, resulting in tandem repeat of codon 120-126 or 121-127. Transgenic mice expressing APO E3-Leiden mutation are known to have hyperlipidemic phenotypes that under specific conditions lead to the development of atherosclerotic plaques. The model has a high predicted success rate in finding differences at the small molecule (metabolite) and protein levels, while the gene level is very well characterized.
  • In the present example, 10 wildtype and 10 APO E3 male mice were sacrificed after collection of urine in metabolic cages. The APO E3 mice were created by insertion of a well-defined human gene cluster (APO E3-APC 1), and a very homogeneous population was generated by at least 20 inbred generations.
  • The following samples were available for analysis: (1) 10 wildtype and 10 APO E3 urine samples (about 0.5 ml/animal or more); (2) 10 wildtype and 10 APO E3 (heparin) plasma samples (about 350 μl/animal); (3) 10 wildtype and 10 APO E3 liver samples. From the plasma samples 100 microliters were used for NMR and the same samples were used for LC-MS, about 250 ul is available for protein work and duplicates. All samples were stored at −20 C. In total, 19 plasma samples were received. One sample, animal #6 (APO-E3 Leiden group) was not present. After cleanup, (described below) the portions reserved for proteomics research were transferred to −70° C.
  • B. Experimental Details, Plasma and Urine Samples
  • Plasma sample extraction was accomplished with isopropanol (protein precipitation). LC-MS lipid profile measurements of the plasma samples were obtained with on an electrospray ionization (“ESI”) and atmospheric pressure chemical ionization (“APCI”) LC-MS system. The resultant raw data was preprocessed with an entropy-based peak detection technique substantially similar to that disclosed in pending U.S. patent application Ser. No. 09/920,993, filed Aug. 2, 2001. The preprocessed data was then subjected to principal component analysis (“PCA”) and/or discriminant analysis (“DA”) according to the methods of the present invention. The raw data from the NMR measurements of the plasma samples was subjected to a pattern recognition analysis (“PARC”), which included preprocessing (such as a partial linear fit), peak detection and multivariate statistical analysis.
  • Urine samples were prepared and NMR measurements of the urine samples were obtained. The raw NMR data on the urine samples was also subjected to a PARC analysis, which included preprocessing, peak detection and multivariate statistical analysis.
  • B.1. Mouse Blood Plasma Preparation and Cleanup
  • The mouse plasma samples were thawed at room temperature. Aliquots of 100 μl were transferred to a clean eppendorf vials and stored at −70° C. The sample volume for sample #12 was low and only 50 μl was transferred. For NMR and LC-MS lipid analysis 150 μl aliquots were transferred to clean eppendorf vials.
  • Plasma samples were cleaned up and handled substantially according to the following protocol: (1) add 0.6 ml of isopropanol; (2) vortex; (3) centrifuge at 10,000 rpm for 5 min.; (4) transfer 500 μl to clean tube for NMR analysis; (5) transfer 100 μl to clean eppendorf vial; (6) add 400 μl water and mix; and (7) transfer 200 μl to autosampler vial insert. The remaining extract and pellet (precipitated protein) were stored at −20° C.
  • B.2. Human Blood Plasma Preparation and Cleanup
  • Human heparin plasma was obtained from a blood bank. In a glass tube, 1 ml of human plasma and 4 ml of isopropanol were mixed (vortexed). After centrifugation, 1 ml of extract was transferred to a tube and 4 ml of water was added. The resulting solution was transferred to 4 autosampler vials (1 ml).
  • B. 3. LC-MS of Blood Plasma Samples:
  • Spectrometric measurments of plasma samples were made with a combination HPLC-time-of-flight MS instrument. Efluent emerging from the chromatograph was ionized by electrosrpay ionization (“ESI”) and atmospheric pressure chemical ionization (“APCI”). Typical instrument parameters used with HPLC instrument are given in Table 1 and details of the gradient in Table 2; typical parameters for the ESI source are given in Table 3, and those for the APCI source are given in Table 4.
    TABLE 1
    HPLC Parameters
    Column: Inertsil ODS3 5 μm, 100 × 3 mm
    i.d. (Chrompack); R2 guard column (Chrompack)
    Mobile phase A: 5% acetonitrile, 50 ml MeCN, water ad
    1000 ml, 10 ml ammonium acetate solution
    (1 mol/l), 1 ml formic acid
    Mobile phase B: 30% isopropanol in acetonitrile, 300
    ml isopropanol, acetonitrile ad
    1000 ml, 10 ml ammonium acetate
    solution (1 mol/l), 1 ml formic acid
    Mobile phase C: 50% dichloromethane in isopropanol,
    500 ml isopropanol, dichloromethane
    ad
    1000 ml, 10 ml ammonium acetate
    solution (1 mol/l), 1 ml formic acid
    Temperature: ca. 20° C. (conditioned laboratory)
    Injection volume: 75 μl
  • TABLE 2
    HPLC Gradient
    Time (min) Flow (ml/min) % A % B % C
    0 0.7 70 30
    2 0.7 70 30
    15 0.7 5 95
    35 0.7 5 35 60
    40 0.7 5 35 60
    41 0.7 5 95
    45 0.7 70 30
  • TABLE 3
    Electrospray (ESI) Parameters
    Mode: positive (+)
    Cap. Heater: 250° C.
    Spray voltage: 4 kV
    Sheath gas: 70 units
    Aux. Gas: 15 units
    Scan: 200 to 1750, 1 s/scan
  • TABLE 4
    Atmospheric Pressure Chemical Ionization (APCI) Parameters
    Mode: positive (+)
    Cap. Heater: 175° C.
    Vaporizer: 450° C.
    Corona: 5 μA
    Sheath gas: 70 units
    Aux. Gas: 0 units
    Scan: 200 to 1750, 1 s/scan
  • The injection sequence for samples was as follows. The mouse plasma extracts were injected twice in a random order. The human plasma extract was injected twice at the start of the sequence and after every 5 injections of the mouse plasma extracts to monitor the stability of the LC-MS conditions. The random sequence was applied to prevent the detrimental effects of possible drift on the multivariate statistics.
  • B.4. NMR of Plasma and Urine Samples:
  • NMR spectrometric measurements of plasma samples were made with a 400 MHz 1H-NMR. Samples for the NMR were prepared and handled substantially in accord with the following protocol. Isopropanol plasma extracts (500 μl from 2.3.1) were dried under nitrogen, whereafter the residues were dissolved in deuterated methanol (MeOD). Deuterated methanol was selected because it gave the best NMR spectra when chlorofom, water, methanol and dimethylsulfoxide (all deuterated) were compared.
  • NMR spectrometric measurements of urine samples were also made with a 400 MHz 1H-NMR.
  • C. Spectrometric Measurements and Analysis
  • The following spectrometric measurements were made at metabolite/small molecule level:
      • NMR-measurements of urine, multiple measurements (preferably triplicate measurements) on a total of 40 samples;
      • NMR-measurement of plasma, multiple measurements (preferably triplicate measurements) on a total of 40 samples; and
      • LC/MS-measurement of plasma (plasmalipid profile), multiple measurements (preferably triplicate measurements) on a total of 40 samples.
  • A flow chart illustrating the analysis of the spectrometric data of this example according to one embodiment of the present invention is shown in FIGS. 2A and 2B.
  • Referring to FIG. 2A, the spectrometric data obtained was grouped into eight data sets 301-308. The data sets were as follows: (1) data set 1 comprised 400 MHz
  • 1H-NMR spectra of wildtype mouse urine samples 301; (2) data set 2 comprised 400 MHz 1H-NMR spectra of APO E3 mouse urine samples 302; (3) data set 3 comprised 400 MHz 1H-NMR spectra of APO E3 mouse blood plasma samples 303; (4) data set 4 comprised 400 MHz 1H-NMR spectra of wildtype mouse blood plasma samples 304; (5) data set 5 comprised LC-MS spectra (using ESI) of wildtype mouse blood plasma lipid samples 305; (6) data set 6 comprised LC-MS spectra (using ESI) of APO E3 mouse blood plasma lipid samples 306; (7) data set 7 comprised LC-MS spectra (using APCI) of APO E3 mouse blood plasma lipid samples 307; and (8) data set 8 comprised LC-MS spectra (using APCI) of wildtype mouse blood plasma lipid samples 308. Examples of the spectrometric measurements obtained for each of these data sets is as follows: FIGS. 3A and 4A for data set 1; FIGS. 3B and 4B for data set 2; FIGS. 5B and 6B for data set 3; FIGS. 5A and 6A for data set 4; FIG. 7B for data set 5; and FIG. 7A for data set 6. Various features were noted in the data of FIGS. 3A-7B.
  • Referring to FIGS. 3A and 3B, it was noted that peaks associated with hippuric acid 410 were observed in the wildtype mouse urine sample 1H-NMR spectra, while such peaks were substantially absent from the APO E3 mouse urine sample 1H-NMR spectra, indicating a possible biochemical process unique to the APO E3 mouse. Referring to FIGS. 4A and 4B, in addition, peaks associated with an unidentified component 420 were observed in the wildtype mouse urine sample 1H-NMR spectra, which were also substantially absent from corresponding 1H-NMR spectra of the APO E3 mouse urine samples.
  • Referring to FIGS. 5A and 5B, a two series of peaks 510, 520 were observed in the APO E3 mouse blood plasma sample 1H-NMR spectra, which were either substantially absent from the wildtype spectra 510 or substantially reduced 520. As shown in FIGS. 6A and 6B, the peaks associated with the first series of peaks 510 are substantially absent from the resonance shift region in wildtype spectra 610, whole the second series of peaks 520 are present but reduced in the wildtype spectra 620.
  • Referring to FIGS. 7A and 7B, it was noted that peaks associated with lyso-phosphatidylcholines (“lyso-PC”) 710 were slightly reduced in intensity in the APO E3 mouse spectra relative to those for the wildtype, that peaks associated with phospholipids 720 were substantially equal in intensity between the APO E3 and wildtype spectra, and that peaks associated with triglycerides 730 were substantially increased in intensity in the APO E3 mouse spectra relative to those for the wildtype.
  • The raw data from data sets 1 to 8 was preprocessed 320 and a first multivariate analysis was performed between data sets 1 and 2, 3 and 4, 5 and 6, and 7 and 8, respectively, each at a first level of correlation 330, i.e., PCA-DA score plots. Examples of the results of the first multivariate analysis at a first level of correlation are illustrated in FIGS. 8-11 for data sets 1 and 2; FIG. 12 for data sets 3 and 4; and FIG. 13 for data sets 5 and 6 (which includes data from human samples). Data from the first multivariate analysis was then used to produce an analysis at a second level of correlation 340, i.e., PCA-DA loading plots. An example of one such PCA-DA loading plot is shown in FIG. 14.
  • Referring to FIG. 8, a PCA-DA score plot of the NMR data for the urine samples of data sets 1 and 2 is shown. As illustrated, the analysis groups NMR data for APO E3 and wildtype group into two substantially distinct regions in the score plot, an APO E3 region 810 and a wildtype region 820, indicating that urine samples alone may suffice to develop a profile that reflects the transgenic nature of the APO E3 mice and serve as a body fluid biomarker profile for distinguishing APO E3 mice from other types of mice.
  • Referring to FIG. 9, a score plot of the NMR data for the urine samples of data set 1 is shown. As illustrated, the analysis indicates that there are similarities and differences within the urine samples of data set 1 that correlate with urine color. Specifically, the analysis illustrates three distinct regions in the score plot correlated to deep brown urine 910, brown urine 920, and yellow urine 930. FIG. 9 illustrates that there are three distinct subgroups of mouse urine profiles in the wildtype mouse cohort.
  • Similarly in FIG. 10, a score plot of the NMR data for the urine samples of data set 2 is shown. As illustrated, the analysis indicates that there are similarities and differences within the urine samples of data set 2 that correlate with urine color. Specifically, the analysis illustrates three regions in the score plot, one correlated to brown urine 1010, and another to pale brown urine 1020, that slightly overlaps with a yellow urine correlated region 1030. FIG. 10 illustrates that there are three subgroups of mouse urine profiles in the APO E3 mouse cohort.
  • Referring to FIG. 11, a PCA-DA score plot of the NMR data for the urine samples of both wildtype and APO E3 mice is shown. As illustrated, the analysis indicates that there are similarities and differences within the urine samples of data sets 1 and 2 even for urine with the same color. Specifically, the analysis illustrates three regions in the score plot, one correlated to yellow APO E3 mouse urine 1110, one to pale brown APO E3 mouse urine 1120, and another to yellow wildtype mouse urine 1130. FIG. 11 illustrates that there are three distinct subgroups of mouse urine profiles which can be used as profiles to distinguish between APO E3 animals from wildtype animals, and to distinguish animals producing yellow urine from pale brown urine.
  • Referring to FIG. 12, a PCA-DA score plot of the NMR data for the blood plasma samples of data sets 3 and 4 is shown. As illustrated, the analysis groups NMR data for APO E3 and wildtype group into two substantially distinct regions in the score plot, a wildtype region 1210 and an APO E3 region 1220, indicating that blood samples alone may be suffice to develop a profile that distinguishes APO E3 mice from wildtype mice.
  • Referring to FIG. 13, a PCA-DA score plot of the NMR data for the blood plasma samples of data sets 5, 6 and the human samples is shown. As illustrated, the analysis groups NMR data regions corresponding to each organism type, a human region 1310, a wildtype region 1320 and an APO E3 region 1330. FIG. 13 indicates that blood plasma samples may suffice to develop a profile that distinguishes organisms and genotypes. In one embodiment, information at a second level of correlation is obtained from the analysis illustrated in FIG. 13 to investigate, for example, the contribution of each metabolite measured by the NMR technique to the segregation of the data into three regions. In one version a loading plot is used to determine a second level of correlation. An example of a loading plot for axis D2 of FIG. 13 is shown in FIG. 14.
  • Referring to FIGS. 14 and 2A, four ranges of numbers are circled 1401-1404. The abscissa corresponds to masses (or mass-to-charge ranges). Points with positive values along the ordinate indicate component masses that are lower in abundance in the APO E3 mouse versus wildtype, and negative values indicate the reverse. As can be seen in FIG. 14, the circled ranges are a significant contribution to the correlations of, for example, FIG. 13. The mass chromatograms associated these regions were investigated 350 and the upper circled ranges 1401, 1403 found to be associated with lyso-phosphatidylcholines (“lyso-PC”), and the lower ranges 1402, 1404 with triglycerides. An example of the phosphatidylcholine mass chromatograms for both wildtype and APO E3 mouse are shown in FIG. 15, and an example of the lyso-phosphatidylcholine mass chromatograms for both wildtype and APO E3 mouse are shown in FIG. 16.
  • Referring to FIG. 15, a series of peaks corresponding phosphatidylcholines, where n refers to the number of residues, is shown for both wildtype (thin solid line) and APO E3 (thick solid line) plasma samples. The chromatograms in FIG. 15 are each normalized such that the maximum intensity of the n=3 peak 1510 is equal for all the spectra and it should be noted that although some n=1 is present, the majority of the signal corresponding to this peak location 1540 is not believed to arise from a phosphatidylcholine. As illustrated, it was observed that the peaks corresponding to n=5 1520, 1530 were substantially reduced in the APO E3 mouse spectra relative to wildtype.
  • Referring to FIG. 16, a series of peaks corresponding lyso-phosphatidylcholines, where the designation x:y refers to x number of carbon atoms on the fatty acids and y carbon bonds, is shown for both wildtype (thin solid line) and APO E3 (thick solid line) plasma samples. The chromatograms in FIG. 16 are each normalized such that the maximum intensity of peak 1610 is equal for all the spectra. As illustrated, it was observed that the peaks corresponding to arachidonic acid 1620, and linoleic acid 1630 were substantially reduced in the APO E3 mouse spectra relative to wildtype.
  • Referring again to FIGS. 2A and 2B, a second multivariate analysis was also performed (“YES” to query 360) comprising a canonical correlation. This second multivariate analysis was performed on data sets 3, 4, 5, and 6, 371, to produce a canonical correlation score plot 381. An example of the results of this second multivariate analysis is shown in FIG. 17. It should be noted that analysis 371 correlates data from two very different spectrometric techniques: data sets 3 and 4 from NMR, and 5 and 6 from LC-MS. Such an analysis, for example, may discern whether different information is being provided by such different techniques.
  • As illustrated in FIG. 17, the canonical correlation groups both NMR and LC-MS results for the APO E3 mouse and wildtype mouse into two substantially distinct regions in the plot, a wildtype region 1710 and an APO E3 region 1720, indicating that both NMR and LC-MS techniques result in segregation into distinct regions, however the LC-MS method yielded a more pronounced separation.
  • A second multivariate analysis was performed on data sets 5, 6, 7 and 8, 372, to produce a canonical correlation score plot 382. An example of the results of this second multivariate analysis is shown in FIG. 18. It should be noted that analysis 372 correlates data from in many respects the same spectrometric technique LC-MS, but different instrument configurations: data sets 5 and 6 using ESI, and 7 and 8 using APCI. Such an analysis, for example, may discern whether different information is being provided by such different instrument configurations. In addition, such a multivariate analysis may be used to discern whether different machines (that use the exact same instrumentation) provide different information. In cases where different machines provide significantly different information (on the same sample, using the same technique, parameters, and instrumentation) user or machine errors may be detected.
  • As illustrated in FIG. 18, the canonical correlation groups both ESI LC-MS results (crosses+) and APCI LC-MS results (asterisks *) for the APO E3 mouse and wildtype mouse into two substantially distinct regions in the plot, a wildtype region 1810 and an APO E3 region 1820, indicating that both ESI LC-MS and APCI LC-MS techniques result in segregation into distinct regions.
  • While the invention has been particularly shown and described with reference to specific embodiments, it should be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. The scope of the invention is thus indicated by the appended claims and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced.

Claims (23)

1-44. (canceled)
45. A method of profiling a state of a biological system of an animal, the method comprising the steps of:
(a) providing a plurality of data sets derived from a biological sample type, the plurality of data sets comprising measurements of metabolomic components of a biological system;
(b) evaluating a plurality of the data sets with a multivariate analysis and correlating features among the plurality of the data sets to determine one or more sets of differences among at least a portion of the plurality of data sets;
(c) generating a profile of one or more biomarkers in response to one or more correlations, the profile characterizing a state of the biological system; and
(d) displaying at least a portion of the data relevant to the profile.
46. The method of claim 45, wherein the biological sample type is selected from the group consisting of blood, blood plasma, blood serum, cerebrospinal fluid, bile, saliva, synovial fluid, pleural fluid, pericardial fluid, peritoneal fluid, feces, nasal fluid, ocular fluid, intracellular fluid, intercellular fluid, lymph fluid, and urine.
47. The method of claim 45, wherein the biological sample type is selected from the group consisting of liver cells, epithelial cells, endothelial cells, kidney cells, prostate cells, blood cells, lung cells, brain cells, skin cells, adipose cells, tumor cells, and mammary cells.
48. The method of claim 45, wherein the plurality of data sets is derived from one biological sample type.
49. The method of claim 45, wherein the plurality of data sets are derived from biological samples taken at different times for the same organism.
50. The method of claim 45, wherein the plurality of data sets is derived from one biological sample type that is treated differently.
51. The method of claim 45, wherein the measurements of metabolomic components comprise different types of measurements of metabolomic components.
52. The method of claim 51, wherein the different types of measurements comprise different types of spectrometric measurements.
53. The method of claim 52, wherein the spectrometric measurements comprise data from different instrument configurations of the same spectrometric technique.
54. The method of claim 45, wherein the biological system is in a mammal.
55. An article of manufacture having a computer-readable medium with computer-readable instructions embodied thereon for performing the method of claim 45.
56. A method of profiling a state of a biological system of an animal, the method comprising the steps of:
(a) analyzing a sample of a biological system to provide a plurality of data sets, the plurality of data sets comprising measurements of metabolomic components of the biological system;
(b) evaluating a plurality of the data sets with a multivariate analysis and correlating features among the plurality of the data sets to determine one or more sets of differences among at least a portion of the plurality of data sets; and
(c) generating a profile of one or more biomarkers in response to one or more correlations, the profile characterizing a state of the biological system.
57. The method of claim 56, wherein the biological sample type is selected from the group consisting of blood, blood plasma, blood serum, cerebrospinal fluid, bile, saliva, synovial fluid, pleural fluid, pericardial fluid, peritoneal fluid, feces, nasal fluid, ocular fluid, intracellular fluid, intercellular fluid, lymph fluid, and urine.
58. The method of claim 56, wherein the biological sample type is selected from the group consisting of liver cells, epithelial cells, endothelial cells, kidney cells, prostate cells, blood cells, lung cells, brain cells, skin cells, adipose cells, tumor cells, and mammary cells.
59. The method of claim 56, wherein the plurality of data sets is derived from one biological sample type.
60. The method of claim 56, wherein the plurality of data sets are derived from biological samples taken at different times for the same organism.
61. The method of claim 56, wherein the plurality of data sets is derived from one biological sample type that is treated differently.
62. The method of claim 56, wherein the measurements of metabolomic components comprise different types of measurements of metabolomic components.
63. The method of claim 62, wherein the different types of measurements comprise different types of spectrometric measurements.
64. The method of claim 63, wherein the spectrometric measurements comprise data from different instrument configurations of the same spectrometric technique.
65. The method of claim 56, wherein the biological system is in a mammal.
66. An article of manufacture having a computer-readable medium with computer-readable instructions embodied thereon for performing the method of claim 56.
US11/141,253 2001-08-13 2005-05-31 Method and system for profiling biological systems Abandoned US20050283320A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/141,253 US20050283320A1 (en) 2001-08-13 2005-05-31 Method and system for profiling biological systems

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US31214501P 2001-08-13 2001-08-13
US10/218,880 US8068987B2 (en) 2001-08-13 2002-08-13 Method and system for profiling biological systems
US11/141,253 US20050283320A1 (en) 2001-08-13 2005-05-31 Method and system for profiling biological systems

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US10/218,880 Continuation US8068987B2 (en) 2001-08-13 2002-08-13 Method and system for profiling biological systems

Publications (1)

Publication Number Publication Date
US20050283320A1 true US20050283320A1 (en) 2005-12-22

Family

ID=23210073

Family Applications (3)

Application Number Title Priority Date Filing Date
US10/218,880 Expired - Fee Related US8068987B2 (en) 2001-08-13 2002-08-13 Method and system for profiling biological systems
US11/141,253 Abandoned US20050283320A1 (en) 2001-08-13 2005-05-31 Method and system for profiling biological systems
US11/141,258 Abandoned US20050273275A1 (en) 2001-08-13 2005-05-31 Method and system for profiling biological systems

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US10/218,880 Expired - Fee Related US8068987B2 (en) 2001-08-13 2002-08-13 Method and system for profiling biological systems

Family Applications After (1)

Application Number Title Priority Date Filing Date
US11/141,258 Abandoned US20050273275A1 (en) 2001-08-13 2005-05-31 Method and system for profiling biological systems

Country Status (6)

Country Link
US (3) US8068987B2 (en)
EP (1) EP1425695A2 (en)
JP (2) JP2005500543A (en)
CA (1) CA2457432A1 (en)
IL (1) IL160324A0 (en)
WO (1) WO2003017177A2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070020180A1 (en) * 2003-09-05 2007-01-25 Mei Wang Method for determining the impact of a multicomponent natural product mixture on the biological profile of a disease within a group of living systems and the developement and quality control of natural product based medicine
US20070160973A1 (en) * 2006-01-09 2007-07-12 Mcgill University Method to determine state of a cell exchanging metabolites with a fluid medium by analyzing the metabolites in the fluid medium
US20080147368A1 (en) * 2005-03-16 2008-06-19 Ajinomoto Co., Inc. Biological state-evaluating apparatus, biological state-evaluating method, biological state-evaluating system, biological state-evaluating program, evaluation function-generating apparatus, evaluation function-generating method, evaluation function-generating program and recording medium
US20130273595A1 (en) * 2010-04-09 2013-10-17 Rural Development Administration Method for determining age of ginseng roots using chromatogramphy-mass spectroscopy

Families Citing this family (57)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2298181C (en) 2000-02-02 2006-09-19 Dayan Burke Goodnough Non-targeted complex sample analysis
KR101047575B1 (en) 2000-06-19 2011-07-13 안국약품 주식회사 Heuristic Method of Classification
WO2002006829A2 (en) * 2000-07-18 2002-01-24 Correlogic Systems, Inc. A process for discriminating between biological states based on hidden patterns from biological data
US8068987B2 (en) 2001-08-13 2011-11-29 Bg Medicine, Inc. Method and system for profiling biological systems
US7333896B2 (en) 2002-07-29 2008-02-19 Correlogic Systems, Inc. Quality assurance/quality control for high throughput bioassay process
US9740817B1 (en) 2002-10-18 2017-08-22 Dennis Sunga Fernandez Apparatus for biological sensing and alerting of pharmaco-genomic mutation
JP4581994B2 (en) * 2002-12-09 2010-11-17 味の素株式会社 Biological state information processing apparatus, biological state information processing method, biological state information management system, program, and recording medium
US7457708B2 (en) * 2003-03-13 2008-11-25 Agilent Technologies Inc Methods and devices for identifying related ions from chromatographic mass spectral datasets containing overlapping components
AU2003901196A0 (en) * 2003-03-17 2003-04-03 Commonwealth Scientific And Industrial Research Organisation Analysis method
GB0307352D0 (en) 2003-03-29 2003-05-07 Qinetiq Ltd Improvements in and relating to the analysis of compounds
US7425700B2 (en) 2003-05-22 2008-09-16 Stults John T Systems and methods for discovery and analysis of markers
US20040236603A1 (en) * 2003-05-22 2004-11-25 Biospect, Inc. System of analyzing complex mixtures of biological and other fluids to identify biological state information
JP4818116B2 (en) * 2003-05-29 2011-11-16 ウオーターズ・テクノロジーズ・コーポレイシヨン Method and device for processing LC-MS or LC-MS / MS data in metabonomics
EP1649281A4 (en) * 2003-08-01 2007-11-07 Correlogic Systems Inc Multiple high-resolution serum proteomic features for ovarian cancer detection
WO2005020125A2 (en) * 2003-08-20 2005-03-03 Bg Medicine, Inc. Methods and systems for profiling biological systems
US8346482B2 (en) 2003-08-22 2013-01-01 Fernandez Dennis S Integrated biosensor and simulation system for diagnosis and therapy
WO2005060608A2 (en) * 2003-12-11 2005-07-07 Correlogic Systems, Inc. Method of diagnosing biological states through the use of a centralized, adaptive model, and remote sample processing
CA2501003C (en) 2004-04-23 2009-05-19 F. Hoffmann-La Roche Ag Sample analysis to provide characterization data
US20050244973A1 (en) * 2004-04-29 2005-11-03 Predicant Biosciences, Inc. Biological patterns for diagnosis and treatment of cancer
DE602006018578D1 (en) 2005-01-28 2011-01-13 Childrens Medical Center DIAGNOSIS AND PROGNOSIS OF BUBBLE CANCER.
JP2008530555A (en) * 2005-02-09 2008-08-07 コレロジック システムズ,インコーポレイテッド Identification of bacteria and spores
WO2006124628A2 (en) * 2005-05-12 2006-11-23 Correlogic Systems, Inc. A model for classifying a biological sample in relation to breast cancer based on mass spectral data
CA2881326A1 (en) 2005-09-12 2007-03-22 Phenomenome Discoveries Inc. Methods for the diagnosis of colorectal cancer and ovarian cancer health states
EP2375254A1 (en) 2006-02-17 2011-10-12 The Children's Medical Center Corporation Free NGAL as a biomarker for cancer
US7736905B2 (en) 2006-03-31 2010-06-15 Biodesix, Inc. Method and system for determining whether a drug will be effective on a patient with a disease
US7571056B2 (en) * 2006-05-25 2009-08-04 Vialogy Corp. Analyzing information gathered using multiple analytical techniques
WO2008100941A2 (en) * 2007-02-12 2008-08-21 Correlogic Systems Inc. A method for calibrating an analytical instrument
EP2538049B1 (en) 2007-03-30 2015-03-18 Continental Automotive Systems US, Inc. Reductant delivery unit for selective catalytic reduction
CA3056116A1 (en) 2007-06-22 2008-12-31 Randolph Watnick Methods and uses thereof of prosaposin
JP2010532484A (en) 2007-06-29 2010-10-07 コレロジック システムズ,インコーポレイテッド Predictive markers for ovarian cancer
JP4983451B2 (en) * 2007-07-18 2012-07-25 株式会社島津製作所 Chromatographic mass spectrometry data processor
US20100099135A1 (en) * 2008-10-22 2010-04-22 Mandy Katz-Jaffe Methods and assays for assessing the quality of embryos in assisted reproduction technology protocols
JP5436446B2 (en) * 2008-12-01 2014-03-05 国立大学法人山口大学 Drug action / side effect prediction system and program
US8535891B2 (en) 2008-12-30 2013-09-17 Children's Medical Center Corporation Method of predicting acute appendicitis
WO2010079253A2 (en) 2009-01-09 2010-07-15 Proyecto De Biomedicina Cima, S.L. Bio-markers for diagnosing fibrosis
EP2513137B1 (en) 2009-12-17 2018-02-28 Children's Medical Center Corporation Saposin-a derived peptides and uses thereof
CA2801459C (en) * 2010-06-20 2018-04-24 Zora Biosciences Oy Lipidomic biomarkers for identification of high-risk coronary artery disease patients
CN103459611B (en) 2010-09-17 2016-11-02 哈佛大学校长及研究员协会 The functional genomics research that effectiveness and the safety of pluripotent stem cell are characterized
DE102010038014B4 (en) * 2010-10-06 2021-10-07 Numares Ag Use of specific substances as markers to determine the risk of kidney rejection
CN102478563B (en) * 2010-11-25 2014-08-13 中国科学院大连化学物理研究所 Method for studying metabolic difference of transgenic rice and non-transgenic rice
EP2724156B1 (en) 2011-06-27 2017-08-16 The Jackson Laboratory Methods and compositions for treatment of cancer and autoimmune disease
WO2013055911A1 (en) 2011-10-14 2013-04-18 Dana-Farber Cancer Institute, Inc. Znf365/zfp365 biomarker predictive of anti-cancer response
BR112014011491A2 (en) 2011-11-14 2017-05-09 Nestec Sa trials and methods for selecting a treatment regimen for an individual with depression
AU2012358269B2 (en) 2011-12-22 2017-11-02 Children's Medical Center Corporation Saposin-A derived peptides and uses thereof
US9572879B2 (en) 2012-01-05 2017-02-21 Boston Medical Center Corporation ROBO2 inhibitory compositions comprising SLIT2-binding extracellular domain of ROBO2
WO2013112216A1 (en) 2012-01-24 2013-08-01 Cd Diagnostics, Llc System for detecting infection in synovial fluid
CN103913698B (en) * 2014-03-27 2016-07-06 西北工业大学 Diagnostic method for failure of switch current circuit based on small wave fractal and core pivot characteristic
CA2944903A1 (en) 2014-04-24 2015-10-29 Dana-Farber Cancer Institute, Inc. Tumor suppressor and oncogene biomarkers predictive of anti-immune checkpoint inhibitor response
SG11201610610YA (en) 2014-06-19 2017-01-27 Sloan Kettering Inst Cancer Biomarkers for response to ezh2 inhibitors
US20170227541A1 (en) 2014-07-17 2017-08-10 The Trustees Of The University Of Pennsylvania Methods for using exosomes to monitor transplanted organ status
CA2994416A1 (en) 2015-08-04 2017-02-09 Cd Diagnostics, Inc. Methods for detecting adverse local tissue reaction (altr) necrosis
US20200200735A9 (en) 2016-02-22 2020-06-25 Ursure, Inc. System and method for detecting therapeutic agents to monitor adherence to a treatment regimen
CN106022373B (en) * 2016-05-18 2019-04-23 江南大学 A kind of image-recognizing method based on extended mean value canonical correlation analysis
CN109187614B (en) * 2018-09-27 2020-03-06 厦门大学 Metabonomics data fusion method based on nuclear magnetic resonance and mass spectrum and application thereof
CN117129704A (en) 2019-08-05 2023-11-28 禧尔公司 Systems and methods for sample preparation, data generation, and protein crown analysis
EP4343004A2 (en) 2020-10-19 2024-03-27 Dana-Farber Cancer Institute, Inc. Germline biomarkers of clinical response and benefit to immune checkpoint inhibitor therapy
EP4074820A1 (en) 2021-04-16 2022-10-19 The Trustees of The University of Pennsylvania Micro-engineered models of the human eye and methods of use

Citations (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6194217B1 (en) * 1980-01-14 2001-02-27 Esa, Inc. Method of diagnosing or categorizing disorders from biochemical profiles
US6329146B1 (en) * 1998-03-02 2001-12-11 Isis Pharmaceuticals, Inc. Mass spectrometric methods for biomolecular screening
US20020053545A1 (en) * 2000-08-03 2002-05-09 Greef Jan Van Der Method and system for identifying and quantifying chemical components of a mixture
US20020095260A1 (en) * 2000-11-28 2002-07-18 Surromed, Inc. Methods for efficiently mining broad data sets for biological markers
US20020095259A1 (en) * 2000-11-21 2002-07-18 Hood Leroy E. Multiparameter analysis for drug response and related methods
US20020145425A1 (en) * 2000-12-22 2002-10-10 Ebbels Timothy Mark David Methods for spectral analysis and their applications: spectral replacement
US20030004402A1 (en) * 2000-07-18 2003-01-02 Hitt Ben A. Process for discriminating between biological states based on hidden patterns from biological data
US20030023386A1 (en) * 2001-01-18 2003-01-30 Nelly Aranibar Metabolome profiling methods using chromatographic and spectroscopic data in pattern recognition analysis
US20030040123A1 (en) * 2001-08-24 2003-02-27 Surromed, Inc. Peak selection in multidimensional data
US20030078739A1 (en) * 2001-10-05 2003-04-24 Surromed, Inc. Feature list extraction from data sets such as spectra
US20030111596A1 (en) * 2001-10-15 2003-06-19 Surromed, Inc. Mass specttrometric quantification of chemical mixture components
US20030130798A1 (en) * 2000-11-14 2003-07-10 The Institute For Systems Biology Multiparameter integration methods for the analysis of biological networks
US20030138827A1 (en) * 1998-02-25 2003-07-24 The Government Of The U.S.A. As Represented By The Secretary Of The Dept. Of Health & Human Services Tumor tissue microarrays for rapid molecular profiling
US20030143520A1 (en) * 2002-01-31 2003-07-31 Hood Leroy E. Gene discovery for the system assignment of gene function
US6615141B1 (en) * 1999-05-14 2003-09-02 Cytokinetics, Inc. Database system for predictive cellular bioinformatics
US6647341B1 (en) * 1999-04-09 2003-11-11 Whitehead Institute For Biomedical Research Methods for classifying samples and ascertaining previously unknown classes
US6656695B2 (en) * 2000-03-06 2003-12-02 Bioseek, Inc. Biomap characterization of biologically active agents
US20030229451A1 (en) * 2001-11-21 2003-12-11 Carol Hamilton Methods and systems for analyzing complex biological systems
US6675104B2 (en) * 2000-11-16 2004-01-06 Ciphergen Biosystems, Inc. Method for analyzing mass spectra
US20040096917A1 (en) * 2002-11-12 2004-05-20 Becton, Dickinson And Company Diagnosis of sepsis or SIRS using biomarker profiles
US20040113062A1 (en) * 2002-05-09 2004-06-17 Surromed, Inc. Methods for time-alignment of liquid chromatography-mass spectrometry data
US6753135B2 (en) * 2000-09-20 2004-06-22 Surromed, Inc. Biological markers for evaluating therapeutic treatment of inflammatory and autoimmune disorders
US20040142496A1 (en) * 2001-04-23 2004-07-22 Nicholson Jeremy Kirk Methods for analysis of spectral data and their applications: atherosclerosis/coronary heart disease
US20040214348A1 (en) * 2001-04-23 2004-10-28 Nicholson Jeremy Kirk Methods for analysis of spectral data and their applications: osteoarthritis
US20050037515A1 (en) * 2001-04-23 2005-02-17 Nicholson Jeremy Kirk Methods for analysis of spectral data and their applications osteoporosis
US20050074834A1 (en) * 2001-09-12 2005-04-07 The State Of Or Acting By & Through The State Board Of Higher Educ. On Behalf Of Or State Univ. Method and system for classifying a scenario
US20050074745A1 (en) * 2002-06-14 2005-04-07 Pfizer Inc Metabolic phenotyping

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5644503A (en) 1994-03-28 1997-07-01 Hitachi, Ltd. Methods and apparatuses for analyzing multichannel chromatogram
JP3270290B2 (en) * 1994-04-28 2002-04-02 株式会社日立製作所 Multi-channel chromatogram analysis method and data processing device
US6031232A (en) 1995-11-13 2000-02-29 Bio-Rad Laboratories, Inc. Method for the detection of malignant and premalignant stages of cervical cancer
US6558955B1 (en) * 1998-03-30 2003-05-06 Esa Inc. Methodology for predicting and/or diagnosing disease
CN1201148C (en) * 1999-02-10 2005-05-11 牛津天然产品有限公司 Process for quality control and standardisation of medicinal plant products
WO2001044269A2 (en) * 1999-12-17 2001-06-21 Large Scale Proteomics Corporation Brain protein markers
GB0013007D0 (en) 2000-05-30 2000-07-19 Imperial College Characterization system and method
WO2002099452A1 (en) 2001-06-04 2002-12-12 Metabometrix Limited Methods for spectral analysis and their applications: reliability assessment
US8068987B2 (en) 2001-08-13 2011-11-29 Bg Medicine, Inc. Method and system for profiling biological systems
EP1327883A3 (en) 2002-01-10 2003-07-30 Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. Combined metabolomic, proteomic and transcriptomic analysis from one, single sample and suitable statistical evaluation of data

Patent Citations (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6194217B1 (en) * 1980-01-14 2001-02-27 Esa, Inc. Method of diagnosing or categorizing disorders from biochemical profiles
US20030138827A1 (en) * 1998-02-25 2003-07-24 The Government Of The U.S.A. As Represented By The Secretary Of The Dept. Of Health & Human Services Tumor tissue microarrays for rapid molecular profiling
US6329146B1 (en) * 1998-03-02 2001-12-11 Isis Pharmaceuticals, Inc. Mass spectrometric methods for biomolecular screening
US6647341B1 (en) * 1999-04-09 2003-11-11 Whitehead Institute For Biomedical Research Methods for classifying samples and ascertaining previously unknown classes
US6615141B1 (en) * 1999-05-14 2003-09-02 Cytokinetics, Inc. Database system for predictive cellular bioinformatics
US6656695B2 (en) * 2000-03-06 2003-12-02 Bioseek, Inc. Biomap characterization of biologically active agents
US20030004402A1 (en) * 2000-07-18 2003-01-02 Hitt Ben A. Process for discriminating between biological states based on hidden patterns from biological data
US20020053545A1 (en) * 2000-08-03 2002-05-09 Greef Jan Van Der Method and system for identifying and quantifying chemical components of a mixture
US6753135B2 (en) * 2000-09-20 2004-06-22 Surromed, Inc. Biological markers for evaluating therapeutic treatment of inflammatory and autoimmune disorders
US20030130798A1 (en) * 2000-11-14 2003-07-10 The Institute For Systems Biology Multiparameter integration methods for the analysis of biological networks
US6675104B2 (en) * 2000-11-16 2004-01-06 Ciphergen Biosystems, Inc. Method for analyzing mass spectra
US20020095259A1 (en) * 2000-11-21 2002-07-18 Hood Leroy E. Multiparameter analysis for drug response and related methods
US20020095260A1 (en) * 2000-11-28 2002-07-18 Surromed, Inc. Methods for efficiently mining broad data sets for biological markers
US6683455B2 (en) * 2000-12-22 2004-01-27 Metabometrix Limited Methods for spectral analysis and their applications: spectral replacement
US20020145425A1 (en) * 2000-12-22 2002-10-10 Ebbels Timothy Mark David Methods for spectral analysis and their applications: spectral replacement
US20030023386A1 (en) * 2001-01-18 2003-01-30 Nelly Aranibar Metabolome profiling methods using chromatographic and spectroscopic data in pattern recognition analysis
US20040241743A1 (en) * 2001-04-23 2004-12-02 Nicholson Jeremy Kirk Methods for the diagnosis and treatment of bone disorders
US20050130321A1 (en) * 2001-04-23 2005-06-16 Nicholson Jeremy K. Methods for analysis of spectral data and their applications
US20040142496A1 (en) * 2001-04-23 2004-07-22 Nicholson Jeremy Kirk Methods for analysis of spectral data and their applications: atherosclerosis/coronary heart disease
US20050037515A1 (en) * 2001-04-23 2005-02-17 Nicholson Jeremy Kirk Methods for analysis of spectral data and their applications osteoporosis
US20040214348A1 (en) * 2001-04-23 2004-10-28 Nicholson Jeremy Kirk Methods for analysis of spectral data and their applications: osteoarthritis
US20030040123A1 (en) * 2001-08-24 2003-02-27 Surromed, Inc. Peak selection in multidimensional data
US20050074834A1 (en) * 2001-09-12 2005-04-07 The State Of Or Acting By & Through The State Board Of Higher Educ. On Behalf Of Or State Univ. Method and system for classifying a scenario
US20030078739A1 (en) * 2001-10-05 2003-04-24 Surromed, Inc. Feature list extraction from data sets such as spectra
US20030111596A1 (en) * 2001-10-15 2003-06-19 Surromed, Inc. Mass specttrometric quantification of chemical mixture components
US20040002842A1 (en) * 2001-11-21 2004-01-01 Jeffrey Woessner Methods and systems for analyzing complex biological systems
US20040019430A1 (en) * 2001-11-21 2004-01-29 Patrick Hurban Methods and systems for analyzing complex biological systems
US20040024293A1 (en) * 2001-11-21 2004-02-05 Matthew Lawrence Methods and systems for analyzing complex biological systems
US20040023295A1 (en) * 2001-11-21 2004-02-05 Carol Hamilton Methods and systems for analyzing complex biological systems
US20030229451A1 (en) * 2001-11-21 2003-12-11 Carol Hamilton Methods and systems for analyzing complex biological systems
US20040024543A1 (en) * 2001-11-21 2004-02-05 Weiwen Zhang Methods and systems for analyzing complex biological systems
US20040019429A1 (en) * 2001-11-21 2004-01-29 Marie Coffin Methods and systems for analyzing complex biological systems
US20040018501A1 (en) * 2001-11-21 2004-01-29 Keith Allen Methods and systems for analyzing complex biological systems
US20040018500A1 (en) * 2001-11-21 2004-01-29 Norman Glassbrook Methods and systems for analyzing complex biological systems
US20040019435A1 (en) * 2001-11-21 2004-01-29 Stephanie Winfield Methods and systems for analyzing complex biological systems
US20030143520A1 (en) * 2002-01-31 2003-07-31 Hood Leroy E. Gene discovery for the system assignment of gene function
US20040113062A1 (en) * 2002-05-09 2004-06-17 Surromed, Inc. Methods for time-alignment of liquid chromatography-mass spectrometry data
US20050074745A1 (en) * 2002-06-14 2005-04-07 Pfizer Inc Metabolic phenotyping
US20040096917A1 (en) * 2002-11-12 2004-05-20 Becton, Dickinson And Company Diagnosis of sepsis or SIRS using biomarker profiles

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070020180A1 (en) * 2003-09-05 2007-01-25 Mei Wang Method for determining the impact of a multicomponent natural product mixture on the biological profile of a disease within a group of living systems and the developement and quality control of natural product based medicine
US20070160990A1 (en) * 2003-09-05 2007-07-12 Mei Wang Method for determining the impact of a multicomponent synthetic product mixture on the biological profile of a disease within a group of living systems and the development of new combinatorial interventions
US20080147368A1 (en) * 2005-03-16 2008-06-19 Ajinomoto Co., Inc. Biological state-evaluating apparatus, biological state-evaluating method, biological state-evaluating system, biological state-evaluating program, evaluation function-generating apparatus, evaluation function-generating method, evaluation function-generating program and recording medium
US20070160973A1 (en) * 2006-01-09 2007-07-12 Mcgill University Method to determine state of a cell exchanging metabolites with a fluid medium by analyzing the metabolites in the fluid medium
US7981399B2 (en) 2006-01-09 2011-07-19 Mcgill University Method to determine state of a cell exchanging metabolites with a fluid medium by analyzing the metabolites in the fluid medium
US20110236922A1 (en) * 2006-01-09 2011-09-29 Mcgill University Method to determine state of a cell exchanging metabolites with a fluid medium by analyzing the metabolites in the fluid medium
US8486690B2 (en) 2006-01-09 2013-07-16 Mcgill University Method to determine state of a cell exchanging metabolites with a fluid medium by analyzing the metabolites in the fluid medium
US20130273595A1 (en) * 2010-04-09 2013-10-17 Rural Development Administration Method for determining age of ginseng roots using chromatogramphy-mass spectroscopy

Also Published As

Publication number Publication date
US20050273275A1 (en) 2005-12-08
EP1425695A2 (en) 2004-06-09
US8068987B2 (en) 2011-11-29
JP2005500543A (en) 2005-01-06
US20030134304A1 (en) 2003-07-17
IL160324A0 (en) 2004-07-25
CA2457432A1 (en) 2003-02-27
WO2003017177A3 (en) 2004-04-08
JP2009133867A (en) 2009-06-18
WO2003017177A2 (en) 2003-02-27

Similar Documents

Publication Publication Date Title
US8068987B2 (en) Method and system for profiling biological systems
Klassen et al. Metabolomics: definitions and significance in systems biology
Theodoridis et al. Mass spectrometry‐based holistic analytical approaches for metabolite profiling in systems biology studies
Peckner et al. Specter: linear deconvolution for targeted analysis of data-independent acquisition mass spectrometry proteomics
Yin et al. Current state-of-the-art of nontargeted metabolomics based on liquid chromatography–mass spectrometry with special emphasis in clinical applications
Lindon et al. Metabonomics in pharmaceutical R & D
Milman General principles of identification by mass spectrometry
Ressom et al. Utilization of metabolomics to identify serum biomarkers for hepatocellular carcinoma in patients with liver cirrhosis
Zhang et al. Ultraperformance liquid chromatography–mass spectrometry based comprehensive metabolomics combined with pattern recognition and network analysis methods for characterization of metabolites and metabolic pathways from biological data sets
Lu et al. LC–MS-based metabonomics analysis
Kenar et al. Automated label-free quantification of metabolites from liquid chromatography–mass spectrometry data
Tan et al. Metabolomics study of stepwise hepatocarcinogenesis from the model rats to patients: potential biomarkers effective for small hepatocellular carcinoma diagnosis
CN103616450B (en) A kind of Serum of Patients with Lung Cancer specific metabolic production spectra and method for building up thereof
Gika et al. A QC approach to the determination of day-to-day reproducibility and robustness of LC–MS methods for global metabolite profiling in metabonomics/metabolomics
Jiang et al. An automated data analysis pipeline for GC− TOF− MS metabonomics studies
Wishart Computational approaches to metabolomics
US10401337B2 (en) Method and apparatus for improved quantitation by mass spectrometry
Tebani et al. Advances in metabolome information retrieval: turning chemistry into biology. Part I: analytical chemistry of the metabolome
JP2006522340A (en) Analyzing mass spectrometry data
O’Connor et al. LipidFinder: a computational workflow for discovery of lipids identifies eicosanoid-phosphoinositides in platelets
Tsai et al. Preprocessing and analysis of LC-MS-based proteomic data
Crockford et al. Statistical search space reduction and two-dimensional data display approaches for UPLC− MS in biomarker discovery and pathway analysis
Idkowiak et al. Robust and high-throughput lipidomic quantitation of human blood samples using flow injection analysis with tandem mass spectrometry for clinical use
Wishart et al. Metabolomics
Ramana et al. Metabonomics and drug development

Legal Events

Date Code Title Description
AS Assignment

Owner name: BG MEDICINE, INC., MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:AFEYAN, NOUBAR B.;GREEF, JAN VAN DER;REGNIER, FREDERICK E.;AND OTHERS;REEL/FRAME:017077/0501;SIGNING DATES FROM 20050311 TO 20050413

Owner name: NEDERLANDSE ORGANISATIE VOOR TOEGEPAST-NATUURWETEN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:VERHEIJ, ELWIN ROBBERT;REEL/FRAME:017069/0950

Effective date: 20050311

AS Assignment

Owner name: BG MEDICINE, INC., MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NEDERLANDSE ORGANISATIE VOOR TOEGEPAST-NATUURWETENSCHAPPELIIJK ONDERZOEK TNO;REEL/FRAME:017436/0426

Effective date: 20051205

AS Assignment

Owner name: SILICON VALLEY BANK, CALIFORNIA

Free format text: SECURITY AGREEMENT;ASSIGNOR:BG MEDICINE, INC.;REEL/FRAME:020166/0868

Effective date: 20071109

Owner name: SILICON VALLEY BANK,CALIFORNIA

Free format text: SECURITY AGREEMENT;ASSIGNOR:BG MEDICINE, INC.;REEL/FRAME:020166/0868

Effective date: 20071109

AS Assignment

Owner name: BG MEDICINE, INC., MASSACHUSETTS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:SILICON VALLEY BANK;REEL/FRAME:022012/0083

Effective date: 20081126

Owner name: BG MEDICINE, INC.,MASSACHUSETTS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:SILICON VALLEY BANK;REEL/FRAME:022012/0083

Effective date: 20081126

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION