WO2002103954A3 - Data mining platform for bioinformatics and other knowledge discovery - Google Patents

Data mining platform for bioinformatics and other knowledge discovery Download PDF

Info

Publication number
WO2002103954A3
WO2002103954A3 PCT/US2002/019202 US0219202W WO02103954A3 WO 2002103954 A3 WO2002103954 A3 WO 2002103954A3 US 0219202 W US0219202 W US 0219202W WO 02103954 A3 WO02103954 A3 WO 02103954A3
Authority
WO
WIPO (PCT)
Prior art keywords
data
mining platform
bioinformatics
data mining
module processes
Prior art date
Application number
PCT/US2002/019202
Other languages
French (fr)
Other versions
WO2002103954A2 (en
Inventor
Isabelle Guyon
Edward Reiss
Rene Doursat
David Lewis
Jason Weston
Original Assignee
Biowulf Technologies Llc
Isabelle Guyon
Edward Reiss
Rene Doursat
David Lewis
Jason Weston
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Biowulf Technologies Llc, Isabelle Guyon, Edward Reiss, Rene Doursat, David Lewis, Jason Weston filed Critical Biowulf Technologies Llc
Priority to AU2002304006A priority Critical patent/AU2002304006A1/en
Priority to US10/481,068 priority patent/US7444308B2/en
Publication of WO2002103954A2 publication Critical patent/WO2002103954A2/en
Publication of WO2002103954A3 publication Critical patent/WO2002103954A3/en
Priority to US11/928,641 priority patent/US7542947B2/en
Priority to US13/079,198 priority patent/US8126825B2/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/10Gene or protein expression profiling; Expression-ratio estimation or normalisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/30Unsupervised data analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/20Heterogeneous data integration
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Abstract

The data mining platform comprises a plurality of system modules (500, 550), each formed from a plurality of components. Each module has an input data component (502, 552), a data analysis engine (504, 554) for processing the input data, an output data component (506, 556) for outputting the results of the data analysis, and a web server (510) to access and monitor the other modules within the unit and to provide communication to other units. Each module processes a different type of data, for example, a first module processes microarray (gene expression) data while a second module processes biomedical literature on the Internet for information supporting relationships between genes and diseases and gene functionality
PCT/US2002/019202 1998-05-01 2002-06-17 Data mining platform for bioinformatics and other knowledge discovery WO2002103954A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
AU2002304006A AU2002304006A1 (en) 2001-06-15 2002-06-17 Data mining platform for bioinformatics and other knowledge discovery
US10/481,068 US7444308B2 (en) 2001-06-15 2002-06-17 Data mining platform for bioinformatics and other knowledge discovery
US11/928,641 US7542947B2 (en) 1998-05-01 2007-10-30 Data mining platform for bioinformatics and other knowledge discovery
US13/079,198 US8126825B2 (en) 1998-05-01 2011-04-04 Method for visualizing feature ranking of a subset of features for classifying data using a learning machine

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US29886701P 2001-06-15 2001-06-15
US29884201P 2001-06-15 2001-06-15
US29875701P 2001-06-15 2001-06-15
US60/298,842 2001-06-15
US60/298,867 2001-06-15
US60/298,757 2001-06-15

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2002/016012 Continuation-In-Part WO2002095534A2 (en) 1998-05-01 2002-05-20 Methods for feature selection in a learning machine

Related Child Applications (3)

Application Number Title Priority Date Filing Date
US10481068 A-371-Of-International 2002-06-17
US11/928,606 Continuation US7921068B2 (en) 1998-05-01 2007-10-30 Data mining platform for knowledge discovery from heterogeneous data types and/or heterogeneous data sources
US11/928,641 Continuation US7542947B2 (en) 1998-05-01 2007-10-30 Data mining platform for bioinformatics and other knowledge discovery

Publications (2)

Publication Number Publication Date
WO2002103954A2 WO2002103954A2 (en) 2002-12-27
WO2002103954A3 true WO2002103954A3 (en) 2003-04-03

Family

ID=27404588

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2002/019202 WO2002103954A2 (en) 1998-05-01 2002-06-17 Data mining platform for bioinformatics and other knowledge discovery

Country Status (2)

Country Link
AU (1) AU2002304006A1 (en)
WO (1) WO2002103954A2 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7243100B2 (en) * 2003-07-30 2007-07-10 International Business Machines Corporation Methods and apparatus for mining attribute associations
US7953677B2 (en) 2006-12-22 2011-05-31 International Business Machines Corporation Computer-implemented method, computer program and system for analyzing data records by generalizations on redundant attributes
US20100161607A1 (en) * 2008-12-22 2010-06-24 Jasjit Singh System and method for analyzing genome data
US10515715B1 (en) 2019-06-25 2019-12-24 Colgate-Palmolive Company Systems and methods for evaluating compositions
CN112116952B (en) * 2020-08-06 2024-02-09 温州大学 Gene selection method of gray wolf optimization algorithm based on diffusion and chaotic local search
CN112102937B (en) * 2020-11-13 2021-02-12 之江实验室 Patient data visualization method and system for chronic disease assistant decision making

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6266668B1 (en) * 1998-08-04 2001-07-24 Dryken Technologies, Inc. System and method for dynamic data-mining and on-line communication of customized information
US20020052882A1 (en) * 2000-07-07 2002-05-02 Seth Taylor Method and apparatus for visualizing complex data sets
US20020083067A1 (en) * 2000-09-28 2002-06-27 Pablo Tamayo Enterprise web mining system and method
US20020095260A1 (en) * 2000-11-28 2002-07-18 Surromed, Inc. Methods for efficiently mining broad data sets for biological markers
US20020111742A1 (en) * 2000-09-19 2002-08-15 The Regents Of The University Of California Methods for classifying high-dimensional biological data
US20020119462A1 (en) * 2000-07-31 2002-08-29 Mendrick Donna L. Molecular toxicology modeling
US20020120405A1 (en) * 2000-09-27 2002-08-29 Aled Edwards Protein data analysis
US20020133504A1 (en) * 2000-10-27 2002-09-19 Harry Vlahos Integrating heterogeneous data and tools
US6470333B1 (en) * 1998-07-24 2002-10-22 Jarg Corporation Knowledge extraction system and method
US20020165845A1 (en) * 2001-05-02 2002-11-07 Gogolak Victor V. Method and system for web-based analysis of drug adverse effects

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6470333B1 (en) * 1998-07-24 2002-10-22 Jarg Corporation Knowledge extraction system and method
US6266668B1 (en) * 1998-08-04 2001-07-24 Dryken Technologies, Inc. System and method for dynamic data-mining and on-line communication of customized information
US20020049704A1 (en) * 1998-08-04 2002-04-25 Vanderveldt Ingrid V. Method and system for dynamic data-mining and on-line communication of customized information
US20020052882A1 (en) * 2000-07-07 2002-05-02 Seth Taylor Method and apparatus for visualizing complex data sets
US20020119462A1 (en) * 2000-07-31 2002-08-29 Mendrick Donna L. Molecular toxicology modeling
US20020111742A1 (en) * 2000-09-19 2002-08-15 The Regents Of The University Of California Methods for classifying high-dimensional biological data
US20020120405A1 (en) * 2000-09-27 2002-08-29 Aled Edwards Protein data analysis
US20020083067A1 (en) * 2000-09-28 2002-06-27 Pablo Tamayo Enterprise web mining system and method
US20020133504A1 (en) * 2000-10-27 2002-09-19 Harry Vlahos Integrating heterogeneous data and tools
US20020095260A1 (en) * 2000-11-28 2002-07-18 Surromed, Inc. Methods for efficiently mining broad data sets for biological markers
US20020165845A1 (en) * 2001-05-02 2002-11-07 Gogolak Victor V. Method and system for web-based analysis of drug adverse effects

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
KEMP ET AL.: "Using the functional data model to integrate distributed biological data sources", PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE SYSTEMS, June 1996 (1996-06-01), pages 176 - 185, XP002958893 *
MOORE S.K.: "Harmonizing data, setting standards", GENOMICS INFORMATION SETS, IEEE SPECTRUM, vol. 38, no. 1, January 2001 (2001-01-01), pages 111 - 112, XP002958891 *
PAVLIDIS ET AL.: "Gene functional classification from heterogeneous data", PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL BIOLOGY, April 2001 (2001-04-01), pages 249 - 255, XP000988076 *
SYED ET AL.: "A study of support vectors on model independent example selection", PROCEEDINGS OF THE 5TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, July 1999 (1999-07-01), pages 272 - 276, XP002958894 *
WALKER R.L.: "Parallel clustering system using the methodologies of evolutionary computations", PROCEEDINGS OF THE 2001 CONGRESS ON EVOLUTIONARY COMPUTATION, 2001, pages 831 - 838, XP002958892 *
YANG ET AL.: "Data-driven theory refinement algorithms for bioformatics", INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, July 1999 (1999-07-01), pages 4064 - 4068, XP010372571 *

Also Published As

Publication number Publication date
WO2002103954A2 (en) 2002-12-27
AU2002304006A1 (en) 2003-01-02

Similar Documents

Publication Publication Date Title
WO2003025762A1 (en) Network information processing system and information processing method
WO2004084002A3 (en) Systems and methods for providing access to data stored in different types of data repositories
WO2002093295A3 (en) In-channel marketing and product testing system
WO2003052664A3 (en) Method and system for targeted incentives
AU2729000A (en) Database system
CA2511344A1 (en) System, method, and computer program for interfacing an expert system to a clinical information system
EP1403795A4 (en) Information communication system
WO2003073232A3 (en) System and method for building and manipulating a centralized measurement value database
WO2002061527A3 (en) Online insurance sales platform
WO2003079145A3 (en) System and method for delivering data in a network
WO2003026217A1 (en) Network information processing system and information processing method
WO2001042882A3 (en) Timeshared electronic catalog system and method
EP0782083A3 (en) Data processing system
WO2007050341A3 (en) Hybrid peer-to-peer data communication and management
WO2003052578A3 (en) Method, device system and computer program for saving and retrieving print data in a network
EP1411485A3 (en) System and method for monitoring a structure
WO2006026443A3 (en) Dynamic physical interface between computer module and computer accessory and methods
EP1327939A3 (en) Ring bus system
WO2002103954A3 (en) Data mining platform for bioinformatics and other knowledge discovery
WO2000065523A8 (en) System and method for modeling genetic, biochemical, biophysical and anatomical information
WO2003079144A3 (en) System for standardizing updates of data on a plurality of electronic devices
TW200515244A (en) Semiconductor intellectual property technology transfer method and system
EP1235384A3 (en) Accounting system and method for storage devices
DE59607009D1 (en) DEVICE FOR SINGLE-CHANNEL TRANSMISSION OF DATA FROM TWO DATA SOURCES
TW200504549A (en) A system and method for managing module development

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 69(1) EPC

122 Ep: pct application non-entry in european phase
ENP Entry into the national phase

Ref document number: 2006064415

Country of ref document: US

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 10481068

Country of ref document: US

WWP Wipo information: published in national office

Ref document number: 10481068

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP