Búsqueda Imágenes Maps Play YouTube Noticias Gmail Drive Más »
Iniciar sesión
Usuarios de lectores de pantalla: deben hacer clic en este enlace para utilizar el modo de accesibilidad. Este modo tiene las mismas funciones esenciales pero funciona mejor con el lector.

Patentes

  1. Búsqueda avanzada de patentes
Número de publicaciónUS7512626 B2
Tipo de publicaciónConcesión
Número de solicitudUS 11/174,698
Fecha de publicación31 Mar 2009
Fecha de presentación5 Jul 2005
Fecha de prioridad5 Jul 2005
TarifaCaducada
También publicado comoUS20070011188
Número de publicación11174698, 174698, US 7512626 B2, US 7512626B2, US-B2-7512626, US7512626 B2, US7512626B2
InventoresMilind Chitgupakar, Mark S. Ramsey, David A. Selby
Cesionario originalInternational Business Machines Corporation
Exportar citaBiBTeX, EndNote, RefMan
Enlaces externos: USPTO, Cesión de USPTO, Espacenet
System and method for selecting a data mining modeling algorithm for data mining applications
US 7512626 B2
Resumen
A computing system and method for selecting a data mining modeling algorithm. The computing system comprises a computer readable medium and computing devices electrically coupled through an interface apparatus. A plurality of different data mining modeling algorithms and test data are stored on the computer readable medium. Each of the computing devices comprises a data subset from a plurality of data subsets. A technique is selected for generating a data mining model applied to each of the data subsets. Each of the different data mining modeling algorithms is run simultaneously to generate an associated data mining model on each of the computing devices. Each of the data mining models is compared to the test data to determine a best data model. A best data mining modeling algorithm from the different data mining modeling algorithms is selected in accordance with the best data mining model.
Imágenes(5)
Previous page
Next page
Reclamaciones(36)
1. A data mining method, comprising:
providing a computing system comprising a computer readable medium and computing devices electrically coupled through an interface apparatus, wherein a plurality of different data mining modeling algorithms and test data are stored on said computer readable medium, wherein said test data comprises a known outcome associated with a marketing offer, wherein each of said computing devices comprises at least one central processing unit (CPU) and an associated memory device, and wherein each of said computing devices may not access each other computing device of said computing devices;
first receiving, by said computing system, a first steady data stream from a plurality of client spreadsheets;
first dividing, by the computing system, said first steady data stream into a first plurality of data subsets;
associating, by the computing system, each data subset of said first plurality of data subsets with a different customer number associated with a different customer;
first placing, by the computing system, a different data subset of said first plurality of data subsets in each said associated memory device, wherein said first receiving, said first dividing, and said first placing are performed simultaneously;
selecting a technique for generating a data mining model applied to each of said first plurality of data subsets, wherein said data mining model is used to predict future customer behavior based on past historical data, wherein said past historical data consists of a customer purchasing history and a customer returned items history;
running simultaneously, each of said different data mining modeling algorithms on a different associated data subset of said first plurality of data subsets using said selected technique to generate first data mining models on said computing devices, wherein a first associated data mining model of said first data mining models is stored on each of said computing devices;
comparing each of said first data mining models on each of said computing devices to said test data to determine a best selection data model of said first data mining models,
determining, a best data mining modeling algorithm from said different data mining modeling algorithms in accordance with said selected technique, wherein said best data mining modeling algorithm is the data mining modeling algorithm that is associated with said best selection data mining model, and wherein said best data mining modeling algorithm is a neural network algorithm;
first removing each different data subset of said first plurality of data subsets from each said associated memory device;
after said first removing, second receiving by said computing system, a second steady data stream differing from said first steady data stream;
second dividing, by the computing system, said second steady data stream into a second plurality of data subsets;
second placing, by the computing system, a different data subset of said second plurality of data subsets in each said associated memory device, wherein said second receiving, said second dividing, and said second placing are performed simultaneously;
simultaneously applying said best data mining modeling algorithm to each data subset of said second plurality of data subsets;
generating in response to said simultaneously applying, second data mining models on said computing devices, wherein a second associated data mining model of said second data mining models is stored on each of said computing devices, wherein an output of each of said second data mining models comprises a numerical description representing an expected behavior for customers, and wherein each of said second data mining models is associated with a mortgage company; and
comparing each of said second data mining models on each of said computing devices to each other data mining model of said second data mining models to determine a best data model of said second data mining models, wherein said best data model comprises a most predictive data model having a highest degree of correlation to an offer with respect to a customer of said customers as compared to all other data models of said second data mining models.
2. The data mining method of claim 1, wherein said test data comprises existing data related to a marketing offer accepted by a first plurality of candidates, and wherein each of said data mining models comprises an acceptance probability that said marketing offer will be accepted by a second plurality of candidates.
3. The data mining method of claim 2, wherein said best data mining model comprises a higher acceptance probability than said acceptance probabilities for any other of said data mining models.
4. The data mining method of claim 1, wherein said selected technique comprises a lift chart technique, and wherein said method further comprises:
determining by said lift chart technique an effectiveness of each of said generated data mining models.
5. The data mining method of claim 1, wherein said selected technique comprises a root mean squared technique, and wherein said method further comprises:
determining by said root mean squared technique, an error for each of said generated data mining models.
6. The data mining method of claim 1, wherein each of said data mining modeling algorithms are selected from the group consisting of a decision tree algorithm, a clustering algorithm, a radial basis function algorithm, a linear regression algorithm, an associations algorithm, and a neural network algorithm.
7. The data mining method of claim 1, wherein said interface apparatus is a high speed switching apparatus.
8. The data mining method of claim 1, wherein said computing system comprises a system selected from the group consisting of a massively parallel processing system, a symmetric multiprocessing system, and a combination of a massively parallel processing system and a symmetric multiprocessing system.
9. The data mining method of claim 1, wherein said computing system further comprises a relational database software system.
10. A computing system comprising a processor coupled to a computer readable medium and computing devices electrically coupled through an interface apparatus, wherein said computer readable medium comprises a plurality of different data mining modeling algorithms, test data, and instructions that when executed by the processor implement a data mining method, wherein said test data comprises a known outcome associated with a marketing offer, wherein each of said computing devices comprises at least one central processing unit (CPU) and an associated memory device, and wherein each of said computing devices may not access each other computing device of said computing devices, said method comprising the computer implemented steps of:
first receiving, by said computing system, a first steady data stream from a plurality of client spreadsheets;
first dividing, by the computing system, said first steady data stream into a first plurality of data subsets;
associating, by the computing system, each data subset of said first plurality of data subsets with a different customer number associated with a different customer;
first placing, by the computing system, a different data subset of said first plurality of data subsets in each said associated memory device, wherein said first receiving, said first dividing, and said first placing are performed simultaneously;
selecting a technique for generating a data mining model applied to each of said first plurality of data subsets, wherein said data mining model is used to predict future customer behavior based on past historical data, wherein said past historical data consists of a customer purchasing history and a customer returned items history;
running simultaneously, each of said different data mining modeling algorithms on a different associated data subset of said first plurality of data subsets using said selected technique to generate first data mining models on said computing devices, wherein a first associated data mining model of said first data mining models is stored on each of said computing devices;
comparing each of said first data mining models on each of said computing devices to said test data to determine a best selection data model of said first data mining models,
determining, a best data mining modeling algorithm from said different data mining modeling algorithms in accordance with said selected technique, wherein said best data mining modeling algorithm is the data mining modeling algorithm that is associated with said best selection data mining model, and wherein said best data mining modeling algorithm is a neural network algorithm;
first removing each different data subset of said first plurality of data subsets from each said associated memory device;
after said first removing, second receiving by said computing system, a second steady data stream differing from said first steady data stream;
second dividing, by the computing system, said second steady data stream into a second plurality of data subsets;
second placing, by the computing system, a different data subset of said second plurality of data subsets in each said associated memory device, wherein said second receiving, said second dividing, and said second placing are performed simultaneously;
simultaneously applying said best data mining modeling algorithm to each data subset of said second plurality of data subsets;
generating in response to said simultaneously applying, second data mining models on said computing devices, wherein a second associated data mining model of said second data mining models is stored on each of said computing devices, wherein an output of each of said second data mining models comprises a numerical description representing an expected behavior for customers, and wherein each of said second data mining models is associated with a mortgage company; and
comparing each of said second data mining models on each of said computing devices to each other data mining model of said second data mining models to determine a best data model of said second data mining models, wherein said best data model comprises a most predictive data model having a highest degree of correlation to an offer with respect to a customer of said customers as compared to all other data models of said second data mining models.
11. The computing system of claim 10, wherein said test data comprises existing data related to a marketing offer accepted by a first plurality of candidates, and wherein each of said data mining models comprises an acceptance probability that said marketing offer will be accepted by a second plurality of candidates.
12. The computing system of claim 11, wherein said best data mining model comprises a higher acceptance probability than said acceptance probabilities for any other of said data mining models.
13. The computing system of claim 10, wherein said selected technique comprises a lift chart technique, and wherein said method further comprises:
determining by said lift chart technique an effectiveness of each of said generated data mining models.
14. The computing system of claim 10, wherein said selected technique comprises and a root mean squared technique, and wherein said method further comprises:
determining by said root mean squared technique, an error for each of said generated data mining models.
15. The computing system of claim 10, wherein each of said data mining modeling algorithms are selected from the group consisting of a decision tree algorithm, a clustering algorithm, a radial basis function algorithm, a linear regression algorithm, an associations algorithm, and a neural network algorithm.
16. The computing system of claim 10, wherein said interface apparatus is a high speed switching apparatus.
17. The computing system of claim 10, wherein said computing devices electrically coupled through said interface apparatus is a computing system selected from the group consisting of a massively parallel processing system, a symmetric multiprocessing system, and a combination of a massively parallel processing system and a symmetric multiprocessing system.
18. The computing system of claim 10, wherein said computing devices electrically coupled through said interface apparatus comprise a relational database software system.
19. A process for integrating computing infrastructure, comprising integrating computer-readable code into a computing system, wherein the code in combination with the computing system comprises a computer readable medium and computing devices electrically coupled through an interface apparatus, wherein a plurality of different data mining modeling algorithms and test data are stored on said computer readable medium, wherein said test data comprises a known outcome associated with a marketing offer, wherein each of said computing devices comprises at least one central processing unit (CPU) and an associated memory device, wherein each of said computing devices may not access each other computing device of said computing devices, and wherein the code in combination with the computing system is adapted to implement a method for performing the steps of:
first receiving, by said computing system, a first steady data stream from a plurality of client spreadsheets;
first dividing, by the computing system, said first steady data stream into a first plurality of data subsets;
associating, by the computing system, each data subset of said first plurality of data subsets with a different customer number associated with a different customer;
first placing, by the computing system, a different data subset of said first plurality of data subsets in each said associated memory device, wherein said first receiving, said first dividing, and said first placing are performed simultaneously;
selecting a technique for generating a data mining model applied to each of said first plurality of data subsets, wherein said data mining model is used to predict future customer behavior based on past historical data, wherein said past historical data consists of a customer purchasing history and a customer returned items history;
running simultaneously, each of said different data mining modeling algorithms on a different associated data subset of said first plurality of data subsets using said selected technique to generate first data mining models on said computing devices, wherein a first associated data mining model of said first data mining models is stored on each of said computing devices;
comparing each of said first data mining models on each of said computing devices to said test data to determine a best selection data model of said first data mining models,
determining, a best data mining modeling algorithm from said different data mining modeling algorithms in accordance with said selected technique, wherein said best data mining modeling algorithm is the data mining modeling algorithm that is associated with said best selection data mining model, and wherein said best data mining modeling algorithm is a neural network algorithm;
first removing each different data subset of said first plurality of data subsets from each said associated memory device;
after said first removing, second receiving by said computing system, a second steady data stream differing from said first steady data stream;
second dividing, by the computing system, said second steady data stream into a second plurality of data subsets;
second placing, by the computing system, a different data subset of said second plurality of data subsets in each said associated memory device, wherein said second receiving, said second dividing, and said second placing are performed simultaneously;
simultaneously applying said best data mining modeling algorithm to each data subset of said second plurality of data subsets;
generating in response to said simultaneously applying, second data mining models on said computing devices, wherein a second associated data mining model of said second data mining models is stored on each of said computing devices, wherein an output of each of said second data mining models comprises a numerical description representing an expected behavior for customers, and wherein each of said second data mining models is associated with a mortgage company; and
comparing each of said second data mining models on each of said computing devices to each other data mining model of said second data mining models to determine a best data model of said second data mining models, wherein said best data model comprises a most predictive data model having a highest degree of correlation to an offer with respect to a customer of said customers as compared to all other data models of said second data mining models.
20. The process of claim of claim 19, wherein said test data comprises existing data related to a marketing offer accepted by a first plurality of candidates, and wherein each of said data mining models comprises an acceptance probability that said marketing offer will be accepted by a second plurality of candidates.
21. The process of claim 20, wherein said best data mining model comprises a higher acceptance probability than said acceptance probabilities for any other of said data mining models.
22. The process of claim 19, wherein said selected technique comprises a lift chart technique, and wherein said method further comprises:
determining by said lift chart technique an effectiveness of each of said generated data mining models.
23. The process of claim 19, wherein said selected technique comprises and a root mean squared technique, and wherein said method further comprises:
determining by said root mean squared technique, an error for each of said generated data mining models.
24. The process of claim 19, wherein each of said data mining modeling algorithms are selected from the group consisting of a decision tree algorithm, a clustering algorithm, a radial basis function algorithm, a linear regression algorithm, an associations algorithm, and a neural network algorithm.
25. The process of claim 19, wherein said interface apparatus is a high speed switching apparatus.
26. The process of claim 19, wherein said computing system comprises a system selected from the group consisting of a massively parallel processing system, a symmetric multiprocessing system, and a combination of a massively parallel processing system and a symmetric multiprocessing system.
27. The process of claim 19, wherein said computing system further comprises a relational database software system.
28. A computer program product, comprising a computer usable medium having a computer readable program code embodied therein, said computer readable program code comprising an algorithm adapted to implement a data mining method within a computing system, said computing system comprising a computer readable medium and computing devices electrically coupled through an interface apparatus, wherein a plurality of different data mining modeling algorithms and test data are stored on said computer readable medium, wherein said test data comprises a known outcome associated with a marketing offer, wherein each of said computing devices comprises at least one central processing unit (CPU) and an associated memory device, and wherein each of said computing devices may not access each other computing device of said computing devices, said method comprising the steps of:
first receiving, by said computing system, a first steady data stream from a plurality of client spreadsheets;
first dividing, by the computing system, said first steady data stream into a first plurality of data subsets;
associating, by the computing system, each data subset of said first plurality of data subsets with a different customer number associated with a different customer;
first placing, by the computing system, a different data subset of said first plurality of data subsets in each said associated memory device, wherein said first receiving, said first dividing, and said first placing are performed simultaneously;
selecting a technique for generating a data mining model applied to each of said first plurality of data subsets, wherein said data mining model is used to predict future customer behavior based on past historical data, wherein said past historical data consists of a customer purchasing history and a customer returned items history;
running simultaneously, each of said different data mining modeling algorithms on a different associated data subset of said first plurality of data subsets using said selected technique to generate first data mining models on said computing devices, wherein a first associated data mining model of said first data mining models is stored on each of said computing devices;
comparing each of said first data mining models on each of said computing devices to said test data to determine a best selection data model of said first data mining models,
determining, a best data mining modeling algorithm from said different data mining modeling algorithms in accordance with said selected technique, wherein said best data mining modeling algorithm is the data mining modeling algorithm that is associated with said best selection data mining model, and wherein said best data mining modeling algorithm is a neural network algorithm;
first removing each different data subset of said first plurality of data subsets from each said associated memory device;
after said first removing, second receiving by said computing system, a second steady data stream differing from said first steady data stream;
second dividing, by the computing system, said second steady data stream into a second plurality of data subsets;
second placing, by the computing system, a different data subset of said second plurality of data subsets in each said associated memory device, wherein said second receiving, said second dividing, and said second placing are performed simultaneously;
simultaneously applying said best data mining modeling algorithm to each data subset of said second plurality of data subsets;
generating in response to said simultaneously applying, second data mining models on said computing devices, wherein a second associated data mining model of said second data mining models is stored on each of said computing devices, wherein an output of each of said second data mining models comprises a numerical description representing an expected behavior for customers, and wherein each of said second data mining models is associated with a mortgage company; and
comparing each of said second data mining models on each of said computing devices to each other data mining model of said second data mining models to determine a best data model of said second data mining models, wherein said best data model comprises a most predictive data model having a highest degree of correlation to an offer with respect to a customer of said customers as compared to all other data models of said second data mining models.
29. The computer program product of claim 28, wherein said test data comprises existing data related to a marketing offer accepted by a first plurality of candidates, and wherein each of said data mining models comprises an acceptance probability that said marketing offer will be accepted by a second plurality of candidates.
30. The computer program product of claim 29, wherein said best data mining model comprises a higher acceptance probability than said acceptance probabilities for any other of said data mining models.
31. The computer program product of claim 28, wherein said selected technique comprises a lift chart technique, and wherein said method further comprises:
determining by said lift chart technique an effectiveness of each of said generated data mining models.
32. The computer program product of claim 28, wherein said selected technique comprises and a root mean squared technique, and wherein said method further comprises:
determining by said root mean squared technique, an error for each of said generated data mining models.
33. The computer program product of claim 28, wherein each of said data mining modeling algorithms are selected from the group consisting of a decision tree algorithm, a clustering algorithm, a radial basis function algorithm, a linear regression algorithm, an association algorithm, and a neural network algorithm.
34. The computer program product of claim 28, wherein said interface apparatus is a high speed switching apparatus.
35. The computer program product of claim 28, wherein said computing system comprises a system selected from the group consisting of a massively parallel processing system, a symmetric multiprocessing system, and a combination of a massively parallel processing system and a symmetric multiprocessing system.
36. The computer program product of claim 28, wherein said computing system further comprises a relational database software system.
Descripción
BACKGROUND OF THE INVENTION

1. Technical Field

The present invention relates to a system and associated method for selecting a data mining modeling algorithm.

2. Related Art

Selecting a technique to locate specific data from a large amount of data is typically very time consuming. Therefore there exists a need for a time efficient procedure to select a technique to locate specific data from a large amount of data.

SUMMARY OF THE INVENTION

The present invention provides a data mining method, comprising:

providing a computing system comprising a computer readable medium and computing devices electrically coupled through an interface apparatus, wherein a plurality of different data mining modeling algorithms and test data are stored on said computer readable medium, wherein each of said computing devices comprises at least one central processing unit (CPU) and an associated memory device, wherein data has been divided by the computing system into a plurality of data subsets, and wherein each of said associated memory devices comprises a data subset from said plurality of data subsets;

selecting a technique for generating a data mining model applied to each of said data subsets;

running simultaneously, each of said different data mining modeling algorithms using said selected technique to generate an associated data mining model on each of said computing devices;

comparing each of said data mining models on each of said computing devices to said test data to determine a best data model of said data mining models; and

determining, a best data mining modeling algorithm from said different data mining modeling algorithms in accordance with said selected technique, wherein said best data mining modeling algorithm is the data mining modeling algorithm that is associated with said best data mining model.

The present invention provides a computing system comprising a processor coupled to a computer readable medium and computing devices electrically coupled through an interface apparatus, wherein said computer readable medium comprises a plurality of different data mining modeling algorithms, test data, and instructions that when executed by the processor implement a data mining method, wherein each of said computing devices comprises at least one central processing unit (CPU) and an associated memory device, wherein data has been divided by the computing system into a plurality of data subsets, and wherein each of said associated memory devices comprises a data subset from said plurality of data subsets, said method comprising the computer implemented steps of:

selecting a technique for generating a data mining model applied to each of said data subsets;

running simultaneously, each of said different data mining modeling algorithms using said selected technique to generate an associated data mining model on each of said computing devices;

comparing each of said data mining models on each of said computing devices to said test data to determine a best data model of said data mining models; and

determining, a best data mining modeling algorithm from said different data mining modeling algorithms in accordance with said selected technique, wherein said best data mining modeling algorithm is the data mining modeling algorithm that is associated with said best data mining model.

The present invention provides a process for integrating computing infrastructure, comprising integrating computer-readable code into a computing system, wherein the code in combination with the computing system comprises a computer readable medium and computing devices electrically coupled through an interface apparatus, wherein a plurality of different data mining modeling algorithms and test data are stored on said computer readable medium, wherein each of said computing devices comprises at least one central processing unit (CPU) and an associated memory device, wherein data has been divided by the computing system into a plurality of data subsets, and wherein each of said associated memory devices comprises a data subset from said plurality of data subsets, and wherein the code in combination with the computing system is adapted to implement a method for performing the steps of:

selecting a technique for generating a data mining model applied to each of said data subsets;

running simultaneously, each of said different data mining modeling algorithms using said selected technique to generate an associated data mining model on each of said computing devices;

comparing each of said data mining models on each of said computing devices to said test data to determine a best data model of said data mining models; and

determining, a best data mining modeling algorithm from said different data mining modeling algorithms in accordance with said selected technique, wherein said best data mining modeling algorithm is the data mining modeling algorithm that is associated with said best data mining model.

The present invention provides a computer program product, comprising a computer usable medium having a computer readable program code embodied therein, said computer readable program code comprising an algorithm adapted to implement a data mining method within a computing system, said computing system comprising a computer readable medium and computing devices electrically coupled through an interface apparatus, wherein a plurality of different data mining modeling algorithms and test data are stored on said computer readable medium, wherein each of said computing devices comprises at least one central processing unit (CPU) and an associated memory device, wherein data has been divided by the computing system into a plurality of data subsets, and wherein each of said associated memory devices comprises a data subset from said plurality of data subsets, said method comprising the steps of:

selecting a technique for generating a data mining model applied to each of said data subsets;

running simultaneously, each of said different data mining modeling algorithms using said selected technique to generate an associated data mining model on each of said computing devices;

comparing each of said data mining models on each of said computing devices to said test data to determine a best data model of said data mining models; and

determining, a best data mining modeling algorithm from said different data mining modeling algorithms in accordance with said selected technique, wherein said best data mining modeling algorithm is the data mining modeling algorithm that is associated with said best data mining model.

The present invention advantageously provides a system and associated method comprising a time efficient procedure to select a technique to locate specific data from a large amount of data.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a block diagram view of a database system for selecting a best data mining modeling algorithm for generating data mining models, in accordance with embodiments of the present invention.

FIG. 2 illustrates a block diagram comprising an algorithm for implementing the database system 2 of FIG. 1 for selecting a data mining modelling algorithm and producing a propensity to lapse data mining model, in accordance with embodiments of the present invention.

FIG. 3 illustrates a flowchart comprising an algorithm used by database system of FIG. 1 for selecting a “best” data mining modeling algorithm, generating data mining models using the “best” data mining modeling algorithm, and selecting a “best” data mining model, in accordance with embodiments of the present invention.

FIG. 4 illustrates a computer system used for implementing the database system of FIG. 1 for selecting a “best” data mining modeling algorithm to generate and select data mining models, in accordance with embodiments of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1 illustrates a block diagram view of a database system 2 for determining a best data mining modeling algorithm for generating data mining models, in accordance with embodiments of the present invention. The database system 2 may alternatively be a computing system. A database system (e.g., database system 2) executes a data mining modeling algorithm(s) on data in accordance with a selected technique to create a plurality of data mining models. Data mining models may be used for, inter alia, predicting a customer(s) (i.e., a candidate) response and acceptance probability to a marketing offer(s) for a product or service from an entity (e.g., a business). The data mining modeling algorithm may comprise any type of data mining modeling algorithm including, inter alia, a decision tree algorithm, a clustering algorithm, a radial basis function algorithm, a linear regression algorithm, an associations algorithm, and a neural network algorithm, etc. A data mining model is generated using existing customer data (e.g., customer behavioral data) such as, inter alia, purchasing history, returned-items history, payment history, promotional offers history, demographic data regarding the customer, etc. A data mining model may be used by an entity (e.g., a company offering products or services such as, inter alia, credit cards, consumer products, mortgages, etc.) to predict future customer behavior (i.e., propensity to respond to a product or service offer) based on an analysis of numerous customer attributes (e.g., purchasing history, returned-items history, payment history, promotional offers history, demographic data regarding, etc.) from the past. The accuracy of the prediction is tied to the ability of a data mining professional to generate and test numerous data mining models, using various data mining modeling algorithm(s), to determine both a “best” data mining modeling algorithm and “best” data mining model having a highest degree of correlation to a desired product offer or service offer with respect to a customer(s). Certain types of data mining modeling algorithms generate better (i.e., more predictive) data models from certain types of data. Therefore, the database system must select a “best” data mining modeling algorithm based on data type. The data mining modeling algorithm selection process comprises executing different types of data mining modeling algorithms on data subsets comprising a same type of data to generate data models. The generated data models are compared to test (or results) data comprising a known outcome using a selected technique (e.g., a lift chart technique as defined, infra, a root mean squared technique as defined, infra, etc) and a best (i.e., a most predictive) data model is selected. The data mining modeling algorithm that is associated with the “best” data model (i.e., data mining modeling algorithm that generated the “best” data model from the associated data subset) is considered the “best” data mining modeling algorithm. The test data comprises known data. For example, the test data may comprise, inter alia, data related to a specific marketing offer (e.g., product or service) accepted by a group of candidates. The “best” data mining modeling algorithm is now used to generate a plurality of data models from data comprising a specific data type.

The database system (e.g., database system 2 in FIG. 1) comprises existing customer data (e.g., data 6 in FIG. 1) divided or allocated into a first plurality of individual data subsets (e.g., data subsets 6A . . . 6F in FIG. 1) within individual computing devices or nodes (e.g., computing devices 20 . . . 25 in FIG. 1). The first plurality of individual data subsets comprise an allocated portion of the total customer data. Each data subset of the first plurality of data subsets is defined as 1/N multiplied by the total data set, wherein N is the total number of nodes or individual computing devices. For example, a 100 node (i.e., 100 computing device) parallel system would allocate 1/100th of the total data set (e.g., data 6) on each node. The total data may be allocated among the nodes uniformly (as in the previous example), randomly (e.g., using a hash algorithm), or the data may be allocated among the nodes according to a business rule, such as, inter alia, a customer number. Once the total data is allocated and stored across the nodes of the database system, the first plurality of data subsets are available for access to generate data mining models. The first plurality of data subsets may be allocated among the nodes in the database system as the data is entered into the database system. A technique is selected for selecting a “best” data mining modeling algorithm and generating data mining models applied to each of a second plurality data subsets and determining a “best” data mining model. The technique may comprise any technique including, inter alia, a lift chart technique as defined, infra, a root mean squared technique as defined, infra, etc. A coordinator node (e.g., administrator computing apparatus 29) applies a plurality of different types of data mining modeling algorithms to the first plurality of data subsets in each node simultaneously to generate data models. The generated data models are compared to test (or results) data (e.g., test data 4 in FIG. 1) comprising a known outcome using the selected technique (e.g., a lift chart technique as defined, infra, a root mean squared technique as defined, infra, etc) and a “best” (i.e., a most predictive) selection data model is selected. The data mining modeling algorithm that is associated with the “best” selection data model (i.e., data mining modeling algorithm that generated the “best” data model) is considered the “best” data mining modeling algorithm. The “best” data mining modeling algorithm is applied by the coordinator node (e.g., administrator computing apparatus 29) to a second plurality of data subsets (e.g., data subsets 8A-8F in FIG. 1) simultaneously to generate in accordance with the selected technique and compare numerous data mining models. The second plurality of data subsets are allocated across the nodes of the database system in a same manner as the first plurality of data subsets. The data mining modeling algorithm may comprise any type of data mining modeling algorithm including, inter alia, a decision tree algorithm, a clustering algorithm, a radial basis function algorithm, a linear regression algorithm, an associations algorithm, and a neural network algorithm, etc. Each of the above mentioned data mining modeling algorithms are defined, infra. An output from the generated data mining models comprises a numerical description of an “expected behavior(s)” for a customer(s). By comparing results of these “expected behaviors” at a coordinator node (e.g., administrator computing apparatus 29) in accordance with the selected technique in the database system, a “best” data mining model may be selected. The “best” data mining model comprises a highest degree of correlation to a desired product or service offer with respect to a customer(s). The database system 2 comprises computing devices 20, 21, 22, 23, 24, and 25, electrically connected to an interface 15. The interface 15 may comprise any type of interface known to a person of ordinary skill in the art including, inter alia, a local area network (LAN), etc. Additionally, the database system 2 comprises an administrator computing apparatus 29 electrically connected to an interface 15. Each of computing devices 20, 21, 22, and 23 comprises a single central processing unit (CPU) 5 and a memory unit 15. Each of computing devices 24 and 25 comprises a plurality of CPUs 5 connected to a memory unit 15 through a bus 7. The computing devices 24 and 25 are symmetric multiprocessing (SMP) computing devices. An SMP computing device is a computing device comprising multiple CPUs to complete individual processes simultaneously. The database system 2 may comprise an unlimited number of computing devices similar to: the computing devices 20 . . . 23, the computing devices 24 . . . 25, or a combination of computing devices similar to the computing devices 20 . . . 23 and the computing devices 24 . . . 25. The database system 2 may comprise only computing devices similar to the computing devices 20 . . . 23 (i.e., comprising a single CPU). As a first alternative, the database system 2 may comprise only computing devices similar to the computing devices 24 . . . 25 (i.e., SMP computing devices). As a second alternative, the database system 2 may comprise a combination of computing devices (unlimited number) similar to the computing devices 20 . . . 23 and the computing devices 24 . . . 25 as illustrated in FIG. 1. The database system 2 illustrated in FIG. 1 comprises a massively parallel processing (MPP) computer system comprising single CPU 5 computing devices (i.e., computing device 20 . . . 23) and SMP computing devices (i.e., computing device 25 . . . 25). An MPP computer system is a computer system that comprises separate CPUs running in parallel (i.e., simultaneously) to execute a single program. The administrator computing apparatus 29 comprises a computer 14, an input device 17, an output device 12, a database managing software application 9, test data 4, and data mining modeling algorithms 33. The database managing software application 9 may comprise any type of database manager software including, inter alia, DB2 database management system by IBM, etc. The computer 14 may comprise any type of computer known to a person of ordinary skill in the art including, inter alia, a personal computer, a server computer, etc. The input device 17 may comprise any type of input device known to a person of ordinary skill in the art including, inter alia, a keyboard, a computer disc drive, a keypad, a network connection, etc. The output device 12 may comprise any type of output device known to a person of ordinary skill in the art including, inter alia, a monitor, a printer, etc. The administrator computing apparatus 29 may access and send instructions, programs and/or copies of the database managing software application 9 to each of the memory devices 15 within each of the computing devices 20 . . . 23 and 24 . . . 25. Each of the computing devices 20 . . . 23 and 24 . . . 25 may only access their own memory device 15 and may not access each other's memory devices 15. Streams of data 6 and 8 are inputted into the administrator computing apparatus 29 through the input device 17. The administrator computing apparatus 29 divides the streams of data 6 and 8 into a plurality of data subsets 6A . . . 6F and a plurality of data subsets 8A . . . 8F. The streams of data 6 and 8 may comprise steady streams of data. Alternatively, the streams of data 6 and 8 may comprise streams of data inputted through the input device 17 in intervals. The administrator computing apparatus 29, randomly or by use of a business rule, sends each of the data subsets 6A . . . 6F and each of the data subsets 8A . . . 8F to a different one of computing devices 20, 21, 22, 23, 24, or 25. A technique is selected for selecting a “best” data mining modeling algorithm for generating data mining models from each of the data subsets 6A . . . 6F and determining a “best” data mining model. The technique may comprise any technique including, inter alia, a lift chart technique as defined, infra, a root mean squared technique as defined, infra, etc. The stream of selection data 6 is inputted into the administrator computing apparatus 29 through the input device 17. The administrator computing apparatus 29 divides the stream of selection data 6 into a plurality of data subsets 6A . . . 6F. The administrator computing apparatus 29 applies a plurality of different types of data mining modeling algorithms to each of data subsets 6A . . . 6F within each of computing devices 20 . . . 23 and 24 . . . 25 to simultaneously to generate selection data models. The administrator computing apparatus 29 compares the generated selection data models to test (or results) data 4 comprising a known outcome using the selected technique (e.g., a lift chart technique as defined, infra, a root mean squared technique as defined, infra, etc) and a best (i.e., a most predictive) selection data model is selected. The data mining modeling algorithm that is associated with the “best” selection data model (i.e., data mining modeling algorithm that generated the “best” data model) is considered the “best” data mining modeling algorithm. The data subsets 6A . . . 6F may now be removed from the database system 2. The “best” data mining modeling algorithm 33 is applied by the administrator computing apparatus 29 to each of data subsets 8A . . . 8F within each of computing devices 20 . . . 23 and 24 . . . 25 to simultaneously generate and compare numerous data mining models in accordance with the selected technique and select a best data mining model. The “best” data mining modeling algorithm 33 may comprise any type of data mining modeling algorithm including, inter alia, a decision tree algorithm, a clustering algorithm, a radial basis function algorithm, a linear regression algorithm, an associations algorithm, and a neural network algorithm. A decision tree algorithm comprises a method for dividing the data subsets into a tree with an objective of predicting an outcome by using a “divide and conquer” approach. A clustering algorithm comprises placing data subsets into groups otherwise known as clusters whereby all the customers are “similar”. A radial basis function algorithm comprises a method referred to as supervised learning (alternative examples in same a class as supervised learning may be time-series analysis, multivariate analysis, etc.). A linear regression algorithm comprises a method of fitting a line to a set of observations such as to minimize the scatter of the original pattern. An association's algorithm comprises a method used for discovering regularities in a data subset and generally predicts different things. A neural network algorithm comprises a computing method based parallel architecture. Neural networks comprise simple processing elements, a high degree of interconnection, simple scalar messages, and adaptive interaction between elements. The administrator computing apparatus 29 using a selected technique compares each of the generated data mining models to each other and a “best” data mining model is determined. The “best” data mining model comprises a highest degree of correlation to a desired product offer or service offer with respect to a customer(s). The “best” data mining model may be determined using a plurality of techniques including, inter alia, a lift chart technique, a root mean squared technique, etc. A lift chart technique comprises calculating a measure of the effectiveness of a predictive model (i.e., data mining model) as a ratio between results obtained with and without the predictive model. For example, a lift chart technique comprises using a measurement comprising a determination of how much better (or worse) a data mining model predicted results for a given case set would be in comparison to random selection. A lift is typically calculated by dividing a percentage of expected response predicted by the data mining model by the percentage of expected response predicted by a random selection. For example, if a normal density of response to a direct mail campaign for a product offer or service offer is 10 percent, a determination may be made by focussing on a top quartile of the case set predicted to respond to the campaign by the data mining model. The determination comprises a density of response increasing to 30 percent. Therefore the lift would be calculated at 3, or 30/10. A root mean squared technique comprises a special form of error rate for a prediction involving continuous, ordered attributes. The mean-squared error is the measurement of variation between a predicted value and an actual value. Subtracting the two values and squaring the result provides the rate of squared error. The rate of squared error is averaged over all predictions for the same attribute to provide an estimate of variation for a given prediction. The result is squared to ensure that all errors are positive and can be added together when the average is taken. Additionally, the result is squared to weigh widely varying prediction values. For example, if a prediction for unit sales (in thousands) for one store is 50 and the actual unit sales (in thousands) for the store was 65, the mean squared error would be 65−50=15, raised to the power of 2, or 225.

FIG. 2 illustrates a block diagram comprising an algorithm 19 for implementing the database system 2 of FIG. 1 for selecting a data mining modelling algorithm and producing a propensity to lapse data mining model, in accordance with embodiments of the present invention. Step 26 comprises a process for building a business understanding with the underlying business issues associated with lapsing one's policy/product in the customer's domain. Step 27 comprises using the information from step 26 to source a set of variables from the existing database/client spreadsheets (i.e., data 8). This is typically undertaken as an iterative process. A key to building a predictive model is finding evidence of attrition within the historical population (i.e., customer database). Step 28 comprises a data preparation phase requiring performing exploratory data analysis on the set of chosen variables and undertaking some necessary mathematical transformations. For example, a number of weeks a policy is in force may be determined by subtracting the current date from the date the policy was in force to calculate the number of weeks. In step 31, a best data mining modelling algorithm is selected as described in the description of FIG. 1. In Step 30, data mining models are generated using the best data mining modelling algorithm and the data models are evaluated so that the population (i.e., customer data) may be divided into several samples for training purposes. There are two reasons for dividing the population (i.e., customer data) into several samples. A first reason for dividing the population (i.e., customer data) into several samples is to reduce a run time, by reduction in data mining model complexity. A second reason for dividing the population (i.e., customer data) into several samples is to try to unbias the data samples. Typically a number of attrition for the population are few and therefore the attrition are overloaded by statistics of the portion of the population that may not accept a product or service offer. By choosing smaller populations to compare against each other, more representative data mining models may be generated. Typically, an entity may want to select as many training cases (i.e., samples) as possible when creating a data mining model, but time limitations typically reduce an actual number of training cases selected, thereby ensuring that the training case set (i.e., samples) closely represents the density and distribution of the production case set. A largest possible training case set may be selected to smooth a distribution of training case attributes. The process of creating such a representative set of data, called sampling, is best handled by selecting records completely at random. Such random sampling should provide a truly unbiased view of the data. As a result of step 30, a plurality of data mining models are generated. In step 32, data mining models that have been generated are stored and used for later comparison to each other to select a most effective data mining model (i.e., a “best” data mining model). A “best” data mining model may be selected using a plurality of techniques including, inter alia, a lift chart technique, a root mean squared technique, etc. as described in the description of FIG. 1. In step 34, a “best” data mining model is selected and deployed with respect to a product offer or service offer.

FIG. 3 illustrates a flowchart comprising an algorithm used by database system 2 of FIG. 1 for selecting a “best” data mining modeling algorithm, generating data mining models using the “best” data mining modeling algorithm, and selecting a “best” data mining model, in accordance with embodiments of the present invention. In step 35, a technique is selected for selecting a “best” data mining modeling algorithm and generating data mining models applied to each of the data subsets 8A . . . 8F to determine a “best” data mining model. The selected techniques may include, inter alia, a lift chart technique, a root mean squared technique, etc. as described and defined in the description of FIG. 1. In step 36, the administrator computing apparatus 29 transmits simultaneously, a different data mining modeling algorithm from a plurality of data mining modeling algorithms 33 to each of data subsets 6A . . . 6F within each of computing devices 20 . . . 23 and 24 . . . 25. In step 37 each different data mining modeling algorithm is run simultaneously, using the selected technique from step 35, on each of data subsets 6A . . . 6F within each of computing devices 20 . . . 23 and 24 . . . 25. In step 39, a plurality of selection data mining models are simultaneously generated. In step 42, the administrator computing apparatus 29 compares each of the generated selection data mining models to test data 4 a “best” selection data mining model is selected. In step 44, a “best” data mining modelling algorithm is selected. The “best” data mining modelling algorithm is associated with the “best” selection data mining model. The “best” data mining modeling algorithm may comprise any type of data mining modeling algorithm including, inter alia, a decision tree algorithm, a clustering algorithm, a radial basis function algorithm, a linear regression algorithm, an associations algorithm, a neural network algorithm, etc. as described and defined in the description of FIG. 1. In step 46, the “best” data mining modelling algorithm is applied to each of data subsets 8A . . . 8F within each of computing devices 20 . . . 23 and 24 . . . 25 to create a plurality of data mining models. In step 48, a “best” data mining model is selected. The “best” data mining model comprises a highest degree of correlation to a desired product offer or service offer with respect to a customer(s).

FIG. 4 illustrates a computer system 90 used for implementing the database system 2 of FIG. 1 for selecting a “best” data mining modeling algorithm to generate and select data mining models, in accordance with embodiments of the present invention. The computer system 90 comprises a processor 91, an input device 92 coupled to the processor 91, an output device 93 coupled to the processor 91, and memory devices 94 and 95 each coupled to the processor 91. The input device 92 may be, inter alia, a keyboard, a mouse, etc. The output device 93 may be, inter alia, a printer, a plotter, a computer screen, a magnetic tape, a removable hard disk, a floppy disk, etc. The memory devices 94 and 95 may be, inter alia, a hard disk, a floppy disk, a magnetic tape, an optical storage such as a compact disc (CD) or a digital video disc (DVD), a dynamic random access memory (DRAM), a read-only memory (ROM), etc. The memory device 95 includes a computer code 97. The computer code 97 includes an algorithm for selecting a “best” data mining modeling algorithm to generate and select data mining models. The processor 91 executes the computer code 97. The memory device 94 includes input data 96. The input data 96 includes input required by the computer code 97. The output device 93 displays output from the computer code 97. Either or both memory devices 94 and 95 (or one or more additional memory devices not shown in FIG. 4) may comprise the database system 2 of FIG. 1 and may be used as a computer usable medium (or a computer readable medium or a program storage device) having a computer readable program code embodied therein and/or having other data stored therein, wherein the computer readable program code comprises the computer code 97. Generally, a computer program product (or, alternatively, an article of manufacture) of the computer system 90 may comprise said computer usable medium (or said program storage device).

Thus the present invention discloses a process for deploying or integrating computing infrastructure, comprising integrating computer-readable code into the computer system 90, wherein the code in combination with the computer system 90 is capable of performing a method used for selecting a “best” data mining modeling algorithm to generate and select data mining models.

While FIG. 4 shows the computer system 90 as a particular configuration of hardware and software, any configuration of hardware and software, as would be known to a person of ordinary skill in the art, may be utilized for the purposes stated supra in conjunction with the particular computer system 90 of FIG. 4. For example, the memory devices 94 and 95 may be portions of a single memory device rather than separate memory devices.

While embodiments of the present invention have been described herein for purposes of illustration, many modifications and changes will become apparent to those skilled in the art. Accordingly, the appended claims are intended to encompass all such modifications and changes as fall within the true spirit and scope of this invention.

Citas de patentes
Patente citada Fecha de presentación Fecha de publicación Solicitante Título
US618900521 Ago 199813 Feb 2001International Business Machines CorporationSystem and method for mining surprising temporal patterns
US6233566 *18 Mar 199915 May 2001Ultraprise CorporationSystem, method and computer program product for online financial products trading
US62789975 Feb 199921 Ago 2001International Business Machines CorporationSystem and method for constraint-based rule mining in large, dense data-sets
US643054722 Sep 19996 Ago 2002International Business Machines CorporationMethod and system for integrating spatial analysis and data mining analysis to ascertain relationships between collected samples and geology with remotely sensed data
US653937830 Nov 200125 Mar 2003Amazon.Com, Inc.Method for creating an information closure model
US6553366 *1 Oct 199922 Abr 2003Ncr CorporationAnalytic logical data model
US6567814 *26 Ago 199920 May 2003Thinkanalytics LtdMethod and apparatus for knowledge discovery in databases
US66118291 Oct 199926 Ago 2003Ncr CorporationSQL-based analytic algorithm for association
US662909513 Oct 199830 Sep 2003International Business Machines CorporationSystem and method for integrating data mining into a relational database management system
US665104822 Oct 199918 Nov 2003International Business Machines CorporationInteractive mining of most interesting rules with population constraints
US6675164 *8 Jun 20016 Ene 2004The Regents Of The University Of CaliforniaParallel object-oriented data mining system
US668769318 Dic 20003 Feb 2004Ncr CorporationArchitecture for distributed relational data mining systems
US66876951 Oct 19993 Feb 2004Ncr CorporationSQL-based analytic algorithms
US668769626 Jul 20013 Feb 2004Recommind Inc.System and method for personalized search, information filtering, and for generating recommendations utilizing statistical latent class models
US67183221 Oct 19996 Abr 2004Ncr CorporationSQL-based analytic algorithm for rule induction
US671833826 Jun 20016 Abr 2004International Business Machines CorporationStoring data mining clustering results in a relational database for querying and reporting
US6772166 *1 Oct 19993 Ago 2004Ncr CorporationSQL-based analytic algorithm for clustering
US6775831 *11 Feb 200010 Ago 2004Overture Services, Inc.System and method for rapid completion of data processing tasks distributed on a network
US7020631 *21 May 200128 Mar 2006The Chase Manhattan BankMethod for mortgage and closed end loan portfolio management
US7039654 *12 Sep 20022 May 2006Asset Trust, Inc.Automated bot development system
US7092941 *23 May 200215 Ago 2006Oracle International CorporationClustering module for data mining
US7143046 *28 Dic 200128 Nov 2006Lucent Technologies Inc.System and method for compressing a data table using models
US7219099 *9 Abr 200315 May 2007Oracle International CorporationData mining model building using attribute importance
US7349919 *21 Nov 200325 Mar 2008International Business Machines CorporationComputerized method, system and program product for generating a data mining model
US20010051947 *3 Abr 200113 Dic 2001International Business Machines CorporationSpatial data mining method, spatial data mining apparatus and storage medium
US20020059202 *14 May 200116 May 2002Mirsad HadzikadicIncremental clustering classifier and predictor
US2002007779018 Dic 200020 Jun 2002Mikael Bisgaard-BohrAnalysis of retail transactions using gaussian mixture models in a data mining system
US20020083067 *27 Sep 200127 Jun 2002Pablo TamayoEnterprise web mining system and method
US20020128997 *15 May 200112 Sep 2002Rockwell Technologies, LlcSystem and method for estimating the point of diminishing returns in data mining processing
US20020138492 *16 Nov 200126 Sep 2002David KilData mining application with improved data mining algorithm selection
US20020194159 *8 Jun 200119 Dic 2002The Regents Of The University Of CaliforniaParallel object-oriented data mining system
US20020198889 *26 Abr 200126 Dic 2002International Business Machines CorporationMethod and system for data mining automation in domain-specific analytic applications
US20030065663 *12 Sep 20013 Abr 2003Chu Chengwen RobertComputer-implemented knowledge repository interface system and method
US20030088491 *7 Nov 20018 May 2003International Business Machines CorporationMethod and apparatus for identifying cross-selling opportunities based on profitability analysis
US20030217033 *17 May 200220 Nov 2003Zigmund SandlerDatabase system and methods
US20040015386 *19 Jul 200222 Ene 2004International Business Machines CorporationSystem and method for sequential decision making for customer relationship management
US2004012828725 Nov 20031 Jul 2004International Business Machines CorporationSelf tuning database retrieval optimization using regression functions
US20040267729 *9 Abr 200430 Dic 2004Accenture LlpKnowledge management tool
US20050114360 *24 Nov 200326 May 2005International Business Machines CorporationComputerized data mining system, method and program product
US20060085422 *1 Oct 200420 Abr 2006International Business Machines CorporationTechnique for data mining using a web service
Otras citas
Referencia
1Allinson et al.; Interactive and Semantic Data Visualization using Self-Organising Maps; pp. 1-9.
2Feng et al.; Data mining techniques applied to predictive modeling of the knurling process; IIE Transactions (2004) 36, 263; 0740-817X print / 1545-8830 online; DOI: 10.1080/07408170490274214.
3Hall et al.; Why are Neural Networks Sometimes Much More Accurate than Decision Trees; An Analysis on a Bio-Informatics Problem 0-7803-7952-7/03 2003 IEEE; SMC '03 Conference Proceedings, vol. 3 of 5; pp. 2851-2856.
4Hossain et al.; A study of re-sampling methods with regression modeling; Third International Conference on Data Mining, Data Mining III; pp. 83-91, 2002.
5Jin et al.; A Super-Programming Approach for Mining Association Rules in Parallel on PC Clusters; IEEE Transactions on Parallel and Distributed Systems, vol. 15, No. 9, Sep. 2004; pp. 783-793.
6Kamran Sartipi; Software Architecture Recovery based on Pattern Matching; 1063-6773/03 2003 IEEE; 4 pages.
7Kumar et al.; High Performance Data Mining Tutorial PM-3; pp. 309-425.
8Lee et al. Neural-Based Approaches for Improving the Accuracy of Decision Trees; Y. Kambayashi, W. Winiwarter, M. Arikawa (Eds.); DaWaK 2002, LNCS 2454; pp. 114-123, 2002.
9Mardia et al.; Association Rules for Web Data Mining in WHOWEDA; 0-7695-1022-1/01 2001 IEEE; pp. 227-233.
10Mark Last; A Compact and Accurate Model for Classification; 1041-4347/04 2004 IEEE; IEEE Transactions on Knowledge and Data Engineering, vol. 16, No. 2, Feb. 2004; pp. 203-215.
11Pan et al.; Hybrid Neural Network and C4.5 For Missue Detection; 0-7803-7865-2/03 2003 IEEE; 2003 International Conf. Machine Learning and Cybernetics; vol. 4 of 5; pp. 2463-2467.
12Ros et al.; Development of predictive models by adaptive fuzzy partitioning. Application to compounds active on the central nervous system; Chemometrics and Intelligent Laboratory Systems 67 (2003); pp. 29-50.
13Sousa et al.; Modeling Charity Donations Using Target Selection for Revenue Maximization; 0-7803-7810-5/03 2003 IEEE; The 12th IEEE International Conf. on Fuzzy Systems; vol. 1; pp. 654-659.
14Steingold et al.; Measuring Real-Time Predictive Models; 0-7695-1119-8/01 2001 IEEE; pp. 649-650.
15Szupiluk et al.; Independent Component Analysis as Postprocessing Stage in Data Mining; AI-Meth-2003-Artificial Intelligence Methods; Nov. 5-7, 2003, Gliwice, Poland; pp. 311-314.
16Vafaie et al.; Improving Performance of Inductive Models Through an Algorithm and Sample Combination Strategy; International Journal of Artificial Intelligence Tools; Dec. 2001, vol. 10, No. 4; pp. 555-572.
Citada por
Patente citante Fecha de presentación Fecha de publicación Solicitante Título
US7730100 *13 Nov 20061 Jun 2010Canon Kabushiki KaishaInformation processing apparatus, information processing method, and storage medium
US8527324 *28 Dic 20063 Sep 2013Oracle Otc Subsidiary LlcPredictive and profile learning salesperson performance system and method
US8713048 *24 Jun 200829 Abr 2014Microsoft CorporationQuery processing with specialized query operators
US20090319499 *24 Jun 200824 Dic 2009Microsoft CorporationQuery processing with specialized query operators
Clasificaciones
Clasificación de EE.UU.1/1, 706/12, 707/999.102, 707/999.006, 707/999.008, 707/999.101, 707/999.007
Clasificación internacionalG06F7/00
Clasificación cooperativaY10S707/99936, Y10S707/99942, Y10S707/99937, Y10S707/99943, Y10S707/99938, G06F17/30539
Clasificación europeaG06F17/30S4P8D
Eventos legales
FechaCódigoEventoDescripción
21 May 2013FPExpired due to failure to pay maintenance fee
Effective date: 20130331
31 Mar 2013LAPSLapse for failure to pay maintenance fees
12 Nov 2012REMIMaintenance fee reminder mailed
21 Jul 2005ASAssignment
Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CHITGUPAKAR, MILIND;RAMSEY, MARK S.;SELBY, DAVID A.;REEL/FRAME:016549/0271;SIGNING DATES FROM 20050608 TO 20050619