WO1998049637A1 - Method and apparatus for searching a database of records - Google Patents

Method and apparatus for searching a database of records Download PDF

Info

Publication number
WO1998049637A1
WO1998049637A1 PCT/US1998/008785 US9808785W WO9849637A1 WO 1998049637 A1 WO1998049637 A1 WO 1998049637A1 US 9808785 W US9808785 W US 9808785W WO 9849637 A1 WO9849637 A1 WO 9849637A1
Authority
WO
WIPO (PCT)
Prior art keywords
records
search result
categories
search
database
Prior art date
Application number
PCT/US1998/008785
Other languages
French (fr)
Inventor
Marc F. Krellenstein
Original Assignee
Northern Light Technology, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northern Light Technology, Llc filed Critical Northern Light Technology, Llc
Priority to DE69839604T priority Critical patent/DE69839604D1/en
Priority to CA002288745A priority patent/CA2288745C/en
Priority to AU72717/98A priority patent/AU736428B2/en
Priority to EP98920069A priority patent/EP0979470B1/en
Priority to JP54740898A priority patent/JP2001522496A/en
Publication of WO1998049637A1 publication Critical patent/WO1998049637A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99934Query formulation, input preparation, or translation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99935Query augmenting and refining, e.g. inexact access

Definitions

  • the invention relates generally to a method and apparatus for searching a database of records. More particularly, the invention relates to a method and search apparatus for searching a database comprising both Internet and premium content information.
  • the Internet attracts millions of users every day. It has been estimated that the number of Internet users would grow from 10 million at the end of 1995 to 170 million by the year 2000.
  • the primary attraction to the Internet is the promise of huge quantities of available information on any imaginable topic of interest.
  • the Internet is an excellent source of the type of information found in product brochures. However, the Internet is a remarkably poor source of editorial information, reference information and commentary.
  • quality information i.e., premium content
  • premium content owners are most often created and provided by companies who are compensated for the information (i.e., premium content owners).
  • the tradition of no cost information on the Internet has inhibited premium content owner from making their information available via the Internet.
  • Another reason has been the substantial financial and capital investment required to develop, market and maintain premium content on the Internet. Industry observers are unclear as to which business models will ultimately materialize to produce reasonable profits for premium content available on the Internet. As a result of these factors, the Internet is currently not considered a primary source of most recognized content on any topic.
  • some premium content owners have begun to make their information available on the Internet, typically in the form of subscription services. These services, however, have numerous problems and are therefore not always a good solution for Internet users.
  • One problem with subscription services is that a user must perform multiple searches and search multiple sites (often including multiple databases at sites) to obtain comprehensive information on the subject being searched. For a truly robust result, users often use a search engine, which can return volumes of information from the Internet. With no easy way to consolidate the returned information, users find the process too cumbersome and time consuming to be worthwhile.
  • Another problem is that users can incur high costs in signing up for multiple subscription services to satisfy their needs in each topic area of interest. While users typically have varying interests, many resist signing up for multiple subscriptions on multiple topics.
  • users are required to anticipate their desire to query on a particular topic in order to have all of the necessary subscriptions in advance. In reality, many user information interests are ad hoc and of short duration. Subscription services cannot satisfy this type of user information need.
  • the search can produce hundreds, even thousands, of hits (i.e., records).
  • the Alta VistaTM search engine returns hundreds of thousands of hits in response to a search under the topic "windows.” This deluge of information is often just too much to review, cull, and select. This problem is exacerbated by the failure of the search engine to group the hits in the search result list in any meaningful way.
  • WindowsTM 95 software product information would be included along with architectural windows and personal pages on the search result list.
  • many of the leading search engines view each html page as an independent hit, so a one-hundred page Web site can produce one-hundred hits on the search result list.
  • search engines do group hits by web site. Many leading search engines use primitive relevance ranking routines that result in search result lists with little or no relevance ranking. Poorly ranked search result lists are a significant problem for consumers. If a search produces one-hundred hits, the user must browse through twenty screens of information to see find the most interesting information. It has been shown most users give up after the first few screens. Thus, if highly relevant information is buried in a later screen, most users never know and conclude that the search was a failure.
  • the present invention features a method and apparatus for searching a database which can include Internet and premium content records.
  • the invention provides users with access to the wealth of information on the Internet and to premium content information not on the Internet.
  • the invention uses sophisticated categorization methods along with detailed relevancy criteria to provide a meaningful search result list in the form of a set of search result categories.
  • the user is presented with a small number of categories along with a list of the most relevant records.
  • Each category can include narrower categories and/or a list of the most relevant records.
  • the invention features a method for searching a database of records.
  • the database can include Internet and premium content records.
  • the database is searched and a search result list which includes a selected set of the records is generated.
  • a portion of the search result list is processed to dynamically create a set of search result categories.
  • the portion of the search result list can be the first two-hundred (or one-hundred) most relevant records within the selected set of records.
  • Each search result category is associated with a subset of the records within the search result list.
  • the invention uses a categorization (or clustering) methodology for retrieving records stored in the database to compile the search result list.
  • the methodology has three primary steps: identifying candidate categories, weighing candidate categories and displaying a set of search result categories selected from the candidate categories.
  • Each record within the search list can have associated subject, type, source and language characteristics. Common characteristics associated with the records are identified, and records having common characteristics are grouped into candidate categories.
  • a list of candidate categories, being representative of possible search result categories, is compiled. Each candidate category is weighted as a function of the identified common characteristics of the records within that candidate category.
  • One or more candidate categories are selected as a function of the identified common characteristics of the records. For example, about five to ten search result categories can be selected from the candidate categories.
  • a graphical representation of the categories is provided for user display of the categories. The categories can be displayed as a plurality of folders on the user's display.
  • the invention features a search apparatus for searching a database of records.
  • the database comprises a plurality of records, including Internet records and premium content records.
  • the apparatus includes a search processor and a grouping processor.
  • the grouping processor includes a record processor; a candidate generator; a weighing processor; and a display processor.
  • Each of these elements is a software module. Alternatively, each element could possibly be a hardware module or a combined hardware/software module.
  • the search processor receives search instructions from a user. Responsive to a search instruction, the search processor searches the database to generate a search result list which includes a selected set of the records.
  • the grouping processor processes a portion of the search result list to dynamically create a set of search result categories. Each search result category is associated with a subset of the records in the search result list.
  • the apparatus performs a plurality of processing steps to dynamically create the search result categories.
  • the record processor that identifies subject, type, source and language characteristics associated with each record within the search result list.
  • the candidate generator identifies common characteristics associated with the records within the search result list and compiles a list of candidate categories. Each candidate category is representative of a possible search result category.
  • the weighting processor weights each candidate category as a function of the identified common characteristics of the records within the candidate category.
  • the display processor selects a plurality of search result categories corresponding to those candidate categories having the highest weight.
  • the display processor provides a graphical representation of the search result categories for display on the user's monitor.
  • the invention provides an efficient method to view and navigate among large sets of records and offers advantages over long linear lists.
  • the invention uses categorization to guide the user through a multi-step search process in a humane and satisfying way.
  • a user can construct a complex query in small steps taken one at a time.
  • a user can rapidly perform the search in a few steps without having to review long linear lists of records.
  • FIG. 1 is a block diagram illustrating the functional elements of a search apparatus incorporating the principles of the invention.
  • FIG. 2 is a flow chart illustrating the sequence of steps used by the search apparatus in performing a search in accordance with the invention.
  • FIGS. 3A-3C are illustrations of a user's display during a search using the search apparatus.
  • FIG. 1 is a block diagram illustrating the functional elements of a search apparatus incorporating the principles of the invention.
  • the apparatus 10 includes a search processor 12 and a grouping processor 14.
  • the grouping processor comprises a record processor 16, a candidate generator 18, a weighing processor 20, and a display processor 22. These elements are software modules and have been so identified merely to illustrate the functionality of the invention.
  • the apparatus 10 communicates with a user 24 (i.e., a computer) and a database 26, which includes Internet and premium content records, via an I/O bus 28.
  • the apparatus 10 is capable of communicating with a plurality of remotely located users over a wide area network (e. g. , the Internet) .
  • a wide area network e. g. , the Internet
  • FIG. 2 is a flow chart illustrating the sequence of steps used by the search apparatus in performing a search.
  • the search processor 12 receives search instructions (i.e., a query) from a user 24 via the bus 28 (step 30).
  • the search processor 12 searches the database 26 and generates a search result list corresponding to a selected set of the records (step 32).
  • the selected set of records are ranked according to relevancy criteria.
  • the relevancy criteria for ranking the records can include the following rules:
  • the grouping processor determines whether the query term is in the keywords. If the query term is in the keywords, the record ranks higher. If the number of records is less than a particular value (e.g., 20), the grouping processor
  • the grouping processor 14 processes a portion of the search result list to dynamically create a set of search result categories, wherein each search result category is associated with a subset of the records in the search result list.
  • the portion of the search result list processed can be the first two-hundred (or one-hundred) most relevant records within the selected set of records.
  • the grouping processor 14 performs a plurality of processing steps to dynamically create the set of search result categories.
  • the record processor 16 identifies various characteristics (e.g., subject, type, source and language) associated with each record in the search result list (step 36).
  • the candidate generator 18 identifies common characteristics associated with the records in the search result list and compiles a list of candidate categories (step 38).
  • the candidate generator 18 utilizes various rules, which are described below, to compile the list.
  • the weighting processor 20 weights each candidate category as a function of the identified common characteristics of the records within the candidate category (step 40). Also, the weighting processor 20 utilizes various weighting rules, which are described below, to weight the candidate categories.
  • the display processor 22 selects a plurality of search result categories (e.g., 5 to 10) corresponding to the candidate categories having the highest weight (step 42) and provides a graphical representation of the search result categories for display on the user's monitor (step 44).
  • the search result categories can be displayed as a plurality of icons on the monitor (e.g. folders).
  • the display processor also can provide a graphical representation of the number of records in the search result category, additional search result categories and a list of the most relevant records for display.
  • the user can select a search result category (step 46) and view additional search result categories (if the number of records is greater than a particular value) along with the list of records included in that category.
  • the user can provide an additional search terms (i.e., a refine instruction) (step 48).
  • the search processor 12 searches the database 26 and generates another search result list corresponding to a refined set of the records (step 50).
  • the user can (effectively) refine the search simply by successively opening up additional search result categories.
  • FIGS. 3A-3C are sample illustrations of a user's display during a search using the search apparatus 10. These illustrations are merely exemplary and provided solely for explanation purposes. Therefore, the layout of the various keys, buttons and icons is immaterial.
  • the display 60 includes a search field 62 into which a user can enter search instructions and a search icon 64 for executing the search instructions.
  • the display also includes a hints icon 66 for providing search tips, miscellaneous function icons (e.g., a search icon 68, directories icon 70, a support icon 72 and a legal icon 74) and search icons (e.g., simple search 76, power search 78, health search 80, company search 82 and computer search 84).
  • miscellaneous function icons e.g., a search icon 68, directories icon 70, a support icon 72 and a legal icon 74
  • search icons e.g., simple search 76, power search 78, health search 80, company search 82 and computer search 84.
  • the user enters search instructions (i.e., a query) into the field 62 and selects the search icon 64 (see FIG. 3B).
  • the search apparatus 10 searches the database 26 and dynamically creates a set of search result categories (86a-86n) along with a list of the most relevant records (88a- 88m).
  • Each search result category (86a-86n) includes a subject caption
  • each record (88a- 88m) includes a caption along with a "fee/free" indicator.
  • the user can view a category by selecting its icon or can view a particular record by selecting its icon. Alternatively, the user can perform a new search by selecting the start over icon 90 or can refine the query by entering text into the search field 62 and selecting the search icon 64.
  • the apparatus 10 creates another set of search result categories and another list of the most relevant records.
  • the user can repeat this process, further narrowing the search with each iteration, until the number of relevant records drops to a predetermined threshold (e.g., 20).
  • the apparatus 10 only provides the user with a list of the most relevant records.
  • the user can use a predetermined list of directories (92a-92y) to focus the searching process (see FIG. 3C).
  • the user enters search instructions into the field 62, selects one or more directories (e.g., directories 92a, 92b) and selects the search icon 64 (see FIG. 3B).
  • the search apparatus 10 searches the database 26, focusing on those records that satisfy the query and fall within the selected directories.
  • the apparatus provides a set of search result categories and most relevant records which are limited to those directories.
  • the grouping processor executes a categorization algorithm to dynamically create the set of search result categories.
  • the algorithm includes three primary steps: identifying candidate categories, weighting categories and displaying a plurality of categories with the highest weights.
  • the rules have been organized around a target number of seven (+1-2) categories in the following embodiment, but are generally independent of that number.
  • nrecs means the first 200 records of the total number of records on the search result list, or the total size of the result list, whichever number is smaller.
  • nrecs refers interchangeably to that number of records or that group of records.
  • ncategories means the number of desired categories (+1-2 categories).
  • internal domain ordering means an ordering of domains that emphasizes the relevant differentiation capabilities of the domains. The ordering can be as follows: type; subject; source; and language.
  • user domain ordering means an ordering of domains that emphasizes the user accessibility/apparent user value of the domains.
  • the ordering can be as follows: subject; source; type; and language.
  • level means the level in a domain hierarchy of the single value for that particular domain assigned to that category. Hierarchy levels are assumed to be numbered from 1 (all items, e.g., all subjects) through N (the lowest level of the hierarchy, with the normal 'top' level of 6 or so items being level 2).
  • the search processor searches the database and generates a search result list. The set of records in the list are ranked according to relevancy criteria described above. All subsequent processing is performed on nrecs.
  • nrecs is less than 20 (or, some other predetermined number)
  • the only candidate category is the "all records" category, and the processor skips to category weighting (described below).
  • the set of available type, subject, language or source values is limited by any value or sub-trees of such value provided in the query (e.g., queries limited to a particular subject result in candidate categories that only include that subject or more specific subjects in that subject area). If no values for these fields are provided, the entire domains of these characteristics are available. It is assumed that any criteria specified for multiple fields are logically AND'd together in the query.
  • the grouping processor generates, as candidate categories, all type-subject combinations having more than 20% of nrecs and using all available nodes in the subject and type domains.
  • the grouping processor generates, as candidate categories, all subject-only groupings and consolidations from all available nodes in the subject domain that have 20% or more of nrecs.
  • the grouping processor generates, as candidate categories, all type-only groupings and consolidations from all available nodes in the type domain that have 20% or more of nrecs.
  • the grouping processor generates, as a candidate category, any domain in the language hierarchy that contains more than 20% but less than 80% of nrecs.
  • the grouping processor generates, as a candidate category, any web site that contains three or more records, or any other node in the source hierarchy that contains more than 20% of nrecs.
  • the grouping processor generates, as a candidate category, any top-level node in the source hierarchy for which one has not already been generated. This provides at least one set of candidate categories which are exhaustive not only of nrecs but of the entire search result list.
  • the grouping processor generates candidate categories with 20% or more of nrecs not already generated that consist of all pair-wise combinations of all available nodes of any two fields specified in the query (e.g., a query specifying language and source will have candidates generated for all language-source combinations with 20% or more of nrecs.
  • the grouping processor eliminates any categories with a value of "Unknown" for any domain in the category.
  • the algorithm weights categories.
  • the weighting rules indicate weights are applied cumulatively to categories (i.e., the final weight of each category is the sum of all the weights received).
  • One rule emphasizes the internal domain ordering and the level of precision within a domain. That rule provides that all categories receive a weight for each domain which is the product of the factor for that domain and the level of the value for that domain.
  • the factors are as follows: type (10), subject (6), source (3) and language (1).
  • Another rule emphasizes categories having a larger number of records. That rule provides that all categories receive a weight which is 20% of the percentage of nrecs contained in that category.
  • Three rules emphasize the most relevant categories.
  • the first rule provides that all categories receive a weight equal to ten times the number of records in the category that are among the top ranked five records of nrecs.
  • the second rule provides that all categories receive a weight equal to five times the number of records found in the category that are among the second ranked five records of nrecs.
  • the third rule provides that all categories receive a weight equal to two times the number of records found in the category that are among the eleventh through twentieth ranked records of nrecs.
  • the first rule provides that all categories containing a value at level two of the domain of that value, for which there are no categories for values below level two of that domain, receive a weight of 15. This applies for each domain contained within the category theme.
  • the second rule provides that all categories containing a value at level three of its domain, for which there are no categories for values below level three of that domain, receive a weight 8. This applies for each domain contained within the category theme.
  • the third rule provides that, if the ncategories with the highest weights do not exhaustively cover the values of any one of the domains, and there are two or fewer categories that can be added with values from a single domain to exhaustively represent that domain, add 25 to each of those two categories. If, however, there are more than two categories (in the same or a different domain) that this applies to, select the categories for which the sum of the two identified categories have the highest weight. In case of a tie, select based on the internal domain ordering, and, if still a tie, select randomly.
  • the fourth rule provides that all categories that contain records, 70% or more of which are not found in other categories, receive a weight of 8. It is noted that other percentages and weighting values can be used.
  • Another rule emphasizes web site categories. That rule provides that all web site-only categories with 20% of more of nrecs receive a weight of 12. Yet another rule emphasizes themes specified within a query and provides that all categories containing a domain for which the user specified a value receive a weight of 10. Finally, a rule that emphasizes combination categories provides that all combination categories receive a weight of 8.
  • the algorithm determines a plurality of search result categories from those candidate categories with the highest weights.
  • the processor selects the candidate categories with the highest weight. In case of ties, the user domain ordering is used to select the categories. If the lowest or two lowest weighted categories in ncategories represent a significant drop from the next highest weighted category, the ncategories are reduced by one or two. If, however, the two highest weighted categories not already in ncategories are insignificantly lower in weight than the lowest category already in ncategories. the ncategories are increased by one or two. It is noted that other percentages and weighting values can be used.
  • categories with combinations of domains are named with the value of each domain separated by a hyphen.
  • the order of the two domains is determined by the user domain ordering.
  • For each search result category a count of the number of records in each category is displayed.
  • Web site categories are named by the domain of the web site, which may be a hot link to the default home page of the site. Regardless of whether any of the displayed categories are pure or combination web site categories, all occurrences of individual records within the record list of any category, other than a web site or web site combination category, are replaced with the web site category that contains those records.
  • each record within the database is classified by subject, type, source, and language characteristics (i.e., meta-data attributes).
  • the records can be classified by additional meta-data attributes (e.g., level of difficulty or popularity), query-based attributes, proper names, and run-time document analysis characteristics. Because such a task is too much to do manually, the search apparatus auto-classifies substantially all of the records into the proper categories.
  • Every record is assigned one or more types (e.g., article, book review, letter) and a single source value (e.g., PC Week, personal web pages) via a mostly automatic process (completely automatic for Internet data), although there is some editorial assignment for certain premium content data.
  • every record is generally assigned one or more subjects (e.g., molecular biology) and languages (e.g., French) via a mostly automatic process (completely automatic for Internet data).
  • a record is not assigned a subject and/or language. In such cases, the records are assigned a value of "unknown" for these particular meta-data items (or attributes, or fields, or domains).
  • subject, language and type but not for source
  • each hierarchy is fairly small, e.g., about six subject areas for the subject domain (including humanities and society, business, etc.), each of which are divided into three or four more, making 18 at level 2, each of which are divided into about 35, making about 600 at level 3, etc.
  • a classification system has been developed for automatically determining the four data attributes (i.e., subject, type, language and source) when such attributes are not editorially available from the publisher or record source.
  • the classification systems includes two main components: (1) a query-based classification program; and (2) set of individual programs.
  • the query-based classification program efficiently performs classification for selected attributes and attribute values, including 20,000+ subject terms.
  • the queries are executed against all of the records, and classification scores representing the strength of the match are computed for each record and query. Records are then classified to the two or three queries/attribute values for which that record has the highest classification score.
  • the query-based classification program draws on the following sub-components: (1) a classification language for specifying classification queries; (2) a means of and sources for automatically producing classification queries used by the program; and (3) a number of manually constructed classification queries used by the program.
  • the means of and sources for automatically producing classification queries generates queries about 5 lines long. Each query is produced by analyzing an exemplar or model record for that attribute value (e.g., an encyclopedia article on biochemistry) and automatically extracting the most significant terms for the record. The resulting 'query' is used to match and retrieve other similar records (i.e., classifiable to the same value). Term significance is determined both by how frequent the term is within the record (i.e., more frequent equals more significant) and how infrequent it is in the particular body of exemplar records being used, e.g., the encyclopedia as a whole (i.e., less frequent equals more significant). Exactly what values of frequency/infrequency to use is empirically determined and set for each particular source of exemplar records. Multiple sources can be used. A number of related program tools have also been developed (e.g., for automatically matching encyclopedia articles to terms in the subject hierarchy).
  • the number of manually constructed classification queries is as follows: about 2,000 such subject queries; about 50 manual type queries; and about 6 manual language queries. Manual queries average about 15 lines in length, except for language queries, which are considerably longer.
  • the second component of the classification system is a set of individual programs and a higher level controlling program which are used to classify data to certain particular values (e.g., recipe, or "personal web page") of one of the data attributes ("type" in the case of recipe,
  • a classification database creates and maintains the data taxonomies, hierarchies, cross- references and associated classification queries.
  • the database includes a multi-user classification editor, and a means to generate reports and data files needed by other system components (e.g., the search engine) and is implemented using Microsoft Access, Microsoft forms, Data Access Objects, SQL and Visual Basic.
  • the database includes approximately 40 tables, 15 forms, 25 reports and 5,000 lines of Visual Basic, and it produces 12 intermediate files for other parts of the system.

Abstract

A method and search apparatus for searching a database of records organizes results of the search into a set of most relevant categories enabling a user to obtain with a few mouse clicks only those records that are most relevant. In response to a search instruction from the user, the search apparatus searches the database, which can include Internet records and premium content records, to generate a search result list corresponding to a selected set of the records. The search apparatus processes the search result list to dynamically create a set of search result categories. Each search result category is associated with a subset of the records within the search result list having one or more common characteristics. The categories can be displayed as a plurality of folders on the user's display. For the foregoing categorization method and apparatus to work, each record within the database is classified according to various meta-data attributes (e.g., subject, type, source, and language characteristics). Because such a task is too much to do manually, substantially all of the records are automatically classified by a classification system into the proper categories. The classification system automatically determines the various meta-data attributes when such attributes are not editorially available from source.

Description

METHOD AND APPARATUS FOR SEARCHING A DATABASE OF RECORDS
Field of the Invention
The invention relates generally to a method and apparatus for searching a database of records. More particularly, the invention relates to a method and search apparatus for searching a database comprising both Internet and premium content information. Background of the Invention
The Internet attracts millions of users every day. It has been estimated that the number of Internet users would grow from 10 million at the end of 1995 to 170 million by the year 2000. The primary attraction to the Internet is the promise of huge quantities of available information on any imaginable topic of interest. Research has shown that the primary uses of the Internet by users include searching for information and browsing (a form of searching) for information.
Several companies offer search services to assist users in searching the massive, rapidly growing, and infinitely distributed data on the Internet. A large number of Internet users use a search service several times a week, and the top twenty percent of Internet users use a search engine several times a day. The Internet, however, is not without its shortcomings. While there are 250 gigabytes of textual information on the Internet accessible to the public, many Internet users are thwarted in their quest for information in the following ways: (1) quality information is often not on the Internet; (2) quality information exists but is dispersed across proprietary subscription-based sites; (3) search services produce too much or too little information; and (4) search services do not anticipate users' requests.
The Internet is an excellent source of the type of information found in product brochures. However, the Internet is a remarkably poor source of editorial information, reference information and commentary. One reason for this impediment is that quality information (i.e., premium content) is most often created and provided by companies who are compensated for the information (i.e., premium content owners). The tradition of no cost information on the Internet has inhibited premium content owner from making their information available via the Internet. Another reason has been the substantial financial and capital investment required to develop, market and maintain premium content on the Internet. Industry observers are unclear as to which business models will ultimately materialize to produce reasonable profits for premium content available on the Internet. As a result of these factors, the Internet is currently not considered a primary source of most recognized content on any topic. Despite the foregoing reasons, some premium content owners have begun to make their information available on the Internet, typically in the form of subscription services. These services, however, have numerous problems and are therefore not always a good solution for Internet users.
One problem with subscription services is that a user must perform multiple searches and search multiple sites (often including multiple databases at sites) to obtain comprehensive information on the subject being searched. For a truly robust result, users often use a search engine, which can return volumes of information from the Internet. With no easy way to consolidate the returned information, users find the process too cumbersome and time consuming to be worthwhile. Another problem is that users can incur high costs in signing up for multiple subscription services to satisfy their needs in each topic area of interest. While users typically have varying interests, many resist signing up for multiple subscriptions on multiple topics. Yet another problem is that users are required to anticipate their desire to query on a particular topic in order to have all of the necessary subscriptions in advance. In reality, many user information interests are ad hoc and of short duration. Subscription services cannot satisfy this type of user information need.
When a user accesses one of the leading search engines, the search can produce hundreds, even thousands, of hits (i.e., records). For example, the Alta Vista™ search engine returns hundreds of thousands of hits in response to a search under the topic "windows." This deluge of information is often just too much to review, cull, and select. This problem is exacerbated by the failure of the search engine to group the hits in the search result list in any meaningful way. In the above example, Windows™ 95 software product information would be included along with architectural windows and personal pages on the search result list. Also, many of the leading search engines view each html page as an independent hit, so a one-hundred page Web site can produce one-hundred hits on the search result list. To address this problem, some search engines do group hits by web site. Many leading search engines use primitive relevance ranking routines that result in search result lists with little or no relevance ranking. Poorly ranked search result lists are a significant problem for consumers. If a search produces one-hundred hits, the user must browse through twenty screens of information to see find the most interesting information. It has been shown most users give up after the first few screens. Thus, if highly relevant information is buried in a later screen, most users never know and conclude that the search was a failure.
Two of the leading search engines, Excite™ and Yahoo™, manually classify and index the Internet. This approach produces high quality indexes and proper classification of Web sites in the directory structure. However, the editorial staffs of these companies find themselves in a losing race with the growth of the Internet. Even with staffs of hundreds of editors, these companies cannot visit enough Web sites and cannot revisit each site every time the site changes. Consequently, these companies are incapable of covering a large percentage of the Internet. As a result, searches using these search engines can often return "too little" useful information.
Summary of the Invention The present invention features a method and apparatus for searching a database which can include Internet and premium content records. The invention provides users with access to the wealth of information on the Internet and to premium content information not on the Internet. The invention uses sophisticated categorization methods along with detailed relevancy criteria to provide a meaningful search result list in the form of a set of search result categories. The user is presented with a small number of categories along with a list of the most relevant records. Each category can include narrower categories and/or a list of the most relevant records. By organizing the search list results into a hierarchy, users can rapidly focus the search to those few records of interest without being overwhelmed by the results.
In one aspect, the invention features a method for searching a database of records. The database can include Internet and premium content records. In response to a search instruction from a user, the database is searched and a search result list which includes a selected set of the records is generated. A portion of the search result list is processed to dynamically create a set of search result categories. By way of example, the portion of the search result list can be the first two-hundred (or one-hundred) most relevant records within the selected set of records. Each search result category is associated with a subset of the records within the search result list. The invention uses a categorization (or clustering) methodology for retrieving records stored in the database to compile the search result list. The methodology has three primary steps: identifying candidate categories, weighing candidate categories and displaying a set of search result categories selected from the candidate categories. Each record within the search list can have associated subject, type, source and language characteristics. Common characteristics associated with the records are identified, and records having common characteristics are grouped into candidate categories. A list of candidate categories, being representative of possible search result categories, is compiled. Each candidate category is weighted as a function of the identified common characteristics of the records within that candidate category. One or more candidate categories are selected as a function of the identified common characteristics of the records. For example, about five to ten search result categories can be selected from the candidate categories. A graphical representation of the categories is provided for user display of the categories. The categories can be displayed as a plurality of folders on the user's display. In another aspect, the invention features a search apparatus for searching a database of records. The database comprises a plurality of records, including Internet records and premium content records. The apparatus includes a search processor and a grouping processor. The grouping processor includes a record processor; a candidate generator; a weighing processor; and a display processor. Each of these elements is a software module. Alternatively, each element could possibly be a hardware module or a combined hardware/software module. The search processor receives search instructions from a user. Responsive to a search instruction, the search processor searches the database to generate a search result list which includes a selected set of the records. The grouping processor processes a portion of the search result list to dynamically create a set of search result categories. Each search result category is associated with a subset of the records in the search result list.
The apparatus performs a plurality of processing steps to dynamically create the search result categories. The record processor that identifies subject, type, source and language characteristics associated with each record within the search result list. The candidate generator identifies common characteristics associated with the records within the search result list and compiles a list of candidate categories. Each candidate category is representative of a possible search result category. The weighting processor weights each candidate category as a function of the identified common characteristics of the records within the candidate category. The display processor selects a plurality of search result categories corresponding to those candidate categories having the highest weight. The display processor provides a graphical representation of the search result categories for display on the user's monitor. The invention provides an efficient method to view and navigate among large sets of records and offers advantages over long linear lists. The invention uses categorization to guide the user through a multi-step search process in a humane and satisfying way. A user can construct a complex query in small steps taken one at a time. Using the invention, a user can rapidly perform the search in a few steps without having to review long linear lists of records. Brief Description of the Drawings
These and other features of the invention are more fully described below in the detailed description and accompanying drawings of which the figures illustrate an apparatus and method for searching a database comprising both Internet and premium content information.
FIG. 1 is a block diagram illustrating the functional elements of a search apparatus incorporating the principles of the invention.
FIG. 2 is a flow chart illustrating the sequence of steps used by the search apparatus in performing a search in accordance with the invention.
FIGS. 3A-3C are illustrations of a user's display during a search using the search apparatus.
Detailed Description
FIG. 1 is a block diagram illustrating the functional elements of a search apparatus incorporating the principles of the invention. The apparatus 10 includes a search processor 12 and a grouping processor 14. The grouping processor comprises a record processor 16, a candidate generator 18, a weighing processor 20, and a display processor 22. These elements are software modules and have been so identified merely to illustrate the functionality of the invention. The apparatus 10 communicates with a user 24 (i.e., a computer) and a database 26, which includes Internet and premium content records, via an I/O bus 28. The apparatus 10 is capable of communicating with a plurality of remotely located users over a wide area network (e. g. , the Internet) .
FIG. 2 is a flow chart illustrating the sequence of steps used by the search apparatus in performing a search. With reference to FIGS. 1 and 2, the search processor 12 receives search instructions (i.e., a query) from a user 24 via the bus 28 (step 30). The search processor 12 searches the database 26 and generates a search result list corresponding to a selected set of the records (step 32). The selected set of records are ranked according to relevancy criteria. In one embodiment, the relevancy criteria for ranking the records can include the following rules:
1. If there are more "hits" (a word in a record matching a word in the search criteria), the record ranks higher;
2. If the query term phrase is a hit versus the words separately being hits, the record ranks higher;
3. If the capitalization is the same as in the query term, the record ranks higher;
4. If the query term is in the title, the record ranks higher;
5. If the query term is in the abstract, the record ranks higher; and
6. If the query term is in the keywords, the record ranks higher. If the number of records is less than a particular value (e.g., 20), the grouping processor
36 is bypassed (step 34). Otherwise, the grouping processor 14 processes a portion of the search result list to dynamically create a set of search result categories, wherein each search result category is associated with a subset of the records in the search result list. By way of example only, the portion of the search result list processed can be the first two-hundred (or one-hundred) most relevant records within the selected set of records.
The grouping processor 14 performs a plurality of processing steps to dynamically create the set of search result categories. The record processor 16 identifies various characteristics (e.g., subject, type, source and language) associated with each record in the search result list (step 36). The candidate generator 18 identifies common characteristics associated with the records in the search result list and compiles a list of candidate categories (step 38). The candidate generator 18 utilizes various rules, which are described below, to compile the list. The weighting processor 20 weights each candidate category as a function of the identified common characteristics of the records within the candidate category (step 40). Also, the weighting processor 20 utilizes various weighting rules, which are described below, to weight the candidate categories. The display processor 22 selects a plurality of search result categories (e.g., 5 to 10) corresponding to the candidate categories having the highest weight (step 42) and provides a graphical representation of the search result categories for display on the user's monitor (step 44). The search result categories can be displayed as a plurality of icons on the monitor (e.g. folders). When a particular search result category is selected by the user, the display processor also can provide a graphical representation of the number of records in the search result category, additional search result categories and a list of the most relevant records for display.
As noted above, the user can select a search result category (step 46) and view additional search result categories (if the number of records is greater than a particular value) along with the list of records included in that category. To narrow the search, the user can provide an additional search terms (i.e., a refine instruction) (step 48). Upon receiving the additional terms, the search processor 12 searches the database 26 and generates another search result list corresponding to a refined set of the records (step 50). Alternatively, the user can (effectively) refine the search simply by successively opening up additional search result categories.
FIGS. 3A-3C are sample illustrations of a user's display during a search using the search apparatus 10. These illustrations are merely exemplary and provided solely for explanation purposes. Therefore, the layout of the various keys, buttons and icons is immaterial. With reference to FIGS. 3A-3C, the display 60 includes a search field 62 into which a user can enter search instructions and a search icon 64 for executing the search instructions. The display also includes a hints icon 66 for providing search tips, miscellaneous function icons (e.g., a search icon 68, directories icon 70, a support icon 72 and a legal icon 74) and search icons (e.g., simple search 76, power search 78, health search 80, company search 82 and computer search 84).
The user enters search instructions (i.e., a query) into the field 62 and selects the search icon 64 (see FIG. 3B). The search apparatus 10 searches the database 26 and dynamically creates a set of search result categories (86a-86n) along with a list of the most relevant records (88a- 88m). Each search result category (86a-86n) includes a subject caption, and each record (88a- 88m) includes a caption along with a "fee/free" indicator. The user can view a category by selecting its icon or can view a particular record by selecting its icon. Alternatively, the user can perform a new search by selecting the start over icon 90 or can refine the query by entering text into the search field 62 and selecting the search icon 64. If the user selects a category, the apparatus 10, creates another set of search result categories and another list of the most relevant records. The user can repeat this process, further narrowing the search with each iteration, until the number of relevant records drops to a predetermined threshold (e.g., 20). At that point, the apparatus 10 only provides the user with a list of the most relevant records. The user can use a predetermined list of directories (92a-92y) to focus the searching process (see FIG. 3C). The user enters search instructions into the field 62, selects one or more directories (e.g., directories 92a, 92b) and selects the search icon 64 (see FIG. 3B). The search apparatus 10 searches the database 26, focusing on those records that satisfy the query and fall within the selected directories. The apparatus provides a set of search result categories and most relevant records which are limited to those directories.
The grouping processor executes a categorization algorithm to dynamically create the set of search result categories. The algorithm includes three primary steps: identifying candidate categories, weighting categories and displaying a plurality of categories with the highest weights. The rules have been organized around a target number of seven (+1-2) categories in the following embodiment, but are generally independent of that number.
One embodiment of the categorization algorithm employed by the grouping processor is presented logically hereinafter. It is noted that an actual implementation of the algorithm may omit steps, perform steps in parallel or arbitrarily. In describing the algorithm, the following terms are used. The term "nrecs" means the first 200 records of the total number of records on the search result list, or the total size of the result list, whichever number is smaller. The term nrecs refers interchangeably to that number of records or that group of records. The term "ncategories" means the number of desired categories (+1-2 categories). The term "internal domain ordering" means an ordering of domains that emphasizes the relevant differentiation capabilities of the domains. The ordering can be as follows: type; subject; source; and language. The term "user domain ordering" means an ordering of domains that emphasizes the user accessibility/apparent user value of the domains. The ordering can be as follows: subject; source; type; and language. The term "level" means the level in a domain hierarchy of the single value for that particular domain assigned to that category. Hierarchy levels are assumed to be numbered from 1 (all items, e.g., all subjects) through N (the lowest level of the hierarchy, with the normal 'top' level of 6 or so items being level 2). In response to a query, the search processor searches the database and generates a search result list. The set of records in the list are ranked according to relevancy criteria described above. All subsequent processing is performed on nrecs. If nrecs is less than 20 (or, some other predetermined number), the only candidate category is the "all records" category, and the processor skips to category weighting (described below). For all candidate generation rules, the set of available type, subject, language or source values is limited by any value or sub-trees of such value provided in the query (e.g., queries limited to a particular subject result in candidate categories that only include that subject or more specific subjects in that subject area). If no values for these fields are provided, the entire domains of these characteristics are available. It is assumed that any criteria specified for multiple fields are logically AND'd together in the query.
The grouping processor generates, as candidate categories, all type-subject combinations having more than 20% of nrecs and using all available nodes in the subject and type domains. The grouping processor generates, as candidate categories, all subject-only groupings and consolidations from all available nodes in the subject domain that have 20% or more of nrecs. The grouping processor generates, as candidate categories, all type-only groupings and consolidations from all available nodes in the type domain that have 20% or more of nrecs. The grouping processor generates, as a candidate category, any domain in the language hierarchy that contains more than 20% but less than 80% of nrecs. The grouping processor generates, as a candidate category, any web site that contains three or more records, or any other node in the source hierarchy that contains more than 20% of nrecs. The grouping processor generates, as a candidate category, any top-level node in the source hierarchy for which one has not already been generated. This provides at least one set of candidate categories which are exhaustive not only of nrecs but of the entire search result list. The grouping processor generates candidate categories with 20% or more of nrecs not already generated that consist of all pair-wise combinations of all available nodes of any two fields specified in the query (e.g., a query specifying language and source will have candidates generated for all language-source combinations with 20% or more of nrecs. Finally, the grouping processor eliminates any categories with a value of "Unknown" for any domain in the category.
Second, the algorithm weights categories. The weighting rules indicate weights are applied cumulatively to categories (i.e., the final weight of each category is the sum of all the weights received). One rule emphasizes the internal domain ordering and the level of precision within a domain. That rule provides that all categories receive a weight for each domain which is the product of the factor for that domain and the level of the value for that domain. The factors are as follows: type (10), subject (6), source (3) and language (1). Another rule emphasizes categories having a larger number of records. That rule provides that all categories receive a weight which is 20% of the percentage of nrecs contained in that category. Three rules emphasize the most relevant categories. The first rule provides that all categories receive a weight equal to ten times the number of records in the category that are among the top ranked five records of nrecs. The second rule provides that all categories receive a weight equal to five times the number of records found in the category that are among the second ranked five records of nrecs. The third rule provides that all categories receive a weight equal to two times the number of records found in the category that are among the eleventh through twentieth ranked records of nrecs.
Four weighting rules show spread in a domain, increasing overall coverage and minimizing duplication. The first rule provides that all categories containing a value at level two of the domain of that value, for which there are no categories for values below level two of that domain, receive a weight of 15. This applies for each domain contained within the category theme. The second rule provides that all categories containing a value at level three of its domain, for which there are no categories for values below level three of that domain, receive a weight 8. This applies for each domain contained within the category theme. The third rule provides that, if the ncategories with the highest weights do not exhaustively cover the values of any one of the domains, and there are two or fewer categories that can be added with values from a single domain to exhaustively represent that domain, add 25 to each of those two categories. If, however, there are more than two categories (in the same or a different domain) that this applies to, select the categories for which the sum of the two identified categories have the highest weight. In case of a tie, select based on the internal domain ordering, and, if still a tie, select randomly. The fourth rule provides that all categories that contain records, 70% or more of which are not found in other categories, receive a weight of 8. It is noted that other percentages and weighting values can be used.
Another rule emphasizes web site categories. That rule provides that all web site-only categories with 20% of more of nrecs receive a weight of 12. Yet another rule emphasizes themes specified within a query and provides that all categories containing a domain for which the user specified a value receive a weight of 10. Finally, a rule that emphasizes combination categories provides that all combination categories receive a weight of 8.
Third, the algorithm determines a plurality of search result categories from those candidate categories with the highest weights. First, the processor selects the candidate categories with the highest weight. In case of ties, the user domain ordering is used to select the categories. If the lowest or two lowest weighted categories in ncategories represent a significant drop from the next highest weighted category, the ncategories are reduced by one or two. If, however, the two highest weighted categories not already in ncategories are insignificantly lower in weight than the lowest category already in ncategories. the ncategories are increased by one or two. It is noted that other percentages and weighting values can be used.
In determining the name for each search result category, categories with combinations of domains (e.g., subject-type) are named with the value of each domain separated by a hyphen. The order of the two domains is determined by the user domain ordering. For each search result category, a count of the number of records in each category is displayed. Web site categories are named by the domain of the web site, which may be a hot link to the default home page of the site. Regardless of whether any of the displayed categories are pure or combination web site categories, all occurrences of individual records within the record list of any category, other than a web site or web site combination category, are replaced with the web site category that contains those records. For web site categories so embedded within a search results list that have three or fewer records, it is possible to show the records "in-line," i.e., the individual records themselves can be shown with the category name, eliminating the need to explicitly expand the category. Records within search result categories are displayed by default in relevancy order, or, at the user's option, in reverse date order (most recent first). Web site categories within record lists are ranked by the value of the highest ranked record in the web site. The numerical percentages, the assigned weights and the detailed rules described above are exemplary and can change without departing from the spirit and scope of the invention. For the categorization methodology to work, each record within the database, including Internet records and premium content records, is classified by subject, type, source, and language characteristics (i.e., meta-data attributes). In other embodiments, the records can be classified by additional meta-data attributes (e.g., level of difficulty or popularity), query-based attributes, proper names, and run-time document analysis characteristics. Because such a task is too much to do manually, the search apparatus auto-classifies substantially all of the records into the proper categories.
Every record is assigned one or more types (e.g., article, book review, letter) and a single source value (e.g., PC Week, personal web pages) via a mostly automatic process (completely automatic for Internet data), although there is some editorial assignment for certain premium content data. Also, every record is generally assigned one or more subjects (e.g., molecular biology) and languages (e.g., French) via a mostly automatic process (completely automatic for Internet data). Occasionally, a record is not assigned a subject and/or language. In such cases, the records are assigned a value of "unknown" for these particular meta-data items (or attributes, or fields, or domains). In the case of subject, language and type (but not for source), it is possible for a record to have more than one value (e.g., because it really addresses two or more different subject areas, or because it contains text in more than one language).
Further, all of the values in these domains are arranged hierarchically (e.g., "molecular biology" belongs to "biology", and "book reviews" belongs to "reviews"). Although records are automatically or manually classified to only one (or perhaps two or more) fairly specific values for a given domain (e.g., "book review" for type, "molecular biology" for subject), they inherit all the values that are higher than those values in their respective domain hierarchies. For example, a record classified to "molecular biology" is also given the subject of "biology" (the parent of molecular biology) and the subject of "science" (which is the parent of biology). This can result in additional 5 or 6 classification values for that record. The top levels of each hierarchy are fairly small, e.g., about six subject areas for the subject domain (including humanities and society, business, etc.), each of which are divided into three or four more, making 18 at level 2, each of which are divided into about 35, making about 600 at level 3, etc.
A classification system has been developed for automatically determining the four data attributes (i.e., subject, type, language and source) when such attributes are not editorially available from the publisher or record source. The classification systems includes two main components: (1) a query-based classification program; and (2) set of individual programs.
The query-based classification program efficiently performs classification for selected attributes and attribute values, including 20,000+ subject terms. One query is required for each attribute and attribute value (e.g., attribute = subject, attribute value = biochemistry). The queries are executed against all of the records, and classification scores representing the strength of the match are computed for each record and query. Records are then classified to the two or three queries/attribute values for which that record has the highest classification score.
The query-based classification program draws on the following sub-components: (1) a classification language for specifying classification queries; (2) a means of and sources for automatically producing classification queries used by the program; and (3) a number of manually constructed classification queries used by the program.
The means of and sources for automatically producing classification queries generates queries about 5 lines long. Each query is produced by analyzing an exemplar or model record for that attribute value (e.g., an encyclopedia article on biochemistry) and automatically extracting the most significant terms for the record. The resulting 'query' is used to match and retrieve other similar records (i.e., classifiable to the same value). Term significance is determined both by how frequent the term is within the record (i.e., more frequent equals more significant) and how infrequent it is in the particular body of exemplar records being used, e.g., the encyclopedia as a whole (i.e., less frequent equals more significant). Exactly what values of frequency/infrequency to use is empirically determined and set for each particular source of exemplar records. Multiple sources can be used. A number of related program tools have also been developed (e.g., for automatically matching encyclopedia articles to terms in the subject hierarchy).
The number of manually constructed classification queries is as follows: about 2,000 such subject queries; about 50 manual type queries; and about 6 manual language queries. Manual queries average about 15 lines in length, except for language queries, which are considerably longer.
The second component of the classification system is a set of individual programs and a higher level controlling program which are used to classify data to certain particular values (e.g., recipe, or "personal web page") of one of the data attributes ("type" in the case of recipe,
"source" in the case of personal pages) when the query-based approach is considered inadequate. These programs are comprised of several thousand lines of Perl. These programs look not only for the presence of certain words but for formatting cues (e.g., the particular format of a record of type recipe, or of type interview). There is also a set of testing tools for evaluating the results of these classifications. A classification database creates and maintains the data taxonomies, hierarchies, cross- references and associated classification queries. The database includes a multi-user classification editor, and a means to generate reports and data files needed by other system components (e.g., the search engine) and is implemented using Microsoft Access, Microsoft forms, Data Access Objects, SQL and Visual Basic. The database includes approximately 40 tables, 15 forms, 25 reports and 5,000 lines of Visual Basic, and it produces 12 intermediate files for other parts of the system.
Equivalents
While the invention has been particularly shown and described with reference to specific preferred embodiments, it should be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

Claims 1. A method for searching a database of records, comprising: searching the database, in response to a search instruction, to generate a search result list which includes a selected set of the records; and processing at least a portion of the search result list to dynamically create a set of search result categories, each search result category being associated with a subset of the records within the search result list.
2. A method according to claim 1 further comprising ranking the records within the search results list according to preselected relevancy criteria.
3. A method according to claim 1 further comprising identifying subject, type, source and language characteristics associated with each record within the search result list.
4. A method according to claim 3 further comprising: identifying common characteristics associated with the records within the search result list; grouping records having common characteristics into candidate categories; and compiling a list of candidate categories, each candidate category being representative of a possible search result category.
5. A method according to claim 4 further comprising weighting each candidate category as a function of the identified common characteristics of the records within the candidate category.
6. A method according to claim 5 further comprising selecting candidate categories as a function of the identified common characteristics of the records.
7. A method according to claim 6 further comprising selecting between about five to ten search result categories from the candidate categories.
8. A method according to claim 3 further comprising grouping the search result categories in response to a user-selected value for one of the characteristics.
9. A method according to claim 1 wherein the database includes Internet records and premium content records.
10. A method according to claim 1 further comprising providing a graphical representation of the categories.
11. A method according to claim 1 further comprising identifying meta-data characteristics associated with records within the search result list.
12. A search apparatus for searching a database of records, comprising a search processor, responsive to a search instruction, for searching the database to generate a search result list which includes a selected set of the records; and a grouping processor for processing at least a portion of the search result list to dynamically create a set of search result categories, each search result category being associated with a subset of the records within the search result list.
13. An apparatus according to claim 12 further comprising means for ranking the records within the search result list according to preselected relevancy criteria.
14. An apparatus according to claim 12 further comprising a record processor for identifying subject, type, source and language characteristics associated with each record within the search result list.
15. An apparatus according to claim 14 further comprising a candidate generator for identifying common characteristics associated with the records within the search result list to compile a list of candidate categories, each candidate category being representative of a possible search result category.
16. An apparatus according to claim 15 further comprising a weighting processor for weighting each candidate category as a function of the identified common characteristics of the records within the candidate category.
17. An apparatus according to claim 16 further comprising means for selecting between about five to ten search result categories from the candidate categories.
18. An apparatus according to claim 12 further comprising a display processor for providing a graphical representation of the categories.
19. An apparatus according to claim 13 further comprising means for grouping the records within the search result list in response to a user-selected value for one of the characteristics.
20. An apparatus according to claim 12 further comprising means for generating, as a function of one of the categories, a refine instruction being representative of an additional instruction for searching the database for records associated with the category and the additional instruction.
21. An apparatus according to claim 14 further comprising means for ranking the identified common characteristics of the records into a hierarchical order.
22. An apparatus according to claim 12 wherein the database includes Internet records and premium content records.
23. A search apparatus comprising: a database for storing a plurality of records, including Internet records and premium content records; a search processor for searching the database, in response to a search instruction from a user, to generate a search result list which includes a selected set of the records; a grouping processor for processing at least a portion of the search result list to dynamically create a set of search result categories, each search result category being associated with a subset of the records within the search result list; and a display processor for providing a graphical representation of the categories to the user.
24. A method for automatically classifying a database of records, comprising: executing a query for each attribute value associated with each of a plurality of attributes against each of record in the database; determining a classification score which represents the relative strength of the match for each query and each record; classifying each record under selected attribute values for each attribute for which the record has highest classification scores.
25. The method of claim 24 further comprising arranging the attribute values for each attribute hierarchically.
PCT/US1998/008785 1997-05-01 1998-04-29 Method and apparatus for searching a database of records WO1998049637A1 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
DE69839604T DE69839604D1 (en) 1997-05-01 1998-04-29 METHOD AND SYSTEM FOR SEARCHING A DATABASE
CA002288745A CA2288745C (en) 1997-05-01 1998-04-29 Method and apparatus for searching a database of records
AU72717/98A AU736428B2 (en) 1997-05-01 1998-04-29 Method and apparatus for searching a database of records
EP98920069A EP0979470B1 (en) 1997-05-01 1998-04-29 Method and apparatus for searching a database of records
JP54740898A JP2001522496A (en) 1997-05-01 1998-04-29 Method and apparatus for searching data in a database

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/846,850 1997-05-01
US08/846,850 US5924090A (en) 1997-05-01 1997-05-01 Method and apparatus for searching a database of records

Publications (1)

Publication Number Publication Date
WO1998049637A1 true WO1998049637A1 (en) 1998-11-05

Family

ID=25299114

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/008785 WO1998049637A1 (en) 1997-05-01 1998-04-29 Method and apparatus for searching a database of records

Country Status (8)

Country Link
US (1) US5924090A (en)
EP (1) EP0979470B1 (en)
JP (4) JP2001522496A (en)
AU (1) AU736428B2 (en)
CA (1) CA2288745C (en)
DE (1) DE69839604D1 (en)
ES (1) ES2306474T3 (en)
WO (1) WO1998049637A1 (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1006467A2 (en) * 1998-11-25 2000-06-07 Canon Research Centre France S.A. Method and device for the automatic classification of sites or users of a communication network
EP1014662A2 (en) * 1998-12-23 2000-06-28 Nortel Networks Corporation Access to documents of multimedia information with keywords
WO2000054177A2 (en) * 1999-03-05 2000-09-14 Accenture Llp Method and apparatus for creating an information summary
WO2000074294A2 (en) * 1999-05-31 2000-12-07 Webnara Co., Ltd. General-purpose robot agent and real-time search method
WO2001011441A2 (en) * 1999-08-10 2001-02-15 Seung Chul Joo Content service system and method using classification diagram
WO2000067161A3 (en) * 1999-05-04 2002-06-06 Lee H Grant Method and apparatus for categorizing and retrieving network pages and sites
EP1256887A1 (en) * 2001-05-09 2002-11-13 Requisite Technology Inc. Sequential subset catalog search engine
GB2380576A (en) * 1997-09-21 2003-04-09 Microsoft Corp Stardard user interface control display for a data provider
WO2003042777A2 (en) * 2001-09-14 2003-05-22 Kent Ridge Digital Labs Method and system for personalized information management
US6584462B2 (en) 1999-09-10 2003-06-24 Requisite Technology, Inc. Sequential subset catalog search engine
WO2004012431A1 (en) * 2002-07-29 2004-02-05 British Telecommunications Public Limited Company Improvements in or relating to information provision for call centres
US6697799B1 (en) * 1999-09-10 2004-02-24 Requisite Technology, Inc. Automated classification of items using cascade searches
US6907424B1 (en) * 1999-09-10 2005-06-14 Requisite Technology, Inc. Sequential subset catalog search engine
US7043492B1 (en) 2001-07-05 2006-05-09 Requisite Technology, Inc. Automated classification of items using classification mappings
US7152064B2 (en) 2000-08-18 2006-12-19 Exalead Corporation Searching tool and process for unified search using categories and keywords
WO2007028021A2 (en) * 2005-08-31 2007-03-08 Thomson Global Resources Systems, methods, and interfaces for reducing executions of overly broad user queries
WO2007028013A1 (en) * 2005-08-31 2007-03-08 Thomson Global Resources System and method presenting search results in a topical space
AU785452B2 (en) * 2001-05-09 2007-07-12 Requisite Software, Inc. Sequential subset catalog search engine
US7734680B1 (en) 1999-09-30 2010-06-08 Koninklijke Philips Electronics N.V. Method and apparatus for realizing personalized information from multiple information sources
US10223907B2 (en) 2008-11-14 2019-03-05 Apple Inc. System and method for capturing remote control device command signals

Families Citing this family (376)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6339767B1 (en) * 1997-06-02 2002-01-15 Aurigin Systems, Inc. Using hyperbolic trees to visualize data generated by patent-centric and group-oriented data processing
US5822720A (en) 1994-02-16 1998-10-13 Sentius Corporation System amd method for linking streams of multimedia data for reference material for display
US5724571A (en) * 1995-07-07 1998-03-03 Sun Microsystems, Inc. Method and apparatus for generating query responses in a computer-based document retrieval system
US6304874B1 (en) * 1996-07-09 2001-10-16 British Telecommunications Public Limited Company Access system for distributed storage
US7903029B2 (en) 1996-09-09 2011-03-08 Tracbeam Llc Wireless location routing applications and architecture therefor
US7714778B2 (en) 1997-08-20 2010-05-11 Tracbeam Llc Wireless location gateway and applications therefor
WO1998010307A1 (en) 1996-09-09 1998-03-12 Dennis Jay Dupray Location of a mobile station
US6236365B1 (en) 1996-09-09 2001-05-22 Tracbeam, Llc Location of a mobile station using a plurality of commercial wireless infrastructures
US6249252B1 (en) 1996-09-09 2001-06-19 Tracbeam Llc Wireless location using multiple location estimators
US9134398B2 (en) 1996-09-09 2015-09-15 Tracbeam Llc Wireless location using network centric location estimators
GB2331166B (en) * 1997-11-06 2002-09-11 Ibm Database search engine
US6295533B2 (en) * 1997-02-25 2001-09-25 At&T Corp. System and method for accessing heterogeneous databases
US6167397A (en) * 1997-09-23 2000-12-26 At&T Corporation Method of clustering electronic documents in response to a search query
US6272492B1 (en) * 1997-11-21 2001-08-07 Ibm Corporation Front-end proxy for transparently increasing web server functionality
US6542888B2 (en) * 1997-11-26 2003-04-01 International Business Machines Corporation Content filtering for electronic documents generated in multiple foreign languages
US6236991B1 (en) * 1997-11-26 2001-05-22 International Business Machines Corp. Method and system for providing access for categorized information from online internet and intranet sources
US6021411A (en) * 1997-12-30 2000-02-01 International Business Machines Corporation Case-based reasoning system and method for scoring cases in a case database
US7010536B1 (en) * 1998-01-30 2006-03-07 Pattern Intelligence, Inc. System and method for creating and manipulating information containers with dynamic registers
US6032145A (en) 1998-04-10 2000-02-29 Requisite Technology, Inc. Method and system for database manipulation
US6473659B1 (en) * 1998-04-10 2002-10-29 General Electric Company System and method for integrating a plurality of diagnostic related information
US6748376B1 (en) 1998-04-10 2004-06-08 Requisite Technology, Inc. Method and system for database manipulation
US6044375A (en) * 1998-04-30 2000-03-28 Hewlett-Packard Company Automatic extraction of metadata using a neural network
US6199061B1 (en) * 1998-06-17 2001-03-06 Microsoft Corporation Method and apparatus for providing dynamic help topic titles to a user
AU4723999A (en) 1998-06-29 2000-01-17 Sbc Technology Resources, Inc. Emergency facility information system and methods
US6401118B1 (en) * 1998-06-30 2002-06-04 Online Monitoring Services Method and computer program product for an online monitoring search engine
US7602424B2 (en) * 1998-07-23 2009-10-13 Scenera Technologies, Llc Method and apparatus for automatically categorizing images in a digital camera
US20010012062A1 (en) * 1998-07-23 2001-08-09 Eric C. Anderson System and method for automatic analysis and categorization of images in an electronic imaging device
US6363377B1 (en) 1998-07-30 2002-03-26 Sarnoff Corporation Search data processor
WO2000008539A1 (en) * 1998-08-03 2000-02-17 Fish Robert D Self-evolving database and method of using same
US6035294A (en) * 1998-08-03 2000-03-07 Big Fat Fish, Inc. Wide access databases and database systems
US7272604B1 (en) * 1999-09-03 2007-09-18 Atle Hedloy Method, system and computer readable medium for addressing handling from an operating system
NO984066L (en) * 1998-09-03 2000-03-06 Arendi As Computer function button
US7496854B2 (en) * 1998-11-10 2009-02-24 Arendi Holding Limited Method, system and computer readable medium for addressing handling from a computer program
US6115709A (en) * 1998-09-18 2000-09-05 Tacit Knowledge Systems, Inc. Method and system for constructing a knowledge profile of a user having unrestricted and restricted access portions according to respective levels of confidence of content of the portions
AU5822899A (en) 1998-09-18 2000-04-10 Tacit Knowledge Systems Method and apparatus for querying a user knowledge profile
US8380875B1 (en) 1998-09-18 2013-02-19 Oracle International Corporation Method and system for addressing a communication document for transmission over a network based on the content thereof
US6871220B1 (en) 1998-10-28 2005-03-22 Yodlee, Inc. System and method for distributed storage and retrieval of personal information
ATE242511T1 (en) 1998-10-28 2003-06-15 Verticalone Corp APPARATUS AND METHOD FOR AUTOMATICALLY COMPOSING AND TRANSMITTING TRANSACTIONS CONTAINING PERSONAL ELECTRONIC INFORMATION OR DATA
US6385602B1 (en) * 1998-11-03 2002-05-07 E-Centives, Inc. Presentation of search results using dynamic categorization
US8121891B2 (en) 1998-11-12 2012-02-21 Accenture Global Services Gmbh Personalized product report
US6195651B1 (en) * 1998-11-19 2001-02-27 Andersen Consulting Properties Bv System, method and article of manufacture for a tuned user application experience
US7076504B1 (en) 1998-11-19 2006-07-11 Accenture Llp Sharing a centralized profile
US8135413B2 (en) 1998-11-24 2012-03-13 Tracbeam Llc Platform and applications for wireless location and other complex services
US6366910B1 (en) * 1998-12-07 2002-04-02 Amazon.Com, Inc. Method and system for generation of hierarchical search results
US7672879B1 (en) 1998-12-08 2010-03-02 Yodlee.Com, Inc. Interactive activity interface for managing personal data and performing transactions over a data packet network
US6517587B2 (en) * 1998-12-08 2003-02-11 Yodlee.Com, Inc. Networked architecture for enabling automated gathering of information from Web servers
US6802042B2 (en) * 1999-06-01 2004-10-05 Yodlee.Com, Inc. Method and apparatus for providing calculated and solution-oriented personalized summary-reports to a user through a single user-interface
US8069407B1 (en) 1998-12-08 2011-11-29 Yodlee.Com, Inc. Method and apparatus for detecting changes in websites and reporting results to web developers for navigation template repair purposes
US7085997B1 (en) 1998-12-08 2006-08-01 Yodlee.Com Network-based bookmark management and web-summary system
US6370527B1 (en) * 1998-12-29 2002-04-09 At&T Corp. Method and apparatus for searching distributed networks using a plurality of search devices
EP1171828A1 (en) * 1999-01-08 2002-01-16 Micro-Integration Corporation Search engine database and interface
US20060010136A1 (en) * 1999-01-28 2006-01-12 Deangelo Michael System and method for creating and manipulating information containers with dynamic registers
US6330564B1 (en) * 1999-02-10 2001-12-11 International Business Machines Corporation System and method for automated problem isolation in systems with measurements structured as a multidimensional database
US6834276B1 (en) * 1999-02-25 2004-12-21 Integrated Data Control, Inc. Database system and method for data acquisition and perusal
US7966328B2 (en) 1999-03-02 2011-06-21 Rose Blush Software Llc Patent-related tools and methodology for use in research and development projects
US7716060B2 (en) * 1999-03-02 2010-05-11 Germeraad Paul B Patent-related tools and methodology for use in the merger and acquisition process
US6924828B1 (en) * 1999-04-27 2005-08-02 Surfnotes Method and apparatus for improved information representation
US6175830B1 (en) 1999-05-20 2001-01-16 Evresearch, Ltd. Information management, retrieval and display system and associated method
US8438487B1 (en) * 1999-05-24 2013-05-07 Catherine Lin-Hendel Method and system for one-click navigation and browsing of electronic media and their category structure as well as tracking the navigation and browsing thereof
US7752535B2 (en) 1999-06-01 2010-07-06 Yodlec.com, Inc. Categorization of summarized information
US6418434B1 (en) * 1999-06-25 2002-07-09 International Business Machines Corporation Two stage automated electronic messaging system
US7181438B1 (en) 1999-07-21 2007-02-20 Alberti Anemometer, Llc Database access system
US6444072B1 (en) * 1999-08-11 2002-09-03 Southpac Trust International Process for producing holographic material
TW519836B (en) * 1999-09-24 2003-02-01 United Video Properties Inc Interactive television program guide with enhanced user interface
WO2002000316A1 (en) * 1999-09-24 2002-01-03 Goldberg Sheldon F Geographically constrained network services
US6963863B1 (en) * 1999-09-28 2005-11-08 Thomas Bannon Network query and matching system and method
US7386599B1 (en) * 1999-09-30 2008-06-10 Ricoh Co., Ltd. Methods and apparatuses for searching both external public documents and internal private documents in response to single search request
US6442555B1 (en) * 1999-10-26 2002-08-27 Hewlett-Packard Company Automatic categorization of documents using document signatures
US6772338B1 (en) 1999-10-26 2004-08-03 Ricoh Co., Ltd. Device for transfering data between an unconscious capture device and another device
US20020069134A1 (en) * 1999-11-01 2002-06-06 Neal Solomon System, method and apparatus for aggregation of cooperative intelligent agents for procurement in a distributed network
US20030074301A1 (en) * 1999-11-01 2003-04-17 Neal Solomon System, method, and apparatus for an intelligent search agent to access data in a distributed network
US20020055903A1 (en) * 1999-11-01 2002-05-09 Neal Solomon System, method, and apparatus for a cooperative communications network
US20020046157A1 (en) * 1999-11-01 2002-04-18 Neal Solomon System, method and apparatus for demand-initiated intelligent negotiation agents in a distributed network
CA2389375C (en) * 1999-11-01 2005-12-20 Lockheed Martin Corporation System and method for the storage and access of electronic data in a web-based computer system
WO2001037134A1 (en) * 1999-11-16 2001-05-25 Searchcraft Corporation Method for searching from a plurality of data sources
US7249315B2 (en) 1999-11-23 2007-07-24 John Brent Moetteli System and method of creating and following URL tours
GB9928210D0 (en) * 1999-11-29 2000-01-26 Medical Data Service Gmbh Method
CN1141638C (en) * 1999-11-30 2004-03-10 国际商业机器公司 Establish information display tactics for different display unit
US6754660B1 (en) 1999-11-30 2004-06-22 International Business Machines Corp. Arrangement of information for display into a continuum ranging from closely related to distantly related to a reference piece of information
US6556225B1 (en) 1999-11-30 2003-04-29 International Business Machines Corp. Graphical display of path through three-dimensional organization of information
US6507343B1 (en) 1999-11-30 2003-01-14 International Business Machines Corp. Arrangement of information to allow three-dimensional navigation through information displays
US6593943B1 (en) 1999-11-30 2003-07-15 International Business Machines Corp. Information grouping configuration for use with diverse display devices
US6501469B1 (en) 1999-11-30 2002-12-31 International Business Machines Corp. Arrangement of information to allow three-dimensional navigation through information displays with indication of intended starting point
US6924797B1 (en) 1999-11-30 2005-08-02 International Business Machines Corp. Arrangement of information into linear form for display on diverse display devices
KR20000012520A (en) * 1999-12-09 2000-03-06 홍윤택 Internet service method of information having scenario
DE19959850A1 (en) * 1999-12-10 2001-06-13 Deutsche Telekom Ag Communication system and method for providing Internet access by telephone
US6850906B1 (en) 1999-12-15 2005-02-01 Traderbot, Inc. Real-time financial search engine and method
US20010032189A1 (en) * 1999-12-27 2001-10-18 Powell Michael D. Method and apparatus for a cryptographically assisted commercial network system designed to facilitate idea submission, purchase and licensing and innovation transfer
EP1120722A3 (en) * 2000-01-13 2004-01-14 Applied Psychology Research Limited Method and apparatus for generating categorization data
US8019757B2 (en) * 2000-01-14 2011-09-13 Thinkstream, Inc. Distributed globally accessible information network implemented to maintain universal accessibility
US20020038348A1 (en) * 2000-01-14 2002-03-28 Malone Michael K. Distributed globally accessible information network
US20020087546A1 (en) * 2000-01-31 2002-07-04 Michael Slater Apparatus, methods, and systems for digital photo management
US6868525B1 (en) * 2000-02-01 2005-03-15 Alberti Anemometer Llc Computer graphic display visualization system and method
JP4491893B2 (en) * 2000-02-03 2010-06-30 ソニー株式会社 Information sending device, information terminal device, and information providing method
WO2001075664A1 (en) * 2000-03-31 2001-10-11 Kapow Aps Method of retrieving attributes from at least two data sources
US7262778B1 (en) 2000-02-11 2007-08-28 Sony Corporation Automatic color adjustment of a template design
WO2001059587A2 (en) 2000-02-11 2001-08-16 Kapow Aps User interface, system and method for performing a web-based transaction
AU2001241604A1 (en) * 2000-02-18 2001-08-27 Homeportfolio, Inc. Attribute tagging and matching system and method for database management
US20030149940A1 (en) * 2000-02-25 2003-08-07 Fordyce Paul Mervyn Document assembly from a database
US10002167B2 (en) 2000-02-25 2018-06-19 Vilox Technologies, Llc Search-on-the-fly/sort-on-the-fly by a search engine directed to a plurality of disparate data sources
US6760720B1 (en) 2000-02-25 2004-07-06 Pedestrian Concepts, Inc. Search-on-the-fly/sort-on-the-fly search engine for searching databases
AU2001239997A1 (en) * 2000-03-02 2001-09-12 Mmc Webreporter Systems.Com, Inc. System and method for creating a book of reports over a computer network
WO2001067351A1 (en) 2000-03-09 2001-09-13 The Web Access, Inc. Method and apparatus for performing a research task by interchangeably utilizing a multitude of search methodologies
US6311194B1 (en) * 2000-03-15 2001-10-30 Taalee, Inc. System and method for creating a semantic web and its applications in browsing, searching, profiling, personalization and advertising
US7137067B2 (en) * 2000-03-17 2006-11-14 Fujitsu Limited Device and method for presenting news information
US20020038299A1 (en) * 2000-03-20 2002-03-28 Uri Zernik Interface for presenting information
US6633903B1 (en) 2000-03-23 2003-10-14 Monkeymedia, Inc. Method and article of manufacture for seamless integrated searching
US20020073079A1 (en) * 2000-04-04 2002-06-13 Merijn Terheggen Method and apparatus for searching a database and providing relevance feedback
US6760721B1 (en) * 2000-04-14 2004-07-06 Realnetworks, Inc. System and method of managing metadata data
US6654749B1 (en) 2000-05-12 2003-11-25 Choice Media, Inc. Method and system for searching indexed information databases with automatic user registration via a communication network
US6567805B1 (en) 2000-05-15 2003-05-20 International Business Machines Corporation Interactive automated response system
US6879332B2 (en) * 2000-05-16 2005-04-12 Groxis, Inc. User interface for displaying and exploring hierarchical information
KR20010104871A (en) * 2000-05-16 2001-11-28 임갑철 System for internet site search service having a function of automatic sorting of search results
US6704729B1 (en) * 2000-05-19 2004-03-09 Microsoft Corporation Retrieval of relevant information categories
US6876997B1 (en) 2000-05-22 2005-04-05 Overture Services, Inc. Method and apparatus for indentifying related searches in a database search system
US8082355B1 (en) 2000-05-26 2011-12-20 Thomson Licensing Internet multimedia advertisement insertion architecture
US10641861B2 (en) 2000-06-02 2020-05-05 Dennis J. Dupray Services and applications for a communications network
US9875492B2 (en) 2001-05-22 2018-01-23 Dennis J. Dupray Real estate transaction system
US10684350B2 (en) 2000-06-02 2020-06-16 Tracbeam Llc Services and applications for a communications network
EP1299821A1 (en) 2000-06-09 2003-04-09 Thanh Ngoc Nguyen Method and apparatus for data collection and knowledge management
US20020091836A1 (en) * 2000-06-24 2002-07-11 Moetteli John Brent Browsing method for focusing research
US20030120653A1 (en) * 2000-07-05 2003-06-26 Sean Brady Trainable internet search engine and methods of using
US6625595B1 (en) * 2000-07-05 2003-09-23 Bellsouth Intellectual Property Corporation Method and system for selectively presenting database results in an information retrieval system
US7490092B2 (en) * 2000-07-06 2009-02-10 Streamsage, Inc. Method and system for indexing and searching timed media information based upon relevance intervals
US6618717B1 (en) 2000-07-31 2003-09-09 Eliyon Technologies Corporation Computer method and apparatus for determining content owner of a website
US20070027672A1 (en) * 2000-07-31 2007-02-01 Michel Decary Computer method and apparatus for extracting data from web pages
WO2002013047A2 (en) * 2000-08-04 2002-02-14 Athenahealth, Inc. Practice management and billing automation system
US7047229B2 (en) 2000-08-08 2006-05-16 America Online, Inc. Searching content on web pages
US7225180B2 (en) * 2000-08-08 2007-05-29 Aol Llc Filtering search results
US7359951B2 (en) 2000-08-08 2008-04-15 Aol Llc, A Delaware Limited Liability Company Displaying search results
DE10038494A1 (en) * 2000-08-08 2002-02-21 Abb Patent Gmbh Data processing method for equipment monitoring uses autocorrelation function to identify noise corruption
US7007008B2 (en) 2000-08-08 2006-02-28 America Online, Inc. Category searching
US20020026386A1 (en) * 2000-08-17 2002-02-28 Walden John C. Personalized storage folder & associated site-within-a-site web site
JP2002063180A (en) 2000-08-17 2002-02-28 Tsubasa System Co Ltd Method for retrieving used car retrieval support system
EP1182580A1 (en) * 2000-08-23 2002-02-27 Matsushita Electric Industrial Co., Ltd. Document retrieval and classification method and apparatus
KR100407081B1 (en) * 2000-08-24 2003-11-28 마쯔시다덴기산교 가부시키가이샤 Document retrieval and classification method and apparatus
IL140241A (en) * 2000-12-11 2007-02-11 Celebros Ltd Interactive searching system and method
US7016853B1 (en) 2000-09-20 2006-03-21 Openhike, Inc. Method and system for resume storage and retrieval
US20020040311A1 (en) * 2000-10-04 2002-04-04 John Douglass Web browser page rating system
US20020040384A1 (en) * 2000-10-04 2002-04-04 John Moetteli Communication method using customisable banners
US7233942B2 (en) * 2000-10-10 2007-06-19 Truelocal Inc. Method and apparatus for providing geographically authenticated electronic documents
US6804662B1 (en) * 2000-10-27 2004-10-12 Plumtree Software, Inc. Method and apparatus for query and analysis
US6711570B1 (en) 2000-10-31 2004-03-23 Tacit Knowledge Systems, Inc. System and method for matching terms contained in an electronic document with a set of user profiles
US6678694B1 (en) 2000-11-08 2004-01-13 Frank Meik Indexed, extensible, interactive document retrieval system
AU2002220172A1 (en) * 2000-11-15 2002-05-27 David M. Holbrook Apparatus and method for organizing and/or presenting data
US20070226640A1 (en) * 2000-11-15 2007-09-27 Holbrook David M Apparatus and methods for organizing and/or presenting data
US7444660B2 (en) 2000-11-16 2008-10-28 Meevee, Inc. System and method for generating metadata for video programming events
EP1346559A4 (en) * 2000-11-16 2006-02-01 Mydtv Inc System and methods for determining the desirability of video programming events
US20020083468A1 (en) * 2000-11-16 2002-06-27 Dudkiewicz Gil Gavriel System and method for generating metadata for segments of a video program
US20020152463A1 (en) * 2000-11-16 2002-10-17 Dudkiewicz Gil Gavriel System and method for personalized presentation of video programming events
US7016892B1 (en) * 2000-11-17 2006-03-21 Cnet Networks, Inc. Apparatus and method for delivering information over a network
US7062705B1 (en) 2000-11-20 2006-06-13 Cisco Technology, Inc. Techniques for forming electronic documents comprising multiple information types
US6983288B1 (en) 2000-11-20 2006-01-03 Cisco Technology, Inc. Multiple layer information object repository
US7103607B1 (en) 2000-11-20 2006-09-05 Cisco Technology, Inc. Business vocabulary data retrieval using alternative forms
US7139973B1 (en) * 2000-11-20 2006-11-21 Cisco Technology, Inc. Dynamic information object cache approach useful in a vocabulary retrieval system
US7007018B1 (en) 2000-11-20 2006-02-28 Cisco Technology, Inc. Business vocabulary data storage using multiple inter-related hierarchies
US6594670B1 (en) 2000-12-22 2003-07-15 Mathias Genser System and method for organizing search criteria match results
US6647396B2 (en) * 2000-12-28 2003-11-11 Trilogy Development Group, Inc. Classification based content management system
US20020087532A1 (en) * 2000-12-29 2002-07-04 Steven Barritz Cooperative, interactive, heuristic system for the creation and ongoing modification of categorization systems
US7174453B2 (en) 2000-12-29 2007-02-06 America Online, Inc. Message screening system
US6928433B2 (en) * 2001-01-05 2005-08-09 Creative Technology Ltd Automatic hierarchical categorization of music by metadata
US20040111386A1 (en) * 2001-01-08 2004-06-10 Goldberg Jonathan M. Knowledge neighborhoods
US7685224B2 (en) * 2001-01-11 2010-03-23 Truelocal Inc. Method for providing an attribute bounded network of computers
US6983270B2 (en) 2001-01-24 2006-01-03 Andreas Rippich Method and apparatus for displaying database search results
US20020107707A1 (en) * 2001-02-05 2002-08-08 Imagepaths.Com Llc System and method for providing personalized health interventions over a computer network
US7139762B2 (en) * 2001-02-27 2006-11-21 Microsoft Corporation System and method for filtering database records
US6938046B2 (en) * 2001-03-02 2005-08-30 Dow Jones Reuters Business Interactive, Llp Polyarchical data indexing and automatically generated hierarchical data indexing paths
US7158971B1 (en) 2001-03-07 2007-01-02 Thomas Layne Bascom Method for searching document objects on a network
US7386792B1 (en) 2001-03-07 2008-06-10 Thomas Layne Bascom System and method for collecting, storing, managing and providing categorized information related to a document object
US20030018659A1 (en) * 2001-03-14 2003-01-23 Lingomotors, Inc. Category-based selections in an information access environment
US7085753B2 (en) * 2001-03-22 2006-08-01 E-Nvent Usa Inc. Method and system for mapping and searching the Internet and displaying the results in a visual form
JP2002297185A (en) * 2001-03-29 2002-10-11 Pioneer Electronic Corp Device and method for information processing
US20020194161A1 (en) * 2001-04-12 2002-12-19 Mcnamee J. Paul Directed web crawler with machine learning
US6957206B2 (en) 2001-04-19 2005-10-18 Quantum Dynamics, Inc. Computer system and method with adaptive N-level structures for automated generation of program solutions based on rules input by subject matter experts
US20020169770A1 (en) * 2001-04-27 2002-11-14 Kim Brian Seong-Gon Apparatus and method that categorize a collection of documents into a hierarchy of categories that are defined by the collection of documents
US7536413B1 (en) 2001-05-07 2009-05-19 Ixreveal, Inc. Concept-based categorization of unstructured objects
US7627588B1 (en) 2001-05-07 2009-12-01 Ixreveal, Inc. System and method for concept based analysis of unstructured data
US7194483B1 (en) 2001-05-07 2007-03-20 Intelligenxia, Inc. Method, system, and computer program product for concept-based multi-dimensional analysis of unstructured information
USRE46973E1 (en) 2001-05-07 2018-07-31 Ureveal, Inc. Method, system, and computer program product for concept-based multi-dimensional analysis of unstructured information
US6920448B2 (en) 2001-05-09 2005-07-19 Agilent Technologies, Inc. Domain specific knowledge-based metasearch system and methods of using
US7519605B2 (en) 2001-05-09 2009-04-14 Agilent Technologies, Inc. Systems, methods and computer readable media for performing a domain-specific metasearch, and visualizing search results therefrom
US8082096B2 (en) 2001-05-22 2011-12-20 Tracbeam Llc Wireless location routing applications and architecture therefor
US7765378B1 (en) * 2001-06-01 2010-07-27 Sanbolic, Inc. Utilization of memory storage
US7840634B2 (en) * 2001-06-26 2010-11-23 Eastman Kodak Company System and method for managing images over a communication network
JP3907161B2 (en) * 2001-06-29 2007-04-18 インターナショナル・ビジネス・マシーンズ・コーポレーション Keyword search method, keyword search terminal, computer program
WO2003005235A1 (en) * 2001-07-04 2003-01-16 Cogisum Intermedia Ag Category based, extensible and interactive system for document retrieval
US7130861B2 (en) 2001-08-16 2006-10-31 Sentius International Corporation Automated creation and delivery of database content
US6947947B2 (en) * 2001-08-17 2005-09-20 Universal Business Matrix Llc Method for adding metadata to data
US20030050967A1 (en) * 2001-09-11 2003-03-13 Bentley William F. Apparatus and method for optimal selection of IP modules for design integration
US20030061013A1 (en) * 2001-09-11 2003-03-27 Bentley William F. Optimal selection of IP modules for design integration
US20030061161A1 (en) * 2001-09-21 2003-03-27 Black Daniel A. Business method for facilitating offsetting payables against receivables
AUPR796801A0 (en) * 2001-09-27 2001-10-25 Plugged In Communications Pty Ltd Computer user interface tool for navigation of data stored in directed graphs
AUPR796701A0 (en) * 2001-09-27 2001-10-25 Plugged In Communications Pty Ltd Database query system and method
EP1459530A4 (en) * 2001-11-16 2005-06-08 Mydtv Inc Systems and methods relating to determining the desirability of and recording programming events
JP2003157376A (en) * 2001-11-21 2003-05-30 Ricoh Co Ltd Network system, identification information management method, server device, program and recording medium
US7134082B1 (en) 2001-12-04 2006-11-07 Louisiana Tech University Research Foundation As A Division Of The Louisiana Tech University Foundation Method and apparatus for individualizing and updating a directory of computer files
DE10160607A1 (en) * 2001-12-10 2003-06-26 Oce Printing Systems Gmbh Production of printed document such as newspaper, from multiple files containing page data, by creating cluster file from associated input files and storing in memory before transmission to printer
US7315848B2 (en) 2001-12-12 2008-01-01 Aaron Pearse Web snippets capture, storage and retrieval system and method
US8589413B1 (en) 2002-03-01 2013-11-19 Ixreveal, Inc. Concept-based method and system for dynamically analyzing results from search engines
US6910037B2 (en) * 2002-03-07 2005-06-21 Koninklijke Philips Electronics N.V. Method and apparatus for providing search results in response to an information search request
EP1345132A1 (en) * 2002-03-14 2003-09-17 Hewlett-Packard Company Process for storing and retrieving a document within a knowledge base of documents
US20040078225A1 (en) * 2002-03-18 2004-04-22 Merck & Co., Inc. Computer assisted and/or implemented process and system for managing and/or providing continuing healthcare education status and activities
JP4352653B2 (en) * 2002-04-12 2009-10-28 三菱電機株式会社 Video content management system
US8127217B2 (en) 2002-04-19 2012-02-28 Kabushiki Kaisha Toshiba Document management system for transferring a plurality of documents
US20050141028A1 (en) * 2002-04-19 2005-06-30 Toshiba Corporation And Toshiba Tec Kabushiki Kaisha Document management system for automating operations performed on documents in data storage areas
US7376709B1 (en) * 2002-05-09 2008-05-20 Proquest Method for creating durable web-enabled uniform resource locator links
US7080059B1 (en) 2002-05-13 2006-07-18 Quasm Corporation Search and presentation engine
US7231395B2 (en) 2002-05-24 2007-06-12 Overture Services, Inc. Method and apparatus for categorizing and presenting documents of a distributed database
US8260786B2 (en) * 2002-05-24 2012-09-04 Yahoo! Inc. Method and apparatus for categorizing and presenting documents of a distributed database
US20040002993A1 (en) * 2002-06-26 2004-01-01 Microsoft Corporation User feedback processing of metadata associated with digital media files
US7136866B2 (en) * 2002-08-15 2006-11-14 Microsoft Corporation Media identifier registry
US8335779B2 (en) * 2002-08-16 2012-12-18 Gamroe Applications, Llc Method and apparatus for gathering, categorizing and parameterizing data
US20040049514A1 (en) * 2002-09-11 2004-03-11 Sergei Burkov System and method of searching data utilizing automatic categorization
US7076484B2 (en) * 2002-09-16 2006-07-11 International Business Machines Corporation Automated research engine
US6829599B2 (en) * 2002-10-02 2004-12-07 Xerox Corporation System and method for improving answer relevance in meta-search engines
US20040068514A1 (en) * 2002-10-04 2004-04-08 Parvathi Chundi System and method for biotechnology information access and data analysis
US20040083213A1 (en) * 2002-10-25 2004-04-29 Yuh-Cherng Wu Solution search
US7437703B2 (en) * 2002-10-25 2008-10-14 Sap Ag Enterprise multi-agent software system with services able to call multiple engines and scheduling capability
US6944612B2 (en) 2002-11-13 2005-09-13 Xerox Corporation Structured contextual clustering method and system in a federated search engine
US9805373B1 (en) 2002-11-19 2017-10-31 Oracle International Corporation Expertise services platform
US7640336B1 (en) 2002-12-30 2009-12-29 Aol Llc Supervising user interaction with online services
JP2004234157A (en) * 2003-01-29 2004-08-19 Sony Corp Information processor and method, and computer program
US20050149507A1 (en) * 2003-02-05 2005-07-07 Nye Timothy G. Systems and methods for identifying an internet resource address
US8117130B2 (en) * 2003-02-25 2012-02-14 Stragent, Llc Batch loading and self-registration of digital media files
US7945567B2 (en) * 2003-03-17 2011-05-17 Hewlett-Packard Development Company, L.P. Storing and/or retrieving a document within a knowledge base or document repository
US20040186833A1 (en) * 2003-03-19 2004-09-23 The United States Of America As Represented By The Secretary Of The Army Requirements -based knowledge discovery for technology management
US7823077B2 (en) 2003-03-24 2010-10-26 Microsoft Corporation System and method for user modification of metadata in a shell browser
US7769794B2 (en) 2003-03-24 2010-08-03 Microsoft Corporation User interface for a file system shell
US7627552B2 (en) 2003-03-27 2009-12-01 Microsoft Corporation System and method for filtering and organizing items based on common elements
US7240292B2 (en) 2003-04-17 2007-07-03 Microsoft Corporation Virtual address bar user interface control
US7421438B2 (en) 2004-04-29 2008-09-02 Microsoft Corporation Metadata editing control
US8533840B2 (en) * 2003-03-25 2013-09-10 DigitalDoors, Inc. Method and system of quantifying risk
US7925682B2 (en) * 2003-03-27 2011-04-12 Microsoft Corporation System and method utilizing virtual folders
US7523095B2 (en) * 2003-04-29 2009-04-21 International Business Machines Corporation System and method for generating refinement categories for a set of search results
US20040230564A1 (en) * 2003-05-16 2004-11-18 Horatiu Simon Filtering algorithm for information retrieval systems
US20040243536A1 (en) * 2003-05-28 2004-12-02 Integrated Data Control, Inc. Information capturing, indexing, and authentication system
US7729990B2 (en) * 2003-05-28 2010-06-01 Stephen Michael Marceau Check image access system
US20040243627A1 (en) * 2003-05-28 2004-12-02 Integrated Data Control, Inc. Chat stream information capturing and indexing system
US20040243494A1 (en) * 2003-05-28 2004-12-02 Integrated Data Control, Inc. Financial transaction information capturing and indexing system
US7403939B1 (en) 2003-05-30 2008-07-22 Aol Llc Resolving queries based on automatic determination of requestor geographic location
US7613687B2 (en) * 2003-05-30 2009-11-03 Truelocal Inc. Systems and methods for enhancing web-based searching
US7206780B2 (en) * 2003-06-27 2007-04-17 Sbc Knowledge Ventures, L.P. Relevance value for each category of a particular search result in the ranked list is estimated based on its rank and actual relevance values
US7617203B2 (en) * 2003-08-01 2009-11-10 Yahoo! Inc Listings optimization using a plurality of data sources
US8473532B1 (en) 2003-08-12 2013-06-25 Louisiana Tech University Research Foundation Method and apparatus for automatic organization for computer files
US20050044060A1 (en) * 2003-08-18 2005-02-24 Yuh-Cherng Wu Filtering process for information retrieval systems
US7644065B2 (en) * 2003-08-18 2010-01-05 Sap Aktiengesellschaft Process of performing an index search
US20050071310A1 (en) * 2003-09-30 2005-03-31 Nadav Eiron System, method, and computer program product for identifying multi-page documents in hypertext collections
US20050080770A1 (en) * 2003-10-14 2005-04-14 Microsoft Corporation System and process for presenting search results in a tree format
US8024335B2 (en) 2004-05-03 2011-09-20 Microsoft Corporation System and method for dynamically generating a selectable search extension
US7346494B2 (en) * 2003-10-31 2008-03-18 International Business Machines Corporation Document summarization based on topicality and specificity
US20050132305A1 (en) * 2003-12-12 2005-06-16 Guichard Robert D. Electronic information access systems, methods for creation and related commercial models
NL1025129C2 (en) * 2003-12-24 2005-07-04 Split Vision Systemen B V Method, computer system, computer program and computer program product for storing and recovering data files in a data memory.
US8706686B2 (en) * 2003-12-24 2014-04-22 Split-Vision Kennis B.V. Method, computer system, computer program and computer program product for storage and retrieval of data files in a data storage means
US7447678B2 (en) 2003-12-31 2008-11-04 Google Inc. Interface for a universal search engine
US8121997B2 (en) * 2004-02-09 2012-02-21 Limelight Networks, Inc. Universal search engine
US20050177555A1 (en) * 2004-02-11 2005-08-11 Alpert Sherman R. System and method for providing information on a set of search returned documents
RU2006133549A (en) * 2004-02-20 2008-05-20 ДАУ ДЖОУНС РЕЙТЕРЗ БИЗНЕС ИНТЕРЭКТИВ, Эл Эл Си (US) SYSTEM AND METHOD OF INTELLECTUAL SEARCH AND SAMPLE
US8595146B1 (en) 2004-03-15 2013-11-26 Aol Inc. Social networking permissions
WO2005111868A2 (en) * 2004-05-03 2005-11-24 Microsoft Corporation System and method for dynamically generating a selectable search extension
US20050250439A1 (en) * 2004-05-06 2005-11-10 Garthen Leslie Book radio system
US20050248453A1 (en) * 2004-05-10 2005-11-10 Fechter Cary E Multiple deterrent, emergency response and localization system and method
US20050261962A1 (en) * 2004-05-18 2005-11-24 Khai Gan Chuah Anonymous page recognition
WO2005114508A1 (en) * 2004-05-21 2005-12-01 Computer Associates Think, Inc. Maintaining a history of query results
US8370166B2 (en) * 2004-06-15 2013-02-05 Sap Aktiengesellschaft Script-based information retrieval
US7562069B1 (en) 2004-07-01 2009-07-14 Aol Llc Query disambiguation
US7519595B2 (en) * 2004-07-14 2009-04-14 Microsoft Corporation Method and system for adaptive categorial presentation of search results
US7698333B2 (en) 2004-07-22 2010-04-13 Factiva, Inc. Intelligent query system and method using phrase-code frequency-inverse phrase-code document frequency module
US7483918B2 (en) 2004-08-10 2009-01-27 Microsoft Corporation Dynamic physical database design
US7567962B2 (en) * 2004-08-13 2009-07-28 Microsoft Corporation Generating a labeled hierarchy of mutually disjoint categories from a set of query results
US7516149B2 (en) 2004-08-30 2009-04-07 Microsoft Corporation Robust detector of fuzzy duplicates
US7321889B2 (en) * 2004-09-10 2008-01-22 Suggestica, Inc. Authoring and managing personalized searchable link collections
US7493301B2 (en) 2004-09-10 2009-02-17 Suggestica, Inc. Creating and sharing collections of links for conducting a search directed by a hierarchy-free set of topics, and a user interface therefor
US20060059135A1 (en) * 2004-09-10 2006-03-16 Eran Palmon Conducting a search directed by a hierarchy-free set of topics
WO2006033055A2 (en) * 2004-09-21 2006-03-30 Koninklijke Philips Electronics N.V. Method of providing compliance information
WO2006035196A1 (en) * 2004-09-30 2006-04-06 British Telecommunications Public Limited Company Information retrieval
US7707209B2 (en) * 2004-11-25 2010-04-27 Kabushiki Kaisha Square Enix Retrieval method for contents to be selection candidates for user
US8996486B2 (en) * 2004-12-15 2015-03-31 Applied Invention, Llc Data store with lock-free stateless paging capability
US7774308B2 (en) * 2004-12-15 2010-08-10 Applied Minds, Inc. Anti-item for deletion of content in a distributed datastore
US7590635B2 (en) * 2004-12-15 2009-09-15 Applied Minds, Inc. Distributed data store with an orderstamp to ensure progress
US11321408B2 (en) 2004-12-15 2022-05-03 Applied Invention, Llc Data store with lock-free stateless paging capacity
US8275804B2 (en) 2004-12-15 2012-09-25 Applied Minds, Llc Distributed data store with a designated master to ensure consistency
US7349896B2 (en) * 2004-12-29 2008-03-25 Aol Llc Query routing
US7272597B2 (en) 2004-12-29 2007-09-18 Aol Llc Domain expert search
US7571157B2 (en) 2004-12-29 2009-08-04 Aol Llc Filtering search results
US7818314B2 (en) 2004-12-29 2010-10-19 Aol Inc. Search fusion
US7418410B2 (en) 2005-01-07 2008-08-26 Nicholas Caiafa Methods and apparatus for anonymously requesting bids from a customer specified quantity of local vendors with automatic geographic expansion
GB0502259D0 (en) * 2005-02-03 2005-03-09 British Telecomm Document searching tool and method
US8423541B1 (en) 2005-03-31 2013-04-16 Google Inc. Using saved search results for quality feedback
US8195646B2 (en) 2005-04-22 2012-06-05 Microsoft Corporation Systems, methods, and user interfaces for storing, searching, navigating, and retrieving electronic information
US7734644B2 (en) * 2005-05-06 2010-06-08 Seaton Gras System and method for hierarchical information retrieval from a coded collection of relational data
US7665028B2 (en) 2005-07-13 2010-02-16 Microsoft Corporation Rich drag drop user interface
WO2007011140A1 (en) * 2005-07-15 2007-01-25 Chutnoon Inc. Method of extracting topics and issues and method and apparatus for providing search results based on topics and issues
KR100645614B1 (en) * 2005-07-15 2006-11-14 (주)첫눈 Search method and apparatus considering a worth of information
EP1952280B8 (en) * 2005-10-11 2016-11-30 Ureveal, Inc. System, method&computer program product for concept based searching&analysis
GB2431742A (en) * 2005-10-27 2007-05-02 Hewlett Packard Development Co A method of retrieving data from a data repository
US7707506B2 (en) * 2005-12-28 2010-04-27 Sap Ag Breadcrumb with alternative restriction traversal
US7676485B2 (en) * 2006-01-20 2010-03-09 Ixreveal, Inc. Method and computer program product for converting ontologies into concept semantic networks
US8386469B2 (en) * 2006-02-16 2013-02-26 Mobile Content Networks, Inc. Method and system for determining relevant sources, querying and merging results from multiple content sources
US7917511B2 (en) * 2006-03-20 2011-03-29 Cannon Structures, Inc. Query system using iterative grouping and narrowing of query results
US8880569B2 (en) * 2006-04-17 2014-11-04 Teradata Us, Inc. Graphical user interfaces for custom lists and labels
US8793244B2 (en) * 2006-04-17 2014-07-29 Teradata Us, Inc. Data store list generation and management
AU2007253724A1 (en) * 2006-05-19 2007-11-29 Jorn Lyseggen Source search engine
US20070291923A1 (en) * 2006-06-19 2007-12-20 Amy Hsieh Method and apparatus for the purchase, sale and facilitation of voice over internet protocol (VoIP) consultations
US7991769B2 (en) * 2006-07-07 2011-08-02 Yahoo! Inc. System and method for budgeted generalization search in hierarchies
US20080010250A1 (en) * 2006-07-07 2008-01-10 Yahoo! Inc. System and method for generalization search in hierarchies
US20080059485A1 (en) * 2006-08-23 2008-03-06 Finn James P Systems and methods for entering and retrieving data
US7606752B2 (en) 2006-09-07 2009-10-20 Yodlee Inc. Host exchange in bill paying services
CN101150529B (en) * 2006-09-21 2011-07-27 腾讯科技(深圳)有限公司 A method and system for mail search
US7644068B2 (en) * 2006-10-06 2010-01-05 International Business Machines Corporation Selecting records from a list with privacy protections
US7739247B2 (en) * 2006-12-28 2010-06-15 Ebay Inc. Multi-pass data organization and automatic naming
KR100934989B1 (en) 2007-01-31 2009-12-31 삼성전자주식회사 Content management method and apparatus
US8220023B2 (en) 2007-02-21 2012-07-10 Nds Limited Method for content presentation
US7705847B2 (en) 2007-03-05 2010-04-27 Oracle International Corporation Graph selection method
US8326852B2 (en) * 2007-03-13 2012-12-04 International Business Machines Corporation Determining query entities for an abstract database from a physical database table
US20080243777A1 (en) * 2007-03-29 2008-10-02 Osamuyimen Thompson Stewart Systems and methods for results list navigation using semantic componential-gradient processing techniques
US8949214B1 (en) * 2007-04-24 2015-02-03 Wal-Mart Stores, Inc. Mashup platform
US8019760B2 (en) * 2007-07-09 2011-09-13 Vivisimo, Inc. Clustering system and method
US20090177648A1 (en) * 2007-09-15 2009-07-09 Bond Andrew R Systems and methods for organizing and managing trusted health care reference information
US8600966B2 (en) * 2007-09-20 2013-12-03 Hal Kravcik Internet data mining method and system
KR20090033728A (en) * 2007-10-01 2009-04-06 삼성전자주식회사 Method and apparatus for providing content summary information
US7877344B2 (en) * 2007-10-10 2011-01-25 Northern Light Group, Llc Method and apparatus for extracting meaning from documents using a meaning taxonomy comprising syntactic structures
US9529974B2 (en) 2008-02-25 2016-12-27 Georgetown University System and method for detecting, collecting, analyzing, and communicating event-related information
US9746985B1 (en) 2008-02-25 2017-08-29 Georgetown University System and method for detecting, collecting, analyzing, and communicating event-related information
US8881040B2 (en) 2008-08-28 2014-11-04 Georgetown University System and method for detecting, collecting, analyzing, and communicating event-related information
US9489495B2 (en) 2008-02-25 2016-11-08 Georgetown University System and method for detecting, collecting, analyzing, and communicating event-related information
US8205157B2 (en) * 2008-03-04 2012-06-19 Apple Inc. Methods and graphical user interfaces for conducting searches on a portable multifunction device
US7711622B2 (en) 2008-03-05 2010-05-04 Stephen M Marceau Financial statement and transaction image delivery and access system
US8261334B2 (en) 2008-04-25 2012-09-04 Yodlee Inc. System for performing web authentication of a user by proxy
US9317599B2 (en) * 2008-09-19 2016-04-19 Nokia Technologies Oy Method, apparatus and computer program product for providing relevance indication
US9442933B2 (en) * 2008-12-24 2016-09-13 Comcast Interactive Media, Llc Identification of segments within audio, video, and multimedia items
US8713016B2 (en) 2008-12-24 2014-04-29 Comcast Interactive Media, Llc Method and apparatus for organizing segments of media assets and determining relevance of segments to a query
US11531668B2 (en) * 2008-12-29 2022-12-20 Comcast Interactive Media, Llc Merging of multiple data sets
US8555359B2 (en) 2009-02-26 2013-10-08 Yodlee, Inc. System and methods for automatically accessing a web site on behalf of a client
US8176043B2 (en) 2009-03-12 2012-05-08 Comcast Interactive Media, Llc Ranking search results
US8589374B2 (en) 2009-03-16 2013-11-19 Apple Inc. Multifunction device with integrated search and application selection
US9245243B2 (en) 2009-04-14 2016-01-26 Ureveal, Inc. Concept-based analysis of structured and unstructured data using concept inheritance
US8533223B2 (en) 2009-05-12 2013-09-10 Comcast Interactive Media, LLC. Disambiguation and tagging of entities
US9892730B2 (en) 2009-07-01 2018-02-13 Comcast Interactive Media, Llc Generating topic-specific language models
JP5635247B2 (en) * 2009-08-20 2014-12-03 富士通株式会社 Multi-chip module
US8694505B2 (en) 2009-09-04 2014-04-08 Microsoft Corporation Table of contents for search query refinement
US8954893B2 (en) * 2009-11-06 2015-02-10 Hewlett-Packard Development Company, L.P. Visually representing a hierarchy of category nodes
US8650195B2 (en) * 2010-03-26 2014-02-11 Palle M Pedersen Region based information retrieval system
US8452765B2 (en) * 2010-04-23 2013-05-28 Eye Level Holdings, Llc System and method of controlling interactive communication services by responding to user query with relevant information from content specific database
US10713312B2 (en) 2010-06-11 2020-07-14 Doat Media Ltd. System and method for context-launching of applications
US9069443B2 (en) 2010-06-11 2015-06-30 Doat Media Ltd. Method for dynamically displaying a personalized home screen on a user device
WO2011156605A2 (en) 2010-06-11 2011-12-15 Doat Media Ltd. A system and methods thereof for enhancing a user's search experience
US8423555B2 (en) 2010-07-09 2013-04-16 Comcast Cable Communications, Llc Automatic segmentation of video
US9538493B2 (en) 2010-08-23 2017-01-03 Finetrak, Llc Locating a mobile station and applications therefor
US8838582B2 (en) 2011-02-08 2014-09-16 Apple Inc. Faceted search results
WO2012112149A1 (en) * 2011-02-16 2012-08-23 Hewlett-Packard Development Company, L.P. Population category hierarchies
US9858342B2 (en) 2011-03-28 2018-01-02 Doat Media Ltd. Method and system for searching for applications respective of a connectivity mode of a user device
US9235620B2 (en) 2012-08-14 2016-01-12 Amadeus S.A.S. Updating cached database query results
EP2541473A1 (en) 2011-06-27 2013-01-02 Amadeus S.A.S. Method and system for a pre-shopping reservation system with increased search efficiency
US20130073586A1 (en) * 2011-05-02 2013-03-21 Amadeus S.A.S. Database system using batch-oriented computation
US10467289B2 (en) 2011-08-02 2019-11-05 Comcast Cable Communications, Llc Segmentation of video according to narrative theme
US8843469B2 (en) 2011-08-04 2014-09-23 International Business Machines Corporation Faceted and selectable tabs within ephemeral search results
US8504561B2 (en) 2011-09-02 2013-08-06 Microsoft Corporation Using domain intent to provide more search results that correspond to a domain
US20130290324A1 (en) * 2012-04-26 2013-10-31 Amadeus S.A.S. Categorizing and ranking travel-related database query results
EP2657893A1 (en) * 2012-04-26 2013-10-30 Amadeus S.A.S. System and method of categorizing and ranking travel option search results
AU2012378631A1 (en) * 2012-04-26 2014-11-13 Amadeus S.A.S. Database system using batch-oriented computation
US20140280042A1 (en) * 2013-03-13 2014-09-18 Sap Ag Query processing system including data classification
CN104462113B (en) * 2013-09-17 2018-10-23 腾讯科技(深圳)有限公司 Searching method, device and electronic equipment
US9443015B1 (en) * 2013-10-31 2016-09-13 Allscripts Software, Llc Automatic disambiguation assistance for similar items in a set
US10019520B1 (en) * 2013-12-13 2018-07-10 Joy Sargis Muske System and process for using artificial intelligence to provide context-relevant search engine results
WO2015175548A1 (en) * 2014-05-12 2015-11-19 Diffeo, Inc. Entity-centric knowledge discovery
US9659106B2 (en) * 2014-06-19 2017-05-23 Go Daddy Operating Company, LLC Software application customized for target market
US9575961B2 (en) 2014-08-28 2017-02-21 Northern Light Group, Llc Systems and methods for analyzing document coverage
US11544306B2 (en) 2015-09-22 2023-01-03 Northern Light Group, Llc System and method for concept-based search summaries
US11886477B2 (en) 2015-09-22 2024-01-30 Northern Light Group, Llc System and method for quote-based search summaries
US10534783B1 (en) 2016-02-08 2020-01-14 Microstrategy Incorporated Enterprise search
US10224026B2 (en) * 2016-03-15 2019-03-05 Sony Corporation Electronic device, system, method and computer program
US11226946B2 (en) 2016-04-13 2022-01-18 Northern Light Group, Llc Systems and methods for automatically determining a performance index
JP6533876B2 (en) * 2016-10-13 2019-06-19 楽天株式会社 Product information display system, product information display method, and program
US11106741B2 (en) 2017-06-06 2021-08-31 Salesforce.Com, Inc. Knowledge operating system
JP7067109B2 (en) * 2018-02-20 2022-05-16 村田機械株式会社 Data management system
CN113157996B (en) * 2020-01-23 2022-09-16 久瓴(上海)智能科技有限公司 Document information processing method and device, computer equipment and readable storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5463773A (en) * 1992-05-25 1995-10-31 Fujitsu Limited Building of a document classification tree by recursive optimization of keyword selection function

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5758257A (en) * 1994-11-29 1998-05-26 Herz; Frederick System and method for scheduling broadcast of and access to video programs and other data using customer profiles
US5721902A (en) * 1995-09-15 1998-02-24 Infonautics Corporation Restricted expansion of query terms using part of speech tagging
US5640553A (en) * 1995-09-15 1997-06-17 Infonautics Corporation Relevance normalization for documents retrieved from an information retrieval system in response to a query
US5742816A (en) * 1995-09-15 1998-04-21 Infonautics Corporation Method and apparatus for identifying textual documents and multi-mediafiles corresponding to a search topic
US5737734A (en) * 1995-09-15 1998-04-07 Infonautics Corporation Query word relevance adjustment in a search of an information retrieval system
US5717914A (en) * 1995-09-15 1998-02-10 Infonautics Corporation Method for categorizing documents into subjects using relevance normalization for documents retrieved from an information retrieval system in response to a query
US5675788A (en) * 1995-09-15 1997-10-07 Infonautics Corp. Method and apparatus for generating a composite document on a selected topic from a plurality of information sources
US5659742A (en) * 1995-09-15 1997-08-19 Infonautics Corporation Method for storing multi-media information in an information retrieval system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5463773A (en) * 1992-05-25 1995-10-31 Fujitsu Limited Building of a document classification tree by recursive optimization of keyword selection function

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BALDAZO R: "NAVIGATING WITH A WEB COMPASS", BYTE, vol. 21, no. 3, 1 March 1996 (1996-03-01), pages 97/98, XP000600179 *
WANG BALDONADO M ET AL: "SENSEMAKER: AN INFORMATION-EXPLORATION INTERFACE SUPPORTING THE CONTEXTUAL EVOLUTION OF A USER'S INTERESTS", CHI 97. HUMAN FACTORS IN COMPUTING SYSTEMS, ATLANTA, MAR. 22 - 27, 1997, 22 March 1997 (1997-03-22), PEMBERTON S (ED ), pages 11 - 18, XP000697112 *
WEISS R ET AL: "HYPURSUIT: A HIERARCHICAL NETWORK SEARCH ENGINE THAT EXPLOITS CONTENT-LINK HYPERTEXT CLUSTERING", HYPERTEXT '96. 7TH. ACM CONFERENCE ON HYPERTEXT, WASHINGTON, MAR. 16 - 20, 1996, no. CONF. 7, 16 March 1996 (1996-03-16), ASSOCIATION FOR COMPUTING MACHINERY, pages 180 - 193, XP000724328 *

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2380576A (en) * 1997-09-21 2003-04-09 Microsoft Corp Stardard user interface control display for a data provider
GB2380576B (en) * 1997-09-21 2003-12-17 Microsoft Corp Displaying on a display device a control window area used to filter data
EP1006467A2 (en) * 1998-11-25 2000-06-07 Canon Research Centre France S.A. Method and device for the automatic classification of sites or users of a communication network
EP1006467A3 (en) * 1998-11-25 2000-12-20 Canon Research Centre France S.A. Method and device for the automatic classification of sites or users of a communication network
EP1014662A2 (en) * 1998-12-23 2000-06-28 Nortel Networks Corporation Access to documents of multimedia information with keywords
EP1014662A3 (en) * 1998-12-23 2002-08-21 Nortel Networks Limited Access to documents of multimedia information with keywords
WO2000054177A2 (en) * 1999-03-05 2000-09-14 Accenture Llp Method and apparatus for creating an information summary
WO2000054177A3 (en) * 1999-03-05 2001-01-25 Ac Properties Bv Method and apparatus for creating an information summary
WO2000067161A3 (en) * 1999-05-04 2002-06-06 Lee H Grant Method and apparatus for categorizing and retrieving network pages and sites
WO2000074294A2 (en) * 1999-05-31 2000-12-07 Webnara Co., Ltd. General-purpose robot agent and real-time search method
WO2000074294A3 (en) * 1999-05-31 2002-02-14 Webnara Co Ltd General-purpose robot agent and real-time search method
WO2001011441A3 (en) * 1999-08-10 2001-12-06 Seung Chul Joo Content service system and method using classification diagram
WO2001011441A2 (en) * 1999-08-10 2001-02-15 Seung Chul Joo Content service system and method using classification diagram
US6584462B2 (en) 1999-09-10 2003-06-24 Requisite Technology, Inc. Sequential subset catalog search engine
US6697799B1 (en) * 1999-09-10 2004-02-24 Requisite Technology, Inc. Automated classification of items using cascade searches
US6907424B1 (en) * 1999-09-10 2005-06-14 Requisite Technology, Inc. Sequential subset catalog search engine
US7734680B1 (en) 1999-09-30 2010-06-08 Koninklijke Philips Electronics N.V. Method and apparatus for realizing personalized information from multiple information sources
US7152064B2 (en) 2000-08-18 2006-12-19 Exalead Corporation Searching tool and process for unified search using categories and keywords
AU785452B2 (en) * 2001-05-09 2007-07-12 Requisite Software, Inc. Sequential subset catalog search engine
EP1256887A1 (en) * 2001-05-09 2002-11-13 Requisite Technology Inc. Sequential subset catalog search engine
US7043492B1 (en) 2001-07-05 2006-05-09 Requisite Technology, Inc. Automated classification of items using classification mappings
WO2003042777A3 (en) * 2001-09-14 2005-04-07 Kent Ridge Digital Labs Method and system for personalized information management
WO2003042777A2 (en) * 2001-09-14 2003-05-22 Kent Ridge Digital Labs Method and system for personalized information management
US7542902B2 (en) 2002-07-29 2009-06-02 British Telecommunications Plc Information provision for call centres
WO2004012431A1 (en) * 2002-07-29 2004-02-05 British Telecommunications Public Limited Company Improvements in or relating to information provision for call centres
WO2007028021A3 (en) * 2005-08-31 2007-05-18 Thomson Global Resources Systems, methods, and interfaces for reducing executions of overly broad user queries
WO2007028013A1 (en) * 2005-08-31 2007-03-08 Thomson Global Resources System and method presenting search results in a topical space
WO2007028021A2 (en) * 2005-08-31 2007-03-08 Thomson Global Resources Systems, methods, and interfaces for reducing executions of overly broad user queries
US8024338B2 (en) 2005-08-31 2011-09-20 Brei James E Systems, methods, and interfaces for reducing executions of overly broad user queries
US8949259B2 (en) 2005-08-31 2015-02-03 Cengage Learning, Inc. Systems, methods, software, and interfaces for analyzing, mapping, and depicting search results in a topical space
US10223907B2 (en) 2008-11-14 2019-03-05 Apple Inc. System and method for capturing remote control device command signals

Also Published As

Publication number Publication date
EP0979470B1 (en) 2008-06-11
JP2008071372A (en) 2008-03-27
JP2009238241A (en) 2009-10-15
AU7271798A (en) 1998-11-24
CA2288745A1 (en) 1998-11-05
DE69839604D1 (en) 2008-07-24
CA2288745C (en) 2008-01-08
US5924090A (en) 1999-07-13
JP2001522496A (en) 2001-11-13
AU736428B2 (en) 2001-07-26
ES2306474T3 (en) 2008-11-01
EP0979470A1 (en) 2000-02-16
JP2008097641A (en) 2008-04-24

Similar Documents

Publication Publication Date Title
US5924090A (en) Method and apparatus for searching a database of records
US20220164401A1 (en) Systems and methods for dynamically creating hyperlinks associated with relevant multimedia content
US6944609B2 (en) Search results using editor feedback
EP1618496B1 (en) A system and method for generating refinement categories for a set of search results
Dumais et al. Optimizing search by showing results in context
Terveen et al. Constructing, organizing, and visualizing collections of topically related web resources
US20020073079A1 (en) Method and apparatus for searching a database and providing relevance feedback
US6920448B2 (en) Domain specific knowledge-based metasearch system and methods of using
CA2281645C (en) System and method for semiotically processing text
US20060288001A1 (en) System and method for dynamically identifying the best search engines and searchable databases for a query, and model of presentation of results - the search assistant
US20030061209A1 (en) Computer user interface tool for navigation of data stored in directed graphs
US20020069203A1 (en) Internet information retrieval method and apparatus
US7024405B2 (en) Method and apparatus for improved internet searching
CA2637239A1 (en) System for searching
US7013300B1 (en) Locating, filtering, matching macro-context from indexed database for searching context where micro-context relevant to textual input by user
Kazai et al. Construction of a test collection for the focussed retrieval of structured documents
Madjid et al. The effect of individual differences on searching the web
CA2396459A1 (en) Method and system for collecting topically related resources
Buckland et al. Partnerships in navigation: an information retrieval research agenda
Hu et al. World wide web search technologies
EP1672544A2 (en) Improving text search quality by exploiting organizational information
Nowick et al. A model search engine based on cluster analysis of user search terms
Al‐Hawamdeh et al. Paragraph‐based access to full‐text documents using a hypertext system
Mori et al. Bookmark‐agent: Sharing of bookmarks for search assists
Kato Effects of the Variety of Document Retrieval Methods on Interactive Information Access

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase

Ref document number: 2288745

Country of ref document: CA

Ref country code: JP

Ref document number: 1998 547408

Kind code of ref document: A

Format of ref document f/p: F

Ref country code: CA

Ref document number: 2288745

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 72717/98

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 1998920069

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1998920069

Country of ref document: EP

WWG Wipo information: grant in national office

Ref document number: 72717/98

Country of ref document: AU