WO2007103815B1 - Hyperspace index - Google Patents

Hyperspace index

Info

Publication number
WO2007103815B1
WO2007103815B1 PCT/US2007/063218 US2007063218W WO2007103815B1 WO 2007103815 B1 WO2007103815 B1 WO 2007103815B1 US 2007063218 W US2007063218 W US 2007063218W WO 2007103815 B1 WO2007103815 B1 WO 2007103815B1
Authority
WO
WIPO (PCT)
Prior art keywords
parameter
field
identifier
patterns
pattern
Prior art date
Application number
PCT/US2007/063218
Other languages
French (fr)
Other versions
WO2007103815A2 (en
WO2007103815A3 (en
Inventor
Dillon K Inouye
Ronald P Millett
John C Higgins
Original Assignee
Perfect Search Corp
Dillon K Inouye
Ronald P Millett
John C Higgins
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Perfect Search Corp, Dillon K Inouye, Ronald P Millett, John C Higgins filed Critical Perfect Search Corp
Priority to US12/281,262 priority Critical patent/US8176052B2/en
Priority to EP07757830A priority patent/EP1999565A4/en
Priority to US11/847,784 priority patent/US8266152B2/en
Publication of WO2007103815A2 publication Critical patent/WO2007103815A2/en
Publication of WO2007103815A3 publication Critical patent/WO2007103815A3/en
Publication of WO2007103815B1 publication Critical patent/WO2007103815B1/en
Priority to US13/312,022 priority patent/US20120096008A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2264Multidimensional index structures
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99936Pattern matching access

Abstract

A hyperspace index data structure. A data structure indexes identifiers corresponding to parameter patterns. The presence of an identifier in the data structure indicates that the corresponding parameter pattern may be present in a set of parameter patterns, while the absence of the indicator can be used to indicate that the corresponding parameter pattern is not present. A computing environment to implement the data structure is provided.

Claims

57AMENDED CLAIMS received by the International Bureau on 05 May 2008 (05.05.08)
1. In a computing environment, a data structure for indexing identifiers, wherein the identifiers correspond to parameter patterns, and wherein the presence of an identifier in the data structure indicates that the corresponding parameter pattern may be present in a set of parameter patterns, and wherein absence of the indicator in the data structure can be used to indicate that the corresponding parameter pattern is not present in the set of parameter patterns, the data structure comprising: a first field, wherein the first field comprises a first plurality of binary bits or small parameter pattern records, each binary bit, when set, or small parameter pattern record being an identifier corresponding to a parameter pattern from among a first set of parameter patterns, and wherein when the bit is set or small parameter pattern record included, the identifier is included in the first field indicating that a corresponding parameter pattern may possibly be included in the first set of parameter patterns, but where an indicator may also be a false positive indication of a parameter pattern being included in the first set of parameter patterns; and one or more additional fields hierarchically below the first data field, wherein the one or more additional field hierarchically below the first data field are divisions of higher level fields such that a field hierarchically below a higher level field contains indicators for fewer documents or records than a higher level field, wherein each of the additional fields comprises at least one of identifiers for a subset of the first set of parameter patterns represented by identifiers in the first field, identifiers for a parameter pattern related to one or more of the parameter patterns in the first set of parameter patterns or a subset of parameter patterns or related parameter patterns represented by one of the other one or more additional fields hierarchically above the field including the identifiers, wherein the absence of an identifier corresponding to a parameter pattern for a particular field indicates that the parameter pattern is not represented by an indicator in fields hierarchically below the field irrespective of the presence of identifiers corresponding to the parameter pattern being present fields hierarchically above the field not including an identifier corresponding to the parameter pattern, and wherein the absence of an 58
identifier corresponding to a parameter pattern at any level in all of the hierarchical threads indicates that the parameter pattern is not present in the set of parameter patterns.
2. The data structure of claim 1, wherein identifiers are derived from a calculated hash of a corresponding parameter pattern.
3. The data structure of claim 1, wherein location information of an identifier for parameter pattern in the first field can be preserved in hierarchically lower fields by deriving identifiers for the parameter pattern in the lower fields using the same method that was used to derive the identifier for the parameter pattern in the first field.
4. The data structure of claim 1, wherein parameter patterns comprise search specifiers.
5. The data structure of claim 4, wherein the search specifiers comprise Boolean combinations of search terms.
6. The data structure of claim 4, wherein the search specifiers comprise at least one of a question or an answer.
7. The data structure of claim 1, wherein the parameter patterns further comprise ranking information combined with search specifiers,
8. The data structure of claim 7, wherein the ranking information includes personalized ranking through customized criteria.
9. The data structure of claim 1, wherein the parameter patterns include an ordering according to a priority.
10. The data structure of claim 1, wherein the parameter patterns further comprise ranking information combined with search terms, wherein the ranking information includes at least one of user preference ranking, job appropriate ranking, profession appropriate ranking, linguistic appropriate ranking, stylistic appropriate ranking, and ranking based on previous searches.
11. The data structure of claim 1, wherein fields are organized into hierarchical threads according to a ranking such that referencing of the data structure for identifier entries can be accomplished by following higher ranked hierarchical threads first. 59
12. The data structure of claim 11, wherein fields are organized such that fields beyond are given threshold are not relevant.
13. The data structure of claim 1 , wherein the one or more additional fields are sized with a selected granularity to set a level of search resolution.
14. The data structure of claim 1 , wherein the subsets of parameter patterns or associated parameter patterns are sub-divided into fields according to characteristics of objects represented in the field.
15. The data structure of claim 1, wherein each possible identifier corresponding to a parameter pattern corresponds to a binary representation of the parameter pattern such that the first field comprises a full text index.
16. The data structure of claim 1, wherein a possible identifier corresponding to a parameter pattern corresponds to a unique identifier or substantially unique identifier of the parameter pattern.
17. The data structure of claim 1, wherein the possible identifier comprises a frequency pointer that include an indication of the frequency ol" indicators for the parameter pattern in lower portions of an index.
18. The data structure of claim 1, wherein the possible identifier comprises one or more child index pointers pointing to portions of an index that may contain identifiers for the parameter pattern.
19. The data structure of claim 1, wherein the possible identifier comprises a short circuit pointer which allows for lower level fields to be bypassed such that a search can be focused to a particular location or record of a data space.
20. The data structure of claim 1, wherein the first field and the one or more fields hierarchically below the first field are a combination of abbreviated indexes, indexes of small records accessed by an identifier number as a key and full text indexes.
21. The data structure of claim 1, the data structure comprising fields at a lowest hierarchical level, and wherein the fields at the lowest hierarchical level comprise at least one of records or documents ordered from high to low relevance in the fields.
22. The data structure of claim 21, wherein relevance is determined by at least one of general reliability or relevance, or static reliability or relevance. 60
23. The data structure of claim 1, the data structure comprising fields at a lowest hierarchical level, and wherein the fields at the lowest hierarchical level are distributed according to relevance in proportions across a set of multiple clusters of servers.
24. In a computing environment, a method for indexing identifiers into a data structure, wherein the identifiers correspond to parameter patterns, and wherein the presence of an identifier in the data structure indicates that the corresponding parameter pattern may be present in a set of parameter patterns, and wherein absence of the indicator in the data structure can be used to indicate that the corresponding parameter pattern is not present in the set of parameter patterns, the method comprising: including a first field identifier for a first parameter in a first field, wherein the first field comprises a first plurality of binary bits or small parameter pattern records, each binary bit, when set, or small parameter pattern record being an identifier corresponding to a parameter pattern from among a first set of parameter patterns, and wherein when the bit is set or small parameter pattern record included, the identifier is included in the first field indicating that a corresponding parameter pattern may possibly be included in the first set of parameter patterns, but where an indicator may also be a false positive indication of a parameter pattern being included in the first set of parameter patterns; and including one or more lower field identifiers for the first parameter pattern or a parameter pattern related to the first parameter pattern in one or more additional fields hierarchically below the first data field, wherein the one or more additional field hierarchically below the first data field are divisions of higher level fields such that a field hierarchically below a higher level field contains indicators for fewer documents or records than a higher level field, wherein each of the additional fields comprises at least one of identifiers for a subset of the first set of parameter patterns represented by identifiers in the first field, identifiers for a parameter pattern related to one or more of the parameter patterns in the first set of parameter patterns or a subset of parameter patterns or related parameter patterns represented by one of the 61
other one or more additional fields hierarchically above the field including the identifiers, wherein the absence of an identifier corresponding to a parameter pattern for a particular field indicates that the parameter pattern is not represented by an indicator in fields hierarchically below the field irrespective of the presence of identifiers corresponding to the parameter pattern being present fields hierarchically above the field not including an identifier corresponding to the parameter pattern, and wherein the absence of an identifier corresponding to a parameter pattern at any level in all of the hierarchical threads indicates that the parameter pattern is not present in the set of parameter patterns.
25. The method of claim 24, further comprising generating a hash code for the parameter pattern, and wherein the first field identifier and the one or more lower field identifiers are derived from the generated hash code.
26. The method of claim 24, wherein the first parameter pattern is a parameter pattern for a data object, the method further comprising: including additional identifiers for additional parameter patterns for the data object in the first field; and including additional lower identifiers for the additional parameter patterns or related additional parameter patterns in the one or more additional fields hierarchically below the first data field.
27. The method of claim 26, further comprising allocating memory for the first field and the one or more additional fields hierarchically below the first data field based on the number of data objects, the number of parameter patterns for each data object, and one or more multipliers.
28. The method of claim 27, wherein allocating memory is done for each field individually based on the number of data objects, the number of parameter patterns for each data object, and a multiplier, for each field.
29. The method of claim 24, further comprising organizing a hierarchical arrangement of fields by priority.
30. Li a computing environment, a method of locating identifiers in an index, wherein the identifiers correspond to parameter patterns, and wherein the presence of an identifier in the data structure indicates that the corresponding parameter pattern may be present in a set of parameter patterns, and wherein absence of the indicator in the data structure can be used to indicate that the corresponding parameter pattern is not present in the set of parameter patterns, the method comprising: referencing a first field for a first identifier corresponding to a first parameter pattern, wherein the first field comprises a first plurality of binary bits or small parameter pattern records, each binary bit, when set, or small parameter pattern record being an identifier corresponding to a parameter pattern from among a first set of parameter patterns, and wherein when the bit is set or small parameter pattern record included, the identifier is included in the first field indicating that a corresponding parameter pattern may possibly be included in the first set of parameter patterns, but where an indicator may also be a false positive indication of a parameter pattern being included in the first set of parameter patterns; and if the first identifier is in the first field, referencing one or more additional fields hierarchically below the first field, wherein the one or more additional field hierarchically below the first field are divisions of higher level fields such that a field hierarchically below a higher level field contains indicators for fewer documents or records than a higher level field, wherein each of the additional fields comprises at least one of identifiers for a subset of the first set of parameter patterns represented by identifiers in the first field, identifiers for a parameter pattern related to one or more of the parameter patterns in the first set of parameter patterns or a subset of parameter patterns or related parameter patterns represented by one of the oiher one or more additional fields hierarchically above the field including the identifiers, wherein the absence of an identifier corresponding to a parameter pattern for a particular field indicates that the parameter pattern is not represented by an indicator in fields hierarchically below the field irrespective of the presence of identifiers corresponding to the parameter pattern being present fields hierarchically above the field not including an identifier corresponding to the parameter pattern, and wherein the absence of an identifier corresponding to a 63
parameter pattern at any level in all of the hierarchical threads indicates that the parameter pattern is not present in the set of parameter patterns.
31. The method of claim 30, wherein referencing one or more additional fields hierarchically below the first data field for one or more identifiers corresponding to the parameter pattern or one or more related parameter patterns comprises referencing fields in a hierarchical thread, from higher levels in the thread to lower levels in the threads and stopping referencing fields in a thread when an identifier for a parameter pattern being referenced for does not appear in a field in the thread.
32. The method of claim 30, wherein referencing a first field for a first identifier corresponding to a first parameter pattern, comprises referencing parameter patterns for more relevant parameter patterns first.
33. The method of claim 32, wherein relevance is determined by at least one of linguistic analysis, date, or reliability of a document or record.
34. In a computing environment, a method of indexing parameter patterns for later search and retrieval, the method comprising: operating on one or more parameter patterns to generate a identifier code for each parameter pattern; sorting the identifier codes generated for the parameter patterns; and correlating offsets with sorted identifier codes where each offset is correlated to one of: a sorted identifier code, wherein the offset represents a portion of one or more identifier codes correlated to the offset, or to an indicator that indicates that a particular offset does not correspond to an identifier code for a parameter pattern.
35. The method of claim 34, wherein a given offset comprises a predetermined number of the most significant bits of an identifier code correlated to the given offset.
36. The method of claim 34, wherein sorting the identifier codes generated for the parameter patterns comprises sorting the identifier codes in an ascending order numerically. 64
37. The method of claim 34, wherein the indicator that indicates that a particular offset does not correspond to identifier code for a parameter pattern is a negative number.
38. The method of claim 34, wherein the indicator that indicates that a particular offset does not correspond to an identifier code for a parameter pattern includes an indicator for locating a next identifier code corresponding to a parameter pattern.
39. In a computing environment, a method of checking an index for entries, the method comprising: operating on a parameter pattern to create a identifier code; generating an offset from the identifier code; comparing the offset to a set of correlated offsets wherein the set of correlated offsets are correlated with sorted identifier codes where each offset is correlated to one of: a sorted identifier code, wherein the offset represents a portion of one or more pre-calculated identifier codes correlated to the offset, or to an indicator that indicates that a particular offset does not correspond to a pre-calculated identifier code for a parameter pattern; if a correlated offset matches the offset, and the correlated offset is correlated to a portion of one or more pre-calculated identifier codes, then comparing the identifier code to at least one of the one or more pre-calculated identifier codes; but if a correlated offset matches the offset, and the correlated offset is correlated to an indicator that indicates that a particular offset does not correspond to a pre-calculated identifier code for a parameter pattern, then returning an indication that the parameter pattern is not included in a set of parameter patterns.
40. The method of claim 39, wherein comparing the identifier code to at least one of the one or more pre-calculated identifier codes comprises: continuing comparing the identifier code to pre-calculated identifier codes until either the identifier code matches a pre-calculated identifier code, or until a compared pre-calculated identifier code is greater than the identifier code; 65
if the identifier code matches a pre-calculated identifier code, then returning an indication that the identifier code matches a pre-calculated identifier code; but if a compared pre-calculated identifier code is greater than the identifier code, then returning an indication that the parameter pattern is not included in a set of parameter patterns.
41. The method of claim 39, wherein generating an offset comprises selecting a predetermined number of the most significant bits of an identifier code correlated to the given offset.
42. In a computing environment, a data structure for indexing parameter patterns, the data structure comprising: a first field, the first field comprising pre-calculated identifier codes for parameter patterns, wherein the pre-calculated identifier codes are sorted in the first field; a second field, the second field comprising an enumeration of each of the pre-calculated identifier codes; and a third field, the third field comprising one or more offsets, wherein each offset is correlated to one of the enumerations in the second field or to an indicator that indicates that a particular offset does not correspond to a identifier code for a parameter pattern.
PCT/US2007/063218 2006-03-03 2007-03-02 Hyperspace index WO2007103815A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US12/281,262 US8176052B2 (en) 2006-03-03 2007-03-02 Hyperspace index
EP07757830A EP1999565A4 (en) 2006-03-03 2007-03-02 Hyperspace index
US11/847,784 US8266152B2 (en) 2006-03-03 2007-08-30 Hashed indexing
US13/312,022 US20120096008A1 (en) 2006-03-03 2011-12-06 Hyperspace index

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US77921406P 2006-03-03 2006-03-03
US60/779,214 2006-03-03

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US11/847,784 Continuation-In-Part US8266152B2 (en) 2006-03-03 2007-08-30 Hashed indexing
US13/312,022 Continuation US20120096008A1 (en) 2006-03-03 2011-12-06 Hyperspace index

Publications (3)

Publication Number Publication Date
WO2007103815A2 WO2007103815A2 (en) 2007-09-13
WO2007103815A3 WO2007103815A3 (en) 2008-05-02
WO2007103815B1 true WO2007103815B1 (en) 2008-07-10

Family

ID=38475735

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2007/063218 WO2007103815A2 (en) 2006-03-03 2007-03-02 Hyperspace index

Country Status (3)

Country Link
US (3) US8176052B2 (en)
EP (1) EP1999565A4 (en)
WO (1) WO2007103815A2 (en)

Families Citing this family (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070162481A1 (en) 2006-01-10 2007-07-12 Millett Ronald P Pattern index
US8266152B2 (en) 2006-03-03 2012-09-11 Perfect Search Corporation Hashed indexing
US8176052B2 (en) * 2006-03-03 2012-05-08 Perfect Search Corporation Hyperspace index
JP5193518B2 (en) * 2007-07-13 2013-05-08 株式会社東芝 Pattern search apparatus and method
US7912840B2 (en) 2007-08-30 2011-03-22 Perfect Search Corporation Indexing and filtering using composite data stores
US7774353B2 (en) 2007-08-30 2010-08-10 Perfect Search Corporation Search templates
US7774347B2 (en) 2007-08-30 2010-08-10 Perfect Search Corporation Vortex searching
US7984019B2 (en) * 2007-12-28 2011-07-19 Knowledge Computing Corporation Method and apparatus for loading data files into a data-warehouse system
US7840546B2 (en) * 2008-01-07 2010-11-23 Knowledge Computing Corporation Method and apparatus for conducting data queries using consolidation strings and inter-node consolidation
US8032495B2 (en) 2008-06-20 2011-10-04 Perfect Search Corporation Index compression
US8037050B2 (en) * 2008-08-02 2011-10-11 Knowledge Computing Corporation Methods and apparatus for performing multi-data-source, non-ETL queries and entity resolution
US8386436B2 (en) * 2008-09-30 2013-02-26 Rainstor Limited System and method for data storage
US8131738B2 (en) * 2008-12-30 2012-03-06 International Business Machines Corporation Search engine service utilizing hash algorithms
WO2010135430A1 (en) * 2009-05-19 2010-11-25 Vmware, Inc. Shortcut input/output in virtual machine systems
JP5341209B2 (en) * 2009-12-25 2013-11-13 インターナショナル・ビジネス・マシーンズ・コーポレーション System, method and program for checking pointer consistency in hierarchical database
US9317536B2 (en) * 2010-04-27 2016-04-19 Cornell University System and methods for mapping and searching objects in multidimensional space
US8527546B2 (en) 2010-11-25 2013-09-03 International Business Machines Corporation Generating a checkpoint image for use with an in-memory database
US9811373B2 (en) * 2011-02-09 2017-11-07 Nec Corporation Analysis engine control device
US9015142B2 (en) * 2011-06-10 2015-04-21 Google Inc. Identifying listings of multi-site entities based on user behavior signals
US9155320B2 (en) * 2011-07-06 2015-10-13 International Business Machines Corporation Prefix-based leaf node storage for database system
US9715434B1 (en) * 2011-09-30 2017-07-25 EMC IP Holding Company LLC System and method for estimating storage space needed to store data migrated from a source storage to a target storage
US8738595B2 (en) 2011-11-22 2014-05-27 Navteq B.V. Location based full text search
US8745022B2 (en) 2011-11-22 2014-06-03 Navteq B.V. Full text search based on interwoven string tokens
US9009149B2 (en) * 2011-12-06 2015-04-14 The Trustees Of Columbia University In The City Of New York Systems and methods for mobile search using Bag of Hash Bits and boundary reranking
US8996467B2 (en) 2011-12-29 2015-03-31 Druva Inc. Distributed scalable deduplicated data backup system
US8700634B2 (en) * 2011-12-29 2014-04-15 Druva Inc. Efficient deduplicated data storage with tiered indexing
US8700661B2 (en) 2012-04-12 2014-04-15 Navteq B.V. Full text search using R-trees
US9898505B2 (en) * 2012-09-27 2018-02-20 Nec Corporation Method, apparatus and program for transforming into binary data
US9262423B2 (en) * 2012-09-27 2016-02-16 Microsoft Technology Licensing, Llc Large scale file storage in cloud computing
CN103345469B (en) * 2013-05-24 2016-08-03 联动优势科技有限公司 The storage of set of numbers, querying method and device thereof
CN105630803B (en) * 2014-10-30 2019-07-05 国际商业机器公司 The method and apparatus that Document image analysis establishes index
US10467215B2 (en) 2015-06-23 2019-11-05 Microsoft Technology Licensing, Llc Matching documents using a bit vector search index
US10242071B2 (en) 2015-06-23 2019-03-26 Microsoft Technology Licensing, Llc Preliminary ranker for scoring matching documents
US11392568B2 (en) 2015-06-23 2022-07-19 Microsoft Technology Licensing, Llc Reducing matching documents for a search query
US10733164B2 (en) 2015-06-23 2020-08-04 Microsoft Technology Licensing, Llc Updating a bit vector search index
US11281639B2 (en) 2015-06-23 2022-03-22 Microsoft Technology Licensing, Llc Match fix-up to remove matching documents
US10565198B2 (en) 2015-06-23 2020-02-18 Microsoft Technology Licensing, Llc Bit vector search index using shards
US10229143B2 (en) 2015-06-23 2019-03-12 Microsoft Technology Licensing, Llc Storage and retrieval of data from a bit vector search index
US10210210B2 (en) 2015-10-21 2019-02-19 International Business Machines Corporation Adaptive multi-index access plan for database queries
US11449554B2 (en) * 2015-10-22 2022-09-20 Mcafee, Llc Extensible search solution for asset information
US11200217B2 (en) 2016-05-26 2021-12-14 Perfect Search Corporation Structured document indexing and searching
CN107025263A (en) * 2017-01-16 2017-08-08 中国银联股份有限公司 Sentence analytic method for database statement
US10915576B2 (en) * 2019-03-26 2021-02-09 Sap Se High performance bloom filter

Family Cites Families (84)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4817036A (en) * 1985-03-15 1989-03-28 Brigham Young University Computer system and method for data base indexing and information retrieval
US4961139A (en) 1988-06-30 1990-10-02 Hewlett-Packard Company Data base management system for real-time applications
US4961197A (en) * 1988-09-07 1990-10-02 Hitachi, Ltd. Semiconductor laser device
JPH04186447A (en) 1990-11-21 1992-07-03 Canon Inc Directory management system
US5699441A (en) 1992-03-10 1997-12-16 Hitachi, Ltd. Continuous sign-language recognition apparatus and input apparatus
US5530854A (en) * 1992-09-25 1996-06-25 At&T Corp Shared tuple method and system for generating keys to access a database
US5701459A (en) 1993-01-13 1997-12-23 Novell, Inc. Method and apparatus for rapid full text index creation
US5544352A (en) 1993-06-14 1996-08-06 Libertech, Inc. Method and apparatus for indexing, searching and displaying data
US5664179A (en) 1995-06-27 1997-09-02 Mci Corporation Modified skip list database structure and method for access
US5960194A (en) * 1995-09-11 1999-09-28 International Business Machines Corporation Method for generating a multi-tiered index for partitioned data
US5737734A (en) * 1995-09-15 1998-04-07 Infonautics Corporation Query word relevance adjustment in a search of an information retrieval system
US5761652A (en) * 1996-03-20 1998-06-02 International Business Machines Corporation Constructing balanced multidimensional range-based bitmap indices
US6216213B1 (en) * 1996-06-07 2001-04-10 Motorola, Inc. Method and apparatus for compression, decompression, and execution of program code
US6253188B1 (en) 1996-09-20 2001-06-26 Thomson Newspapers, Inc. Automated interactive classified ad system for the internet
US5799312A (en) * 1996-11-26 1998-08-25 International Business Machines Corporation Three-dimensional affine-invariant hashing defined over any three-dimensional convex domain and producing uniformly-distributed hash keys
US5852822A (en) * 1996-12-09 1998-12-22 Oracle Corporation Index-only tables with nested group keys
US6076051A (en) 1997-03-07 2000-06-13 Microsoft Corporation Information retrieval utilizing semantic representation of text
US6128613A (en) 1997-06-26 2000-10-03 The Chinese University Of Hong Kong Method and apparatus for establishing topic word classes based on an entropy cost function to retrieve documents represented by the topic words
US6018733A (en) * 1997-09-12 2000-01-25 Infoseek Corporation Methods for iteratively and interactively performing collection selection in full text searches
US6026398A (en) 1997-10-16 2000-02-15 Imarket, Incorporated System and methods for searching and matching databases
US6070164A (en) * 1998-05-09 2000-05-30 Information Systems Corporation Database method and apparatus using hierarchical bit vector index structure
US6216123B1 (en) * 1998-06-24 2001-04-10 Novell, Inc. Method and system for rapid retrieval in a full text indexing system
US6658626B1 (en) 1998-07-31 2003-12-02 The Regents Of The University Of California User interface for displaying document comparison information
US6584458B1 (en) * 1999-02-19 2003-06-24 Novell, Inc. Method and apparatuses for creating a full text index accommodating child words
US6516320B1 (en) * 1999-03-08 2003-02-04 Pliant Technologies, Inc. Tiered hashing for data access
US7181438B1 (en) * 1999-07-21 2007-02-20 Alberti Anemometer, Llc Database access system
US6879976B1 (en) * 1999-08-19 2005-04-12 Azi, Inc. Data indexing using bit vectors
US6772141B1 (en) 1999-12-14 2004-08-03 Novell, Inc. Method and apparatus for organizing and using indexes utilizing a search decision table
US6473729B1 (en) 1999-12-20 2002-10-29 Xerox Corporation Word phrase translation using a phrase index
AUPQ475799A0 (en) 1999-12-20 2000-01-20 Youramigo Pty Ltd An internet indexing system and method
US6678686B1 (en) 1999-12-28 2004-01-13 Ncr Corporation Method and apparatus for evaluating index predicates on complex data types using virtual indexed streams
US6584465B1 (en) * 2000-02-25 2003-06-24 Eastman Kodak Company Method and system for search and retrieval of similar patterns
US6947931B1 (en) * 2000-04-06 2005-09-20 International Business Machines Corporation Longest prefix match (LPM) algorithm implementation for a network processor
US6675163B1 (en) 2000-04-06 2004-01-06 International Business Machines Corporation Full match (FM) search algorithm implementation for a network processor
US6718325B1 (en) 2000-06-14 2004-04-06 Sun Microsystems, Inc. Approximate string matcher for delimited strings
US7660819B1 (en) 2000-07-31 2010-02-09 Alion Science And Technology Corporation System for similar document detection
US7328211B2 (en) * 2000-09-21 2008-02-05 Jpmorgan Chase Bank, N.A. System and methods for improved linguistic pattern matching
US6804664B1 (en) * 2000-10-10 2004-10-12 Netzero, Inc. Encoded-data database for fast queries
US7113943B2 (en) 2000-12-06 2006-09-26 Content Analyst Company, Llc Method for document comparison and selection
JP2002222210A (en) 2001-01-25 2002-08-09 Hitachi Ltd Document search system, method therefor, and search server
US6938046B2 (en) * 2001-03-02 2005-08-30 Dow Jones Reuters Business Interactive, Llp Polyarchical data indexing and automatically generated hierarchical data indexing paths
US6785677B1 (en) * 2001-05-02 2004-08-31 Unisys Corporation Method for execution of query to search strings of characters that match pattern with a target string utilizing bit vector
US6748401B2 (en) 2001-10-11 2004-06-08 International Business Machines Corporation Method and system for dynamically managing hash pool data structures
KR100483321B1 (en) 2001-10-17 2005-04-15 한국과학기술원 The Device and Method for Similarity Search Using Hyper-rectangle Based Multidimensional Data Segmentation
US6985904B1 (en) * 2002-02-28 2006-01-10 Oracle International Corporation Systems and methods for sharing of execution plans for similar database statements
US6993533B1 (en) 2002-03-25 2006-01-31 Bif Technologies Corp. Relational database drill-down convention and reporting tool
US7266553B1 (en) * 2002-07-01 2007-09-04 Microsoft Corporation Content data indexing
US7653796B2 (en) 2003-02-20 2010-01-26 Panasonic Corporation Information recording medium and region management method for a plurality of recording regions each managed by independent file system
US20040225497A1 (en) 2003-05-05 2004-11-11 Callahan James Patrick Compressed yet quickly searchable digital textual data format
US7299221B2 (en) * 2003-05-08 2007-11-20 Oracle International Corporation Progressive relaxation of search criteria
JPWO2005008753A1 (en) 2003-05-23 2006-11-16 株式会社ニコン Template creation method and apparatus, pattern detection method, position detection method and apparatus, exposure method and apparatus, device manufacturing method, and template creation program
US7296011B2 (en) * 2003-06-20 2007-11-13 Microsoft Corporation Efficient fuzzy match for evaluating data records
US20050022017A1 (en) 2003-06-24 2005-01-27 Maufer Thomas A. Data structures and state tracking for network protocol processing
US7467138B2 (en) 2003-10-28 2008-12-16 International Business Machines Corporation Algorithm for sorting bit sequences in linear complexity
US20050108394A1 (en) 2003-11-05 2005-05-19 Capital One Financial Corporation Grid-based computing to search a network
US20050131872A1 (en) * 2003-12-16 2005-06-16 Microsoft Corporation Query recognizer
US20060106793A1 (en) * 2003-12-29 2006-05-18 Ping Liang Internet and computer information retrieval and mining with intelligent conceptual filtering, visualization and automation
US7542971B2 (en) 2004-02-02 2009-06-02 Fuji Xerox Co., Ltd. Systems and methods for collaborative note-taking
US8055672B2 (en) 2004-06-10 2011-11-08 International Business Machines Corporation Dynamic graphical database query and data mining interface
US7836044B2 (en) 2004-06-22 2010-11-16 Google Inc. Anticipated query generation and processing in a search engine
US20060036649A1 (en) * 2004-08-12 2006-02-16 Simske Steven J Index extraction from documents
GB2418999A (en) 2004-09-09 2006-04-12 Surfcontrol Plc Categorizing uniform resource locators
JP2006091994A (en) 2004-09-21 2006-04-06 Toshiba Corp Device, method and program for processing document information
EP1846815A2 (en) * 2005-01-31 2007-10-24 Textdigger, Inc. Method and system for semantic search and retrieval of electronic documents
US7640363B2 (en) 2005-02-16 2009-12-29 Microsoft Corporation Applications for remote differential compression
US7685203B2 (en) 2005-03-21 2010-03-23 Oracle International Corporation Mechanism for multi-domain indexes on XML documents
US20060265396A1 (en) 2005-05-19 2006-11-23 Trimergent Personalizable information networks
US7467155B2 (en) * 2005-07-12 2008-12-16 Sand Technology Systems International, Inc. Method and apparatus for representation of unstructured data
US7548929B2 (en) 2005-07-29 2009-06-16 Yahoo! Inc. System and method for determining semantically related terms
US20070033165A1 (en) * 2005-08-02 2007-02-08 International Business Machines Corporation Efficient evaluation of complex search queries
US7840774B2 (en) * 2005-09-09 2010-11-23 International Business Machines Corporation Compressibility checking avoidance
JP2009508273A (en) 2005-09-14 2009-02-26 オー−ヤ!,インク. Apparatus and method for indexing and searching networked information
US7676517B2 (en) 2005-10-14 2010-03-09 Microsoft Corporation Search results injected into client applications
US20070162481A1 (en) 2006-01-10 2007-07-12 Millett Ronald P Pattern index
US20070175674A1 (en) 2006-01-19 2007-08-02 Intelliscience Corporation Systems and methods for ranking terms found in a data product
US20070203898A1 (en) * 2006-02-24 2007-08-30 Jonathan Lurie Carmona Search methods and systems
US8176052B2 (en) 2006-03-03 2012-05-08 Perfect Search Corporation Hyperspace index
US8266152B2 (en) 2006-03-03 2012-09-11 Perfect Search Corporation Hashed indexing
US7853555B2 (en) * 2006-04-19 2010-12-14 Raytheon Company Enhancing multilingual data querying
US8250075B2 (en) 2006-12-22 2012-08-21 Palo Alto Research Center Incorporated System and method for generation of computer index files
US7774347B2 (en) 2007-08-30 2010-08-10 Perfect Search Corporation Vortex searching
US7774353B2 (en) 2007-08-30 2010-08-10 Perfect Search Corporation Search templates
US7912840B2 (en) 2007-08-30 2011-03-22 Perfect Search Corporation Indexing and filtering using composite data stores
US8032495B2 (en) 2008-06-20 2011-10-04 Perfect Search Corporation Index compression

Also Published As

Publication number Publication date
US20120096008A1 (en) 2012-04-19
WO2007103815A2 (en) 2007-09-13
EP1999565A2 (en) 2008-12-10
US20090307184A1 (en) 2009-12-10
US8176052B2 (en) 2012-05-08
US20080059462A1 (en) 2008-03-06
US7644082B2 (en) 2010-01-05
WO2007103815A3 (en) 2008-05-02
EP1999565A4 (en) 2012-01-11

Similar Documents

Publication Publication Date Title
WO2007103815B1 (en) Hyperspace index
KR100946055B1 (en) Heterogeneous indexing for annotation systems
US6236988B1 (en) Data retrieval system
CN102648468B (en) Table search device, table search method, and table search system
Rodriguez et al. Using WordNet to complement training information in text categorization
US8266152B2 (en) Hashed indexing
US20180075090A1 (en) Computer-Implemented System And Method For Identifying Similar Documents
US4495566A (en) Method and means using digital data processing means for locating representations in a stored textual data base
US8250075B2 (en) System and method for generation of computer index files
US8108411B2 (en) Methods and systems for merging data sets
US8140517B2 (en) Database query optimization using weight mapping to qualify an index
CA2493443A1 (en) Systems and methods of building and using custom word lists
CN102456016B (en) Method and device for sequencing search results
US7949657B2 (en) Detecting zero-result search queries
US7774353B2 (en) Search templates
CN115080684B (en) Network disk document indexing method and device, network disk and storage medium
US6735584B1 (en) Accessing a database using user-defined attributes
US20090234837A1 (en) Search query
US6901396B1 (en) Packed radix search tree implementation
US20090063454A1 (en) Vortex searching
Garron et al. Applying Latent Semantic Indexing on the TREC 2010 Legal Dataset.
US20050256823A1 (en) Memory, method, and program product for organizing data using a compressed trie table
Teufel Natural language documents: Indexing and retrieval in an information system
SPEEDY et al. MARY ANN SELLY, Director, Decision Support Software, Inc., McLean, Virginia: The Analytical Hierarchy Process and the Personal Computer
EP1227410A1 (en) Accessing a database using user-defined attributes

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 11847784

Country of ref document: US

121 Ep: the epo has been informed by wipo that ep was designated in this application
DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2007757830

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 12281262

Country of ref document: US

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)