WO2007103815B1 - Hyperspace index - Google Patents
Hyperspace indexInfo
- Publication number
- WO2007103815B1 WO2007103815B1 PCT/US2007/063218 US2007063218W WO2007103815B1 WO 2007103815 B1 WO2007103815 B1 WO 2007103815B1 US 2007063218 W US2007063218 W US 2007063218W WO 2007103815 B1 WO2007103815 B1 WO 2007103815B1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- parameter
- field
- identifier
- patterns
- pattern
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2264—Multidimensional index structures
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
- Y10S707/99936—Pattern matching access
Abstract
A hyperspace index data structure. A data structure indexes identifiers corresponding to parameter patterns. The presence of an identifier in the data structure indicates that the corresponding parameter pattern may be present in a set of parameter patterns, while the absence of the indicator can be used to indicate that the corresponding parameter pattern is not present. A computing environment to implement the data structure is provided.
Claims
1. In a computing environment, a data structure for indexing identifiers, wherein the identifiers correspond to parameter patterns, and wherein the presence of an identifier in the data structure indicates that the corresponding parameter pattern may be present in a set of parameter patterns, and wherein absence of the indicator in the data structure can be used to indicate that the corresponding parameter pattern is not present in the set of parameter patterns, the data structure comprising: a first field, wherein the first field comprises a first plurality of binary bits or small parameter pattern records, each binary bit, when set, or small parameter pattern record being an identifier corresponding to a parameter pattern from among a first set of parameter patterns, and wherein when the bit is set or small parameter pattern record included, the identifier is included in the first field indicating that a corresponding parameter pattern may possibly be included in the first set of parameter patterns, but where an indicator may also be a false positive indication of a parameter pattern being included in the first set of parameter patterns; and one or more additional fields hierarchically below the first data field, wherein the one or more additional field hierarchically below the first data field are divisions of higher level fields such that a field hierarchically below a higher level field contains indicators for fewer documents or records than a higher level field, wherein each of the additional fields comprises at least one of identifiers for a subset of the first set of parameter patterns represented by identifiers in the first field, identifiers for a parameter pattern related to one or more of the parameter patterns in the first set of parameter patterns or a subset of parameter patterns or related parameter patterns represented by one of the other one or more additional fields hierarchically above the field including the identifiers, wherein the absence of an identifier corresponding to a parameter pattern for a particular field indicates that the parameter pattern is not represented by an indicator in fields hierarchically below the field irrespective of the presence of identifiers corresponding to the parameter pattern being present fields hierarchically above the field not including an identifier corresponding to the parameter pattern, and wherein the absence of an 58
identifier corresponding to a parameter pattern at any level in all of the hierarchical threads indicates that the parameter pattern is not present in the set of parameter patterns.
2. The data structure of claim 1, wherein identifiers are derived from a calculated hash of a corresponding parameter pattern.
3. The data structure of claim 1, wherein location information of an identifier for parameter pattern in the first field can be preserved in hierarchically lower fields by deriving identifiers for the parameter pattern in the lower fields using the same method that was used to derive the identifier for the parameter pattern in the first field.
4. The data structure of claim 1, wherein parameter patterns comprise search specifiers.
5. The data structure of claim 4, wherein the search specifiers comprise Boolean combinations of search terms.
6. The data structure of claim 4, wherein the search specifiers comprise at least one of a question or an answer.
7. The data structure of claim 1, wherein the parameter patterns further comprise ranking information combined with search specifiers,
8. The data structure of claim 7, wherein the ranking information includes personalized ranking through customized criteria.
9. The data structure of claim 1, wherein the parameter patterns include an ordering according to a priority.
10. The data structure of claim 1, wherein the parameter patterns further comprise ranking information combined with search terms, wherein the ranking information includes at least one of user preference ranking, job appropriate ranking, profession appropriate ranking, linguistic appropriate ranking, stylistic appropriate ranking, and ranking based on previous searches.
11. The data structure of claim 1, wherein fields are organized into hierarchical threads according to a ranking such that referencing of the data structure for identifier entries can be accomplished by following higher ranked hierarchical threads first. 59
12. The data structure of claim 11, wherein fields are organized such that fields beyond are given threshold are not relevant.
13. The data structure of claim 1 , wherein the one or more additional fields are sized with a selected granularity to set a level of search resolution.
14. The data structure of claim 1 , wherein the subsets of parameter patterns or associated parameter patterns are sub-divided into fields according to characteristics of objects represented in the field.
15. The data structure of claim 1, wherein each possible identifier corresponding to a parameter pattern corresponds to a binary representation of the parameter pattern such that the first field comprises a full text index.
16. The data structure of claim 1, wherein a possible identifier corresponding to a parameter pattern corresponds to a unique identifier or substantially unique identifier of the parameter pattern.
17. The data structure of claim 1, wherein the possible identifier comprises a frequency pointer that include an indication of the frequency ol" indicators for the parameter pattern in lower portions of an index.
18. The data structure of claim 1, wherein the possible identifier comprises one or more child index pointers pointing to portions of an index that may contain identifiers for the parameter pattern.
19. The data structure of claim 1, wherein the possible identifier comprises a short circuit pointer which allows for lower level fields to be bypassed such that a search can be focused to a particular location or record of a data space.
20. The data structure of claim 1, wherein the first field and the one or more fields hierarchically below the first field are a combination of abbreviated indexes, indexes of small records accessed by an identifier number as a key and full text indexes.
21. The data structure of claim 1, the data structure comprising fields at a lowest hierarchical level, and wherein the fields at the lowest hierarchical level comprise at least one of records or documents ordered from high to low relevance in the fields.
22. The data structure of claim 21, wherein relevance is determined by at least one of general reliability or relevance, or static reliability or relevance. 60
23. The data structure of claim 1, the data structure comprising fields at a lowest hierarchical level, and wherein the fields at the lowest hierarchical level are distributed according to relevance in proportions across a set of multiple clusters of servers.
24. In a computing environment, a method for indexing identifiers into a data structure, wherein the identifiers correspond to parameter patterns, and wherein the presence of an identifier in the data structure indicates that the corresponding parameter pattern may be present in a set of parameter patterns, and wherein absence of the indicator in the data structure can be used to indicate that the corresponding parameter pattern is not present in the set of parameter patterns, the method comprising: including a first field identifier for a first parameter in a first field, wherein the first field comprises a first plurality of binary bits or small parameter pattern records, each binary bit, when set, or small parameter pattern record being an identifier corresponding to a parameter pattern from among a first set of parameter patterns, and wherein when the bit is set or small parameter pattern record included, the identifier is included in the first field indicating that a corresponding parameter pattern may possibly be included in the first set of parameter patterns, but where an indicator may also be a false positive indication of a parameter pattern being included in the first set of parameter patterns; and including one or more lower field identifiers for the first parameter pattern or a parameter pattern related to the first parameter pattern in one or more additional fields hierarchically below the first data field, wherein the one or more additional field hierarchically below the first data field are divisions of higher level fields such that a field hierarchically below a higher level field contains indicators for fewer documents or records than a higher level field, wherein each of the additional fields comprises at least one of identifiers for a subset of the first set of parameter patterns represented by identifiers in the first field, identifiers for a parameter pattern related to one or more of the parameter patterns in the first set of parameter patterns or a subset of parameter patterns or related parameter patterns represented by one of the 61
other one or more additional fields hierarchically above the field including the identifiers, wherein the absence of an identifier corresponding to a parameter pattern for a particular field indicates that the parameter pattern is not represented by an indicator in fields hierarchically below the field irrespective of the presence of identifiers corresponding to the parameter pattern being present fields hierarchically above the field not including an identifier corresponding to the parameter pattern, and wherein the absence of an identifier corresponding to a parameter pattern at any level in all of the hierarchical threads indicates that the parameter pattern is not present in the set of parameter patterns.
25. The method of claim 24, further comprising generating a hash code for the parameter pattern, and wherein the first field identifier and the one or more lower field identifiers are derived from the generated hash code.
26. The method of claim 24, wherein the first parameter pattern is a parameter pattern for a data object, the method further comprising: including additional identifiers for additional parameter patterns for the data object in the first field; and including additional lower identifiers for the additional parameter patterns or related additional parameter patterns in the one or more additional fields hierarchically below the first data field.
27. The method of claim 26, further comprising allocating memory for the first field and the one or more additional fields hierarchically below the first data field based on the number of data objects, the number of parameter patterns for each data object, and one or more multipliers.
28. The method of claim 27, wherein allocating memory is done for each field individually based on the number of data objects, the number of parameter patterns for each data object, and a multiplier, for each field.
29. The method of claim 24, further comprising organizing a hierarchical arrangement of fields by priority.
30. Li a computing environment, a method of locating identifiers in an index, wherein the identifiers correspond to parameter patterns, and wherein the presence of an identifier in the data structure indicates that the corresponding parameter pattern may be present in a set of parameter patterns, and wherein absence of the indicator in the data structure can be used to indicate that the corresponding parameter pattern is not present in the set of parameter patterns, the method comprising: referencing a first field for a first identifier corresponding to a first parameter pattern, wherein the first field comprises a first plurality of binary bits or small parameter pattern records, each binary bit, when set, or small parameter pattern record being an identifier corresponding to a parameter pattern from among a first set of parameter patterns, and wherein when the bit is set or small parameter pattern record included, the identifier is included in the first field indicating that a corresponding parameter pattern may possibly be included in the first set of parameter patterns, but where an indicator may also be a false positive indication of a parameter pattern being included in the first set of parameter patterns; and if the first identifier is in the first field, referencing one or more additional fields hierarchically below the first field, wherein the one or more additional field hierarchically below the first field are divisions of higher level fields such that a field hierarchically below a higher level field contains indicators for fewer documents or records than a higher level field, wherein each of the additional fields comprises at least one of identifiers for a subset of the first set of parameter patterns represented by identifiers in the first field, identifiers for a parameter pattern related to one or more of the parameter patterns in the first set of parameter patterns or a subset of parameter patterns or related parameter patterns represented by one of the oiher one or more additional fields hierarchically above the field including the identifiers, wherein the absence of an identifier corresponding to a parameter pattern for a particular field indicates that the parameter pattern is not represented by an indicator in fields hierarchically below the field irrespective of the presence of identifiers corresponding to the parameter pattern being present fields hierarchically above the field not including an identifier corresponding to the parameter pattern, and wherein the absence of an identifier corresponding to a 63
parameter pattern at any level in all of the hierarchical threads indicates that the parameter pattern is not present in the set of parameter patterns.
31. The method of claim 30, wherein referencing one or more additional fields hierarchically below the first data field for one or more identifiers corresponding to the parameter pattern or one or more related parameter patterns comprises referencing fields in a hierarchical thread, from higher levels in the thread to lower levels in the threads and stopping referencing fields in a thread when an identifier for a parameter pattern being referenced for does not appear in a field in the thread.
32. The method of claim 30, wherein referencing a first field for a first identifier corresponding to a first parameter pattern, comprises referencing parameter patterns for more relevant parameter patterns first.
33. The method of claim 32, wherein relevance is determined by at least one of linguistic analysis, date, or reliability of a document or record.
34. In a computing environment, a method of indexing parameter patterns for later search and retrieval, the method comprising: operating on one or more parameter patterns to generate a identifier code for each parameter pattern; sorting the identifier codes generated for the parameter patterns; and correlating offsets with sorted identifier codes where each offset is correlated to one of: a sorted identifier code, wherein the offset represents a portion of one or more identifier codes correlated to the offset, or to an indicator that indicates that a particular offset does not correspond to an identifier code for a parameter pattern.
35. The method of claim 34, wherein a given offset comprises a predetermined number of the most significant bits of an identifier code correlated to the given offset.
36. The method of claim 34, wherein sorting the identifier codes generated for the parameter patterns comprises sorting the identifier codes in an ascending order numerically. 64
37. The method of claim 34, wherein the indicator that indicates that a particular offset does not correspond to identifier code for a parameter pattern is a negative number.
38. The method of claim 34, wherein the indicator that indicates that a particular offset does not correspond to an identifier code for a parameter pattern includes an indicator for locating a next identifier code corresponding to a parameter pattern.
39. In a computing environment, a method of checking an index for entries, the method comprising: operating on a parameter pattern to create a identifier code; generating an offset from the identifier code; comparing the offset to a set of correlated offsets wherein the set of correlated offsets are correlated with sorted identifier codes where each offset is correlated to one of: a sorted identifier code, wherein the offset represents a portion of one or more pre-calculated identifier codes correlated to the offset, or to an indicator that indicates that a particular offset does not correspond to a pre-calculated identifier code for a parameter pattern; if a correlated offset matches the offset, and the correlated offset is correlated to a portion of one or more pre-calculated identifier codes, then comparing the identifier code to at least one of the one or more pre-calculated identifier codes; but if a correlated offset matches the offset, and the correlated offset is correlated to an indicator that indicates that a particular offset does not correspond to a pre-calculated identifier code for a parameter pattern, then returning an indication that the parameter pattern is not included in a set of parameter patterns.
40. The method of claim 39, wherein comparing the identifier code to at least one of the one or more pre-calculated identifier codes comprises: continuing comparing the identifier code to pre-calculated identifier codes until either the identifier code matches a pre-calculated identifier code, or until a compared pre-calculated identifier code is greater than the identifier code; 65
if the identifier code matches a pre-calculated identifier code, then returning an indication that the identifier code matches a pre-calculated identifier code; but if a compared pre-calculated identifier code is greater than the identifier code, then returning an indication that the parameter pattern is not included in a set of parameter patterns.
41. The method of claim 39, wherein generating an offset comprises selecting a predetermined number of the most significant bits of an identifier code correlated to the given offset.
42. In a computing environment, a data structure for indexing parameter patterns, the data structure comprising: a first field, the first field comprising pre-calculated identifier codes for parameter patterns, wherein the pre-calculated identifier codes are sorted in the first field; a second field, the second field comprising an enumeration of each of the pre-calculated identifier codes; and a third field, the third field comprising one or more offsets, wherein each offset is correlated to one of the enumerations in the second field or to an indicator that indicates that a particular offset does not correspond to a identifier code for a parameter pattern.
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/281,262 US8176052B2 (en) | 2006-03-03 | 2007-03-02 | Hyperspace index |
EP07757830A EP1999565A4 (en) | 2006-03-03 | 2007-03-02 | Hyperspace index |
US11/847,784 US8266152B2 (en) | 2006-03-03 | 2007-08-30 | Hashed indexing |
US13/312,022 US20120096008A1 (en) | 2006-03-03 | 2011-12-06 | Hyperspace index |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US77921406P | 2006-03-03 | 2006-03-03 | |
US60/779,214 | 2006-03-03 |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/847,784 Continuation-In-Part US8266152B2 (en) | 2006-03-03 | 2007-08-30 | Hashed indexing |
US13/312,022 Continuation US20120096008A1 (en) | 2006-03-03 | 2011-12-06 | Hyperspace index |
Publications (3)
Publication Number | Publication Date |
---|---|
WO2007103815A2 WO2007103815A2 (en) | 2007-09-13 |
WO2007103815A3 WO2007103815A3 (en) | 2008-05-02 |
WO2007103815B1 true WO2007103815B1 (en) | 2008-07-10 |
Family
ID=38475735
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2007/063218 WO2007103815A2 (en) | 2006-03-03 | 2007-03-02 | Hyperspace index |
Country Status (3)
Country | Link |
---|---|
US (3) | US8176052B2 (en) |
EP (1) | EP1999565A4 (en) |
WO (1) | WO2007103815A2 (en) |
Families Citing this family (43)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070162481A1 (en) | 2006-01-10 | 2007-07-12 | Millett Ronald P | Pattern index |
US8266152B2 (en) | 2006-03-03 | 2012-09-11 | Perfect Search Corporation | Hashed indexing |
US8176052B2 (en) * | 2006-03-03 | 2012-05-08 | Perfect Search Corporation | Hyperspace index |
JP5193518B2 (en) * | 2007-07-13 | 2013-05-08 | 株式会社東芝 | Pattern search apparatus and method |
US7912840B2 (en) | 2007-08-30 | 2011-03-22 | Perfect Search Corporation | Indexing and filtering using composite data stores |
US7774353B2 (en) | 2007-08-30 | 2010-08-10 | Perfect Search Corporation | Search templates |
US7774347B2 (en) | 2007-08-30 | 2010-08-10 | Perfect Search Corporation | Vortex searching |
US7984019B2 (en) * | 2007-12-28 | 2011-07-19 | Knowledge Computing Corporation | Method and apparatus for loading data files into a data-warehouse system |
US7840546B2 (en) * | 2008-01-07 | 2010-11-23 | Knowledge Computing Corporation | Method and apparatus for conducting data queries using consolidation strings and inter-node consolidation |
US8032495B2 (en) | 2008-06-20 | 2011-10-04 | Perfect Search Corporation | Index compression |
US8037050B2 (en) * | 2008-08-02 | 2011-10-11 | Knowledge Computing Corporation | Methods and apparatus for performing multi-data-source, non-ETL queries and entity resolution |
US8386436B2 (en) * | 2008-09-30 | 2013-02-26 | Rainstor Limited | System and method for data storage |
US8131738B2 (en) * | 2008-12-30 | 2012-03-06 | International Business Machines Corporation | Search engine service utilizing hash algorithms |
WO2010135430A1 (en) * | 2009-05-19 | 2010-11-25 | Vmware, Inc. | Shortcut input/output in virtual machine systems |
JP5341209B2 (en) * | 2009-12-25 | 2013-11-13 | インターナショナル・ビジネス・マシーンズ・コーポレーション | System, method and program for checking pointer consistency in hierarchical database |
US9317536B2 (en) * | 2010-04-27 | 2016-04-19 | Cornell University | System and methods for mapping and searching objects in multidimensional space |
US8527546B2 (en) | 2010-11-25 | 2013-09-03 | International Business Machines Corporation | Generating a checkpoint image for use with an in-memory database |
US9811373B2 (en) * | 2011-02-09 | 2017-11-07 | Nec Corporation | Analysis engine control device |
US9015142B2 (en) * | 2011-06-10 | 2015-04-21 | Google Inc. | Identifying listings of multi-site entities based on user behavior signals |
US9155320B2 (en) * | 2011-07-06 | 2015-10-13 | International Business Machines Corporation | Prefix-based leaf node storage for database system |
US9715434B1 (en) * | 2011-09-30 | 2017-07-25 | EMC IP Holding Company LLC | System and method for estimating storage space needed to store data migrated from a source storage to a target storage |
US8738595B2 (en) | 2011-11-22 | 2014-05-27 | Navteq B.V. | Location based full text search |
US8745022B2 (en) | 2011-11-22 | 2014-06-03 | Navteq B.V. | Full text search based on interwoven string tokens |
US9009149B2 (en) * | 2011-12-06 | 2015-04-14 | The Trustees Of Columbia University In The City Of New York | Systems and methods for mobile search using Bag of Hash Bits and boundary reranking |
US8996467B2 (en) | 2011-12-29 | 2015-03-31 | Druva Inc. | Distributed scalable deduplicated data backup system |
US8700634B2 (en) * | 2011-12-29 | 2014-04-15 | Druva Inc. | Efficient deduplicated data storage with tiered indexing |
US8700661B2 (en) | 2012-04-12 | 2014-04-15 | Navteq B.V. | Full text search using R-trees |
US9898505B2 (en) * | 2012-09-27 | 2018-02-20 | Nec Corporation | Method, apparatus and program for transforming into binary data |
US9262423B2 (en) * | 2012-09-27 | 2016-02-16 | Microsoft Technology Licensing, Llc | Large scale file storage in cloud computing |
CN103345469B (en) * | 2013-05-24 | 2016-08-03 | 联动优势科技有限公司 | The storage of set of numbers, querying method and device thereof |
CN105630803B (en) * | 2014-10-30 | 2019-07-05 | 国际商业机器公司 | The method and apparatus that Document image analysis establishes index |
US10467215B2 (en) | 2015-06-23 | 2019-11-05 | Microsoft Technology Licensing, Llc | Matching documents using a bit vector search index |
US10242071B2 (en) | 2015-06-23 | 2019-03-26 | Microsoft Technology Licensing, Llc | Preliminary ranker for scoring matching documents |
US11392568B2 (en) | 2015-06-23 | 2022-07-19 | Microsoft Technology Licensing, Llc | Reducing matching documents for a search query |
US10733164B2 (en) | 2015-06-23 | 2020-08-04 | Microsoft Technology Licensing, Llc | Updating a bit vector search index |
US11281639B2 (en) | 2015-06-23 | 2022-03-22 | Microsoft Technology Licensing, Llc | Match fix-up to remove matching documents |
US10565198B2 (en) | 2015-06-23 | 2020-02-18 | Microsoft Technology Licensing, Llc | Bit vector search index using shards |
US10229143B2 (en) | 2015-06-23 | 2019-03-12 | Microsoft Technology Licensing, Llc | Storage and retrieval of data from a bit vector search index |
US10210210B2 (en) | 2015-10-21 | 2019-02-19 | International Business Machines Corporation | Adaptive multi-index access plan for database queries |
US11449554B2 (en) * | 2015-10-22 | 2022-09-20 | Mcafee, Llc | Extensible search solution for asset information |
US11200217B2 (en) | 2016-05-26 | 2021-12-14 | Perfect Search Corporation | Structured document indexing and searching |
CN107025263A (en) * | 2017-01-16 | 2017-08-08 | 中国银联股份有限公司 | Sentence analytic method for database statement |
US10915576B2 (en) * | 2019-03-26 | 2021-02-09 | Sap Se | High performance bloom filter |
Family Cites Families (84)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4817036A (en) * | 1985-03-15 | 1989-03-28 | Brigham Young University | Computer system and method for data base indexing and information retrieval |
US4961139A (en) | 1988-06-30 | 1990-10-02 | Hewlett-Packard Company | Data base management system for real-time applications |
US4961197A (en) * | 1988-09-07 | 1990-10-02 | Hitachi, Ltd. | Semiconductor laser device |
JPH04186447A (en) | 1990-11-21 | 1992-07-03 | Canon Inc | Directory management system |
US5699441A (en) | 1992-03-10 | 1997-12-16 | Hitachi, Ltd. | Continuous sign-language recognition apparatus and input apparatus |
US5530854A (en) * | 1992-09-25 | 1996-06-25 | At&T Corp | Shared tuple method and system for generating keys to access a database |
US5701459A (en) | 1993-01-13 | 1997-12-23 | Novell, Inc. | Method and apparatus for rapid full text index creation |
US5544352A (en) | 1993-06-14 | 1996-08-06 | Libertech, Inc. | Method and apparatus for indexing, searching and displaying data |
US5664179A (en) | 1995-06-27 | 1997-09-02 | Mci Corporation | Modified skip list database structure and method for access |
US5960194A (en) * | 1995-09-11 | 1999-09-28 | International Business Machines Corporation | Method for generating a multi-tiered index for partitioned data |
US5737734A (en) * | 1995-09-15 | 1998-04-07 | Infonautics Corporation | Query word relevance adjustment in a search of an information retrieval system |
US5761652A (en) * | 1996-03-20 | 1998-06-02 | International Business Machines Corporation | Constructing balanced multidimensional range-based bitmap indices |
US6216213B1 (en) * | 1996-06-07 | 2001-04-10 | Motorola, Inc. | Method and apparatus for compression, decompression, and execution of program code |
US6253188B1 (en) | 1996-09-20 | 2001-06-26 | Thomson Newspapers, Inc. | Automated interactive classified ad system for the internet |
US5799312A (en) * | 1996-11-26 | 1998-08-25 | International Business Machines Corporation | Three-dimensional affine-invariant hashing defined over any three-dimensional convex domain and producing uniformly-distributed hash keys |
US5852822A (en) * | 1996-12-09 | 1998-12-22 | Oracle Corporation | Index-only tables with nested group keys |
US6076051A (en) | 1997-03-07 | 2000-06-13 | Microsoft Corporation | Information retrieval utilizing semantic representation of text |
US6128613A (en) | 1997-06-26 | 2000-10-03 | The Chinese University Of Hong Kong | Method and apparatus for establishing topic word classes based on an entropy cost function to retrieve documents represented by the topic words |
US6018733A (en) * | 1997-09-12 | 2000-01-25 | Infoseek Corporation | Methods for iteratively and interactively performing collection selection in full text searches |
US6026398A (en) | 1997-10-16 | 2000-02-15 | Imarket, Incorporated | System and methods for searching and matching databases |
US6070164A (en) * | 1998-05-09 | 2000-05-30 | Information Systems Corporation | Database method and apparatus using hierarchical bit vector index structure |
US6216123B1 (en) * | 1998-06-24 | 2001-04-10 | Novell, Inc. | Method and system for rapid retrieval in a full text indexing system |
US6658626B1 (en) | 1998-07-31 | 2003-12-02 | The Regents Of The University Of California | User interface for displaying document comparison information |
US6584458B1 (en) * | 1999-02-19 | 2003-06-24 | Novell, Inc. | Method and apparatuses for creating a full text index accommodating child words |
US6516320B1 (en) * | 1999-03-08 | 2003-02-04 | Pliant Technologies, Inc. | Tiered hashing for data access |
US7181438B1 (en) * | 1999-07-21 | 2007-02-20 | Alberti Anemometer, Llc | Database access system |
US6879976B1 (en) * | 1999-08-19 | 2005-04-12 | Azi, Inc. | Data indexing using bit vectors |
US6772141B1 (en) | 1999-12-14 | 2004-08-03 | Novell, Inc. | Method and apparatus for organizing and using indexes utilizing a search decision table |
US6473729B1 (en) | 1999-12-20 | 2002-10-29 | Xerox Corporation | Word phrase translation using a phrase index |
AUPQ475799A0 (en) | 1999-12-20 | 2000-01-20 | Youramigo Pty Ltd | An internet indexing system and method |
US6678686B1 (en) | 1999-12-28 | 2004-01-13 | Ncr Corporation | Method and apparatus for evaluating index predicates on complex data types using virtual indexed streams |
US6584465B1 (en) * | 2000-02-25 | 2003-06-24 | Eastman Kodak Company | Method and system for search and retrieval of similar patterns |
US6947931B1 (en) * | 2000-04-06 | 2005-09-20 | International Business Machines Corporation | Longest prefix match (LPM) algorithm implementation for a network processor |
US6675163B1 (en) | 2000-04-06 | 2004-01-06 | International Business Machines Corporation | Full match (FM) search algorithm implementation for a network processor |
US6718325B1 (en) | 2000-06-14 | 2004-04-06 | Sun Microsystems, Inc. | Approximate string matcher for delimited strings |
US7660819B1 (en) | 2000-07-31 | 2010-02-09 | Alion Science And Technology Corporation | System for similar document detection |
US7328211B2 (en) * | 2000-09-21 | 2008-02-05 | Jpmorgan Chase Bank, N.A. | System and methods for improved linguistic pattern matching |
US6804664B1 (en) * | 2000-10-10 | 2004-10-12 | Netzero, Inc. | Encoded-data database for fast queries |
US7113943B2 (en) | 2000-12-06 | 2006-09-26 | Content Analyst Company, Llc | Method for document comparison and selection |
JP2002222210A (en) | 2001-01-25 | 2002-08-09 | Hitachi Ltd | Document search system, method therefor, and search server |
US6938046B2 (en) * | 2001-03-02 | 2005-08-30 | Dow Jones Reuters Business Interactive, Llp | Polyarchical data indexing and automatically generated hierarchical data indexing paths |
US6785677B1 (en) * | 2001-05-02 | 2004-08-31 | Unisys Corporation | Method for execution of query to search strings of characters that match pattern with a target string utilizing bit vector |
US6748401B2 (en) | 2001-10-11 | 2004-06-08 | International Business Machines Corporation | Method and system for dynamically managing hash pool data structures |
KR100483321B1 (en) | 2001-10-17 | 2005-04-15 | 한국과학기술원 | The Device and Method for Similarity Search Using Hyper-rectangle Based Multidimensional Data Segmentation |
US6985904B1 (en) * | 2002-02-28 | 2006-01-10 | Oracle International Corporation | Systems and methods for sharing of execution plans for similar database statements |
US6993533B1 (en) | 2002-03-25 | 2006-01-31 | Bif Technologies Corp. | Relational database drill-down convention and reporting tool |
US7266553B1 (en) * | 2002-07-01 | 2007-09-04 | Microsoft Corporation | Content data indexing |
US7653796B2 (en) | 2003-02-20 | 2010-01-26 | Panasonic Corporation | Information recording medium and region management method for a plurality of recording regions each managed by independent file system |
US20040225497A1 (en) | 2003-05-05 | 2004-11-11 | Callahan James Patrick | Compressed yet quickly searchable digital textual data format |
US7299221B2 (en) * | 2003-05-08 | 2007-11-20 | Oracle International Corporation | Progressive relaxation of search criteria |
JPWO2005008753A1 (en) | 2003-05-23 | 2006-11-16 | 株式会社ニコン | Template creation method and apparatus, pattern detection method, position detection method and apparatus, exposure method and apparatus, device manufacturing method, and template creation program |
US7296011B2 (en) * | 2003-06-20 | 2007-11-13 | Microsoft Corporation | Efficient fuzzy match for evaluating data records |
US20050022017A1 (en) | 2003-06-24 | 2005-01-27 | Maufer Thomas A. | Data structures and state tracking for network protocol processing |
US7467138B2 (en) | 2003-10-28 | 2008-12-16 | International Business Machines Corporation | Algorithm for sorting bit sequences in linear complexity |
US20050108394A1 (en) | 2003-11-05 | 2005-05-19 | Capital One Financial Corporation | Grid-based computing to search a network |
US20050131872A1 (en) * | 2003-12-16 | 2005-06-16 | Microsoft Corporation | Query recognizer |
US20060106793A1 (en) * | 2003-12-29 | 2006-05-18 | Ping Liang | Internet and computer information retrieval and mining with intelligent conceptual filtering, visualization and automation |
US7542971B2 (en) | 2004-02-02 | 2009-06-02 | Fuji Xerox Co., Ltd. | Systems and methods for collaborative note-taking |
US8055672B2 (en) | 2004-06-10 | 2011-11-08 | International Business Machines Corporation | Dynamic graphical database query and data mining interface |
US7836044B2 (en) | 2004-06-22 | 2010-11-16 | Google Inc. | Anticipated query generation and processing in a search engine |
US20060036649A1 (en) * | 2004-08-12 | 2006-02-16 | Simske Steven J | Index extraction from documents |
GB2418999A (en) | 2004-09-09 | 2006-04-12 | Surfcontrol Plc | Categorizing uniform resource locators |
JP2006091994A (en) | 2004-09-21 | 2006-04-06 | Toshiba Corp | Device, method and program for processing document information |
EP1846815A2 (en) * | 2005-01-31 | 2007-10-24 | Textdigger, Inc. | Method and system for semantic search and retrieval of electronic documents |
US7640363B2 (en) | 2005-02-16 | 2009-12-29 | Microsoft Corporation | Applications for remote differential compression |
US7685203B2 (en) | 2005-03-21 | 2010-03-23 | Oracle International Corporation | Mechanism for multi-domain indexes on XML documents |
US20060265396A1 (en) | 2005-05-19 | 2006-11-23 | Trimergent | Personalizable information networks |
US7467155B2 (en) * | 2005-07-12 | 2008-12-16 | Sand Technology Systems International, Inc. | Method and apparatus for representation of unstructured data |
US7548929B2 (en) | 2005-07-29 | 2009-06-16 | Yahoo! Inc. | System and method for determining semantically related terms |
US20070033165A1 (en) * | 2005-08-02 | 2007-02-08 | International Business Machines Corporation | Efficient evaluation of complex search queries |
US7840774B2 (en) * | 2005-09-09 | 2010-11-23 | International Business Machines Corporation | Compressibility checking avoidance |
JP2009508273A (en) | 2005-09-14 | 2009-02-26 | オー−ヤ!,インク. | Apparatus and method for indexing and searching networked information |
US7676517B2 (en) | 2005-10-14 | 2010-03-09 | Microsoft Corporation | Search results injected into client applications |
US20070162481A1 (en) | 2006-01-10 | 2007-07-12 | Millett Ronald P | Pattern index |
US20070175674A1 (en) | 2006-01-19 | 2007-08-02 | Intelliscience Corporation | Systems and methods for ranking terms found in a data product |
US20070203898A1 (en) * | 2006-02-24 | 2007-08-30 | Jonathan Lurie Carmona | Search methods and systems |
US8176052B2 (en) | 2006-03-03 | 2012-05-08 | Perfect Search Corporation | Hyperspace index |
US8266152B2 (en) | 2006-03-03 | 2012-09-11 | Perfect Search Corporation | Hashed indexing |
US7853555B2 (en) * | 2006-04-19 | 2010-12-14 | Raytheon Company | Enhancing multilingual data querying |
US8250075B2 (en) | 2006-12-22 | 2012-08-21 | Palo Alto Research Center Incorporated | System and method for generation of computer index files |
US7774347B2 (en) | 2007-08-30 | 2010-08-10 | Perfect Search Corporation | Vortex searching |
US7774353B2 (en) | 2007-08-30 | 2010-08-10 | Perfect Search Corporation | Search templates |
US7912840B2 (en) | 2007-08-30 | 2011-03-22 | Perfect Search Corporation | Indexing and filtering using composite data stores |
US8032495B2 (en) | 2008-06-20 | 2011-10-04 | Perfect Search Corporation | Index compression |
-
2007
- 2007-03-02 US US12/281,262 patent/US8176052B2/en active Active
- 2007-03-02 US US11/681,673 patent/US7644082B2/en active Active
- 2007-03-02 WO PCT/US2007/063218 patent/WO2007103815A2/en active Search and Examination
- 2007-03-02 EP EP07757830A patent/EP1999565A4/en not_active Withdrawn
-
2011
- 2011-12-06 US US13/312,022 patent/US20120096008A1/en not_active Abandoned
Also Published As
Publication number | Publication date |
---|---|
US20120096008A1 (en) | 2012-04-19 |
WO2007103815A2 (en) | 2007-09-13 |
EP1999565A2 (en) | 2008-12-10 |
US20090307184A1 (en) | 2009-12-10 |
US8176052B2 (en) | 2012-05-08 |
US20080059462A1 (en) | 2008-03-06 |
US7644082B2 (en) | 2010-01-05 |
WO2007103815A3 (en) | 2008-05-02 |
EP1999565A4 (en) | 2012-01-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2007103815B1 (en) | Hyperspace index | |
KR100946055B1 (en) | Heterogeneous indexing for annotation systems | |
US6236988B1 (en) | Data retrieval system | |
CN102648468B (en) | Table search device, table search method, and table search system | |
Rodriguez et al. | Using WordNet to complement training information in text categorization | |
US8266152B2 (en) | Hashed indexing | |
US20180075090A1 (en) | Computer-Implemented System And Method For Identifying Similar Documents | |
US4495566A (en) | Method and means using digital data processing means for locating representations in a stored textual data base | |
US8250075B2 (en) | System and method for generation of computer index files | |
US8108411B2 (en) | Methods and systems for merging data sets | |
US8140517B2 (en) | Database query optimization using weight mapping to qualify an index | |
CA2493443A1 (en) | Systems and methods of building and using custom word lists | |
CN102456016B (en) | Method and device for sequencing search results | |
US7949657B2 (en) | Detecting zero-result search queries | |
US7774353B2 (en) | Search templates | |
CN115080684B (en) | Network disk document indexing method and device, network disk and storage medium | |
US6735584B1 (en) | Accessing a database using user-defined attributes | |
US20090234837A1 (en) | Search query | |
US6901396B1 (en) | Packed radix search tree implementation | |
US20090063454A1 (en) | Vortex searching | |
Garron et al. | Applying Latent Semantic Indexing on the TREC 2010 Legal Dataset. | |
US20050256823A1 (en) | Memory, method, and program product for organizing data using a compressed trie table | |
Teufel | Natural language documents: Indexing and retrieval in an information system | |
SPEEDY et al. | MARY ANN SELLY, Director, Decision Support Software, Inc., McLean, Virginia: The Analytical Hierarchy Process and the Personal Computer | |
EP1227410A1 (en) | Accessing a database using user-defined attributes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 11847784 Country of ref document: US |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) | ||
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2007757830 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 12281262 Country of ref document: US |
|
DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) |