UST900006I4 - Information storage and retrieval system and method - Google Patents

Information storage and retrieval system and method Download PDF

Info

Publication number
UST900006I4
UST900006I4 US900006DH UST900006I4 US T900006 I4 UST900006 I4 US T900006I4 US 900006D H US900006D H US 900006DH US T900006 I4 UST900006 I4 US T900006I4
Authority
US
United States
Prior art keywords
documents
search
stored
occurrence
data base
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed filed Critical
Application granted granted Critical
Publication of UST900006I4 publication Critical patent/UST900006I4/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K17/00Methods or arrangements for effecting co-operative working between equipments covered by two or more of main groups G06K1/00 - G06K15/00, e.g. automatic card files incorporating conveying and reading operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9017Indexing; Data structures therefor; Storage structures using directory or table look-up
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99935Query augmenting and refining, e.g. inexact access

Definitions

  • a probability calculator determines the likelihood of the occurrence of a search symbol at least once in a given stored document in the systems data base.
  • the system and method causes the query text to be scanned so as to determine which search symbols are contained therein.
  • the term overlap is used to designate a search symbol in the query and also in a. given stored document.
  • the particular document in the systems data base having the smallest joint probability of occurrence of overlap search symbols is designated as having the highest relevance potential within the data base to a given query.
  • the stored document having the next larger joint probability of occurrence of overlap search symbols has the next highest relevance potential. In this manner, any select number of relevant stored documents may be outputted by the system and method, in the order of relevance, as either identification numbers of potentially pertinent documents or as the documents per se in full text form.

Abstract

AN INFORMATION STORAGE AND RETRIEVAL SYSTEM AND METHOD CAPABLE OF HANDLING QUERIES IN SENTENCE FORM AND PRESENTING RESPONSES EITHER AS IDENTIFICATION NUMBERS OF POTENTIALLY PERTINENT DOCUMENTS OR AS DOCUMENTS IN FULL TEXT FORM. STORED DOCUMENTS CONTAINING SO-CALLED SEARCH SYMBOLS ARE KNOWN TO THE SYSTEM. A PROBABILITY CALCULATOR DETERMINES THE LIKELIHOOD OF THE OCCURRENCE OF A SEARCH SYMBOL AT LEAST ONCE IN A GIVEN STORED DOCUMENT IN THE SYSTEM''S DATA BASE. THE SYSTEM AND METHOD CAUSES THE QUERY TEXT TO BE SCANNED SO AS TO DETERMINE WHICH SEARCH SYMBOLS ARE CONTAINED THEREIN. THE TERM OVERLAP IS USED TO DESIGNATE A SEARCH SYMBOL IN THE QUERY AND ALSO IN A GIVEN STORED DOCUMENTS. THE PARTICULAR DOCUMENT IN THE SYSTEM''S DATA BASE HAVING THE SMALLEST JOINT PROBABILITY OF OCCURRENCE OF OVERLAP SEARCH SYMBOLS IN DESIGNATED AS HAVING THE HIGHEST RELEVANCE POTENTIAL WITHIN THE DATA BASE TO A GIVEN QUERY. THE STORED DOCUMENT HAVING THE NEXT LARGER JOINT PROBABILITY OF OCCURRENCE OF OVERLAP SEARCH SYMBOLS, HAS THE NEXT HIGHEST RELEVANCE POTENTIAL. IN THIS MANNER, ANY SELECTED NUMBER OF RELEVANT STORED DOCUMENTS MAY BE OUTPUTTED BY THE SYSTEM AND METHOD, IN THE ORDER OF RELEVANCE, AS EITHER IDENTIFICATION NUM-

BERS OF POTENTIALLY PERTINENT DOCUMENTS OR AS THE DOCUMENTS PER SE IN FULL FORM.

Description

DEFENSIVE PUBLICATION UNITED STATES PATENT OFFICE Published at the request of the applicant or owner in accordance with the Notice of Dec. 16, 1969, 869 0.G. 687. The abstracts of Defensive Publication applications are identified by distinctly numbered series and are arranged chronologically. The heading of each abstract indicates the number of pages of specification, including claims and sheets of drawings contained may be purchased for 30 cents a sheet.
in the application as originally filed. The files of these applications are available to the public for inspection and reproduction Defensive Publication applications have not been examined as to the merits of alleged invention. The Patent Ofllce makes no assertion as to the novelty of the disclosed subject matter.
PUBLISHED JULY 18, 1972 T900,006 INFORMATION STORAGE AND RETRIEVAL SYSTEM AND METHOD Matthews P. Perriens, Rockville, Md., and John H.
Williams, Jr., Annandale, Va., assignors to Intemational Business Machines Corporation, Armonk, N. Continuation of application Ser. No. 736,837, June 13, 1968. This application Apr. 19, 1971, Ser. No. 135,467 Int. Cl. G06f 1/00, 7/00, 15/00 US. Cl. 340-1725 3 Sheets Drawing. 21 Pages Specification 1o I 2o 24 consonants JOINT INPUT ggfigggigg SUHSET usr pronoun I GENERATOR 'usr GENEMTOR l m l l I F i l l l I W l I l \i F i r PROBABILITY LIST ust CMBULATDR INVERTER REARRANGER PM i l DOCUMENT m SELECTOR An information storage and retrieval system and method capable of handling queries in sentence form and presenting responses either as identification numbers of potentially pertinent documents or as documents in full text form. Stored documents containing socalled search symbols are known to the system. A probability calculator determines the likelihood of the occurrence of a search symbol at least once in a given stored document in the systems data base. The system and method causes the query text to be scanned so as to determine which search symbols are contained therein. The term overlap is used to designate a search symbol in the query and also in a. given stored document. The particular document in the systems data base having the smallest joint probability of occurrence of overlap search symbols is designated as having the highest relevance potential within the data base to a given query. The stored document having the next larger joint probability of occurrence of overlap search symbols, has the next highest relevance potential. In this manner, any select number of relevant stored documents may be outputted by the system and method, in the order of relevance, as either identification numbers of potentially pertinent documents or as the documents per se in full text form.
July 18, 1972 Original Filed June 13, 1968 INFORMATION STORAGE AND RETRIEVAL SYSTEM AND METHOD 3 Sheets-Sheet '1 CONCORDANCE JOINT INPUT SUBSET LIST PROBABILITY GENERATOR LIST GENERATOR N r N 1 1 3| S2 S3 34 S5 S6 S7 S8 59 f v N R PROBABILITY LIST LIST CALCULATOR INVERTER REARRANGER c I I81 1 30) V I DOCUMENT SCANNER SELECTOR FIG I OUTPUT INVENTORS NmNEw P. RERRNENs JOHN R.NNL1NNs,NR.
ATTORNEY y 1972 M. P. PERRIENS A T900,006
INFORMATION STORAGE AND RETRIEVAL SYSTEM AND METHOD Original Filed June 15, 1968 s Sheets-Sheet" z INPUT I OUTPUT 42 T T G 2 INPUT OUTPUT CHANNEL 46 T N 48L 1 CORE STORAGE 4 7 ARITHMETIC CONTROL UNIT UNIT UNIT CONCORDANCE SEARCH SYMBOLS DOCUMENT NUMBERS SS| OOC| DOC DOC23 DOC F 3 s3 D003 DOC23 000 00c ss D002 D0C5 s s D007 000, 000, ooc DOC DOCUMENTS ss, 00c 00c U00 000 g 53 000 D006 F 4 5 S83 Uoc 000 00c 2 s3 D002 $3 U00 U00 s5 D0C4 U00 LIST I
US900006D 1971-04-19 1971-04-19 Information storage and retrieval system and method Pending UST900006I4 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13546771A 1971-04-19 1971-04-19

Publications (1)

Publication Number Publication Date
UST900006I4 true UST900006I4 (en) 1972-07-18

Family

ID=22468234

Family Applications (1)

Application Number Title Priority Date Filing Date
US900006D Pending UST900006I4 (en) 1971-04-19 1971-04-19 Information storage and retrieval system and method

Country Status (1)

Country Link
US (1) UST900006I4 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5072367A (en) * 1987-10-01 1991-12-10 International Business Machines Corporation System using two passes searching to locate record having only parameters and corresponding values of an input record

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5072367A (en) * 1987-10-01 1991-12-10 International Business Machines Corporation System using two passes searching to locate record having only parameters and corresponding values of an input record

Similar Documents

Publication Publication Date Title
US6523030B1 (en) Sort system for merging database entries
Yarlott et al. Identifying the discourse function of news article paragraphs
JPS6175957A (en) Mechanical translation processor
UST900006I4 (en) Information storage and retrieval system and method
Terasawa et al. Locality sensitive pseudo-code for document images
Hutchinson Protecting privacy in the archives: Supervised machine learning and born-digital records
US3293615A (en) Current addressing system
Freeman AUDACIOUS: An Experiment with an On-line, Interactive Reference Retrieval System Using the Universal Decimal Classification as the Index Language in the Field of Nuclear Science
Miller et al. A multi-level file structure for information processing
Jaster et al. The state of the art of coordinate indexing
Oettinger et al. Linguistic and machine methods for compiling and updating the harvard automatic dictionary
Bowman et al. A chemically oriented information storage and retrieval system. I. storage and verification of structural information
Watson et al. The use of the ASSASSIN system by Central Electricity Generating Board Technical Information Unit
Warheit The direct access search system
Dataset Document Visual Question Answering with CIVQA
Josselson Research in Machine Translation
Zunde Automatic indexing
Costello Jr Computer requirements for inverted coordinate indexes
Sharpe et al. Machine-independent system for processing medical text
Malone Dictionary of Literary Biography
Adler Judaica Automation in Israel—An Overview
Stone Standards for computer-aided content analysis: The Pisa conventions and recommendations
Gutenmakher et al. INFORMATION MACHINE
Taube et al. Communications Theory and Storage and Retrieval Systems
Lindsley Application of IBM TEXT‐PAC to the ASIS file management exercise