WO1999023583A1 - Data storage and retrieval using unique identifiers - Google Patents
Data storage and retrieval using unique identifiers Download PDFInfo
- Publication number
- WO1999023583A1 WO1999023583A1 PCT/US1998/023125 US9823125W WO9923583A1 WO 1999023583 A1 WO1999023583 A1 WO 1999023583A1 US 9823125 W US9823125 W US 9823125W WO 9923583 A1 WO9923583 A1 WO 9923583A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- identification number
- group
- document
- unique
- directories
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
Definitions
- THIS INVENTION relates to data storage and retrieval. It relates in particular to a method of storing a plurality of documents in a database and to a method of retrieving data from a database. Further, it relates to an arrangement of data in a database.
- a typical application of the storage of documents in the form of digital images is in the medical field e.g. the storage of a claim document submitted by a doctor to a medical aid fund.
- a digital image of the original claim document is stored in a storage medium e.g. a CD ROM or the like.
- Selected information is also manually read from the claim document and entered into an independent storage medium thereby creating an abridgement of the original document. Retrieval of the information from the independent storage medium is generally fairly rapid. However, in the event of full details on the claim being required, the digital image of the original document is usually required.
- an arrangement of data in a database including a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in said group of file locations divided by NFG files.
- the path to a selected document may be derived from the unique primary identification number.
- the number of file locations may be a preselected number of file locations which corresponds to a number of documents which are capable of being stored in at least a particular section of the database.
- the file locations may include digital images of the documents which are captured by means of a conventional scanner.
- the file locations may be defined by a preselected number of base directories, each group of the file locations including NFG base directories and each base directory being designated by a primary identification number.
- the number of base directories in each group of file locations may be less than about 1000, preferably less than about 250.
- Each secondary identification number may be associated with a directory used in a conventional computer system, the directory being designated by the secondary identification number and each base directory being defined by a sub-directory of said directory.
- the primary identification number is typically the document number and the documents are preferably sequentially numbered.
- each unique secondary identification number may be associated with, typically being the name or label of, a directory of a conventional directory /sub-directory arrangement used in conventional computer systems and each unique primary identification number may be associated with, typically being the name or label of, a sub-directory of said directory.
- each file location may be a sub-directory in which a digital image of the document is stored and which is labelled or named with the unique primary identification number associated with the document.
- the unique primary identification number is typically the document number.
- the database is typically arranged in a hierarchical or so-called "root" structure of directories and in which the file locations are each defined by a subdirectory at a base level L B in the hierarchical structure.
- Each group of directories at one level above the base level (level L B + 1 ) in the structure may include NFGL B + 1 sub-directories each of which has a secondary identification number designated by the absolute value of the document number of the first document in the group divided by NFGL B + 1 .
- Each level L B + n may include a plurality of groups of directories, each group of directories including NFGL B + n directories at an immediately lower level
- Each group of directories at level L B + n may include a unique secondary identification number which is defined by the absolute value of the unique secondary identification number of the first sub-directory in the group of directories at level L B + n . 1 divided by NFGL B + n .
- the number of groups of directories at level L B + n is typically between about 2 and about 10 times the number of groups of directories at level L B + n . 1 .
- an even number of groups of directories NFGL B + n is provided at each level L B + n .
- a method of storing a plurality of documents in a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of each document to be stored in the database; determining the secondary identification number by taking the absolute value of the primary identification number of the document to be stored and dividing it by NFG; and storing the document in a file location in the form of a directory which is identified from the unique primary and secondary identification numbers.
- the database may be arranged in a hierarchical directory structure.
- the method may include iteratively identifying an associated directory of a group of directories at one level higher (level L B + n + 1 ) by dividing the unique secondary identification number of the directory at level L B + n by the number of directories NFGL B + n + 1 in the group.
- the method include scanning an original copy of the document to obtain a digital image thereof, and storing the digital image of the document in the file location.
- a method of identifying a path to one of a plurality of file locations in a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of the document to be retrieved from the database; and dividing the unique primary identification number by the number of file locations NFG in the group and taking the absolute value of the result to obtain the unique secondary identification number of the group in which the document lies thereby to identify the path to the selected document.
- the database may be arranged in a hierarchical directory structure.
- the method may include iteratively identifying an associated directory of a group of directories at one level higher (level L B + n + 1 ) by dividing the unique secondary identification number of the directory at level L B + n by the number of directories NFGL B + n + 1 in the group.
- a method of retrieving data from a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of the document to be retrieved from the database, and dividing the number of file locations NFG in the group by the unique primary identification number, and taking the absolute value of the result to obtain the unique secondary identification number of the group in which the document lies thereby to identify the path to the selected document; and reading the data stored in the directory via the path.
- the unique primary identification number may be associated with a name of a legal entity e.g. a natural person, a business, or the like.
- the method may include searching for the name of the legal entity in a conventional manner and retrieving the unique primary identification number thereby to identify a name and path of the directory in which the document has been stored.
- a data management installation which includes reading means for reading data from data storage means which includes a digital image of a plurality of documents arranged in an hierarchical structure in a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number, and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files; input means for receiving the primary identification number of a document to be retrieved from the storage means; and processing means arranged to identify a path in the hierarchical structure to the file location in which the document has been stored, the path being derived from a unique primary identification number of the document.
- the installation may include interface means for interfacing the installation to a conventional data management installation which selectively accesses abridgements of documents e.g. abridgements of medical aid claims or the like which have been stored in the form of a digital image.
- abridgements of documents e.g. abridgements of medical aid claims or the like which have been stored in the form of a digital image.
- the installation may be arranged to receive a document number from the abridgement, the document number being translated into a unique primary identification number thereby to permit a digital image of the entire original document to be retrieved.
- the storage means may be a plurality of CD ROMs which define the database, the reading means being a so-called "CD jukebox" .
- Figure 1 shows a schematic diagram of a data management installation in accordance with the invention
- Figure 2 shows a schematic diagram of an arrangement of data in a database, also in accordance with the invention, of the installation of Figure 1 ; and Figure 3 shows a sub-section of a directory structure which is arranged in a similar fashion to that of Figure 2.
- reference numeral 10 generally indicates a data management installation in accordance with the invention.
- the installation 10 includes a data capturing sub-section 1 2, a data storage sub-section 14, and a data retrieval sub-section 1 6.
- the installation 10 is configured or arranged to store a digital image of each of a substantial number of documents for subsequent retrieval from a database arrangement as described in more detail below.
- the data capturing sub-system 12 includes a conventional digital scanner 1 8 which scans a substantial number of documents 20 and feeds a digital image of each document 20 into storage means 22 as shown by arrow 24. Once the documents 20 have been scanned, they are physically stored in a warehouse or discarded as indicated by arrow 26.
- the documents 20 are numbered sequentially and each number defines a unique primary identification number which is associated with a particular document 20.
- the installation 10 includes interface means as indicated by arrow 28, for interfacing the installation 1 0 to a conventional computer system
- the documents 20 are typically claim forms received by a medical aid company from various medical practitioners.
- the data management installation 1 0 stores each document 20 in a unique fashion in an arrangement of data in a database in the storage means 22.
- the database is arranged in an hierarchial or so-called "root" structure 32 (see Figure 2) .
- the structure 32 of the arrangement includes 100 directories or file locations only a few of which are referenced in the drawings by reference numeral 34.
- the file locations or directories 34 are divided into a number of groups of directories 36.1 , 36.2, 36.3 and so on, each comprising 10 directories 34 or number of files in the group (NFG) .
- Each directory 34 has as its title or name the unique primary identification number of a document 20 intended to be stored therein. Thus, each document 20 is stored in a specific location to facilitate subsequent retrieval thereof.
- the directories 34 are located at a base level L B in the hierarchial structure 32 as indicated by arrow 38.
- the group of directories 36.1 is associated with a directory 40 at a level L B + 1 , which is one level higher in the hierarchial structure 32.
- further directories 42 to 58 are provided at level L B + 1 each of which are associated with 10 file locations (NFG) or directories 34 at level l_ B , each file location or directory 34 bearing the name or label of the unique identification number of the document 20 to be stored therein.
- the directories 40 to 58 at level L B + 1 are grouped into two groups of directories 60, 62, at a level L B + 2 , each group having 5 (NFGL B + 2 ) sub-directories.
- the groups of directories 60, 62 are grouped or branch out from a further directory 64 which bears a label " 1 -100" and which is thus representative of the range of documents 20 having unique primary identification numbers between 1 and 1 00 which are associated with the directory.
- the directory 64 has 2 (NFGL B + 3 ) directories in its group.
- the various names of the directories 64, 60, 62, 40, to 58, and 34 are in the form of reference numerals which are allocated in a specific fashion.
- the name of the directory 40 defines a unique secondary identification number which is defined by the absolute value of the first unique primary identification number 34.1 in the group of directories 36.1 divided by the total number of directories or file locations in NFG in the group of directories 36.1 .
- the name of the directory 40 is then defined by the absolute value of 0 divided by 10 which is 0 as show in Figure 2.
- the directory 42 its name or label is defined by the absolute value of the first unique primary identification number 34.2 in a second group of directories 36.2 divided by the number of files or directories in NFG in the particular group, i.e. the absolute value of 10 divided by 10 which is equal to 1 .
- the unique secondary identification numbers which define the names of the directories 44 to 58 are determined.
- the label or name of the group of directories 60 is defined by the absolute value of the unique secondary identification number "0" which is the name of the first group of directories 40 at an immediately lower level L B + n . 1 ; divided by the number of groups of directories at an immediately lower level, i.e. 5 thus providing a result of 0 as shown in Figure 2.
- the name of the group of directories 62 is derived by the first unique secondary identification number which is the file name of the group of directories 50, i.e. 5 divided by 5 (NFGL B + 2 ) which equals 1 .
- the hierarchial structure may comprise a plurality of different levels.
- the number of different levels depends upon the number of documents which are to be stored in the hierarchy. Further, the fewer the number of levels, i.e. the flatter the hierarchial structure is, the more simple the path is to the particular directory in which the document is stored and thus retrieval times may be reduced in comparison to a very pointed hierarchial structure in which a number of levels are included.
- a plurality of hierarchial structures one of which is shown in Figure 3 which are independent of each other may be used.
- the number of file locations or directories 34 at the base level L B in the hierarchial structure 38 is typically less then about 1000 and, more preferably, less than about 250.
- the hierarchial structure 38 may thus include a plurality of levels extending above base level L B , each level including a group of directories at a level L B + n having NFGl- B + n directories in the group.
- each directory in a group of directories branches out or extends into NFGL B + n groups of directories at an immediately lower level L B + n.1 .
- the name or secondary identification number of each group of sub-directories is then determined in a similar fashion as described above.
- the number of groups of directories at level L B + n are typically between about 2 and about 10 times the number of groups of directories at level L B + n .
- the number of levels L B + n is dependent upon the number of documents at the base level l_ B in the hierarchial structure.
- a digital image of each document 20 is stored on a plurality of compact discs 70 as shown in Figure 1 .
- the compact discs 70 may form part of a library of information on various transactions or claims which have been submitted to the medical aid via the various doctors.
- Certain of the compact discs may be loaded in a CD jukebox 72 to provide a near line facility and other compact discs may be loaded in a CD tower 74 to provide an on-line facility as shown by arrows 76, 78 respectively.
- the database is stored on a magnetic media 80.
- the installation 10 includes computing means 82 (see Figure 1 ) which is arranged to generate a variety of user friendly screens to assist in instructing the computing means 82 to perform various retrieval functions.
- the computing means 82 is programmed in such a fashion so that an indexed field window 84 prompts a user to enter a client name 86 via a keyboard (not shown).
- the computing means 82 retrieves the unique primary identification number 88 which is associated with the client name 86.
- the unique primary identification number is then fed to a unique key of documents screen 90 which has a search prompt 92 which may be activated with a mouse to initiate retrieval of a selected document from the database.
- the path to the particular file location or directory 34 in which the document has been stored is derived directly from the unique primary identification number which defines the name of the file location or directory 34 in which the document 20 has been stored.
- the relevant directories in the groups of directories, at the various levels L B + n in the hierarchial structure 32 must be determined.
- the name of the actual directory 34 in which the document 20 has been stored is determined as indicated above and the particular directory in the group of directories is then determined by taking the absolute value of the unique primary identification number or document number divided by the number of file locations or directories 34 NFG at base level L B .
- the path to the relevant document may be reconstructed and thus retrieval time may be reduced.
- the unique primary identification number i.e. " 10" is divided by the number of the number of groups of directories NFG at the base level L B i.e. " 1 0" and the absolute value thereof is taken, i.e. directory 42 labelled " 1 " is identified at level L B + 1 .
- the absolute value is taken of the unique primary identification number or document number 1 5 divided by the number of files in the group of directories 36, i.e. the result is the absolute value of 1 .5 which is 1 .
- a particular sub-directory at level L B + n is determined and thus the path to the document may be determined.
Abstract
Description
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA002307226A CA2307226A1 (en) | 1997-11-03 | 1998-10-30 | Data storage and retrieval using unique identifiers |
AU12074/99A AU1207499A (en) | 1997-11-03 | 1998-10-30 | Data storage and retrieval using unique identifiers |
APAP/P/2000/001798A AP2000001798A0 (en) | 1997-11-03 | 1998-10-30 | Image data storage and retrieval. |
EP98955217A EP1034490A4 (en) | 1997-11-03 | 1998-10-30 | Data storage and retrieval using unique identifiers |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
ZA97/9873 | 1997-11-03 | ||
ZA979873 | 1997-11-03 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1999023583A1 true WO1999023583A1 (en) | 1999-05-14 |
Family
ID=25586686
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1998/023125 WO1999023583A1 (en) | 1997-11-03 | 1998-10-30 | Data storage and retrieval using unique identifiers |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP1034490A4 (en) |
AP (1) | AP2000001798A0 (en) |
AU (1) | AU1207499A (en) |
CA (1) | CA2307226A1 (en) |
WO (1) | WO1999023583A1 (en) |
ZA (1) | ZA989947B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5058162A (en) * | 1990-08-09 | 1991-10-15 | Hewlett-Packard Company | Method of distributing computer data files |
US5204958A (en) * | 1991-06-27 | 1993-04-20 | Digital Equipment Corporation | System and method for efficiently indexing and storing a large database with high data insertion frequency |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS56162162A (en) * | 1980-05-16 | 1981-12-12 | Toshiba Corp | Data storing device having variable data structure |
-
1998
- 1998-10-30 AU AU12074/99A patent/AU1207499A/en not_active Abandoned
- 1998-10-30 AP APAP/P/2000/001798A patent/AP2000001798A0/en unknown
- 1998-10-30 ZA ZA989947A patent/ZA989947B/en unknown
- 1998-10-30 EP EP98955217A patent/EP1034490A4/en not_active Withdrawn
- 1998-10-30 WO PCT/US1998/023125 patent/WO1999023583A1/en not_active Application Discontinuation
- 1998-10-30 CA CA002307226A patent/CA2307226A1/en not_active Abandoned
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5058162A (en) * | 1990-08-09 | 1991-10-15 | Hewlett-Packard Company | Method of distributing computer data files |
US5204958A (en) * | 1991-06-27 | 1993-04-20 | Digital Equipment Corporation | System and method for efficiently indexing and storing a large database with high data insertion frequency |
Non-Patent Citations (2)
Title |
---|
HSIAO Y.S. et al., "Adaptive Hashing", INFORM. SYSTEMS, 1988, Vol. 13, No. 1, pages 111-127. * |
See also references of EP1034490A4 * |
Also Published As
Publication number | Publication date |
---|---|
CA2307226A1 (en) | 1999-05-14 |
AP2000001798A0 (en) | 2000-06-30 |
EP1034490A4 (en) | 2001-02-07 |
AU1207499A (en) | 1999-05-24 |
EP1034490A1 (en) | 2000-09-13 |
ZA989947B (en) | 1999-05-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6477528B1 (en) | File management system, electronic filing system, hierarchical structure display method of file, computer readable recording medium recording program in which function thereof is executable | |
US7272610B2 (en) | Knowledge management system | |
US9087101B2 (en) | Document management techniques to account for user-specific patterns in document metadata | |
US7765191B2 (en) | Methods and apparatus for managing the replication of content | |
US7246170B2 (en) | Scheme for systematically registering meta-data with respect to various types of data | |
US5740445A (en) | Information processing apparatus for generating directory information to manage a file using directories | |
EP0846298B1 (en) | Electronic document and data storage and retrieval system | |
US7392235B2 (en) | Methods and apparatus for retrieval of content units in a time-based directory structure | |
US6549913B1 (en) | Method for compiling an image database, an image database system, and an image data storage medium | |
RU2378685C2 (en) | File management device, method of controlling said device, computer program and data carrier | |
US7401078B2 (en) | Information processing apparatus, document search method, program, and storage medium | |
US10114821B2 (en) | Method and system to access to electronic business documents | |
JP2001510673A (en) | Apparatus and method for optimizing keyframe and blob retrieval and storage | |
WO1997007468A9 (en) | Electronic document and data storage and retrieval system | |
US20070124272A1 (en) | System and Method for Collecting and Compiling Data in a Computer Network | |
US7333992B2 (en) | System and method for identifying and storing changes made to a table | |
US5884321A (en) | Document image and query management system for application databases | |
US20020143794A1 (en) | Method and system for converting data files from a first format to second format | |
Arms | Getting the picture: Observations from the library of congress on providing online access to pictorial images | |
KR100296574B1 (en) | Method and archive server for creating an archive on a removable mass storage medium | |
US20030101199A1 (en) | Electronic document processing system | |
US20060235893A1 (en) | Methods and apparatus for managing the storage of content | |
EP1116137B1 (en) | Database, and methods of data storage and retrieval | |
EP1034490A1 (en) | Data storage and retrieval using unique identifiers | |
JP2001075954A (en) | Electronic filing system and data registering method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
ENP | Entry into the national phase |
Ref document number: 2307226 Country of ref document: CA Ref country code: CA Ref document number: 2307226 Kind code of ref document: A Format of ref document f/p: F |
|
WWE | Wipo information: entry into national phase |
Ref document number: 12074/99 Country of ref document: AU |
|
NENP | Non-entry into the national phase |
Ref country code: KR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1998955217 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWP | Wipo information: published in national office |
Ref document number: 1998955217 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1998955217 Country of ref document: EP |