WO1999023583A1 - Data storage and retrieval using unique identifiers - Google Patents

Data storage and retrieval using unique identifiers Download PDF

Info

Publication number
WO1999023583A1
WO1999023583A1 PCT/US1998/023125 US9823125W WO9923583A1 WO 1999023583 A1 WO1999023583 A1 WO 1999023583A1 US 9823125 W US9823125 W US 9823125W WO 9923583 A1 WO9923583 A1 WO 9923583A1
Authority
WO
WIPO (PCT)
Prior art keywords
identification number
group
document
unique
directories
Prior art date
Application number
PCT/US1998/023125
Other languages
French (fr)
Inventor
Marius Van Tonder
Original Assignee
Top Info Outsourcing Services (Proprietary) Limited
Handelman, Joseph, H.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Top Info Outsourcing Services (Proprietary) Limited, Handelman, Joseph, H. filed Critical Top Info Outsourcing Services (Proprietary) Limited
Priority to CA002307226A priority Critical patent/CA2307226A1/en
Priority to AU12074/99A priority patent/AU1207499A/en
Priority to APAP/P/2000/001798A priority patent/AP2000001798A0/en
Priority to EP98955217A priority patent/EP1034490A4/en
Publication of WO1999023583A1 publication Critical patent/WO1999023583A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices

Definitions

  • THIS INVENTION relates to data storage and retrieval. It relates in particular to a method of storing a plurality of documents in a database and to a method of retrieving data from a database. Further, it relates to an arrangement of data in a database.
  • a typical application of the storage of documents in the form of digital images is in the medical field e.g. the storage of a claim document submitted by a doctor to a medical aid fund.
  • a digital image of the original claim document is stored in a storage medium e.g. a CD ROM or the like.
  • Selected information is also manually read from the claim document and entered into an independent storage medium thereby creating an abridgement of the original document. Retrieval of the information from the independent storage medium is generally fairly rapid. However, in the event of full details on the claim being required, the digital image of the original document is usually required.
  • an arrangement of data in a database including a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in said group of file locations divided by NFG files.
  • the path to a selected document may be derived from the unique primary identification number.
  • the number of file locations may be a preselected number of file locations which corresponds to a number of documents which are capable of being stored in at least a particular section of the database.
  • the file locations may include digital images of the documents which are captured by means of a conventional scanner.
  • the file locations may be defined by a preselected number of base directories, each group of the file locations including NFG base directories and each base directory being designated by a primary identification number.
  • the number of base directories in each group of file locations may be less than about 1000, preferably less than about 250.
  • Each secondary identification number may be associated with a directory used in a conventional computer system, the directory being designated by the secondary identification number and each base directory being defined by a sub-directory of said directory.
  • the primary identification number is typically the document number and the documents are preferably sequentially numbered.
  • each unique secondary identification number may be associated with, typically being the name or label of, a directory of a conventional directory /sub-directory arrangement used in conventional computer systems and each unique primary identification number may be associated with, typically being the name or label of, a sub-directory of said directory.
  • each file location may be a sub-directory in which a digital image of the document is stored and which is labelled or named with the unique primary identification number associated with the document.
  • the unique primary identification number is typically the document number.
  • the database is typically arranged in a hierarchical or so-called "root" structure of directories and in which the file locations are each defined by a subdirectory at a base level L B in the hierarchical structure.
  • Each group of directories at one level above the base level (level L B + 1 ) in the structure may include NFGL B + 1 sub-directories each of which has a secondary identification number designated by the absolute value of the document number of the first document in the group divided by NFGL B + 1 .
  • Each level L B + n may include a plurality of groups of directories, each group of directories including NFGL B + n directories at an immediately lower level
  • Each group of directories at level L B + n may include a unique secondary identification number which is defined by the absolute value of the unique secondary identification number of the first sub-directory in the group of directories at level L B + n . 1 divided by NFGL B + n .
  • the number of groups of directories at level L B + n is typically between about 2 and about 10 times the number of groups of directories at level L B + n . 1 .
  • an even number of groups of directories NFGL B + n is provided at each level L B + n .
  • a method of storing a plurality of documents in a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of each document to be stored in the database; determining the secondary identification number by taking the absolute value of the primary identification number of the document to be stored and dividing it by NFG; and storing the document in a file location in the form of a directory which is identified from the unique primary and secondary identification numbers.
  • the database may be arranged in a hierarchical directory structure.
  • the method may include iteratively identifying an associated directory of a group of directories at one level higher (level L B + n + 1 ) by dividing the unique secondary identification number of the directory at level L B + n by the number of directories NFGL B + n + 1 in the group.
  • the method include scanning an original copy of the document to obtain a digital image thereof, and storing the digital image of the document in the file location.
  • a method of identifying a path to one of a plurality of file locations in a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of the document to be retrieved from the database; and dividing the unique primary identification number by the number of file locations NFG in the group and taking the absolute value of the result to obtain the unique secondary identification number of the group in which the document lies thereby to identify the path to the selected document.
  • the database may be arranged in a hierarchical directory structure.
  • the method may include iteratively identifying an associated directory of a group of directories at one level higher (level L B + n + 1 ) by dividing the unique secondary identification number of the directory at level L B + n by the number of directories NFGL B + n + 1 in the group.
  • a method of retrieving data from a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of the document to be retrieved from the database, and dividing the number of file locations NFG in the group by the unique primary identification number, and taking the absolute value of the result to obtain the unique secondary identification number of the group in which the document lies thereby to identify the path to the selected document; and reading the data stored in the directory via the path.
  • the unique primary identification number may be associated with a name of a legal entity e.g. a natural person, a business, or the like.
  • the method may include searching for the name of the legal entity in a conventional manner and retrieving the unique primary identification number thereby to identify a name and path of the directory in which the document has been stored.
  • a data management installation which includes reading means for reading data from data storage means which includes a digital image of a plurality of documents arranged in an hierarchical structure in a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number, and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files; input means for receiving the primary identification number of a document to be retrieved from the storage means; and processing means arranged to identify a path in the hierarchical structure to the file location in which the document has been stored, the path being derived from a unique primary identification number of the document.
  • the installation may include interface means for interfacing the installation to a conventional data management installation which selectively accesses abridgements of documents e.g. abridgements of medical aid claims or the like which have been stored in the form of a digital image.
  • abridgements of documents e.g. abridgements of medical aid claims or the like which have been stored in the form of a digital image.
  • the installation may be arranged to receive a document number from the abridgement, the document number being translated into a unique primary identification number thereby to permit a digital image of the entire original document to be retrieved.
  • the storage means may be a plurality of CD ROMs which define the database, the reading means being a so-called "CD jukebox" .
  • Figure 1 shows a schematic diagram of a data management installation in accordance with the invention
  • Figure 2 shows a schematic diagram of an arrangement of data in a database, also in accordance with the invention, of the installation of Figure 1 ; and Figure 3 shows a sub-section of a directory structure which is arranged in a similar fashion to that of Figure 2.
  • reference numeral 10 generally indicates a data management installation in accordance with the invention.
  • the installation 10 includes a data capturing sub-section 1 2, a data storage sub-section 14, and a data retrieval sub-section 1 6.
  • the installation 10 is configured or arranged to store a digital image of each of a substantial number of documents for subsequent retrieval from a database arrangement as described in more detail below.
  • the data capturing sub-system 12 includes a conventional digital scanner 1 8 which scans a substantial number of documents 20 and feeds a digital image of each document 20 into storage means 22 as shown by arrow 24. Once the documents 20 have been scanned, they are physically stored in a warehouse or discarded as indicated by arrow 26.
  • the documents 20 are numbered sequentially and each number defines a unique primary identification number which is associated with a particular document 20.
  • the installation 10 includes interface means as indicated by arrow 28, for interfacing the installation 1 0 to a conventional computer system
  • the documents 20 are typically claim forms received by a medical aid company from various medical practitioners.
  • the data management installation 1 0 stores each document 20 in a unique fashion in an arrangement of data in a database in the storage means 22.
  • the database is arranged in an hierarchial or so-called "root" structure 32 (see Figure 2) .
  • the structure 32 of the arrangement includes 100 directories or file locations only a few of which are referenced in the drawings by reference numeral 34.
  • the file locations or directories 34 are divided into a number of groups of directories 36.1 , 36.2, 36.3 and so on, each comprising 10 directories 34 or number of files in the group (NFG) .
  • Each directory 34 has as its title or name the unique primary identification number of a document 20 intended to be stored therein. Thus, each document 20 is stored in a specific location to facilitate subsequent retrieval thereof.
  • the directories 34 are located at a base level L B in the hierarchial structure 32 as indicated by arrow 38.
  • the group of directories 36.1 is associated with a directory 40 at a level L B + 1 , which is one level higher in the hierarchial structure 32.
  • further directories 42 to 58 are provided at level L B + 1 each of which are associated with 10 file locations (NFG) or directories 34 at level l_ B , each file location or directory 34 bearing the name or label of the unique identification number of the document 20 to be stored therein.
  • the directories 40 to 58 at level L B + 1 are grouped into two groups of directories 60, 62, at a level L B + 2 , each group having 5 (NFGL B + 2 ) sub-directories.
  • the groups of directories 60, 62 are grouped or branch out from a further directory 64 which bears a label " 1 -100" and which is thus representative of the range of documents 20 having unique primary identification numbers between 1 and 1 00 which are associated with the directory.
  • the directory 64 has 2 (NFGL B + 3 ) directories in its group.
  • the various names of the directories 64, 60, 62, 40, to 58, and 34 are in the form of reference numerals which are allocated in a specific fashion.
  • the name of the directory 40 defines a unique secondary identification number which is defined by the absolute value of the first unique primary identification number 34.1 in the group of directories 36.1 divided by the total number of directories or file locations in NFG in the group of directories 36.1 .
  • the name of the directory 40 is then defined by the absolute value of 0 divided by 10 which is 0 as show in Figure 2.
  • the directory 42 its name or label is defined by the absolute value of the first unique primary identification number 34.2 in a second group of directories 36.2 divided by the number of files or directories in NFG in the particular group, i.e. the absolute value of 10 divided by 10 which is equal to 1 .
  • the unique secondary identification numbers which define the names of the directories 44 to 58 are determined.
  • the label or name of the group of directories 60 is defined by the absolute value of the unique secondary identification number "0" which is the name of the first group of directories 40 at an immediately lower level L B + n . 1 ; divided by the number of groups of directories at an immediately lower level, i.e. 5 thus providing a result of 0 as shown in Figure 2.
  • the name of the group of directories 62 is derived by the first unique secondary identification number which is the file name of the group of directories 50, i.e. 5 divided by 5 (NFGL B + 2 ) which equals 1 .
  • the hierarchial structure may comprise a plurality of different levels.
  • the number of different levels depends upon the number of documents which are to be stored in the hierarchy. Further, the fewer the number of levels, i.e. the flatter the hierarchial structure is, the more simple the path is to the particular directory in which the document is stored and thus retrieval times may be reduced in comparison to a very pointed hierarchial structure in which a number of levels are included.
  • a plurality of hierarchial structures one of which is shown in Figure 3 which are independent of each other may be used.
  • the number of file locations or directories 34 at the base level L B in the hierarchial structure 38 is typically less then about 1000 and, more preferably, less than about 250.
  • the hierarchial structure 38 may thus include a plurality of levels extending above base level L B , each level including a group of directories at a level L B + n having NFGl- B + n directories in the group.
  • each directory in a group of directories branches out or extends into NFGL B + n groups of directories at an immediately lower level L B + n.1 .
  • the name or secondary identification number of each group of sub-directories is then determined in a similar fashion as described above.
  • the number of groups of directories at level L B + n are typically between about 2 and about 10 times the number of groups of directories at level L B + n .
  • the number of levels L B + n is dependent upon the number of documents at the base level l_ B in the hierarchial structure.
  • a digital image of each document 20 is stored on a plurality of compact discs 70 as shown in Figure 1 .
  • the compact discs 70 may form part of a library of information on various transactions or claims which have been submitted to the medical aid via the various doctors.
  • Certain of the compact discs may be loaded in a CD jukebox 72 to provide a near line facility and other compact discs may be loaded in a CD tower 74 to provide an on-line facility as shown by arrows 76, 78 respectively.
  • the database is stored on a magnetic media 80.
  • the installation 10 includes computing means 82 (see Figure 1 ) which is arranged to generate a variety of user friendly screens to assist in instructing the computing means 82 to perform various retrieval functions.
  • the computing means 82 is programmed in such a fashion so that an indexed field window 84 prompts a user to enter a client name 86 via a keyboard (not shown).
  • the computing means 82 retrieves the unique primary identification number 88 which is associated with the client name 86.
  • the unique primary identification number is then fed to a unique key of documents screen 90 which has a search prompt 92 which may be activated with a mouse to initiate retrieval of a selected document from the database.
  • the path to the particular file location or directory 34 in which the document has been stored is derived directly from the unique primary identification number which defines the name of the file location or directory 34 in which the document 20 has been stored.
  • the relevant directories in the groups of directories, at the various levels L B + n in the hierarchial structure 32 must be determined.
  • the name of the actual directory 34 in which the document 20 has been stored is determined as indicated above and the particular directory in the group of directories is then determined by taking the absolute value of the unique primary identification number or document number divided by the number of file locations or directories 34 NFG at base level L B .
  • the path to the relevant document may be reconstructed and thus retrieval time may be reduced.
  • the unique primary identification number i.e. " 10" is divided by the number of the number of groups of directories NFG at the base level L B i.e. " 1 0" and the absolute value thereof is taken, i.e. directory 42 labelled " 1 " is identified at level L B + 1 .
  • the absolute value is taken of the unique primary identification number or document number 1 5 divided by the number of files in the group of directories 36, i.e. the result is the absolute value of 1 .5 which is 1 .
  • a particular sub-directory at level L B + n is determined and thus the path to the document may be determined.

Abstract

An arrangement of data in a database is provided. The arrangement includes a selected number of file locations (34), each file location (34) including a document which includes a unique primary identification number (34-1). The arrangement further includes a plurality of groups (36) of the file locations, each group (36) including a specific number of locations NFG for files in the group and including a unique secondary identification number (40) which is defined by the absolute value of the first unique primary identification number of a file location in said group of file locations divided by NFG files. The number of a file locations is a preselected number of file locations which corresponds to a number of documents which are capable of being stored in at least a particular section of the database.

Description

DATA STORAGE AND RETRIEVAL USING UNIQUE IDENTIFIERS
FIELD OF THE INVENTION
THIS INVENTION relates to data storage and retrieval. It relates in particular to a method of storing a plurality of documents in a database and to a method of retrieving data from a database. Further, it relates to an arrangement of data in a database.
DESCRIPTION OF THE PRIOR ART
The storage of data in the form of digital images of documents has played an increasing role in recent years with improvements in computer technology. A typical application of the storage of documents in the form of digital images is in the medical field e.g. the storage of a claim document submitted by a doctor to a medical aid fund. When such claims are received by the medical aid fund, a digital image of the original claim document is stored in a storage medium e.g. a CD ROM or the like. Selected information is also manually read from the claim document and entered into an independent storage medium thereby creating an abridgement of the original document. Retrieval of the information from the independent storage medium is generally fairly rapid. However, in the event of full details on the claim being required, the digital image of the original document is usually required. This normally entails obtaining a document reference from the abridgement and retrieving the original document by means of a independent computing or data management system using conventional search techniques. The result is that, due to the substantial size of the database in which the digital images are stored, the retrieval of the digital image of the original document may take an unacceptably long period of time. It is an object of this invention to offer a solution to this problem. It is however to be appreciated that not only medical records, but also insurance records, or any other bulk record systems, which are conventionally stored in large electronic databases are to be borne in mind for the purposes of this specification.
SUMMARY OF THE INVENTION
According to the invention, there is provided an arrangement of data in a database, the arrangement including a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in said group of file locations divided by NFG files.
Accordingly, the path to a selected document may be derived from the unique primary identification number.
The number of file locations may be a preselected number of file locations which corresponds to a number of documents which are capable of being stored in at least a particular section of the database.
The file locations may include digital images of the documents which are captured by means of a conventional scanner.
The file locations may be defined by a preselected number of base directories, each group of the file locations including NFG base directories and each base directory being designated by a primary identification number. The number of base directories in each group of file locations may be less than about 1000, preferably less than about 250.
Each secondary identification number may be associated with a directory used in a conventional computer system, the directory being designated by the secondary identification number and each base directory being defined by a sub-directory of said directory. The primary identification number is typically the document number and the documents are preferably sequentially numbered.
Thus, each unique secondary identification number may be associated with, typically being the name or label of, a directory of a conventional directory /sub-directory arrangement used in conventional computer systems and each unique primary identification number may be associated with, typically being the name or label of, a sub-directory of said directory. Accordingly, each file location may be a sub-directory in which a digital image of the document is stored and which is labelled or named with the unique primary identification number associated with the document. The unique primary identification number is typically the document number.
The database is typically arranged in a hierarchical or so-called "root" structure of directories and in which the file locations are each defined by a subdirectory at a base level LB in the hierarchical structure. Each group of directories at one level above the base level (level LB + 1 ) in the structure may include NFGLB + 1 sub-directories each of which has a secondary identification number designated by the absolute value of the document number of the first document in the group divided by NFGLB + 1.
Each level LB + n may include a plurality of groups of directories, each group of directories including NFGLB + n directories at an immediately lower level
LB + n_ι . Each group of directories at level LB + n may include a unique secondary identification number which is defined by the absolute value of the unique secondary identification number of the first sub-directory in the group of directories at level LB + n.1 divided by NFGLB + n.
The number of groups of directories at level LB + n is typically between about 2 and about 10 times the number of groups of directories at level LB + n.1. Preferably, an even number of groups of directories NFGLB + n is provided at each level LB + n.
Further in accordance with the invention, there is provided a method of storing a plurality of documents in a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of each document to be stored in the database; determining the secondary identification number by taking the absolute value of the primary identification number of the document to be stored and dividing it by NFG; and storing the document in a file location in the form of a directory which is identified from the unique primary and secondary identification numbers.
The database may be arranged in a hierarchical directory structure.
Accordingly, the method may include iteratively identifying an associated directory of a group of directories at one level higher (level LB + n + 1 ) by dividing the unique secondary identification number of the directory at level LB + n by the number of directories NFGLB + n + 1 in the group. The method include scanning an original copy of the document to obtain a digital image thereof, and storing the digital image of the document in the file location.
Still further in accordance with the invention, there is provided a method of identifying a path to one of a plurality of file locations in a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of the document to be retrieved from the database; and dividing the unique primary identification number by the number of file locations NFG in the group and taking the absolute value of the result to obtain the unique secondary identification number of the group in which the document lies thereby to identify the path to the selected document.
The database may be arranged in a hierarchical directory structure.
Accordingly, the method may include iteratively identifying an associated directory of a group of directories at one level higher (level LB + n + 1 ) by dividing the unique secondary identification number of the directory at level LB + n by the number of directories NFGLB + n + 1 in the group.
Further in accordance with the invention, there is provided a method of retrieving data from a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of the document to be retrieved from the database, and dividing the number of file locations NFG in the group by the unique primary identification number, and taking the absolute value of the result to obtain the unique secondary identification number of the group in which the document lies thereby to identify the path to the selected document; and reading the data stored in the directory via the path.
The unique primary identification number may be associated with a name of a legal entity e.g. a natural person, a business, or the like. The method may include searching for the name of the legal entity in a conventional manner and retrieving the unique primary identification number thereby to identify a name and path of the directory in which the document has been stored.
Further in accordance with the invention, there is provided a data management installation which includes reading means for reading data from data storage means which includes a digital image of a plurality of documents arranged in an hierarchical structure in a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number, and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files; input means for receiving the primary identification number of a document to be retrieved from the storage means; and processing means arranged to identify a path in the hierarchical structure to the file location in which the document has been stored, the path being derived from a unique primary identification number of the document.
The installation may include interface means for interfacing the installation to a conventional data management installation which selectively accesses abridgements of documents e.g. abridgements of medical aid claims or the like which have been stored in the form of a digital image.
The installation may be arranged to receive a document number from the abridgement, the document number being translated into a unique primary identification number thereby to permit a digital image of the entire original document to be retrieved.
The storage means may be a plurality of CD ROMs which define the database, the reading means being a so-called "CD jukebox" .
BRIEF DESCRIPTION OF THE DRAWINGS
The invention is now described, by way of example, with reference to the accompanying diagrammatic drawings.
DESCRIPTION OF THE PREFERRED EMBODIMENT
In the drawings, Figure 1 shows a schematic diagram of a data management installation in accordance with the invention;
Figure 2 shows a schematic diagram of an arrangement of data in a database, also in accordance with the invention, of the installation of Figure 1 ; and Figure 3 shows a sub-section of a directory structure which is arranged in a similar fashion to that of Figure 2.
Referring to the drawings, reference numeral 10 generally indicates a data management installation in accordance with the invention. The installation 10 includes a data capturing sub-section 1 2, a data storage sub-section 14, and a data retrieval sub-section 1 6. The installation 10 is configured or arranged to store a digital image of each of a substantial number of documents for subsequent retrieval from a database arrangement as described in more detail below.
The data capturing sub-system 12 includes a conventional digital scanner 1 8 which scans a substantial number of documents 20 and feeds a digital image of each document 20 into storage means 22 as shown by arrow 24. Once the documents 20 have been scanned, they are physically stored in a warehouse or discarded as indicated by arrow 26.
The documents 20 are numbered sequentially and each number defines a unique primary identification number which is associated with a particular document 20. The installation 10 includes interface means as indicated by arrow 28, for interfacing the installation 1 0 to a conventional computer system
30 which is configured to access abridgements of the documents 20 in its storage means (not shown). In particular, the documents 20 are typically claim forms received by a medical aid company from various medical practitioners.
Conventionally, selected information from each document 20 is manually entered into the conventional computer system 30 to create an abridgement of the document 20 for subsequent retrieval by an operator. However, in certain circumstances, merely accessing the abridgement is insufficient adequately to attend to any query pertaining to the document, and, accordingly, the operator would then in a conventional system physically retrieve the document 20 to obtain comprehensive information on a particular transaction or claim. However, unlike conventional systems, the data management installation 1 0 stores each document 20 in a unique fashion in an arrangement of data in a database in the storage means 22. In particular, the database is arranged in an hierarchial or so-called "root" structure 32 (see Figure 2) . The structure 32 of the arrangement includes 100 directories or file locations only a few of which are referenced in the drawings by reference numeral 34. The file locations or directories 34 are divided into a number of groups of directories 36.1 , 36.2, 36.3 and so on, each comprising 10 directories 34 or number of files in the group (NFG) . Each directory 34 has as its title or name the unique primary identification number of a document 20 intended to be stored therein. Thus, each document 20 is stored in a specific location to facilitate subsequent retrieval thereof.
The directories 34 are located at a base level LB in the hierarchial structure 32 as indicated by arrow 38. The group of directories 36.1 is associated with a directory 40 at a level LB + 1 , which is one level higher in the hierarchial structure 32. In a similar fashion, further directories 42 to 58 are provided at level LB + 1 each of which are associated with 10 file locations (NFG) or directories 34 at level l_B, each file location or directory 34 bearing the name or label of the unique identification number of the document 20 to be stored therein. Further, as in the case of the directories 34 which are grouped into groups of directories 36, the directories 40 to 58 at level LB + 1 are grouped into two groups of directories 60, 62, at a level LB + 2, each group having 5 (NFGLB + 2) sub-directories. Further, in a similar fashion, the groups of directories 60, 62 are grouped or branch out from a further directory 64 which bears a label " 1 -100" and which is thus representative of the range of documents 20 having unique primary identification numbers between 1 and 1 00 which are associated with the directory. The directory 64 has 2 (NFGLB + 3) directories in its group.
The various names of the directories 64, 60, 62, 40, to 58, and 34 are in the form of reference numerals which are allocated in a specific fashion. In particular, the name of the directory 40 defines a unique secondary identification number which is defined by the absolute value of the first unique primary identification number 34.1 in the group of directories 36.1 divided by the total number of directories or file locations in NFG in the group of directories 36.1 . For example, as the first document is stored in file location 34.1 it bears a unique primary identification number 0 and the name of the directory 40 is then defined by the absolute value of 0 divided by 10 which is 0 as show in Figure 2. In the case of the directory 42, its name or label is defined by the absolute value of the first unique primary identification number 34.2 in a second group of directories 36.2 divided by the number of files or directories in NFG in the particular group, i.e. the absolute value of 10 divided by 10 which is equal to 1 . In a similar fashion, the unique secondary identification numbers which define the names of the directories 44 to 58 are determined.
In a similar fashion, the label or name of the group of directories 60 is defined by the absolute value of the unique secondary identification number "0" which is the name of the first group of directories 40 at an immediately lower level LB + n.1 ; divided by the number of groups of directories at an immediately lower level, i.e. 5 thus providing a result of 0 as shown in Figure 2. In a similar fashion, the name of the group of directories 62 is derived by the first unique secondary identification number which is the file name of the group of directories 50, i.e. 5 divided by 5 (NFGLB + 2) which equals 1 .
It is to be appreciated that, in other embodiments of the invention, the hierarchial structure may comprise a plurality of different levels. The number of different levels depends upon the number of documents which are to be stored in the hierarchy. Further, the fewer the number of levels, i.e. the flatter the hierarchial structure is, the more simple the path is to the particular directory in which the document is stored and thus retrieval times may be reduced in comparison to a very pointed hierarchial structure in which a number of levels are included. When a substantial number of documents are to be stored, a plurality of hierarchial structures (one of which is shown in Figure 3) which are independent of each other may be used. Preferably, due to software and hardware limitations of certain computing systems, the number of file locations or directories 34 at the base level LB in the hierarchial structure 38 is typically less then about 1000 and, more preferably, less than about 250. The hierarchial structure 38 may thus include a plurality of levels extending above base level LB, each level including a group of directories at a level LB + n having NFGl-B + n directories in the group. Thus, from a top down point of view, each directory in a group of directories branches out or extends into NFGLB + n groups of directories at an immediately lower level LB + n.1 . The name or secondary identification number of each group of sub-directories is then determined in a similar fashion as described above. When arranging the database, the number of groups of directories at level LB + n are typically between about 2 and about 10 times the number of groups of directories at level LB + n. Thus, it is evident that the number of levels LB + n is dependent upon the number of documents at the base level l_B in the hierarchial structure.
Once the hierarchial structure 38 has been established and the data has been arranged in the database as described above, a digital image of each document 20 is stored on a plurality of compact discs 70 as shown in Figure 1 . The compact discs 70 may form part of a library of information on various transactions or claims which have been submitted to the medical aid via the various doctors. Certain of the compact discs may be loaded in a CD jukebox 72 to provide a near line facility and other compact discs may be loaded in a CD tower 74 to provide an on-line facility as shown by arrows 76, 78 respectively. In other embodiments of the invention, the database is stored on a magnetic media 80.
In order to retrieve a digital image of a specific document 20 from the database, the installation 10 includes computing means 82 (see Figure 1 ) which is arranged to generate a variety of user friendly screens to assist in instructing the computing means 82 to perform various retrieval functions. The computing means 82 is programmed in such a fashion so that an indexed field window 84 prompts a user to enter a client name 86 via a keyboard (not shown). The computing means 82, in a conventional fashion, then retrieves the unique primary identification number 88 which is associated with the client name 86. The unique primary identification number is then fed to a unique key of documents screen 90 which has a search prompt 92 which may be activated with a mouse to initiate retrieval of a selected document from the database.
In order to facilitate retrieval of the digital image of the selected document from the database, the path to the particular file location or directory 34 in which the document has been stored is derived directly from the unique primary identification number which defines the name of the file location or directory 34 in which the document 20 has been stored. In order to determine this path, the relevant directories in the groups of directories, at the various levels LB + n in the hierarchial structure 32 must be determined. As described above, the name of the actual directory 34 in which the document 20 has been stored is determined as indicated above and the particular directory in the group of directories is then determined by taking the absolute value of the unique primary identification number or document number divided by the number of file locations or directories 34 NFG at base level LB. Once the name of the particular directory at each intermediate level in the hierarchial structure 32 has been determined, the path to the relevant document may be reconstructed and thus retrieval time may be reduced. For example, to identify the path to document number 10 in the hierarchial structure 32, the unique primary identification number i.e. " 10" is divided by the number of the number of groups of directories NFG at the base level LB i.e. " 1 0" and the absolute value thereof is taken, i.e. directory 42 labelled " 1 " is identified at level LB + 1 . In a similar fashion, if document 1 5 is to be retrieved, the absolute value is taken of the unique primary identification number or document number 1 5 divided by the number of files in the group of directories 36, i.e. the result is the absolute value of 1 .5 which is 1 . Likewise a particular sub-directory at level LB + n is determined and thus the path to the document may be determined.
The inventors believe that the invention, as illustrated, provides a data management system 10 which has enhanced retrieval characteristics of a document from a database as the documents are stored in a particular fashion and the actual path to the directory in which the document has been stored is derived from a unique primary identification number allocated to the document.

Claims

1 . An arrangement of data in a database, the arrangement including a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in said group of file locations divided by NFG files.
2. An arrangement of data as claimed in Claim 1 , in which the number of file locations is a preselected number of file locations which corresponds to a number of documents which are capable of being stored in at least a particular section of the database.
3. An arrangement of data as claimed in Claim 2, in which the file locations include digital images of the documents which are captured by means of a conventional scanner.
4. An arrangement of data as claimed in Claim 1 , in which the file locations are defined by a preselected number of base directories, each group of the file locations including NFG base directories and each base directory being designated by a primary identification number.
5. An arrangement of data as claimed in Claim 4, in which the number of base directories in each group of file locations is less than 1000.
6. An arrangement of data as claimed in Claim 5, in which each secondary identification number is associated with a directory used in a conventional computer system, the directory being designated by the secondary identification number and each base directory being defined by a sub-directory of said directory.
7. An arrangement of data as claimed in Claim 6, in which the primary identification number is the document number and the documents are sequentially numbered.
8. An arrangement of data as claimed in Claim 7, in which the database is arranged in a hierarchical structure of directories and in which the file locations are each defined by a sub-directory at a base level LB in the hierarchical structure, each group of directories at one level above the base level (level LB + 1 ) in the structure including NFGLB + 1 sub-directories each of which has a secondary identification number designated by the absolute value of the document number of the first document in the group divided by NFGLB + 1 .
9. An arrangement of data as claimed in Claim 8, in which each level LB + n includes a plurality of groups of directories, each group of directories including NFGLB + n directories at an immediately lower level LB + n.1 , each group of directories at level LB + n including a unique secondary identification number which is defined by the absolute value of the unique secondary identification number of the first sub-directory in the group of directories at level LB + n.1 divided by NFGLB + n.
10. An arrangement of data as claimed in Claim 9, in which the number of groups of directories at level LB + n is between 2 and 10 times the number of groups of directories at level LB + tv1 .
1 1 . An arrangement of data as claimed in Claim 10, in which an even number of groups of directories NFGLB + n is provided at each level LB + n.
1 2. A method of storing a plurality of documents in a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of each document to be stored in the database; determining the secondary identification number by taking the absolute value of the primary identification number of the document to be stored and dividing it by NFG; and storing the document in a file location in the form of a directory which is identified from the unique primary and secondary identification numbers.
1 3. A method as claimed in Claim 1 2, in which the database is arranged in a hierarchical directory structure, and the method includes iteratively identifying an associated directory of a group of directories at one level higher (level LB + n + 1 ) by dividing the unique secondary identification number of the directory at level LB + n by the number of directories NFGLB + n + 1 in the group.
1 4. A method as claimed in Claim 1 3, which includes scanning an original copy of the document to obtain a digital image thereof, and storing the digital image of the document in the file location.
1 5. A method of identifying a path to one of a plurality of file locations in a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of the document to be retrieved from the database; and dividing the unique primary identification number by the number of file locations NFG in the group and taking the absolute value of the result to obtain the unique secondary identification number of the group in which the document lies thereby to identify the path to the selected document.
16. A method as claimed in Claim 1 5, in which the database is arranged in a hierarchical directory structure, and the method includes iteratively identifying an associated directory of a group of directories at one level higher (level LB + n + 1 ) by dividing the unique secondary identification number of the directory at level LB + n by the number of directories NFGLB + n + 1 in the group.
1 7. A method of retrieving data from a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of the document to be retrieved from the database, and dividing the number of file locations NFG in the group by the unique primary identification number, and taking the absolute value of the result to obtain the unique secondary identification number of the group in which the document lies thereby to identify the path to the selected document; and reading the data stored in the directory via the path.
1 8. A method as claimed in Claim 1 7, in which the unique primary identification number is associated with a name of a legal entity, the method including searching for the name of the legal entity in a conventional manner and retrieving the unique primary identification number thereby to identify a name and path of the directory in which the document has been stored.
1 9. A data management installation which includes reading means for reading data from data storage means which includes a digital image of a plurality of documents arranged in an hierarchical structure in a database of data which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number, and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files; input means for receiving the primary identification number of a document to be retrieved from the storage means; and processing means arranged to identify a path in the hierarchical structure to the file location in which the document has been stored, the path being derived from a unique primary identification number of the document.
20. An installation as claimed in Claim 1 9, which includes interface means for interfacing the installation to a conventional data management installation which selectively accesses abridgements of documents.
21 . An installation as claimed in Claim 20, which is arranged to receive a document number from the abridgement, the document number being translated into a unique primary identification number thereby to permit a digital image of the entire original document to be retrieved.
22. An installation as claimed in Claim 1 9, in which the storage means is a plurality of CD ROMs which define the database, the reading means being a so-called "CD jukebox" .
PCT/US1998/023125 1997-11-03 1998-10-30 Data storage and retrieval using unique identifiers WO1999023583A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CA002307226A CA2307226A1 (en) 1997-11-03 1998-10-30 Data storage and retrieval using unique identifiers
AU12074/99A AU1207499A (en) 1997-11-03 1998-10-30 Data storage and retrieval using unique identifiers
APAP/P/2000/001798A AP2000001798A0 (en) 1997-11-03 1998-10-30 Image data storage and retrieval.
EP98955217A EP1034490A4 (en) 1997-11-03 1998-10-30 Data storage and retrieval using unique identifiers

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
ZA97/9873 1997-11-03
ZA979873 1997-11-03

Publications (1)

Publication Number Publication Date
WO1999023583A1 true WO1999023583A1 (en) 1999-05-14

Family

ID=25586686

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/023125 WO1999023583A1 (en) 1997-11-03 1998-10-30 Data storage and retrieval using unique identifiers

Country Status (6)

Country Link
EP (1) EP1034490A4 (en)
AP (1) AP2000001798A0 (en)
AU (1) AU1207499A (en)
CA (1) CA2307226A1 (en)
WO (1) WO1999023583A1 (en)
ZA (1) ZA989947B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5058162A (en) * 1990-08-09 1991-10-15 Hewlett-Packard Company Method of distributing computer data files
US5204958A (en) * 1991-06-27 1993-04-20 Digital Equipment Corporation System and method for efficiently indexing and storing a large database with high data insertion frequency

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS56162162A (en) * 1980-05-16 1981-12-12 Toshiba Corp Data storing device having variable data structure

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5058162A (en) * 1990-08-09 1991-10-15 Hewlett-Packard Company Method of distributing computer data files
US5204958A (en) * 1991-06-27 1993-04-20 Digital Equipment Corporation System and method for efficiently indexing and storing a large database with high data insertion frequency

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HSIAO Y.S. et al., "Adaptive Hashing", INFORM. SYSTEMS, 1988, Vol. 13, No. 1, pages 111-127. *
See also references of EP1034490A4 *

Also Published As

Publication number Publication date
CA2307226A1 (en) 1999-05-14
AP2000001798A0 (en) 2000-06-30
EP1034490A4 (en) 2001-02-07
AU1207499A (en) 1999-05-24
EP1034490A1 (en) 2000-09-13
ZA989947B (en) 1999-05-13

Similar Documents

Publication Publication Date Title
US6477528B1 (en) File management system, electronic filing system, hierarchical structure display method of file, computer readable recording medium recording program in which function thereof is executable
US7272610B2 (en) Knowledge management system
US9087101B2 (en) Document management techniques to account for user-specific patterns in document metadata
US7765191B2 (en) Methods and apparatus for managing the replication of content
US7246170B2 (en) Scheme for systematically registering meta-data with respect to various types of data
US5740445A (en) Information processing apparatus for generating directory information to manage a file using directories
EP0846298B1 (en) Electronic document and data storage and retrieval system
US7392235B2 (en) Methods and apparatus for retrieval of content units in a time-based directory structure
US6549913B1 (en) Method for compiling an image database, an image database system, and an image data storage medium
RU2378685C2 (en) File management device, method of controlling said device, computer program and data carrier
US7401078B2 (en) Information processing apparatus, document search method, program, and storage medium
US10114821B2 (en) Method and system to access to electronic business documents
JP2001510673A (en) Apparatus and method for optimizing keyframe and blob retrieval and storage
WO1997007468A9 (en) Electronic document and data storage and retrieval system
US20070124272A1 (en) System and Method for Collecting and Compiling Data in a Computer Network
US7333992B2 (en) System and method for identifying and storing changes made to a table
US5884321A (en) Document image and query management system for application databases
US20020143794A1 (en) Method and system for converting data files from a first format to second format
Arms Getting the picture: Observations from the library of congress on providing online access to pictorial images
KR100296574B1 (en) Method and archive server for creating an archive on a removable mass storage medium
US20030101199A1 (en) Electronic document processing system
US20060235893A1 (en) Methods and apparatus for managing the storage of content
EP1116137B1 (en) Database, and methods of data storage and retrieval
EP1034490A1 (en) Data storage and retrieval using unique identifiers
JP2001075954A (en) Electronic filing system and data registering method

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
ENP Entry into the national phase

Ref document number: 2307226

Country of ref document: CA

Ref country code: CA

Ref document number: 2307226

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 12074/99

Country of ref document: AU

NENP Non-entry into the national phase

Ref country code: KR

WWE Wipo information: entry into national phase

Ref document number: 1998955217

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWP Wipo information: published in national office

Ref document number: 1998955217

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1998955217

Country of ref document: EP