WO2002019075A2 - System and method for client document certification and validation by remote host - Google Patents

System and method for client document certification and validation by remote host Download PDF

Info

Publication number
WO2002019075A2
WO2002019075A2 PCT/US2001/040965 US0140965W WO0219075A2 WO 2002019075 A2 WO2002019075 A2 WO 2002019075A2 US 0140965 W US0140965 W US 0140965W WO 0219075 A2 WO0219075 A2 WO 0219075A2
Authority
WO
WIPO (PCT)
Prior art keywords
document
client
checksum
host
registration
Prior art date
Application number
PCT/US2001/040965
Other languages
French (fr)
Other versions
WO2002019075A3 (en
Inventor
David A. Benaron
Ilian H. Parachikov
Original Assignee
Spectros Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Spectros Corporation filed Critical Spectros Corporation
Priority to AU2001267087A priority Critical patent/AU2001267087A1/en
Publication of WO2002019075A2 publication Critical patent/WO2002019075A2/en
Publication of WO2002019075A3 publication Critical patent/WO2002019075A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • G06F21/645Protecting data integrity, e.g. using checksums, certificates or signatures using a third party

Definitions

  • the present invention relates to systems and methods for establishing and verifying the stability of a digital document with respect to a fixed time and date by a remote host, without need for transmission of the client document. More particularly, the system relates to the provision of a publicly-accessible and secure third-party internet-enabled host site that provides a downloadable agent program to any client system, with the agent program generating a document certification checksum within the client system that is then uploaded to and archived by the host as a verifiable record, with the host creating a certificate and/or a registration number, downloaded to the client.
  • the certificate or registration number allows the document to be time-, date-, and content- validated by any client with access to both the registered document and to its corresponding registration number.
  • the present invention relies upon the presence of an informational network connection between a secure, independent host computer and a client computer that wishes to obtain or validate a time-date-content certification for a document stored within it.
  • the client begins with an agent program, either resident within the client system, downloaded via the network, or made available to the client via some other route.
  • the agent program executes within the client system and analyzes the document or documents to be certified and, based upon the results of this analysis, produces a unique checksum identifier.
  • the checksum is generated in such a manner, and of sufficient length, that any alteration of the target data is highly likely to produces large and detectable changes in the checksum. Thus, it is unlikely that alteration of the document from its form at the time of the initial analysis would go undetected.
  • the checksum generated as described above is then sent by the agent program from the client to a remote, secure, and third-party host over a network for permanent entry into a registry.
  • the host computer stores, at the minimum, this checksum, the time and/or date of registration, and a unique registration number that allows for the future identification of the file, as well as the validation of the integrity of the current document with respect to the document as initially registered.
  • the host may create a certification document that is downloaded to the client for storage. This certification document contains at least a registration number, and may contain any other information as required for document validation, such as encoding schemes, agent version numbers, original file name.
  • This certification document may also contain information needed for the conduction of business, such as billing information, limits for the number of queries to the host for validation that are prepaid or gratis validation, and other information helpful to ' the conduction of business.
  • the certification document may be stored by the agent program as a registration certificate, or may be used to mark the document as a registered document, such as by renaming the document, or a read-only copy of the document, to include the registration number in the document name.
  • This certificate can be viewed and validated, as needed, by any client having electronic access to both the original document and the registration number (either contained in a registration certificate file or embedded within the document name). The validation will fail if the document has been altered in any way since registration.
  • a salient feature of the present invention is that only the checksum is transmitted to the registration system, rather than the target document.
  • the security of the target document is maintained since the electronic document is not transmitted to the registering host system, even in part or in an encoded form. No level of eavesdropping will reveal the contents of the registered document to an unintended eavesdropper.
  • file verification the contents of the file remain secure as they are not required to be transmitted over the network.
  • the registration or verification processes can be anonymous and/or initiated by any client. In contrast to many document registration systems and methods, there does not need to be any identification of the submitting or verifying clients.
  • any party in possession of an electronic document can obtain the downloadable agent program software and access the host site to register a document, all without revealing the registering party's identity.
  • a party in possession of an electronic document and either the corresponding registration certificate or registration number can verify the integrity of the file by accessing the host registration site over a network, all without revealing the verifying party's identify. If either the registration or verification processes engenders a transaction fee, identify information can be collected, and possibly de-linked from the registered file, once any fees are paid.
  • the registration can be used to validate not only the integrity of content of a document at a later date, but also the identity of the signer or signers as well.
  • a final feature is that this approach is unbiased and verifiable. Registration and verification are performed by a non-interested third-party host. Thus, registration and verification are performed without conflict of interest, and can be performed at a sufficiently secure level so as to be likely to be legally recognized.
  • an object of the present invention is to provide a legally-binding system and method to produce a verifiable certificate and/or verifiable registration number for any electronic document, or set of documents, accessible to a remote and anonymous client by providing an independent host system and long-term, archived registration certifying the time, date, checksum, and registration number of the registered documents.
  • This checksum is selected so as to make vanishingly small the possibility of alteration of the document without alteration of the checksum.
  • a second object is that the integrity of the document, and either the registration certificate or the registration number, can be anonymously verified by any client with access to the document, and access to either the registration certificate or the registration number.
  • a third object is that the document registered or verified can be in any digital information format or electronic form, provided only that it can be accessed in an unchanged digital form for later retrieval and verification.
  • Such documents are intended to include a file, a folder, a hard drive or floppy disk, a spreadsheet, a database, an image, a scanned image of a paper document, a sound and/or audio recording, or any group of the above, or any group of one or more documents for which a certificate is desired.
  • certificate checksum is stable with respect to an unchanged document, and does not necessarily require public or private keywords in order to encrypt/decrypt the file or to generate a valid checksum.
  • a corollary object is that the checksum is of sufficient length, and is calculated in such a way, that even a small change in the document (such as an alteration of one bit or character) will be extremely likely to produce a substantial change in the checksum, and thus have a near certainty of detection during the verification process.
  • the certificate can be validated on any client system with access to the registered document, whether it be the original client system or otherwise, by allowing any client system to recalculate the checksum using an agent program, to upload the certificate number, date, time, and checksum to the secure host, and to request a look-up and comparison of the original values stored within the registry.
  • This allows any document to be time-, date-, and content-validated by any client system with access to the registered document and to its corresponding registration certificate, and in which the downloadable agent resides or is accessible.
  • the downloadable agent program can perform other key services, such as scanning for viruses or embedded macros, thus certifying that the document is also uninfected, or for performing a compression or encryption function, if desired.
  • the encryption must be exactly reversible or else the checksum will no longer match.
  • a final object is that this process may require payment of a transaction fee, and the identity of the payer may be optionally linked with the registry data, if desired.
  • time and date stamping program can be downloaded to nearly any system, as the calculation can be operating system independent.
  • a downloaded agent program will always be the most advanced version of the agent program, though the system will maintain backward version compatibility.
  • Another unexpected advantage of the present invention is that it may be preferable to have an electronic version of a notarized or witnessed paper document than it would be to have the paper form.
  • Paper documents suffer from a weak point that they can be later altered without much sign of the tampering. For example, if a notary public notarizes a document, this document can be later altered without affecting the appearance of the document in an obvious manner. However, if a conventional notary public notarizes a document, and then registers a digital scanned version of the electronic document using the present invention, then the document will remain unaltered in a verifiable manner. This raises the possibility of net-based notarization and certification of documents with advantages over the conventional methods of notarization.
  • a method for registering a document at a particular time and date in which a client system containing the electronic document to be validated can download from a secure third-party host site, or run from disk, an agent program, such that the agent program computes a unique checksum using the document to be certified, uploads that checksum to the secure host, which registers the checksum, time and/or date of registration, and other relevant data in a permanent private data record, and generates a certificate which is sent back to the client for storage.
  • the host is an internet-connected windows-based server
  • the client is an internet- connected windows-based personal computer that downloads a visual-basic-based agent to perform the checksum.
  • the host After upload of the checksum to the host, the host generates a certificate and downloads this certificate to the client. Further, once the certificate is generated, any client having access to both the certified document and its corresponding validation certificate (or registration number) can query the host computer via the internet, download the agent program, re-compute a checksum, and compare that checksum to that stored in the host system registry, in order to verify the authenticity of the registered document and its accompanying certificate for time, date, and content.
  • a system for operating a web site and performing the above functions from a remote client is also described.
  • FIG. 1 is a schematic diagram of a host/client system in accordance with the invention.
  • Figure 2A-D show images from a functioning site with host and client program threads performed in accordance with the invention.
  • Figure 3 is a flow chart illustrating the key features of data flow for a host and client configured during document registration performed in accordance with the present invention.
  • Figure 4A-B show two images of a document, previously witnessed and notarized by a notary public, after scanning and digitization using a flatbed scanner, which allowed for subsequent document registration. Even though these images appear by eye to be identical, they yield different checksum identifiers upon registration.
  • Host A computer system that is connected to a network, such as the internet or a local intranet, within which is resident a permanent registry of registered files.
  • a network such as the internet or a local intranet, within which is resident a permanent registry of registered files.
  • an agent program downloadable to other systems for performing checksum calculations can be resident within the host.
  • a host is a secure internet-connected server.
  • Client A system that may be connected to a network at times, and which can use that connection to download an agent program from the host computer, then use that agent to generate a unique checksum, then upload the checksum to the host for registration, and then finally download a document certificate for the registered document.
  • An example of a client is a personal computer connected at times to an internet service provider via a dial-up modem.
  • a document can be text, graphics, spreadsheets, databases, drawings, digital images, audio or video clips, scanned or pdf (Adobe Systems, San Jose, CA) images of paper documents, raw data, groups of files (such as file folders, drives, or tapes), or any other digital material stored as a file.
  • a document can be text, graphics, spreadsheets, databases, drawings, digital images, audio or video clips, scanned or pdf (Adobe Systems, San Jose, CA) images of paper documents, raw data, groups of files (such as file folders, drives, or tapes), or any other digital material stored as a file.
  • Document Integrity The stability of a document, with regard to maintenance of the exact and unchanged contents of the document, with respect to the contents that the document possessed at the specific time and date of registration, stable even to changes as small as one data bit. Document integrity can also include lack of changes to the document creator, registrant, signatory or signatories, or other related aspects of the document, if desired.
  • Checksum A checksum is a unique string of binary values or text characters calculated from an electronic file or a set of binary data, such that the loss of, the addition of, or the alteration of even one data bit will most likely result in a detectable change in the checksum.
  • a checksum is determined based upon a mathematical function applied to some or all of the digital information contained in the target document file. The mathematical function is selected such that changes to the target document produce large and inversely-unpredictable changes in the identifier.
  • There are multiple methods known in the art to generate a checksum and these vary in their complexity, speed, and degree of certainty that a corruption or change in the analyzed data will result in a detectable change in the checksum.
  • Registration A process by which one or more documents are analyzed, the results of this analysis entered into the host register, and a certificate is issued by the host confirming registration.
  • Validation A process by which one or more registered documents are analyzed, the results of this analysis are compared to values previously entered into the host register, and a certificate is issued confirming or denying the integrity of the registered documents since the time and/or date of initial registration.
  • Certification During registration, certification is a process by which one or more documents are analyzed, and issued one or more certificates containing data related to the registration process, such confirmation of registration or registration numbers. During validation, certification is a process by which one or more documents are analyzed, and issued one or more certificates relating to the integrity of the documents, with respect to the content of the documents at the time of registration.
  • Registered Document A document that been registered and certified by the host computer.
  • Validated Document A document that has previously been registered and certified, and for which a recalculation of the checksum is requested, and successfully matched to, the registered checksum.
  • a recalculation of the checksum is performed by the client requesting validation, and the recalculated checksum is uploaded for comparison to the registered checksum previously stored by the host, as well as optionally compared to the certificate checksum previously stored in the document's registration certificate.
  • validation affirms the integrity and stability of the registered document with respect to date, time, and content, as compared to the document when originally registered.
  • validation results in the production of a certificate, either confirming or denying the integrity of the document as compared to the document at registration.
  • Validation can be performed by any client, provided only that document and its registration number (either contained within the document itself, or attached in the document's registration certificate) are accessible to the validating client. Description of a Preferred Embodiment
  • client system 120 is connected to host server 130 through network 134 to allow connection of client system 120 to host server 130 when desired.
  • network 134 is a local area network connected to both client system 120 and host 130 via cables 136 and 138.
  • network 134 is likely to be the internet, connected to client system 120 and host 130 by dial-in, DSL (digital subscriber line) phone connection, or by T-l trunk line, connecting to the client and host over phone cables 136 and 138, respectively.
  • web browser 125 may be required for client system 120 to access the internet, with browsers such as Netscape 4.73 (Netscape, Mountain View, CA) or Internet Explorer 5.0 (Microsoft, Redmond, WA) by way of examples, suitable.
  • client system 120 has one or more target documents 121 to be registered, stored on disk drive, tape, floppy drive, CD-ROM, DVD, or other storage media, for which registration and certification is desired.
  • client system 120 accesses internet network 134 using internet browser 125 and the user of client system 120 selects dialog button 202 ( Figure 2 A) to begin registration of a document. The user then selects dialog button 203 ( Figure 2B) to enable a download of agent program 145.
  • downloaded agent program 145 is written in Microsoft Visual Basic Pro 6.0 (Microsoft, Redmond, WA), is stored in memory 123 of client system 120.
  • agent program 145 may be written in any of computer language appropriate for the client computer, such as Java++, Visual Basic, Visual C++, Java, or others. If written in Java, agent program 145 could run as a script operating within the framework of the browser. Once downloaded, agent program 145 runs automatically. Agent program 145 is also configured so as to be fugitive, and remains in memory 123 of client system 120 only so long as to perform the needed functions, and is then deleted from client system 120 without transfer to disk. However, agent program 145 may also be configured to be saved by the client system 120 for future use. In such cases in which agent program 145 is stored, agent program 145 may be loaded by disk or other transfer, rather than downloaded via the network 134, and then checked for version updates during access to the host 130 over the internet or other network system.
  • Agent program 145 continues by prompting the user to identify the file, folder, drive, or other storage location for which registration and certification is requested, as shown in Figure 2C. Agent program 145 will calculate a checksum value for, and then register, the file, folder, drive, or other storage medium selected. Alternatively, agent program 145 can generate a checksum value for, and then register, each document in the selected folder or drive. Once the target documents are selected, the user selects dialog button 208 in Figure 2C to complete the registration process.
  • Checksum 155 a key element of the registration process, is now described in detail.
  • a checksum is a unique string of binary values and/or text characters calculated from an electronic document file or from a set of binary data.
  • Checksum 155 is calculated such that the loss of, the addition of, or the alteration of even one data bit will result in a detectable change in the checksum.
  • Such checksums are often used internally by a computer, or externally by modem and internet programs, to detect errors in transmission, reception, or storage of a file.
  • There are multiple methods known in the art to generate a checksum and these vary in their complexity, speed, and degree of certainty that a corruption or change in the analyzed data will result in a detectable change in the checksum.
  • the goal of the using a checksum is to provide a digital measure of the integrity of a file, such that any alteration, deletion, or insertion of data will almost certainly result in a large change in the checksum.
  • Such overwhelming odds of altering the checksum by modifying the file make it very difficult to intentionally alter a file, yet retain the checksum at its original value.
  • the checksum should be long, and the method of calculation difficult to reverse (e.g., difficult to predict what changes in the document could be made such that, taken together, the checksum remains unchanged).
  • the difficultly of tampering to yield an altered file with an identical checksum rises exponentially with the number of bytes in the checksum string (a string is a set of characters, numbers, or symbols strung together into a single "word").
  • the difficulty of tampering rises in a power curve with changes in the number of characters used in the alphabet used to create each byte in the checksum string. For example, if there were only 16 characters in the checksum, and there were 64 choices for each character, there would be 16 64 possibilities for the checksum, a number which would require a computer checking a new sum every microsecond per second an average over a trillion years to try to reproduce a given checksum.
  • checksum 155 is 16 characters in length and uses a 64 character alphabet composed of upper case letters, lower case letters, the digits 1 through 0, and two repeated alphabet characters ("a" and "b") to bring the total to 64 characters, based upon the calculated integer values of the checksum for each character.
  • this checksum length and/or character set may be expanded or contracted, provided only that the length and character set used in the checksum are chosen such that the chances of modification of the registered document without producing a corresponding change in the checksum is reduced to nearly zero balanced by the need for a reasonable calculation time.
  • checksum 155 A preferred method of determination of checksum 155 is now discussed in detail. Many encoding schemes are known, and involve mathematical procedures in which each data bit in the document influences one or more bytes in the character string.
  • a simple single-pass checksum is generated by sequentially loading each 8-bit byte in the document file (decimal value 0-256) and performing a mathematical operation on each. This operation has multiple components. To begin with, the first byte in the document is added, after a division and a multiplication, to the first byte of the checksum, and is also added after different multiplication and division operations to other character bytes, as is determined by the variable called "Influence", to be discussed later. Each time a new character is read, this "base byte" is increased by one.
  • the base byte for the second byte in the document is the second byte in the checksum.
  • the base byte When the base byte exceeds the length of the checksum, then the base byte resets to point at the first character of the checksum.
  • the sixteenth byte of the document has a base byte of the sixteenth byte of the checksum, while the seventeenth byte of the checksum has a base byte of the first byte of the checksum. This rotating base byte continues incrementing and resetting as above, until every byte of the document has been read, processed, and entered into the checksum.
  • the first division operation that each byte in the document undergoes involves a truncated division.
  • This truncated division uses an operation termed FRAC, which yields the portion of the division result to the right of decimal, with the integer portion rejected.
  • FRAC 7 is equal to of the result of 10 divided by 7, or 1.42857142, with the integer portion removed, or 0.42857142.
  • the discarding of the integer component by the FRAC function, and the degree of changes in the FRAC function result with small changes in the document byte ensures that the checksum irreversibly holds less information than the document, with the result that the checksum is simple to generate, but difficult to reverse calculate. This makes it more difficult to alter a file in a manner that would produce no evidence of change in the checksum.
  • the present embodiment includes values termed "Influence” and "Offset.”
  • the value of "Influence” determines how many characters in the checksum will be affected by each byte in the document. If each byte in the document affects only the base byte, then it would be relatively simple to alter a character in one place in the document, and then to correct any changes in checksum by altering one more character. By having each byte in the document influence a base byte in the checksum, but also to influence several other bytes, the checksum becomes more resistant to tampering. Thus, if "Influence” equals 8, then each byte in the document will affect its base byte, plus 8 additional bytes in the checksum. Again, the calculations are selected such that there is an irreversible calculation involving each of the bytes under the influence of a character in the original document.
  • the checksum is calculated by performing a FRAC division of the decimal value of each byte in the document (with a value between 0 and 255 for each byte) by a series of odd number (3, 5, 7, 9, ...), with the number of divisions performed equal to Influence + 1.
  • each result from the FRAC function (result between 0 and 1) undergoes is a multiplication by the maximum value allowed for each checksum character, followed by an integer truncation of the decimal portion of the value.
  • the result instead of ranging between 0 and 1, the result ranges between 0 and the maximum value for each byte.
  • the final integer result after multiplication of the FRAC result now swings wildly but consistently with changes in the value of the document byte and with changes in the value of the odd number divisors, as shown in the following table when the byte value is 33 (the ASCII equivalent of an "C" character):
  • Step 1 Determine the constants.
  • the constants are as follows:
  • Step 2 Begin with DocLocation, the integer location of each byte in the document, equal to 1. This selects initially for the first byte in the document.
  • Step 3 Set DocValue equal to the binary value of the document at byte
  • DocLocation Each DocValue will range from 0 to 255 for a typical computer using an 8-bit byte.
  • Step 4 Set Offset equal 0, which will store the result of the calculation in the base byte, and then increment Offset by 1 up to a maximum value equal to Influence. For each value of Offset, do the following operation:
  • Step 5 Return to Steps 2-4, until all bytes in the document have been read.
  • Checksum 155 is then stored in client memory 123, uploaded by client system 120 to host 130 via network connections 134, 136, and 138, and saved in host registry 147.
  • Checksum 155 may be composed of N multiple checksums if multiple files are scanned, such as a scan of a folder.
  • Host 130 may additionally perform additional encoding steps on the values received, before storing such data in registry 147.
  • Checksum 155 maybe omitted from the certificate, as keeping the registered sum private and internal to the host system may make fabrication of a falsified file incrementally more difficult.
  • host 130 Upon receiving data sent by client system 120, and after possible additional checksum calculations, host 130 assigns a certificate number, and records the checksum, date, time, registration number, encoding scheme, and any other identifying data requested, into registry 147.
  • digital certificate 166 is generated and contains some or all of the data saved in the registry, and is stored in host memory 149. Certificate 166 is then downloaded to client system 120 via network 134 and cables 136 and 138 for storage in document and certificate folder 175.
  • Host 130 may place contents of the registry in backup archive 178 via connection 179, for long-term retrievable storage.
  • the agent program may rename the document file in the client system 120, make the registered documents read only, move the file or certificate to new directories, or take other actions to both indicate which file has been registered and/or attempt to prevent unintentional modification or erasure of the registered version of the program.
  • An example of renaming will be illustrated later in Example 1.
  • a user of client system 120, or of new client system 180, connected to the network via cable 182 may wish to re-certify or verify the integrity of the registered copies of documents 121, and corresponding digital certificate 166 if one is available.
  • client system 120 or new client 180 must have access to copies of previously-registered documents 121, and to any corresponding certificates in folder 175.
  • Client system 120 or client 180 then accesses host 130 via network 134 and cables 138 and 136 or 182, to download agent 145.
  • client system 120 will be verifying previously registered files, but the methods could apply equally well to a new client system 180 employing the same actions and method.
  • agent program 145 To verify the integrity of a file previously registered, the user accesses agent program 145, such as by downloading agent program 145 into client memory 123. The user selects validation by selecting command button 205 (shown in Figure 2A). Agent 145, operating from within client system 120, then prompts the user to select the name of the document or documents to be validated from client system 120. If the document has a certificate available, agent program 145 web site will search and find, or alternatively ask the user for, the name of certificate 166 in certificate folder 175, and agent program 145 automatically resubmits certificate 166 to verify that it agrees with the stored registration data in archive 178 of host 130.
  • agent program 145 will first upload the registration number to host 130, and host 130 will download the encoding scheme used for the registration. A copy of the previously- registered document, accessible from the client system, will be opened and the current checksum will be recalculated by agent 145.
  • Knowledge of the encoding scheme is essential for agent program 145 to accurately recreate the checksum, and may include information regarding the length of the checksum, the size of the character set used, the agent and algorithm used at the time of the initial registration, and so on, to provide for backward compatibility.
  • agent program 145 After agent program 145 recalculates a current checksum, using the same length, character set, and algorithm as used for the initial registration, and sends the recalculated checksum for comparison to the value archived on host 130.
  • Host 130 retrieves the archived registration data via network connection 134 and cables 138 and 136 or 182.
  • Host 130 retrieves the registered certificate data stored in archive 178, via connecting link 179 between host 130 and archive 178, using the registration number provided when the document was initially registered, and then compares the newly sent checksum and registration certificate (if any) to that stored within the registry.
  • host 130 If the information matches, indicating that the file has not been altered since it was registered, host 130 sends a message to agent program 145 running on client system 120 that the integrity of the file has been validated, namely that the document without alteration since the specific date and time of registration. If the information does not match, indicating that the file has been altered since it was registered, host 130 first rechecks the data, to ensure that an error has not been made, and then sends a message to the agent running on client system 120 that the file is invalid due to alteration of the file.
  • These messages may be presented to the user in many ways, such as by using a Windows 2000 message box function or by running a Java-based notification routine in a browser used to access the internet.
  • host 130 may also indicate what portions of the certificate have been altered, if determinable. In addition, other messages may be generated, such as that the document is intact but the certificate has been altered, or that both the certificate and the document have been altered.
  • the program flow, and the direction of data transfer, are illustrated in Figure 3.
  • client system 120 uploads a request to host 130 for downloading of agent 145, as shown at flow chart step 187.
  • Host 130 responds by downloading agent 145, as shown at flow chart step 191. Steps 187 and 191 are optional, as they can be ignored if the agent is available to client system 120 without need for download.
  • client system 120 uploads data for a certificate, including calculated checksum or a registration number, or both, to host 130, as shown at flow chart step 193.
  • host 130 stores checksum 155 using a unique host-generated registration number, while in the case of validation host 130 requires an uploading of the registration number in addition to checksum 155 in order to perform the validation process.
  • host 130 transmits the results of the requested registration or validation process in a downloaded certificate, as shown as flow chart step 195.
  • agent program 145 may request that the process be confirmed or verified, which may add two or more optional steps, as shown in program flow chart steps 197 and 199.
  • a primary goal of this invention is to establish a secure method and system for an anonymous user to easily, reliably, and certifiably time stamp an electronic document, as well as be able to verify its integrity with respect the original registration, independent of the identity of the creator or future verifier.
  • a client checking the validity of a document can do so in anonymity, unless it is selected otherwise during the registration of the certificate.
  • any user can feel comfortable obtaining a certificate, even if the data on the client's computer is highly sensitive or if the data cannot be transmitted due to security or intellectual property issues, or if the user does not wish to reveal his or her identify.
  • This combination of verifiable document registry and validation, without requiring user-identifying signatures, and/or without transfer of the document data, is unique, allows for a high level of security during document registration and validation.
  • a contract may be posted at a public site. Users downloading this contract can perform a validation check, based upon a registration number provided on a public or private internet web page. The validation of the document confirms that the document has not been altered since it was posted at the web site. This may allow, for example, for multiple users to complete the closing of a contract, with each of the users at different sites, and with assurance that the documents each completes will remain unchanged and without tampering should future reference to the completed documents become desired. A similar level of assurance may be offered when downloading software.
  • a user would download software from a public site, the validate the software using a public or private posting of the registration number at a user- accessible web site page.
  • a confirmation of the validity of the file indicates that the file has not been altered since it was posted on the web site by the authoring software company.
  • data identifying the client can be collected but delinked from the registered files in such a manner so as to continue to provide anonymity, when desired.
  • notarization With respect to notarization, the establishment of the date, time, and integrity of a document is but one aspect of the notarization function.
  • a second aspect is that the identification of the signatory client is confirmed through the requirement for positive identification-
  • digital signatures are reaching the point of being legally valid and binding.
  • Digital signatures offer the opportunity to execute agreements, contracts, and the like at a distance, such as over the internet.
  • the inclusion of digital signatures in the registration process can add value to the present method, such as by allowing for the registration of notarized documents, after electronic scanning, to ensure that such documents remain unchanged over time with respect to the date, time, identity of participants, and even place of registration.
  • documents may require serial or simultaneous signatures, and registered document allows for each party to sign, while keeping the original document intact.
  • a web site may be provided to facilitate simultaneous signature at multiple locations. Such a web site would include features such as collection of digital ID's, but attachment of these ID's to the target document only upon collection of all signatures. This would provide a unique business opportunity to provide for internet or network based notarization.
  • some users may wish for a paper document to be saved elsewhere. For example, a user with an email-PC (e-mail only personal computer) may not have any disk drives for data storage, or limited disk space.
  • the document to be notarized or signed may reside at the host computer site, within registry 147 or archive 178. Again, the creation of a web-based notary system provides a unique business opportunity.
  • Example 1 Registering a Document Resident Accessible by a Client
  • a registration system was written in Visual Basic 6.0.
  • a live host program and an executable agent program were installed on a computer (700 MHz Pentium Pro Inspiron 7500, Dell, Round Rock, TX).
  • client system 120 and host 130 had graphical user interfaces written in Visual Basic Pro 6.0 (Microsoft Corporation, Redmond, WA), with host 130 running on a continuous basis.
  • Agent program 145 required an execution request from a user in order to begin running.
  • the host and agent software communicated by buffer, in a similar manner of input/output as would occur as upload and download buffers when the programs operate in different computer systems linked by a network such as the internet.
  • Client system 120 had several stored document for which registration and certification were desired by the user.
  • client system 120 opened up a browser, in this case Netscape Navigator 4.73 (Netscape, Mountain View, CA), and the user entered a network address for host 130.
  • agent program 145 was then copied by host 130 to memory 123 of client system 120, and agent program 145 started and ran automatically.
  • the user of client system 120 then viewed opening screen 208 of agent 145, and the user selected dialog button 202, marked "Register a New Document” (shown in Figure 2A) and then button 203, marked "Download Now” (shown in Figure 2B).
  • both agent program 145 and host 130 had graphical user interfaces written and compiled in Visual Basic Pro 6.0 (Microsoft Corporation, Redmond, WA).
  • agent program 145 prompted the user to identify the document, document folder, or storage disk for which certification was requested.
  • the user of client system 120 selected two documents, named 'Declaration.txt' and 'Declaration.doc'. These two documents were created using the text of the first two sentences of Declaration of Independence, including capitalization and punctuation, as provided at the National Archives and Records Administration, transcription files, Washington, DC.
  • the first document was saved in text format (although any format, including image files, could have been registered), while the second document was saved as a Word 2000 file (Microsoft Corporation).
  • the exact text used was as follows (quotation marks added here but not included in the document file during analysis):
  • agent program 145 proceeded to determine checksum 155 for each document, in sequence.
  • the checksum for 'Declaration.txt' was "bSVodXSE9XtGcjD5", and the file was assigned a registration number of AAA- 10072-000616. Using this registration number, a copy ofthe document file was created, made read-only, and was given the name "Declaration.Reg_AAA- 10072- 000616.txt". Note that the extension '.txt' remains at the end ofthe file, despite the insertion of a registration number, allowing the file type to remain stable. A certificate was also stored, in this case in the same directory. Alternatively, the security certificate could have been stored in another directory, such as the Windows security certificate folder.
  • the name ofthe certificate 209 was made similar to the name ofthe registered read-only document as "Declaration.Reg_AAA- 10072-000616.txt.crt". Certificate 209 was then displayed to the user, as shown in Figure 2D.
  • agent version 211 the agent version 211, the encoding scheme 212, and registration number 215 were saved in certificate 209, while checksum identifier 155 was also saved in the document name.
  • the encoding scheme and the agent version will allow program 145 to be backward compatible, as encoding schemes and agent programs may mature and evolve over time.
  • use ofthe registration number in the document file name will allow for retrieval ofthe encoding scheme and agent version archived within registry memory 147 or archive 178 of host 130, without need ofthe registration certificate.
  • Example 1 The text file registered in Example 1 was then validated. Again, the agent program downloaded. In this case, the verify option was selected clicking on dialog button 205 ( Figure 2 A) to begin validation ofthe document, and then selecting dialog button 203 ( Figure 2B to enable a download of agent program 145.
  • registry 147 of host computer 130 already contained an entry for registration number AAA-10072-000616.
  • the following values had been saved: checksum length, maximum checksum value for each character, influence value, date and time of registration, and the original calculated checksum.
  • the value of these variables were then read from the registry and were downloaded to client system 120 for use by program 145 in the calculation of checksum 155.
  • the user may have elected to select both the document file as well as its corresponding certificate, allowing the certificate to be tested for tampering.
  • information needed for the recalculation of checksum 155 could have been read directly from the certificate, instead of over the internet from registry 147.
  • agent program 145 When agent program 145 recalculated the checksum for "Declaration.txt:, a checksum value of "bSVodXSE9XtGcjD5" was once again obtained, and transmitted to host 130. The host checked the registry, and confirmed that the file has not been altered, and displays the message "REGISTRATION VALID," as well as displays the certificate information, via program 145. If the checksum did not match, host 130 would have indicated that the file and/or the certificate have been altered (See Example 4 for details).
  • any file with a registration number in its file name can be transmitted to any third party user, who may check its integrity at any time.
  • this system provides a method for reliable verification for personal as well as distributed electronic documents, such as text documents, images, or programs. For security reasons, it may be decided that the checksum may be retained only by the host, and not included in any certificate. This could potentially make the job of altering a certificate more difficult.
  • the registration number may be any combination of letters, numbers or symbols that would allow for the registration record within the host registry to be accessed at a later date and time.
  • Example 1 As a demonstration ofthe effect of tampering, the text file registered in Example 1 was opened using a text editor, and the starting sentence was altered from "When in the Course of " to "Then in the Course ". This change left the total number of characters in the file unchanged. The result was saved in the file 'DeclarationAlter.txt' As a result of this single character alteration, the calculated checksum changed from "bSVodXSE9XtGcjD5" to "bs6TLIF5yXtGcjD5". As the value of influence was set at 8, a single character change in the document affects the base byte plus 8 additional bytes.
  • Example 4 Verification of an Altered File
  • the first altered file from Example 3 was then validated using the registration number ofthe unaltered file, as described in the method shown in Example 2.
  • the checksum ofthe altered document did not match the checksum ofthe unaltered document.
  • the host then reloaded the agent program to ensure that a transmission error was not responsible for the failure. After failing on a second attempt, instead of confirming that the file was valid, as occurred in Example 2, the host instead returned the following message:
  • a document can be electronically notarized using this method.
  • a notary may elect to notarize a paper document using a conventional ink stamp and record book, but then digitize the resulting document using an electronic scanner, and register the electronic version ofthe document using the method and system ofthe present invention.
  • the registration number ofthe document could arguably be entered into the notary's written register as additional proof of document stability.
  • the scanning of paper documents allows for the digital transmission of any contract or notarized document, rather than transmission by conventional certified or express mail delivery.
  • Registration of notarized documents also resolves the issue that a document witnessed by a notary identifies only the signer(s), but does not attest as to the accuracy ofthe content ofthe document, which is easily tampered with. Digitization and registration can be used to provide a secure and traceable method for the transmission of notarized documents, or for any signed contract, in order to ensure that the documents remain in their original form.
  • a document was created in Word 2000 (Office 2000, Small Business Edition, Microsoft Corporation, Bothell, WA) and then printed (LaserJet 4, Hewlett-Packard, Palo Alto, CA).
  • This printed copy was notarized using a conventional ink stamp and witnessed signature, and was then digitized five times without moving the paper document between images, using a flatbed document scanner (MFC 9100C, Brother International Corporation, Bridgewater, NJ).
  • the resulting images were captured using imaging software (Visioneer PaperPort 6.1, ScanSoft, Los Gatos, CA), and registered using the method ofthe present invention, with Influence set equal to 1. Two of these scanned images are shown ( Figure 4A and 4B).
  • Image 355 in Figure 4A and image 357 in Figure 4B are indistinguishable by eye.
  • each ofthe remaining three scanned images (not shown) are similarly identical.
  • notary seal 366 and 368 can be seen clearly in each ofthe images.
  • the resulting checksum differs significantly for each image, with checksum values of "GWdUbbveY ⁇ qeNFDg" for image 355, "ubqSQbX7WuaLTOb0" for image 357, and "neCF0XWh9qUsSHpw", “VnYBaksOSsvell 3s", and "oIDk5HevWDF4EqXU” for the remaining three images, respectively. Because of these changes, rescanning of the document with the intent to alter the document, even with every attempt made to scan in exactly the same manner, is likely to fail.
  • a document to be notarized or a contract to be signed may be more conveniently provided in a scanned or other electronic digital form, such as if the form is to be distributed via email.
  • a registered copy ofthe blank document would ensure that the contract had not been surreptitiously modified or corrupted during transmission.
  • Exemplary types of electronic documents include doc files from Word or pdf files from Adobe Acrobat. If proof of identity can be provided electronically, rather than in person by a notary, then the entire process maybe completed electronically, without the need of a human notary public.
  • Providing electronic identifying documentation, such as digital signatures would allow for the document to be electronically distributed, electronically signed, and then electronically registered to freeze the document in its signed from, without changes or corruption.
  • a document distributed in electronic form has the advantage that the same form can be rapidly distributed to multiple individuals, each of whom may separately electronically sign the document, have the document electronically notarized, and then returned in a signed, notarized, and registered form.
  • a guarantee may be provided that each electronically signed copy has been stable since notarization, provided that the registration can be verified.
  • a document such as a contract
  • a document such as a contract
  • signatures may, in fact, be electronically collected in an asynchronous manner, with the web site waiting until all signatures have been provided, and then attaching all signatures simultaneously to ensure that signing occurs simultaneously, followed by the freezing the document in a registered, verifiable form immediately after contract closure.
  • the registration number, or an adjunct password or keyword may also be used to control distribution of a document, such as access to a web site or access to a web page for contract closure over the web.
  • the registration number ofthe final document may be distributed separately, or provided on a web site with limited access, such that only users with access to both the registration number and the registered document will be permitted to read and/or validate the document.

Abstract

A system providing a publicly-accessible, independent, and secure host internet site that provides a downloadable agent program to any anonymous client PC, with the agent program generating within the client PC a registration checksum based upon the document to be registered. The checksum is then uploaded to and archived by the host as a verifiable record, without transmission of the document to be registered, and without a requirement for identification of the client. In addition, the host in addition generates a registration number, and may generate a registration certificate, both downloadable to the client. This registration can be used to verify the integrity of document with respect to date, time, and content of the document at any later date and time and by any client with access to both the document and the registration number. If desired, a transaction fee may be charged to the registering client, as well as to either the original registering client or the validating client at the time of validation. Last, if a digital signature is provided by the registering client, this can be appended in a verifiable manner to the host registry. A method for the system is also described.

Description

SYSTEM AND METHOD FOR CLIENT DOCUMENT CERTIFICATION AND VALIDATION BY REMOTE HOST
Field of the Invention
The present invention relates to systems and methods for establishing and verifying the stability of a digital document with respect to a fixed time and date by a remote host, without need for transmission of the client document. More particularly, the system relates to the provision of a publicly-accessible and secure third-party internet-enabled host site that provides a downloadable agent program to any client system, with the agent program generating a document certification checksum within the client system that is then uploaded to and archived by the host as a verifiable record, with the host creating a certificate and/or a registration number, downloaded to the client. The certificate or registration number allows the document to be time-, date-, and content- validated by any client with access to both the registered document and to its corresponding registration number.
Background of the Invention
When a paper document is to be certified as to the time and date of creation, it is often co-signed by a witness, notarized, or both. This holds true for laboratory notebooks, confidential disclosures, drawings, depositions, and the like, each of which can be witnessed or notarized in a simple way. There are, however, an increasing number of documents that do not exist in paper form, such as databases, spreadsheets, encrypted computer files, and the like. Such documents are difficult to witness, and it is often impractical or impossible to verify their integrity or reasonably establish their date of creation.
One reason for a lack of methods to certify electronic documents is that it is simple for one skilled in the art of computer tampering to alter an existing digital document, or to change the date and time and then claim a spurious date of creation. For this and other reasons, courts in general do not recognize the legal validity of the existence or the absence of tampering with an electronic document based solely on a time and date stamp assigned to it by operating systems such as DOS, Mac OS, Linux, or Windows.
Document validation and certification schemes for electronic documents are known in the art. The great majority of these approaches focus on the establishing a high degree of certainty that a document has been transmitted in an intact, uncorrupted, unmodified form during passage from a sending to receiving client, or for providing a verifiable signature to prove that a document received by a receiving client indeed originated from a known and identified sending client (e.g., US 5,748,738, US 5,956,404, 5,872,848, 5,982,506). In fact, some approaches actually record the senders handwritten signature (US 5,818,955) and use this signature to establish the identify of the sender. Others alter the name or appearance (e.g., US 5,781,629) in order to provide a unique identifier to each file. Other approaches involve encryption of the file. However, in some cases, a user may not wish to transmit a file to another computer. Many documents that a client may wish to time stamp are sufficiently sensitive that sending the document out to another computer is not acceptable. Further, transmission of large files (such as the contents of a large database, disk drive, or tape backup) would be time consuming, and would expose the sender to eavesdropping, raising significant data security concerns.
Currently, if a private individual or corporation wishes to time stamp a document resident on their computer, there is no simple network-based or internet- based service to do this. None of the approaches noted above can validate that an untransmitted document residing within a client system remains unaltered at a later date, nor certify the form of a document at a specific date and time, in a reliable way. Such approaches validate a file only during, or in conjunction with, a file transfer, and require both a sending and a receiving client, and use an intermediary system to perform the transfer or to assure both clients that the document is valid.
None of the above systems, agents, or methods suggest how to certify a ' document on a client system in such a manner so as to provide not only a legally- binding certification, equivalent in weight to that of a notary, to guarantee the time and date of the certification and registration, but provide the added benefit of ensuring that tampering has not occurred with regard to content since the time of registration, nor do they work well for virtually any digital document, whether a text file, an image, a scanned copy of a paper document, or other stored files. A real-time system for document certification without transfer of the file has not been taught, nor has such a tool been successfully commercialized. Summary and Objects of the Invention
The present invention relies upon the presence of an informational network connection between a secure, independent host computer and a client computer that wishes to obtain or validate a time-date-content certification for a document stored within it. The client begins with an agent program, either resident within the client system, downloaded via the network, or made available to the client via some other route. The agent program executes within the client system and analyzes the document or documents to be certified and, based upon the results of this analysis, produces a unique checksum identifier. The checksum is generated in such a manner, and of sufficient length, that any alteration of the target data is highly likely to produces large and detectable changes in the checksum. Thus, it is unlikely that alteration of the document from its form at the time of the initial analysis would go undetected.
The checksum generated as described above is then sent by the agent program from the client to a remote, secure, and third-party host over a network for permanent entry into a registry. In this registry, the host computer stores, at the minimum, this checksum, the time and/or date of registration, and a unique registration number that allows for the future identification of the file, as well as the validation of the integrity of the current document with respect to the document as initially registered. In addition, the host may create a certification document that is downloaded to the client for storage. This certification document contains at least a registration number, and may contain any other information as required for document validation, such as encoding schemes, agent version numbers, original file name. This certification document may also contain information needed for the conduction of business, such as billing information, limits for the number of queries to the host for validation that are prepaid or gratis validation, and other information helpful to'the conduction of business. The certification document may be stored by the agent program as a registration certificate, or may be used to mark the document as a registered document, such as by renaming the document, or a read-only copy of the document, to include the registration number in the document name. This certificate can be viewed and validated, as needed, by any client having electronic access to both the original document and the registration number (either contained in a registration certificate file or embedded within the document name). The validation will fail if the document has been altered in any way since registration. A salient feature of the present invention is that only the checksum is transmitted to the registration system, rather than the target document. The security of the target document is maintained since the electronic document is not transmitted to the registering host system, even in part or in an encoded form. No level of eavesdropping will reveal the contents of the registered document to an unintended eavesdropper. Similarly, during file verification, the contents of the file remain secure as they are not required to be transmitted over the network. A second feature is that the registration or verification processes can be anonymous and/or initiated by any client. In contrast to many document registration systems and methods, there does not need to be any identification of the submitting or verifying clients. Thus, any party in possession of an electronic document can obtain the downloadable agent program software and access the host site to register a document, all without revealing the registering party's identity. Similarly, a party in possession of an electronic document and either the corresponding registration certificate or registration number, can verify the integrity of the file by accessing the host registration site over a network, all without revealing the verifying party's identify. If either the registration or verification processes engenders a transaction fee, identify information can be collected, and possibly de-linked from the registered file, once any fees are paid. Conversely, if a digital signature verifying the identity of the ' user is made available, then the registration can be used to validate not only the integrity of content of a document at a later date, but also the identity of the signer or signers as well. A final feature is that this approach is unbiased and verifiable. Registration and verification are performed by a non-interested third-party host. Thus, registration and verification are performed without conflict of interest, and can be performed at a sufficiently secure level so as to be likely to be legally recognized.
Accordingly, an object of the present invention is to provide a legally-binding system and method to produce a verifiable certificate and/or verifiable registration number for any electronic document, or set of documents, accessible to a remote and anonymous client by providing an independent host system and long-term, archived registration certifying the time, date, checksum, and registration number of the registered documents. This checksum is selected so as to make vanishingly small the possibility of alteration of the document without alteration of the checksum.
A second object is that the integrity of the document, and either the registration certificate or the registration number, can be anonymously verified by any client with access to the document, and access to either the registration certificate or the registration number. A third object is that the document registered or verified can be in any digital information format or electronic form, provided only that it can be accessed in an unchanged digital form for later retrieval and verification. Such documents are intended to include a file, a folder, a hard drive or floppy disk, a spreadsheet, a database, an image, a scanned image of a paper document, a sound and/or audio recording, or any group of the above, or any group of one or more documents for which a certificate is desired. Another object is that certificate checksum is stable with respect to an unchanged document, and does not necessarily require public or private keywords in order to encrypt/decrypt the file or to generate a valid checksum. A corollary object is that the checksum is of sufficient length, and is calculated in such a way, that even a small change in the document (such as an alteration of one bit or character) will be extremely likely to produce a substantial change in the checksum, and thus have a near certainty of detection during the verification process.
Another object is that the certificate can be validated on any client system with access to the registered document, whether it be the original client system or otherwise, by allowing any client system to recalculate the checksum using an agent program, to upload the certificate number, date, time, and checksum to the secure host, and to request a look-up and comparison of the original values stored within the registry. This allows any document to be time-, date-, and content-validated by any client system with access to the registered document and to its corresponding registration certificate, and in which the downloadable agent resides or is accessible. Another object is that the downloadable agent program can perform other key services, such as scanning for viruses or embedded macros, thus certifying that the document is also uninfected, or for performing a compression or encryption function, if desired. If the registration is performed prior to encryption, the encryption must be exactly reversible or else the checksum will no longer match. A final object is that this process may require payment of a transaction fee, and the identity of the payer may be optionally linked with the registry data, if desired.
The systems and methods as described have several advantages. One advantage is the time and date stamping program can be downloaded to nearly any system, as the calculation can be operating system independent. In addition, a downloaded agent program will always be the most advanced version of the agent program, though the system will maintain backward version compatibility.
Another unexpected advantage of the present invention is that it may be preferable to have an electronic version of a notarized or witnessed paper document than it would be to have the paper form. Paper documents suffer from a weak point that they can be later altered without much sign of the tampering. For example, if a notary public notarizes a document, this document can be later altered without affecting the appearance of the document in an obvious manner. However, if a conventional notary public notarizes a document, and then registers a digital scanned version of the electronic document using the present invention, then the document will remain unaltered in a verifiable manner. This raises the possibility of net-based notarization and certification of documents with advantages over the conventional methods of notarization.
There is provided a method for registering a document at a particular time and date in which a client system containing the electronic document to be validated can download from a secure third-party host site, or run from disk, an agent program, such that the agent program computes a unique checksum using the document to be certified, uploads that checksum to the secure host, which registers the checksum, time and/or date of registration, and other relevant data in a permanent private data record, and generates a certificate which is sent back to the client for storage. In one example, the host is an internet-connected windows-based server, while the client is an internet- connected windows-based personal computer that downloads a visual-basic-based agent to perform the checksum. After upload of the checksum to the host, the host generates a certificate and downloads this certificate to the client. Further, once the certificate is generated, any client having access to both the certified document and its corresponding validation certificate (or registration number) can query the host computer via the internet, download the agent program, re-compute a checksum, and compare that checksum to that stored in the host system registry, in order to verify the authenticity of the registered document and its accompanying certificate for time, date, and content. A system for operating a web site and performing the above functions from a remote client is also described.
The breadth of uses and advantages of the present invention are best understood by example, and by a detailed explanation of the workings of a constructed system, now in operation and tested. These and other advantages of the invention will become apparent when viewed in light of accompanying drawings, examples, and detailed description.
Brief Description of the Drawings
The following drawings are provided:
Figure 1 is a schematic diagram of a host/client system in accordance with the invention.
Figure 2A-D show images from a functioning site with host and client program threads performed in accordance with the invention. Figure 3 is a flow chart illustrating the key features of data flow for a host and client configured during document registration performed in accordance with the present invention.
Figure 4A-B show two images of a document, previously witnessed and notarized by a notary public, after scanning and digitization using a flatbed scanner, which allowed for subsequent document registration. Even though these images appear by eye to be identical, they yield different checksum identifiers upon registration.
Definitions
For the purposes of this invention, the following definitions are provided: Host: A computer system that is connected to a network, such as the internet or a local intranet, within which is resident a permanent registry of registered files. In addition, an agent program downloadable to other systems for performing checksum calculations can be resident within the host. One example of a host is a secure internet-connected server.
Client: A system that may be connected to a network at times, and which can use that connection to download an agent program from the host computer, then use that agent to generate a unique checksum, then upload the checksum to the host for registration, and then finally download a document certificate for the registered document. One example of a client is a personal computer connected at times to an internet service provider via a dial-up modem.
Document: Any digital document stored within the reach of a client. A document can be text, graphics, spreadsheets, databases, drawings, digital images, audio or video clips, scanned or pdf (Adobe Systems, San Jose, CA) images of paper documents, raw data, groups of files (such as file folders, drives, or tapes), or any other digital material stored as a file.
Document Integrity: The stability of a document, with regard to maintenance of the exact and unchanged contents of the document, with respect to the contents that the document possessed at the specific time and date of registration, stable even to changes as small as one data bit. Document integrity can also include lack of changes to the document creator, registrant, signatory or signatories, or other related aspects of the document, if desired.
Checksum: A checksum is a unique string of binary values or text characters calculated from an electronic file or a set of binary data, such that the loss of, the addition of, or the alteration of even one data bit will most likely result in a detectable change in the checksum. A checksum is determined based upon a mathematical function applied to some or all of the digital information contained in the target document file. The mathematical function is selected such that changes to the target document produce large and inversely-unpredictable changes in the identifier. There are multiple methods known in the art to generate a checksum, and these vary in their complexity, speed, and degree of certainty that a corruption or change in the analyzed data will result in a detectable change in the checksum.
Registration: A process by which one or more documents are analyzed, the results of this analysis entered into the host register, and a certificate is issued by the host confirming registration. Validation: A process by which one or more registered documents are analyzed, the results of this analysis are compared to values previously entered into the host register, and a certificate is issued confirming or denying the integrity of the registered documents since the time and/or date of initial registration.
Certification: During registration, certification is a process by which one or more documents are analyzed, and issued one or more certificates containing data related to the registration process, such confirmation of registration or registration numbers. During validation, certification is a process by which one or more documents are analyzed, and issued one or more certificates relating to the integrity of the documents, with respect to the content of the documents at the time of registration. Registered Document: A document that been registered and certified by the host computer.
Validated Document: A document that has previously been registered and certified, and for which a recalculation of the checksum is requested, and successfully matched to, the registered checksum. A recalculation of the checksum is performed by the client requesting validation, and the recalculated checksum is uploaded for comparison to the registered checksum previously stored by the host, as well as optionally compared to the certificate checksum previously stored in the document's registration certificate. If successful, validation affirms the integrity and stability of the registered document with respect to date, time, and content, as compared to the document when originally registered. Whether successful or note, validation results in the production of a certificate, either confirming or denying the integrity of the document as compared to the document at registration. Validation can be performed by any client, provided only that document and its registration number (either contained within the document itself, or attached in the document's registration certificate) are accessible to the validating client. Description of a Preferred Embodiment
One embodiment of the system and method will now be described with reference to Figure 1. In this system, client system 120 is connected to host server 130 through network 134 to allow connection of client system 120 to host server 130 when desired. In this embodiment, network 134 is a local area network connected to both client system 120 and host 130 via cables 136 and 138. Alternatively, network 134 is likely to be the internet, connected to client system 120 and host 130 by dial-in, DSL (digital subscriber line) phone connection, or by T-l trunk line, connecting to the client and host over phone cables 136 and 138, respectively. If the internet is to be accessed, web browser 125 may be required for client system 120 to access the internet, with browsers such as Netscape 4.73 (Netscape, Mountain View, CA) or Internet Explorer 5.0 (Microsoft, Redmond, WA) by way of examples, suitable. To begin, client system 120 has one or more target documents 121 to be registered, stored on disk drive, tape, floppy drive, CD-ROM, DVD, or other storage media, for which registration and certification is desired. In this embodiment, client system 120 accesses internet network 134 using internet browser 125 and the user of client system 120 selects dialog button 202 (Figure 2 A) to begin registration of a document. The user then selects dialog button 203 (Figure 2B) to enable a download of agent program 145. In this embodiment, downloaded agent program 145 is written in Microsoft Visual Basic Pro 6.0 (Microsoft, Redmond, WA), is stored in memory 123 of client system 120. In other embodiments, agent program 145 may be written in any of computer language appropriate for the client computer, such as Java++, Visual Basic, Visual C++, Java, or others. If written in Java, agent program 145 could run as a script operating within the framework of the browser. Once downloaded, agent program 145 runs automatically. Agent program 145 is also configured so as to be fugitive, and remains in memory 123 of client system 120 only so long as to perform the needed functions, and is then deleted from client system 120 without transfer to disk. However, agent program 145 may also be configured to be saved by the client system 120 for future use. In such cases in which agent program 145 is stored, agent program 145 may be loaded by disk or other transfer, rather than downloaded via the network 134, and then checked for version updates during access to the host 130 over the internet or other network system.
Agent program 145 continues by prompting the user to identify the file, folder, drive, or other storage location for which registration and certification is requested, as shown in Figure 2C. Agent program 145 will calculate a checksum value for, and then register, the file, folder, drive, or other storage medium selected. Alternatively, agent program 145 can generate a checksum value for, and then register, each document in the selected folder or drive. Once the target documents are selected, the user selects dialog button 208 in Figure 2C to complete the registration process.
Checksum 155, a key element of the registration process, is now described in detail. A checksum is a unique string of binary values and/or text characters calculated from an electronic document file or from a set of binary data. Checksum 155 is calculated such that the loss of, the addition of, or the alteration of even one data bit will result in a detectable change in the checksum. Such checksums are often used internally by a computer, or externally by modem and internet programs, to detect errors in transmission, reception, or storage of a file. There are multiple methods known in the art to generate a checksum, and these vary in their complexity, speed, and degree of certainty that a corruption or change in the analyzed data will result in a detectable change in the checksum.
In the case of the present invention, the goal of the using a checksum is to provide a digital measure of the integrity of a file, such that any alteration, deletion, or insertion of data will almost certainly result in a large change in the checksum. Such overwhelming odds of altering the checksum by modifying the file make it very difficult to intentionally alter a file, yet retain the checksum at its original value. In order to achieve this degree of security, the checksum should be long, and the method of calculation difficult to reverse (e.g., difficult to predict what changes in the document could be made such that, taken together, the checksum remains unchanged). The difficultly of tampering to yield an altered file with an identical checksum rises exponentially with the number of bytes in the checksum string (a string is a set of characters, numbers, or symbols strung together into a single "word"). The difficulty of tampering rises in a power curve with changes in the number of characters used in the alphabet used to create each byte in the checksum string. For example, if there were only 16 characters in the checksum, and there were 64 choices for each character, there would be 1664 possibilities for the checksum, a number which would require a computer checking a new sum every microsecond per second an average over a trillion years to try to reproduce a given checksum. With this in mind, in the preferred embodiment, checksum 155 is 16 characters in length and uses a 64 character alphabet composed of upper case letters, lower case letters, the digits 1 through 0, and two repeated alphabet characters ("a" and "b") to bring the total to 64 characters, based upon the calculated integer values of the checksum for each character. However, this checksum length and/or character set may be expanded or contracted, provided only that the length and character set used in the checksum are chosen such that the chances of modification of the registered document without producing a corresponding change in the checksum is reduced to nearly zero balanced by the need for a reasonable calculation time.
A preferred method of determination of checksum 155 is now discussed in detail. Many encoding schemes are known, and involve mathematical procedures in which each data bit in the document influences one or more bytes in the character string. In the present embodiment, a simple single-pass checksum is generated by sequentially loading each 8-bit byte in the document file (decimal value 0-256) and performing a mathematical operation on each. This operation has multiple components. To begin with, the first byte in the document is added, after a division and a multiplication, to the first byte of the checksum, and is also added after different multiplication and division operations to other character bytes, as is determined by the variable called "Influence", to be discussed later. Each time a new character is read, this "base byte" is increased by one. For example, the base byte for the second byte in the document is the second byte in the checksum. When the base byte exceeds the length of the checksum, then the base byte resets to point at the first character of the checksum. For example, with a checksum sixteen bytes in length, the sixteenth byte of the document has a base byte of the sixteenth byte of the checksum, while the seventeenth byte of the checksum has a base byte of the first byte of the checksum. This rotating base byte continues incrementing and resetting as above, until every byte of the document has been read, processed, and entered into the checksum.
In this embodiment, the first division operation that each byte in the document undergoes involves a truncated division. This truncated division uses an operation termed FRAC, which yields the portion of the division result to the right of decimal, with the integer portion rejected. For example, 10 FRAC 7 is equal to of the result of 10 divided by 7, or 1.42857142, with the integer portion removed, or 0.42857142. The discarding of the integer component by the FRAC function, and the degree of changes in the FRAC function result with small changes in the document byte, ensures that the checksum irreversibly holds less information than the document, with the result that the checksum is simple to generate, but difficult to reverse calculate. This makes it more difficult to alter a file in a manner that would produce no evidence of change in the checksum.
In addition to the FRAC calculation, the present embodiment includes values termed "Influence" and "Offset." The value of "Influence" determines how many characters in the checksum will be affected by each byte in the document. If each byte in the document affects only the base byte, then it would be relatively simple to alter a character in one place in the document, and then to correct any changes in checksum by altering one more character. By having each byte in the document influence a base byte in the checksum, but also to influence several other bytes, the checksum becomes more resistant to tampering. Thus, if "Influence" equals 8, then each byte in the document will affect its base byte, plus 8 additional bytes in the checksum. Again, the calculations are selected such that there is an irreversible calculation involving each of the bytes under the influence of a character in the original document.
In this embodiment, the checksum is calculated by performing a FRAC division of the decimal value of each byte in the document (with a value between 0 and 255 for each byte) by a series of odd number (3, 5, 7, 9, ...), with the number of divisions performed equal to Influence + 1. The first division (Offset = 0) is added to the starting character of the checksum, and each time a division is made the location of the character in the checksum to which the result is added is increased by 1 (Offset = 1, 2, 3, ... up to Influence).
The multiplication step that each result from the FRAC function (result between 0 and 1) undergoes is a multiplication by the maximum value allowed for each checksum character, followed by an integer truncation of the decimal portion of the value. As a result, instead of ranging between 0 and 1, the result ranges between 0 and the maximum value for each byte. Note that the final integer result after multiplication of the FRAC result now swings wildly but consistently with changes in the value of the document byte and with changes in the value of the odd number divisors, as shown in the following table when the byte value is 33 (the ASCII equivalent of an "C" character):
Figure imgf000014_0001
The steps for determining the checksum, using this method, are outlined below:
Step 1: Determine the constants. In this example, the constants are as follows:
(a) Set Influence equal to the number of bytes in the checksum that will be influenced by any single byte in the document.
(b) Set MaxValue equal to the maximum integer value that any byte in the checksum is allowed to have. If MaxValue = 256, then each byte in the checksum may vary from 0 to 256. If MaxValue = 64, then each byte can vary from 0 to 64.
(c) Set DocLength equal to the number of bytes in the document to be registered.
(d) Set SumLength equal to the number of characters in the checksum.
Step 2: Begin with DocLocation, the integer location of each byte in the document, equal to 1. This selects initially for the first byte in the document.
Incrementing DocLocation by 1 each after each loop of Steps 2-5, up to a maximum value of DocLength, allows each of the bytes in the original document to contribute to the final checksum value. For each value of DocLocation, do steps 3-5.
Step 3 : Set DocValue equal to the binary value of the document at byte
DocLocation. Each DocValue will range from 0 to 255 for a typical computer using an 8-bit byte.
Step 4: Set Offset equal 0, which will store the result of the calculation in the base byte, and then increment Offset by 1 up to a maximum value equal to Influence. For each value of Offset, do the following operation:
(a) Set Divisor equal to Offset * 2 + 3, effectively beginning Divisor at
3, and increasing as odd numbers 3, 5, 7, and so on, until Offset is equal to Influence. (b) Set SumLocation, the location of the current byte in the checksum, equal to DocLocation + Offset. If this value exceeds SumLength, the length of the checksum, subtract SumLength repeatedly until Byte is less than or equal to SumLength.
(c) Set CurrentValue equal to the current value of the checksum at byte SumLocation
(d) Set AddedValue equal to the value of the FRAC calculation, which will now be between a fraction 0 and 1 , multiplied by MaxValue, to yield a number between 0 and MaxValue.
(e) Add CurrentValue and AddedValue to yield Sum Value. If Sum Value exceeds MaxValue, the maximum value for each character in the checksum, then MaxValue is repeatedly from
SumValue until Sum Value is less than or equal to MaxValue. The integer of SumValue is now saved in the checksum at SumLocation.
(f) Increment Offset, and repeat steps (a) through (f) until Offset would exceed Influence, and therefore the base byte plus Influence other bytes of the checksum have been affected by the current byte in the document.
Step 5: Return to Steps 2-4, until all bytes in the document have been read.
The final values of the checksum at all bytes, 1 through SumLength, are converted to letter and number characters, using the rule that the integer value "Position" of the checksum at each byte is replaced with the character appearing as character number "Position" the conversion string "ABCDEFGHIJKLMNOPQRSTUNWXYZabcdefghijklmnopqrstuvwxyzl234567890 ab," and the resultant string is saved as checksum value 155.
Alternative methods of calculating a checksum may be used, and fall within the spirit of the invention if small changes in the document to be registered reliably produce changes in checksum 155. For example, the file name of the document may be used in the determination of checksum 155, or a simple addition, without the FRAC function, may be used to improve checksum speed (though the irreversibility of the calculation will likely suffer). Checksum 155 is then stored in client memory 123, uploaded by client system 120 to host 130 via network connections 134, 136, and 138, and saved in host registry 147. Checksum 155 may be composed of N multiple checksums if multiple files are scanned, such as a scan of a folder. Host 130 may additionally perform additional encoding steps on the values received, before storing such data in registry 147. Checksum 155 maybe omitted from the certificate, as keeping the registered sum private and internal to the host system may make fabrication of a falsified file incrementally more difficult.
Upon receiving data sent by client system 120, and after possible additional checksum calculations, host 130 assigns a certificate number, and records the checksum, date, time, registration number, encoding scheme, and any other identifying data requested, into registry 147. In addition, digital certificate 166 is generated and contains some or all of the data saved in the registry, and is stored in host memory 149. Certificate 166 is then downloaded to client system 120 via network 134 and cables 136 and 138 for storage in document and certificate folder 175. Host 130 may place contents of the registry in backup archive 178 via connection 179, for long-term retrievable storage.
The agent program may rename the document file in the client system 120, make the registered documents read only, move the file or certificate to new directories, or take other actions to both indicate which file has been registered and/or attempt to prevent unintentional modification or erasure of the registered version of the program. An example of renaming will be illustrated later in Example 1.
At a later date, a user of client system 120, or of new client system 180, connected to the network via cable 182, may wish to re-certify or verify the integrity of the registered copies of documents 121, and corresponding digital certificate 166 if one is available. To do so, client system 120 or new client 180 must have access to copies of previously-registered documents 121, and to any corresponding certificates in folder 175. Client system 120 or client 180 then accesses host 130 via network 134 and cables 138 and 136 or 182, to download agent 145. In the remaining discussion of the preferred embodiment, client system 120 will be verifying previously registered files, but the methods could apply equally well to a new client system 180 employing the same actions and method.
To verify the integrity of a file previously registered, the user accesses agent program 145, such as by downloading agent program 145 into client memory 123. The user selects validation by selecting command button 205 (shown in Figure 2A). Agent 145, operating from within client system 120, then prompts the user to select the name of the document or documents to be validated from client system 120. If the document has a certificate available, agent program 145 web site will search and find, or alternatively ask the user for, the name of certificate 166 in certificate folder 175, and agent program 145 automatically resubmits certificate 166 to verify that it agrees with the stored registration data in archive 178 of host 130. Alternatively, if no certificate is present, but a registration number is available (such as if the registration number appears in the document file name, or is available from other records), agent program 145 will first upload the registration number to host 130, and host 130 will download the encoding scheme used for the registration. A copy of the previously- registered document, accessible from the client system, will be opened and the current checksum will be recalculated by agent 145. Knowledge of the encoding scheme is essential for agent program 145 to accurately recreate the checksum, and may include information regarding the length of the checksum, the size of the character set used, the agent and algorithm used at the time of the initial registration, and so on, to provide for backward compatibility. After agent program 145 recalculates a current checksum, using the same length, character set, and algorithm as used for the initial registration, and sends the recalculated checksum for comparison to the value archived on host 130. Host 130 retrieves the archived registration data via network connection 134 and cables 138 and 136 or 182. Host 130 retrieves the registered certificate data stored in archive 178, via connecting link 179 between host 130 and archive 178, using the registration number provided when the document was initially registered, and then compares the newly sent checksum and registration certificate (if any) to that stored within the registry.
If the information matches, indicating that the file has not been altered since it was registered, host 130 sends a message to agent program 145 running on client system 120 that the integrity of the file has been validated, namely that the document without alteration since the specific date and time of registration. If the information does not match, indicating that the file has been altered since it was registered, host 130 first rechecks the data, to ensure that an error has not been made, and then sends a message to the agent running on client system 120 that the file is invalid due to alteration of the file. These messages may be presented to the user in many ways, such as by using a Windows 2000 message box function or by running a Java-based notification routine in a browser used to access the internet. If a certificate is present, host 130 may also indicate what portions of the certificate have been altered, if determinable. In addition, other messages may be generated, such as that the document is intact but the certificate has been altered, or that both the certificate and the document have been altered. The program flow, and the direction of data transfer, are illustrated in Figure 3. To begin a registration or validation, client system 120 uploads a request to host 130 for downloading of agent 145, as shown at flow chart step 187. Host 130 responds by downloading agent 145, as shown at flow chart step 191. Steps 187 and 191 are optional, as they can be ignored if the agent is available to client system 120 without need for download. Next, client system 120 uploads data for a certificate, including calculated checksum or a registration number, or both, to host 130, as shown at flow chart step 193. In the case of registration, host 130 stores checksum 155 using a unique host-generated registration number, while in the case of validation host 130 requires an uploading of the registration number in addition to checksum 155 in order to perform the validation process. In either case, host 130 transmits the results of the requested registration or validation process in a downloaded certificate, as shown as flow chart step 195. Optionally, agent program 145 may request that the process be confirmed or verified, which may add two or more optional steps, as shown in program flow chart steps 197 and 199.
Note that in neither the registration nor verification processes has there been a requirement for the document to be transmitted, either in whole or in part, to host 130 or to another client. A primary goal of this invention is to establish a secure method and system for an anonymous user to easily, reliably, and certifiably time stamp an electronic document, as well as be able to verify its integrity with respect the original registration, independent of the identity of the creator or future verifier. Similarly, a client checking the validity of a document can do so in anonymity, unless it is selected otherwise during the registration of the certificate. Thus, both users who register documents and users who validate certificates can feel free to use such a system without giving away their identity (other than for billing purposes) or their data. In this manner, any user can feel comfortable obtaining a certificate, even if the data on the client's computer is highly sensitive or if the data cannot be transmitted due to security or intellectual property issues, or if the user does not wish to reveal his or her identify. This combination of verifiable document registry and validation, without requiring user-identifying signatures, and/or without transfer of the document data, is unique, allows for a high level of security during document registration and validation.
However, situations are conceivable in which the publication or transfer of a registered file may be advantageous. For example, a contract may be posted at a public site. Users downloading this contract can perform a validation check, based upon a registration number provided on a public or private internet web page. The validation of the document confirms that the document has not been altered since it was posted at the web site. This may allow, for example, for multiple users to complete the closing of a contract, with each of the users at different sites, and with assurance that the documents each completes will remain unchanged and without tampering should future reference to the completed documents become desired. A similar level of assurance may be offered when downloading software. Here, a user would download software from a public site, the validate the software using a public or private posting of the registration number at a user- accessible web site page. A confirmation of the validity of the file indicates that the file has not been altered since it was posted on the web site by the authoring software company. It may be also desirable to identify the registering or validating user and client system, such as to charge a client for registration or validation, or to charge the initial registering client when other new clients desire to verify copies of the previously registered document. In such cases, data identifying the client can be collected but delinked from the registered files in such a manner so as to continue to provide anonymity, when desired.
With respect to notarization, the establishment of the date, time, and integrity of a document is but one aspect of the notarization function. A second aspect is that the identification of the signatory client is confirmed through the requirement for positive identification- In this respect, digital signatures are reaching the point of being legally valid and binding. Digital signatures offer the opportunity to execute agreements, contracts, and the like at a distance, such as over the internet. The inclusion of digital signatures in the registration process can add value to the present method, such as by allowing for the registration of notarized documents, after electronic scanning, to ensure that such documents remain unchanged over time with respect to the date, time, identity of participants, and even place of registration. Further, as documents may require serial or simultaneous signatures, and registered document allows for each party to sign, while keeping the original document intact. Once all signatures have been provided, the entire document may be re-registered, in order to lock-in the signatures and to reestablish a new document that is to remain stable with respect to time, date, and signatories. A web site may be provided to facilitate simultaneous signature at multiple locations. Such a web site would include features such as collection of digital ID's, but attachment of these ID's to the target document only upon collection of all signatures. This would provide a unique business opportunity to provide for internet or network based notarization. Last, some users may wish for a paper document to be saved elsewhere. For example, a user with an email-PC (e-mail only personal computer) may not have any disk drives for data storage, or limited disk space. For such users, the document to be notarized or signed may reside at the host computer site, within registry 147 or archive 178. Again, the creation of a web-based notary system provides a unique business opportunity.
Examples
The breadth of uses of the present invention is best understood by example, six of which are provided below.: These examples are by no means intended to be inclusive of all uses and applications of the system, but merely to serve as case studies by which a person, skilled in the art, can better appreciate the methods of utilizing, and the scope of, such a system.
Example 1 : Registering a Document Resident Accessible by a Client
To test and demonstrate this method in use, a registration system was written in Visual Basic 6.0. A live host program and an executable agent program were installed on a computer (700 MHz Pentium Pro Inspiron 7500, Dell, Round Rock, TX). In this example, both client system 120 and host 130 had graphical user interfaces written in Visual Basic Pro 6.0 (Microsoft Corporation, Redmond, WA), with host 130 running on a continuous basis. Agent program 145 required an execution request from a user in order to begin running. The host and agent software communicated by buffer, in a similar manner of input/output as would occur as upload and download buffers when the programs operate in different computer systems linked by a network such as the internet.
In this example, a user was operating client system 120 for the first time, and both the registration and validation processes were configured to be free of charge. Client system 120 had several stored document for which registration and certification were desired by the user.
To begin, client system 120 opened up a browser, in this case Netscape Navigator 4.73 (Netscape, Mountain View, CA), and the user entered a network address for host 130. In this system, agent program 145 was then copied by host 130 to memory 123 of client system 120, and agent program 145 started and ran automatically. The user of client system 120 then viewed opening screen 208 of agent 145, and the user selected dialog button 202, marked "Register a New Document" (shown in Figure 2A) and then button 203, marked "Download Now" (shown in Figure 2B). In this example, both agent program 145 and host 130 had graphical user interfaces written and compiled in Visual Basic Pro 6.0 (Microsoft Corporation, Redmond, WA). Next, agent program 145 prompted the user to identify the document, document folder, or storage disk for which certification was requested. In this example, the user of client system 120 selected two documents, named 'Declaration.txt' and 'Declaration.doc'. These two documents were created using the text of the first two sentences of Declaration of Independence, including capitalization and punctuation, as provided at the National Archives and Records Administration, transcription files, Washington, DC. The first document was saved in text format (although any format, including image files, could have been registered), while the second document was saved as a Word 2000 file (Microsoft Corporation). The exact text used was as follows (quotation marks added here but not included in the document file during analysis):
"When in the Course of human events, it becomes necessary for one people to dissolve the political bands which have connected them with another, and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature's God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation. We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness." Beginning with the file 'Declaration.txt', agent program 145 proceeded to determine checksum 155 for each document, in sequence. The calculation of the checksum proceeded as follows, adhering to the numbered steps as described in the preferred embodiment: In step 1, the following constants were set: SumLength = 16, SumMax = 64, Influence = 8 characters, and DocLength of approximately 618 characters. In step 2, DocLocation = 1, allowing the first character in the document to be read in step 3. The first character read had DocValue= 87, or the ASCII-standard value for the letter "W". In step 4, Offset began at 0, yielding SumLocation = 1 and Divisor = 3. As the Frac calculation can be expressed as Integer[Frac(DocValue/Divisor)*Max], AddedValue = Int[Frac(87/3) * 64], or AddedValue = 0.
As the beginning value in each location of the checksum was set to zero, the sum of the new AddedValue and the old checksum value CurrentValue at SumLocation of 1, yielded SumValue = CurrentValue + AddedValue, or SumValue was equal to zero. Step 4 looped as Offset was increased from 0 up to 8 (the maximum value of 8 determined by the value of the variable, Influence). For example, in the second loop of step 4, Offset =1, yielding SumValue = 2, Divisor = 5, and an ASCII value for the first character of 87. Thus, AddedValue = Int [Frac(87/5)*64] = 25. Note that the Fractional calculation is not reversible, making it difficult to predict the effect of changes in the file upon the checksum. This is helpful, as it makes tampering with the file far more difficult to achieve in the absence of alterations to the checksum. Offset continued to increase up to a maximum of 8, the value of Influence.
Once Offset had looped through its cycles, DocLocation was increased by one, and the cycle was repeated. Once DocLocation had been increased up to the DocLength, the size ofthe document to be registered, Offset took one final loop. In this loop, with DocLocation equal to DocLength, the byte read was the final character in the document. As the position ofthe final character ofthe document, with DocLength = 618, exceeded the length ofthe checksum, at SumLength = 16, the base byte for DocLocation was determined by repeatedly subtracting SumLength from DocLocation, until the value was less than or equal to SumLength, or in this case when SumLocation equaled 8. For this final character, DocValue = 46, the ASCLI- standard value ofthe final period (".").
At the end of this analysis, the checksum for 'Declaration.txt' was "bSVodXSE9XtGcjD5", and the file was assigned a registration number of AAA- 10072-000616. Using this registration number, a copy ofthe document file was created, made read-only, and was given the name "Declaration.Reg_AAA- 10072- 000616.txt". Note that the extension '.txt' remains at the end ofthe file, despite the insertion of a registration number, allowing the file type to remain stable. A certificate was also stored, in this case in the same directory. Alternatively, the security certificate could have been stored in another directory, such as the Windows security certificate folder. In this example, the name ofthe certificate 209 was made similar to the name ofthe registered read-only document as "Declaration.Reg_AAA- 10072-000616.txt.crt". Certificate 209 was then displayed to the user, as shown in Figure 2D.
Note that the agent version 211, the encoding scheme 212, and registration number 215 were saved in certificate 209, while checksum identifier 155 was also saved in the document name. When verification is desired at a later date, the encoding scheme and the agent version will allow program 145 to be backward compatible, as encoding schemes and agent programs may mature and evolve over time. Similarly, use ofthe registration number in the document file name will allow for retrieval ofthe encoding scheme and agent version archived within registry memory 147 or archive 178 of host 130, without need ofthe registration certificate.
Next, the second file to be registered was processed. When the file 'Declaration.doc' was analyzed, the checksum differed substantially, and was: "Vmwuy412vRZD4UAA". This demonstrates that when a text file is saved as a Word 2000 document, the changes performed by windows alters the file sufficiently for checksum 155 to change at every character.
Example 2: Verification of an Intact File
The text file registered in Example 1 was then validated. Again, the agent program downloaded. In this case, the verify option was selected clicking on dialog button 205 (Figure 2 A) to begin validation ofthe document, and then selecting dialog button 203 (Figure 2B to enable a download of agent program 145. The document file "Declaration.Reg_AAA- 10072-000616.txt" was selected for validation. In this case, as the registration number was already contained in the file name, no certificate was required to begin the validation process.
From the prior registration ofthe document selected, registry 147 of host computer 130 already contained an entry for registration number AAA-10072-000616. In this register, the following values had been saved: checksum length, maximum checksum value for each character, influence value, date and time of registration, and the original calculated checksum. The value of these variables were then read from the registry and were downloaded to client system 120 for use by program 145 in the calculation of checksum 155. Alternatively, the user may have elected to select both the document file as well as its corresponding certificate, allowing the certificate to be tested for tampering. In addition, with a certificate accessible to client system 120, information needed for the recalculation of checksum 155, could have been read directly from the certificate, instead of over the internet from registry 147.
When agent program 145 recalculated the checksum for "Declaration.txt:, a checksum value of "bSVodXSE9XtGcjD5" was once again obtained, and transmitted to host 130. The host checked the registry, and confirmed that the file has not been altered, and displays the message "REGISTRATION VALID," as well as displays the certificate information, via program 145. If the checksum did not match, host 130 would have indicated that the file and/or the certificate have been altered (See Example 4 for details).
It is relevant to note that any file with a registration number in its file name can be transmitted to any third party user, who may check its integrity at any time. Thus, this system provides a method for reliable verification for personal as well as distributed electronic documents, such as text documents, images, or programs. For security reasons, it may be decided that the checksum may be retained only by the host, and not included in any certificate. This could potentially make the job of altering a certificate more difficult. It is apparent that the registration number may be any combination of letters, numbers or symbols that would allow for the registration record within the host registry to be accessed at a later date and time.
Example 3: Effect of a Single Character Change or Insertion
As a demonstration ofthe effect of tampering, the text file registered in Example 1 was opened using a text editor, and the starting sentence was altered from "When in the Course of ..." to "Then in the Course ...". This change left the total number of characters in the file unchanged. The result was saved in the file 'DeclarationAlter.txt' As a result of this single character alteration, the calculated checksum changed from "bSVodXSE9XtGcjD5" to "bs6TLIF5yXtGcjD5". As the value of influence was set at 8, a single character change in the document affects the base byte plus 8 additional bytes. In this example, only 8 bits changed, as the first byte ofthe checksum, the letter "b", did not change by random chance. Thus, even changes as minor as an alteration in capitalization, punctuation, or the change of one digit or other single character, will have a profound affect on the checksum.
As a second example, a character was inserted into the text file by adding an "s" to the word "course" in the first sentence, changing "When in the course of ..." to "When in the courses of ...". The result was saved in the file "DeclarationInsert.txt", In this case, the insertion of a single character into the file resulted in an entirely different checksum of "wH0S9Mv6aNSwX0v9", and not a single character in the checksum remained unchanged.
Example 4: Verification of an Altered File The first altered file from Example 3 was then validated using the registration number ofthe unaltered file, as described in the method shown in Example 2. In this case, the checksum ofthe altered document did not match the checksum ofthe unaltered document. The host indicated that the file did not match using the message:
"VERIFICATION FAILED ON FIRST ATTEMPT. RETRYING."
The host then reloaded the agent program to ensure that a transmission error was not responsible for the failure. After failing on a second attempt, instead of confirming that the file was valid, as occurred in Example 2, the host instead returned the following message:
"VERIFICATION FAILED. FILE HAS BEEN ALTERED." Example 5: Payment for Registration and Validation
Methods for e-commerce, including the collection of fees, are known. With the advent ofthe digital signature, documents can be electronically signed and remain legally recognized. Use ofthe present method would allow for contracts or other legal documents to be both signed electronically as well as guaranteed to be without modification or alteration. For example, a tax filing could be saved and time stamped, to ensure that nothing in the document has been altered since the date on the certificate. This verifiable registration, coupled with the ability to provide a digital signature would allow for documents to be signed with the knowledge that the document has not been altered since the time and date ofthe signing.
Example 6: Electronic Notarization and Contract Closure
A document can be electronically notarized using this method. For example, a notary may elect to notarize a paper document using a conventional ink stamp and record book, but then digitize the resulting document using an electronic scanner, and register the electronic version ofthe document using the method and system ofthe present invention. The registration number ofthe document could arguably be entered into the notary's written register as additional proof of document stability. The scanning of paper documents allows for the digital transmission of any contract or notarized document, rather than transmission by conventional certified or express mail delivery. Registration of notarized documents also resolves the issue that a document witnessed by a notary identifies only the signer(s), but does not attest as to the accuracy ofthe content ofthe document, which is easily tampered with. Digitization and registration can be used to provide a secure and traceable method for the transmission of notarized documents, or for any signed contract, in order to ensure that the documents remain in their original form.
In this example, a document was created in Word 2000 (Office 2000, Small Business Edition, Microsoft Corporation, Bothell, WA) and then printed (LaserJet 4, Hewlett-Packard, Palo Alto, CA). This printed copy was notarized using a conventional ink stamp and witnessed signature, and was then digitized five times without moving the paper document between images, using a flatbed document scanner (MFC 9100C, Brother International Corporation, Bridgewater, NJ). The resulting images were captured using imaging software (Visioneer PaperPort 6.1, ScanSoft, Los Gatos, CA), and registered using the method ofthe present invention, with Influence set equal to 1. Two of these scanned images are shown (Figure 4A and 4B). Image 355 in Figure 4A and image 357 in Figure 4B are indistinguishable by eye. In addition, each ofthe remaining three scanned images (not shown) are similarly identical. Also, notary seal 366 and 368 can be seen clearly in each ofthe images. Nevertheless, there is sufficient low-level random noise in the scanned image that the resulting checksum differs significantly for each image, with checksum values of "GWdUbbveYόqeNFDg" for image 355, "ubqSQbX7WuaLTOb0" for image 357, and "neCF0XWh9qUsSHpw", "VnYBaksOSsvell 3s", and "oIDk5HevWDF4EqXU" for the remaining three images, respectively. Because of these changes, rescanning of the document with the intent to alter the document, even with every attempt made to scan in exactly the same manner, is likely to fail.
A document to be notarized or a contract to be signed may be more conveniently provided in a scanned or other electronic digital form, such as if the form is to be distributed via email. A registered copy ofthe blank document would ensure that the contract had not been surreptitiously modified or corrupted during transmission. Exemplary types of electronic documents include doc files from Word or pdf files from Adobe Acrobat. If proof of identity can be provided electronically, rather than in person by a notary, then the entire process maybe completed electronically, without the need of a human notary public. Providing electronic identifying documentation, such as digital signatures, would allow for the document to be electronically distributed, electronically signed, and then electronically registered to freeze the document in its signed from, without changes or corruption. A document distributed in electronic form has the advantage that the same form can be rapidly distributed to multiple individuals, each of whom may separately electronically sign the document, have the document electronically notarized, and then returned in a signed, notarized, and registered form. As each copy is registered at the time of notarization, a guarantee may be provided that each electronically signed copy has been stable since notarization, provided that the registration can be verified.
Last, it may be advisable to have a document, such as a contract, posted and available on a web site. This would allow, for example, multiple signatories to an agreement to meet via a web site "chat room", and then to provide their digital signatures and/or electronic proof of identity, resulting in a simultaneous alteration and update of all copies when changes are made, and also result in a simultaneous closure a contract or agreement through the web, with all parties effectively signing at the same time. Such signatures may, in fact, be electronically collected in an asynchronous manner, with the web site waiting until all signatures have been provided, and then attaching all signatures simultaneously to ensure that signing occurs simultaneously, followed by the freezing the document in a registered, verifiable form immediately after contract closure. The registration number, or an adjunct password or keyword may also be used to control distribution of a document, such as access to a web site or access to a web page for contract closure over the web. In such cases, the registration number ofthe final document may be distributed separately, or provided on a web site with limited access, such that only users with access to both the registration number and the registered document will be permitted to read and/or validate the document.

Claims

What is claimed is:
1. A method of registering a client document, present as a file in a client computer system, with a remote and secure host computer, without transmitting the document, comprising: (a) posting a downloadable agent computer program on a networked host computer;
(b) downloading the agent program to the client computer system;
(c) determining a unique checksum identifier using the agent program, said checksum based on a mathematical function ofthe digital information contained in a client document file, and said function selected such that changes to the client file produce large and inversely-unpredictable changes in the identifier;
(d) uploading the unique checksum identifier to the host:
(e) storing the unique checksum identifier and a unique registration number determined by the host for each document to be registered, and the time and date of the registration process, in a permanent host register computer;
(f) downloading at least the registration number to the client; and,
(g) storing the registration number by the client in such a manner so as to allow verification, at a future date, ofthe stability and integrity ofthe target document since the date and time ofthe registration.
2. A method as in claim 1 wherein the downloading and uploading between the client and host is over a network.
3. A method as in claim 2 wherein the network is the Internet.
4. A method of validating a client document registered in accordance with claim 1 contained in a computer system, without transmitting the registered client document beyond said client computer system, comprising:
(a) requesting, via the computer system, a download of an agent program residing on a host computer;
(b) downloading the agent program from the host computer to the computer system;
(c) determining a current and unique checksum identifier using the agent program, based on a mathematical function ofthe digital information contained in the registered document file, said function selected such that the changes to the target file are likely to produce large and inversely-unpredictable changes in the identifier, (d) uploading at least said current identifier checksum and a registration number to the host;
(e) comparing at the host the current identifier to a permanent registry of identifiers registered by the host, and generating a validation certificate containing confirmation or denial of prior registration, and confirmation or denial that the document remains unchanged since said registration; and
(f) downloading a validation certificate to a client computer system.
5. A method as in claim 4 including the step of generating an output message related to the validity and integrity ofthe document with respect to said previous registration ofthe document.
6. A method as in claim 4 wherein the downloading and uploading are over a network.
7. A method as in claim 4 wherein the downloading and uploading are over the Internet.
8. A method of registering a target document present as a file in a client computer with a remote and secure host computer, without transmission ofthe document from said client computer, comprising:
(a) providing an agent computer program to the client computer;
(b) determining a current and unique checksum identifier using the agent program, based on a mathematical function ofthe digital information contained in the target document file;
(c) uploading at least said current identifier to the host for archiving and storage.
(d) downloading at least a registration number to the client for use in validation ofthe integrity ofthe document at a later date and time.
9. A method of claim 8 in which the agent computer program is on a networked host computer; and in which the host generates a registration number, time and date for permanent storage in a host register in such a manner as to allow verification at a future date of the integrity ofthe current target document since the date and time ofthe initial registration, said host further generating a registration certificate confirming the registration.
10. The method of claim 7 in which the client computer alters the name ofthe document file to include the downloaded registration number.
11. A method of validating the integrity of a document registered in accordance with claim 8 without transmitting the registered document beyond the client computer comprising:
(a) providing an agent program to a client computer;
(b) determining a current and unique checksum identifier using the agent program and based on a mathematical function ofthe digital information contained in a previously registered document file;
(c) comparing over the Internet the current checksum identifier with the identifier generated during an initial registration ofthe document; and
(d) generating a message related to the integrity of the current document with respect to the previously registered document.
12. The method of claims 1, 2 or 3, further including the steps of:
(a) obtaining from the client a digital signature identification, and
(b) linking the confirmed identity ofthe registering client in a permanent register by the host.
13. The method of claims 1 or 4, further including the steps of:
(a) obtaining billing information from the registering or validating client; and
(b) assessing a transaction fee for the registration or validation process.
14. The method of claims 1 or 4, further including the steps of:
(a) obtaining and storing billing information from the registering client;
(b) retrieving billing information collected from the registering client which was previously archived in the host registry; and (c) using said billing information to assess transaction fees to the original registering client for validation of a registered document via any client.
15. The method of claim 14 further including the steps of:
(a) retrieving the billing amount and billing limit previously archived in the host registry; and (b) incrementing said billing amount with each registration, and comparing said billing amount to said billing limit, and determining if an additional billing can be made based upon said comparison.
16. A host system for registering a document from a remote client, without transmission ofthe document, comprising:
(a) a remote and secure host computer,
(b) a computer network for downloading an agent program to a client computer, said program determining an identifier checksum based upon a mathematical function ofthe digital contents ofthe document accessible to said client, and said checksum sensitive to minute changes in the document, and for uploading the unique checksum identifier to the host,
(c) a registry of registered documents for allowing the integrity of a document, through recalculation ofthe original checksum, to be compared with a registered checksum, and,
(d) output means for confirming to the client registration of a target document, including provision of a registration number.
17. The method of claim 1 or 4, further including a method of notarizing a document present as a file in the client system, including the steps of:
(a) providing access by a receiving client to a source client document,
(b) electronically attaching said source client document, or physically attaching to said source document and then physically scanning, a receiving client signature and evidence of positive identification of said receiving client, to produce a notarized electronic document,
(c) registering the executed document with the host, and transmitting a certificate to the receiving client affirming the success or failure of document signature and notarization.
18. The method of claim 1 or 4, further including a method of notarizing a document, present as a file in a client computer system, further including the steps of:
(a) providing access to an electronic document from a source client to a receiving client,
(b) electronically or physically providing a signature from said receiving client to said document, and electronically or physically providing a notarization stamp and signature from a notary to said document, (c) registering an electronic copy ofthe signed and notarized document, and
(d) returning a registered copy ofthe signed and notarized document to the source client.
19. A system for registering a document comprising: a first computer-based client system having a document stored therein and for generating a checksum identifier using an agent program, a second computer system for storing checksum identifiers and agent programs, means for exchanging information between the first and second computer systems whereby the agent program can be downloaded to the first computer system for generating the checksum and the first computer system uploads the checksum to the second computer system for storage.
20. A system for validating a document registered in accordance with claim 19 in which said first computer system includes means for generating a new checksum identifier using the same agent program and transmitting said new checksum identifier to the second system and said second system including means for comparing said new checksum identifier with the registered checksum identifier and indicating whether or not they are identical.
PCT/US2001/040965 2000-08-30 2001-06-13 System and method for client document certification and validation by remote host WO2002019075A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001267087A AU2001267087A1 (en) 2000-08-30 2001-06-13 System and method for client document certification and validation by remote host

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US65117400A 2000-08-30 2000-08-30
US09/651,174 2000-08-30

Publications (2)

Publication Number Publication Date
WO2002019075A2 true WO2002019075A2 (en) 2002-03-07
WO2002019075A3 WO2002019075A3 (en) 2003-04-17

Family

ID=24611862

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/040965 WO2002019075A2 (en) 2000-08-30 2001-06-13 System and method for client document certification and validation by remote host

Country Status (2)

Country Link
AU (1) AU2001267087A1 (en)
WO (1) WO2002019075A2 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2405227A (en) * 2003-08-16 2005-02-23 Ibm Authenticating publication date of a document
US6986041B2 (en) 2003-03-06 2006-01-10 International Business Machines Corporation System and method for remote code integrity in distributed systems
EP1672503A1 (en) * 2004-12-20 2006-06-21 Microsoft Corporation Method and computer-readable medium for loading the contents of a data file
EP1672502A1 (en) * 2004-12-20 2006-06-21 Microsoft Corporation Method and computer-readable medium for verifying and saving an electronic document
WO2008065341A2 (en) 2006-12-01 2008-06-05 David Irvine Distributed network system
EP2626819A1 (en) * 2011-12-14 2013-08-14 Dominator IP Co., Ltd. Method and system for documentation of digital archives

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5136646A (en) * 1991-03-08 1992-08-04 Bell Communications Research, Inc. Digital document time-stamping with catenate certificate
EP0667577A1 (en) * 1994-02-11 1995-08-16 Seerp Westra Procedure for data file authentication
US5933498A (en) * 1996-01-11 1999-08-03 Mrj, Inc. System for controlling access and distribution of digital property
WO2000039659A1 (en) * 1998-12-28 2000-07-06 Koninklijke Philips Electronics N.V. Transmitting reviews with digital signatures

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5136646A (en) * 1991-03-08 1992-08-04 Bell Communications Research, Inc. Digital document time-stamping with catenate certificate
EP0667577A1 (en) * 1994-02-11 1995-08-16 Seerp Westra Procedure for data file authentication
US5933498A (en) * 1996-01-11 1999-08-03 Mrj, Inc. System for controlling access and distribution of digital property
WO2000039659A1 (en) * 1998-12-28 2000-07-06 Koninklijke Philips Electronics N.V. Transmitting reviews with digital signatures

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SCHNEIER B: "ONE-WAY HASH FUNCTIONS" DR. DOBB'S JOURNAL, M&T PUBL., REDWOOD CITY, CA,, US, vol. 16, no. 9, 1 September 1991 (1991-09-01), pages 148-151, XP002044823 ISSN: 1044-789X *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6986041B2 (en) 2003-03-06 2006-01-10 International Business Machines Corporation System and method for remote code integrity in distributed systems
GB2405227A (en) * 2003-08-16 2005-02-23 Ibm Authenticating publication date of a document
US7464104B2 (en) 2004-12-20 2008-12-09 Microsoft Corporation Method and computer-readable medium for loading the contents of a data file
EP1672502A1 (en) * 2004-12-20 2006-06-21 Microsoft Corporation Method and computer-readable medium for verifying and saving an electronic document
US7337358B2 (en) 2004-12-20 2008-02-26 Microsoft Corporation Method and computer-readable medium for verifying and saving an electronic document
EP1672503A1 (en) * 2004-12-20 2006-06-21 Microsoft Corporation Method and computer-readable medium for loading the contents of a data file
AU2005237165B2 (en) * 2004-12-20 2010-11-18 Microsoft Technology Licensing, Llc Method and computer-readable medium for verifying and saving an electronic document
US7890801B2 (en) 2004-12-20 2011-02-15 Microsoft Corporation Method and computer-readable medium for verifying and saving an electronic document
WO2008065341A2 (en) 2006-12-01 2008-06-05 David Irvine Distributed network system
EP2472430A1 (en) 2006-12-01 2012-07-04 David Irvine Self encryption
EP2626819A1 (en) * 2011-12-14 2013-08-14 Dominator IP Co., Ltd. Method and system for documentation of digital archives
EP2626819A4 (en) * 2011-12-14 2014-04-09 Dominator Ip Co Ltd Method and system for documentation of digital archives
JP2015506028A (en) * 2011-12-14 2015-02-26 ドミネーター アイピー カンパニー リミテッド Digital document evidence storage method and system

Also Published As

Publication number Publication date
AU2001267087A1 (en) 2002-03-13
WO2002019075A3 (en) 2003-04-17

Similar Documents

Publication Publication Date Title
US20230139312A1 (en) Website Integrity and Date of Existence Verification
JP5030654B2 (en) Secure and efficient method of logging and data exchange synchronization
CN110785760B (en) Method and system for registering digital documents
US10754848B2 (en) Method for registration of data in a blockchain database and a method for verifying data
US7644280B2 (en) Method and system for linking certificates to signed files
US7039805B1 (en) Electronic signature method
JP2020517200A (en) Block chain-based document management method using UTXO-based protocol and document management server using this method
US9009477B2 (en) Archiving electronic content having digital signatures
US6327656B2 (en) Apparatus and method for electronic document certification and verification
US8261066B2 (en) System and method for secure storage, transfer and retrieval of content addressable information
EP0940945A2 (en) A method and apparatus for certification and safe storage of electronic documents
US20050228999A1 (en) Audit records for digitally signed documents
US20060095795A1 (en) Document management apparatus and document management method, and storage medium storing program
US6735694B1 (en) Method and system for certifying authenticity of a web page copy
CN109257340A (en) A kind of website falsification-proof system and method based on block chain
US20020048372A1 (en) Universal signature object for digital data
US20030182552A1 (en) Method of managing digital signature, apparatus for processing digital signature, and a computer readable medium for recording program of managing digital signature
WO2005064846A1 (en) System and method for generating a digital certificate
JP4093723B2 (en) Electronic signature method and apparatus for structured document
US11924342B2 (en) Computer-implemented methods for evidencing the existence of a digital document, anonymously evidencing the existence of a digital document, and verifying the data integrity of a digital document
US7689900B1 (en) Apparatus, system, and method for electronically signing electronic transcripts
US20050183142A1 (en) Identification of Trusted Relationships in Electronic Documents
WO2002019075A2 (en) System and method for client document certification and validation by remote host
WO2023086444A2 (en) Computer-implemented methods for evidencing the existence of a digital document, anonymously evidencing the existence of a digital document, and verifying the data integrity of a digital document
US11550931B1 (en) Data certification system and process for centralized user file encapsulation, encryption, notarization, and verification using a blockchain

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP