CA1208797A - Document and data handling and retrieval system - Google Patents
Document and data handling and retrieval systemInfo
- Publication number
- CA1208797A CA1208797A CA000455458A CA455458A CA1208797A CA 1208797 A CA1208797 A CA 1208797A CA 000455458 A CA000455458 A CA 000455458A CA 455458 A CA455458 A CA 455458A CA 1208797 A CA1208797 A CA 1208797A
- Authority
- CA
- Canada
- Prior art keywords
- signals
- digitalized
- code
- document
- patterns
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N1/00—Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
- H04N1/21—Intermediate information storage
- H04N1/2166—Intermediate information storage for mass storage, e.g. in document filing systems
- H04N1/217—Interfaces allowing access to a single user
- H04N1/2175—Interfaces allowing access to a single user with local image input
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N1/00—Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
- H04N1/41—Bandwidth or redundancy reduction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N2201/00—Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
- H04N2201/0077—Types of the still picture apparatus
- H04N2201/0081—Image reader
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N2201/00—Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
- H04N2201/0077—Types of the still picture apparatus
- H04N2201/0082—Image hardcopy reproducer
Abstract
Abstract Of The Disclosure A system for accepting documents and handling the data contained therein includes a reader and digi-tizer for producing and storing fragmented or digitized images simulating the characters and graphics on the document. Each document is marked with a unique identi-fying code. Selected portions of the data are converted to a machine code and stored and portions of the digi-tized material are also stored. Converted and unconvert-ed segments can be recalled for complementing and verify-ing the machine code. Access is available to either in storage.
Description
1'7~7 ~a kground Of_The Invention In many organizations throughout the world, both governmental and private, a major problem is one of handling documents for a variety of purposes. The documents are of current as well as historic interest and the documents may csntain information printed or type-writ~en by machine, prin~ed or written by hand or pic-tures, drawings and other forms of representation common-ly referred to today a~ ~graphics~. It is very often necessary to access selected information for various purposes within a short time, and the information must be accessed from a large volume of such information in the form of the documents. Not 211 of the information contained in the documents may be of importance. In ~ddition,- that which is of interest may be of greater or lesser degrees of importance, depending upon the docu~
ments and upon the nature of the organization.
Much information, particularly that which is of historical interest only, is being filed in the form of microfilm, microfiche or similar forms which are produced by what can b~ generically described as photographic techniques. In other cases, the information contained in the documents is converted to an encoded form which can be accomplished by such machines as optical character readers IOCR), despite the considerable expense of such machines, depending on the nature of the printing or typing in the original document; but other information must be entered into a system by a manual keypunch operation, a technique which has gained wide acceptance because of the unsatisfactory nature of alternative techniques, but which nevertheless has serious drawbacks because of the inherent problem of errors occurring simply because of the human process of retyping the information. A discussion of various data preparation device~ and techniques i5 to be found in the Encyclopedia of Computer Science and Engineering, Second Edition, Van l~B7~7 3 ~
Nostrand, Reinhold Company, New York ~1983) beginning at page 4~0. This text includes a review of the historical development of data preparation and also dis~usses th~
expense and difficulty of recycling information within a ~ystem to reduce the error percentage~
In most circumstance~, it is not necessaril~
desirable to elimin~te human intervention, nor can this be done as a practical matter. For example, if documents coming into an organization are to be handled and entered into a system~ it is necessary for some human operator to review each document, determine its relevance and make some decisions. It is, however, desirable to remove the human process of retyping or keypunching the data because of the above-discuss~d error entry problems. On the other hand, machine data entr~ preparation, such as OCR, in addition to the expense has the disadvantage that very often the total content of each document must be entered, an approach which is wasteful of mass storage, compounds the difficulty of locating and utilizing relevant infor-mation at a later time, and usually would necessitate reworking the data, depending upon its ~orm and ultimate use.
Brief Summary Of The_}nvention Accordinsly, it is an object of the present invention to provide an organized system for selectively entering the contents of documents into a storage facil-ity from which the material can be selectively extracted for various purposes.
A further object i~ to provide a unique tech-nique for cross referencing the stored pattern and the original document, the combination of storage and corre-lation being usable, in many circumstances~ to completely eliminate further handling of the original document itself between organizational units~
g~ .
A further object i~ ~o provide such a system in which the total amount of storage required i reduced because of the ~electivi~y of the storage technique, and wherein the sa~ed memory capacity is usable to correlate the filed data with the source documents, permitting carrying of this link forward to provide a bibliography of source information in, for example, report prepara-tion.
A still further object is to provide efficient techniques for verification and complementing of the stored data in a codification and storage procedure.
Yet another object is to provide such a system which is capable of automatic selection of desired data from a plurality of similar documents, iOe., like format but containing variations in data content, by program-mable machine, thus allowing the human involvement to be at a higher level and to be greatly reduced.
Briefly described, the invention includes an apparatus for yathering data from a plurality of source documents comprising means for sequentially receiving documents, optically scanning each document and forming series of digitalized electrical signal-s representative of digitalized signals representative of digital patterns substantially identical to patterns on each document from which an ima~e of each document can be reproduced; means ~or ~roviding m a predeter~ned area of each document a set of char-acters by imprinting or signing by program uniquely identifying each doc~r~ t and for producing electrical si~nals representative of said characters; and buffer means for storing the series of digitalized signals along with said signals representa-tive of said characteristics.
In another aspect, the invention includes an apparatus for selectively storing information derived from source documents comprising means for receiving source documents~ optically scanning each document and forming series of digitalized electrical signals repre-sentative of digitalized patterns of material on each - s -document from which an image of each document can be reproduced, buffer means for storing ~he series of digitalized signals, means for recalling from said buffer means groups of said digitalized signals and for produc-ing on a viewable screen an image of digitalized patterns of the document from which said signals were formed;
manually oper~ble control means for selecting a plurality of locations in said document to identify selected segments of ~he patterns therein and for adding to said selected segments address information to control subse-quent disposition of said segments; and a mass data file ~or receiving said segment~ in digitallzed form and said address information. The added address information can also be used to perform such functions as complementing the link tp the source information for use in a specific report or an excerpt from a report which permits this link to be carried forward, providing, e.g., an automated audit trail.
A still further aspect of the invention in-cludes a method of inputting and preparing data from source documents comprising the steps of scanning each source document and forming signals representative of digitalized patterns derived from images of characters and graphics thereon, temporarily storinq the signals representative of the digitalized patterns, selecting segments of the store~ signals for further processing, converting signals representative of digitalized patterns of characters in only the selected segments into a machine code, displaying the digitalized patterns from the storage of signals for each character not success-~ully con~erted into mach-inP code or ambiguous character along with a display of converted characters both before and after each unconverted character, manually entering a code for the digitalized pattern, and storing the machine code and digitalized pattern signal for subsequent us~.
In order that the manner in which ~he foregoing and oth~r objects are attained in accordance with the 1~8'7~3~7 -- 6 ~
invention can ~e understood in detail, particularly advantageous embodiments thereof will be described with reference to the accompanying drawin~sr which form a part of this specification, and wherein:~
Fig. 1 is a schematic block diagram of a system ~n accordance with ths invention;
Fig. 2 i8 a plan view o~ a typical source document illustrating possible placement areas for printed identification;
Fig. 3 is a schematic block diagram of a second embodiment of a system in accordance with the invention;
Fig. 4 is a flow diagram illustrating the sequences of steps in document and data handling in accordance with the method of the in~ention7 and Fig. 5 is a plan view of a typical text page illustrating a further technique invol~ing document preparation r Detailed Description Of The Preferred Embodiments For purposes of the following disclosure it will be desirable to use terms in a specific way to avoid confusion. In the processes to be described the term ~digitalization" and various forms thereof will be used to refer to a fragmenting of a symbol or pattern into an arrangement of light and dark elements~ i.e., elements having contra~t variations, and to signals representing those contrast variations or elements. An example of this would be an arrangement of dots which follow the lines of a letter of ~he alphabet such as might be produced by a dot matrix printer, and one or more series of electrical signals representing, respectively, the light and dark areas of the dots and their background.
Further terms to be used involve "code," i.e., ~machine code,~ ncodingn or "encoding" which refers to forming a code xepresentation of a s~mbol using, for example, ASCII code, such that the code can readily be r 1~51 7~ 7 ~ 7 ~
~tored in a machine, in a magnetic med~um or in some other form of memory and can be ma~ipulated using conven tional DP techni~ues.
In this connection, it will be observed that the digitalized pattern can also be stored, but the ~torage does not depend on recognition of the symbol nor need the symbol be actually recognizable. Encoding and storing the code, however, depends on recognition of the symbol as one of a predefined set and selection of an assortment of code elements which has been assigned to represent that symbol (or group of symbols). As used herein, the specific selection of the type of code will largely be ignored because that selection will depend primarily on the context in which the invention is used.
The term "data" will be used in a rather general sense to include information in human readable as well as machine readable or machine stored form.
~ Character" i~ used to mean alpha-numeric symbols as well as other symbols such as mathematical operators, generally including any symbol having a recognizable and definable meaning to some group. That term differs from "graphics'~ which is used to mean drawings, graphs, etc.
Turning now to the drawings, it will be seen that the diagram of Fig. l shows a digitizer system for receiving source documents indicated generally at 10 which are fed into a device which will be referred to as a documen~ reader 11 which performs the function o~
optically ^canning each document page delivered to it, the reader having as an integral part thereo a digiti2er 12 which performs the function of producing digitalized signals represent~tive of digital patterns approximating the characters and graphics appearing on the source documents delivered thereto. This portion of the appara-tu~ is known and is currently available on ~he market~ a useful device being the Memorex' PS100 OEM High-Speed Optical Page Scanner available from Memorex Corporation, 1~ 7~3~7 r ~ 8 ~
Santa Clara, CA. The digitalized signals representative of th~ digital patterns o the characters and graphics appearing on the pages are supplied on a channel 14 to a store 16 which can be any convenient form of memory capable of accepting the signals and retaining them in an extractable and changeable form. Various form~ of high density memory are usable for this purpose including hard magnetic disk, video disk, floppy disk and bubble.
Of particular significance is the inclusion in the apparatus shown in Fig. 1 of a printing device 18 which is illustrated as being at the input end of the apparatus and which, preferably, is in a very early portion of the system although it need not be the first element in a mechanical sense. The automatic printing may also take place prior to inputting the documents for ~igitalization, in a separate stepO A printer which can ~e used for this purpose is the Centronics Model 154 sold by Centronics Data Computer Corp., One Wall Street7 Hudson, N~. The purpose of printer 18 is to provide on each document which passes through the system a printed legend which uniquely identifies that document and which has useful information contained therein by which the document can be stored in an organized fashion and reco~ered quickly, if necessary; The printer can readily be arranged to print on any normally unused portion of the do~ument so as to not interfere with the text appear-lng thereon. For example, as shown in Fig. 2, a typical document 10 has a xegion 20 in which, for a certain oryanization, text normally appears, the marginal areas usually being blank. In an organization which commonly attaches document by hole-punching at the top, the legend can be printed in a zone indicated generally at 22 near the bottom of the page or, if the document is normally stored by punching along the left-hand marsin of the document, the legend can conveniently be printed either a~ the top or in zone 22 or in a zone 23~ As will be recognized t the direction of feed of the document will _ 9 determine whether a serial or parallel printer need be uRed for the zones indicated.
In either case, e~ch zone includes a portion A
and a portion B, one of which i~ printed wi~h a sequence o~ human reada~le symbols indicating filing information ~uch a~, for example, a serial number which i~ different for each document and which can include date information and, if more than one system of the type shown in Fig. 1 i5 in use, an indication of which system handled the document or additional information needed for an opera-tion. The other of portions A and B includes substan-tially the same information in a machine readable form which can employ symbols commonly readable by substan-tially all OCR systems such as the ba~ic mathematical operators, periods and the like. With this imprinted legend~ the document can readily be located manually or it can be located by a machine sorting system. If the system is used with documents which are frequently covered on one side with text, the printer can be arranged to print the legend on the back as indicated at 24. In that case, the zone a,ain has portions A and B as described.
Imprinting of the legend, control of the feed and f,urther functions are accomplished by a control unit ~5 which is coupled to the printer, the reader and the digitizer and which not only controls the normal func-tion~ of the apparatus, such as providing the manufacturer-specified input signals, but also controls the printer to inscribe the proper sequence of symbols on each page and, in addition, incorporates with the digit-ized signal an address relating to the imprinted legend so that the signal on channel 14 to the s~orage unit 16 includes not only the digitalized signal representative of the digital patterns of the material appearing on the page in the region 20 but also includes address informa-~ion such as that given in zone 22~ 23 or 24. It is also Lmportant to recognize that the system provides ~or the 12~'79'7 handliny of each source document only once unless unusual cir-cumstances arise. Thus, after the document has been printed, read and the information therefrom digitized and supplied to store 16, the document is delivered to a storage facility 27 from which it would normally not be removed, although it can continue to remain available if desired, depending upon the nature of the organization employing the system and the nature of the documents themselves. As a minor exception to the "single handling" principle~ it may be desirable to run the ~0 documents through a separate stack feeder and printer before the reader if differences in feed speed become significant because of the choice of certain printers or readers.
Instead of printing of the legend, documents, e.g.
transmit~ted by electronic mail, may be signed hy an internal program of the apparatus.
It is desirable to have bidirectional communication between the digitizer and the store so that the store can supply on a channel 15 information to the digitizer about available space remaining and can also be used for handshake and parity error checking purposes.
With the apparatus thus far described, it is possible to provide a central processing facility in which incoming docu-ments are handled only once for the purpose of passing them through the reader and digitizer apparatus, after which they are placed in storage or, conceivably, destroyed after a pre-selected lapse of time. All of the data from each document is available in a store so that it can be subsequently accessed, using the address information, so that the digitized signals can be employed to reproduce, on a viewable screen, a repro-duction of the digital patterns approximating the data on each original source document. The stored data is conveniently a mixture of machine code and digitalized pattern signals.
~Z~8'79~
It is also the philosophy of the invention to strategically position equipment of the described approach at one of several places within an operation, or in one piace with several pieces of equipment. This is for the purpose of organi-zing the operation in such a way ~0 - - lOa -v 1~ 7~
- ~1 that all ln- and ou~oing documents are collected in whatever way the operation requixes. This also permits collection and immediate transmission of data required to be viewed at one or various remote locations. The organization department of an operation, for instance, could enter a new procedure with drawings via the above approach~ with the new procedure being retrieva~le based on known passwords with no paper, i.e.~ a procedure which needs to be distributed to the various organizational elements. Furthermore, there is no need or the paper ~i.e.l procedure) to be filed for later access. The same ~pproach could be taken for the purpose of change control i~ an ~ngineering operation where either the latest change of a drawing or its history could be retrieved and viewed. This approach requires a new organizational element with experlenced and knowledgeable personnel.
FigO 1 also shows apparatus for recalliny and further processing the data stored in store 16. This apparatus, i~ a relatively simple form, includès a work sta io~ 30 having a viewable screen sym~olically indi-cated at 31 on which groups of digital patterns can be displayed from store 16. Work station 30 can comprise, for example, a relatively simple personal computer with a similarly simple recall program to extract data from store 16 and display it on screen 31, the work station being capable of bidirectional communication with store 16 on channels 32 and 33. The existence of work station 30 at this point in the system is~ however, extremely important as will be realized when recognizing that the data stored in store 16 is still in a digital pattern form and has not yet been sorted or selected nor is it y~t in final storage. Work station 30 is provided with a manual control symbolically indicated at 35 which can comprise a simple form of cursor control, a simplified keyboard, a ~mouse" or a lightpen~ any of which are ~apable of positioning two or more cursors at selected locations in textual or graphic material displayed on screen 31. Thus, i~ it is assumed that lines 34 on ~creen 31 represent lines of characters which have been extracted from store 16, control 35 can be used to place cursors at~ for example, the positions on the lines identifiPd by the X symbols in Fig. 1. These cursors are used to identify those segments lying between the X's as being segments which are to be preserved for further processing or use. The X's can ~e used to select all of the data on the screen r none of the data on the screen, or any amount in between, the sequence of entering the material into storage being according to the sequence of activating the cursors. The selected data segments are then transferred on line 36 to mass storage facilities symbolized by the tape storage unit 37 or disc storage 38. A computer which is quite suitab~e for this purpose is the apple ~LISAn/ computer system made by Apple Computer Inc~, 10260 Bandley Dr., Cupertino, CA and described in the February 1983 issue of BYTE Magazine, p. 33 et seq~
The cursor control and entry commands can also be used to designate which segments are to be stored in digitalized form for subsequen~ conversion ~o machine code and which segments (especially graphics) are simply to be retained in digitalized form as a more suitable format than in machine code.
In addition to selecting segments of the data, the manual cont~ol 35 can also be employed to attach an address or keyword to the data segment indicating the nature of the subject matter or the organizational unit to which the subject matter should be directed, or both.
In some circumstances, control 35 would necessarily be in the nature of a keyboard to provide a larger amount of control, but the actual size is not particularly signifi-cant~ It is greatly preferred that only meaningful portions of the data in digital pattern form are pre-served for mass storage and subsequent use. Additional-ly, the selected segments are iden~i~ied in such a way 12~3'7~7 _ 13 --that they can be acquired by symbal~ attached to them at work station 30 which will be referred to hereinafter as ~passwords. n It is obviously necessary for a trained individual to operate station 30,~ someone who is capable of viewing and comprehending at least the significance of the information being displayed and of attaching the desired code, using a lexicon for this purpose which is supplied by the organization~ In a corporate environ-ment, for example, the lexicon might include such or~ani-zational units as "accounting,n nsales,~ "research and development,~ and the like, and can also include subunits such as will serve to identify specific research and development projects or topics within the R~D department.
The exact nature of the passwords is, of course, not significant to the invention itself and will vary from ~ne organization to another.
Fig. 3 shows a further embodiment of an appara-tus in accordance with the invention in which the further step of conversion to a code is accomplished. Those portions of the system which have already been described will not be descrihed again, their funct.ions being substantially the same. In th~ em~odiment of Fig. 3, work station 30 is shown as having, in addition to screen 31~ a more complete keyboard 40, although the apparatus can sti.ll include a lightpen, mous2, joy stick or the like for cursor control.
In addition, the app~ratus includes a conver-ter/compiler 42 which is capable of converting digitalized signals representin7 digitalized patterns into a machine code such as ASCII, BCD or some other code. As will be recognized, the volume of the material is reduced at work station 30 by selecting data segments to be storedl ~owever, in addition to storing the digitalized signals in a data file 46, the signals are supplied on a channel 48 to converter 42 to be converted into a code which can be processed and handled by a conventional computing ~Z~379~7 device. The encoded data is then supplied on a channel S0 to a separate portion of data Eile 46.
As is well recognized, pattern converters of good quality are capable of recognizing a large percentage of the patterns presented to them, which patterns will then be success-fully converted into the machine code and stored. It can be expected, however, that certain patterns will be not recognized or will be recognized as being ambiguous symbols, such as "5"
and "S", "H" and "4", and the like. An ambiguous symbol is also character, which requires one or several conversion attempts by the program until being successful. The digital patterns representing these ambiguous and unrecognized characters are stored in data file 46 and the machine code for all recog-nized symbols are stored in file 46 but those which are notrecognized or which are thought to be ambiguous are replaced in storage, preferably in a separate portion thereof J with a code signifying a special identifying symbol such as a rectangle substituting for the character which is presenting the problem, plus a return address.
After a set of data has been stored, the symbol substituted for those characters which have been identified as unrecognized or ambiguous are returned to the screen along with a concurrent display of the digital pattern stored in file 46 for the same character. Preferably, the machine code symbol indicating the problem character is accompanied by a predetermined number of characters in either DP code or as digital patterns on either side of that symbol, e.g., 4 or 5 such characters, allowing the unknown character to be presented on the screen 31 in a context from which it can be recognized if the pattern is not. Also, the digitalized patterns for, e.g. 3 to 5 characters on either side of the probmem character can be displayed to place the character in context. The human operator, presumably capable of identifying the symbol, then inserts the appropriate symbol in machine code using keyboard 379~
40, this inserted symbol replacing the rectangle and comple-menting the converted data stored in file 46.
, - 14a -r 1~q:)~791 7 AR will ~e recog~ized from the above, human intervention i~ again necessary only for resolving ambiguities or similar problems which occur as a result of incomplete con~er~ion. Thi~ intervention, of course, could take place in a separate step.
After the conversion has been completed and verified, the original data file can be retained or erased as a matter of organi~ational policy. Bearing in mind that the source documents are still available, identified by the unique codes previously described, and recognizing further that these unique codes still accom-pany the selected segments which have now been converted from their digitalized patterns to machin~ codes, it is a simple matter, if con~idered necessary, to return to the sourc~ document for purposes of finding support for the chosen segment. Thus, retaining the digital pattern file may not be necessary. Howe~er, when the source documents contain graphic information which is not convertible into machine code in the same fashion, retention of the digital pattern signals in memory is necessary.
Fig. 4 illustrates a sequence of events which i~ substantially ~he same as that which has been describ~-ed above. However, it should be recognized in reviewing the sPquence illustrated in Fig. 4 that more than one work station 30 can easily, and would preferably, be involved. Thus, the same work station would not neces-sarily be used for segment selection and machine code conversion complementing. Indeed, it is en~irely pos-sible to have several work stations performing each task i justified b~ the volume of documents handledO
To briefly review Fig. 4, the source documents are supplied to apparatus to be read, co~ied, digi~alized aftex which a decision is made a~ to whether the- data selection can or cannot be made by a progr~m. If the documents are of substantially identical format or have other characteristics which permit automatic handling for this purpose, the segmen~s are selected by program and r ~L20l~3 7~7 the selected segments are converted into machine code.
If the segment selection cannot be made by program, the 6egments are selected by the manual control techniques discussed above and the ~elected segments are then ~onverted.
If the conversion was successful, the converted data is supplied to machine code storage~ If not, the ambiguous or unrecognized portions are compared with the digitalized patterns and the machine code data is comple mented manually, the additional code segments being supplie~ to machine code storage. In each case, the digital patterns are stored.
The stored patterns an~ machine ccdes are then available through the passwords previously described and can be used for further processing~
An additional concept is illustrated in Fig. 5 which involves preparing documents for data segment selection in an automatic sense but employing documents which are not, in any given batch~ in a sufficiently similar format to permit selection by area designation in a program~ Fig. S illustrates a section o a typical page of information in which a central portion thereof is to be selected for retention. The selected segment is identified by manually placing identification marks which are unique and different from the remainder of the text likely to appear on the page, the marks ~eing chosen to be machine recognizable. In the example illustrated, marks 56 and 5? have been placed on the page, indicating the ~eginning and end points o the selected segment.
Thus, the phrase beginning nof only ... n and ending ~ o given page" will be retained and the remainder will notc Marks such as those shown in Fig. S, or any other uniquely distinctive marks which can be placed in a position between words, can be added to the page by an individual with a simple marker such as a rubber stamp.
Then, when the information is read and digitalized and the decision is made about whethex the data selection can ~Z~7~3 be made by program, the answer to tha~ last question is ~yes~ because the machine i~ capable of recogni~ing these ~ymbols which have been previously added to the page.
While this initial preparation step is necessary in order to accomplish automatic selection, the time involved is not significantly greater than that required for machine ~election, as discussed in con~ection with Fig. 1, and the proces~ does not occupy a terminal which can then be used .or other purposes.
FroM the foregoing, it will be apparent that a system in accordance with the invention permits increased flexibility and efficiency in the extraction of data from documents. When dealing with batch type documents [meaning a large number of documents having data items arranged in a uniform format) the data collection can be fully automatic, the data can be collected in any desired sequenGe, independent of the document format, and the human effort can be reduced by as much as ~5%. Also, the fo~mat need not be marked on the documents, i.e., they need not have the customary "boxes" and labels. When dealing with single-type documents, collection flexibil-ity still exists~ although an operator is used to select either by a work station such as 30 (Figs. 1, 3) or by a pre-m~rking (Fig~ 5) for automatic collection. Human effort reduction is in the order of more than 50%~
Conversion and verification into machine code such as ASCII is accomplished using retrievable digitai-ized images and there is consistent retrieval code correlation between the original document, the stored digitalized image of selected segments and the machine coded store. Even those unrecognized portions (e~g., handwritten) which are beyond the capability of today's OCR equipment can be handled in a mixed document.
Because of the dual storage, documents with graphics can be stored and retrieved, as well as important signatures;
and can be ~filed" in the most effective way for the '7gl~
particular subject matter, i.e., with or without conversion to machine code, etc.
i he material can also be designated for either immediate transmission to an organizational unit (in the sense of "electronic mail") or for subse~uent access by one or more units. Designation can also be effected by a cross-reference issued by data processing program. Then imprinting of each document may not be necessary. A degree of security can be provided~ if appropriate, by limiting access to cer-tain designated units, all of this being a function of thepasswords assigned. Access to documents of wide interest can be made almost immediately available to everyone at once because the text can be accessed rather than needing to cir-culate an original or make numerous copies for distribution.
Decentralization of the organization has no negative effect on this kind of information availability.
. , . . . _ . . . . .. . .... . .
ments and upon the nature of the organization.
Much information, particularly that which is of historical interest only, is being filed in the form of microfilm, microfiche or similar forms which are produced by what can b~ generically described as photographic techniques. In other cases, the information contained in the documents is converted to an encoded form which can be accomplished by such machines as optical character readers IOCR), despite the considerable expense of such machines, depending on the nature of the printing or typing in the original document; but other information must be entered into a system by a manual keypunch operation, a technique which has gained wide acceptance because of the unsatisfactory nature of alternative techniques, but which nevertheless has serious drawbacks because of the inherent problem of errors occurring simply because of the human process of retyping the information. A discussion of various data preparation device~ and techniques i5 to be found in the Encyclopedia of Computer Science and Engineering, Second Edition, Van l~B7~7 3 ~
Nostrand, Reinhold Company, New York ~1983) beginning at page 4~0. This text includes a review of the historical development of data preparation and also dis~usses th~
expense and difficulty of recycling information within a ~ystem to reduce the error percentage~
In most circumstance~, it is not necessaril~
desirable to elimin~te human intervention, nor can this be done as a practical matter. For example, if documents coming into an organization are to be handled and entered into a system~ it is necessary for some human operator to review each document, determine its relevance and make some decisions. It is, however, desirable to remove the human process of retyping or keypunching the data because of the above-discuss~d error entry problems. On the other hand, machine data entr~ preparation, such as OCR, in addition to the expense has the disadvantage that very often the total content of each document must be entered, an approach which is wasteful of mass storage, compounds the difficulty of locating and utilizing relevant infor-mation at a later time, and usually would necessitate reworking the data, depending upon its ~orm and ultimate use.
Brief Summary Of The_}nvention Accordinsly, it is an object of the present invention to provide an organized system for selectively entering the contents of documents into a storage facil-ity from which the material can be selectively extracted for various purposes.
A further object i~ to provide a unique tech-nique for cross referencing the stored pattern and the original document, the combination of storage and corre-lation being usable, in many circumstances~ to completely eliminate further handling of the original document itself between organizational units~
g~ .
A further object i~ ~o provide such a system in which the total amount of storage required i reduced because of the ~electivi~y of the storage technique, and wherein the sa~ed memory capacity is usable to correlate the filed data with the source documents, permitting carrying of this link forward to provide a bibliography of source information in, for example, report prepara-tion.
A still further object is to provide efficient techniques for verification and complementing of the stored data in a codification and storage procedure.
Yet another object is to provide such a system which is capable of automatic selection of desired data from a plurality of similar documents, iOe., like format but containing variations in data content, by program-mable machine, thus allowing the human involvement to be at a higher level and to be greatly reduced.
Briefly described, the invention includes an apparatus for yathering data from a plurality of source documents comprising means for sequentially receiving documents, optically scanning each document and forming series of digitalized electrical signal-s representative of digitalized signals representative of digital patterns substantially identical to patterns on each document from which an ima~e of each document can be reproduced; means ~or ~roviding m a predeter~ned area of each document a set of char-acters by imprinting or signing by program uniquely identifying each doc~r~ t and for producing electrical si~nals representative of said characters; and buffer means for storing the series of digitalized signals along with said signals representa-tive of said characteristics.
In another aspect, the invention includes an apparatus for selectively storing information derived from source documents comprising means for receiving source documents~ optically scanning each document and forming series of digitalized electrical signals repre-sentative of digitalized patterns of material on each - s -document from which an image of each document can be reproduced, buffer means for storing ~he series of digitalized signals, means for recalling from said buffer means groups of said digitalized signals and for produc-ing on a viewable screen an image of digitalized patterns of the document from which said signals were formed;
manually oper~ble control means for selecting a plurality of locations in said document to identify selected segments of ~he patterns therein and for adding to said selected segments address information to control subse-quent disposition of said segments; and a mass data file ~or receiving said segment~ in digitallzed form and said address information. The added address information can also be used to perform such functions as complementing the link tp the source information for use in a specific report or an excerpt from a report which permits this link to be carried forward, providing, e.g., an automated audit trail.
A still further aspect of the invention in-cludes a method of inputting and preparing data from source documents comprising the steps of scanning each source document and forming signals representative of digitalized patterns derived from images of characters and graphics thereon, temporarily storinq the signals representative of the digitalized patterns, selecting segments of the store~ signals for further processing, converting signals representative of digitalized patterns of characters in only the selected segments into a machine code, displaying the digitalized patterns from the storage of signals for each character not success-~ully con~erted into mach-inP code or ambiguous character along with a display of converted characters both before and after each unconverted character, manually entering a code for the digitalized pattern, and storing the machine code and digitalized pattern signal for subsequent us~.
In order that the manner in which ~he foregoing and oth~r objects are attained in accordance with the 1~8'7~3~7 -- 6 ~
invention can ~e understood in detail, particularly advantageous embodiments thereof will be described with reference to the accompanying drawin~sr which form a part of this specification, and wherein:~
Fig. 1 is a schematic block diagram of a system ~n accordance with ths invention;
Fig. 2 i8 a plan view o~ a typical source document illustrating possible placement areas for printed identification;
Fig. 3 is a schematic block diagram of a second embodiment of a system in accordance with the invention;
Fig. 4 is a flow diagram illustrating the sequences of steps in document and data handling in accordance with the method of the in~ention7 and Fig. 5 is a plan view of a typical text page illustrating a further technique invol~ing document preparation r Detailed Description Of The Preferred Embodiments For purposes of the following disclosure it will be desirable to use terms in a specific way to avoid confusion. In the processes to be described the term ~digitalization" and various forms thereof will be used to refer to a fragmenting of a symbol or pattern into an arrangement of light and dark elements~ i.e., elements having contra~t variations, and to signals representing those contrast variations or elements. An example of this would be an arrangement of dots which follow the lines of a letter of ~he alphabet such as might be produced by a dot matrix printer, and one or more series of electrical signals representing, respectively, the light and dark areas of the dots and their background.
Further terms to be used involve "code," i.e., ~machine code,~ ncodingn or "encoding" which refers to forming a code xepresentation of a s~mbol using, for example, ASCII code, such that the code can readily be r 1~51 7~ 7 ~ 7 ~
~tored in a machine, in a magnetic med~um or in some other form of memory and can be ma~ipulated using conven tional DP techni~ues.
In this connection, it will be observed that the digitalized pattern can also be stored, but the ~torage does not depend on recognition of the symbol nor need the symbol be actually recognizable. Encoding and storing the code, however, depends on recognition of the symbol as one of a predefined set and selection of an assortment of code elements which has been assigned to represent that symbol (or group of symbols). As used herein, the specific selection of the type of code will largely be ignored because that selection will depend primarily on the context in which the invention is used.
The term "data" will be used in a rather general sense to include information in human readable as well as machine readable or machine stored form.
~ Character" i~ used to mean alpha-numeric symbols as well as other symbols such as mathematical operators, generally including any symbol having a recognizable and definable meaning to some group. That term differs from "graphics'~ which is used to mean drawings, graphs, etc.
Turning now to the drawings, it will be seen that the diagram of Fig. l shows a digitizer system for receiving source documents indicated generally at 10 which are fed into a device which will be referred to as a documen~ reader 11 which performs the function o~
optically ^canning each document page delivered to it, the reader having as an integral part thereo a digiti2er 12 which performs the function of producing digitalized signals represent~tive of digital patterns approximating the characters and graphics appearing on the source documents delivered thereto. This portion of the appara-tu~ is known and is currently available on ~he market~ a useful device being the Memorex' PS100 OEM High-Speed Optical Page Scanner available from Memorex Corporation, 1~ 7~3~7 r ~ 8 ~
Santa Clara, CA. The digitalized signals representative of th~ digital patterns o the characters and graphics appearing on the pages are supplied on a channel 14 to a store 16 which can be any convenient form of memory capable of accepting the signals and retaining them in an extractable and changeable form. Various form~ of high density memory are usable for this purpose including hard magnetic disk, video disk, floppy disk and bubble.
Of particular significance is the inclusion in the apparatus shown in Fig. 1 of a printing device 18 which is illustrated as being at the input end of the apparatus and which, preferably, is in a very early portion of the system although it need not be the first element in a mechanical sense. The automatic printing may also take place prior to inputting the documents for ~igitalization, in a separate stepO A printer which can ~e used for this purpose is the Centronics Model 154 sold by Centronics Data Computer Corp., One Wall Street7 Hudson, N~. The purpose of printer 18 is to provide on each document which passes through the system a printed legend which uniquely identifies that document and which has useful information contained therein by which the document can be stored in an organized fashion and reco~ered quickly, if necessary; The printer can readily be arranged to print on any normally unused portion of the do~ument so as to not interfere with the text appear-lng thereon. For example, as shown in Fig. 2, a typical document 10 has a xegion 20 in which, for a certain oryanization, text normally appears, the marginal areas usually being blank. In an organization which commonly attaches document by hole-punching at the top, the legend can be printed in a zone indicated generally at 22 near the bottom of the page or, if the document is normally stored by punching along the left-hand marsin of the document, the legend can conveniently be printed either a~ the top or in zone 22 or in a zone 23~ As will be recognized t the direction of feed of the document will _ 9 determine whether a serial or parallel printer need be uRed for the zones indicated.
In either case, e~ch zone includes a portion A
and a portion B, one of which i~ printed wi~h a sequence o~ human reada~le symbols indicating filing information ~uch a~, for example, a serial number which i~ different for each document and which can include date information and, if more than one system of the type shown in Fig. 1 i5 in use, an indication of which system handled the document or additional information needed for an opera-tion. The other of portions A and B includes substan-tially the same information in a machine readable form which can employ symbols commonly readable by substan-tially all OCR systems such as the ba~ic mathematical operators, periods and the like. With this imprinted legend~ the document can readily be located manually or it can be located by a machine sorting system. If the system is used with documents which are frequently covered on one side with text, the printer can be arranged to print the legend on the back as indicated at 24. In that case, the zone a,ain has portions A and B as described.
Imprinting of the legend, control of the feed and f,urther functions are accomplished by a control unit ~5 which is coupled to the printer, the reader and the digitizer and which not only controls the normal func-tion~ of the apparatus, such as providing the manufacturer-specified input signals, but also controls the printer to inscribe the proper sequence of symbols on each page and, in addition, incorporates with the digit-ized signal an address relating to the imprinted legend so that the signal on channel 14 to the s~orage unit 16 includes not only the digitalized signal representative of the digital patterns of the material appearing on the page in the region 20 but also includes address informa-~ion such as that given in zone 22~ 23 or 24. It is also Lmportant to recognize that the system provides ~or the 12~'79'7 handliny of each source document only once unless unusual cir-cumstances arise. Thus, after the document has been printed, read and the information therefrom digitized and supplied to store 16, the document is delivered to a storage facility 27 from which it would normally not be removed, although it can continue to remain available if desired, depending upon the nature of the organization employing the system and the nature of the documents themselves. As a minor exception to the "single handling" principle~ it may be desirable to run the ~0 documents through a separate stack feeder and printer before the reader if differences in feed speed become significant because of the choice of certain printers or readers.
Instead of printing of the legend, documents, e.g.
transmit~ted by electronic mail, may be signed hy an internal program of the apparatus.
It is desirable to have bidirectional communication between the digitizer and the store so that the store can supply on a channel 15 information to the digitizer about available space remaining and can also be used for handshake and parity error checking purposes.
With the apparatus thus far described, it is possible to provide a central processing facility in which incoming docu-ments are handled only once for the purpose of passing them through the reader and digitizer apparatus, after which they are placed in storage or, conceivably, destroyed after a pre-selected lapse of time. All of the data from each document is available in a store so that it can be subsequently accessed, using the address information, so that the digitized signals can be employed to reproduce, on a viewable screen, a repro-duction of the digital patterns approximating the data on each original source document. The stored data is conveniently a mixture of machine code and digitalized pattern signals.
~Z~8'79~
It is also the philosophy of the invention to strategically position equipment of the described approach at one of several places within an operation, or in one piace with several pieces of equipment. This is for the purpose of organi-zing the operation in such a way ~0 - - lOa -v 1~ 7~
- ~1 that all ln- and ou~oing documents are collected in whatever way the operation requixes. This also permits collection and immediate transmission of data required to be viewed at one or various remote locations. The organization department of an operation, for instance, could enter a new procedure with drawings via the above approach~ with the new procedure being retrieva~le based on known passwords with no paper, i.e.~ a procedure which needs to be distributed to the various organizational elements. Furthermore, there is no need or the paper ~i.e.l procedure) to be filed for later access. The same ~pproach could be taken for the purpose of change control i~ an ~ngineering operation where either the latest change of a drawing or its history could be retrieved and viewed. This approach requires a new organizational element with experlenced and knowledgeable personnel.
FigO 1 also shows apparatus for recalliny and further processing the data stored in store 16. This apparatus, i~ a relatively simple form, includès a work sta io~ 30 having a viewable screen sym~olically indi-cated at 31 on which groups of digital patterns can be displayed from store 16. Work station 30 can comprise, for example, a relatively simple personal computer with a similarly simple recall program to extract data from store 16 and display it on screen 31, the work station being capable of bidirectional communication with store 16 on channels 32 and 33. The existence of work station 30 at this point in the system is~ however, extremely important as will be realized when recognizing that the data stored in store 16 is still in a digital pattern form and has not yet been sorted or selected nor is it y~t in final storage. Work station 30 is provided with a manual control symbolically indicated at 35 which can comprise a simple form of cursor control, a simplified keyboard, a ~mouse" or a lightpen~ any of which are ~apable of positioning two or more cursors at selected locations in textual or graphic material displayed on screen 31. Thus, i~ it is assumed that lines 34 on ~creen 31 represent lines of characters which have been extracted from store 16, control 35 can be used to place cursors at~ for example, the positions on the lines identifiPd by the X symbols in Fig. 1. These cursors are used to identify those segments lying between the X's as being segments which are to be preserved for further processing or use. The X's can ~e used to select all of the data on the screen r none of the data on the screen, or any amount in between, the sequence of entering the material into storage being according to the sequence of activating the cursors. The selected data segments are then transferred on line 36 to mass storage facilities symbolized by the tape storage unit 37 or disc storage 38. A computer which is quite suitab~e for this purpose is the apple ~LISAn/ computer system made by Apple Computer Inc~, 10260 Bandley Dr., Cupertino, CA and described in the February 1983 issue of BYTE Magazine, p. 33 et seq~
The cursor control and entry commands can also be used to designate which segments are to be stored in digitalized form for subsequen~ conversion ~o machine code and which segments (especially graphics) are simply to be retained in digitalized form as a more suitable format than in machine code.
In addition to selecting segments of the data, the manual cont~ol 35 can also be employed to attach an address or keyword to the data segment indicating the nature of the subject matter or the organizational unit to which the subject matter should be directed, or both.
In some circumstances, control 35 would necessarily be in the nature of a keyboard to provide a larger amount of control, but the actual size is not particularly signifi-cant~ It is greatly preferred that only meaningful portions of the data in digital pattern form are pre-served for mass storage and subsequent use. Additional-ly, the selected segments are iden~i~ied in such a way 12~3'7~7 _ 13 --that they can be acquired by symbal~ attached to them at work station 30 which will be referred to hereinafter as ~passwords. n It is obviously necessary for a trained individual to operate station 30,~ someone who is capable of viewing and comprehending at least the significance of the information being displayed and of attaching the desired code, using a lexicon for this purpose which is supplied by the organization~ In a corporate environ-ment, for example, the lexicon might include such or~ani-zational units as "accounting,n nsales,~ "research and development,~ and the like, and can also include subunits such as will serve to identify specific research and development projects or topics within the R~D department.
The exact nature of the passwords is, of course, not significant to the invention itself and will vary from ~ne organization to another.
Fig. 3 shows a further embodiment of an appara-tus in accordance with the invention in which the further step of conversion to a code is accomplished. Those portions of the system which have already been described will not be descrihed again, their funct.ions being substantially the same. In th~ em~odiment of Fig. 3, work station 30 is shown as having, in addition to screen 31~ a more complete keyboard 40, although the apparatus can sti.ll include a lightpen, mous2, joy stick or the like for cursor control.
In addition, the app~ratus includes a conver-ter/compiler 42 which is capable of converting digitalized signals representin7 digitalized patterns into a machine code such as ASCII, BCD or some other code. As will be recognized, the volume of the material is reduced at work station 30 by selecting data segments to be storedl ~owever, in addition to storing the digitalized signals in a data file 46, the signals are supplied on a channel 48 to converter 42 to be converted into a code which can be processed and handled by a conventional computing ~Z~379~7 device. The encoded data is then supplied on a channel S0 to a separate portion of data Eile 46.
As is well recognized, pattern converters of good quality are capable of recognizing a large percentage of the patterns presented to them, which patterns will then be success-fully converted into the machine code and stored. It can be expected, however, that certain patterns will be not recognized or will be recognized as being ambiguous symbols, such as "5"
and "S", "H" and "4", and the like. An ambiguous symbol is also character, which requires one or several conversion attempts by the program until being successful. The digital patterns representing these ambiguous and unrecognized characters are stored in data file 46 and the machine code for all recog-nized symbols are stored in file 46 but those which are notrecognized or which are thought to be ambiguous are replaced in storage, preferably in a separate portion thereof J with a code signifying a special identifying symbol such as a rectangle substituting for the character which is presenting the problem, plus a return address.
After a set of data has been stored, the symbol substituted for those characters which have been identified as unrecognized or ambiguous are returned to the screen along with a concurrent display of the digital pattern stored in file 46 for the same character. Preferably, the machine code symbol indicating the problem character is accompanied by a predetermined number of characters in either DP code or as digital patterns on either side of that symbol, e.g., 4 or 5 such characters, allowing the unknown character to be presented on the screen 31 in a context from which it can be recognized if the pattern is not. Also, the digitalized patterns for, e.g. 3 to 5 characters on either side of the probmem character can be displayed to place the character in context. The human operator, presumably capable of identifying the symbol, then inserts the appropriate symbol in machine code using keyboard 379~
40, this inserted symbol replacing the rectangle and comple-menting the converted data stored in file 46.
, - 14a -r 1~q:)~791 7 AR will ~e recog~ized from the above, human intervention i~ again necessary only for resolving ambiguities or similar problems which occur as a result of incomplete con~er~ion. Thi~ intervention, of course, could take place in a separate step.
After the conversion has been completed and verified, the original data file can be retained or erased as a matter of organi~ational policy. Bearing in mind that the source documents are still available, identified by the unique codes previously described, and recognizing further that these unique codes still accom-pany the selected segments which have now been converted from their digitalized patterns to machin~ codes, it is a simple matter, if con~idered necessary, to return to the sourc~ document for purposes of finding support for the chosen segment. Thus, retaining the digital pattern file may not be necessary. Howe~er, when the source documents contain graphic information which is not convertible into machine code in the same fashion, retention of the digital pattern signals in memory is necessary.
Fig. 4 illustrates a sequence of events which i~ substantially ~he same as that which has been describ~-ed above. However, it should be recognized in reviewing the sPquence illustrated in Fig. 4 that more than one work station 30 can easily, and would preferably, be involved. Thus, the same work station would not neces-sarily be used for segment selection and machine code conversion complementing. Indeed, it is en~irely pos-sible to have several work stations performing each task i justified b~ the volume of documents handledO
To briefly review Fig. 4, the source documents are supplied to apparatus to be read, co~ied, digi~alized aftex which a decision is made a~ to whether the- data selection can or cannot be made by a progr~m. If the documents are of substantially identical format or have other characteristics which permit automatic handling for this purpose, the segmen~s are selected by program and r ~L20l~3 7~7 the selected segments are converted into machine code.
If the segment selection cannot be made by program, the 6egments are selected by the manual control techniques discussed above and the ~elected segments are then ~onverted.
If the conversion was successful, the converted data is supplied to machine code storage~ If not, the ambiguous or unrecognized portions are compared with the digitalized patterns and the machine code data is comple mented manually, the additional code segments being supplie~ to machine code storage. In each case, the digital patterns are stored.
The stored patterns an~ machine ccdes are then available through the passwords previously described and can be used for further processing~
An additional concept is illustrated in Fig. 5 which involves preparing documents for data segment selection in an automatic sense but employing documents which are not, in any given batch~ in a sufficiently similar format to permit selection by area designation in a program~ Fig. S illustrates a section o a typical page of information in which a central portion thereof is to be selected for retention. The selected segment is identified by manually placing identification marks which are unique and different from the remainder of the text likely to appear on the page, the marks ~eing chosen to be machine recognizable. In the example illustrated, marks 56 and 5? have been placed on the page, indicating the ~eginning and end points o the selected segment.
Thus, the phrase beginning nof only ... n and ending ~ o given page" will be retained and the remainder will notc Marks such as those shown in Fig. S, or any other uniquely distinctive marks which can be placed in a position between words, can be added to the page by an individual with a simple marker such as a rubber stamp.
Then, when the information is read and digitalized and the decision is made about whethex the data selection can ~Z~7~3 be made by program, the answer to tha~ last question is ~yes~ because the machine i~ capable of recogni~ing these ~ymbols which have been previously added to the page.
While this initial preparation step is necessary in order to accomplish automatic selection, the time involved is not significantly greater than that required for machine ~election, as discussed in con~ection with Fig. 1, and the proces~ does not occupy a terminal which can then be used .or other purposes.
FroM the foregoing, it will be apparent that a system in accordance with the invention permits increased flexibility and efficiency in the extraction of data from documents. When dealing with batch type documents [meaning a large number of documents having data items arranged in a uniform format) the data collection can be fully automatic, the data can be collected in any desired sequenGe, independent of the document format, and the human effort can be reduced by as much as ~5%. Also, the fo~mat need not be marked on the documents, i.e., they need not have the customary "boxes" and labels. When dealing with single-type documents, collection flexibil-ity still exists~ although an operator is used to select either by a work station such as 30 (Figs. 1, 3) or by a pre-m~rking (Fig~ 5) for automatic collection. Human effort reduction is in the order of more than 50%~
Conversion and verification into machine code such as ASCII is accomplished using retrievable digitai-ized images and there is consistent retrieval code correlation between the original document, the stored digitalized image of selected segments and the machine coded store. Even those unrecognized portions (e~g., handwritten) which are beyond the capability of today's OCR equipment can be handled in a mixed document.
Because of the dual storage, documents with graphics can be stored and retrieved, as well as important signatures;
and can be ~filed" in the most effective way for the '7gl~
particular subject matter, i.e., with or without conversion to machine code, etc.
i he material can also be designated for either immediate transmission to an organizational unit (in the sense of "electronic mail") or for subse~uent access by one or more units. Designation can also be effected by a cross-reference issued by data processing program. Then imprinting of each document may not be necessary. A degree of security can be provided~ if appropriate, by limiting access to cer-tain designated units, all of this being a function of thepasswords assigned. Access to documents of wide interest can be made almost immediately available to everyone at once because the text can be accessed rather than needing to cir-culate an original or make numerous copies for distribution.
Decentralization of the organization has no negative effect on this kind of information availability.
. , . . . _ . . . . .. . .... . .
Claims (29)
PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS:
1. An apparatus for gathering data from a plurality of source documents comprising: means for sequentially receiving documents, optically scanning each document and forming a series of digitalized signals representative of digital patterns closely approximating patterns on each docu-ment from which an image of each document can be reproduced;
means for imprinting in a predetermined area of each docu-ment a set of characters uniquely identifying each document and for producing electrical signals representative of said characters; buffer means for storing the series of digitali-zed signals along with said signals representative of said characters; means for recalling from said buffer means groups of said digitalized signals and for producing on a viewable screen an image of the digitalized patterns of the document from which said signals were formed; manually operable con-trol means for selecting a plurality of locations in said document image to identify selected segments of the digital patterns therein and for adding to said selected segments address information to control subsequent disposition of said segments; and a mass data file for receiving and storing said segments in digitalized form and said address informa-tion.
means for imprinting in a predetermined area of each docu-ment a set of characters uniquely identifying each document and for producing electrical signals representative of said characters; buffer means for storing the series of digitali-zed signals along with said signals representative of said characters; means for recalling from said buffer means groups of said digitalized signals and for producing on a viewable screen an image of the digitalized patterns of the document from which said signals were formed; manually operable con-trol means for selecting a plurality of locations in said document image to identify selected segments of the digital patterns therein and for adding to said selected segments address information to control subsequent disposition of said segments; and a mass data file for receiving and storing said segments in digitalized form and said address informa-tion.
2. An apparatus according to claim 1 wherein the set of characters printed by said means for imprinting in-cludes concurrently printed subsets of characters in machine readable form and in human readable form including essen-tially the same information.
3. An apparatus according to claim 2 wherein said predetermined area is an edge of each document.
4. An apparatus according to claim 1 wherein said mass data file comprises video disc means for receiving and storing.
5. An apparatus according to claim 1 and further comprising means for recalling said digitalized signals from said buffer means and for converting said signals into signals forming a machine code; and means for storing said code signals.
6. An apparatus according to claim 5 and compris-ing means for identifying signals representative of unrecog-nized and ambiguous digitalized patterns and for including a distinctive marker code correlated with such code signals in said means for storing; means for recalling code signal groups including said marker codes and displaying on a screen groups of code-generated symbols each including an unrecognized signal; means for displaying the unrecognized digitalized pattern concurrently with the symbols generated from said groups of code signals on said screen for human review; and means for manually entering a symbol code to re-place said marker code and the signals representing said un-recognized or ambiguous symbols.
7. An apparatus according to claim 6 wherein said manually operable control means includes a manipulatable control for positioning cursors at said locations.
8. An apparatus according to claim 6 wherein said manually operable control means includes a lightpen, and said screen includes means responsive to said lightpen to establish an identifying element at locations contacted by said pen.
9. An apparatus according to claim 5 wherein said mass data file comprises video disc means for receiving and storing.
10. An apparatus according to claim 1 wherein said manually operable control means includes a manipula-table control for positioning cursors at said locations.
11. An apparatus according to claim 1 wherein said manually operable control means includes a lightpen, and said screen includes means responsive to said lightpen to establish an identifying element at locations contacted by said pen.
12. An apparatus for selectively storing informa-tion derived from source documents comprising means for re-ceiving source documents, optically scanning each document and forming a series of digitalized electrical signals rep-resentative of a digitalization of patterns on each document from which an image of each document can be reproduced, buf-fer means for storing the series of digitalized signals, means for recalling from said buffer means groups of said digitalized signals and for producing on a viewable screen an image of digitalized patterns of the document from which said signals were formed; manually operable control means for selecting a plurality of locations in said document to identify selected segments of the patterns therein and for adding to said selected segments address information to con-trol subsequent disposition of said segments; and a mass data file for receiving said segments in digitalized form and said address information.
13. An apparatus according to claim 12 and fur-ther comprising means for recalling said digitalized signals from said buffer means and for converting said signals into signals forming a machine code; and means for storing said code signals.
14. An apparatus according to claim 13 comprising means for identifying signals representative of unrecognized and ambiguous digitalized patterns and for including a dis-tinctive marker code correlated with such code signals in said means for storing; means for recalling code signal groups including said marker codes and displaying on a screen groups of code-generated symbols each including an unrecognized or ambiguous signal; means for displaying the unrecognized digitalized pattern concurrently with the symbols generated from said groups of code signals on said screen for human review; and means for manually entering a symbol code to replace said marker code and the signals represent-ing said unrecognized or ambiguous symbols.
15. An apparatus according to claim 14 wherein said manually operable control means includes a manipulat-able control for positioning cursors at said locations.
16. An apparatus according to claim 14 wherein said manually operable control means includes lightpen, and said screen includes means responsive to said lightpen to establish an identifying element at locations contacted by said pen.
17. A method of inputting and preparing data from source documents comprising the steps of scanning each source document and forming signals representative of digi-talized patterns derived from images of characters and gra-phics thereon, temporarily storing the signals representa-tive of the digitalized patterns, selecting segments of the stored signals for further processing, converting signals representative of digitalized patterns of characters in only the selected segments into a machine code, displaying the digitalized patterns from the storage of signals for each character not successfully converted into machine code, man-ually entering a code for the digitalized pattern, and stor-ing the machine code and digitalized pattern signal for sub-sequent use.
18. A method according to claim 17 and further including the step of adding to sets of the digitalized pat-tern signals a signal set representative of a password selected in accordance with a predetermined lexicon by which said sets of digitalized pattern signals can be subsequently accessed.
19. A method according to claim 17 and including displaying digitalized patterns of characters appearing both before and after each unconverted character.
20. A method according to claim 18 wherein the step of selecting includes manually positioning a cursor at the beginning and end of each segment to be converted.
21. A method according to claim 18 wherein the step of selecting includes instructing a processor to select segments from selected areas of the source documents.
22. An apparatus according to claim 17 and fur-ther including adding to sets of machine code signals a sig-nal set representative of a password selected in accordance with a predetermined lexicon.
23. A method according to claim 17 wherein the step of selecting includes manually positioning a cursor at the beginning and end of each segment to be converted.
24. A method according to claim 17 wherein the step of selecting includes instructing a processor to select segments from selected areas of the source documents.
25. A method according to claim 17 and preceded by the step of imprinting in a predetermined zone on each document a set of characters uniquely identifying that doc-ument in human and machine readable form.
26. A method of preparing source documents and inputting selected data therefrom into a storage and retrie-val system comprising: manually marking each source docu-ment with machine readable distinctive marks identifying one or more segments of material thereon which are to be stored, scanning each source document and forming signals represen-tative of digitalized patterns derived from images of char-acters and graphics in the selected segments, temporarily storing the signals representative of the digitalized pat-terns, converting signals representative of digitalized pat-terns of characters in the selected segments into a machine code, displaying the digitalized patterns from the storage of signals for each character not successfully converted into machine code; manually entering a code for the digita-lized pattern, and storing the machine code and digitalized pattern signal for subsequent use.
27. A method according to claim 26, in which the display of converted characters is left and right to uncon-verted or ambiguous character.
28. A method according to claim 20 wherein the manual positioning of a cursor includes applying a lightpen to the display to locate the desired cursor positions.
29. A method according to claim 23, wherein the manual positioning of a cursor includes applying a lightpen to the display to locate the desired cursor positions.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US06/499,500 US4553261A (en) | 1983-05-31 | 1983-05-31 | Document and data handling and retrieval system |
US499,500 | 1983-05-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
CA1208797A true CA1208797A (en) | 1986-07-29 |
Family
ID=23985495
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA000455458A Expired CA1208797A (en) | 1983-05-31 | 1984-05-30 | Document and data handling and retrieval system |
Country Status (7)
Country | Link |
---|---|
US (1) | US4553261A (en) |
EP (1) | EP0144361B1 (en) |
AU (1) | AU2961284A (en) |
CA (1) | CA1208797A (en) |
DE (1) | DE3475255D1 (en) |
IT (1) | IT1176221B (en) |
WO (1) | WO1984004864A1 (en) |
Families Citing this family (178)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AUPQ131399A0 (en) * | 1999-06-30 | 1999-07-22 | Silverbrook Research Pty Ltd | A method and apparatus (NPAGE02) |
JPS6249483A (en) * | 1985-08-28 | 1987-03-04 | Hitachi Ltd | Character inputting system for real time handwritten character recognition |
JPH0757002B2 (en) * | 1982-10-05 | 1995-06-14 | キヤノン株式会社 | Image processing device |
JPS59128666A (en) * | 1983-01-14 | 1984-07-24 | Fuji Xerox Co Ltd | Issuing device of slip, form or the like |
JPS603056A (en) * | 1983-06-21 | 1985-01-09 | Toshiba Corp | Information rearranging device |
US4726065A (en) * | 1984-01-26 | 1988-02-16 | Horst Froessl | Image manipulation by speech signals |
JPS60163156A (en) * | 1984-02-04 | 1985-08-26 | Casio Comput Co Ltd | Document forming and editing system |
JPS60196856A (en) * | 1984-03-20 | 1985-10-05 | Olympus Optical Co Ltd | Picture retrieval registering system |
JPH0750483B2 (en) * | 1985-05-22 | 1995-05-31 | 株式会社日立製作所 | How to store additional information about document images |
US5265242A (en) * | 1985-08-23 | 1993-11-23 | Hiromichi Fujisawa | Document retrieval system for displaying document image data with inputted bibliographic items and character string selected from multiple character candidates |
US4716542A (en) * | 1985-09-26 | 1987-12-29 | Timberline Software Corporation | Method and apparatus for single source entry of analog and digital data into a computer |
DE3642220A1 (en) * | 1985-12-11 | 1987-06-19 | Sharp Kk | DEVICE FOR RECORDING AND RETURNING RECORDED INFORMATION |
FR2595487B1 (en) * | 1986-03-06 | 1990-01-19 | Truong Trong Thi Andre | AUTOMATIC DOCUMENT ARCHIVING DEVICE |
JPH0785248B2 (en) * | 1986-03-14 | 1995-09-13 | 株式会社東芝 | Data Isle System |
US4887129A (en) * | 1986-05-12 | 1989-12-12 | Shenoy Vittal U | Editing copying machine |
US4760606A (en) * | 1986-06-30 | 1988-07-26 | Wang Laboratories, Inc. | Digital imaging file processing system |
US4813077A (en) * | 1986-07-30 | 1989-03-14 | Scan-Optics, Inc. | Sales transaction record processing system and method |
US4734789A (en) * | 1987-02-02 | 1988-03-29 | Xerox Corporation | Editing copying machine |
US4888812A (en) * | 1987-12-18 | 1989-12-19 | International Business Machines Corporation | Document image processing system |
US4910537A (en) * | 1988-02-26 | 1990-03-20 | Kabushiki Kaisha Toshiba | Image forming apparatus |
JP2589999B2 (en) * | 1988-03-18 | 1997-03-12 | 株式会社竹中工務店 | Graphic input / output device |
US6247031B1 (en) * | 1988-04-30 | 2001-06-12 | Minolta Co., Ltd. | Image filing system for memorizing images read from a given document together with small characterizing image |
US5058185A (en) * | 1988-06-27 | 1991-10-15 | International Business Machines Corporation | Object management and delivery system having multiple object-resolution capability |
US5153936A (en) * | 1988-06-27 | 1992-10-06 | International Business Machines Corporation | Dual density digital image system |
US5089956A (en) * | 1988-11-29 | 1992-02-18 | International Business Machines Corporation | Method of distributing related documents to identified end users in an information processing system |
US5101345A (en) * | 1988-11-29 | 1992-03-31 | International Business Machines Inc. | Method of filing stapled documents with a staple relationship involving one or more application programs |
US5179718A (en) * | 1988-11-29 | 1993-01-12 | International Business Machines Corporation | Method of filing having a directed relationship through defining a staple relationship within the context of a folder document |
US5353132A (en) * | 1989-02-06 | 1994-10-04 | Canon Kabushiki Kaisha | Image processing device |
JPH032979A (en) * | 1989-05-31 | 1991-01-09 | Toshiba Corp | Method and device for correction of image |
US5133024A (en) * | 1989-10-24 | 1992-07-21 | Horst Froessl | Image data bank system with selective conversion |
EP0424803B1 (en) * | 1989-10-24 | 1997-07-16 | FROESSL, Horst | Method for at least partially transforming image data into text with provision for subsequent storage or further processing |
JPH03202967A (en) * | 1989-12-28 | 1991-09-04 | Toshiba Corp | Electronic filing device |
US5344132A (en) * | 1990-01-16 | 1994-09-06 | Digital Image Systems | Image based document processing and information management system and apparatus |
US5191525A (en) * | 1990-01-16 | 1993-03-02 | Digital Image Systems, Corporation | System and method for extraction of data from documents for subsequent processing |
US5396588A (en) * | 1990-07-03 | 1995-03-07 | Froessl; Horst | Data processing using digitized images |
US5109439A (en) * | 1990-06-12 | 1992-04-28 | Horst Froessl | Mass document storage and retrieval system |
US5444840A (en) * | 1990-06-12 | 1995-08-22 | Froessl; Horst | Multiple image font processing |
US5224181A (en) * | 1990-10-10 | 1993-06-29 | Fuji Xerox Co., Ltd. | Image processor |
JP2735698B2 (en) * | 1991-01-21 | 1998-04-02 | 富士通株式会社 | Interface verification processing method |
US5258855A (en) * | 1991-03-20 | 1993-11-02 | System X, L. P. | Information processing methodology |
US6683697B1 (en) * | 1991-03-20 | 2004-01-27 | Millenium L.P. | Information processing methodology |
US5267047A (en) * | 1991-04-30 | 1993-11-30 | International Business Machines Corporation | Apparatus and method of operation for a facsimilie subsystem in an image archiving system |
JPH0644320A (en) * | 1991-05-14 | 1994-02-18 | Sony Corp | Information retrieval system |
US5926565A (en) * | 1991-10-28 | 1999-07-20 | Froessl; Horst | Computer method for processing records with images and multiple fonts |
EP0538812A2 (en) * | 1991-10-21 | 1993-04-28 | FROESSL, Horst | Multiple editing and non-edit approaches for image font processing of records |
US5875263A (en) * | 1991-10-28 | 1999-02-23 | Froessl; Horst | Non-edit multiple image font processing of records |
US5544045A (en) * | 1991-10-30 | 1996-08-06 | Canon Inc. | Unified scanner computer printer |
JP3191057B2 (en) * | 1991-11-22 | 2001-07-23 | 株式会社日立製作所 | Method and apparatus for processing encoded image data |
US8352400B2 (en) | 1991-12-23 | 2013-01-08 | Hoffberg Steven M | Adaptive pattern recognition based controller apparatus and method and human-factored interface therefore |
US5586240A (en) * | 1992-03-11 | 1996-12-17 | Genesis Software, Inc. | Image generation and retrieval system integrated with arbitrary application using layered interface |
US5579407A (en) * | 1992-04-21 | 1996-11-26 | Murez; James D. | Optical character classification |
US5235654A (en) * | 1992-04-30 | 1993-08-10 | International Business Machines Corporation | Advanced data capture architecture data processing system and method for scanned images of document forms |
US5710844A (en) * | 1992-05-27 | 1998-01-20 | Apple Computer | Method for searching and displaying results in a pen-based computer system |
US5764818A (en) * | 1992-05-27 | 1998-06-09 | Apple Computer, Inc. | Method for locating and displaying information in a pointer-based computer system |
US5477343A (en) * | 1992-06-18 | 1995-12-19 | Anacomp, Inc. | Micrographic reader with digitized image |
US5987149A (en) | 1992-07-08 | 1999-11-16 | Uniscore Incorporated | Method for scoring and control of scoring open-ended assessments using scorers in diverse locations |
US5721788A (en) | 1992-07-31 | 1998-02-24 | Corbis Corporation | Method and system for digital image signatures |
US5437554A (en) | 1993-02-05 | 1995-08-01 | National Computer Systems, Inc. | System for providing performance feedback to test resolvers |
US5613019A (en) * | 1993-05-20 | 1997-03-18 | Microsoft Corporation | System and methods for spacing, storing and recognizing electronic representations of handwriting, printing and drawings |
US6587587B2 (en) | 1993-05-20 | 2003-07-01 | Microsoft Corporation | System and methods for spacing, storing and recognizing electronic representations of handwriting, printing and drawings |
US5526447A (en) * | 1993-07-26 | 1996-06-11 | Cognitronics Imaging Systems, Inc. | Batched character image processing |
DE69432480T2 (en) * | 1993-11-18 | 2004-03-18 | Digimarc Corp., Tualatin | IDENTIFICATION / CERTIFICATION CODING METHOD AND DEVICE |
US6983051B1 (en) | 1993-11-18 | 2006-01-03 | Digimarc Corporation | Methods for audio watermarking and decoding |
US5841978A (en) | 1993-11-18 | 1998-11-24 | Digimarc Corporation | Network linking method using steganographically embedded data objects |
US6580819B1 (en) | 1993-11-18 | 2003-06-17 | Digimarc Corporation | Methods of producing security documents having digitally encoded data and documents employing same |
US5822436A (en) * | 1996-04-25 | 1998-10-13 | Digimarc Corporation | Photographic products and methods employing embedded information |
US5748783A (en) * | 1995-05-08 | 1998-05-05 | Digimarc Corporation | Method and apparatus for robust information coding |
US6611607B1 (en) | 1993-11-18 | 2003-08-26 | Digimarc Corporation | Integrating digital watermarks in multimedia content |
US6944298B1 (en) | 1993-11-18 | 2005-09-13 | Digimare Corporation | Steganographic encoding and decoding of auxiliary codes in media signals |
US6122403A (en) | 1995-07-27 | 2000-09-19 | Digimarc Corporation | Computer system linked by using information in data objects |
US6424725B1 (en) | 1996-05-16 | 2002-07-23 | Digimarc Corporation | Determining transformations of media signals with embedded code signals |
US5636292C1 (en) * | 1995-05-08 | 2002-06-18 | Digimarc Corp | Steganography methods employing embedded calibration data |
US5841886A (en) * | 1993-11-18 | 1998-11-24 | Digimarc Corporation | Security system for photographic identification |
US6449377B1 (en) | 1995-05-08 | 2002-09-10 | Digimarc Corporation | Methods and systems for watermark processing of line art images |
US5768426A (en) | 1993-11-18 | 1998-06-16 | Digimarc Corporation | Graphics processing system employing embedded code signals |
US6516079B1 (en) | 2000-02-14 | 2003-02-04 | Digimarc Corporation | Digital watermark screening and detecting strategies |
US5710834A (en) * | 1995-05-08 | 1998-01-20 | Digimarc Corporation | Method and apparatus responsive to a code signal conveyed through a graphic image |
US5862260A (en) * | 1993-11-18 | 1999-01-19 | Digimarc Corporation | Methods for surveying dissemination of proprietary empirical data |
US6614914B1 (en) | 1995-05-08 | 2003-09-02 | Digimarc Corporation | Watermark embedder and reader |
US7044395B1 (en) | 1993-11-18 | 2006-05-16 | Digimarc Corporation | Embedding and reading imperceptible codes on objects |
US5748763A (en) | 1993-11-18 | 1998-05-05 | Digimarc Corporation | Image steganography system featuring perceptually adaptive and globally scalable signal embedding |
USRE40919E1 (en) * | 1993-11-18 | 2009-09-22 | Digimarc Corporation | Methods for surveying dissemination of proprietary empirical data |
US7171016B1 (en) | 1993-11-18 | 2007-01-30 | Digimarc Corporation | Method for monitoring internet dissemination of image, video and/or audio files |
US5832119C1 (en) * | 1993-11-18 | 2002-03-05 | Digimarc Corp | Methods for controlling systems using control signals embedded in empirical data |
US6408082B1 (en) | 1996-04-25 | 2002-06-18 | Digimarc Corporation | Watermark detection using a fourier mellin transform |
CA2134255C (en) * | 1993-12-09 | 1999-07-13 | Hans Peter Graf | Dropped-form document image compression |
US5748780A (en) * | 1994-04-07 | 1998-05-05 | Stolfo; Salvatore J. | Method and apparatus for imaging, image processing and data compression |
US6968057B2 (en) | 1994-03-17 | 2005-11-22 | Digimarc Corporation | Emulsion products and imagery employing steganography |
US6869023B2 (en) | 2002-02-12 | 2005-03-22 | Digimarc Corporation | Linking documents through digital watermarking |
US7039214B2 (en) | 1999-11-05 | 2006-05-02 | Digimarc Corporation | Embedding watermark components during separate printing stages |
US6522770B1 (en) | 1999-05-19 | 2003-02-18 | Digimarc Corporation | Management of documents and other objects using optical devices |
US20020082043A1 (en) * | 1994-05-19 | 2002-06-27 | Kari-Pekka Wilska | Device for personal communications, data collection and data processing, and a circuit card |
US5502637A (en) * | 1994-06-15 | 1996-03-26 | Thomson Shared Services, Inc. | Investment research delivery system |
US6560349B1 (en) | 1994-10-21 | 2003-05-06 | Digimarc Corporation | Audio monitoring using steganographic information |
US5848413A (en) * | 1995-01-13 | 1998-12-08 | Ricoh Company, Ltd. | Method and apparatus for accessing and publishing electronic documents |
US5873077A (en) * | 1995-01-13 | 1999-02-16 | Ricoh Corporation | Method and apparatus for searching for and retrieving documents using a facsimile machine |
US7486799B2 (en) | 1995-05-08 | 2009-02-03 | Digimarc Corporation | Methods for monitoring audio and images on the internet |
US6728390B2 (en) | 1995-05-08 | 2004-04-27 | Digimarc Corporation | Methods and systems using multiple watermarks |
US6721440B2 (en) | 1995-05-08 | 2004-04-13 | Digimarc Corporation | Low visibility watermarks using an out-of-phase color |
US6760463B2 (en) | 1995-05-08 | 2004-07-06 | Digimarc Corporation | Watermarking methods and media |
US7006661B2 (en) | 1995-07-27 | 2006-02-28 | Digimarc Corp | Digital watermarking systems and methods |
US6408331B1 (en) | 1995-07-27 | 2002-06-18 | Digimarc Corporation | Computer linking methods using encoded graphics |
US6411725B1 (en) | 1995-07-27 | 2002-06-25 | Digimarc Corporation | Watermark enabled video objects |
US6577746B1 (en) | 1999-12-28 | 2003-06-10 | Digimarc Corporation | Watermark-based object linking and embedding |
US6788800B1 (en) | 2000-07-25 | 2004-09-07 | Digimarc Corporation | Authenticating objects using embedded data |
US6965682B1 (en) | 1999-05-19 | 2005-11-15 | Digimarc Corp | Data transmission by watermark proxy |
US6829368B2 (en) | 2000-01-26 | 2004-12-07 | Digimarc Corporation | Establishing and interacting with on-line media collections using identifiers in media signals |
JP4356847B2 (en) | 1995-11-10 | 2009-11-04 | 万太郎 矢島 | Field definition information generation method, line and field definition information generation device |
US6381341B1 (en) | 1996-05-16 | 2002-04-30 | Digimarc Corporation | Watermark encoding method exploiting biases inherent in original signal |
US5848386A (en) * | 1996-05-28 | 1998-12-08 | Ricoh Company, Ltd. | Method and system for translating documents using different translation resources for different portions of the documents |
US5956468A (en) * | 1996-07-12 | 1999-09-21 | Seiko Epson Corporation | Document segmentation system |
HUP0100603A2 (en) * | 1997-01-13 | 2001-06-28 | John Overton | Universal system for image archiving and method for universally tracking images |
US6192165B1 (en) * | 1997-12-30 | 2001-02-20 | Imagetag, Inc. | Apparatus and method for digital filing |
US6804376B2 (en) | 1998-01-20 | 2004-10-12 | Digimarc Corporation | Equipment employing watermark-based authentication function |
AUPP424798A0 (en) | 1998-06-19 | 1998-07-16 | Canon Kabushiki Kaisha | Apparatus and method for copying selected region(s) of documents |
US7103640B1 (en) | 1999-09-14 | 2006-09-05 | Econnectix, Llc | Network distributed tracking wire transfer protocol |
US7233978B2 (en) * | 1998-07-08 | 2007-06-19 | Econnectix, Llc | Method and apparatus for managing location information in a network separate from the data to which the location information pertains |
US7966078B2 (en) | 1999-02-01 | 2011-06-21 | Steven Hoffberg | Network media appliance system and method |
US6625297B1 (en) | 2000-02-10 | 2003-09-23 | Digimarc Corporation | Self-orienting watermarks |
US6674923B1 (en) * | 2000-03-28 | 2004-01-06 | Eastman Kodak Company | Method and system for locating and accessing digitally stored images |
US7027614B2 (en) | 2000-04-19 | 2006-04-11 | Digimarc Corporation | Hiding information to reduce or offset perceptible artifacts |
US6804377B2 (en) | 2000-04-19 | 2004-10-12 | Digimarc Corporation | Detecting information hidden out-of-phase in color channels |
US20080005275A1 (en) * | 2000-06-02 | 2008-01-03 | Econnectix, Llc | Method and apparatus for managing location information in a network separate from the data to which the location information pertains |
US6810232B2 (en) | 2001-03-05 | 2004-10-26 | Ncs Pearson, Inc. | Test processing workflow tracking system |
US6751351B2 (en) | 2001-03-05 | 2004-06-15 | Nsc Pearson, Inc. | Test question response verification system |
US6675133B2 (en) | 2001-03-05 | 2004-01-06 | Ncs Pearsons, Inc. | Pre-data-collection applications test processing system |
US6961482B2 (en) * | 2001-03-05 | 2005-11-01 | Ncs Pearson, Inc. | System for archiving electronic images of test question responses |
AUPR399601A0 (en) * | 2001-03-27 | 2001-04-26 | Silverbrook Research Pty. Ltd. | An apparatus and method(ART108) |
US6999204B2 (en) * | 2001-04-05 | 2006-02-14 | Global 360, Inc. | Document processing using color marking |
DK1456810T3 (en) | 2001-12-18 | 2011-07-18 | L 1 Secure Credentialing Inc | Multiple image security features to identify documents and methods of producing them |
EP1459246B1 (en) | 2001-12-24 | 2012-05-02 | L-1 Secure Credentialing, Inc. | Method for full color laser marking of id documents |
US7728048B2 (en) | 2002-12-20 | 2010-06-01 | L-1 Secure Credentialing, Inc. | Increasing thermal conductivity of host polymer used with laser engraving methods and compositions |
US7694887B2 (en) | 2001-12-24 | 2010-04-13 | L-1 Secure Credentialing, Inc. | Optically variable personalized indicia for identification documents |
EP1459239B1 (en) | 2001-12-24 | 2012-04-04 | L-1 Secure Credentialing, Inc. | Covert variable information on id documents and methods of making same |
US20040008223A1 (en) * | 2002-03-16 | 2004-01-15 | Catherine Britton | Electronic healthcare management form navigation |
US7590932B2 (en) * | 2002-03-16 | 2009-09-15 | Siemens Medical Solutions Usa, Inc. | Electronic healthcare management form creation |
US7824029B2 (en) | 2002-05-10 | 2010-11-02 | L-1 Secure Credentialing, Inc. | Identification card printer-assembler for over the counter card issuing |
US20040064472A1 (en) * | 2002-09-27 | 2004-04-01 | Oetringer Eugen H. | Method and system for information management |
US7804982B2 (en) | 2002-11-26 | 2010-09-28 | L-1 Secure Credentialing, Inc. | Systems and methods for managing and detecting fraud in image databases used with identification documents |
US7712673B2 (en) | 2002-12-18 | 2010-05-11 | L-L Secure Credentialing, Inc. | Identification document with three dimensional image of bearer |
ATE491190T1 (en) | 2003-04-16 | 2010-12-15 | L 1 Secure Credentialing Inc | THREE-DIMENSIONAL DATA STORAGE |
US7707039B2 (en) | 2004-02-15 | 2010-04-27 | Exbiblio B.V. | Automatic modification of web pages |
US8442331B2 (en) | 2004-02-15 | 2013-05-14 | Google Inc. | Capturing text from rendered documents using supplemental information |
US7812860B2 (en) | 2004-04-01 | 2010-10-12 | Exbiblio B.V. | Handheld device for capturing text from both a document printed on paper and a document displayed on a dynamic display device |
US20060041484A1 (en) | 2004-04-01 | 2006-02-23 | King Martin T | Methods and systems for initiating application processes by data capture from rendered documents |
US10635723B2 (en) | 2004-02-15 | 2020-04-28 | Google Llc | Search engines and systems with handheld document data capture devices |
US7744002B2 (en) | 2004-03-11 | 2010-06-29 | L-1 Secure Credentialing, Inc. | Tamper evident adhesive and identification document including same |
US8621349B2 (en) | 2004-04-01 | 2013-12-31 | Google Inc. | Publishing techniques for adding value to a rendered document |
US20060081714A1 (en) | 2004-08-23 | 2006-04-20 | King Martin T | Portable scanning device |
US9116890B2 (en) | 2004-04-01 | 2015-08-25 | Google Inc. | Triggering actions in response to optically or acoustically capturing keywords from a rendered document |
US7894670B2 (en) | 2004-04-01 | 2011-02-22 | Exbiblio B.V. | Triggering actions in response to optically or acoustically capturing keywords from a rendered document |
US20060098900A1 (en) | 2004-09-27 | 2006-05-11 | King Martin T | Secure data gathering from rendered documents |
US20080313172A1 (en) | 2004-12-03 | 2008-12-18 | King Martin T | Determining actions involving captured information and electronic content associated with rendered documents |
US9143638B2 (en) | 2004-04-01 | 2015-09-22 | Google Inc. | Data capture from rendered documents using handheld device |
US7990556B2 (en) | 2004-12-03 | 2011-08-02 | Google Inc. | Association of a portable scanner with input/output and storage devices |
WO2008028674A2 (en) | 2006-09-08 | 2008-03-13 | Exbiblio B.V. | Optical scanners, such as hand-held optical scanners |
US8146156B2 (en) | 2004-04-01 | 2012-03-27 | Google Inc. | Archive of text captures from rendered documents |
US8793162B2 (en) | 2004-04-01 | 2014-07-29 | Google Inc. | Adding information or functionality to a rendered document via association with an electronic counterpart |
US8713418B2 (en) | 2004-04-12 | 2014-04-29 | Google Inc. | Adding value to a rendered document |
US8620083B2 (en) | 2004-12-03 | 2013-12-31 | Google Inc. | Method and system for character recognition |
US8874504B2 (en) | 2004-12-03 | 2014-10-28 | Google Inc. | Processing techniques for visual capture data from a rendered document |
US8489624B2 (en) | 2004-05-17 | 2013-07-16 | Google, Inc. | Processing techniques for text capture from a rendered document |
US9460346B2 (en) | 2004-04-19 | 2016-10-04 | Google Inc. | Handheld device for capturing text from both a document printed on paper and a document displayed on a dynamic display device |
US8346620B2 (en) | 2004-07-19 | 2013-01-01 | Google Inc. | Automatic modification of web pages |
US20060241977A1 (en) * | 2005-04-22 | 2006-10-26 | Fitzgerald Loretta A | Patient medical data graphical presentation system |
US20110096174A1 (en) * | 2006-02-28 | 2011-04-28 | King Martin T | Accessing resources based on capturing information from a rendered document |
US20070250532A1 (en) * | 2006-04-21 | 2007-10-25 | Eastman Kodak Company | Method for automatically generating a dynamic digital metadata record from digitized hardcopy media |
US7930226B1 (en) | 2006-07-24 | 2011-04-19 | Intuit Inc. | User-driven document-based data collection |
US8396331B2 (en) * | 2007-02-26 | 2013-03-12 | Microsoft Corporation | Generating a multi-use vocabulary based on image data |
US8638363B2 (en) | 2009-02-18 | 2014-01-28 | Google Inc. | Automatically capturing information, such as capturing information using a document-aware device |
US7882091B2 (en) * | 2008-01-09 | 2011-02-01 | Stephen Schneider | Record tagging, storage and filtering system and method |
WO2010105246A2 (en) | 2009-03-12 | 2010-09-16 | Exbiblio B.V. | Accessing resources based on capturing information from a rendered document |
US8447066B2 (en) | 2009-03-12 | 2013-05-21 | Google Inc. | Performing actions based on capturing information from rendered documents, such as documents under copyright |
US9081799B2 (en) | 2009-12-04 | 2015-07-14 | Google Inc. | Using gestalt information to identify locations in printed information |
US9323784B2 (en) | 2009-12-09 | 2016-04-26 | Google Inc. | Image search using text-based elements within the contents of images |
US8204805B2 (en) | 2010-10-28 | 2012-06-19 | Intuit Inc. | Instant tax return preparation |
US9558521B1 (en) | 2010-07-29 | 2017-01-31 | Intuit Inc. | System and method for populating a field on a form including remote field level data capture |
US9058352B2 (en) | 2011-09-22 | 2015-06-16 | Cerner Innovation, Inc. | System for dynamically and quickly generating a report and request for quotation |
Family Cites Families (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US2877951A (en) * | 1956-12-31 | 1959-03-17 | Ibm | Character sensing system |
US3181119A (en) * | 1960-11-30 | 1965-04-27 | Control Data Corp | Reading machine output controller responsive to reject signals |
US3271738A (en) * | 1963-08-13 | 1966-09-06 | Ibm | Operator assisted character reading system |
US3273130A (en) * | 1963-12-04 | 1966-09-13 | Ibm | Applied sequence identification device |
US3582886A (en) * | 1967-10-03 | 1971-06-01 | Ibm | Scanning address generator for computer-controlled character reader |
US3553646A (en) * | 1967-10-03 | 1971-01-05 | Ibm | Format control in a character recognition system |
GB1243969A (en) * | 1967-11-15 | 1971-08-25 | Emi Ltd | Improvements relating to pattern recognition devices |
US3540012A (en) * | 1967-12-26 | 1970-11-10 | Sperry Rand Corp | Crt display editing circuit |
US3536950A (en) * | 1968-07-15 | 1970-10-27 | Ibm | Calibration error detection and correction in a document reading system |
US3629828A (en) * | 1969-05-07 | 1971-12-21 | Ibm | System having scanner controlled by video clipping level and recognition exception routines |
US3571797A (en) * | 1969-06-02 | 1971-03-23 | Ibm | Area-format control in a character-recogniton system |
US3753240A (en) * | 1971-03-08 | 1973-08-14 | Dynamic Information Systems | Data entry and retrieval composite display system |
US3781799A (en) * | 1972-01-03 | 1973-12-25 | Ibm | Control system employing microprogram discrete logic control routines |
US4001787A (en) * | 1972-07-17 | 1977-01-04 | International Business Machines Corporation | Data processor for pattern recognition and the like |
GB1487507A (en) * | 1975-12-29 | 1977-10-05 | Ibm | Information retrieval system |
US4121196A (en) * | 1977-05-02 | 1978-10-17 | The United States Of America As Represented By The Secretary Of The Army | Data base update scheme |
US4273440A (en) * | 1977-08-30 | 1981-06-16 | Horst Froessl | Method and apparatus for data collection and preparation |
US4264808A (en) * | 1978-10-06 | 1981-04-28 | Ncr Corporation | Method and apparatus for electronic image processing of documents for accounting purposes |
US4408181A (en) * | 1979-04-10 | 1983-10-04 | Tokyo Shibaura Denki Kabushiki Kaisha | Document data filing/retrieval system |
DE3101543A1 (en) * | 1981-01-20 | 1982-08-26 | Licentia Patent-Verwaltungs-Gmbh, 6000 Frankfurt | "OFFICE COMMUNICATION SYSTEM" |
-
1983
- 1983-05-31 US US06/499,500 patent/US4553261A/en not_active Expired - Lifetime
-
1984
- 1984-05-28 DE DE8484901956T patent/DE3475255D1/en not_active Expired
- 1984-05-28 EP EP84901956A patent/EP0144361B1/en not_active Expired
- 1984-05-28 AU AU29612/84A patent/AU2961284A/en not_active Abandoned
- 1984-05-28 WO PCT/CH1984/000085 patent/WO1984004864A1/en active IP Right Grant
- 1984-05-30 CA CA000455458A patent/CA1208797A/en not_active Expired
- 1984-05-31 IT IT21201/84A patent/IT1176221B/en active
Also Published As
Publication number | Publication date |
---|---|
US4553261A (en) | 1985-11-12 |
IT8421201A0 (en) | 1984-05-31 |
EP0144361B1 (en) | 1988-11-17 |
IT8421201A1 (en) | 1985-12-01 |
IT1176221B (en) | 1987-08-18 |
AU2961284A (en) | 1984-12-18 |
DE3475255D1 (en) | 1988-12-22 |
EP0144361A1 (en) | 1985-06-19 |
WO1984004864A1 (en) | 1984-12-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA1208797A (en) | Document and data handling and retrieval system | |
US5926565A (en) | Computer method for processing records with images and multiple fonts | |
US5339412A (en) | Electronic filing system using a mark on each page of the document for building a database with respect to plurality of multi-page documents | |
US6023528A (en) | Non-edit multiple image font processing of records | |
US5903904A (en) | Iconic paper for alphabetic, japanese and graphic documents | |
US5133024A (en) | Image data bank system with selective conversion | |
JP4118349B2 (en) | Document selection method and document server | |
EP0435316B1 (en) | Image information recording apparatus | |
US5179649A (en) | Method for generating title information entry format and method for filing images in image filing device | |
US6697056B1 (en) | Method and system for form recognition | |
JP3478681B2 (en) | Document information management system | |
US5134669A (en) | Image processing system for documentary data | |
US5001769A (en) | Image processing system | |
CA2128583C (en) | Source verification using images | |
JP4260790B2 (en) | Filing / retrieval apparatus and filing / retrieval method | |
EP0629078B1 (en) | Apparatus for processing and reproducing image information | |
JP4053100B2 (en) | Document information management system and document information management method | |
US9454696B2 (en) | Dynamically generating table of contents for printable or scanned content | |
US4837737A (en) | System for detecting origin of proprietary documents generated by an apparatus for processing information such as words, figures and pictures | |
EP1672473A2 (en) | Stamp sheet | |
EP0833276A2 (en) | Image forming apparatus and image forming system | |
JP3604483B2 (en) | Document information management system and document information management method | |
KR860001012B1 (en) | Ideographic coder | |
US5719960A (en) | System for dispatching task orders into a user network and method | |
US5854860A (en) | Image filing apparatus having a character recognition function |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MKEX | Expiry |