US 20060062461 A1
A handwritten Chinese character input method and system is provided to allow users to enter Chinese characters to a data processor by adding less than three strokes and one selection movement such as mouse clicking or stylus or finger tapping. The system is interactive, predictive, and intuitive to use. By adding one or two strokes which are used to start writing a Chinese character, or in some case even no strokes are needed, users can find a desired character from a list of characters. The list is context sensitive. It varies depending on the prior character entered. Compared to other existing systems, this system can save users considerable time and efforts to entering handwritten characters.
1. A Chinese character handwriting input system, comprising:
recognition means for recognizing a category of handwriting stroke from a predefined number of stroke categories;
recognition means for recognizing one or more categories of handwriting stroke from a predefined number of stroke categories;
collection means for organizing a list of characters that commonly start with said more than one recognized category of handwriting stroke, said list of characters being displayed in a predefined sequence, wherein said predefined sequence is based on any of:
number of strokes necessary to write out a character;
use frequency of a character; and
contextual relation to the last character entered; and
selection means for selecting a desired character from said list of characters.
2. The system of
wildcard entry means for matching any stroke category.
3. A Chinese character handwriting input system, comprising:
recognition means for recognizing a category of handwriting stroke from a predefined number of stroke categories; and
collection means for organizing a list of characters that commonly start with one or more recognized categories of handwriting stroke, said list of characters being displayed in a predefined sequence, wherein said predefined sequence is based on any of:
number of strokes necessary to write out a character;
use frequency of a character; and
contextual relation to the last character entered;
selection means for selecting a desired character from said list of characters;
wherein said predetermined number of stroke categories comprise more than five basic categories.
4. A method for inputting handwritten Chinese characters, comprising the steps of:
adding a stroke into a pattern recognition system;
categorizing said added stroke into one of a predefined number of stroke categories;
finding characters based on frequency of character use; and
displaying a list of found characters.
5. The method of
if a desired character is in said list, selecting said desired character from said list;
if a desired character is not visible in said list, adding another stroke; and
displaying another list of found characters.
6. The method
displaying a numeric or iconic representation for a stroke that is added; and
displaying full stroke numeric or iconic representation for a character that is selected.
7. The method of
if a desired character is in said list, either of selecting said desired character from said list or adding another stroke and displaying another list of found characters.
8. The method of
retaining an ink trail of each stroke that is added until a character is selected.
9. The method of
color coding each ink trail either to indicate a level of confidence or differentiation in said categorization step.
10. The method of
prompting a user to clarify between ambiguous stroke interpretations and/or to remedy a stroke's misinterpretation.
11. The method of
providing means for removing one or more strokes of an input stroke sequence in reverse order.
12. The method of
providing means for matching any of Latin letters, punctuation symbols, and emoticons with predefined or user-defined stroke sequences.
13. The method of
selecting a character from said list with a user gesture;
wherein said user gesture allows said user to begin entry of strokes for a next character.
14. The method of
providing user-defined gestures for any of stroke categories, sequences of strokes, and character components.
15. The method of
providing means for explicit selection of stroke categories.
16. The method of
displaying character components that start with one or more recognized stroke categories;
wherein selecting a character component results in the display of only the characters containing or starting with said selected component.
17. The method of
allowing alternative stroke sequences for character or character component entry.
18. The method of
finding said characters based on context.
This application is a continuation of U.S. patent application Ser. No. 10/205,950, filed Jul. 25, 2002.
1. Technical Field
This invention relates generally to text input technology. More particularly, the invention relates to a method and system that allows users to input handwritten Chinese characters to a data processor by entering the first few strokes required to write a character, so that users can perform characters input tasks in a fast, predictive way.
2. Description of the Prior Art
Around the globe, over 1.2 billion people speak Chinese. This includes the People's Republic of China, Taiwan, Singapore, and a large community of overseas Chinese in Asia and North America. Chinese character strokes and symbols are so different and so complicated that they can be sorted and grouped in a wide variety of ways. One can analytically sort out as many as 35-40 strokes of 4-10 symbols or more per Chinese character, depending on how they are grouped. Because of this unique structure of Chinese language, computer users cannot input Chinese characters using alphabetic keyboards as easily as inputting Western language.
A number of methods and systems for inputting Chinese characters to screen, such as the Three Corners method, Goo Coding System, 5-Stroke method, Changjie's Input scheme, etc., have been developed. However, none of these input methods provides an easy to use, standardized input/output scheme to speed up the retrieval, typewriting process, by taking full advantage of computer technology.
Several other methods and system for inputting handwritten Chinese characters are also deknown. For example, Apple Computer and the Institute of System Science in Singapore (Apple-ISS) have developed a system which features an application for dictation and a handwriting input method for Chinese. This system incorporates a dictionary assistance service wherein when a first character is recognized, the device displays a list of phrases based on the first character and the user may select the proper phrase without inputting any stroke. This technique effectively increases the input speed.
Another example is Synaptics' QuickStroke system which incorporates a prediction function based on a highly sophisticated neural network engine. This is not a graphics capture application where the users have to write out the entire character before the software can recognize which character is intended. Instead, it can recognize a character after only three to six strokes of the character have been written. It can be used with a standard mouse, Synaptics TouchPad™, or a Synaptics pen input TouchPad.
Another example is Zi Corporation's text input solutions based on an intelligent indexing engine which intuitively predicts and displays desired candidates. The solutions also include powerful personalization and learning capabilities—providing prediction of user-created terms and frequently used vocabulary.
It would be advantageous to provide a handwritten Chinese character input method and system to allow users to enter Chinese characters to a data processor by drawing just the first few strokes and one selection movement such as mouse clicking or stylus or finger tapping.
A handwritten Chinese character input method and system is provided to allow users to enter Chinese characters to a data processor by drawing just the first few strokes and one selection movement such as mouse clicking or stylus or finger tapping. The system is interactive, predictive, and intuitive to use. By adding one or two strokes which are used to start writing a Chinese character, users can find a desired character from a list of characters. The list is context sensitive, so in some cases no strokes are needed. It varies depending on the prior character entered. The system puts the handwritten-stroke-to-category mapping on top of the stroke category matching technology, including an optional “Match any stroke category” key or gesture. Compared to other existing systems, this system can save users considerable time and efforts to entering handwritten characters.
In one preferred embodiment, the handwritten Chinese character input system includes: (1) recognition means for recognizing a category of handwriting stroke from a list of stroke categories; (2) collection means for organizing a list of characters that commonly start with one or more recognized categories of handwriting strokes, the list of characters being displayed in a predetermined sequence; and (3) selection means for selecting a desired character from the list of characters.
In a typical embodiment, the strokes are classified into five basic categories, each having one or more sub-categories. The collection means contains predefined stroke order information. It also contain a display means to display a list of most frequently used characters when no strokes are entered, while strokes are being entered, and/or after a character is selected. The list of most frequently used characters is context sensitive. It varies depending upon the last Chinese character entered. The predetermined sequence may be based on any of: (1) number of strokes necessary to write out a character; (2) use frequency of a character; and (3) contextual relation to the last character entered.
The selection means is associated with any of: (1) mouse clicking; (2) stylus tapping; (3) finger tapping; and (4) button/key pressing.
The system also contains “stroke entry means,” such as an LCD touchscreen, stylus or finger pad, trackball, data glove, or other touch-sensitive (possibly flexible) surface.
The system may further includes means for displaying a numeric or iconic representation of each stroke that is entered and a full numeric or iconic representation of strokes for a Chinese character that is selected.
According to the preferred embodiment, a method for inputting handwritten Chinese characters includes the following steps:
The method may further comprise the steps of:
As an alternative, the method may comprises the steps of:
The Stroke Recognition Interface 20 has three basic areas: a Message Display Area 28, a Selection List Area 26, and a Stroke Input Area 22.
Message Display Area 28 is the place where the selected characters are displayed. It represents an email or SMS message, or whatever application intends to use the generated text.
Selection List Area 26 is the place to display the most common character choices for the strokes currently entered on the stroke input window. This area may also list common characters that follow the last character in the Message Display Area 28, that also begin with the strokes entered in the Stroke Input Area 22.
Stroke Input Area 22 is the heart of the Stroke Recognition Interface 20. The user begins drawing a character onscreen in this area, using an Input Device 24 such as a stylus, a finger, or a mouse, depending on input device and display device used. The display device echos and retains each stroke (an “ink trail”) until the character is selected.
Stroke Recognition Interface 20 may further includes a Stroke Number Display Area to display the interfaces interpretation, either numeric or iconic, of the strokes entered by the user. When a character is selected, the full stroke representation, either by numbers or by icons, is displayed here. This area is optional, but could be useful for helping users learn stroke orders and stroke categories.
The system may further include: the capabilities to match Latin letters and punctuation symbols and emoticons, with user-defined stroke sequences; user-defined gestures for predefined stroke categories, and unique gestures representing entire components/sequence/symbols; learning/adapting to user's handwriting style, skew, or cursive; optional training session with known characters; optional prompting user to clarify between ambiguous stroke interpretations, and/or a means to enter explicit strokes, e.g. via stroke category keys), and/or remedy a stroke misinterpretation; optional indication of level of confidence of stroke interpretations, e.g. color-coding each “ink trail” or a smiley-face that frowns when it is uncertain; means to display all strokes that make up a character, e.g. drag & drop from text editor to Stroke [Number] Display Area); as well as ability to delete the last stroke(s) in reverse order (and ink trail(s)) by some means.
The apparatus may have a function to actively display the interfaces interpretation, either numeric or iconic, of the strokes entered by the user. Therefore, the method described above may further comprise the steps of:
As an alternative, Step 54 may be replaced by:
One of the major advantages of the recognition system according to this invention is the great reduction of ambiguities arising in the subtle distinction between certain subtypes of the stroke categories. To reduce ambiguities, there are further definitions on the subtypes. For example, a horizontal line with a slight hook upwards is stroke 1; a horizontal line with a slight hook down is stroke 5; a horizontal line angled upwards is stroke 1; and a curved line that starts right diagonally then evens out to horizontal or curved up is stroke 4, and etc.
One technique for resolving, or at least limiting, ambiguities, is the use of limited wildcards. These are stroke keys that match with any stroke that fits one type of ambiguity. For example, if the stroke may fit into either stroke category 4 or stroke category 5, the limited wildcard would match both 4 and 5.
Often the difference between a stroke of one type and a similar stroke of another type are too subtle for a computer to differentiate. This gets even more confusing when the user is sloppy and curves his straight strokes, or straightens his curved strokes, or gets the angle slightly off.
To account for all of the variation of an individual user, the system may learn the specific idiosyncrasies of its one user, and adapt to fit that person's handwriting style.
The specifics of the exaggeration needed may be determined as appropriate. Key to this aspect of the invention is that the user has to make diagonal strokes very diagonal, straight strokes very straight, curved strokes very curved, and angled strokes very angled.
The result on paper is a character that would look somewhat artificial and a caricature of its intended character. However, this greatly simplifies the disambiguation process for finding the strokes, which then helps the disambiguation of characters.
In the following paragraphs in conjunction with a series of pictorial diagrams, the operation process is described.
In a typical embodiment, the stroke entry means is a handwriting input area displayed on a touchscreen on a PDA. Each entered stroke is recognized as one of a set of stroke categories. The graphical keys, each assigned to a stroke category, are optionally available to display and enter strokes, as an alternative input means. One of the graphical keys represents “match any stroke category”.
The method described above may be carried out by a computer usable medium containing instructions in computer readable form. In other words, the method may be incorporated in a computer program, a logic device, mobile device, or firmware and/or may be downloaded from a network, e.g. a Web site over the Internet. It may be applied in all sorts of text entry.
Although the invention is described herein with reference to some preferred embodiments, one skilled in the art will readily appreciate that other applications may be substituted for those set forth herein without departing from the spirit and scope of the present invention.
Accordingly, the invention should only be limited by the claims included below.