US5917890A - Disambiguation of alphabetic characters in an automated call processing environment - Google Patents

Disambiguation of alphabetic characters in an automated call processing environment Download PDF

Info

Publication number
US5917890A
US5917890A US08/581,716 US58171695A US5917890A US 5917890 A US5917890 A US 5917890A US 58171695 A US58171695 A US 58171695A US 5917890 A US5917890 A US 5917890A
Authority
US
United States
Prior art keywords
alphabetic character
character
uttered
candidate
received signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US08/581,716
Inventor
Lynne Shapiro Brotman
James M. Farber
Benjamin J. Stern
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AT&T Corp
Original Assignee
AT&T Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by AT&T Corp filed Critical AT&T Corp
Priority to US08/581,716 priority Critical patent/US5917890A/en
Assigned to AT&T CORP. reassignment AT&T CORP. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: STERN, BENJAMIN J.
Application granted granted Critical
Publication of US5917890A publication Critical patent/US5917890A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/26Devices for calling a subscriber
    • H04M1/27Devices whereby a plurality of signals may be stored simultaneously
    • H04M1/271Devices whereby a plurality of signals may be stored simultaneously controlled by voice recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/53Centralised arrangements for recording incoming messages, i.e. mailbox systems
    • H04M3/533Voice mail systems

Definitions

  • the present invention relates to automated call processing, and, more particularly, is directed to capturing alphabetic characters in an automated call processing environment.
  • Automated capture of an uttered alphabetic character is provided in accordance with the principles of this invention by using an input, beyond the uttered alphabetic character, to disambiguate an incorrectly captured character.
  • At least one uttered alphabetic character is captured by receiving a signal indicative of the uttered alphabetic character, automatically finding a first candidate alphabetic character corresponding to the received signal, inquiring whether the first candidate alphabetic character is the uttered alphabetic character, and receiving an input for use in disambiguating the received signal when the first candidate alphabetic character differs from the uttered alphabetic character.
  • the input is an indication of a telephone key representing the uttered alphabetic character.
  • the indication can be a dual tone multifrequency signal or an utterance of the number of the telephone key.
  • the input is an indication that the first candidate alphabetic character differs from the uttered alphabetic character.
  • FIG. 1 is a block diagram illustrating a configuration in which the present invention is applied
  • FIG. 2 is a flowchart of a method of automatically capturing an uttered alphabetic character
  • FIG. 3 is a flowchart of another method of automatically capturing an uttered alphabetic character.
  • a caller also referred to herein as a user of the automated call processing system
  • a user of the automated call processing system is assumed to have decided that he or she wishes to enter their name or other alphabetic information to the system, for a purpose such as placing an order or receiving information.
  • the user has available only a conventional telephone set, i.e., any telephone set unable to directly transmit alphabetic information across the telephone network, and communicates via this telephone set with the system.
  • FIG. 1 there is illustrated a system 900 in which the present invention is applied.
  • a user is assumed to have access to only a conventional telephone set 910 which communicates with the system 900 using conventional telecommunications facilities such as wired or wireless telecommunications systems known to one of ordinary skill in the art.
  • the system 900 comprises communications interface (COMM INTFC) 920, speech generation module (SPEECH GEN) 930, speech recognition module (SPEECH RECOG) 940, storage interface (STORAGE INTFC) 950, storage medium 960, memory 970, processor 980 and communications links therebetween.
  • communications interface COMM INTFC
  • SPEECH GEN speech generation module
  • SPEECH RECOG speech recognition module
  • STORAGE INTFC storage medium 960
  • memory 970 memory 970
  • processor 980 and communications links therebetween.
  • Communications interface 920 is adapted to receive calls from a user telephone set 910, to supply synthesized speech from speech generation module 930 to the telephone set 910, to forward signals from the telephone set 910 to speech recognition module 940, and to exchange information with processor 980.
  • the system shown in FIG. 1 includes a communications bus and separate communications lines for carrying voiceband signals between the communications interface 920 and each of speech generation module 930 and speech recognition module 940, but one of ordinary skill in the art will appreciate that other configurations are also suitable.
  • Speech generation module 930 is adapted to receive control commands from processor 980, to generate a voiceband signal in response thereto, and to deliver the generated signal to communications interface 920.
  • speech generation module 930 Preferably, speech generation module 930 generates synthesized speech in a frequency band of approximately 300-3,300 Hz.
  • speech generation module 930 may also function to transmit ("play") pre-stored phrases in response to commands from processor 980; module 930 includes appropriate signal storage facilities in these cases.
  • Speech recognition module 940 is adapted (i) to receive from communications interface 920 a voiceband signal which can be a speech signal or a DTMF signal generated in response to depression of a key on the telephone set 910, (ii) to process this signal as described in detail below and in response to commands from processor 980, and (iii) to deliver the results of its processing to processor 980.
  • speech recognition module 940 includes storage for holding predetermined signals and/or for holding speech signals from telephone set 910 for the duration of a call.
  • Storage interface 950 is adapted to deliver information to and retrieve information from storage medium 960 in accordance with commands from processor 980.
  • the storage medium 960 may be any appropriate medium, such as magnetic disk, optical disk, tape or transistor arrays.
  • Memory 970 may be implemented by using, for example, ROM and RAM, and is adapted to store information used by processor 980.
  • Processor 980 is adapted to execute programs for interacting with the user of telephone set 910 in accordance with a control program typically stored on storage medium 960 and also loaded into memory 970. Processor 980 may also communicate with other systems via communications links (not shown), for example, to retrieve user-specific information from a remote database and/or to deliver information captured from the user of telephone set 910 to a remote database.
  • the user employs telephone set 910 to place a call to system 900.
  • Communications interface 920 receives the call and notifies processor 980 of an in-coming call event.
  • Processor 980 in accordance with its control program, instructs speech generation module 930 to generate a speech signal.
  • Speech generation module 930 generates the requested speech signal and delivers the generated signal to communication interface 920, which forwards it to the telephone set 910.
  • the user In response to the generated speech signal, the user enters information to system 900 via telephone set 910.
  • the information can be a speech signal or a DTMF signal generated in response to depression of a key on the telephone set 910.
  • Communications interface 920 (a) receives the user-generated signal; (b) notifies processor 980 that a signal has been received; and (c) delivers the signal to speech recognition module 940.
  • the module 940 processes the signal in accordance with the present invention, as described in detail below, and delivers the result of its processing to processor 980. Based on this result, processor 980 proceeds through its control program, generally instructing speech generation module 930 to request information from the user or to deliver information to the user, and receiving processed user input from speech recognition module 940.
  • FIG. 2 illustrates a flowchart for a method of automatically capturing an uttered alphabetic character.
  • the character capture method illustrated in FIG. 2 generally involves the user uttering a character, and the system presenting what it has determined as the first candidate character to the user. If the first candidate character is correct, that is, the first candidate character is the character uttered by the user, then the system goes on to capture the next character. If the first candidate character is incorrect, then the system asks for an input to aid in disambiguating the uttered character.
  • the input is preferably a DTMF signal for a telephone key.
  • the DTMF signal input narrows the range of possible characters, and in combination with the uttered character, typically results in a correctly identified character.
  • FIG. 2 encompasses the actions of the elements shown in FIG. 1.
  • a control and processing program executed by processor 980 will be apparent to one of ordinary skill in the art in view of FIG. 2.
  • An advantage of the present method is that the user utters only the desired character, and not additional information, when the speech recognition portion of the automated system is capable of correctly capturing the character.
  • additional user input is required only when the automated capture is inaccurate, to compensate for the inadequacy in voice recognition technology. Therefore, voice recognition technology which has imperfect letter recognition accuracy may now be utilized to provide highly accurate character capture.
  • the system prompts the user to utter an alphabetic character.
  • a typical system prompt may be, "Please spell your name, beginning with the first letter.”
  • the system prompt may change to, "Please say the next character, or the word “DONE” to go on”, or to, "Please say the next character, or press the pound sign to go on.”
  • the user responds by uttering an alphabetic character, such as "N”.
  • the system receives a signal indicative of the uttered alphabetic character.
  • the signal represents the utterance "en”.
  • the system accesses a set of stored signals representing spoken alphabetic characters.
  • the set generally comprises signals representing the utterances “ay”, “bee”, “see”, “dee”, and so on.
  • Some alphabetic characters may have multiple stored signals, such as “zee” and “zed” for "Z”.
  • the system compares the received signal with the stored signals, selects the stored signal which best matches the received signal, and finds the alphabetic character corresponding to the best matching stored signal. In this example, the system is assumed to select the stored signal for "em" as the best matching signal.
  • the system inquires whether the alphabetic character corresponding to the best matching stored signal, that is, the first candidate character, is the uttered character.
  • the inquiry is generated using speech generation technology known to one of ordinary skill in the art.
  • the system inquiry includes information for assuring that the user correctly understands the alphabetic character presented by the system.
  • the system may inquire, "I understood M as in Mary. Is this correct?"
  • the additional information is a word "Mary" associated with the best matching alphabetic character "M", where the spelling of the word begins with the best matching alphabetic character.
  • the user replies with, typically, a "yes” or “no” answer, which can be processed by presently available voice recognition technology with a relatively high level of accuracy.
  • the system receives the user's reply.
  • the system uses the reply to determine whether the alphabetic character corresponding to the best matching stored signal is the uttered character. If the character selected by the system matches the uttered character, then the system has correctly captured an alphabetic character and goes on to ask for the next character at step 110.
  • the system prompts the user to enter an input for correctly disambiguating the received signal as the desired character.
  • the user has a telephone set which provides DTMF, and so the system prompt is, "Please touch the telephone key having the desired character.” Provision is made for characters which do not correspond to a telephone key, namely, "Q" and "Z”, such as assigning them to the "0" key.
  • the user may be prompted to speak the number of the telephone key having the desired character.
  • This alternative is useful when the caller has a pulse telephone set, i.e., does not have the capability to enter DTMF.
  • Presently available voice recognition technology has a high level of accuracy for recognition of single digits, as compared with the level of accuracy for recognition of spoken alphabetic characters.
  • the system receives the input entered by the user for disambiguating the uttered alphabetic character.
  • the input may be the DTMF generated by depression of the "6" telephone key, corresponding to the letters "M", "N", "O".
  • the system prompts the user to speak a word beginning with the uttered character.
  • a system prompt may be, "Please say Nancy if the character is N or Mary if the character is M.”
  • the system receives the input.
  • the system is changed to expect an utterance corresponding to one of, e.g., "Nancy” or "Mary". That is, for the input used for disambiguation, the expectations of the speech recognizer regarding the nature of the input change relative to the nature of the input expected initially.
  • the system uses the input to select a subset of the stored signals representing spoken alphabetic characters.
  • the subset is the stored signals for the letters "M", "N", "O".
  • the system eliminates the stored signals for the first candidate character, which has already been presented to the user, from the subset, if the first candidate character is in the subset.
  • the system eliminates the stored signal(s) for the letter "M”.
  • the system orders the remaining stored signals in the subset by similarity to the received signal.
  • the received signal is "en”
  • the remaining stored signals in the subset are “en” and “oh”
  • the ordered subset is ⁇ "en", "oh” ⁇ .
  • the system selects the best matching signal in the stored subset as the second candidate character, namely, "en”, and at step 230, the system inquires whether the second candidate alphabetic character is the uttered character.
  • the system receives the user's reply.
  • the system uses the reply to determine whether the second candidate alphabetic character is the uttered character. If the second candidate character selected by the system matches the uttered character, then the system has correctly captured an alphabetic character and goes on to ask for the next character at step 110.
  • the system eliminates the just refused second candidate character from the subset, and determines whether anything is left in the subset. If something is left, then the system goes to step 220 and tries the best remaining character. In this example, if the user refused "en”, then "oh” would still remain in the subset, and would be presented to the user.
  • the system If nothing is left in the subset, then the system has been unable to correctly capture the uttered character.
  • the system prompts the user to re-utter the character, and returns to step 120 to re-try capturing the character. If this is the situation, then at step 130, the system eliminates the signals corresponding to the already refused characters from the stored signals when selecting the best matching stored signal.
  • FIG. 3 there is illustrated a flowchart for another method of automatically capturing an uttered alphabetic character.
  • the character capture method illustrated in FIG. 3 generally involves the user uttering a character, and the system presenting what it has determined as the best matching character to the user. If the presented character is correct, that is, the presented character is the character uttered by the user, then the system goes on to capture the next character. If the presented character is incorrect, then the system presents its next best matching character to the user and inquires whether this character is correct. In this method, the responses of the user are inputs to aid in disambiguating the uttered alphabetic character.
  • An advantage of this method is that the user has a very simple structured interaction with the system, that is, the user either accepts or rejects the characters presented by the system.
  • This method also permits voice recognition technology which has imperfect letter recognition accuracy to be utilized to provide highly accurate character capture.
  • Steps 410 and 420 of FIG. 3 are similar to steps 110 and 120 of FIG. 2, and, for brevity, will not be discussed in detail.
  • the system accesses a set of stored signals representing spoken alphabetic characters, as described above with respect to step 130 of FIG. 2.
  • the system selects the stored signals which match the received signal to within a predetermined threshold as the best matching subset, and then orders the selected stored signals by similarity to the received signal to generate an ordered best matching subset.
  • the system compares the received signal with the stored signals, selects the stored signal which best matches the received signal, finds the alphabetic character corresponding to the best matching stored signal, and performs a table look up for the alphabetic character to obtain an ordered best matching subset.
  • the system compares the received signal with the stored signals, selects the stored signal which best matches the received signal, and finds the alphabetic character corresponding to the best matching stored signal.
  • a best matching subset is generated only when the best matching alphabetic character is rejected by the user.
  • Steps 440-460 of FIG. 3 are similar to steps 140-160 of FIG. 2, and, for brevity, will not be discussed in detail.
  • the system determines whether anything is left in the best matching subset. If something is left in the subset, then at step 480, the system selects the next entry in the subset as the best matching alphabetic character, and loops back to step 440 to check whether the user accepts this new best matching alphabetic character.
  • the system If nothing is left in the subset, then the system has been unable to correctly capture the uttered character.
  • the system prompts the user to re-utter the character, and returns to step 420 to re-try capturing the character. If this is the situation, then at step 430, the system eliminates the signals corresponding to the already refused characters from the stored signals when selecting the best matching stored signal.

Abstract

Automated capture of an uttered alphabetic character is provided by using an input, beyond the uttered alphabetic character, to disambiguate an incorrectly captured character. The input is an indication of a telephone key representing the uttered alphabetic character. The indication can be a dual tone multifrequency signal or an utterance of the number of the telephone key. Alternatively, the input is an indication that the incorrectly captured alphabetic character differs from the uttered alphabetic character.

Description

BACKGROUND OF THE INVENTION
The present invention relates to automated call processing, and, more particularly, is directed to capturing alphabetic characters in an automated call processing environment.
Automated call processing has achieved widespread usage. Applications include call routing, voice mail, directory assistance, order processing, information dissemination and so forth.
However, existing telephone based services in which a caller is interacting with a computer do not capture alphabetic character strings with a high degree of accuracy when the strings comprise letters which are selected from an unlimited or very large domain, such as names. Since the set of character strings cannot be defined in advance, the string must be spelled as it is captured.
Automatically capturing alphabetic spelled character strings using only voice input is not feasible presently because letter recognition accuracy is too low with available voice recognition technology. For example, it is difficult to automatically distinguish "B" from "P".
Methods of automatically capturing alphabetic spelled character strings using only dual tone multifrequency (DTMF) input from a twelve-key keypad on a telephone set are cumbersome, as each telephone key does not uniquely map to a single alphabetic character. Consequently, multiple inputs per letter are required for disambiguation, e.g., to indicate "K" press "5" twice or press "5", "2". These methods are also error-prone due to the problem of the user accidentally pressing the wrong key or multiple keys and being unaware of the error, the so-called "fat finger" effect.
SUMMARY OF THE INVENTION
Automated capture of an uttered alphabetic character is provided in accordance with the principles of this invention by using an input, beyond the uttered alphabetic character, to disambiguate an incorrectly captured character.
In an exemplary embodiment of this invention, at least one uttered alphabetic character is captured by receiving a signal indicative of the uttered alphabetic character, automatically finding a first candidate alphabetic character corresponding to the received signal, inquiring whether the first candidate alphabetic character is the uttered alphabetic character, and receiving an input for use in disambiguating the received signal when the first candidate alphabetic character differs from the uttered alphabetic character.
The input is an indication of a telephone key representing the uttered alphabetic character. The indication can be a dual tone multifrequency signal or an utterance of the number of the telephone key. Alternatively, the input is an indication that the first candidate alphabetic character differs from the uttered alphabetic character.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram illustrating a configuration in which the present invention is applied;
FIG. 2 is a flowchart of a method of automatically capturing an uttered alphabetic character; and
FIG. 3 is a flowchart of another method of automatically capturing an uttered alphabetic character.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
The present invention is related to the invention of U.S. patent application Ser. No. 08/580,702, filed Dec. 29, 1995, the disclosure of which is hereby incorporated by reference.
In an automated call processing scenario, for example, a caller, also referred to herein as a user of the automated call processing system, is assumed to have decided that he or she wishes to enter their name or other alphabetic information to the system, for a purpose such as placing an order or receiving information. In this scenario, the user has available only a conventional telephone set, i.e., any telephone set unable to directly transmit alphabetic information across the telephone network, and communicates via this telephone set with the system.
Referring now to the drawings, and in particular to FIG. 1, there is illustrated a system 900 in which the present invention is applied. As mentioned, a user is assumed to have access to only a conventional telephone set 910 which communicates with the system 900 using conventional telecommunications facilities such as wired or wireless telecommunications systems known to one of ordinary skill in the art.
The system 900 comprises communications interface (COMM INTFC) 920, speech generation module (SPEECH GEN) 930, speech recognition module (SPEECH RECOG) 940, storage interface (STORAGE INTFC) 950, storage medium 960, memory 970, processor 980 and communications links therebetween.
Communications interface 920 is adapted to receive calls from a user telephone set 910, to supply synthesized speech from speech generation module 930 to the telephone set 910, to forward signals from the telephone set 910 to speech recognition module 940, and to exchange information with processor 980. The system shown in FIG. 1 includes a communications bus and separate communications lines for carrying voiceband signals between the communications interface 920 and each of speech generation module 930 and speech recognition module 940, but one of ordinary skill in the art will appreciate that other configurations are also suitable.
Speech generation module 930 is adapted to receive control commands from processor 980, to generate a voiceband signal in response thereto, and to deliver the generated signal to communications interface 920. Preferably, speech generation module 930 generates synthesized speech in a frequency band of approximately 300-3,300 Hz. In some embodiments, speech generation module 930 may also function to transmit ("play") pre-stored phrases in response to commands from processor 980; module 930 includes appropriate signal storage facilities in these cases.
Speech recognition module 940 is adapted (i) to receive from communications interface 920 a voiceband signal which can be a speech signal or a DTMF signal generated in response to depression of a key on the telephone set 910, (ii) to process this signal as described in detail below and in response to commands from processor 980, and (iii) to deliver the results of its processing to processor 980. As will be appreciated, in some embodiments speech recognition module 940 includes storage for holding predetermined signals and/or for holding speech signals from telephone set 910 for the duration of a call.
Storage interface 950 is adapted to deliver information to and retrieve information from storage medium 960 in accordance with commands from processor 980. The storage medium 960 may be any appropriate medium, such as magnetic disk, optical disk, tape or transistor arrays.
Memory 970 may be implemented by using, for example, ROM and RAM, and is adapted to store information used by processor 980.
Processor 980 is adapted to execute programs for interacting with the user of telephone set 910 in accordance with a control program typically stored on storage medium 960 and also loaded into memory 970. Processor 980 may also communicate with other systems via communications links (not shown), for example, to retrieve user-specific information from a remote database and/or to deliver information captured from the user of telephone set 910 to a remote database.
In a typical call processing operation, the user employs telephone set 910 to place a call to system 900. Communications interface 920 receives the call and notifies processor 980 of an in-coming call event. Processor 980, in accordance with its control program, instructs speech generation module 930 to generate a speech signal. Speech generation module 930 generates the requested speech signal and delivers the generated signal to communication interface 920, which forwards it to the telephone set 910.
In response to the generated speech signal, the user enters information to system 900 via telephone set 910. As described in detail below, the information can be a speech signal or a DTMF signal generated in response to depression of a key on the telephone set 910.
Communications interface 920 (a) receives the user-generated signal; (b) notifies processor 980 that a signal has been received; and (c) delivers the signal to speech recognition module 940. The module 940 processes the signal in accordance with the present invention, as described in detail below, and delivers the result of its processing to processor 980. Based on this result, processor 980 proceeds through its control program, generally instructing speech generation module 930 to request information from the user or to deliver information to the user, and receiving processed user input from speech recognition module 940.
Entry of alphabetic information according to the present invention, from the user to the system, will now be described.
FIG. 2 illustrates a flowchart for a method of automatically capturing an uttered alphabetic character. The character capture method illustrated in FIG. 2 generally involves the user uttering a character, and the system presenting what it has determined as the first candidate character to the user. If the first candidate character is correct, that is, the first candidate character is the character uttered by the user, then the system goes on to capture the next character. If the first candidate character is incorrect, then the system asks for an input to aid in disambiguating the uttered character. The input is preferably a DTMF signal for a telephone key. The DTMF signal input narrows the range of possible characters, and in combination with the uttered character, typically results in a correctly identified character.
The flowchart illustrated in FIG. 2 encompasses the actions of the elements shown in FIG. 1. For example, a control and processing program executed by processor 980 will be apparent to one of ordinary skill in the art in view of FIG. 2.
An advantage of the present method is that the user utters only the desired character, and not additional information, when the speech recognition portion of the automated system is capable of correctly capturing the character. In other words, additional user input is required only when the automated capture is inaccurate, to compensate for the inadequacy in voice recognition technology. Therefore, voice recognition technology which has imperfect letter recognition accuracy may now be utilized to provide highly accurate character capture.
At step 110 of FIG. 2, the system prompts the user to utter an alphabetic character. For example, a typical system prompt may be, "Please spell your name, beginning with the first letter." After the first character has been correctly captured, the system prompt may change to, "Please say the next character, or the word "DONE" to go on", or to, "Please say the next character, or press the pound sign to go on."
The user responds by uttering an alphabetic character, such as "N". At step 120, the system receives a signal indicative of the uttered alphabetic character. In this example, the signal represents the utterance "en".
At step 130, the system accesses a set of stored signals representing spoken alphabetic characters. The set generally comprises signals representing the utterances "ay", "bee", "see", "dee", and so on. Some alphabetic characters may have multiple stored signals, such as "zee" and "zed" for "Z". The system compares the received signal with the stored signals, selects the stored signal which best matches the received signal, and finds the alphabetic character corresponding to the best matching stored signal. In this example, the system is assumed to select the stored signal for "em" as the best matching signal.
At step 140, the system inquires whether the alphabetic character corresponding to the best matching stored signal, that is, the first candidate character, is the uttered character. The inquiry is generated using speech generation technology known to one of ordinary skill in the art. Preferably, the system inquiry includes information for assuring that the user correctly understands the alphabetic character presented by the system. For example, the system may inquire, "I understood M as in Mary. Is this correct?" In this example, the additional information is a word "Mary" associated with the best matching alphabetic character "M", where the spelling of the word begins with the best matching alphabetic character. The user replies with, typically, a "yes" or "no" answer, which can be processed by presently available voice recognition technology with a relatively high level of accuracy. At step 150, the system receives the user's reply.
At step 160, the system uses the reply to determine whether the alphabetic character corresponding to the best matching stored signal is the uttered character. If the character selected by the system matches the uttered character, then the system has correctly captured an alphabetic character and goes on to ask for the next character at step 110.
If the character selected by the system does not match the uttered character, then, at step 170, the system prompts the user to enter an input for correctly disambiguating the received signal as the desired character.
Preferably, the user has a telephone set which provides DTMF, and so the system prompt is, "Please touch the telephone key having the desired character." Provision is made for characters which do not correspond to a telephone key, namely, "Q" and "Z", such as assigning them to the "0" key.
Alternatively, the user may be prompted to speak the number of the telephone key having the desired character. This alternative is useful when the caller has a pulse telephone set, i.e., does not have the capability to enter DTMF. Presently available voice recognition technology has a high level of accuracy for recognition of single digits, as compared with the level of accuracy for recognition of spoken alphabetic characters.
At step 180, the system receives the input entered by the user for disambiguating the uttered alphabetic character. For example, the input may be the DTMF generated by depression of the "6" telephone key, corresponding to the letters "M", "N", "O".
In another embodiment, if the character selected by the system does not match the uttered character, then, at step 170, the system prompts the user to speak a word beginning with the uttered character. For example, a system prompt may be, "Please say Nancy if the character is N or Mary if the character is M." Then, at step 180, the system receives the input. In this embodiment, the system is changed to expect an utterance corresponding to one of, e.g., "Nancy" or "Mary". That is, for the input used for disambiguation, the expectations of the speech recognizer regarding the nature of the input change relative to the nature of the input expected initially.
At step 190, the system uses the input to select a subset of the stored signals representing spoken alphabetic characters. In this example, the subset is the stored signals for the letters "M", "N", "O".
At step 200, the system eliminates the stored signals for the first candidate character, which has already been presented to the user, from the subset, if the first candidate character is in the subset. In this example, the system eliminates the stored signal(s) for the letter "M".
At step 210, the system orders the remaining stored signals in the subset by similarity to the received signal. In this example, the received signal is "en", the remaining stored signals in the subset are "en" and "oh", and the ordered subset is {"en", "oh"}.
At step 220, the system selects the best matching signal in the stored subset as the second candidate character, namely, "en", and at step 230, the system inquires whether the second candidate alphabetic character is the uttered character.
At step 240, the system receives the user's reply.
At step 250, the system uses the reply to determine whether the second candidate alphabetic character is the uttered character. If the second candidate character selected by the system matches the uttered character, then the system has correctly captured an alphabetic character and goes on to ask for the next character at step 110.
If the second candidate character selected by the system does not match the uttered character, then, at step 260, the system eliminates the just refused second candidate character from the subset, and determines whether anything is left in the subset. If something is left, then the system goes to step 220 and tries the best remaining character. In this example, if the user refused "en", then "oh" would still remain in the subset, and would be presented to the user.
If nothing is left in the subset, then the system has been unable to correctly capture the uttered character. At step 270, the system prompts the user to re-utter the character, and returns to step 120 to re-try capturing the character. If this is the situation, then at step 130, the system eliminates the signals corresponding to the already refused characters from the stored signals when selecting the best matching stored signal.
Referring now to FIG. 3, there is illustrated a flowchart for another method of automatically capturing an uttered alphabetic character.
The character capture method illustrated in FIG. 3 generally involves the user uttering a character, and the system presenting what it has determined as the best matching character to the user. If the presented character is correct, that is, the presented character is the character uttered by the user, then the system goes on to capture the next character. If the presented character is incorrect, then the system presents its next best matching character to the user and inquires whether this character is correct. In this method, the responses of the user are inputs to aid in disambiguating the uttered alphabetic character.
An advantage of this method is that the user has a very simple structured interaction with the system, that is, the user either accepts or rejects the characters presented by the system. This method also permits voice recognition technology which has imperfect letter recognition accuracy to be utilized to provide highly accurate character capture.
Steps 410 and 420 of FIG. 3 are similar to steps 110 and 120 of FIG. 2, and, for brevity, will not be discussed in detail.
At step 430, the system accesses a set of stored signals representing spoken alphabetic characters, as described above with respect to step 130 of FIG. 2. Preferably, the system selects the stored signals which match the received signal to within a predetermined threshold as the best matching subset, and then orders the selected stored signals by similarity to the received signal to generate an ordered best matching subset.
Alternatively, the system compares the received signal with the stored signals, selects the stored signal which best matches the received signal, finds the alphabetic character corresponding to the best matching stored signal, and performs a table look up for the alphabetic character to obtain an ordered best matching subset.
As yet another alternative, the system compares the received signal with the stored signals, selects the stored signal which best matches the received signal, and finds the alphabetic character corresponding to the best matching stored signal. In this alternative, a best matching subset is generated only when the best matching alphabetic character is rejected by the user.
Steps 440-460 of FIG. 3 are similar to steps 140-160 of FIG. 2, and, for brevity, will not be discussed in detail.
If the best matching character selected by the system does not match the uttered character, then, at step 470, the system determines whether anything is left in the best matching subset. If something is left in the subset, then at step 480, the system selects the next entry in the subset as the best matching alphabetic character, and loops back to step 440 to check whether the user accepts this new best matching alphabetic character.
It will be appreciated that if a best matching subset has not yet been determined, then at step 470, it is necessary to determine the best matching subset, and eliminate the just rejected character from the best matching subset.
If nothing is left in the subset, then the system has been unable to correctly capture the uttered character. At step 490, the system prompts the user to re-utter the character, and returns to step 420 to re-try capturing the character. If this is the situation, then at step 430, the system eliminates the signals corresponding to the already refused characters from the stored signals when selecting the best matching stored signal.
Although illustrative embodiments of the present invention, and various modifications thereof, have been described in detail herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to these precise embodiments and the described modifications, and that various changes and further modifications may be effected therein by one skilled in the art without departing from the scope or spirit of the invention as defined in the appended claims.

Claims (5)

What is claimed is:
1. A method of capturing at least one uttered alphabetic character, comprising the steps of:
receiving a signal indicative of an uttered alphabetic character;
accessing a set of stored signals representing spoken alphabetic characters;
automatically finding a first candidate alphabetic character from the set of stored signals corresponding to the received signal;
inquiring from a user whether the first candidate alphabetic character is the uttered alphabetic character; and
receiving an input for use in disambiguating the received signal when the first candidate alphabetic character differs from the uttered alphabetic character, wherein the input is an indication of a telephone key representing the uttered alphabetic character and wherein the telephone key indication is an utterance; and
disambiguating the received signal, by automatically finding an alternate candidate alphabetic character that accurately captures the uttered character; wherein the step of automatically finding the alternate candidate alphabetic character includes using the input to compare the stored signals with the received signal until one of the stored signals matches the received signal.
2. A method of accurately capturing at least one uttered alphabetic character, comprising the steps of:
receiving a signal indicative of an uttered alphabetic character;
accessing a set of stored signals representing spoken alphabetic characters;
automatically finding a first candidate alphabetic character corresponding to the received signal;
inquiring from a user whether the first candidate alphabetic character is the uttered alphabetic character;
receiving an input for use in disambiguating the received signal when the first candidate alphabetic character differs from the uttered alphabetic character;
automatically finding a second candidate alphabetic character corresponding to the received signal in accordance with the input, wherein the step of automatically finding the second candidate alphabetic character includes comparing the stored signals with the received signal; eliminating a stored signal representing the first candidate alphabetic character from the stored signals and inquiring whether the second candidate alphabetic candidate character accurately captures the uttered character; wherein
automatically finding an alternative candidate alphabetic character; wherein the step of automatically finding includes using the input to compare a selected group of the stored signals representing the spoken alphabetic characters indicated by the input with the received signal until one of the stored signals matches the received signal.
3. Apparatus for capturing at least one uttered alphabetic character, comprising:
means for receiving a signal indicative of an uttered alphabetic character;
means for a set of stored signals representing spoken alphabetic characters;
means for automatically finding a first candidate alphabetic character corresponding to the received signal;
means for inquiring from a user whether the first candidate alphabetic character is the uttered alphabetic character;
means for receiving an input for use in disambiguating the received signal when the first candidate alphabetic character differs from the uttered alphabetic character, wherein the input is an indication of a telephone key representing the uttered alphabetic character and wherein the telephone key indication is an utterance; disambiguating the received signal, by automatically finding an alternate candidate alphabetic character that accurately captures the uttered character; wherein the step of automatically finding the alternate candidate alphabetic character includes using the input to compare a selected group of the stored signals with the received signal until one of the stored signals matches the received signals.
4. The apparatus of claim 3, wherein the utterance represents a number corresponding to the telephone key.
5. Apparatus for capturing at least one uttered alphabetic character, comprising:
means for receiving a signal indicative of an uttered alphabetic character;
means for automatically finding a first candidate alphabetic character corresponding to the received signal;
means for accessing a set of stored signals representing spoken alphabetic characters;
means for inquiring from a user whether the first candidate alphabetic character is the uttered alphabetic character;
means for receiving an input for use in disambiguating the received signal when the first candidate alphabetic character differs from the uttered alphabetic character; and
means for automatically finding a second candidate alphabetic character corresponding to the received signal in accordance with the input; wherein the means for automatically finding the second candidate alphabetic character includes means for comparing the stored signals with the received signal and wherein a stored signal representing the first candidate alphabetic character is eliminated from the stored signals indicated by the input; disambiguating the received signal, by automatically finding an alternate candidate alphabetic character that accurately captures the uttered character; wherein the step of automatically finding the alternate candidate alphabetic character includes the step of using the input to compare a selected group of the stored signals representing the first spoken alphabetic characters indicated by the input with the received signal until one of the stored signals matches the received signal.
US08/581,716 1995-12-29 1995-12-29 Disambiguation of alphabetic characters in an automated call processing environment Expired - Lifetime US5917890A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US08/581,716 US5917890A (en) 1995-12-29 1995-12-29 Disambiguation of alphabetic characters in an automated call processing environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US08/581,716 US5917890A (en) 1995-12-29 1995-12-29 Disambiguation of alphabetic characters in an automated call processing environment

Publications (1)

Publication Number Publication Date
US5917890A true US5917890A (en) 1999-06-29

Family

ID=24326289

Family Applications (1)

Application Number Title Priority Date Filing Date
US08/581,716 Expired - Lifetime US5917890A (en) 1995-12-29 1995-12-29 Disambiguation of alphabetic characters in an automated call processing environment

Country Status (1)

Country Link
US (1) US5917890A (en)

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0971522A1 (en) * 1998-07-07 2000-01-12 AT & T Corp. A method of initiating a call feature request
US6269153B1 (en) * 1998-07-29 2001-07-31 Lucent Technologies Inc. Methods and apparatus for automatic call routing including disambiguating routing decisions
US6314166B1 (en) * 1997-05-06 2001-11-06 Nokia Mobile Phones Limited Method for dialling a telephone number by voice commands and a telecommunication terminal controlled by voice commands
US20010056345A1 (en) * 2000-04-25 2001-12-27 David Guedalia Method and system for speech recognition of the alphabet
US6436093B1 (en) 2000-06-21 2002-08-20 Luis Antonio Ruiz Controllable liquid crystal matrix mask particularly suited for performing ophthamological surgery, a laser system with said mask and a method of using the same
US6464692B1 (en) 2000-06-21 2002-10-15 Luis Antonio Ruiz Controllable electro-optical patternable mask, system with said mask and method of using the same
US20020173956A1 (en) * 2001-05-16 2002-11-21 International Business Machines Corporation Method and system for speech recognition using phonetically similar word alternatives
US20020196911A1 (en) * 2001-05-04 2002-12-26 International Business Machines Corporation Methods and apparatus for conversational name dialing systems
US20020196163A1 (en) * 1998-12-04 2002-12-26 Bradford Ethan Robert Explicit character filtering of ambiguous text entry
US20030016675A1 (en) * 1997-09-19 2003-01-23 Siemens Telecom Networks Flexible software architecture for a call processing system
US20030115057A1 (en) * 2001-12-13 2003-06-19 Junqua Jean-Claude Constraint-based speech recognition system and method
US20030163319A1 (en) * 2002-02-22 2003-08-28 International Business Machines Corporation Automatic selection of a disambiguation data field for a speech interface
EP1372139A1 (en) * 2002-05-15 2003-12-17 Pioneer Corporation Speech recognition apparatus and program with error correction
US6714631B1 (en) 2002-10-31 2004-03-30 Sbc Properties, L.P. Method and system for an automated departure strategy
US20040088285A1 (en) * 2002-10-31 2004-05-06 Sbc Properties, L.P. Method and system for an automated disambiguation
WO2004053836A1 (en) * 2002-12-10 2004-06-24 Kirusa, Inc. Techniques for disambiguating speech input using multimodal interfaces
US20050043947A1 (en) * 2001-09-05 2005-02-24 Voice Signal Technologies, Inc. Speech recognition using ambiguous or phone key spelling and/or filtering
US20050049858A1 (en) * 2003-08-25 2005-03-03 Bellsouth Intellectual Property Corporation Methods and systems for improving alphabetic speech recognition accuracy
US20050125230A1 (en) * 2003-12-09 2005-06-09 Gregory Haas Method and apparatus for entering alphabetic characters
US20050131686A1 (en) * 2003-12-16 2005-06-16 Canon Kabushiki Kaisha Information processing apparatus and data input method
US20050159948A1 (en) * 2001-09-05 2005-07-21 Voice Signal Technologies, Inc. Combined speech and handwriting recognition
US20050159957A1 (en) * 2001-09-05 2005-07-21 Voice Signal Technologies, Inc. Combined speech recognition and sound recording
US20050216276A1 (en) * 2004-03-23 2005-09-29 Ching-Ho Tsai Method and system for voice-inputting chinese character
US20050234722A1 (en) * 2004-02-11 2005-10-20 Alex Robinson Handwriting and voice input with automatic correction
US20050283358A1 (en) * 2002-06-20 2005-12-22 James Stephanick Apparatus and method for providing visual indication of character ambiguity during text entry
US20060190256A1 (en) * 1998-12-04 2006-08-24 James Stephanick Method and apparatus utilizing voice input to resolve ambiguous manually entered text input
US20060247915A1 (en) * 1998-12-04 2006-11-02 Tegic Communications, Inc. Contextual Prediction of User Words and User Actions
US7143043B1 (en) * 2000-04-26 2006-11-28 Openwave Systems Inc. Constrained keyboard disambiguation using voice recognition
WO2005119642A3 (en) * 2004-06-02 2006-12-28 America Online Inc Multimodal disambiguation of speech recognition
US20070005358A1 (en) * 2005-06-29 2007-01-04 Siemens Aktiengesellschaft Method for determining a list of hypotheses from a vocabulary of a voice recognition system
US20080103774A1 (en) * 2006-10-30 2008-05-01 International Business Machines Corporation Heuristic for Voice Result Determination
US20080120102A1 (en) * 2006-11-17 2008-05-22 Rao Ashwin P Predictive speech-to-text input
US7444286B2 (en) 2001-09-05 2008-10-28 Roth Daniel L Speech recognition using re-utterance recognition
US7679534B2 (en) 1998-12-04 2010-03-16 Tegic Communications, Inc. Contextual prediction of user words and user actions
US7809574B2 (en) 2001-09-05 2010-10-05 Voice Signal Technologies Inc. Word recognition using choice lists
US20120253823A1 (en) * 2004-09-10 2012-10-04 Thomas Barton Schalk Hybrid Dialog Speech Recognition for In-Vehicle Automated Interaction and In-Vehicle Interfaces Requiring Minimal Driver Processing
US20130238338A1 (en) * 2012-03-06 2013-09-12 Verizon Patent And Licensing, Inc. Method and apparatus for phonetic character conversion
US8606582B2 (en) 2004-06-02 2013-12-10 Tegic Communications, Inc. Multimodal disambiguation of speech recognition
US9830912B2 (en) 2006-11-30 2017-11-28 Ashwin P Rao Speak and touch auto correction interface
US9922640B2 (en) 2008-10-17 2018-03-20 Ashwin P Rao System and method for multimodal utterance detection

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US31188A (en) * 1861-01-22 Drawer-alarm
US3928724A (en) * 1974-10-10 1975-12-23 Andersen Byram Kouma Murphy Lo Voice-actuated telephone directory-assistance system
US4355302A (en) * 1980-09-12 1982-10-19 Bell Telephone Laboratories, Incorporated Spelled word recognizer
US4593157A (en) * 1984-09-04 1986-06-03 Usdan Myron S Directory interface and dialer
US4608460A (en) * 1984-09-17 1986-08-26 Itt Corporation Comprehensive automatic directory assistance apparatus and method thereof
US4649563A (en) * 1984-04-02 1987-03-10 R L Associates Method of and means for accessing computerized data bases utilizing a touch-tone telephone instrument
US4650927A (en) * 1984-11-29 1987-03-17 International Business Machines Corporation Processor-assisted communication system using tone-generating telephones
US4782509A (en) * 1984-08-27 1988-11-01 Cognitronics Corporation Apparatus and method for obtaining information in a wide-area telephone system with multiple local exchanges and multiple information storage sites
US5125022A (en) * 1990-05-15 1992-06-23 Vcs Industries, Inc. Method for recognizing alphanumeric strings spoken over a telephone network
US5127043A (en) * 1990-05-15 1992-06-30 Vcs Industries, Inc. Simultaneous speaker-independent voice recognition and verification over a telephone network
US5131045A (en) * 1990-05-10 1992-07-14 Roth Richard G Audio-augmented data keying
US5163084A (en) * 1989-08-11 1992-11-10 Korea Telecommunication Authority Voice information service system and method utilizing approximately matched input character string and key word
US5303299A (en) * 1990-05-15 1994-04-12 Vcs Industries, Inc. Method for continuous recognition of alphanumeric strings spoken over a telephone network
US5384833A (en) * 1988-04-27 1995-01-24 British Telecommunications Public Limited Company Voice-operated service
US5392338A (en) * 1990-03-28 1995-02-21 Danish International, Inc. Entry of alphabetical characters into a telephone system using a conventional telephone keypad
US5454063A (en) * 1993-11-29 1995-09-26 Rossides; Michael T. Voice input system for data retrieval
US5638425A (en) * 1992-12-17 1997-06-10 Bell Atlantic Network Services, Inc. Automated directory assistance system using word recognition and phoneme processing method

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US31188A (en) * 1861-01-22 Drawer-alarm
US3928724A (en) * 1974-10-10 1975-12-23 Andersen Byram Kouma Murphy Lo Voice-actuated telephone directory-assistance system
US4355302A (en) * 1980-09-12 1982-10-19 Bell Telephone Laboratories, Incorporated Spelled word recognizer
US4649563A (en) * 1984-04-02 1987-03-10 R L Associates Method of and means for accessing computerized data bases utilizing a touch-tone telephone instrument
US4782509A (en) * 1984-08-27 1988-11-01 Cognitronics Corporation Apparatus and method for obtaining information in a wide-area telephone system with multiple local exchanges and multiple information storage sites
US4593157A (en) * 1984-09-04 1986-06-03 Usdan Myron S Directory interface and dialer
US4608460A (en) * 1984-09-17 1986-08-26 Itt Corporation Comprehensive automatic directory assistance apparatus and method thereof
US4650927A (en) * 1984-11-29 1987-03-17 International Business Machines Corporation Processor-assisted communication system using tone-generating telephones
US5384833A (en) * 1988-04-27 1995-01-24 British Telecommunications Public Limited Company Voice-operated service
US5163084A (en) * 1989-08-11 1992-11-10 Korea Telecommunication Authority Voice information service system and method utilizing approximately matched input character string and key word
US5392338A (en) * 1990-03-28 1995-02-21 Danish International, Inc. Entry of alphabetical characters into a telephone system using a conventional telephone keypad
US5131045A (en) * 1990-05-10 1992-07-14 Roth Richard G Audio-augmented data keying
US5125022A (en) * 1990-05-15 1992-06-23 Vcs Industries, Inc. Method for recognizing alphanumeric strings spoken over a telephone network
US5303299A (en) * 1990-05-15 1994-04-12 Vcs Industries, Inc. Method for continuous recognition of alphanumeric strings spoken over a telephone network
US5127043A (en) * 1990-05-15 1992-06-30 Vcs Industries, Inc. Simultaneous speaker-independent voice recognition and verification over a telephone network
US5638425A (en) * 1992-12-17 1997-06-10 Bell Atlantic Network Services, Inc. Automated directory assistance system using word recognition and phoneme processing method
US5454063A (en) * 1993-11-29 1995-09-26 Rossides; Michael T. Voice input system for data retrieval

Cited By (81)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6314166B1 (en) * 1997-05-06 2001-11-06 Nokia Mobile Phones Limited Method for dialling a telephone number by voice commands and a telecommunication terminal controlled by voice commands
US20030016675A1 (en) * 1997-09-19 2003-01-23 Siemens Telecom Networks Flexible software architecture for a call processing system
US6504912B1 (en) * 1998-07-07 2003-01-07 At&T Corp. Method of initiating a call feature request
US6801602B2 (en) * 1998-07-07 2004-10-05 At&T Corp. Method of initiating a call feature request
EP0971522A1 (en) * 1998-07-07 2000-01-12 AT & T Corp. A method of initiating a call feature request
US20030068017A1 (en) * 1998-07-07 2003-04-10 Glossbrenner Kenneth C. Method of initiating a call feature request
US6269153B1 (en) * 1998-07-29 2001-07-31 Lucent Technologies Inc. Methods and apparatus for automatic call routing including disambiguating routing decisions
US20060190256A1 (en) * 1998-12-04 2006-08-24 James Stephanick Method and apparatus utilizing voice input to resolve ambiguous manually entered text input
US20020196163A1 (en) * 1998-12-04 2002-12-26 Bradford Ethan Robert Explicit character filtering of ambiguous text entry
US7720682B2 (en) 1998-12-04 2010-05-18 Tegic Communications, Inc. Method and apparatus utilizing voice input to resolve ambiguous manually entered text input
US7712053B2 (en) 1998-12-04 2010-05-04 Tegic Communications, Inc. Explicit character filtering of ambiguous text entry
US7679534B2 (en) 1998-12-04 2010-03-16 Tegic Communications, Inc. Contextual prediction of user words and user actions
US8938688B2 (en) 1998-12-04 2015-01-20 Nuance Communications, Inc. Contextual prediction of user words and user actions
US7881936B2 (en) 1998-12-04 2011-02-01 Tegic Communications, Inc. Multimodal disambiguation of speech recognition
US9626355B2 (en) 1998-12-04 2017-04-18 Nuance Communications, Inc. Contextual prediction of user words and user actions
US20060247915A1 (en) * 1998-12-04 2006-11-02 Tegic Communications, Inc. Contextual Prediction of User Words and User Actions
US8782568B2 (en) 1999-12-03 2014-07-15 Nuance Communications, Inc. Explicit character filtering of ambiguous text entry
US20100174529A1 (en) * 1999-12-03 2010-07-08 Ethan Robert Bradford Explicit Character Filtering of Ambiguous Text Entry
US8972905B2 (en) 1999-12-03 2015-03-03 Nuance Communications, Inc. Explicit character filtering of ambiguous text entry
US8381137B2 (en) 1999-12-03 2013-02-19 Tegic Communications, Inc. Explicit character filtering of ambiguous text entry
US8990738B2 (en) 1999-12-03 2015-03-24 Nuance Communications, Inc. Explicit character filtering of ambiguous text entry
US20010056345A1 (en) * 2000-04-25 2001-12-27 David Guedalia Method and system for speech recognition of the alphabet
US7143043B1 (en) * 2000-04-26 2006-11-28 Openwave Systems Inc. Constrained keyboard disambiguation using voice recognition
US6736806B2 (en) 2000-06-21 2004-05-18 Luis Antonio Ruiz Controllable liquid crystal matrix mask particularly suited for performing ophthamological surgery, a laser system with said mask and a method of using the same
US6770068B2 (en) 2000-06-21 2004-08-03 Antonio Ruiz Controllable electro-optical patternable mask, system with said mask and method of using the same
US6436093B1 (en) 2000-06-21 2002-08-20 Luis Antonio Ruiz Controllable liquid crystal matrix mask particularly suited for performing ophthamological surgery, a laser system with said mask and a method of using the same
US6464692B1 (en) 2000-06-21 2002-10-15 Luis Antonio Ruiz Controllable electro-optical patternable mask, system with said mask and method of using the same
US20020196911A1 (en) * 2001-05-04 2002-12-26 International Business Machines Corporation Methods and apparatus for conversational name dialing systems
US6925154B2 (en) * 2001-05-04 2005-08-02 International Business Machines Corproation Methods and apparatus for conversational name dialing systems
US6910012B2 (en) * 2001-05-16 2005-06-21 International Business Machines Corporation Method and system for speech recognition using phonetically similar word alternatives
US20020173956A1 (en) * 2001-05-16 2002-11-21 International Business Machines Corporation Method and system for speech recognition using phonetically similar word alternatives
US20050043947A1 (en) * 2001-09-05 2005-02-24 Voice Signal Technologies, Inc. Speech recognition using ambiguous or phone key spelling and/or filtering
US20050159957A1 (en) * 2001-09-05 2005-07-21 Voice Signal Technologies, Inc. Combined speech recognition and sound recording
US20050159948A1 (en) * 2001-09-05 2005-07-21 Voice Signal Technologies, Inc. Combined speech and handwriting recognition
US7809574B2 (en) 2001-09-05 2010-10-05 Voice Signal Technologies Inc. Word recognition using choice lists
US7526431B2 (en) 2001-09-05 2009-04-28 Voice Signal Technologies, Inc. Speech recognition using ambiguous or phone key spelling and/or filtering
US7505911B2 (en) 2001-09-05 2009-03-17 Roth Daniel L Combined speech recognition and sound recording
US7444286B2 (en) 2001-09-05 2008-10-28 Roth Daniel L Speech recognition using re-utterance recognition
US7467089B2 (en) 2001-09-05 2008-12-16 Roth Daniel L Combined speech and handwriting recognition
US20030115057A1 (en) * 2001-12-13 2003-06-19 Junqua Jean-Claude Constraint-based speech recognition system and method
US7124085B2 (en) * 2001-12-13 2006-10-17 Matsushita Electric Industrial Co., Ltd. Constraint-based speech recognition system and method
US7769592B2 (en) 2002-02-22 2010-08-03 Nuance Communications, Inc. Automatic selection of a disambiguation data field for a speech interface
US20030163319A1 (en) * 2002-02-22 2003-08-28 International Business Machines Corporation Automatic selection of a disambiguation data field for a speech interface
EP1575031A3 (en) * 2002-05-15 2010-08-11 Pioneer Corporation Voice recognition apparatus
EP1372139A1 (en) * 2002-05-15 2003-12-17 Pioneer Corporation Speech recognition apparatus and program with error correction
EP1575031A2 (en) * 2002-05-15 2005-09-14 Pioneer Corporation Voice recognition apparatus
US20050283358A1 (en) * 2002-06-20 2005-12-22 James Stephanick Apparatus and method for providing visual indication of character ambiguity during text entry
US8583440B2 (en) 2002-06-20 2013-11-12 Tegic Communications, Inc. Apparatus and method for providing visual indication of character ambiguity during text entry
US20040161094A1 (en) * 2002-10-31 2004-08-19 Sbc Properties, L.P. Method and system for an automated departure strategy
US7443960B2 (en) 2002-10-31 2008-10-28 At&T Intellectual Property I, L.P. Method and system for an automated departure strategy
US7146383B2 (en) 2002-10-31 2006-12-05 Sbc Properties, L.P. Method and system for an automated disambiguation
US20060259478A1 (en) * 2002-10-31 2006-11-16 Martin John M Method and system for an automated disambiguation
US20060193449A1 (en) * 2002-10-31 2006-08-31 Martin John M Method and System for an Automated Departure Strategy
US7062018B2 (en) 2002-10-31 2006-06-13 Sbc Properties, L.P. Method and system for an automated departure strategy
US20040088285A1 (en) * 2002-10-31 2004-05-06 Sbc Properties, L.P. Method and system for an automated disambiguation
US6714631B1 (en) 2002-10-31 2004-03-30 Sbc Properties, L.P. Method and system for an automated departure strategy
US7684985B2 (en) 2002-12-10 2010-03-23 Richard Dominach Techniques for disambiguating speech input using multimodal interfaces
WO2004053836A1 (en) * 2002-12-10 2004-06-24 Kirusa, Inc. Techniques for disambiguating speech input using multimodal interfaces
US20040172258A1 (en) * 2002-12-10 2004-09-02 Dominach Richard F. Techniques for disambiguating speech input using multimodal interfaces
USRE44418E1 (en) 2002-12-10 2013-08-06 Waloomba Tech Ltd., L.L.C. Techniques for disambiguating speech input using multimodal interfaces
US20050049858A1 (en) * 2003-08-25 2005-03-03 Bellsouth Intellectual Property Corporation Methods and systems for improving alphabetic speech recognition accuracy
CN1707409B (en) * 2003-09-19 2010-12-08 美国在线服务公司 Contextual prediction of user words and user actions
US20050125230A1 (en) * 2003-12-09 2005-06-09 Gregory Haas Method and apparatus for entering alphabetic characters
US20050131686A1 (en) * 2003-12-16 2005-06-16 Canon Kabushiki Kaisha Information processing apparatus and data input method
US20050234722A1 (en) * 2004-02-11 2005-10-20 Alex Robinson Handwriting and voice input with automatic correction
US7319957B2 (en) 2004-02-11 2008-01-15 Tegic Communications, Inc. Handwriting and voice input with automatic correction
US20050216276A1 (en) * 2004-03-23 2005-09-29 Ching-Ho Tsai Method and system for voice-inputting chinese character
US9786273B2 (en) 2004-06-02 2017-10-10 Nuance Communications, Inc. Multimodal disambiguation of speech recognition
WO2005119642A3 (en) * 2004-06-02 2006-12-28 America Online Inc Multimodal disambiguation of speech recognition
US8606582B2 (en) 2004-06-02 2013-12-10 Tegic Communications, Inc. Multimodal disambiguation of speech recognition
US20120253823A1 (en) * 2004-09-10 2012-10-04 Thomas Barton Schalk Hybrid Dialog Speech Recognition for In-Vehicle Automated Interaction and In-Vehicle Interfaces Requiring Minimal Driver Processing
US20070005358A1 (en) * 2005-06-29 2007-01-04 Siemens Aktiengesellschaft Method for determining a list of hypotheses from a vocabulary of a voice recognition system
US8700397B2 (en) 2006-10-30 2014-04-15 Nuance Communications, Inc. Speech recognition of character sequences
US20080103774A1 (en) * 2006-10-30 2008-05-01 International Business Machines Corporation Heuristic for Voice Result Determination
US8255216B2 (en) * 2006-10-30 2012-08-28 Nuance Communications, Inc. Speech recognition of character sequences
US20080120102A1 (en) * 2006-11-17 2008-05-22 Rao Ashwin P Predictive speech-to-text input
US7904298B2 (en) * 2006-11-17 2011-03-08 Rao Ashwin P Predictive speech-to-text input
US9830912B2 (en) 2006-11-30 2017-11-28 Ashwin P Rao Speak and touch auto correction interface
US9922640B2 (en) 2008-10-17 2018-03-20 Ashwin P Rao System and method for multimodal utterance detection
US20130238338A1 (en) * 2012-03-06 2013-09-12 Verizon Patent And Licensing, Inc. Method and apparatus for phonetic character conversion
US9418649B2 (en) * 2012-03-06 2016-08-16 Verizon Patent And Licensing Inc. Method and apparatus for phonetic character conversion

Similar Documents

Publication Publication Date Title
US5917890A (en) Disambiguation of alphabetic characters in an automated call processing environment
US5917889A (en) Capture of alphabetic or alphanumeric character strings in an automated call processing environment
US5832063A (en) Methods and apparatus for performing speaker independent recognition of commands in parallel with speaker dependent recognition of names, words or phrases
EP0890249B1 (en) Apparatus and method for reducing speech recognition vocabulary perplexity and dynamically selecting acoustic models
US5430827A (en) Password verification system
EP0477688B1 (en) Voice recognition telephone dialing
US7260537B2 (en) Disambiguating results within a speech based IVR session
US6925154B2 (en) Methods and apparatus for conversational name dialing systems
US8185539B1 (en) Web site or directory search using speech recognition of letters
EP1354311B1 (en) Voice-enabled user interface for voicemail systems
US20060217978A1 (en) System and method for handling information in a voice recognition automated conversation
US20010016813A1 (en) Distributed recogniton system having multiple prompt-specific and response-specific speech recognizers
US8374862B2 (en) Method, software and device for uniquely identifying a desired contact in a contacts database based on a single utterance
CA2104850C (en) Speech password system
US5752230A (en) Method and apparatus for identifying names with a speech recognition program
US6246987B1 (en) System for permitting access to a common resource in response to speaker identification and verification
US6731737B2 (en) Directory assistance system
US7624016B2 (en) Method and apparatus for robustly locating user barge-ins in voice-activated command systems
US6223156B1 (en) Speech recognition of caller identifiers using location information
US7552221B2 (en) System for communicating with a server through a mobile communication device
US6236967B1 (en) Tone and speech recognition in communications systems
US20050114139A1 (en) Method of operating a speech dialog system
JPH11184670A (en) System and method for accessing network, and recording medium
US20010056345A1 (en) Method and system for speech recognition of the alphabet
US6049768A (en) Speech recognition system with implicit checksum

Legal Events

Date Code Title Description
AS Assignment

Owner name: AT&T CORP., NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:STERN, BENJAMIN J.;REEL/FRAME:009664/0575

Effective date: 19981120

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

FPAY Fee payment

Year of fee payment: 12