WO2002049004A2 - Verfahren und anordnung zur spracherkennung für ein kleingerät - Google Patents
Verfahren und anordnung zur spracherkennung für ein kleingerät Download PDFInfo
- Publication number
- WO2002049004A2 WO2002049004A2 PCT/EP2001/014616 EP0114616W WO0249004A2 WO 2002049004 A2 WO2002049004 A2 WO 2002049004A2 EP 0114616 W EP0114616 W EP 0114616W WO 0249004 A2 WO0249004 A2 WO 0249004A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- letter
- server
- character string
- small device
- network
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
Definitions
- the invention relates to a method for speech recognition for a small device connected to a telecommunications or data network according to the preamble of claim 1 and a corresponding arrangement or a corresponding device.
- the invention includes the basic idea of shifting at least the memory-intensive steps of recognizing a letter sequence spoken on a small device out of the small device. It also includes the idea of using a central server in the telecommunications or data network for these parts of the process, which has practically unlimited capacity for this.
- the small device preferably only has a simple character string recognition for which little computing power and memory space is required and which is therefore also possible with microcontrollers and DSPs (digital signal processors) of the small devices mentioned.
- the preferred embodiment of the invention therefore sees, in the sense of the division of labor between the small device as client and the central server, a speech-text conversion of the spoken letter or character strings into a provisional written letter or character string in the small device, and then the transmission thereof the server, then a check and, if necessary, correction of these letter or character strings at Server and the retransmission of the checked letters or Character string to the small device, according to which further simple processing in the sense of a confirmation of the received word can take place in the small device.
- the essential procedural component at the server runs in particular on the basis of one or more letter confusion matrices or a letter language model, which can use complex algorithms and extensive context databases due to the practically unlimited resources of the server.
- a word classifier is entered on the small device in connection with the letter or character string by the user and transmitted together with the provisional written letter or character string to the server, where it is used as additional information for the recognition process running there ( Review and, if necessary, correction) is used.
- a so-called word hypothesis graph is formed in particular from the letter string search and transmitted to the server, and a search is carried out on the server on this word hypothesis graph in a text lexicon database with several memory areas or in several lexicon databases.
- the word classes specified by the word classifier can be, for example, personal names, street or place names, or Internet addresses or also technical terms of a certain area or the like, for each a directory or lexicon is kept at the server.
- the centralized processing also offers the particular advantage of an effortless updating and maintenance of the data stock - which is very important in view of the rapidly increasing number of domain names, especially for Internet addresses.
- the proposed method is implemented as a service of a telecommunications company or a service provider and, as such, is offered to users in particular for a fee, but occasionally also as a free service.
- the most highly developed available resources are preferably used for the transmission of the new words entered to the server.
- the transmission is preferably carried out as a text short message by SMS, and in the case of a WAP-capable mobile phone preferably as a text message according to the WAP standard.
- Their protocols will offer corresponding possibilities for future mobile radio standards - in particular for a UMTS network, transmission using a standard Internet protocol (HTTP) will be possible.
- HTTP Internet protocol
- the transmission takes place via a data channel of the ISDN network. The entry is preferably made (as with a mobile phone) using an alphanumeric keyboard or DTMF.
- the small device can also be designed in particular as a handheld PC or PDA for connection to a telecommunications and / or data network or as a mobile input unit of a remote control system. It has in particular one for displaying several letters or Character strings trained display device and a confirmation device for confirming a word recognized on the server. This can be implemented in particular as a softkey in connection with a menu control or on a touch screen.
- an ISDN fixed network telephone T and a GSM mobile telephone MS which are connected to a line-bound telephone network TN or a mobile radio network GSM are connected, in cooperation with a letter sequence recognizer CSR, which is assigned to both communication networks TN and GSM together.
- the fixed network telephone T and the mobile telephone MS are each connected to an exchange SC or MSC of their network via an ISDN telephone line ISDN or an air interface (not specifically designated) and a base station BTS / BSC.
- a connection to a common administration and service center PRO of a service provider is established via this (in the case of the fixed network) or indirectly via an additional gateway server GS, which has a transcription service as a fee provider in the fixed network TN as well as in the GSM mobile network Offers service.
- the letter sequence recognizer CSR is assigned several text lexicon databases PDB1 to PDB3 and (schematically shown in the form of two function blocks) a letter confusion matrix CMA and a letter language mode 11 SMO for development. Furthermore, the letter sequence recognizer is assigned a charging device BM for charging for use of the transcription service.
- An ISDN interface device IF is installed in the fixed network telephone T and is only shown symbolically in the figure as a separate block.
- the ISDN line between the fixed network telephone T and the associated switching center SC has a voice channel A and an independent data channel B in a known manner.
- provisional character string recognition for words spelled by the user is carried out.
- the result of the recognition is transmitted via the letter chain transmission stage CCT together with the word classifier entered by the user via the keyboard to the administration and service center PRO of the provider and the associated letter sequence recognizer CSR.
- the reference lexicon data bases PDB1 to PDB3, the letter confusion matrix CMA and the letter language model SMO the latter performs a check of the letter string output by the mobile phone against a comprehensive linguistic background and context knowledge of the respective national language of the user.
- the national language is selected on the basis of the information stored in the SIM card. saved user data and / or based on a selection made by the user at the beginning of the corresponding menu. It goes without saying that pronunciations of characters, spelling habits etc. typical for the national language are taken into account here.
- the checked letter string recognition works analogously for letter strings spoken on the fixed network telephone T.
- the checked back and possibly corrected letter chain or chains are transmitted back here in particular via the B channel of the ISDN network.
- the user can preselect or confirm the knowledge sources to be used in the central check for the character string recognizer CSR, or these are corresponding to the
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP01991834A EP1352388B1 (de) | 2000-12-14 | 2001-12-12 | Verfahren und anordnung zur spracherkennung für ein kleingerät |
DE50106056T DE50106056D1 (de) | 2000-12-14 | 2001-12-12 | Verfahren und anordnung zur spracherkennung für ein kleingerät |
US10/450,580 US20040049386A1 (en) | 2000-12-14 | 2001-12-12 | Speech recognition method and system for a small device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP00127457.0 | 2000-12-14 | ||
EP00127457 | 2000-12-14 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2002049004A2 true WO2002049004A2 (de) | 2002-06-20 |
WO2002049004A3 WO2002049004A3 (de) | 2002-09-19 |
Family
ID=8170671
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2001/014616 WO2002049004A2 (de) | 2000-12-14 | 2001-12-12 | Verfahren und anordnung zur spracherkennung für ein kleingerät |
Country Status (5)
Country | Link |
---|---|
US (1) | US20040049386A1 (de) |
EP (1) | EP1352388B1 (de) |
DE (1) | DE50106056D1 (de) |
ES (1) | ES2238054T3 (de) |
WO (1) | WO2002049004A2 (de) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7418381B2 (en) * | 2001-09-07 | 2008-08-26 | Hewlett-Packard Development Company, L.P. | Device for automatically translating and presenting voice messages as text messages |
US7117153B2 (en) * | 2003-02-13 | 2006-10-03 | Microsoft Corporation | Method and apparatus for predicting word error rates from text |
US20070016420A1 (en) * | 2005-07-07 | 2007-01-18 | International Business Machines Corporation | Dictionary lookup for mobile devices using spelling recognition |
US10540957B2 (en) * | 2014-12-15 | 2020-01-21 | Baidu Usa Llc | Systems and methods for speech transcription |
US10049198B2 (en) * | 2015-03-18 | 2018-08-14 | International Business Machines Corporation | Securing a device using graphical analysis |
US10049199B2 (en) * | 2015-03-18 | 2018-08-14 | International Business Machines Corporation | Securing a device using graphical analysis |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5677990A (en) * | 1995-05-05 | 1997-10-14 | Panasonic Technologies, Inc. | System and method using N-best strategy for real time recognition of continuously spelled names |
EP0848536A2 (de) * | 1996-12-13 | 1998-06-17 | AT&T Corp. | Statistische Datenbank-Korrektur von alphanumerischen Kontennummern unter Verwendung von Spracherkennung und Wahltonerkennung |
WO1999021171A1 (en) * | 1997-10-21 | 1999-04-29 | Bell Canada | A method and apparatus for improving the utility of speech recognition |
US5995928A (en) * | 1996-10-02 | 1999-11-30 | Speechworks International, Inc. | Method and apparatus for continuous spelling speech recognition with early identification |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5303299A (en) * | 1990-05-15 | 1994-04-12 | Vcs Industries, Inc. | Method for continuous recognition of alphanumeric strings spoken over a telephone network |
FR2696067B1 (fr) * | 1992-09-21 | 1994-11-25 | France Telecom | Installation de télécommunication à téléchargement sécurisé de moyens de pré-paiement et procédé de téléchargement correspondant. |
WO1994014270A1 (en) * | 1992-12-17 | 1994-06-23 | Bell Atlantic Network Services, Inc. | Mechanized directory assistance |
US5812639A (en) * | 1994-12-05 | 1998-09-22 | Bell Atlantic Network Services, Inc. | Message communication via common signaling channel |
US6161082A (en) * | 1997-11-18 | 2000-12-12 | At&T Corp | Network based language translation system |
US20020055351A1 (en) * | 1999-11-12 | 2002-05-09 | Elsey Nicholas J. | Technique for providing personalized information and communications services |
-
2001
- 2001-12-12 DE DE50106056T patent/DE50106056D1/de not_active Expired - Fee Related
- 2001-12-12 WO PCT/EP2001/014616 patent/WO2002049004A2/de active IP Right Grant
- 2001-12-12 EP EP01991834A patent/EP1352388B1/de not_active Expired - Lifetime
- 2001-12-12 ES ES01991834T patent/ES2238054T3/es not_active Expired - Lifetime
- 2001-12-12 US US10/450,580 patent/US20040049386A1/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5677990A (en) * | 1995-05-05 | 1997-10-14 | Panasonic Technologies, Inc. | System and method using N-best strategy for real time recognition of continuously spelled names |
US5995928A (en) * | 1996-10-02 | 1999-11-30 | Speechworks International, Inc. | Method and apparatus for continuous spelling speech recognition with early identification |
EP0848536A2 (de) * | 1996-12-13 | 1998-06-17 | AT&T Corp. | Statistische Datenbank-Korrektur von alphanumerischen Kontennummern unter Verwendung von Spracherkennung und Wahltonerkennung |
WO1999021171A1 (en) * | 1997-10-21 | 1999-04-29 | Bell Canada | A method and apparatus for improving the utility of speech recognition |
Non-Patent Citations (2)
Title |
---|
GILLOIRE A ET AL: "Innovative speech processing for mobile terminals: an annotated bibliography" SIGNAL PROCESSING,NL,AMSTERDAM, Bd. 80, Nr. 7, Juli 2000 (2000-07), Seiten 1149-1166, XP004200934 ISSN: 0165-1684 * |
LAMEL L ET AL: "The LIMSI Arise system" SPEECH COMMUNICATION,ELSEVIER SCIENCE PUBLISHERS, AMSTERDAM,NL, Bd. 31, Nr. 4, August 2000 (2000-08), Seiten 339-353, XP004210025 ISSN: 0167-6393 * |
Also Published As
Publication number | Publication date |
---|---|
ES2238054T3 (es) | 2005-08-16 |
US20040049386A1 (en) | 2004-03-11 |
WO2002049004A3 (de) | 2002-09-19 |
EP1352388A2 (de) | 2003-10-15 |
DE50106056D1 (de) | 2005-06-02 |
EP1352388B1 (de) | 2005-04-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE60217241T2 (de) | Fokussierte Sprachmodelle zur Verbesserung der Spracheingabe von strukturierten Dokumenten | |
DE60219943T2 (de) | Verfahren zum komprimieren von wörterbuchdaten | |
DE10235548B4 (de) | Verfahren und Vorrichtung für die Prädiktion einer Textnachrichteneingabe | |
DE69725761T2 (de) | System und verfahren zur kodierung und zur aussendung von sprachdaten | |
US8392453B2 (en) | Nonstandard text entry | |
DE60021761T2 (de) | System zur speicherung und bereitstellung von mobilkommunikations - adress - informationen | |
US20060025999A1 (en) | Predicting tone pattern information for textual information used in telecommunication systems | |
US20080281582A1 (en) | Input system for mobile search and method therefor | |
US6526292B1 (en) | System and method for creating a digit string for use by a portable phone | |
DE112005000924T5 (de) | Stimme über Short Message Service | |
DE60304246T2 (de) | Einstellung der Betriebsartauswahl in Abhängigkeit von Sprachinformation | |
DE60114759T2 (de) | Verfahren und vorrichtung zur konvertierung von addressbücheintragen in einem drahtlosen kommunikationsgerät | |
EP2815396A1 (de) | Verfahren zum phonetisieren einer datenliste und sprachgesteuerte benutzerschnittstelle | |
DE112007000728T5 (de) | Tragbare elektronische Vorrichtung zum Vorsehen einer vorgeschlagenen korrigierten Eingabe als Reaktion auf eine fehlerhafte Texteingabe in einer Umgebung eines Textes, der mehrere sequentielle Betätigungen derselben Taste erfordert, und zugehöriges Verfahren | |
EP1352388B1 (de) | Verfahren und anordnung zur spracherkennung für ein kleingerät | |
EP1220200B1 (de) | Verfahren und Anordnung zur sprecherunabhängigen Spracherkennung für ein Telekommunikations- bzw. Datenendgerät | |
DE19851287A1 (de) | Datenverarbeitungssystem oder Kommunikationsendgerät mit einer Einrichtung zur Erkennugn gesprochener Sprache und Verfahren zur Erkennung bestimmter akustischer Objekte | |
WO2002005263A1 (de) | Verfahren zur spracheingabe und -erkennung | |
DE10211777A1 (de) | Erzeugung von Nachrichtentexten | |
WO2002049325A1 (de) | Verfahren zur konfigurierung einer benutzeroberfläche | |
EP1414223B1 (de) | Texteingabe für ein Endgerät | |
DE10003529A1 (de) | Verfahren und Vorrichtung zum Erstellen einer Textdatei mittels Spracherkennung | |
WO2006061266A1 (de) | Automatische spracheinstellung für die beantwortung einer empfangenen sms-nachricht | |
EP1215653B1 (de) | Verfahren und Anordnung zur Spracherkennung für ein Kleingerät | |
EP4203449A1 (de) | Verbindungsdienst für ein mobilfunknetz |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR |
|
AK | Designated states |
Kind code of ref document: A3 Designated state(s): US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A3 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2001991834 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 10450580 Country of ref document: US |
|
WWP | Wipo information: published in national office |
Ref document number: 2001991834 Country of ref document: EP |
|
WWG | Wipo information: grant in national office |
Ref document number: 2001991834 Country of ref document: EP |