WO2010025441A3 - Distributed speech recognition using one way communication - Google Patents

Distributed speech recognition using one way communication Download PDF

Info

Publication number
WO2010025441A3
WO2010025441A3 PCT/US2009/055480 US2009055480W WO2010025441A3 WO 2010025441 A3 WO2010025441 A3 WO 2010025441A3 US 2009055480 W US2009055480 W US 2009055480W WO 2010025441 A3 WO2010025441 A3 WO 2010025441A3
Authority
WO
WIPO (PCT)
Prior art keywords
speech
server
recognizer
speech recognition
client
Prior art date
Application number
PCT/US2009/055480
Other languages
French (fr)
Other versions
WO2010025441A2 (en
Inventor
Eric Carraux
Detlef Koll
Original Assignee
Multimodal Technologies, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Multimodal Technologies, Inc. filed Critical Multimodal Technologies, Inc.
Priority to EP09810710.5A priority Critical patent/EP2321821B1/en
Priority to PL09810710T priority patent/PL2321821T3/en
Priority to CA2732256A priority patent/CA2732256C/en
Priority to JP2011525266A priority patent/JP5588986B2/en
Priority to ES09810710.5T priority patent/ES2446667T3/en
Priority to DK09810710.5T priority patent/DK2321821T3/en
Publication of WO2010025441A2 publication Critical patent/WO2010025441A2/en
Publication of WO2010025441A3 publication Critical patent/WO2010025441A3/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/32Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems

Abstract

A speech recognition client sends a speech stream and control stream in parallel to a server-side speech recognizer over a network. The network may be an unreliable, low-latency network. The server-side speech recognizer recognizes the speech stream continuously. The speech recognition client receives recognition results from the server-side recognizer in response to requests from the client. The client may remotely reconfigure the state of the server-side recognizer during recognition.
PCT/US2009/055480 2008-08-29 2009-08-31 Distributed speech recognition using one way communication WO2010025441A2 (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
EP09810710.5A EP2321821B1 (en) 2008-08-29 2009-08-31 Distributed speech recognition using one way communication
PL09810710T PL2321821T3 (en) 2008-08-29 2009-08-31 Distributed speech recognition using one way communication
CA2732256A CA2732256C (en) 2008-08-29 2009-08-31 Distributed speech recognition using one way communication
JP2011525266A JP5588986B2 (en) 2008-08-29 2009-08-31 Distributed speech recognition using one-way communication
ES09810710.5T ES2446667T3 (en) 2008-08-29 2009-08-31 Distributed voice recognition using unidirectional communication
DK09810710.5T DK2321821T3 (en) 2008-08-29 2009-08-31 Distributed voice recognition that use one-way communications

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US9322108P 2008-08-29 2008-08-29
US61/093,221 2008-08-29
US12/550,381 US8019608B2 (en) 2008-08-29 2009-08-30 Distributed speech recognition using one way communication
US12/550,381 2009-08-30

Publications (2)

Publication Number Publication Date
WO2010025441A2 WO2010025441A2 (en) 2010-03-04
WO2010025441A3 true WO2010025441A3 (en) 2010-06-03

Family

ID=41722339

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/055480 WO2010025441A2 (en) 2008-08-29 2009-08-31 Distributed speech recognition using one way communication

Country Status (8)

Country Link
US (5) US8019608B2 (en)
EP (1) EP2321821B1 (en)
JP (2) JP5588986B2 (en)
CA (1) CA2732256C (en)
DK (1) DK2321821T3 (en)
ES (1) ES2446667T3 (en)
PL (1) PL2321821T3 (en)
WO (1) WO2010025441A2 (en)

Families Citing this family (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8019608B2 (en) * 2008-08-29 2011-09-13 Multimodal Technologies, Inc. Distributed speech recognition using one way communication
US7933777B2 (en) * 2008-08-29 2011-04-26 Multimodal Technologies, Inc. Hybrid speech recognition
US9570078B2 (en) * 2009-06-19 2017-02-14 Microsoft Technology Licensing, Llc Techniques to provide a standard interface to a speech recognition platform
EP2513774A4 (en) * 2009-12-18 2013-09-04 Nokia Corp Method and apparatus for projecting a user interface via partition streaming
US20110184740A1 (en) 2010-01-26 2011-07-28 Google Inc. Integration of Embedded and Network Speech Recognizers
US9634855B2 (en) 2010-05-13 2017-04-25 Alexander Poltorak Electronic personal interactive device that determines topics of interest using a conversational agent
US8812321B2 (en) * 2010-09-30 2014-08-19 At&T Intellectual Property I, L.P. System and method for combining speech recognition outputs from a plurality of domain-specific speech recognizers via machine learning
US8959102B2 (en) 2010-10-08 2015-02-17 Mmodal Ip Llc Structured searching of dynamic structured document corpuses
KR101208166B1 (en) * 2010-12-16 2012-12-04 엔에이치엔(주) Speech recognition client system, speech recognition server system and speech recognition method for processing speech recognition in online
US9009041B2 (en) * 2011-07-26 2015-04-14 Nuance Communications, Inc. Systems and methods for improving the accuracy of a transcription using auxiliary data such as personal data
US8924219B1 (en) 2011-09-30 2014-12-30 Google Inc. Multi hotword robust continuous voice command detection in mobile devices
US8775175B1 (en) * 2012-06-01 2014-07-08 Google Inc. Performing dictation correction
US8996374B2 (en) * 2012-06-06 2015-03-31 Spansion Llc Senone scoring for multiple input streams
US9430465B2 (en) * 2013-05-13 2016-08-30 Facebook, Inc. Hybrid, offline/online speech translation system
US20140379334A1 (en) * 2013-06-20 2014-12-25 Qnx Software Systems Limited Natural language understanding automatic speech recognition post processing
US9747899B2 (en) 2013-06-27 2017-08-29 Amazon Technologies, Inc. Detecting self-generated wake expressions
US10885918B2 (en) 2013-09-19 2021-01-05 Microsoft Technology Licensing, Llc Speech recognition using phoneme matching
CN105793923A (en) * 2013-09-20 2016-07-20 亚马逊技术股份有限公司 Local and remote speech processing
EP2866153A1 (en) * 2013-10-22 2015-04-29 Agfa Healthcare Speech recognition method and system with simultaneous text editing
US9601108B2 (en) 2014-01-17 2017-03-21 Microsoft Technology Licensing, Llc Incorporating an exogenous large-vocabulary model into rule-based speech recognition
US10878721B2 (en) 2014-02-28 2020-12-29 Ultratec, Inc. Semiautomated relay method and apparatus
US20180034961A1 (en) 2014-02-28 2018-02-01 Ultratec, Inc. Semiautomated Relay Method and Apparatus
US10389876B2 (en) 2014-02-28 2019-08-20 Ultratec, Inc. Semiautomated relay method and apparatus
US10748523B2 (en) 2014-02-28 2020-08-18 Ultratec, Inc. Semiautomated relay method and apparatus
US20180270350A1 (en) 2014-02-28 2018-09-20 Ultratec, Inc. Semiautomated relay method and apparatus
US10749989B2 (en) 2014-04-01 2020-08-18 Microsoft Technology Licensing Llc Hybrid client/server architecture for parallel processing
JP6150077B2 (en) * 2014-10-31 2017-06-21 マツダ株式会社 Spoken dialogue device for vehicles
WO2016129188A1 (en) * 2015-02-10 2016-08-18 Necソリューションイノベータ株式会社 Speech recognition processing device, speech recognition processing method, and program
US9910840B2 (en) 2015-04-03 2018-03-06 Microsoft Technology Licensing, Llc Annotating notes from passive recording with categories
EP3089159B1 (en) 2015-04-28 2019-08-28 Google LLC Correcting voice recognition using selective re-speak
EP3323126A4 (en) * 2015-07-17 2019-03-20 Nuance Communications, Inc. Reduced latency speech recognition system using multiple recognizers
US9715498B2 (en) 2015-08-31 2017-07-25 Microsoft Technology Licensing, Llc Distributed server system for language understanding
US9443519B1 (en) 2015-09-09 2016-09-13 Google Inc. Reducing latency caused by switching input modalities
CN107452383B (en) * 2016-05-31 2021-10-26 华为终端有限公司 Information processing method, server, terminal and information processing system
US10971157B2 (en) 2017-01-11 2021-04-06 Nuance Communications, Inc. Methods and apparatus for hybrid speech recognition processing
US10410635B2 (en) * 2017-06-09 2019-09-10 Soundhound, Inc. Dual mode speech recognition
CN109285548A (en) * 2017-07-19 2019-01-29 阿里巴巴集团控股有限公司 Information processing method, system, electronic equipment and computer storage medium
US10796687B2 (en) 2017-09-06 2020-10-06 Amazon Technologies, Inc. Voice-activated selective memory for voice-capturing devices
KR102552486B1 (en) * 2017-11-02 2023-07-06 현대자동차주식회사 Apparatus and method for recoginizing voice in vehicle
US11017778B1 (en) 2018-12-04 2021-05-25 Sorenson Ip Holdings, Llc Switching between speech recognition systems
US10388272B1 (en) 2018-12-04 2019-08-20 Sorenson Ip Holdings, Llc Training speech recognition systems using word sequences
US10573312B1 (en) 2018-12-04 2020-02-25 Sorenson Ip Holdings, Llc Transcription generation from multiple speech recognition systems
US11170761B2 (en) 2018-12-04 2021-11-09 Sorenson Ip Holdings, Llc Training of speech recognition systems
US11398238B2 (en) * 2019-06-07 2022-07-26 Lg Electronics Inc. Speech recognition method in edge computing device
US11539900B2 (en) 2020-02-21 2022-12-27 Ultratec, Inc. Caption modification and augmentation systems and methods for use by hearing assisted user
US11488604B2 (en) 2020-08-19 2022-11-01 Sorenson Ip Holdings, Llc Transcription of audio

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001337695A (en) * 2000-05-24 2001-12-07 Canon Inc Speech processing system, device, method and storage medium
JP2002162988A (en) * 2000-11-27 2002-06-07 Canon Inc Voice recognition system and its control method, and computer-readable memory
KR20020049150A (en) * 2000-12-19 2002-06-26 이계철 Protocol utilization to control speech recognition and speech synthesis
US6487534B1 (en) * 1999-03-26 2002-11-26 U.S. Philips Corporation Distributed client-server speech recognition system

Family Cites Families (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6078886A (en) * 1997-04-14 2000-06-20 At&T Corporation System and method for providing remote automatic speech recognition services via a packet network
JP2001508200A (en) 1997-11-14 2001-06-19 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Method and system for selective hardware sharing voice processing at multiple levels in a voice-based intercommunication system
US6298326B1 (en) 1999-05-13 2001-10-02 Alan Feller Off-site data entry system
US7330815B1 (en) * 1999-10-04 2008-02-12 Globalenglish Corporation Method and system for network-based speech recognition
US6963837B1 (en) 1999-10-06 2005-11-08 Multimodal Technologies, Inc. Attribute-based word modeling
US7725307B2 (en) 1999-11-12 2010-05-25 Phoenix Solutions, Inc. Query engine for processing voice based queries including semantic decoding
US9076448B2 (en) * 1999-11-12 2015-07-07 Nuance Communications, Inc. Distributed real time speech recognition system
US6728677B1 (en) * 2001-01-31 2004-04-27 Nuance Communications Method and system for dynamically improving performance of speech recognition or other speech processing systems
CN1266625C (en) * 2001-05-04 2006-07-26 微软公司 Server for identifying WEB invocation
US6996525B2 (en) * 2001-06-15 2006-02-07 Intel Corporation Selecting one of multiple speech recognizers in a system based on performance predections resulting from experience
US6801604B2 (en) 2001-06-25 2004-10-05 International Business Machines Corporation Universal IP-based and scalable architectures across conversational applications using web services for speech and audio processing resources
JP2003140691A (en) * 2001-11-07 2003-05-16 Hitachi Ltd Voice recognition device
US7035797B2 (en) 2001-12-14 2006-04-25 Nokia Corporation Data-driven filtering of cepstral time trajectories for robust speech recognition
JP3826032B2 (en) * 2001-12-28 2006-09-27 株式会社東芝 Speech recognition apparatus, speech recognition method, and speech recognition program
US7266127B2 (en) 2002-02-08 2007-09-04 Lucent Technologies Inc. Method and system to compensate for the effects of packet delays on speech quality in a Voice-over IP system
JP2004118325A (en) * 2002-09-24 2004-04-15 Sega Corp Data communication method and data communication system
US7092880B2 (en) * 2002-09-25 2006-08-15 Siemens Communications, Inc. Apparatus and method for quantitative measurement of voice quality in packet network environments
US7016844B2 (en) 2002-09-26 2006-03-21 Core Mobility, Inc. System and method for online transcription services
US7539086B2 (en) 2002-10-23 2009-05-26 J2 Global Communications, Inc. System and method for the secure, real-time, high accuracy conversion of general-quality speech into text
US7774694B2 (en) 2002-12-06 2010-08-10 3M Innovation Properties Company Method and system for server-based sequential insertion processing of speech recognition results
US7444285B2 (en) 2002-12-06 2008-10-28 3M Innovative Properties Company Method and system for sequential insertion of speech recognition results to facilitate deferred transcription services
TWI245259B (en) * 2002-12-20 2005-12-11 Ibm Sensor based speech recognizer selection, adaptation and combination
EP1493993A1 (en) * 2003-06-30 2005-01-05 Harman Becker Automotive Systems GmbH Method and device for controlling a speech dialog system
US7418392B1 (en) * 2003-09-25 2008-08-26 Sensory, Inc. System and method for controlling the operation of a device by voice commands
US20050102140A1 (en) 2003-11-12 2005-05-12 Joel Davne Method and system for real-time transcription and correction using an electronic communication environment
US8412521B2 (en) 2004-08-20 2013-04-02 Multimodal Technologies, Llc Discriminative training of document transcription system
US7584103B2 (en) 2004-08-20 2009-09-01 Multimodal Technologies, Inc. Automated extraction of semantic content and generation of a structured document from speech
US7844464B2 (en) 2005-07-22 2010-11-30 Multimodal Technologies, Inc. Content-based audio playback emphasis
US20130304453A9 (en) 2004-08-20 2013-11-14 Juergen Fritsch Automated Extraction of Semantic Content and Generation of a Structured Document from Speech
US20060095266A1 (en) * 2004-11-01 2006-05-04 Mca Nulty Megan Roaming user profiles for speech recognition
US7502741B2 (en) 2005-02-23 2009-03-10 Multimodal Technologies, Inc. Audio signal de-identification
US7640158B2 (en) 2005-11-08 2009-12-29 Multimodal Technologies, Inc. Automatic detection and application of editing patterns in draft documents
JP4882537B2 (en) * 2006-06-20 2012-02-22 株式会社日立製作所 Request control method by timer cooperation
US8560314B2 (en) 2006-06-22 2013-10-15 Multimodal Technologies, Llc Applying service levels to transcripts
JP4875752B2 (en) 2006-11-22 2012-02-15 マルチモーダル・テクノロジーズ・インク Speech recognition in editable audio streams
JP2008145676A (en) * 2006-12-08 2008-06-26 Denso Corp Speech recognition device and vehicle navigation device
CN101030994A (en) * 2007-04-11 2007-09-05 华为技术有限公司 Speech discriminating method system and server
US8099289B2 (en) * 2008-02-13 2012-01-17 Sensory, Inc. Voice interface and search for electronic devices including bluetooth headsets and remote systems
US8019608B2 (en) * 2008-08-29 2011-09-13 Multimodal Technologies, Inc. Distributed speech recognition using one way communication
US7933777B2 (en) 2008-08-29 2011-04-26 Multimodal Technologies, Inc. Hybrid speech recognition
CA2680304C (en) 2008-09-25 2017-08-22 Multimodal Technologies, Inc. Decoding-time prediction of non-verbalized tokens
US9280541B2 (en) * 2012-01-09 2016-03-08 Five9, Inc. QR data proxy and protocol gateway
US8880398B1 (en) * 2012-07-13 2014-11-04 Google Inc. Localized speech recognition with offload

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6487534B1 (en) * 1999-03-26 2002-11-26 U.S. Philips Corporation Distributed client-server speech recognition system
JP2001337695A (en) * 2000-05-24 2001-12-07 Canon Inc Speech processing system, device, method and storage medium
JP2002162988A (en) * 2000-11-27 2002-06-07 Canon Inc Voice recognition system and its control method, and computer-readable memory
KR20020049150A (en) * 2000-12-19 2002-06-26 이계철 Protocol utilization to control speech recognition and speech synthesis

Also Published As

Publication number Publication date
JP2012501481A (en) 2012-01-19
US20110288857A1 (en) 2011-11-24
ES2446667T3 (en) 2014-03-10
EP2321821B1 (en) 2013-11-13
EP2321821A2 (en) 2011-05-18
PL2321821T3 (en) 2014-04-30
CA2732256A1 (en) 2010-03-04
JP2014056258A (en) 2014-03-27
US9502033B2 (en) 2016-11-22
DK2321821T3 (en) 2014-02-17
EP2321821A4 (en) 2012-11-28
CA2732256C (en) 2017-11-07
US20100057451A1 (en) 2010-03-04
US8504372B2 (en) 2013-08-06
US20140163974A1 (en) 2014-06-12
US8019608B2 (en) 2011-09-13
WO2010025441A2 (en) 2010-03-04
JP5588986B2 (en) 2014-09-10
US20120296645A1 (en) 2012-11-22
JP5883841B2 (en) 2016-03-15
US8249878B2 (en) 2012-08-21
US20150170647A1 (en) 2015-06-18

Similar Documents

Publication Publication Date Title
WO2010025441A3 (en) Distributed speech recognition using one way communication
EP3621068A4 (en) Portable smart voice interaction control device, method and system
WO2008083176A3 (en) Voice search-enabled mobile device
AU2017236021A1 (en) Transactional conversation - based computing system
EP3926623A4 (en) Speech recognition method and apparatus, and neural network training method and apparatus
WO2007140047A3 (en) Grammar adaptation through cooperative client and server based speech recognition
WO2007147042A3 (en) Voice-based multimodal speaker authentication using adaptive training and applications thereof
EP3282445A4 (en) Voice recognition method, voice wake-up device, voice recognition device and terminal
WO2014144395A3 (en) User training by intelligent digital assistant
WO2010005852A3 (en) Techniques for enhanced persistent scheduling with efficient link adaptation capability
WO2008118195A3 (en) System and method for a cooperative conversational voice user interface
WO2008147622A3 (en) Customizing haptic effects on an end user device
WO2015026933A3 (en) Devices and methods for interacting with an hvac controller
GB0712277D0 (en) Voice recognition device and method, and program
WO2006084144A3 (en) Methods and apparatus for automatically extending the voice-recognizer vocabulary of mobile communications devices
WO2007133716A3 (en) Multimodal communication and command control systems and related methods
WO2011068372A3 (en) Mobile device and control method thereof
WO2012155079A3 (en) Adaptive voice recognition systems and methods
WO2010144476A3 (en) Method and system for performing multi-stage virtual sim provisioning and setup on mobile devices
WO2012009619A3 (en) Hierarchical device type recognition, caching control and enhanced cdn communication in a wireless mobile network
MY179900A (en) Speech recognition method and speech recognition apparatus
WO2005022295A3 (en) Media center controller system and method
WO2008106431A3 (en) Technique for providing data objects prior to call establishment
WO2007030413A3 (en) Vxml browser control channel
WO2012135229A3 (en) Conversational dialog learning and correction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09810710

Country of ref document: EP

Kind code of ref document: A2

ENP Entry into the national phase

Ref document number: 2732256

Country of ref document: CA

ENP Entry into the national phase

Ref document number: 2011525266

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2009810710

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2061/CHENP/2011

Country of ref document: IN