CA2588604A1 - A system and method for improving recognition accuracy in speech recognition applications - Google Patents

A system and method for improving recognition accuracy in speech recognition applications Download PDF

Info

Publication number
CA2588604A1
CA2588604A1 CA002588604A CA2588604A CA2588604A1 CA 2588604 A1 CA2588604 A1 CA 2588604A1 CA 002588604 A CA002588604 A CA 002588604A CA 2588604 A CA2588604 A CA 2588604A CA 2588604 A1 CA2588604 A1 CA 2588604A1
Authority
CA
Canada
Prior art keywords
voice command
store
argument
verb
interpretations
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CA002588604A
Other languages
French (fr)
Other versions
CA2588604C (en
Inventor
Robert E. Shostak
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Stryker Corp
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CA2588604A1 publication Critical patent/CA2588604A1/en
Application granted granted Critical
Publication of CA2588604C publication Critical patent/CA2588604C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/228Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context

Abstract

A speech recognition system and method are provided to correctly distinguish among muliple interpretations of an utterance (202). This system is particularly useful when the set of possible interpretations is large, changes dynamically, and/or contains items that are not phonetically distinctive (203). The speech recogntion system extends the capabilities of mobile wireless communication devices that are voice operated after their intital activation (206).

Claims (36)

1. A wireless communications system, comprising:
a controlling computer;
one or more wireless access points connected to the controlling computer by a network;
a badge that communicates using a wireless protocol with one of the wireless access points; and wherein the controlling computer further comprises a speech recognition system that receives a voice command from a particular user through the badge and interprets the voice command of the user to generate a set of voice commands interpretations, the speech recognition system further comprising an inner circle mechanism having an inner circle store containing a list of entities associated with the particular user, the inner circle mechanism improving the accuracy of the set of voice commands interpretations to generate a best fit voice command corresponding to the received voice command of the user.
2. The system of Claim 1, wherein the inner circle store further comprises one or more entities which are capable of being arguments to a voice command of the user, the entities further comprise a name of one or more of an individual person, a group of people, an organization and a group of organizations.
3. The system of Claim 1, wherein each voice command further comprises a verb portion and wherein the inner circle mechanism further comprises a partitioner that separates the set of the voice command interpretations into one or more isomorphic sets, each isomorphic set containing one or more voice command interpretations having a common verb portion and a filter that filters the one or more voice command interpretations in each isomorphic set to generate a preferred voice command interpretation for each isomorphic set.
4. The system of Claim 3, wherein the voice command further comprises an argument portion.
5. The system of Claim 4, wherein the filter further comprises a comparator that compares the argument of each voice command interpretation in each isomorphic set to the inner circle store and removes a particular voice command interpretation when the argument of the particular voice command interpretation does not match the inner circle store, unless none of the interpretations match the inner circle store, in which case none of the interpretations are removed.
6. The system of Claim 1, wherein the speech recognition system further comprises a say or spell mechanism that permits the user to spell a voice command, making it sufficiently phonetically distinct from other voice commands allowed in the grammar, in order to reduce the chance of an incorrect voice command interpretation.
7. The system of Claim 6, wherein each voice command further comprises a verb portion and an argument portion and wherein the say or spell mechanism further comprises a grammar store, the grammar store having for each verb of each voice command, a spelling of the verb of the voice command and for each argument of each voice command, a spelling of the argument so that the grammar store contains the combination of each spoken or spelled verb and each spoken or spelled argument for the voice commands.
8. The system of Claim 7, wherein the speech recognition system further comprises a mechanism that permits the user to spell a voice command verb and argument using the grammar store of the say or spell mechanism.
9. A wireless communications system, comprising:
a controlling computer;
one or more wireless access points connected to the controlling computer by a network;
a badge that communicates using a wireless protocol with one of the wireless access points; and wherein the controlling computer further comprises a speech recognition system that receives a voice command from a particular user through the badge and interprets the voice command of the user to generate a set of resulting voice command interpretations, the speech recognition system further comprising a say or spell mechanism that permits the user to spell a voice command, making it sufficiently phonetically distinct from other voice commands allowed in the grammar, in order to reduce the chance of an incorrect voice command interpretation.
10. The system of Claim 9, wherein each voice command further comprises a verb portion and an argument portion and wherein the say or spell mechanism further comprises a grammar store, the grammar store having for each verb of each voice command, a spelling of the verb of the voice command and for each argument of each voice command, a spelling of the argument so that the grammar store contains the combination of each verb and each argument for the voice commands.
11. The system of Claim 10, wherein the speech recognition system further comprises a spelling mechanism that permits the user to spell a voice command verb and argument using the grammar store of the say or spell mechanism.
12. A method for improving the accuracy of a computer-implemented speech recognition system, the method comprising:
obtaining a set of voice command interpretations from a speech recognition engine, each voice command interpretation having a verb portion, partitioning the set of voice command interpretations into one or more isomorphic sets, each isomorphic set containing one or more voice command interpretations having a common verb portion, filtering the one or more voice command interpretations in each isomorphic set to generate a preferred voice command interpretation; and outputting the preferred voice command interpretation.
13 The method of Claim 12, wherein the voice command interpretation further comprises an argument portion.
14 The method of Claim 13, wherein the filtering the one or more voice command interpretations in each isomorphic set further comprises comparing the argument of each voice command interpretation in each isomorphic set to a store of arguments and removing a particular voice command interpretation when the argument of the particular voice command interpretation does not match an entity in the store of arguments, unless none of the interpretations match the inner circle store, in which case none of the interpretations are removed.
15. The method of Claim 14, wherein the store of arguments further comprises an entity store for a particular user from which a voice command has been issued, the store of entities further comprises a name of one or more of an individual person, a group of people, an organization and a group of organizations.
16. The method of Claim 15 further comprising generating an entity store for each user.
17. The method of Claim 16, wherein generating the entity store further comprises one or more of manually setting an entity store for a user, setting an entity store of the user based on a buddy list of the user, setting an entity store of the user based on the department of the user and automatically setting an entity store for the user based on the arguments of prior voice commands issued by the user.
18. The method of Claim 16 further comprising removing an entity from the entity store for the user.
19. The method of Claim 18, wherein removing an entity from the entity store further comprises one or more of manually removing an entity from the entity store and automatically removing an entity from the entity store when the entity has not appeared as an argument for a voice command of the user for a predetermined time period.
20. The method of Claim 13, wherein obtaining the set of voice command interpretations further comprises spelling the verb portion of the voice command to distinguish similar voice command interpretations and generate a set of voice command interpretations and spelling the argument portion of the voice command to further distinguish similar voice command interpretations to generate a reduced set of voice command interpretations.
21. The method of Claim 20, wherein obtaining the set of voice command interpretations further comprises storing a grammar store wherein the grammar store further comprises, for each verb of each voice command, a spelling of the verb of the voice command and for each argument of each voice command, a spelling of the argument so that the grammar store contains the combination of each verb and each argument for the voice commands.
22. The method of Claim 21, wherein the spelling steps further comprises comparing a spelled voice command to the grammar store.
23. A method for improving the accuracy of a computer-implemented speech recognition system, the method comprising:
receiving a voice command from a user, the voice command having a verb portion and an argument portion;
spelling the verb portion of the voice command to distinguish similar voice command interpretations and generate a set of voice command interpretations; and spelling the argument portion of the voice command to further distinguish similar voice command interpretations to generate a reduced set of voice command interpretations.
24. The method of Claim 23 further comprising storing a grammar store wherein the grammar store further comprises, for each verb of each voice command, a spelling of the verb of the voice command and for each argument of each voice command, a spelling of the argument so that the grammar store contains the combination of each verb and each argument for the voice commands.
25. The method of Claim 24, wherein the spelling steps further comprises comparing a spelled voice command to the grammar store.
26. A computer implemented speech recognition system, comprising:
a speech recognition engine that generates a set of voice command interpretations based on a voice command of a user; and an inner circle mechanism, connected to the speech recognition engine, the inner circle mechanism having an inner circle store containing a list of entities associated with the user, the inner circle mechanism improving the accuracy of the set of voice commands interpretations to generate a best fit voice command corresponding to the voice command of the user.
27. The system of Claim 26, wherein the inner circle store further comprises one or more entities which are arguments to a voice command of the user, the entities further comprise a name of one or more of an individual person, a group of people, an organization and a group of organizations.
28 The system of Claim 27, wherein each voice command further comprises a verb portion, the inner circle mechanism further comprises a partitioner that separates the set of the voice command interpretations into one or more isomorphic sets, each isomorphic set containing one or more voice command interpretations having a common verb portion and a filter that filters the one or more voice command interpretations in each isomorphic set to generate one or more preferred voice command interpretations for each isomorphic set.
29. The system of Claim 28, wherein the voice command further comprises an argument portion.
30. The system of Claim 29, wherein the filter further comprises a comparator that compares the argument of each voice command interpretation in each isomorphic set to the inner circle store and removes a particular voice command interpretation when the argument of the particular voice command interpretation does not match the entities in the inner circle store.
31. The system of Claim 26, wherein the speech recognition engine further comprises a say or spell mechanism that permits the user to spell a voice command, making it sufficiently phonetically distinct from other voice commands allowed in the grammar, in order to reduce the chance of incorrect voice command interpretations.
32. The system of Claim 31, wherein each voice command further comprises a verb portion and an argument portion and wherein the say or spell mechanism further comprises a grammar store, the grammar store having for each verb of each voice command, a spelling of the verb of the voice command and for each argument of each voice command, a spelling of the argument so that the grammar store contains the combination of each verb and each argument for the voice commands.
33. The system of Claim 32, wherein the speech recognition system further comprises a spelling mechanism that permits the user to spell a voice command verb and argument using the grammar store of the say or spell mechanism.
34. A computer implemented speech recognition system, comprising:
a speech recognition engine that generates a set of voice command interpretations based on a voice command of a user; and the speech recognition engine further comprises a say or spell mechanism that permits the user to spell a voice command, making it sufficiently phonetically distinct from other voice commands allowed in the grammar, in order to reduce the chance of incorrect voice command interpretations.
35. The system of Claim 34, wherein each voice command further comprises a verb portion and an argument portion and wherein the say or spell mechanism further comprises a grammar store, the grammar store having for each verb of each voice command, a spelling of the verb of the voice command and for each argument of each voice command, a spelling of the argument so that the grammar store contains the combination of each verb and each argument for the voice commands.
36 The system of Claim 35, wherein the speech recognition system further comprises a spelling mechanism that permits the user to spell a voice command verb and argument using the grammar store of the say or spell mechanism.
CA2588604A 2004-11-30 2005-11-29 A system and method for improving recognition accuracy in speech recognition applications Active CA2588604C (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US11/000,590 2004-11-30
US11/000,590 US7457751B2 (en) 2004-11-30 2004-11-30 System and method for improving recognition accuracy in speech recognition applications
PCT/US2005/043235 WO2006060443A2 (en) 2004-11-30 2005-11-29 A system and method for improving recognition accuracy in speech recognition applications

Publications (2)

Publication Number Publication Date
CA2588604A1 true CA2588604A1 (en) 2006-06-08
CA2588604C CA2588604C (en) 2012-07-17

Family

ID=36565667

Family Applications (1)

Application Number Title Priority Date Filing Date
CA2588604A Active CA2588604C (en) 2004-11-30 2005-11-29 A system and method for improving recognition accuracy in speech recognition applications

Country Status (4)

Country Link
US (2) US7457751B2 (en)
EP (1) EP1820182A4 (en)
CA (1) CA2588604C (en)
WO (1) WO2006060443A2 (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8498865B1 (en) 2004-11-30 2013-07-30 Vocera Communications, Inc. Speech recognition system and method using group call statistics
US20080084312A1 (en) * 2006-10-10 2008-04-10 Daily Michael A Radio frequency identification layered foam tag
US8589162B2 (en) * 2007-09-19 2013-11-19 Nuance Communications, Inc. Method, system and computer program for enhanced speech recognition of digits input strings
US8224656B2 (en) * 2008-03-14 2012-07-17 Microsoft Corporation Speech recognition disambiguation on mobile devices
US9280971B2 (en) 2009-02-27 2016-03-08 Blackberry Limited Mobile wireless communications device with speech to text conversion and related methods
ATE544291T1 (en) 2009-02-27 2012-02-15 Research In Motion Ltd MOBILE RADIO COMMUNICATION DEVICE WITH SPEECH TO TEXT CONVERSION AND ASSOCIATED METHODS
US9183834B2 (en) * 2009-07-22 2015-11-10 Cisco Technology, Inc. Speech recognition tuning tool
US10097880B2 (en) * 2009-09-14 2018-10-09 Tivo Solutions Inc. Multifunction multimedia device
US8447261B2 (en) 2009-10-27 2013-05-21 At&T Mobility Ii Llc Integrating multimedia and voicemail
US20110137976A1 (en) * 2009-12-04 2011-06-09 Bob Poniatowski Multifunction Multimedia Device
US8682145B2 (en) 2009-12-04 2014-03-25 Tivo Inc. Recording system based on multimedia content fingerprints
US8626511B2 (en) * 2010-01-22 2014-01-07 Google Inc. Multi-dimensional disambiguation of voice commands
US20110282942A1 (en) * 2010-05-13 2011-11-17 Tiny Prints, Inc. Social networking system and method for an online stationery or greeting card service
WO2013071305A2 (en) * 2011-11-10 2013-05-16 Inventime Usa, Inc. Systems and methods for manipulating data using natural language commands
US9317605B1 (en) 2012-03-21 2016-04-19 Google Inc. Presenting forked auto-completions
US9646606B2 (en) 2013-07-03 2017-05-09 Google Inc. Speech recognition using domain knowledge
US9466296B2 (en) 2013-12-16 2016-10-11 Intel Corporation Initiation of action upon recognition of a partial voice command
US20180358004A1 (en) * 2017-06-07 2018-12-13 Lenovo (Singapore) Pte. Ltd. Apparatus, method, and program product for spelling words
US11798544B2 (en) * 2017-08-07 2023-10-24 Polycom, Llc Replying to a spoken command
US10957445B2 (en) 2017-10-05 2021-03-23 Hill-Rom Services, Inc. Caregiver and staff information system

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5067149A (en) 1987-10-05 1991-11-19 Ambassador College Telephone line communications control system with dynamic call streaming
US5267305A (en) 1987-10-05 1993-11-30 Ambassador College Transparent inband signaling
DE69230905T2 (en) 1991-06-28 2000-08-17 Ericsson Telefon Ab L M USER MODULES IN TELECOMMUNICATION SYSTEMS
US5515426A (en) * 1994-02-28 1996-05-07 Executone Information Systems, Inc. Telephone communication system having a locator
US5596634A (en) * 1994-12-13 1997-01-21 At&T Telecommunications system for dynamically selecting conversation topics having an automatic call-back feature
US6381341B1 (en) * 1996-05-16 2002-04-30 Digimarc Corporation Watermark encoding method exploiting biases inherent in original signal
US5987408A (en) * 1996-12-16 1999-11-16 Nortel Networks Corporation Automated directory assistance system utilizing a heuristics model for predicting the most likely requested number
US6085976A (en) * 1998-05-22 2000-07-11 Sehr; Richard P. Travel system and methods utilizing multi-application passenger cards
US5987410A (en) * 1997-11-10 1999-11-16 U.S. Philips Corporation Method and device for recognizing speech in a spelling mode including word qualifiers
US6480597B1 (en) 1998-06-12 2002-11-12 Mci Communications Corporation Switch controller for a telecommunications network
CA2361726A1 (en) * 1999-02-04 2000-08-10 Apion Telecoms Limited A telecommunications gateway
DE19944608A1 (en) * 1999-09-17 2001-03-22 Philips Corp Intellectual Pty Recognition of spoken speech input in spelled form
US6761637B2 (en) * 2000-02-22 2004-07-13 Creative Kingdoms, Llc Method of game play using RFID tracking device
US6711414B1 (en) * 2000-02-25 2004-03-23 Charmed Technology, Inc. Wearable computing device capable of responding intelligently to surroundings
US6937986B2 (en) * 2000-12-28 2005-08-30 Comverse, Inc. Automatic dynamic speech recognition vocabulary based on external sources of information
US6728681B2 (en) * 2001-01-05 2004-04-27 Charles L. Whitham Interactive multimedia book
US6731737B2 (en) * 2001-02-27 2004-05-04 International Business Machines Corporation Directory assistance system
US6901255B2 (en) * 2001-09-05 2005-05-31 Vocera Communications Inc. Voice-controlled wireless communications system and method
US7172113B2 (en) * 2002-09-16 2007-02-06 Avery Dennison Corporation System and method for creating a display card
US20040162724A1 (en) 2003-02-11 2004-08-19 Jeffrey Hill Management of conversations
US7606714B2 (en) 2003-02-11 2009-10-20 Microsoft Corporation Natural language classification within an automated response system
US7302392B1 (en) 2003-10-07 2007-11-27 Sprint Spectrum L.P. Voice browser with weighting of browser-level grammar to enhance usability
US20050089150A1 (en) 2003-10-28 2005-04-28 Birkhead Mark W. Voice enabled interactive drug and medical information system
US7319386B2 (en) * 2004-08-02 2008-01-15 Hill-Rom Services, Inc. Configurable system for alerting caregivers

Also Published As

Publication number Publication date
US20090043587A1 (en) 2009-02-12
EP1820182A4 (en) 2009-12-09
US7457751B2 (en) 2008-11-25
EP1820182A2 (en) 2007-08-22
US8175887B2 (en) 2012-05-08
CA2588604C (en) 2012-07-17
WO2006060443A3 (en) 2007-05-10
WO2006060443A2 (en) 2006-06-08
US20060116885A1 (en) 2006-06-01

Similar Documents

Publication Publication Date Title
CA2588604A1 (en) A system and method for improving recognition accuracy in speech recognition applications
DE102017121086B4 (en) INTERACTIVE VOICE ACTIVATED DEVICES
WO2007140047A3 (en) Grammar adaptation through cooperative client and server based speech recognition
DE102019111529A1 (en) AUTOMATED LANGUAGE IDENTIFICATION USING A DYNAMICALLY ADJUSTABLE TIME-OUT
WO2004023455A3 (en) Methods, systems, and programming for performing speech recognition
US9817809B2 (en) System and method for treating homonyms in a speech recognition system
CN106502649A (en) A kind of robot service awakening method and device
WO2002054033A3 (en) Hierarchical language models for speech recognition
CN103903627A (en) Voice-data transmission method and device
WO2004072926A3 (en) Management of conversations
EP1435605A3 (en) Method and apparatus for speech recognition
WO2007118100A3 (en) Automatic language model update
WO2004090866A3 (en) Phonetically based speech recognition system and method
WO2008083173A3 (en) Local storage and use of search results for voice-enabled mobile communications devices
EP1211873A3 (en) Advanced voice recognition phone interface for in-vehicle speech recognition applications
EP1933301A3 (en) Speech recognition method and system with intelligent speaker identification and adaptation
CN109725523A (en) A kind of method and wrist-watch of smartwatch speech recognition controlled
CN106601250A (en) Speech control method and device and equipment
CN105047196B (en) Speech artefacts compensation system and method in speech recognition system
EP1352390B1 (en) Automatic dialog system with database language model
WO2019075829A1 (en) Voice translation method and apparatus, and translation device
CN111179903A (en) Voice recognition method and device, storage medium and electric appliance
CN107767860A (en) A kind of voice information processing method and device
WO2002103675A8 (en) Client-server based distributed speech recognition system architecture
CN107886940B (en) Voice translation processing method and device

Legal Events

Date Code Title Description
EEER Examination request