WO1995006309A1 - Voice operated remote control system - Google Patents

Voice operated remote control system Download PDF

Info

Publication number
WO1995006309A1
WO1995006309A1 PCT/US1994/009544 US9409544W WO9506309A1 WO 1995006309 A1 WO1995006309 A1 WO 1995006309A1 US 9409544 W US9409544 W US 9409544W WO 9506309 A1 WO9506309 A1 WO 9506309A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
signal
remote control
accordance
analog
Prior art date
Application number
PCT/US1994/009544
Other languages
French (fr)
Inventor
George H. Fischer
Mark P. Fortunato
Original Assignee
Voice Powered Technology International, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Voice Powered Technology International, Inc. filed Critical Voice Powered Technology International, Inc.
Priority to AU76371/94A priority Critical patent/AU7637194A/en
Publication of WO1995006309A1 publication Critical patent/WO1995006309A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • GPHYSICS
    • G08SIGNALLING
    • G08CTRANSMISSION SYSTEMS FOR MEASURED VALUES, CONTROL OR SIMILAR SIGNALS
    • G08C23/00Non-electrical signal transmission systems, e.g. optical systems
    • G08C23/02Non-electrical signal transmission systems, e.g. optical systems using infrasonic, sonic or ultrasonic waves
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B11/00Transmission systems employing sonic, ultrasonic or infrasonic waves
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B1/00Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission
    • H04B1/06Receivers
    • H04B1/16Circuits
    • H04B1/20Circuits for coupling gramophone pick-up, recorder output, or microphone to receiver
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Definitions

  • the present invention relates to systems for controlling a device by using voice recognition of spoken commands, and more particularly to systems of that type in which the commands are communicated to the controlled device by wireless transmission.
  • the wireless remote controls transmit signals to the controlled appliance or other device in response to the pressing of appropriate buttons by a user.
  • the controlled devices receive such signals, decode them and perform a function based on the specific signal received.
  • the transmitted signals are usually digital in nature, and typically are transmitted using infrared light transmission. Alternatively, radio frequency transmission is used in some systems.
  • Co-pending application Serial No. 07/915,112 describes a system utilizing a voice operated, hand- held, battery powered portable remote control device for controlling remote electronic devices by the transmission of infrared control signals.
  • the voice operated remote control device accepts voice commands spoken by the user, performs voice recognition pattern matching on the spoken word by comparing the same against pretrained templates to determine the appropriate corresponding command, determines the specific infrared remote control code or set of codes that represent the function corresponding to the command, and transmits the infrared remote control codes to the electronic device to be controlled.
  • the voice operated remote control device can be operated by multiple users, either by voice or by manual control of the selected function on the remote control device.
  • the voice commands can represent either real time commands, which are transmitted immediately after the determination of the voice command, or programmed delay events which delay the transmission of the remote control codes or set of remote control codes for a preset period of time specified by the user.
  • Co-pending application Serial No. 07/915,938 describes a method and apparatus for improving the quality of voice templates for speaker dependent voice recognition devices, through the use of two word templates for each word in a vocabulary and qualification of the word templates.
  • the apparatus takes the result of the comparison and performs an evaluation to determine the relative quality of the word templates. If word template quality is deemed to be less than an acceptable level, the apparatus requests the user to retrain the specific word in order to improve the recognition accuracy rate of the apparatus. Both resulting templates are later used during voice recognition.
  • the system includes a noise rejection system using noise templates during a pattern matching process to significantly reduce the required processing power over commonly used digital noise filters.
  • the system further includes means for correcting voice recognition errors through use of a voice command. Correcting errors by means of voice commands enables the user to correct errors while speaking into the apparatus, without interrupting the flow of the voice process. The fact that an error occurred on a particular word within a particular word group is used to assist in replacing the word template.
  • Co-pending application Serial No. 07/915,114 describes a method and apparatus whereby a remote control device executes system control functions in response to a single user command request.
  • the apparatus self-configures based on the configuration of the equipment to be used.
  • System control functions which can consist of one or more controls from a single remote control or from multiple remote controls, are controlled by transmitting a string or sequence of commands from the apparatus to single or to multiple pieces of equipment for control of that equipment.
  • the system control functions are selected by the user by activation of a single key representing the specific function.
  • the apparatus self-configures, determining what system functions are applicable based on the equipment which the user has and based on the functions learned from the user's
  • the approach exemplified by the three co-pending applications identified above allows the use of voice recognition technology with controlled devices that were not designed for such technology.
  • the voice remote control mimics the traditional remote control in the way in which it communicates with the controlled device.
  • this requires the addition of a significantly higher level of technology in the remote control device itself, much of which is already present in the controlled device.
  • virtually all television sets and VCRs utilize 8-bit microcontrollers with on-board ROM and RAM memory capacity.
  • Such devices often include some type of display. Many newer television sets have on ⁇ screen displays. All such devices have power supplies for powering the electronics.
  • a voice activated remote control must duplicate all such components.
  • voice activated remote control devices in order to mimic the digital remote control signals, such voice activated remote control devices must have the ability to either "learn” the control signals or have a substantial library of existing codes built into the ROM thereof.
  • the remote control device would only have to transmit the voice signal without the need for the sophisticated voice activated remote controls.
  • the remote control device would only have to convert the voice signal from the microphone into a format that can be transmitted. The result would be a relatively simple remote control device in which multiple buttons and the microcontroller used to interpret button operation would not be needed.
  • the controlled device would desirably include the voice recognition apparatus, so as to utilize the existing microprocessing and memory capacities of the controlled device with only minor additions thereto and so as to greatly simplify the remote control device.
  • the present invention provides a voice operated remote control system in which a remote control device responds to voice commands from the user to transmit representations of the voice commands to a controlled device, in wireless fashion.
  • the controlled device responds to the transmitted representations by undergoing a voice recognition process in an attempt to recognize the voice commands of the user. Recognized voice commands are used to produce action routines for specific functions to be performed by the controlled device.
  • voice recognition apparatus within the controlled device enables the remote control device to be of simple and inexpensive design.
  • the remote control device need only include apparatus such as a microphone for converting the voice commands into analog audio signals and apparatus for modulating or otherwise processing such signals for transmission to the controlled device.
  • an audio wireless remote control device includes a microphone for converting voice commands into analog audio signals. Following amplification of such signals by an audio amplifier, the signals are applied to a voltage controlled oscillator or other appropriate circuit for frequency modulation, prior to application to a driver and an infrared transmitter.
  • the infrared transmitter transmits the frequency modulated audio signal to the remotely located controlled device, in wireless fashion.
  • the remote control device is battery powered and includes a push-to-talk switch for conserving battery power by turning on the circuitry only when the user desires to enter a voice command into the remote control device.
  • An audio receiver located within the controlled device includes an infrared sensor for sensing the transmitted signals from the remote control device.
  • the sensed signals are amplified and filtered prior to application to a phase-locked loop receiver for demodulation of the frequency modulated signal.
  • the demodulated output of the phase-locked loop receiver is filtered and amplified to provide a voice signal.
  • the phase-locked loop receiver also provides a lock detect output to indicate when a transmitted signal has been received.
  • the voice signals from the audio receiver are applied to voice recognition apparatus within the controlled device.
  • the voice recognition apparatus comprises a microcontroller which may include an analog-to-digital converter, together with a microprocessor and ROM and RAM memories.
  • the analog- to-digital converter converts each voice signal produced by the audio receiver into an incoming digital voice signal.
  • the memories include a reference memory for storing a plurality of reference digital voice templates, and a program memory for storing a control program.
  • the microprocessor generates an incoming digital voice template from the incoming digital voice signal, executes the control program to determine whether the incoming digital voice template is substantially equivalent to one of the reference digital voice templates, and determines what action to take based on the incoming digital voice template.
  • the controlled device may comprise any appliance or other device capable of being electronically controlled.
  • An example of such a device is a television set having functions such as on/off, channel selection, volume and mute which are controlled in response to voice commands from the user entered at the remote control device.
  • the lock detect signal from the phase-locked loop receiver within the audio receiver may be used with such device to determine when a voice command is no longer being transmitted from the remote control device.
  • Fig. 1 is a basic block diagram of a voice operated remote control system in accordance with the invention
  • Fig. 2 is a detailed block diagram of the remote control device of Fig. 1;
  • Fig. 3 is a detailed block diagram of the audio receiver of Fig. 1;
  • Fig. 4 is a detailed block diagram of the controlled device of Fig. 1;
  • Fig. 5 is a basic flow chart illustrating the operation of the system of Fig. 1;
  • Fig. 6 is a detailed flow chart illustrating the operation of the system of Fig. 1 in connection with the television set shown in Fig. 4. Detailed Description
  • Fig. 1 shows a voice operated remote control system 10 in accordance with the invention.
  • the remote control system 10 of Fig. 1 includes a remote control device 12 for transmitting voice signals to a remotely located controlled device 14 in response to voice commands.
  • the remote control device 12 includes a microphone 16 for receiving voice commands from the user in the form of spoken words.
  • the microphone 16 converts the spoken words into analog audio signals for application to an audio wireless remote control circuit 18.
  • the audio wireless remote control circuit 18 processes the analog audio signals from the microphone 16 into an appropriate form for transmission to the controlled device 14, using an infrared transmitter 20.
  • the audio wireless remote control circuit 18 includes circuitry for frequency modulating (FM) the analog audio signal prior to application to a transmitting device in the form of the infrared transmitter 20.
  • the infrared transmitter 20 transmits the voice commands from the user to the controlled device 14 in wireless fashion.
  • voice operated remote control system 10 is described herein in connection with frequency modulation and demodulation of the voice commands and transmission in the form of an infrared signal, it should be understood that other signal processing and transmission techniques can be used.
  • forms of modulation other than frequency modulation such as amplitude modulation (AM)
  • AM amplitude modulation
  • modulated or otherwise modified signals can be transmitted in other than infrared signal form, such as in the form of a radio frequency signal.
  • the remote control device 12 comprises a small, portable, hand-held unit into which the user speaks the voice commands.
  • the remote control device 12 is remotely located relative to the controlled device 14, but nevertheless is close enough to the controlled device 14 so that the voice signals transmitted by the remote control device 12 are received by the controlled device 14.
  • the controlled device 14 comprises a television set, in the present example.
  • the controlled device 14 can comprise any appliance or device capable of being controlled electronically.
  • the controlled device may comprise a video cassette recorder (VCR) in which the spoken commands dictate the date, time and channel selection for future recording by the VCR, or simple commands such as play or rewind.
  • VCR video cassette recorder
  • the voice signals transmitted by the remote control device 12 are received by the controlled device 14.
  • the controlled device 14 includes an infrared sensor 22 for sensing the infrared signals transmitted by the infrared transmitter 20.
  • the sensed signals are processed by an audio receiver 24 which demodulates the transmitted signal as well as amplifying and filtering such signal to provide to a voice recognition circuit 26 the voice signal represented thereby.
  • the voice recognition circuit 26 performs a voice recognition process in which the voice signal is compared with pre-stored representations of different words of a vocabulary. If substantial similarity is found, then an action routine corresponding to the pre-stored word found to be substantially similar to the voice signal is provided to the controlled device 14 by the voice recognition circuit 26, so that a desired function is performed within the controlled device 14.
  • FIG. 2 A detailed example of the remote control device 12 of Fig. 1 is shown in Fig. 2.
  • the microphone 16 responds to a word spoken by the user by producing an analog audio signal. Such signal is amplified in an audio amplifier 28.
  • the audio amplifier 28 conditions the signal for proper application to a voltage controlled oscillator (VCO) 30 to frequency modulate (FM) a signal at the output of the VCO 30 in accordance with the analog audio signal from the amplifier 28.
  • VCO voltage controlled oscillator
  • FM frequency modulate
  • the output of the VCO 30 is conditioned by a driver circuit 32 prior to driving the infrared transmitter 20.
  • the remote control device 12 is powered by batteries 34. Because there is no memory or microprocessor in the remote control device 12, there is no need to keep the remote control device 12 powered when not in use. Accordingly, a push-to-talk switch 36 is used to provide the power from the batteries 34 only when the user wishes to speak into the remote control device 12. Pushing a button on the outside of the remote control device 12 closes the push-to-talk switch 36 to couple the batteries 34 to a voltage regulator 38.
  • the voltage regulator 38 which is coupled between a common ground line 40 and the push-to-talk switch 36, has an output terminal 42 thereof coupled to provide a supply voltage (+V) at a terminal 44. This regulated supply voltage is applied to the microphone 16, the audio amplifier 28 and the voltage controlled oscillator 30.
  • the batteries 34 are directly coupled to the driver 32 by the push-to-talk switch 36.
  • the regulated supply voltage +V optimizes the operating conditions of the voltage controlled oscillator 30.
  • the user pushes the button to close the push-to-talk switch 36 and speaks the voice command into the microphone 16.
  • the resulting analog audio signal is amplified by the amplifier 28 and applied to the VCO 30 to frequency modulate the output of the VCO 30.
  • This FM signal is applied via the driver 32 to the infrared transmitter 20, for transmission as an infrared signal to the controlled device 14.
  • the push-to-talk switch 36 is open, and the batteries 34 are not coupled to the voltage regulator 38 or to the driver 32.
  • the remote control device 12 transmits a frequency modulated infrared signal from the infrared transmitter 20.
  • the infrared transmitter 20 may comprise an infrared diode or other appropriate infrared transmitting device.
  • Such infrared signals are sensed by the infrared sensor 22 and applied to the audio receiver 24.
  • a detailed example of the audio receiver 24 is shown in Fig. 3.
  • the infrared sensor 22, which may comprise any appropriate form of infrared sensor such as those typically used in remote control devices, provides the received signal for amplification by an amplifier 46 and filtering by a filter 48.
  • the output of the filter 48 is applied to a phase-locked loop receiver 50 which demodulates the frequency modulated signal by converting it back to a voice signal.
  • the phase-locked loop receiver 50 is of conventional phase- locked loop (PLL) configuration, and includes a phase detector 52 coupled to the filter 48 and having an output and a second input coupled in a loop which includes a loop filter 54 and a voltage controlled oscillator (VCO) 56.
  • the output of the phase-locked loop receiver 50 is filtered by an audio filter 58 and amplified by an audio amplifier 60 to provide the demodulated voice signal.
  • the phase-locked loop receiver 50 also provides a lock detect signal, which indicates when a transmitted signal is present in the audio receiver 24.
  • the phase-locked loop receiver 50 is described herein by way of example only, and it should be understood that other demodulation receivers can be used as desired. For example, a super-heterodyne receiver can be used in applications posing more stringent requirements.
  • Fig. 4 shows the voice recognition circuit 26 of Fig. 1 in greater detail in conjunction with a television set 62.
  • the television set 62 comprises the appliance or device being controlled by the voice commands from the remote control device 12.
  • the television set 62 is of conventional configuration and includes a cathode ray tube (CRT) 64 for providing the TV screen.
  • the CRT 64 is driven by a tuner 66 through a video circuit 68.
  • An on-screen display circuit 70 which is coupled to the video circuit 68, provides additional screen displays such as menus of the type which invite the viewer to make selections by speaking further voice commands into the remote control device 12.
  • the on-screen display circuit 70 is used to provide on ⁇ screen displays requesting the user of the remote control device 12 to enter further voice commands.
  • the television set 62 also includes an audio circuit 72 which is coupled to speakers 74 for the television set 62.
  • the voice recognition circuit 26 comprises a microcontroller which includes an analog-to-digital converter (A/D) 76 coupled to receive the voice signal from the audio receiver 24.
  • the microcontroller of the voice recognition circuit 26 also includes a microprocessor (MP) 78 coupled to the A/D converter 76, and having access to both a read only memory (ROM) 80 and a random access memory (RAM) 82.
  • MP microprocessor
  • the microcontroller comprising the voice recognition circuit 26 is coupled to a digital IR input 84 and to an external DRAM 86.
  • the remaining components shown in Fig. 4, including those within the microcontroller of the voice recognition circuit 26, are standard components found in most modern television sets.
  • the external DRAM 86 is utilized for voice recognition storage requirements; primarily storage of voice templates. Therefore, the conventional television set 62 is easily modified by addition of the audio receiver 24 and the external DRAM 86 in order to make it responsive to voice commands transmitted by the remote control device 12.
  • the microcontroller comprising the voice recognition circuit 26 is part of the existing circuitry within the conventional television set 62.
  • voice operated remote control systems 10 in accordance with the invention do not require the addition of a microprocessor but only the modification of an existing one within the controlled device 14, principally through the addition of the external DRAM 86.
  • the microcontroller comprising the voice recognition circuit 26 operates in the same manner as described in the previously referred to co-pending applications. Serial Nos. 07/915,112, 07/915,938 and
  • the A/D converter 76 of Fig. 4 may comprise an 8-bit converter which samples incoming data at a preassigned frequency such as 9.6 KHz.
  • the A/D converter 76 outputs a digital signal representing the input analog voice signal from the audio receiver 24.
  • the microprocessor 78 processes the digital voice signal together with a voice recognition software routine forming part of a control program stored in the ROM 80.
  • the digital voice signal is converted into an incoming voice template that is compared against previously stored voice templates of the user's voice, stored in the external DRAM 86.
  • the program decodes the voice templates.
  • the RAM 82 comprises a reference memory for temporary storage of data.
  • the analog voice signal from the audio receiver 24 is applied to the A/D converter 76 for conversion into an incoming digital voice signal.
  • the reference memory comprised of the external DRAM 86 in conjunction with the RAM 82, stores a plurality of reference digital voice templates.
  • the ROM 80 stores the control program.
  • the microprocessor 78 which is coupled to the A/D converter 76, the ROM 80 and the RAM 82, generates an incoming digital voice template from the incoming digital voice signal at the output of the A/D converter 76.
  • the microprocessor 78 then executes the control program to determine whether the incoming digital voice template is substantially equivalent to one of the reference digital voice templates, stored in the reference memory comprised of the RAM 82 and the external DRAM 86.
  • the microprocessor 78 determines what action to take corresponding to a reference digital voice template, if the incoming digital voice template is found to have substantial similarity to the reference digital voice template.
  • the microcontroller effects the appropriate components of the television set to perform the desired function within the controlled device 14.
  • Voice control is made possible, in the voice operated remote control system 10, by first voice training the collection of reference digital voice templates in accordance with the user's voice. Such templates are collected in the same manner as described in co-pending application Serial No. 07/915,112.
  • the screen of the television set 62 (CRT 64) is used to provide an on ⁇ screen display which prompts the user by requesting the needed words. The user responds by pressing the button to close the push-to-talk switch 36 and speaking the prompted word into the microphone 16 of the remote control device 12.
  • the basic flow chart of Fig. 5 illustrates the manner in which the controlled device 14 operates in conjunction with the remote control device 12.
  • the controlled device 14 is in an idle mode 88, waiting for commands.
  • the lock detect signal causes an exit from the idle mode 88, with a visual cue to be presented to the user.
  • the absence (No) of a lock detect 90 causes the system to return to the idle mode 88.
  • visual feedback 92 is provided the user. This comes in the form of an on-screen display.
  • the user responds by speaking the desired command or string of commands, and voice recognition 94 thereof produces selected action 96 within the controlled device 14 through execution of an action routine by the microprocessor 78. Afterward, the idle mode 88 is returned to.
  • Fig. 6 comprises a detailed flow chart illustrating the operation of the circuit of Fig. 4 in conjunction with the remote control device 12.
  • the starting point in the flow chart of Fig. 6 is an idle mode 98.
  • the system always returns to the idle mode 98.
  • the idle mode 98 is exited and a check is made to determine if the television set 62 is turned on ("TV on" block 102) . If the television set 62 is not turned on (No) , then the set is turned on ("turn on” block 104) .
  • the system proceeds to an initial visual ("cue" block 106) .
  • This feature allows the user to turn on the television set 62 without using a voice command or a separate button therefor on the remote control device 12.
  • the visual cue 106 can simply be a symbol indicative of speech. Because the controlled device 14 is designed with voice recognition in mind, it is also possible for the volume of the television set 62 to automatically be decreased to a moderate level to facilitate better voice recognition, at this point. Once the visual cue 106 is displayed, the user knows that the television set 62 is "listening".
  • the user has a choice of turning the television set 62 off ("Off"), changing the channel (“Channel”), adjusting the volume (“Volume”), or muting the audio (“Mute”) , as represented by a function block 108 in Fig. 6.
  • the on-screen display shows the word recognized or an appropriate symbol indicative of the word or the action denoted thereby.
  • the television set 62 is turned off ("turn off" block 110) and the idle mode 98 is returned to.
  • the word “Channel” is displayed on the television screen and a second group of words is possible. This includes the possibility of either “Up” or “Down” commands, or entry of the digits 0 through 9. “up” or “Down” commands result in changing the channel by one channel and returning to the same choices. This allows the user to single-step through the channels by speaking the words “Up” or “Down” several times. If a channel number is spoken and recognized, then the user can either release the push-to-talk switch 36, which causes the channel to change to the appropriate single-digit channel, or the user can speak another digit causing the channel to be changed to the appropriate 2-digit channel.
  • a block 112 which represents the choices of entering a particular channel number (0-9) or an "Up” or “Down” command. Again, an indication of lock lost results in returning to the idle mode 98.
  • the system increments in the "Up” or “Down” direction, as represented by a "channel up” block 114 or a “channel down” block 116, respectively.
  • the user can speak a number in the 0-9 range. If the spoken number is recognized, the user can release the push-to-talk switch 36 and this causes the channel to be changed to the appropriate single- digit channel, as represented by a block 118 shown in Fig. 6.
  • the user can continue pushing the push-to-talk switch 36 and speak a second digit, causing the channel to be changed to the appropriate 2- digit channel, as represented by a block 120 in Fig. 6.
  • volume is displayed together with a visual indication of the present volume on the screen of the television set 62, and a second word consisting of either “Up” or “Down” is accessed, as represented by a block 122 in Fig. 6.
  • Speaking the word “Up” causes the volume to be increased by one level, as represented by a block 124.
  • speaking the word “Down” causes the volume to decrease by one level, as represented by a block 126.
  • the microcontroller checks the state of the audio output, in a "muted” block 126. If the audio output is on (No) , then the audio signal is turned off, as represented by a mute sound block 128. Conversely, if the audio output is off (Yes) , then the audio output is turned back on to the level where it was prior to being muted ("enable sound” block 130) .
  • the voice operated remote control system 10 in accordance with the invention is described herein in terms of transmission of an infrared signal for purposes of example only. In actual practice, other transmission media such as FM or AM radio frequency (RF) signals can be used. VCRs, audio tape and disc players, and radio receivers, can all be controlled using the techniques described herein. Also, other appliances or devices such as household appliances, automotive accessories, home security systems, and office equipment can be controlled in accordance with the techniques described herein.
  • RF radio frequency
  • multiple users can utilize a single remote control device 12.
  • the voice of each user is trained, and the templates which result therefrom are stored in the controlled device 14 separately from templates of the other users.
  • each user speaks his or her name at the appropriate word group, and this causes the controlled device 14 to change to the new user's voice templates.
  • the on-screen display shows the user's name and proceeds to use that user's templates, until the name is changed.
  • the remote control device 12 does not have any components added to support multiple voices. In fact, the number of user voices supported is practically limited by the choice of the system software. Also, this method does not require that the user remember a switch number or other identification corresponding to his or her voice.
  • the disadvantage of such method relates to the need to add more words to the word groups, as well as possible confusion as to whose voice templates are presently being used.
  • a second, multiple position switch is incorporated into the remote control device 12.
  • a series of pulses is transmitted to the controlled device 14 for interpretation to identify the user.
  • the controlled device 14 then changes to the new user's templates.

Abstract

A voice operated remote control system (10) transmits representations of the voice commands to a controlled device (14). The device produces voice signals in response to the transmitted representations. The remote control device, which includes a push-to-talk switch for conserving battery power, receives the voice commands via a microphone (16) and an audio amplifier to produce an audio signal which is applied to an oscillator for frequency modulation of a signal transmitted by an infrared transmitter (20). At the controlled device, an audio receiver (24) receives the transmitted signal and applies it to a receiver for demodulation. Following filtering and amplification, the voice signal is used to generate a voice template for comparison with a plurality of digital reference voice templates. If a substantial equivalent is found, a corresponding action routine is executed to achieve a desired action within the controlled device (14).

Description

VOICE OPERATED REMOTE CONTROL SYSTEM
Background of the Invention 1. Field of the Invention The present invention relates to systems for controlling a device by using voice recognition of spoken commands, and more particularly to systems of that type in which the commands are communicated to the controlled device by wireless transmission. 2. History of the Prior Art
It is well known to provide electronic appliances, particularly video devices such as television sets and video cassette recorders (VCRs) , which are supplied with wireless remote controls. The wireless remote controls transmit signals to the controlled appliance or other device in response to the pressing of appropriate buttons by a user. The controlled devices receive such signals, decode them and perform a function based on the specific signal received. The transmitted signals are usually digital in nature, and typically are transmitted using infrared light transmission. Alternatively, radio frequency transmission is used in some systems.
Recent developments have led to voice activated remote controls. Such remote controls utilize sophisticated electronics to recognize spoken commands, translate the commands into the traditional digital remote control signals, and transmit the control signals to the controlled device. Examples of such systems are provided by co-pending application Serial No. 07/915,112 of Bissonnette et al., entitled Voice Operated Remote Control Device, by co-pending application Serial No. 07/915,938 of Bissonnette et al., entitled Voice Recognition Apparatus and Method, and by co-pending application Serial No. 07/915,114 of Fischer, entitled Remote Control Device. All three applications were filed on July 17, 1992 and are commonly assigned with the present application.
Co-pending application Serial No. 07/915,112 describes a system utilizing a voice operated, hand- held, battery powered portable remote control device for controlling remote electronic devices by the transmission of infrared control signals. The voice operated remote control device accepts voice commands spoken by the user, performs voice recognition pattern matching on the spoken word by comparing the same against pretrained templates to determine the appropriate corresponding command, determines the specific infrared remote control code or set of codes that represent the function corresponding to the command, and transmits the infrared remote control codes to the electronic device to be controlled. The voice operated remote control device can be operated by multiple users, either by voice or by manual control of the selected function on the remote control device. The voice commands can represent either real time commands, which are transmitted immediately after the determination of the voice command, or programmed delay events which delay the transmission of the remote control codes or set of remote control codes for a preset period of time specified by the user.
Co-pending application Serial No. 07/915,938 describes a method and apparatus for improving the quality of voice templates for speaker dependent voice recognition devices, through the use of two word templates for each word in a vocabulary and qualification of the word templates. By comparing two word templates corresponding to the same word to each other, the apparatus takes the result of the comparison and performs an evaluation to determine the relative quality of the word templates. If word template quality is deemed to be less than an acceptable level, the apparatus requests the user to retrain the specific word in order to improve the recognition accuracy rate of the apparatus. Both resulting templates are later used during voice recognition. The system includes a noise rejection system using noise templates during a pattern matching process to significantly reduce the required processing power over commonly used digital noise filters. The system further includes means for correcting voice recognition errors through use of a voice command. Correcting errors by means of voice commands enables the user to correct errors while speaking into the apparatus, without interrupting the flow of the voice process. The fact that an error occurred on a particular word within a particular word group is used to assist in replacing the word template. Co-pending application Serial No. 07/915,114 describes a method and apparatus whereby a remote control device executes system control functions in response to a single user command request. The apparatus self-configures based on the configuration of the equipment to be used. System control functions, which can consist of one or more controls from a single remote control or from multiple remote controls, are controlled by transmitting a string or sequence of commands from the apparatus to single or to multiple pieces of equipment for control of that equipment. The system control functions are selected by the user by activation of a single key representing the specific function. The apparatus self-configures, determining what system functions are applicable based on the equipment which the user has and based on the functions learned from the user's remote control.
The approach exemplified by the three co-pending applications identified above allows the use of voice recognition technology with controlled devices that were not designed for such technology. The voice remote control mimics the traditional remote control in the way in which it communicates with the controlled device. Unfortunately, this requires the addition of a significantly higher level of technology in the remote control device itself, much of which is already present in the controlled device. For example, virtually all television sets and VCRs utilize 8-bit microcontrollers with on-board ROM and RAM memory capacity. Such devices often include some type of display. Many newer television sets have on¬ screen displays. All such devices have power supplies for powering the electronics. A voice activated remote control must duplicate all such components.
Moreover, in order to mimic the digital remote control signals, such voice activated remote control devices must have the ability to either "learn" the control signals or have a substantial library of existing codes built into the ROM thereof. The only components added to the voice activated remote control device that are not duplicated in the controlled device or that are added only for the purpose of mimicking the standard remote controls, are a microphone, audio amplifier/filter circuit and an analog-to-digital converter.
If the controlled device were to be capable of supporting voice control itself, then the remote control device would only have to transmit the voice signal without the need for the sophisticated voice activated remote controls. In such an arrangement, the remote control device would only have to convert the voice signal from the microphone into a format that can be transmitted. The result would be a relatively simple remote control device in which multiple buttons and the microcontroller used to interpret button operation would not be needed.
Accordingly, it would be desirable to provide a voice operated remote control system in which a relatively simple and inexpensive remote control device is used to transmit voice commands spoken by the user to the controlled device, in wireless fashion. The controlled device would desirably include the voice recognition apparatus, so as to utilize the existing microprocessing and memory capacities of the controlled device with only minor additions thereto and so as to greatly simplify the remote control device.
Brief Description of the Invention The present invention provides a voice operated remote control system in which a remote control device responds to voice commands from the user to transmit representations of the voice commands to a controlled device, in wireless fashion. The controlled device responds to the transmitted representations by undergoing a voice recognition process in an attempt to recognize the voice commands of the user. Recognized voice commands are used to produce action routines for specific functions to be performed by the controlled device. The incorporation of voice recognition apparatus within the controlled device enables the remote control device to be of simple and inexpensive design. The remote control device need only include apparatus such as a microphone for converting the voice commands into analog audio signals and apparatus for modulating or otherwise processing such signals for transmission to the controlled device.
In a preferred embodiment of a voice operated remote control system in accordance with the invention, an audio wireless remote control device includes a microphone for converting voice commands into analog audio signals. Following amplification of such signals by an audio amplifier, the signals are applied to a voltage controlled oscillator or other appropriate circuit for frequency modulation, prior to application to a driver and an infrared transmitter. The infrared transmitter transmits the frequency modulated audio signal to the remotely located controlled device, in wireless fashion. The remote control device is battery powered and includes a push-to-talk switch for conserving battery power by turning on the circuitry only when the user desires to enter a voice command into the remote control device. An audio receiver located within the controlled device includes an infrared sensor for sensing the transmitted signals from the remote control device. The sensed signals are amplified and filtered prior to application to a phase-locked loop receiver for demodulation of the frequency modulated signal. The demodulated output of the phase-locked loop receiver is filtered and amplified to provide a voice signal. The phase-locked loop receiver also provides a lock detect output to indicate when a transmitted signal has been received.
The voice signals from the audio receiver are applied to voice recognition apparatus within the controlled device. The voice recognition apparatus comprises a microcontroller which may include an analog-to-digital converter, together with a microprocessor and ROM and RAM memories. The analog- to-digital converter converts each voice signal produced by the audio receiver into an incoming digital voice signal. The memories include a reference memory for storing a plurality of reference digital voice templates, and a program memory for storing a control program. The microprocessor generates an incoming digital voice template from the incoming digital voice signal, executes the control program to determine whether the incoming digital voice template is substantially equivalent to one of the reference digital voice templates, and determines what action to take based on the incoming digital voice template. Typically, one or more action routines are carried out by the microprocessor to accomplish the action or actions dictated by the voice commands of the user. The controlled device may comprise any appliance or other device capable of being electronically controlled. An example of such a device is a television set having functions such as on/off, channel selection, volume and mute which are controlled in response to voice commands from the user entered at the remote control device. The lock detect signal from the phase-locked loop receiver within the audio receiver may be used with such device to determine when a voice command is no longer being transmitted from the remote control device.
Brief Description of the Drawings A better understanding of the invention may be had by reference to the following detailed description taken in conjunction with the accompanying drawings, in which:
Fig. 1 is a basic block diagram of a voice operated remote control system in accordance with the invention; Fig. 2 is a detailed block diagram of the remote control device of Fig. 1;
Fig. 3 is a detailed block diagram of the audio receiver of Fig. 1;
Fig. 4 is a detailed block diagram of the controlled device of Fig. 1; Fig. 5 is a basic flow chart illustrating the operation of the system of Fig. 1; and
Fig. 6 is a detailed flow chart illustrating the operation of the system of Fig. 1 in connection with the television set shown in Fig. 4. Detailed Description
Fig. 1 shows a voice operated remote control system 10 in accordance with the invention. The remote control system 10 of Fig. 1 includes a remote control device 12 for transmitting voice signals to a remotely located controlled device 14 in response to voice commands. The remote control device 12 includes a microphone 16 for receiving voice commands from the user in the form of spoken words. The microphone 16 converts the spoken words into analog audio signals for application to an audio wireless remote control circuit 18. The audio wireless remote control circuit 18 processes the analog audio signals from the microphone 16 into an appropriate form for transmission to the controlled device 14, using an infrared transmitter 20.
As described hereafter in connection with Figs. 2 and 3, the audio wireless remote control circuit 18 includes circuitry for frequency modulating (FM) the analog audio signal prior to application to a transmitting device in the form of the infrared transmitter 20. The infrared transmitter 20 transmits the voice commands from the user to the controlled device 14 in wireless fashion.
While the voice operated remote control system 10 is described herein in connection with frequency modulation and demodulation of the voice commands and transmission in the form of an infrared signal, it should be understood that other signal processing and transmission techniques can be used. For example, forms of modulation other than frequency modulation, such as amplitude modulation (AM) , can be used to produce a representation of the analog audio signal appropriate for transmission. Also, modulated or otherwise modified signals can be transmitted in other than infrared signal form, such as in the form of a radio frequency signal.
The remote control device 12 comprises a small, portable, hand-held unit into which the user speaks the voice commands. The remote control device 12 is remotely located relative to the controlled device 14, but nevertheless is close enough to the controlled device 14 so that the voice signals transmitted by the remote control device 12 are received by the controlled device 14. As described in detail hereafter, the controlled device 14 comprises a television set, in the present example. However, it should be understood that the controlled device 14 can comprise any appliance or device capable of being controlled electronically. For example, the controlled device may comprise a video cassette recorder (VCR) in which the spoken commands dictate the date, time and channel selection for future recording by the VCR, or simple commands such as play or rewind. Still other examples of appliances or devices which may comprise the controlled device 14 include kitchen appliances such as ovens and ranges, other household appliances, security systems, automotive accessories and office business equipment. Referring again to Fig. 1, the voice signals transmitted by the remote control device 12 are received by the controlled device 14. The controlled device 14 includes an infrared sensor 22 for sensing the infrared signals transmitted by the infrared transmitter 20. The sensed signals are processed by an audio receiver 24 which demodulates the transmitted signal as well as amplifying and filtering such signal to provide to a voice recognition circuit 26 the voice signal represented thereby. The voice recognition circuit 26 performs a voice recognition process in which the voice signal is compared with pre-stored representations of different words of a vocabulary. If substantial similarity is found, then an action routine corresponding to the pre-stored word found to be substantially similar to the voice signal is provided to the controlled device 14 by the voice recognition circuit 26, so that a desired function is performed within the controlled device 14.
A detailed example of the remote control device 12 of Fig. 1 is shown in Fig. 2. As described above in connection with Fig. 1, the microphone 16 responds to a word spoken by the user by producing an analog audio signal. Such signal is amplified in an audio amplifier 28. The audio amplifier 28 conditions the signal for proper application to a voltage controlled oscillator (VCO) 30 to frequency modulate (FM) a signal at the output of the VCO 30 in accordance with the analog audio signal from the amplifier 28. The output of the VCO 30 is conditioned by a driver circuit 32 prior to driving the infrared transmitter 20.
The remote control device 12 is powered by batteries 34. Because there is no memory or microprocessor in the remote control device 12, there is no need to keep the remote control device 12 powered when not in use. Accordingly, a push-to-talk switch 36 is used to provide the power from the batteries 34 only when the user wishes to speak into the remote control device 12. Pushing a button on the outside of the remote control device 12 closes the push-to-talk switch 36 to couple the batteries 34 to a voltage regulator 38. The voltage regulator 38, which is coupled between a common ground line 40 and the push-to-talk switch 36, has an output terminal 42 thereof coupled to provide a supply voltage (+V) at a terminal 44. This regulated supply voltage is applied to the microphone 16, the audio amplifier 28 and the voltage controlled oscillator 30. The batteries 34 are directly coupled to the driver 32 by the push-to-talk switch 36. The regulated supply voltage +V optimizes the operating conditions of the voltage controlled oscillator 30. Whenever the user wishes to transmit a voice command to the controlled device 14, the user pushes the button to close the push-to-talk switch 36 and speaks the voice command into the microphone 16. The resulting analog audio signal is amplified by the amplifier 28 and applied to the VCO 30 to frequency modulate the output of the VCO 30. This FM signal is applied via the driver 32 to the infrared transmitter 20, for transmission as an infrared signal to the controlled device 14. When the user is not transmitting voice commands, the push-to-talk switch 36 is open, and the batteries 34 are not coupled to the voltage regulator 38 or to the driver 32. When the user pushes the button to close the push- to-talk switch 36 and speaks a command word into the microphone 16, the remote control device 12 transmits a frequency modulated infrared signal from the infrared transmitter 20. The infrared transmitter 20 may comprise an infrared diode or other appropriate infrared transmitting device. Such infrared signals are sensed by the infrared sensor 22 and applied to the audio receiver 24. A detailed example of the audio receiver 24 is shown in Fig. 3. Referring to Fig. 3, the infrared sensor 22, which may comprise any appropriate form of infrared sensor such as those typically used in remote control devices, provides the received signal for amplification by an amplifier 46 and filtering by a filter 48. The output of the filter 48 is applied to a phase-locked loop receiver 50 which demodulates the frequency modulated signal by converting it back to a voice signal. The phase-locked loop receiver 50 is of conventional phase- locked loop (PLL) configuration, and includes a phase detector 52 coupled to the filter 48 and having an output and a second input coupled in a loop which includes a loop filter 54 and a voltage controlled oscillator (VCO) 56. The output of the phase-locked loop receiver 50 is filtered by an audio filter 58 and amplified by an audio amplifier 60 to provide the demodulated voice signal. The phase-locked loop receiver 50 also provides a lock detect signal, which indicates when a transmitted signal is present in the audio receiver 24. The phase-locked loop receiver 50 is described herein by way of example only, and it should be understood that other demodulation receivers can be used as desired. For example, a super-heterodyne receiver can be used in applications posing more stringent requirements.
Fig. 4 shows the voice recognition circuit 26 of Fig. 1 in greater detail in conjunction with a television set 62. In the present example, the television set 62 comprises the appliance or device being controlled by the voice commands from the remote control device 12. The television set 62 is of conventional configuration and includes a cathode ray tube (CRT) 64 for providing the TV screen. The CRT 64 is driven by a tuner 66 through a video circuit 68. An on-screen display circuit 70, which is coupled to the video circuit 68, provides additional screen displays such as menus of the type which invite the viewer to make selections by speaking further voice commands into the remote control device 12. As described hereafter, the on-screen display circuit 70 is used to provide on¬ screen displays requesting the user of the remote control device 12 to enter further voice commands. The television set 62 also includes an audio circuit 72 which is coupled to speakers 74 for the television set 62.
The voice recognition circuit 26 comprises a microcontroller which includes an analog-to-digital converter (A/D) 76 coupled to receive the voice signal from the audio receiver 24. The microcontroller of the voice recognition circuit 26 also includes a microprocessor (MP) 78 coupled to the A/D converter 76, and having access to both a read only memory (ROM) 80 and a random access memory (RAM) 82.
The microcontroller comprising the voice recognition circuit 26 is coupled to a digital IR input 84 and to an external DRAM 86. With the exception of the external DRAM 86 and the audio receiver 24, the remaining components shown in Fig. 4, including those within the microcontroller of the voice recognition circuit 26, are standard components found in most modern television sets. The external DRAM 86 is utilized for voice recognition storage requirements; primarily storage of voice templates. Therefore, the conventional television set 62 is easily modified by addition of the audio receiver 24 and the external DRAM 86 in order to make it responsive to voice commands transmitted by the remote control device 12. Again, the microcontroller comprising the voice recognition circuit 26 is part of the existing circuitry within the conventional television set 62. Accordingly, voice operated remote control systems 10 in accordance with the invention do not require the addition of a microprocessor but only the modification of an existing one within the controlled device 14, principally through the addition of the external DRAM 86. The microcontroller comprising the voice recognition circuit 26 operates in the same manner as described in the previously referred to co-pending applications. Serial Nos. 07/915,112, 07/915,938 and
07/915,114. Such applications are incorporated herein by reference. As described in detail in co-pending application Serial No. 07/915,122, for example, the A/D converter 76 of Fig. 4 may comprise an 8-bit converter which samples incoming data at a preassigned frequency such as 9.6 KHz. The A/D converter 76 outputs a digital signal representing the input analog voice signal from the audio receiver 24. The microprocessor 78 processes the digital voice signal together with a voice recognition software routine forming part of a control program stored in the ROM 80. The digital voice signal is converted into an incoming voice template that is compared against previously stored voice templates of the user's voice, stored in the external DRAM 86. The program decodes the voice templates. Together with the external DRAM 86, the RAM 82 comprises a reference memory for temporary storage of data.
Thus, the analog voice signal from the audio receiver 24 is applied to the A/D converter 76 for conversion into an incoming digital voice signal. The reference memory, comprised of the external DRAM 86 in conjunction with the RAM 82, stores a plurality of reference digital voice templates. The ROM 80 stores the control program. The microprocessor 78, which is coupled to the A/D converter 76, the ROM 80 and the RAM 82, generates an incoming digital voice template from the incoming digital voice signal at the output of the A/D converter 76. The microprocessor 78 then executes the control program to determine whether the incoming digital voice template is substantially equivalent to one of the reference digital voice templates, stored in the reference memory comprised of the RAM 82 and the external DRAM 86. The microprocessor 78 determines what action to take corresponding to a reference digital voice template, if the incoming digital voice template is found to have substantial similarity to the reference digital voice template. The microcontroller effects the appropriate components of the television set to perform the desired function within the controlled device 14.
Voice control is made possible, in the voice operated remote control system 10, by first voice training the collection of reference digital voice templates in accordance with the user's voice. Such templates are collected in the same manner as described in co-pending application Serial No. 07/915,112. In the example of Fig. 4, however, the screen of the television set 62 (CRT 64) is used to provide an on¬ screen display which prompts the user by requesting the needed words. The user responds by pressing the button to close the push-to-talk switch 36 and speaking the prompted word into the microphone 16 of the remote control device 12.
When voice training is complete, the voice operated remote control system 10 is ready for use. The basic flow chart of Fig. 5 illustrates the manner in which the controlled device 14 operates in conjunction with the remote control device 12. To begin with, the controlled device 14 is in an idle mode 88, waiting for commands. When the audio receiver 24 detects a signal and locks on to accomplish demodulation, the lock detect signal causes an exit from the idle mode 88, with a visual cue to be presented to the user. As shown in Fig. 5, the absence (No) of a lock detect 90 causes the system to return to the idle mode 88. However, when a lock detect (Yes) is present, visual feedback 92 is provided the user. This comes in the form of an on-screen display. The user responds by speaking the desired command or string of commands, and voice recognition 94 thereof produces selected action 96 within the controlled device 14 through execution of an action routine by the microprocessor 78. Afterward, the idle mode 88 is returned to.
Fig. 6 comprises a detailed flow chart illustrating the operation of the circuit of Fig. 4 in conjunction with the remote control device 12. As in the case of the flow chart of Fig. 5, the starting point in the flow chart of Fig. 6 is an idle mode 98. When a lock detect 100 is not present (No) , the system always returns to the idle mode 98. When a lock detect 100 is present (Yes) , the idle mode 98 is exited and a check is made to determine if the television set 62 is turned on ("TV on" block 102) . If the television set 62 is not turned on (No) , then the set is turned on ("turn on" block 104) . If the television set 62 is turned on (Yes) , then the system proceeds to an initial visual ("cue" block 106) . This feature allows the user to turn on the television set 62 without using a voice command or a separate button therefor on the remote control device 12. With on-screen display, the visual cue 106 can simply be a symbol indicative of speech. Because the controlled device 14 is designed with voice recognition in mind, it is also possible for the volume of the television set 62 to automatically be decreased to a moderate level to facilitate better voice recognition, at this point. Once the visual cue 106 is displayed, the user knows that the television set 62 is "listening".
At this point, the user has a choice of turning the television set 62 off ("Off"), changing the channel ("Channel"), adjusting the volume ("Volume"), or muting the audio ("Mute") , as represented by a function block 108 in Fig. 6. When a voice command word is recognized, the on-screen display shows the word recognized or an appropriate symbol indicative of the word or the action denoted thereby.
If the user says the word "Off", then the television set 62 is turned off ("turn off" block 110) and the idle mode 98 is returned to.
If the user says the word "Channel", then the word "Channel" is displayed on the television screen and a second group of words is possible. This includes the possibility of either "Up" or "Down" commands, or entry of the digits 0 through 9. "up" or "Down" commands result in changing the channel by one channel and returning to the same choices. This allows the user to single-step through the channels by speaking the words "Up" or "Down" several times. If a channel number is spoken and recognized, then the user can either release the push-to-talk switch 36, which causes the channel to change to the appropriate single-digit channel, or the user can speak another digit causing the channel to be changed to the appropriate 2-digit channel.
This is shown in Fig. 6 by a block 112 which represents the choices of entering a particular channel number (0-9) or an "Up" or "Down" command. Again, an indication of lock lost results in returning to the idle mode 98. If the user decides to single-step through the channels by saying "Up" or "Down" several times, then the system increments in the "Up" or "Down" direction, as represented by a "channel up" block 114 or a "channel down" block 116, respectively. Alternatively, the user can speak a number in the 0-9 range. If the spoken number is recognized, the user can release the push-to-talk switch 36 and this causes the channel to be changed to the appropriate single- digit channel, as represented by a block 118 shown in Fig. 6. Alternatively, the user can continue pushing the push-to-talk switch 36 and speak a second digit, causing the channel to be changed to the appropriate 2- digit channel, as represented by a block 120 in Fig. 6.
If the user speaks the word "Volume", then the word
"Volume" is displayed together with a visual indication of the present volume on the screen of the television set 62, and a second word consisting of either "Up" or "Down" is accessed, as represented by a block 122 in Fig. 6. Speaking the word "Up" causes the volume to be increased by one level, as represented by a block 124. Conversely, speaking the word "Down" causes the volume to decrease by one level, as represented by a block 126.
If the user speaks the word "Mute", the microcontroller checks the state of the audio output, in a "muted" block 126. If the audio output is on (No) , then the audio signal is turned off, as represented by a mute sound block 128. Conversely, if the audio output is off (Yes) , then the audio output is turned back on to the level where it was prior to being muted ("enable sound" block 130) . As previously discussed, the voice operated remote control system 10 in accordance with the invention is described herein in terms of transmission of an infrared signal for purposes of example only. In actual practice, other transmission media such as FM or AM radio frequency (RF) signals can be used. VCRs, audio tape and disc players, and radio receivers, can all be controlled using the techniques described herein. Also, other appliances or devices such as household appliances, automotive accessories, home security systems, and office equipment can be controlled in accordance with the techniques described herein.
In accordance with the invention, multiple users can utilize a single remote control device 12. The voice of each user is trained, and the templates which result therefrom are stored in the controlled device 14 separately from templates of the other users.
In a first method for accommodating multiple users, each user speaks his or her name at the appropriate word group, and this causes the controlled device 14 to change to the new user's voice templates. Once the name is recognized, the on-screen display shows the user's name and proceeds to use that user's templates, until the name is changed. Even if the controlled device 14 is turned off, such device retains the current user identification and identifies whom it has retained upon turning on of the system. This method has the advantage that the remote control device 12 does not have any components added to support multiple voices. In fact, the number of user voices supported is practically limited by the choice of the system software. Also, this method does not require that the user remember a switch number or other identification corresponding to his or her voice. The disadvantage of such method relates to the need to add more words to the word groups, as well as possible confusion as to whose voice templates are presently being used.
In accordance with a second method of accommodating multiple users, a second, multiple position switch is incorporated into the remote control device 12. When the position of the switch is changed by the user, a series of pulses is transmitted to the controlled device 14 for interpretation to identify the user. The controlled device 14 then changes to the new user's templates.
In situations where multiple voice operated remote control systems 10 are being simultaneously used, it is necessary to be able to distinguish the transmitted signals from the different systems so that confusion does not occur. One solution to this problem is to use multiple frequencies. Because the receiving circuitry is frequency selective, multiple communication channels can be supported. It is best to design the controlled device 14 with the ability to have the channel thereof changed by the user with a switch or other means. In that instance, the simple remote controls are comprised of simple, fixed-channel devices. While the invention has been particularly shown and described with reference to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention.

Claims

IN THE CLAIMS:
1. A voice operated remote control system comprising the combination of: a remote control device responsive to a voice command for transmitting a representation of the voice command; and a control device including means responsive to the transmitted representation of the voice command for producing a voice signal corresponding to the transmitted representation and voice recognition means for recognizing the voice signal and producing an action routine denoted thereby.
2. A voice operated remote control system in accordance with claim 1, wherein the remote control device includes means for converting the voice command into an analog signal and means for modulating the analog signal to produce a transmission signal, and the means for producing a voice signal includes means for demodulating the transmission signal.
3. A voice operated remote control system in accordance with claim 2, wherein the means for modulating performs FM modulation of the analog signal to produce a transmission signal and the means for demodulating performs FM demodulation of the transmission signal.
4. A voice operated remote control system in accordance with claim 1, wherein the voice recognition means includes a reference memory for storing a plurality of reference voice templates, a program memory for storing a control program, and a processor coupled to the reference memory and the program memory for generating an incoming voice template in response to each voice signal produced by the means for producing a voice signal corresponding to the transmitted representation, and for executing the control program to determine whether the incoming voice template is substantially equivalent to one of the reference voice templates, and for selecting one of a plurality of action routines based on the incoming voice template.
5. A voice operated remote control system in accordance with claim 4, wherein the voice signal is an analog signal, and the voice recognition means further includes an analog-to-digital converter for converting each analog voice signal into an incoming digital voice signal, and the reference memory stores a plurality of reference digital voice templates.
6. A method of controlling by voice command a controlled device, comprising the steps of: responding to a voice command at a location remote from the controlled device by converting the voice command and transmitting the converted voice command; and responding to a transmitted converted voice command at the controlled device by converting the transmitted converted voice command into a voice signal, performing voice recognition of the voice signal, and providing an action routine to the control device in accordance with the voice recognition performed on the voice signal.
7. A method of controlling in accordance with claim 6, wherein converting the voice command comprises frequency modulating the voice command, and converting the transmitted converted voice command comprises frequency demodulating the transmitted converted voice command.
8. A remote control device comprising the combination of: means responsive to a voice command for producing a corresponding analog voice signal; means for modulating the analog voice signal; and means for transmitting the modulated analog voice signal.
9. A remote control device in accordance with claim 8, wherein the means for producing a corresponding analog voice signal comprises a microphone, and further including an audio amplifier for amplifying analog voice signals provided by the microphone.
10. A remote control device in accordance with claim 8, wherein the means for modulating comprises a voltage controlled oscillator for frequency modulating the analog voice signal.
11. A remote control device in accordance with claim 8, wherein the means for transmitting comprises a driver coupled to an infrared transmitter.
12. A remote control device in accordance with claim 8, further including a battery and a push-to-talk switch for selectively coupling the battery to power the means for producing, the means for modulating and the means for transmitting.
13. A device for operating in response to a transmitted audio signal comprising the combination of: means for sensing a transmitted audio signal; means responsive to a sensed audio signal for converting the audio signal into a voice signal; means responsive to the voice signal for performing voice recognition thereon; and means responsive to the voice recognition for producing an action routine to determine operation of the device.
14. A device in accordance with claim 13, wherein the means for sensing comprises an infrared sensor.
15. A device in accordance with claim 13, further including means for amplifying a sensed transmitted audio signal and means for filtering the amplified sensed transmitted audio signal, and wherein the means for converting comprises a phase-locked loop receiver for frequency demodulating the filtered amplified sensed transmitted audio signal.
16. A device in accordance with claim 15, further including an audio filter for filtering a frequency demodulated output of the phase-locked loop receiver and an audio amplifier for amplifying an output of the audio filter to produce the voice signal.
17. A device in accordance with claim 13, wherein the means for performing voice recognition comprises: an analog-to-digital converter for converting the voice signal into an incoming digital voice signal; a reference memory for storing a plurality of reference digital voice templates; a program memory for storing a control program; and a processor coupled to the converter, the reference memory and the program memory for generating an incoming digital voice template from the incoming digital voice signal, for executing the control program to determine whether the incoming digital voice template is substantially equivalent to one of the reference digital voice templates, and for selecting one of a plurality of action routines based on the incoming digital voice template.
18. A device in accordance with claim 13, further including a television set coupled to be controlled by execution of the selected one of a plurality of action routines.
19. A device in accordance with claim 18, wherein the television set includes means for producing on¬ screen displays, in response to execution of the selected one of a plurality of action routines requesting a user to transmit a further audio signal.
20. A voice operated wireless remote control system comprising the combination of: a remote control device, including means responsive to a voice command for producing an analog audio signal, means for modulating the analog audio signal, and means for wireless transmission of the modulated analog audio signal; and a controlled device, including means for sensing a transmitted modulated analog audio signal, means for demodulating a sensed signal to produce a voice signal, means for performing voice recognition of the voice signal, and means responsive to performance of voice recognition for providing an action routine to the controlled device.
21. A voice operated wireless remote control system in accordance with claim 20, wherein the means for wireless transmission comprises an infrared transmitter and the means for sensing comprises an infrared sensor.
22. A voice operated wireless remote control system in accordance with claim 21, wherein the means for modulating comprises a frequency modulator and the means for demodulating comprises a frequency demodulator.
23. A voice operated wireless remote control system in accordance with claim 22, wherein the frequency modulator comprises a voltage controlled oscillator and the frequency demodulator comprises a phase-locked loop receiver.
24. A voice operated wireless remote control system in accordance with claim 23, wherein the means for producing an analog audio signal comprises a microphone and an audio amplifier, the means for sensing a transmitted modulated analog audio signal comprises an infrared sensor, an amplifier and a filter, and the means for demodulating a sensed signal includes an audio filter coupled to the phase-locked loop receiver and an audio amplifier coupled to the audio filter for producing the voice signal.
PCT/US1994/009544 1993-08-27 1994-08-23 Voice operated remote control system WO1995006309A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU76371/94A AU7637194A (en) 1993-08-27 1994-08-23 Voice operated remote control system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11339493A 1993-08-27 1993-08-27
US08/113,394 1993-08-27

Publications (1)

Publication Number Publication Date
WO1995006309A1 true WO1995006309A1 (en) 1995-03-02

Family

ID=22349142

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1994/009544 WO1995006309A1 (en) 1993-08-27 1994-08-23 Voice operated remote control system

Country Status (2)

Country Link
AU (1) AU7637194A (en)
WO (1) WO1995006309A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5794205A (en) * 1995-10-19 1998-08-11 Voice It Worldwide, Inc. Voice recognition interface apparatus and method for interacting with a programmable timekeeping device
DE19709990A1 (en) * 1997-03-11 1998-09-24 Philips Patentverwaltung System for speech recognition of digits
EP0921508A2 (en) * 1997-12-08 1999-06-09 PCS PC-Systeme Entwicklungs- und Produktionsgesellschaft mbH & Co. KG Television and computer part combination with access to a communication network and remote control therefor
WO1999028897A1 (en) * 1997-12-04 1999-06-10 Voquette Networks, Ltd. A personal audio system
EP1079352A1 (en) * 1999-08-27 2001-02-28 Deutsche Thomson-Brandt Gmbh Remote voice control system
WO2001037262A1 (en) * 1999-11-12 2001-05-25 Telefonaktiebolaget Lm Ericsson (Publ) Wireless voice-activated remote control device
WO2002011122A1 (en) * 2000-07-28 2002-02-07 Koninklijke Philips Electronics N.V. System for controlling an apparatus with speech commands
US6978475B1 (en) 1999-11-24 2005-12-20 Ecable, Llc Method and apparatus for internet TV
WO2008000052A2 (en) * 2006-06-26 2008-01-03 Leonardo Senna Da Silva Bracelet with an attached command device
US8886541B2 (en) 2010-02-04 2014-11-11 Sony Corporation Remote controller with position actuatated voice transmission
EP2881941A1 (en) * 2013-12-09 2015-06-10 Thomson Licensing Method and apparatus for watermarking an audio signal
US10257576B2 (en) 2001-10-03 2019-04-09 Promptu Systems Corporation Global speech user interface

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4641292A (en) * 1983-06-20 1987-02-03 George Tunnell Voice controlled welding system
US5199080A (en) * 1989-12-29 1993-03-30 Pioneer Electronic Corporation Voice-operated remote control system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4641292A (en) * 1983-06-20 1987-02-03 George Tunnell Voice controlled welding system
US5199080A (en) * 1989-12-29 1993-03-30 Pioneer Electronic Corporation Voice-operated remote control system

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5794205A (en) * 1995-10-19 1998-08-11 Voice It Worldwide, Inc. Voice recognition interface apparatus and method for interacting with a programmable timekeeping device
US6078887A (en) * 1997-03-11 2000-06-20 U.S. Philips Corporation Speech recognition system for numeric characters
DE19709990A1 (en) * 1997-03-11 1998-09-24 Philips Patentverwaltung System for speech recognition of digits
DE19709990C2 (en) * 1997-03-11 2000-03-02 Philips Corp Intellectual Pty System for recognizing spoken sequences of digits
WO1999028897A1 (en) * 1997-12-04 1999-06-10 Voquette Networks, Ltd. A personal audio system
EP0921508A3 (en) * 1997-12-08 2005-04-13 Fujitsu Siemens Computers GmbH Television and computer part combination with access to a communication network and remote control therefor
EP0921508A2 (en) * 1997-12-08 1999-06-09 PCS PC-Systeme Entwicklungs- und Produktionsgesellschaft mbH & Co. KG Television and computer part combination with access to a communication network and remote control therefor
EP1079352A1 (en) * 1999-08-27 2001-02-28 Deutsche Thomson-Brandt Gmbh Remote voice control system
WO2001037262A1 (en) * 1999-11-12 2001-05-25 Telefonaktiebolaget Lm Ericsson (Publ) Wireless voice-activated remote control device
US6339706B1 (en) 1999-11-12 2002-01-15 Telefonaktiebolaget L M Ericsson (Publ) Wireless voice-activated remote control device
US6978475B1 (en) 1999-11-24 2005-12-20 Ecable, Llc Method and apparatus for internet TV
US7086079B1 (en) 1999-11-24 2006-08-01 Ecable, Llc Method and apparatus for internet TV
WO2002011122A1 (en) * 2000-07-28 2002-02-07 Koninklijke Philips Electronics N.V. System for controlling an apparatus with speech commands
US10257576B2 (en) 2001-10-03 2019-04-09 Promptu Systems Corporation Global speech user interface
US10932005B2 (en) 2001-10-03 2021-02-23 Promptu Systems Corporation Speech interface
US11070882B2 (en) 2001-10-03 2021-07-20 Promptu Systems Corporation Global speech user interface
US11172260B2 (en) 2001-10-03 2021-11-09 Promptu Systems Corporation Speech interface
WO2008000052A3 (en) * 2006-06-26 2008-03-20 Leonardo Senna Da Silva Bracelet with an attached command device
WO2008000052A2 (en) * 2006-06-26 2008-01-03 Leonardo Senna Da Silva Bracelet with an attached command device
US8886541B2 (en) 2010-02-04 2014-11-11 Sony Corporation Remote controller with position actuatated voice transmission
EP2881941A1 (en) * 2013-12-09 2015-06-10 Thomson Licensing Method and apparatus for watermarking an audio signal
WO2015086360A1 (en) * 2013-12-09 2015-06-18 Thomson Licensing Method and apparatus for watermarking an audio signal

Also Published As

Publication number Publication date
AU7637194A (en) 1995-03-21

Similar Documents

Publication Publication Date Title
US6119088A (en) Appliance control programmer using voice recognition
US7080014B2 (en) Hands-free, voice-operated remote control transmitter
US6747566B2 (en) Voice-activated remote control unit for multiple electrical apparatuses
US8653950B2 (en) State-based remote control system
US8271287B1 (en) Voice command remote control system
US5570415A (en) Video programming and storage control using the telephone network
JP2846021B2 (en) Television receiver
US7006974B2 (en) Voice controller and voice-controller system having a voice-controller apparatus
US6998997B2 (en) System and method for learning macro routines in a remote control
US6012029A (en) Voice activated system for locating misplaced items
US20010005197A1 (en) Remotely controlling electronic devices
US7039590B2 (en) General remote using spoken commands
WO1995006309A1 (en) Voice operated remote control system
US20060235698A1 (en) Apparatus for controlling a home theater system by speech commands
US20060235701A1 (en) Activity-based control of a set of electronic devices
JP2004507936A (en) Voice-controlled remote controller with a set of downloadable voice commands
WO2001027895A1 (en) Combined wireless telephone and remote controller with voice commands
JPS59200597A (en) Remote controller for controlling radio wave of various devices
WO2001050454A1 (en) Device setter, device setting system, and recorded medium where device setting program is recorded
WO1994003020A1 (en) Voice operated remote control device
US20050212685A1 (en) Talking remote appliance-controller for the blind
WO1994003017A1 (en) Universal remote control device
MXPA04007036A (en) Barrier movement operator human interface method and apparatus.
JPH05268676A (en) Remote control transmitter
KR101859614B1 (en) Display apparatus, electronic device, interactive system and controlling method thereof

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA DK JP KR NO

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: CA