WO2002041664A2 - Automatically adjusting audio system - Google Patents

Automatically adjusting audio system Download PDF

Info

Publication number
WO2002041664A2
WO2002041664A2 PCT/EP2001/013304 EP0113304W WO0241664A2 WO 2002041664 A2 WO2002041664 A2 WO 2002041664A2 EP 0113304 W EP0113304 W EP 0113304W WO 0241664 A2 WO0241664 A2 WO 0241664A2
Authority
WO
WIPO (PCT)
Prior art keywords
user
speakers
audio
image
generating system
Prior art date
Application number
PCT/EP2001/013304
Other languages
French (fr)
Other versions
WO2002041664A3 (en
Inventor
Miroslav Trajkovic
Srinivas Gutta
Antonio Colmenarez
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Priority to JP2002543259A priority Critical patent/JP2004514359A/en
Priority to EP01989480A priority patent/EP1393591A2/en
Publication of WO2002041664A2 publication Critical patent/WO2002041664A2/en
Publication of WO2002041664A3 publication Critical patent/WO2002041664A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation

Definitions

  • the invention relates to audio systems, such as stereo systems, television audio systems and home theater systems.
  • the invention relates to systems and methods for adjusting audio systems.
  • 2,228,324 describes a system that adjusts the balance of a stereo system as a user moves, in order to maintain the stereo effect for the listener.
  • a signal emitter carried by the user emits signals to two separate receivers that are adjacent to two stereo speakers.
  • the signal emitted may be an ultrasonic signal, infra-red signal or radio signal and may be emitted in response to an initiating signal. (It may also be a wired electrical signal.)
  • the system uses the time it takes a respective receiver (adjacent a speaker) to receive the signal from the signal emitter to determine the distance between the user and the speaker. A distance between the user and each of the two speakers is so calculated.
  • GB 2,228,324 refers to the system determining the position of the user by determining the point where the user's distance from each speaker overlaps, but notes that determining position is not necessary for adjusting stereo balance.
  • Japanese Patent Abstract 5-137200 detects the position of a viewer in one of five angular zones with respect to the front of a television by pointing a separate infra-red detector at each zone. The balance of the stereo speakers flanking the television screen is said to be adjusted based on the zone the viewer is in.
  • Japanese Patent Abstract 4-130900 uses elapsed time of light transmission to calculate the distances between a listener and two light emitting and detecting parts. The distances between the user and the two parts and the distance between the two parts is used to calculate the position of the listener and to adjust the balance of the audio signal.
  • Japanese Patent Abstract 7-302210 uses an infra-red signal to measure the distance between a listening position and a series of spealcer and to adjust an appropriate delay time for each speaker based on the distance between the spealcer and the listening position.
  • One obvious difficulty with the prior art systems is that they either require a user to wear or carry a signal emitter (as in GB 2,228,324) in order to enjoy automatic adjustment of a balance of a stereo system, or, if not, to rely on sensors (such as infra-red sensors) that are unreliable and/or crude in detecting the position of a listener.
  • sensors such as infra-red sensors
  • use of infra-red detectors may fail to detect the listener, resulting in the above-mentioned systems failing to balance properly for the user's position.
  • other people or other items, such as pets
  • may be sensed by the sensors resulting in an adjustment in the balance to someone or something other than the listener.
  • a home theater system typically has a multiplicity of speakers positioned about a room that are used to project audio, including audio effects, to a listener.
  • the audio is not simply "balanced" between speakers. Rather, the output of a particular speaker location may be raised and lowered or otherwise coordinated based on the audio effect to be projected to the listener at his or her location. For example, two speakers of a multiplicity of speakers may be driven in phase or out of phase, in order to project a particular audio effect to a listener at the listener's position.
  • the invention provides an audio system (including an audiovisual system) that can automatically adjust to the position of the listener or user of the system, including a change in position of the user.
  • the system uses image capturing and recognition that recognizes some or part of the contours of a human body, i.e., the user. Based on the position of the user in the field of view, the system determines position information of the user. In one embodiment of the system, for example, the angular position of the user is determined based on the location of the image of the user in the field of view of an imaging capturing device, and the system may adjust the output of two or more speakers based on the determined angle.
  • the image capturing device may be, for example, a video camera connected to a control unit or CPU that has image recognition software programmed to recognize all or part of the shape of a human body.
  • image recognition software programmed to recognize all or part of the shape of a human body.
  • Various methods of detecting and tracking active contours such as the human body have been developed. For example, a "person finder” that finds and follows people's bodies (or head or hands, for example) in a video image is described in "Pfinder: Real-Time Tracking Of the Human Body” by Wren et al., M.I.T. Media Laboratory Perceptual Computing Section Technical Report No. 353, published in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp 780-85 (July 1997), the contents of which are hereby incorporated by reference.
  • control unit or CPU may be programmed to recognize the contours of a human head or even the contours of a particular user's face.
  • Software that can recognize faces in images is commercially available, such as the "Facelt" software sold by Visionics and described at www.faceit.com.
  • Software incorporating such algorithms which may be used to detect human bodies, faces, etc. will be generally referred to as image recognition software, image recognition algorithm and the like in the description below.
  • the position of the recognized body or head relative to the field of view of the camera may be used, for example, to determine the angle of the user's location with respect to the camera.
  • the determined angle may be used to balance or otherwise adjust the audio output and effects to be projected by each speaker to the user's location.
  • the use of an image capturing device and related image sensing software that identifies the contour of a human body or a particular face makes the detection of the user more accurate and reliable.
  • Two or more such programmed image capturing devices having overlapping fields of view may be used to accurately determine the location of the user.
  • two separate cameras as described above may be separately located and each may be used to determine the user's position in a reference coordinate system.
  • the user's location may be used by the audio system, for example, to determine the distance between the user's present location and the fixed (known) position of each speaker in the reference coordinate system and to make the appropriate adjustments to the speaker output to provide the proper audio mix to the user's location, such as audio effects in a home theater system.
  • the invention comprises an audio generating system that outputs audio through two or more speakers.
  • the audio output of each of the two or more speakers is adjustable based upon the position of a user with respect to the positions of the two or more speakers.
  • the system includes at least one image capturing device (such as a video camera) that is trainable on a listening region and coupled to a processing section having image recognition software.
  • the processing section uses the image recognition software to identify the user in an image generated by the image capturing device.
  • the processing section also has software that generates at least one measurement of the position of the user based upon the position of the user in the image.
  • Fig. 1 is a perspective view of a home theater system including automatic detection and locating of a user and adjustment of output in accordance with a first embodiment of the invention
  • Fig. la is a diagram of portions of the control system of the system of Fig. 1;
  • Fig. 2a is an image that includes an image of a user captured by a first camera of the system of Fig. 1;
  • Fig. 2b is an image that includes an image of the user captured by a second camera of the system of Fig. 1 ;
  • Fig. 3 is a representative view of a stereo system including automatic detection and locating of a user and adjustment of output in accordance with a second embodiment of the invention; and Fig. 3 a is an image that includes an image of the user captured by a camera of the system of Fig. 3.
  • a user 10 is shown positioned amongst audio and visual components of a home theater system.
  • the home theater system is comprised of a video display screen 14 and a series of audio speakers 18a-e surrounding the perimeter of a comfortable viewing area for the display screen 14.
  • the system is also comprised of a control unit 22, shown in Fig. 1 positioned atop the display screen 14.
  • the control unit 22 may be positioned elsewhere or may be incorporated within the display unit 14 itself.
  • the control unit 22, display screen 14 and speakers 18a-e are all electrically connected with electrical wires and connectors. The wires are typically run beneath carpet in a room or within an adjacent wall, so they are not shown in Fig. 1.
  • the home theater system of Fig. 1 includes electrical components that produce visual output from display screen 14 and corresponding audio output from speakers 18a-e.
  • the audio and video processing for the home theater output typically occurs in the control unit 22, which may include a processor, memory and related processing software.
  • control units and related processing components are known and available in various commercial formats.
  • Audio and video input provided to the control unit 22 may come from a television signal, a cable signal, a satellite signal, a DVD and/or a VCR.
  • the control unit 22 processes the input signal and provides appropriate signals to the driving circuitry of the display screen 14, resulting in a video display, and also processes the input signal and provides appropriate driving signals to the speakers 18a-e, as shown in Fig. la.
  • the audio portion of the signal input to the control unit 22 may be a stereophonic signal or may support more complex audio processing, such as audio effects processing by the control unit 22.
  • the control unit 22 may drive speakers 18b, 18c, 18d in an overlapping sequence in order to simulate a car passing by on the right hand portion of the display.
  • the amplitude and phase of each speaker 18b, 18c, 18d is driven based on received audio signal by the control unit 22, as well as the position of the speaker 18b, 18c, 18d relative to the user 10 as stored in the memory of control unit 22.
  • the control unit 22 may receive and store the positions of the speakers 18a-e and the position of the user 10 with respect to a common reference system, such as the one defined by origin O and unit vectors (x,y,z) in Fig. 1.
  • the x, y and z coordinates of each speaker 18a-e and the user 10 in the reference coordinate system may be physically measured or otherwise determined and input to the control unit 22.
  • the position of user 10 in Fig. 1 is shown to have coordinates (Xp,Yp, Zp) in the reference coordinate system.
  • the reference coordinate system in general may be located in positions other than shown in Fig. 1. (As described further below, the reference coordinate system in Fig.
  • control unit 22 may alternatively translate the coordinates to an internal reference coordinate system.
  • the position of the user 10 and the speakers 18a-e in such a common reference coordinate system enables the control unit 10 to determine the position of the user 10 with respect to each speaker 18a-e. (It is well known that subtracting the coordinates of the user 10 from the coordinates of the speaker 18a determines their relative positions in the reference coordinate system.)
  • Software within the control unit 22 electronically adjusts the driving signals for the audio output (such as volume, frequency, phase) of each speaker based upon the received audio signal, as well as the position of the user 10 relative to the speaker.
  • Electronic adjustment of the audio output by the control unit 22 based on the relative positions of the speakers 18a-e with respect to the user 10 is known in the art.
  • the control system may allow the user to manually adjust the audio output of each speaker 18a-e.
  • Such manual controls of the audio components via the control unit 22 is also known in the art.
  • input may be provided to the control unit 22 through a remote that wirelessly interfaces with the control unit 22 and projects a menu on the display screen 14, that allows, for example, input of positional data.
  • the home theater system of Fig. 1 can also automatically identify the user and the user's location in the reference coordinate system.
  • the locations of the user 10 and the speakers 18a-e in the reference coordinate system at origin O were presumed to be known by the control unit 22 based, for example, on manual input provided by the user.
  • the positions of the speakers 18a-e will still normally be known to the control unit 22, since they usually will remain fixed after they are placed.
  • the positions of the speakers 18a-e in the reference coordinate system are each manually input to the control system 22 during the initial system set-up and generally remained fixed thereafter.
  • the speaker location may be changed, of course, and a new position(s) may be input, but this does not occur during normal usage of the system.
  • the control unit 22 adjust the audio output to each speaker 18a-e based on the relative locations of the user 10 and the speakers 18a-e, as in the case of manual input of positions, as previously described.
  • the system is further comprised of two video cameras 26a, 26b located atop display screen 14 and directed toward the normal viewing area of the display screen 14.
  • Camera 26a is located at the origin O of the common reference coordinate system.
  • video cameras 26a, 26b may be positioned at other locations; the reference coordinate system may also be re-positioned to a different location of camera 26a or elsewhere.
  • Video cameras 26a, 26b interface with the control unit 22 and provide it with images captured in the viewing area.
  • Image recognition software is loaded in control unit 22 and is used by a processor therein to process the video images received from the cameras 26a, 26b.
  • the components, including memory, of the control unit 22 used for image recognition may be separate or may be shared with the other functions of the control unit 22, such as those shown in Fig. la. Alternatively, the image recognition may take place in a separate unit.
  • Fig. 2a depicts the image in the field of view of camera 26a on one side of the display screen of Fig. 1.
  • the image of Fig. 2a is transmitted to control unit 22, where it is processed using, for example, known image recognition software loaded therein.
  • An image recognition algorithm may be used to recognize the contours of a human body, such as the user 10.
  • image recognition software may be used that recognizes faces or may be programmed to recognize a particular face or faces, such as the face of user 10.
  • the control unit 22 is programmed to determine the point Pi' at the center of the user's 10 head in the image and the coordinates (x',y') with respect to the point Oj' in the upper left-hand corner of the image.
  • the point Oj' in the image of Fig. 2a corresponds approximately to the point (0,0,Zp) in the reference coordinate system of Fig. 1.
  • Fig. 2b depicts the image in the field of view of camera 26b on the other side of the display screen of Fig. 1.
  • the image of Fig. 2b is transmitted to control unit 22, where it is processed using image recognition software to recognize the user 10 or the image of the user's face.
  • the control unit determines the point Pi" at the center of the user's head 10 in the image of Fig. 2b and the coordinates (x",y") with respect to the point 0/' in the upper left-hand corner of the image.
  • the coordinates (Xp,Yp, Zp) of the position P of the user 10 in the reference coordinate system of Fig. 1 may be uniquely determined using standard techniques of computer vision known as the "stereo problem".
  • Basic stereo techniques of three dimensional computer vision are described for example, in “Introductory Techniques for 3-D Computer Vision” by Trucco and Verri, (Prentice Hall, 1998) and, in particular, Chapter 7 of that text entitled “Stereopsis", the contents of which are hereby incorporated by reference.
  • D is the distance between cameras 26a, 26b.
  • D is the distance between cameras 26a, 26b.
  • Eqs. 1-4 are up to linear transformations defined by camera geometry.
  • Equations 1-4 have three unknown variables (coordinates Xp,Yp, Zp), thus the simultaneous solution gives the values of Xp,Yp, and Zp and thus gives the position of the user 10 in the reference coordinate system of Fig. 1.
  • the coordinates (Xp, Yp, Zp) may be translated to another internal coordinate system of the control unit 22.
  • the processing required to determine the position (Xp,Yp, Zp) of the user and to translate the radial coordinates to another reference coordinate, if necessary, may also take place in a processing unit other than control unit 22. For example, it may take place in a processing unit that also supports the image recognition processing, thus comprising a separate processing unit dedicated to the tasks of image detection and location.
  • the fixed positions of speakers 18a-e are known to the control unit 22 based on prior input. For example, once each speaker 18a-e is placed in the room as shown in Figs. 1, the coordinates (x,y,z) of each speaker 18a-e in the reference coordinate system, and the distance D between cameras 26a, 26b may be measured and input in memory in the control unit 22.
  • the coordinates (Xp,Yp, Zp) of the user 10 as determined using the image recognition software (along with the post-recognition processing of the stereo problem described above) and the pre-stored coordinates of each speaker may then be used to determine the position of the user 10 with respect to each speaker 18a-e.
  • the audio processing of the control unit 22 may then appropriately adjust the audio output (including amplitude, frequency and phase) of each speaker 18a-e based upon the input audio signal and the position of the user 10 with respect to the speakers 18a-e.
  • the use of the video cameras 26a, 26b, image recognition software, and post- recognition processing to determine a detected user's position thus allows the location of the user of the home theater system of Fig. 1 to be automatically detected and determined. If the user moves, the processing is repeated and a new position is determined for the user, and the control unit 22 uses the new location to adjust the audio signals output by speakers 18a-e.
  • the automatic detection feature may be turned off so that the output of the speakers is based on a default or a manual input of the location of the user 10.
  • the image recognition software may also be programmed to recognize, for example, a number of different faces and the face of a particular user may be selected for recognition and automatic adjustment. Thus, the system may adjust to the position of a particular user in the viewing area.
  • the image recognition software may be used to detect all faces or human bodies in the viewing area and the processing may then automatically determine each of their respective locations.
  • the adjustment of the audio output of each speaker 18a-e may be determined by an algorithm that attempts to optimize the aural experience at the location of each detected user.
  • Fig. 1 depicted a home theater system
  • the automatic detection and adjustment may be used by other audiovisual systems or other purely audio systems. It may be used, for example, with a stereo system having a number of speakers to adjust the volume at each speaker location based on the determined location of the user with respect to the speakers in order to maintain a proper (or pre-determined) balance of the stereophonic sound at the location of the user.
  • a simpler embodiment of the invention applied to a two speaker stereo system is shown in Fig. 3.
  • the basic components of the stereo system comprise a stereo amplifier 130 attached to two speakers 100a, 100b.
  • a camera 110 is used to detect an image of a listening region, including the image of a listener 140 in the listening region.
  • Fig. 3 shows a simple reference coordinate system in the plane, having an origin O at the camera and comprised of the angle of an object with respect to the axis A of the camera 110.
  • the angle 3 is the angular position of speaker 100a
  • the angle N is the angular position of speaker 100b
  • the angle 2 is the angular position of the user 140.
  • Fig. 3 shows the top of the user's head.
  • the user 140 is assumed to listen to the stereo in the central region of Fig. 3 at an approximate distance D from the origin O.
  • the speakers 100a, 100b have a default balance at the position D along the axis A, which is approximately at the center of the listening area.
  • the angles 3 and N of the positions of speakers 100a, 100b are measured and pre-stored in processing unit 120.
  • the image captured by the camera 110 is transferred to the processing unit 120 that includes image recognition software that detects the contour of a human body, a particular face, etc., as described in the embodiment above.
  • the location of the detected body or face in the image is used by the processing unit to determine the angle 2 corresponding to the position of the user 140 in the reference coordinate system.
  • a first order determination of the angle 2 is:
  • the processing unit 120 in turn sends a signal to the amplifier that adjusts the balance of speakers 100a, 100b based on the relative angular positions of the user 140 and the speakers 100a, 100b.
  • the output of speaker 110a is adjusted using a factor (3-2) and the output of speaker 110b is adjusted using a factor (N+2).
  • the balance of speakers 100a, 100b is thus automatically adjusted based upon the position of the user 140 with respect to the speakers 100a, 100b.
  • the adjustment of the balance is based on the angular position 2 of the user is an acceptable first order adjustment.

Abstract

An audio generating system that outputs audio through two or more speakers. The audio output of each of the two or more speakers is adjustable based upon the position of a user with respect to the location of the two or more speakers. The system includes at least one image capturing device (such as a video camera) that is trainable on a listening region and coupled to a processing section having image recognition software. The processing section uses the image recognition software to identify the user in an image generated by the image capturing device. The processing section also has software that generates at least one measurement of the position of the user based upon the position of the user in the image.

Description

Automatically adjusting audio system
FIELD OF THE INVENTION
The invention relates to audio systems, such as stereo systems, television audio systems and home theater systems. In particular, the invention relates to systems and methods for adjusting audio systems.
BACKGROUND OF THE INVENTION
Particular systems for adjusting the output of various audio systems based on the position of a listener ("user") are known. For example, UK Patent Application GB
2,228,324 describes a system that adjusts the balance of a stereo system as a user moves, in order to maintain the stereo effect for the listener. A signal emitter carried by the user emits signals to two separate receivers that are adjacent to two stereo speakers. The signal emitted may be an ultrasonic signal, infra-red signal or radio signal and may be emitted in response to an initiating signal. (It may also be a wired electrical signal.) The system uses the time it takes a respective receiver (adjacent a speaker) to receive the signal from the signal emitter to determine the distance between the user and the speaker. A distance between the user and each of the two speakers is so calculated. Based on the principle that sound intensity decreases with the cube of the distance from a source, the system uses the distance between each speaker and the user to adjust each speaker so that substantially equal sound intensities are presented to the user from each speaker. GB 2,228,324 refers to the system determining the position of the user by determining the point where the user's distance from each speaker overlaps, but notes that determining position is not necessary for adjusting stereo balance.
Japanese Patent Abstract 5-137200 detects the position of a viewer in one of five angular zones with respect to the front of a television by pointing a separate infra-red detector at each zone. The balance of the stereo speakers flanking the television screen is said to be adjusted based on the zone the viewer is in.
Japanese Patent Abstract 4-130900 uses elapsed time of light transmission to calculate the distances between a listener and two light emitting and detecting parts. The distances between the user and the two parts and the distance between the two parts is used to calculate the position of the listener and to adjust the balance of the audio signal.
Similarly, Japanese Patent Abstract 7-302210 uses an infra-red signal to measure the distance between a listening position and a series of spealcer and to adjust an appropriate delay time for each speaker based on the distance between the spealcer and the listening position.
SUMMARY OF THE INVENTION
One obvious difficulty with the prior art systems is that they either require a user to wear or carry a signal emitter (as in GB 2,228,324) in order to enjoy automatic adjustment of a balance of a stereo system, or, if not, to rely on sensors (such as infra-red sensors) that are unreliable and/or crude in detecting the position of a listener. For example, use of infra-red detectors may fail to detect the listener, resulting in the above-mentioned systems failing to balance properly for the user's position. Moreover, other people (or other items, such as pets) may be sensed by the sensors, resulting in an adjustment in the balance to someone or something other than the listener.
In addition, the above-mentioned systems are not well suited for audio systems more complex than a simple stereo system, for example, a home theater system. A home theater system typically has a multiplicity of speakers positioned about a room that are used to project audio, including audio effects, to a listener. The audio is not simply "balanced" between speakers. Rather, the output of a particular speaker location may be raised and lowered or otherwise coordinated based on the audio effect to be projected to the listener at his or her location. For example, two speakers of a multiplicity of speakers may be driven in phase or out of phase, in order to project a particular audio effect to a listener at the listener's position.
Thus, an accurate determination of the location of each of a multiplicity of speakers with respect to the position of the listener is highly important to certain entertainment experiences. In addition, in order to adjust the required output of a multiplicity of speakers to a changed or changing position of a listener, a more reliable and accurate determination of the listener's position is needed.
Accordingly, the invention provides an audio system (including an audiovisual system) that can automatically adjust to the position of the listener or user of the system, including a change in position of the user. The system uses image capturing and recognition that recognizes some or part of the contours of a human body, i.e., the user. Based on the position of the user in the field of view, the system determines position information of the user. In one embodiment of the system, for example, the angular position of the user is determined based on the location of the image of the user in the field of view of an imaging capturing device, and the system may adjust the output of two or more speakers based on the determined angle.
The image capturing device may be, for example, a video camera connected to a control unit or CPU that has image recognition software programmed to recognize all or part of the shape of a human body. Various methods of detecting and tracking active contours such as the human body have been developed. For example, a "person finder" that finds and follows people's bodies (or head or hands, for example) in a video image is described in "Pfinder: Real-Time Tracking Of the Human Body" by Wren et al., M.I.T. Media Laboratory Perceptual Computing Section Technical Report No. 353, published in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp 780-85 (July 1997), the contents of which are hereby incorporated by reference. Detection of a person (a pedestrian) within an image using a template matching approach is described in "Pedestrian Detection From A Moving Vehicle" by D.M. Gavrila (Image Understanding Systems, DaimlerChrysler Research), Proceedings of the European Conference on Computer Vision, 2000 (available at www.gravila.net), the contents of which are hereby incorporated by reference. Use of a statistical sampling algorithm for detection of a static object in an image and a stochastical model for detection of object motion is described in "Condensation - Conditional Density Propagation For Visual Tracking" by Isard and Black (Oxford Univ. Dept. of Engineering Science), Int. J. Computer Vision, vol. 29, 1998 (available at www.dai.ed.ac.uk/CVonline/LOCAL_COPIES/ISARDl/condensation.html, along with the "Condensation" source code), the contents of which are hereby incorporated by reference. Alternatively, the control unit or CPU may be programmed to recognize the contours of a human head or even the contours of a particular user's face. Software that can recognize faces in images (including digital images) is commercially available, such as the "Facelt" software sold by Visionics and described at www.faceit.com. Software incorporating such algorithms which may be used to detect human bodies, faces, etc. will be generally referred to as image recognition software, image recognition algorithm and the like in the description below. The position of the recognized body or head relative to the field of view of the camera may be used, for example, to determine the angle of the user's location with respect to the camera. The determined angle may be used to balance or otherwise adjust the audio output and effects to be projected by each speaker to the user's location. The use of an image capturing device and related image sensing software that identifies the contour of a human body or a particular face makes the detection of the user more accurate and reliable.
Two or more such programmed image capturing devices having overlapping fields of view may be used to accurately determine the location of the user. For example, two separate cameras as described above may be separately located and each may be used to determine the user's position in a reference coordinate system. The user's location may be used by the audio system, for example, to determine the distance between the user's present location and the fixed (known) position of each speaker in the reference coordinate system and to make the appropriate adjustments to the speaker output to provide the proper audio mix to the user's location, such as audio effects in a home theater system.
Thus, in general, the invention comprises an audio generating system that outputs audio through two or more speakers. The audio output of each of the two or more speakers is adjustable based upon the position of a user with respect to the positions of the two or more speakers. The system includes at least one image capturing device (such as a video camera) that is trainable on a listening region and coupled to a processing section having image recognition software. The processing section uses the image recognition software to identify the user in an image generated by the image capturing device. The processing section also has software that generates at least one measurement of the position of the user based upon the position of the user in the image.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 is a perspective view of a home theater system including automatic detection and locating of a user and adjustment of output in accordance with a first embodiment of the invention;
Fig. la is a diagram of portions of the control system of the system of Fig. 1;
Fig. 2a is an image that includes an image of a user captured by a first camera of the system of Fig. 1;
Fig. 2b is an image that includes an image of the user captured by a second camera of the system of Fig. 1 ;
Fig. 3 is a representative view of a stereo system including automatic detection and locating of a user and adjustment of output in accordance with a second embodiment of the invention; and Fig. 3 a is an image that includes an image of the user captured by a camera of the system of Fig. 3.
DETAILED DESCRIPTION Referring to Fig. 1, a user 10 is shown positioned amongst audio and visual components of a home theater system. The home theater system is comprised of a video display screen 14 and a series of audio speakers 18a-e surrounding the perimeter of a comfortable viewing area for the display screen 14. The system is also comprised of a control unit 22, shown in Fig. 1 positioned atop the display screen 14. Of course, the control unit 22 may be positioned elsewhere or may be incorporated within the display unit 14 itself. The control unit 22, display screen 14 and speakers 18a-e are all electrically connected with electrical wires and connectors. The wires are typically run beneath carpet in a room or within an adjacent wall, so they are not shown in Fig. 1.
The home theater system of Fig. 1 includes electrical components that produce visual output from display screen 14 and corresponding audio output from speakers 18a-e. The audio and video processing for the home theater output typically occurs in the control unit 22, which may include a processor, memory and related processing software. Such control units and related processing components are known and available in various commercial formats. Audio and video input provided to the control unit 22 may come from a television signal, a cable signal, a satellite signal, a DVD and/or a VCR. The control unit 22 processes the input signal and provides appropriate signals to the driving circuitry of the display screen 14, resulting in a video display, and also processes the input signal and provides appropriate driving signals to the speakers 18a-e, as shown in Fig. la.
The audio portion of the signal input to the control unit 22 may be a stereophonic signal or may support more complex audio processing, such as audio effects processing by the control unit 22. For example, the control unit 22 may drive speakers 18b, 18c, 18d in an overlapping sequence in order to simulate a car passing by on the right hand portion of the display. The amplitude and phase of each speaker 18b, 18c, 18d is driven based on received audio signal by the control unit 22, as well as the position of the speaker 18b, 18c, 18d relative to the user 10 as stored in the memory of control unit 22.
The control unit 22 may receive and store the positions of the speakers 18a-e and the position of the user 10 with respect to a common reference system, such as the one defined by origin O and unit vectors (x,y,z) in Fig. 1. The x, y and z coordinates of each speaker 18a-e and the user 10 in the reference coordinate system may be physically measured or otherwise determined and input to the control unit 22. The position of user 10 in Fig. 1 is shown to have coordinates (Xp,Yp, Zp) in the reference coordinate system. The reference coordinate system in general may be located in positions other than shown in Fig. 1. (As described further below, the reference coordinate system in Fig. 1 is chosen to be at the location of a camera in order to facilitate automatic location of the user 10 in accordance with the invention.) Once the coordinates of the speakers 18a-e and user 10 in the reference coordinate system are received by the control unit 22, the control unit 22 may alternatively translate the coordinates to an internal reference coordinate system.
The position of the user 10 and the speakers 18a-e in such a common reference coordinate system enables the control unit 10 to determine the position of the user 10 with respect to each speaker 18a-e. (It is well known that subtracting the coordinates of the user 10 from the coordinates of the speaker 18a determines their relative positions in the reference coordinate system.) Software within the control unit 22 electronically adjusts the driving signals for the audio output (such as volume, frequency, phase) of each speaker based upon the received audio signal, as well as the position of the user 10 relative to the speaker. Electronic adjustment of the audio output by the control unit 22 based on the relative positions of the speakers 18a-e with respect to the user 10 is known in the art. Alternatively, the control system may allow the user to manually adjust the audio output of each speaker 18a-e. Such manual controls of the audio components via the control unit 22 is also known in the art. In both cases, input may be provided to the control unit 22 through a remote that wirelessly interfaces with the control unit 22 and projects a menu on the display screen 14, that allows, for example, input of positional data.
The home theater system of Fig. 1 can also automatically identify the user and the user's location in the reference coordinate system. In the description above, the locations of the user 10 and the speakers 18a-e in the reference coordinate system at origin O were presumed to be known by the control unit 22 based, for example, on manual input provided by the user. Where the position of the user 10 is not known or varies, or an automatic detection and determination of the user's location is otherwise desired, the positions of the speakers 18a-e will still normally be known to the control unit 22, since they usually will remain fixed after they are placed. Thus, the positions of the speakers 18a-e in the reference coordinate system are each manually input to the control system 22 during the initial system set-up and generally remained fixed thereafter. (The speaker location may be changed, of course, and a new position(s) may be input, but this does not occur during normal usage of the system.) Once the user's location is automatically determined by the system, as described in more detail below, the control unit 22 adjust the audio output to each speaker 18a-e based on the relative locations of the user 10 and the speakers 18a-e, as in the case of manual input of positions, as previously described.
In order to automatically detect the presence and, if present, the location of the user 10 in Fig. 1, the system is further comprised of two video cameras 26a, 26b located atop display screen 14 and directed toward the normal viewing area of the display screen 14. Camera 26a is located at the origin O of the common reference coordinate system. As evident from the description below, video cameras 26a, 26b may be positioned at other locations; the reference coordinate system may also be re-positioned to a different location of camera 26a or elsewhere. Video cameras 26a, 26b interface with the control unit 22 and provide it with images captured in the viewing area. Image recognition software is loaded in control unit 22 and is used by a processor therein to process the video images received from the cameras 26a, 26b. The components, including memory, of the control unit 22 used for image recognition may be separate or may be shared with the other functions of the control unit 22, such as those shown in Fig. la. Alternatively, the image recognition may take place in a separate unit.
Fig. 2a depicts the image in the field of view of camera 26a on one side of the display screen of Fig. 1. The image of Fig. 2a is transmitted to control unit 22, where it is processed using, for example, known image recognition software loaded therein. An image recognition algorithm may be used to recognize the contours of a human body, such as the user 10. Alternatively, image recognition software may be used that recognizes faces or may be programmed to recognize a particular face or faces, such as the face of user 10.
Once the image recognition software identifies the contour of a human body or a particular face, the control unit 22 is programmed to determine the point Pi' at the center of the user's 10 head in the image and the coordinates (x',y') with respect to the point Oj' in the upper left-hand corner of the image. As seen, the point Oj' in the image of Fig. 2a corresponds approximately to the point (0,0,Zp) in the reference coordinate system of Fig. 1. Similarly, Fig. 2b depicts the image in the field of view of camera 26b on the other side of the display screen of Fig. 1. In like manner, the image of Fig. 2b is transmitted to control unit 22, where it is processed using image recognition software to recognize the user 10 or the image of the user's face. Because camera 26b is located on the other side of the display screen, the image of the user 10 is located in a different part of the field of view compared to Fig. 2a. The control unit determines the point Pi" at the center of the user's head 10 in the image of Fig. 2b and the coordinates (x",y") with respect to the point 0/' in the upper left-hand corner of the image.
Having identified the positions P,' and P;" of the user 10 in the camera images shown in Figs. 2a and 2b as having image coordinates (x',y') and (x",y"), respectively, the coordinates (Xp,Yp, Zp) of the position P of the user 10 in the reference coordinate system of Fig. 1 may be uniquely determined using standard techniques of computer vision known as the "stereo problem". Basic stereo techniques of three dimensional computer vision are described for example, in "Introductory Techniques for 3-D Computer Vision" by Trucco and Verri, (Prentice Hall, 1998) and, in particular, Chapter 7 of that text entitled "Stereopsis", the contents of which are hereby incorporated by reference. Using such well-known techniques, the relationship between the user's position P in Fig. 1 (having unknown coordinates (Xp,Yp, ZP)) and the image position Pj' of the user in Fig. 2a (having known image coordinates (x',y')) is given by the equations:
Figure imgf000009_0001
Similarly, the relationship between the user's position P in Fig. 1 and the image position Pj" of the user in Fig. 2b (having known image coordinates (x",y")) is given by the equations:
Figure imgf000009_0002
where D is the distance between cameras 26a, 26b. One skilled in the art will recognize that the terms given in Eqs. 1-4 are up to linear transformations defined by camera geometry.
Equations 1-4 have three unknown variables (coordinates Xp,Yp, Zp), thus the simultaneous solution gives the values of Xp,Yp, and Zp and thus gives the position of the user 10 in the reference coordinate system of Fig. 1. If required, the coordinates (Xp, Yp, Zp) may be translated to another internal coordinate system of the control unit 22. The processing required to determine the position (Xp,Yp, Zp) of the user and to translate the radial coordinates to another reference coordinate, if necessary, may also take place in a processing unit other than control unit 22. For example, it may take place in a processing unit that also supports the image recognition processing, thus comprising a separate processing unit dedicated to the tasks of image detection and location.
As noted above, the fixed positions of speakers 18a-e are known to the control unit 22 based on prior input. For example, once each speaker 18a-e is placed in the room as shown in Figs. 1, the coordinates (x,y,z) of each speaker 18a-e in the reference coordinate system, and the distance D between cameras 26a, 26b may be measured and input in memory in the control unit 22. The coordinates (Xp,Yp, Zp) of the user 10 as determined using the image recognition software (along with the post-recognition processing of the stereo problem described above) and the pre-stored coordinates of each speaker may then be used to determine the position of the user 10 with respect to each speaker 18a-e. As previously described, the audio processing of the control unit 22 may then appropriately adjust the audio output (including amplitude, frequency and phase) of each speaker 18a-e based upon the input audio signal and the position of the user 10 with respect to the speakers 18a-e.
The use of the video cameras 26a, 26b, image recognition software, and post- recognition processing to determine a detected user's position thus allows the location of the user of the home theater system of Fig. 1 to be automatically detected and determined. If the user moves, the processing is repeated and a new position is determined for the user, and the control unit 22 uses the new location to adjust the audio signals output by speakers 18a-e. The automatic detection feature may be turned off so that the output of the speakers is based on a default or a manual input of the location of the user 10. The image recognition software may also be programmed to recognize, for example, a number of different faces and the face of a particular user may be selected for recognition and automatic adjustment. Thus, the system may adjust to the position of a particular user in the viewing area. Alternatively, the image recognition software may be used to detect all faces or human bodies in the viewing area and the processing may then automatically determine each of their respective locations. The adjustment of the audio output of each speaker 18a-e may be determined by an algorithm that attempts to optimize the aural experience at the location of each detected user.
Although the embodiment of Fig. 1 depicted a home theater system, the automatic detection and adjustment may be used by other audiovisual systems or other purely audio systems. It may be used, for example, with a stereo system having a number of speakers to adjust the volume at each speaker location based on the determined location of the user with respect to the speakers in order to maintain a proper (or pre-determined) balance of the stereophonic sound at the location of the user. Thus, a simpler embodiment of the invention applied to a two speaker stereo system is shown in Fig. 3. The basic components of the stereo system comprise a stereo amplifier 130 attached to two speakers 100a, 100b. A camera 110 is used to detect an image of a listening region, including the image of a listener 140 in the listening region. The relative positions of the speakers 100a, 100b, camera 110 and user 140 are shown from above, or projected into the plane of the floor. Fig. 3 also shows a simple reference coordinate system in the plane, having an origin O at the camera and comprised of the angle of an object with respect to the axis A of the camera 110. Thus, the angle 3 is the angular position of speaker 100a, the angle N is the angular position of speaker 100b and the angle 2 is the angular position of the user 140. (Fig. 3 shows the top of the user's head.)
In the system of fig. 3, the user 140 is assumed to listen to the stereo in the central region of Fig. 3 at an approximate distance D from the origin O. The speakers 100a, 100b have a default balance at the position D along the axis A, which is approximately at the center of the listening area. The angles 3 and N of the positions of speakers 100a, 100b are measured and pre-stored in processing unit 120. The image captured by the camera 110 is transferred to the processing unit 120 that includes image recognition software that detects the contour of a human body, a particular face, etc., as described in the embodiment above. The location of the detected body or face in the image is used by the processing unit to determine the angle 2 corresponding to the position of the user 140 in the reference coordinate system. For example, referring to Fig. 3a, a first order determination of the angle 2 is:
2 = (x/W)(P) where x is the horizontal image distance measured by the processing unit 120 from the center C of the image, W is the total horizontal width of the image and the P is the field of view, or, equivalently, the angular width of the scene, as fixed by the camera.
The processing unit 120 in turn sends a signal to the amplifier that adjusts the balance of speakers 100a, 100b based on the relative angular positions of the user 140 and the speakers 100a, 100b. For example, the output of speaker 110a is adjusted using a factor (3-2) and the output of speaker 110b is adjusted using a factor (N+2). Thus, the balance of speakers 100a, 100b is thus automatically adjusted based upon the position of the user 140 with respect to the speakers 100a, 100b. As previously noted, it is assumed in the system of Fig. 4 that the user 140 remains in a central listening region in Fig. 3, at an approximate distance D from the origin O. Thus, the adjustment of the balance is based on the angular position 2 of the user is an acceptable first order adjustment. Although illustrative embodiments of the present invention have been described herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, but rather it is intended that the scope of the invention is as defined by the scope of the appended claims.

Claims

CLAIMS:
1. An audio generating system that outputs audio through two or more speakers (18a-e, 100a, 100b), the audio output of each of the two or more speakers (18a-e, 100a, 100b) being adjustable based upon the location of a user with respect to the location of the two or more speakers (18a-e, 100a, 100b), the system comprising at least one image capturing device (26a, 26b, 110) trainable on a listening region and coupled to a processing section (22, 120) having image recognition software that identifies the user in an image generated by the image capturing device (26a, 26b, 110), the processing section (22, 120) having additional software that generates at least one measurement of the position of the user based upon the position of the user in the image.
2. The audio generating system of Claim 1 , wherein the system is part of an audiovisual system.
3. The audio generating system of Claim 2, wherein the audiovisual system is a home theater system.
4. The audio generating system of Claim 1, wherein the processing section (22, 120) adjusts the audio output of at least one of the speakers (18a-e, 100a, 100b) based upon the at least one measurement of the position of the usery
5 The audio generating system of Claim 4 wherein the processing section (22,
120) is comprised of a single processing unit that identifies the user in the image, generates the at least one measurement of the position of the user and adjusts the audio output of at least one of the speakers (18a-e, 100a, 100b) based upon the at least one measurement of the position of the user.
6. The audio generating system of Claim 4 wherein the processing section is comprised of a first processing unit that identifies the user in the image and generates the at least one measurement of the position of the user and a second processing unit that adjusts the audio output of at least one of the speakers (18a-e, 100a, 100b) based upon the at least one measurement of the position of the user.
7. The audio generating system of Claim 1 , wherein the at least one image capturing device is a video camera (26a, 26b, 110).
8. The audio generating system of Claim 7, wherein the at least one measurement of position of the user is an angle in a reference coordinate system.
9. The audio generating system of Claim 7, wherein the processing section (120) uses the angle to adjust the output of at least one speaker (110a, 110b).
10. The audio generating system of Claim 1, wherein the at least one image capturing device is two or more video cameras (26a, 26b, 110).
11. The audio generating system of Claim 10, wherein the processing section (22) determines a position* of the user in a reference coordinate system using the positions of the user in the images generated by each of the two or more video cameras (26a, 26b).
12. The audio generating system of Claim 11 , wherein the processing section (22) uses a stereo technique of three dimensional computer vision to determine the position of the user in the reference coordinate system using the positions of the user in the images generated by each of the two or more video cameras (26a, 26b).
13. The audio generating system of Claim 11 , wherein the processing section (22) uses the position of the user in the reference coordinate system and the positions of the two or more speakers in the reference coordinate system to determine the distance between the user and each of the two or more speakers (26a, 26b).
14. The audio generating system of Claim 13, wherein the distance between the user and each of the two or more speakers is used to adjust the audio output of at least one of the two or more speakers (26a, 26b).
PCT/EP2001/013304 2000-11-16 2001-11-14 Automatically adjusting audio system WO2002041664A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2002543259A JP2004514359A (en) 2000-11-16 2001-11-14 Automatic tuning sound system
EP01989480A EP1393591A2 (en) 2000-11-16 2001-11-14 Automatically adjusting audio system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US71389800A 2000-11-16 2000-11-16
US09/713,898 2000-11-16

Publications (2)

Publication Number Publication Date
WO2002041664A2 true WO2002041664A2 (en) 2002-05-23
WO2002041664A3 WO2002041664A3 (en) 2003-12-18

Family

ID=24867986

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2001/013304 WO2002041664A2 (en) 2000-11-16 2001-11-14 Automatically adjusting audio system

Country Status (3)

Country Link
EP (1) EP1393591A2 (en)
JP (1) JP2004514359A (en)
WO (1) WO2002041664A2 (en)

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102004004068A1 (en) * 2004-01-20 2005-08-04 Deutsche Telekom Ag Control and loudspeaker setup for multimedia installation in room in building has CD recorder-player and other equipment connected to computer via amplifying input stage
WO2006005938A1 (en) * 2004-07-13 2006-01-19 1...Limited Portable speaker system
FR2877534A1 (en) * 2004-11-03 2006-05-05 France Telecom DYNAMIC CONFIGURATION OF A SOUND SYSTEM
EP1677574A2 (en) * 2004-12-30 2006-07-05 Mondo Systems, Inc. Integrated multimedia signal processing system using centralized processing of signals
WO2006100644A2 (en) * 2005-03-24 2006-09-28 Koninklijke Philips Electronics, N.V. Orientation and position adaptation for immersive experiences
EP1677515A3 (en) * 2004-12-30 2007-05-30 Mondo Systems, Inc. Integrated audio video signal processing system using centralized processing of signals
NL1029844C2 (en) * 2004-09-21 2007-07-06 Samsung Electronics Co Ltd Virtual sound reproducing method for speaker system, involves sensing listener position with respect to speakers, and generating compensation value by calculating output levels and time delays of speakers based on sensed position
WO2007004134A3 (en) * 2005-06-30 2007-07-19 Philips Intellectual Property Method of controlling a system
WO2007113718A1 (en) 2006-03-31 2007-10-11 Koninklijke Philips Electronics N.V. A device for and a method of processing data
US20090060235A1 (en) * 2007-08-31 2009-03-05 Samsung Electronics Co., Ltd. Sound processing apparatus and sound processing method thereof
WO2009124773A1 (en) * 2008-04-09 2009-10-15 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Sound reproduction system and method for performing a sound reproduction using a visual face tracking
EP1833276A3 (en) * 2006-03-08 2009-12-02 Sony Corporation Television apparatus
WO2010141149A2 (en) 2009-06-03 2010-12-09 Transpacific Image, Llc Multimedia projection management
EP2464127A1 (en) * 2010-11-18 2012-06-13 LG Electronics Inc. Electronic device generating stereo sound synchronized with stereographic moving picture
EP2667636A1 (en) * 2012-05-25 2013-11-27 Samsung Electronics Co., Ltd. Display apparatus, hearing level control apparatus, and method for correcting sound
US8976986B2 (en) 2009-09-21 2015-03-10 Microsoft Technology Licensing, Llc Volume adjustment based on listener position
EP3032847A3 (en) * 2014-12-08 2016-06-29 Harman International Industries, Incorporated Adjusting speakers using facial recognition
EP2517478B1 (en) * 2009-12-24 2017-11-01 Nokia Technologies Oy An apparatus
CN107318071A (en) * 2016-04-26 2017-11-03 音律电子股份有限公司 Loudspeaker device, control method thereof and playing control system
WO2018149275A1 (en) * 2017-02-16 2018-08-23 深圳创维-Rgb电子有限公司 Method and apparatus for adjusting audio output by speaker
US10171054B1 (en) 2017-08-24 2019-01-01 International Business Machines Corporation Audio adjustment based on dynamic and static rules
US10440473B1 (en) 2018-06-22 2019-10-08 EVA Automation, Inc. Automatic de-baffling
US10484809B1 (en) 2018-06-22 2019-11-19 EVA Automation, Inc. Closed-loop adaptation of 3D sound
US10511906B1 (en) 2018-06-22 2019-12-17 EVA Automation, Inc. Dynamically adapting sound based on environmental characterization
US10524053B1 (en) 2018-06-22 2019-12-31 EVA Automation, Inc. Dynamically adapting sound based on background sound
US10531221B1 (en) 2018-06-22 2020-01-07 EVA Automation, Inc. Automatic room filling
US10552115B2 (en) 2016-12-13 2020-02-04 EVA Automation, Inc. Coordination of acoustic sources based on location
EP2731360B1 (en) * 2012-11-09 2020-02-19 Harman International Industries, Inc. Automatic audio enhancement system
US10708691B2 (en) 2018-06-22 2020-07-07 EVA Automation, Inc. Dynamic equalization in a directional speaker array
CN111782045A (en) * 2020-06-30 2020-10-16 歌尔科技有限公司 Equipment angle adjusting method and device, intelligent sound box and storage medium
CN116736982A (en) * 2023-06-21 2023-09-12 惠州中哲尚蓝柏科技有限公司 Automatic multimedia output parameter adjusting system and method for home theater

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8015590B2 (en) 2004-12-30 2011-09-06 Mondo Systems, Inc. Integrated multimedia signal processing system using centralized processing of signals
JP4789145B2 (en) * 2006-01-06 2011-10-12 サミー株式会社 Content reproduction apparatus and content reproduction program
TWI510106B (en) * 2011-01-28 2015-11-21 Hon Hai Prec Ind Co Ltd System and method for adjusting output voice

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE4027338A1 (en) * 1990-08-29 1992-03-12 Drescher Ruediger Automatic balance control for stereo system - has sensors to determine position of person and adjusts loudspeaker levels accordingly
JP2001054200A (en) * 1999-08-04 2001-02-23 Mitsubishi Electric Inf Technol Center America Inc Sound delivery adjustment system and method to loudspeaker

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04351197A (en) * 1991-05-29 1992-12-04 Matsushita Electric Ind Co Ltd Directivity control speaker system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE4027338A1 (en) * 1990-08-29 1992-03-12 Drescher Ruediger Automatic balance control for stereo system - has sensors to determine position of person and adjusts loudspeaker levels accordingly
JP2001054200A (en) * 1999-08-04 2001-02-23 Mitsubishi Electric Inf Technol Center America Inc Sound delivery adjustment system and method to loudspeaker

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PATENT ABSTRACTS OF JAPAN vol. 017, no. 213 (E-1356), 26 April 1993 (1993-04-26) -& JP 04 351197 A (MATSUSHITA ELECTRIC IND CO LTD), 4 December 1992 (1992-12-04) *
PATENT ABSTRACTS OF JAPAN vol. 2000, no. 19, 5 June 2001 (2001-06-05) -& JP 2001 054200 A (MITSUBISHI ELECTRIC INF TECHNOL CENTER AMERICA INC), 23 February 2001 (2001-02-23) *

Cited By (56)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102004004068A1 (en) * 2004-01-20 2005-08-04 Deutsche Telekom Ag Control and loudspeaker setup for multimedia installation in room in building has CD recorder-player and other equipment connected to computer via amplifying input stage
GB2431066A (en) * 2004-07-13 2007-04-11 1 Ltd Portable speaker system
WO2006005938A1 (en) * 2004-07-13 2006-01-19 1...Limited Portable speaker system
GB2431066B (en) * 2004-07-13 2007-11-28 1 Ltd Portable speaker system
US7860260B2 (en) 2004-09-21 2010-12-28 Samsung Electronics Co., Ltd Method, apparatus, and computer readable medium to reproduce a 2-channel virtual sound based on a listener position
NL1029844C2 (en) * 2004-09-21 2007-07-06 Samsung Electronics Co Ltd Virtual sound reproducing method for speaker system, involves sensing listener position with respect to speakers, and generating compensation value by calculating output levels and time delays of speakers based on sensed position
FR2877534A1 (en) * 2004-11-03 2006-05-05 France Telecom DYNAMIC CONFIGURATION OF A SOUND SYSTEM
WO2006048537A1 (en) * 2004-11-03 2006-05-11 France Telecom Dynamic sound system configuration
WO2006073990A2 (en) * 2004-12-30 2006-07-13 Mondo Systems, Inc. Integrated multimedia signal processing system using centralized processing of signals
EP1677515A3 (en) * 2004-12-30 2007-05-30 Mondo Systems, Inc. Integrated audio video signal processing system using centralized processing of signals
EP1677574A3 (en) * 2004-12-30 2006-09-20 Mondo Systems, Inc. Integrated multimedia signal processing system using centralized processing of signals
US7653447B2 (en) 2004-12-30 2010-01-26 Mondo Systems, Inc. Integrated audio video signal processing system using centralized processing of signals
WO2006073990A3 (en) * 2004-12-30 2009-04-23 Mondo Systems Inc Integrated multimedia signal processing system using centralized processing of signals
US7561935B2 (en) 2004-12-30 2009-07-14 Mondo System, Inc. Integrated multimedia signal processing system using centralized processing of signals
EP1677574A2 (en) * 2004-12-30 2006-07-05 Mondo Systems, Inc. Integrated multimedia signal processing system using centralized processing of signals
WO2006100644A3 (en) * 2005-03-24 2007-02-15 Koninkl Philips Electronics Nv Orientation and position adaptation for immersive experiences
WO2006100644A2 (en) * 2005-03-24 2006-09-28 Koninklijke Philips Electronics, N.V. Orientation and position adaptation for immersive experiences
WO2007004134A3 (en) * 2005-06-30 2007-07-19 Philips Intellectual Property Method of controlling a system
US9465450B2 (en) 2005-06-30 2016-10-11 Koninklijke Philips N.V. Method of controlling a system
US8120713B2 (en) 2006-03-08 2012-02-21 Sony Corporation Television apparatus
EP1833276A3 (en) * 2006-03-08 2009-12-02 Sony Corporation Television apparatus
CN101416235B (en) * 2006-03-31 2012-05-30 皇家飞利浦电子股份有限公司 A device for and a method of processing data
EP2005414A1 (en) * 2006-03-31 2008-12-24 Koninklijke Philips Electronics N.V. A device for and a method of processing data
US8675880B2 (en) 2006-03-31 2014-03-18 Koninklijke Philips N.V. Device for and a method of processing data
EP2005414B1 (en) * 2006-03-31 2012-02-22 Koninklijke Philips Electronics N.V. A device for and a method of processing data
WO2007113718A1 (en) 2006-03-31 2007-10-11 Koninklijke Philips Electronics N.V. A device for and a method of processing data
EP2031905A3 (en) * 2007-08-31 2010-02-17 Samsung Electronics Co., Ltd. Sound processing apparatus and sound processing method thereof
US20090060235A1 (en) * 2007-08-31 2009-03-05 Samsung Electronics Co., Ltd. Sound processing apparatus and sound processing method thereof
WO2009124773A1 (en) * 2008-04-09 2009-10-15 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Sound reproduction system and method for performing a sound reproduction using a visual face tracking
CN102484688A (en) * 2009-06-03 2012-05-30 传斯伯斯克影像有限公司 Multimedia projection management
WO2010141149A3 (en) * 2009-06-03 2011-02-24 Transpacific Image, Llc Multimedia projection management
US8269902B2 (en) 2009-06-03 2012-09-18 Transpacific Image, Llc Multimedia projection management
WO2010141149A2 (en) 2009-06-03 2010-12-09 Transpacific Image, Llc Multimedia projection management
US8976986B2 (en) 2009-09-21 2015-03-10 Microsoft Technology Licensing, Llc Volume adjustment based on listener position
EP2517478B1 (en) * 2009-12-24 2017-11-01 Nokia Technologies Oy An apparatus
EP2464127A1 (en) * 2010-11-18 2012-06-13 LG Electronics Inc. Electronic device generating stereo sound synchronized with stereographic moving picture
US9100633B2 (en) 2010-11-18 2015-08-04 Lg Electronics Inc. Electronic device generating stereo sound synchronized with stereographic moving picture
US9420373B2 (en) 2012-05-25 2016-08-16 Samsung Electronics Co., Ltd. Display apparatus, hearing level control apparatus, and method for correcting sound
EP2667636A1 (en) * 2012-05-25 2013-11-27 Samsung Electronics Co., Ltd. Display apparatus, hearing level control apparatus, and method for correcting sound
EP2731360B1 (en) * 2012-11-09 2020-02-19 Harman International Industries, Inc. Automatic audio enhancement system
US9544679B2 (en) 2014-12-08 2017-01-10 Harman International Industries, Inc. Adjusting speakers using facial recognition
EP3032847A3 (en) * 2014-12-08 2016-06-29 Harman International Industries, Incorporated Adjusting speakers using facial recognition
US9866951B2 (en) 2014-12-08 2018-01-09 Harman International Industries, Incorporated Adjusting speakers using facial recognition
CN107318071A (en) * 2016-04-26 2017-11-03 音律电子股份有限公司 Loudspeaker device, control method thereof and playing control system
US10552115B2 (en) 2016-12-13 2020-02-04 EVA Automation, Inc. Coordination of acoustic sources based on location
WO2018149275A1 (en) * 2017-02-16 2018-08-23 深圳创维-Rgb电子有限公司 Method and apparatus for adjusting audio output by speaker
US10171054B1 (en) 2017-08-24 2019-01-01 International Business Machines Corporation Audio adjustment based on dynamic and static rules
US10440473B1 (en) 2018-06-22 2019-10-08 EVA Automation, Inc. Automatic de-baffling
US10484809B1 (en) 2018-06-22 2019-11-19 EVA Automation, Inc. Closed-loop adaptation of 3D sound
US10511906B1 (en) 2018-06-22 2019-12-17 EVA Automation, Inc. Dynamically adapting sound based on environmental characterization
US10524053B1 (en) 2018-06-22 2019-12-31 EVA Automation, Inc. Dynamically adapting sound based on background sound
US10531221B1 (en) 2018-06-22 2020-01-07 EVA Automation, Inc. Automatic room filling
US10708691B2 (en) 2018-06-22 2020-07-07 EVA Automation, Inc. Dynamic equalization in a directional speaker array
CN111782045A (en) * 2020-06-30 2020-10-16 歌尔科技有限公司 Equipment angle adjusting method and device, intelligent sound box and storage medium
CN116736982A (en) * 2023-06-21 2023-09-12 惠州中哲尚蓝柏科技有限公司 Automatic multimedia output parameter adjusting system and method for home theater
CN116736982B (en) * 2023-06-21 2024-01-26 惠州中哲尚蓝柏科技有限公司 Automatic multimedia output parameter adjusting system and method for home theater

Also Published As

Publication number Publication date
EP1393591A2 (en) 2004-03-03
WO2002041664A3 (en) 2003-12-18
JP2004514359A (en) 2004-05-13

Similar Documents

Publication Publication Date Title
WO2002041664A2 (en) Automatically adjusting audio system
US9980040B2 (en) Active speaker location detection
Ribeiro et al. Using reverberation to improve range and elevation discrimination for small array sound source localization
JP5091857B2 (en) System control method
US20180020312A1 (en) Virtual, augmented, and mixed reality
US9485556B1 (en) Speaker array for sound imaging
EP2031418A1 (en) Tracking system using RFID (radio frequency identification) technology
WO2014162554A1 (en) Image processing system and image processing program
KR20020094011A (en) Automatic positioning of display depending upon the viewer's location
CN112188368A (en) Method and system for directionally enhancing sound
CN114208209B (en) Audio processing system, method and medium
CN101006492A (en) Hrizontal perspective display
JPH1141577A (en) Speaker position detector
Mulder et al. An affordable optical head tracking system for desktop VR/AR systems
JP2023508002A (en) Audio device automatic location selection
WO2013172768A2 (en) Input system
US20170123037A1 (en) Method for calculating angular position of peripheral device with respect to electronic apparatus, and peripheral device with function of the same
Łopatka et al. Application of vector sensors to acoustic surveillance of a public interior space
JP2017512327A (en) Control system and control system operating method
US7599502B2 (en) Sound control installation
Deldjoo et al. A low-cost infrared-optical head tracking solution for virtual 3d audio environment using the nintendo wii-remote
US20220210588A1 (en) Methods and systems for determining parameters of audio devices
Piérard et al. I-see-3d! an interactive and immersive system that dynamically adapts 2d projections to the location of a user's eyes
JP2005295181A (en) Voice information generating apparatus
US9915528B1 (en) Object concealment by inverse time of flight

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR

WWE Wipo information: entry into national phase

Ref document number: 2001989480

Country of ref document: EP

ENP Entry into the national phase in:

Ref country code: JP

Ref document number: 2002 543259

Kind code of ref document: A

Format of ref document f/p: F

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWP Wipo information: published in national office

Ref document number: 2001989480

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 2001989480

Country of ref document: EP