WO2001010065A1

WO2001010065A1 - Acoustic communication system

Info

Publication number: WO2001010065A1
Application number: PCT/GB2000/002961
Authority: WO
Inventors: David Bartlett; Scott Hommel; Michael Reynolds; David Alexander Butler; Peter John Kelly
Original assignee: Scientific Generics Limited
Priority date: 1999-07-30
Filing date: 2000-07-31
Publication date: 2001-02-08
Also published as: AU6303200A; EP1205045B1; EP1205045A1; GB9917985D0; ATE270478T1; DE60011909D1; US7505823B1; JP2003506918A

Abstract

A toy system is described in which a data signal is encoded using spread spectrum technology to form a spread signal which is combined with an audio track to form a modified audio track. The modified audio track is then transmitted as an acoustic signal to a toy having an electro-acoustic transducer for converting the acoustic signal into an electrical signal which is then despread in order to regenerate the data signal, and the toy responds to the data signal. By using spread spectrum technology to spread the data signal over a wide frequency range, the spread signal in the modified audio track can be made virtually inaudible to a listener. The techniques described have relevance to acoustic communication systems other than toy systems.

Description

ACOUSTIC COMMUNICATION SYSTEM

This invention relates to an acoustic communication system in which data is transmitted in an audio signal output by a loudspeaker. The invention has particular, but not exclusive, relevance to an electronic toy which detects an audio signal with an electroacoustic transducer and reacts in accordance with data conveyed by the audio signal.

Articles which react to sound are known. For example, room lights are available which turn on or off in response to the sound of hands clapping. However, the complexity of these articles has generally been limited to a single form of reaction in response to a specific sound.

It is an object of the present invention to generate electronic technology which can be incorporated in articles to enable the articles to respond to sound in many different ways depending on the content of the sound.

According to an aspect of the invention, there is provided a toy system in which a data signal is encoded using spread spectrum technology to form a spread signal and the spread signal is transmitted to a toy as an acoustic signal. The toy has an electroacoustic transducer for converting the acoustic signal into an. electrical signal which is then despread in order to regenerate the data signal, and the toy responds to the data signal. By using spread spectrum technology to spread the data signal over a wide frequency range, the spread signal in the acoustic signal can be made virtually inaudible to a listener.

Preferably, the spread signal is combined with an audio track to form a modified audio track. The modified audio track is then transmitted as the acoustic signal to the toy. By combining the spread signal with an audio track, the spread signal can be made less noticeable to a listener.

Exemplary embodiments of the invention will now be described with reference to the accompanying drawings, in which:

Figure 1 schematically shows a signalling system for communicating data between a television studio and an interactive toy located in a house via the audio track of a television signal; Figure 2 schematically shows an encoder system for mixing data from a data source with the audio track of a television signal for use in the signalling system described with reference to Figure 1;

Figure 3 is a plot comparing the power spectrum of a typical audio track of a television signal with that of a modulated data signal with and without spread spectrum encoding;

Figure 4 shows in more detail decoding circuitry in the toy of the signalling system described with reference to Figure 1; Figure 5 schematically shows a first alternative encoder to the encoder illustrated in Figure 2;

Figure 6 schematically shows a first alternative decoder to the decoder illustrated in Figure 4; Figure 7 schematically shows a second alternative encoder to the encoder illustrated in Figure 2;

Figure 8 schematically shows a second alternative decoder to the decoder illustrated in Figure 4;

Figure 9 schematically shows a third alternative encoder to the encoder illustrated in Figure 2;

Figure 10 is a plot of a power spectrum of the sensitivity of a human ear with and without the presence of a narrowband tone;

Figure 11 schematically shows a third alternative decoder to the decoder illustrated in Figure 4;

Figure 12 schematically shows a fourth alternative encoder to the encoder illustrated in Figure 2;

Figure 13 schematically shows a fourth alternative decoder to the decoder illustrated in Figure 4; Figure 14 schematically shows a fifth alternative encoder to the encoder illustrated in Figure 2;

Figure 15a is a timing diagram illustrating a pseudo-random noise code sequence;

Figure 15b is a timing diagram illustrating a carrier signal which has been phase modulated by the pseudo-noise code sequence illustrated in Figure 15a;

Figure 15c is a timing diagram illustrating a sampled signal obtained by sampling the modulated signal illustrated in Figure 15b; Figure 16 schematically shows a fifth alternative decoder to the decoder illustrated in Figure 4; Figure 17 shows in more detail an acquisition unit of the decoder illustrated in Figure 16;

Figure 18 shows in more detail a normalisation circuit of the acquisition unit illustrated in Figure 17; Figure 19 shows in more detail an averaging circuit of the normalisation circuit illustrated in Figure 18;

Figure 20 is a plot of the output of the normalisation circuit of the acquisition unit illustrated in Figure 17 in the presence of a single-path signal; Figure 21 is a plot of the output of a cross- correlator of the acquisition unit shown in Figure 17;

Figure 22 shows in more detail the components of a processor used in the fifth alternative decoder;

Figure 23 shows in more detail a power comparator of the fifth alternative decoder;

Figure 24 is a plot of the output of the normalisation circuit of the acquisition unit illustrated in Figure 17 in the presence of a multi-path signal;

Figure 25 schematically shows a sixth alternative decoder to the decoder illustrated in Figure 4;

Figure 26 schematically shows a seventh alternative decoder to the decoder illustrated in Figure 4;

Figure 27 illustrates a first alternative acquisition unit to the acquisition unit illustrated in Figure 17;

Figure 28 illustrates a second alternative acquisition unit to the acquisition unit illustrated in Figure 17;

Figure 29 is a plot of the power spectrum of a first alternative data signal to be mixed with an audio track of a television signal; Figure 30 is a plot of the power spectrum of a second alternative data signal to be mixed with the audio track of a television signal;

Figure 31 is a plot of the power spectrum of a third alternative data signal to be mixed with the audio track of a television signal;

Figure 32 schematically shows a signalling system for communicating data stored in the audio track of a compact disk to a set of lights; Figure 33 schematically shows a duplex signalling system for communicating information between a computer and an interactive toy; and

Figure 34 schematically shows the circuitry used in the interactive toy of the signalling system described with reference to Figure 33.

Several embodiments of the invention will now be described with reference to the communication system shown in Figure 1. Figure 1 shows a television studio 1 in which a video track and an audio track are generated in a conventional manner. However, an encoder (not shown) at the television studio 1 combines a data signal with the conventional audio track to form a modified audio track. A television signal including the video track and the modified audio track is then sent along a cable 3 to a transmitter 5 which generates a broadcast signal 7 conveying the televison signal.

The broadcast signal 7 is detected by an aerial 9 on a house 11 and fed, via a cable 13, to a television set 15 located within the house 11. When the television set 15 is switched on, the video track is displayed as images on a screen 17 and the modified audio signal is output through a loudspeaker 19 as an audio signal 21. It will be appreciated that although only one aerial 9 and one television set 15 have been shown for illustrative purposes, the broadcast signal 7 can in fact be detected by many different aerials provided they are within the range of the transmitter 5.

An interactive toy 23 in the house 11 has a microphone 25 to detect the audio signal 21 output by the loudspeaker 19 and to convert it into a corresponding electrical signal. The microphone 25 is connected to a decoder (not shown) which processes the electrical signal from the microphone 25 to retrieve the data signal encoded in the audio signal 21. In this embodiment the data encoded within the audio track is related to the television programme being broadcast and causes the toy 23 to appear to interact with the television programme by causing it to output sounds relating to the television programme which can be heard by a viewer 27. For example, for a television programme which aims to improve the ability of a child to speak by encouraging the child to repeat phrases, the data signal in the modified audio track can be used to make the toy 23 also repeat the phrases in order to encourage the child to do the same.

As those skilled in the art will appreciate, an advantageous feature of the above-described communication system is that communication between the television studio 1 and the interactive toy 23 can be achieved using a conventional television set 15 and therefore the consumer does not need to buy a new television set or an additional "set top box".

A more detailed description of a first embodiment of the above communication system will now be given with reference to Figures 2 to 4. In particular, Figure 2 shows the audio encoder 31 which combines the audio track, AUDIO, of the television programme (which is supplied from an audio source 33) with the data signal F(t) (which is supplied from a data source 35) to be transmitted to the interactive toy 23. In this embodiment the data signal F(t) is a binary signal having a bit rate of 21.5 bits per second in which each binary 1 is represented as +1 and each binary 0 is represented as -1, e.g. +1 volt and -1 volt respectively.

An advantageous feature of the first embodiment is that a spread spectrum encoding technique is used to spread the energy of the data signal F(t) over a wide range of frequencies. This has the effect of making the data signal less noticeable in the audio signal 21 heard by the viewer 27. In particular, if the data signal F(t) is directly combined with the audio track without such coding, then it is more likely to be heard by the viewer 27 of the television programme.

In this embodiment, direct sequence spread spectrum

(DSSS) encoding is utilised to spread the energy of the data signal over a wide band of frequencies. In order to perform the DSSS encoding a pseudo-noise code generator 37 is used to generate a pseudo-noise (PN) code. As those skilled in the art of telecommunications will appreciate, PN codes are binary codes which appear to be completely random in nature, but which are in fact deterministic, i.e. they can be reproduced. In particular, these codes are generated by exclusive-OR feedback from synchronously clocked registers. By continually clocking the registers, the PN code is cyclically reproduced. The number of registers, the registers used in the feedback path and the initialisation state of the registers determines the length of the code and the specific code produced.

In this embodiment, the pseudo-noise code generator 37 has eight registers and generates a PN code having 255 bits (which will hereinafter be referred to as chips using the standard nomenclature in the art to distinguish the bits of the PN code from the bits of the data signal to be spread) in a stream with no sequence of more than 8 chips repeated in the 255 chips. Such a PN code is conventionally referred to as an 8 bit code after the number of registers used to generate it. At the end of each stream of 255 chips a binary 0 is added to make the total length of the stream 256 chips. In this embodiment the PN code is generated at a rate of 5512.5 chips per second and each binary 1 is represented as +1 and each binary 0 is represented as -1, e.g. +1 volt and -1 volt.

The data signal F(t) and the PN code signal are input to a multiplier 39 where they are multiplied together.

Thus, each bit of the data signal F(t) is multiplied by a pseudo-random sequence of 256 chips which has the effect of spreading the energy of the data signal F(t) over a broad range of frequencies. The spread signal S(t) output from the multiplier 39 is then input to a modulator 41 which performs a conventional modulation technique, namely continuous phase frequency shift keying (CPFSK), to centre the frequency of the spread signal S(t) at 5512.5 Hz to form a broadband signal H(t).

The broadband signal H(t) and the audio track from the audio source 33 are both input to an audio mixer 43 where they are combined in a simple adding operation. The output of the audio mixer 43 then forms the modified audio track to be transmitted, along with the corresponding video track, as the broadcast signal 7.

The effect of the spread spectrum encoding is illustrated in Figure 3 which shows a typical audio signal 51 that is in the frequency range of 0 to 18 kHz with, as is normally the case, the power being predominantly concentrated at the lower frequencies. Beyond 15 kHz the sensitivity of the human ear deteriorates and the majority of people cannot hear frequencies above 20 kHz. Figure 3 also shows the modulated data signal 53 which would result if no spreading was carried out and the data signal F(t) was directly modulated by the modulator 41. As shown, this modulated data signal 53 is a narrowband signal centred at approximately 5.5 kHz and having a peak power significantly above the power level of the audio signal 51 at that frequency. However, if spreading is performed as well as modulating, the spread signal 55 is obtained which has a power spectrum with a main band spread between 0 and 11 kHz and harmonic bands at higher frequencies. As the power of the spread signal 55 is spread over a wider range of frequencies the peak power level is significantly reduced. For many applications the spread signal 55 is not noticeable to the listener 27 or is heard only as background white noise. Further, the majority of the energy of the main band is in a frequency range for which most conventional television loudspeakers work satisfactorily. Thus, there is no requirement for a user to obtain a new television set to take advantage of the invention.

Figure 4 shows the circuitry which is provided, in this embodiment, in the toy 23. As shown, the toy 23 includes the microphone 25 which picks up the audio signal 21 emitted by the loudspeaker 19 of the television set 15 and converts it into an electrical signal D(t). This electrical signal D(t) is then input to a decoder 63 in which it first passes through a filter 65 to remove high frequency components including the higher harmonic bands of the broadband signal H(t). The filtered signal is then input to an analogue to digital convertor (ADC) 67 where it is converted into a digital signal. In this embodiment, the ADC 67 has a sampling rate of 22.05 kHz, which is twice the highest frequency of the main energy band of the broadband signal H(t). It will be appreciated by a person skilled in the art that this is the minimum sampling frequency enabling full use to be made of the main energy band of the broadband signal H(t) without aliasing. The digital signal is then input to a demodulator 69 to demodulate the CPFSK modulated signal. The demodulated signal B(t) output by the demodulator 69 is then input to a correlator 71 which correlates the demodulated signal B(t) with the same binary PN code used to spread the spectrum of the data signal F(t) in the encoder 31.

In this embodiment the correlator 71 includes a digital matched filter which is matched to the PN code used to spread the spectrum of the data signal in the encoder 31. A pseudo-noise code generator 73 which generates this PN code is connected to the correlator 71 and generates a pseudo-noise code which is used to set the parameters of the digital matched filter. As the pseudo-noise binary code appears to be random, the digital matched filter will output relatively sharp positive and negative peaks when there is a match between the pseudo-noise code and the demodulated signal. In particular, a positive peak is generated when the received signal matches the pseudo- noise code and a negative peak is generated when the received signal matches the inverse of the pseudo-noise code. In this embodiment the correlator 71 also includes circuitry to convert the peaks emitted by the digital matched filter into a binary signal F' (t) which represents a regenerated version of the original data signal F(t) .

The regenerated data signal F'(t) is then input to a processor 75 which, in accordance with a program stored in a memory 77, identifies from the regenerated data signal F'(t) a sound file stored in the memory 77 which is to be played to the viewer 27 via a loudspeaker 79 provided on the toy 23.

Therefore, the toy 23 can be made to emit sounds in response to the television programme being shown on the television set 15. In this embodiment the memory 77 is a detachable memory so that a different memory 77 can be placed in the toy 23 for each television programme. In this way, the sound files output by the toy 23 can be updated.

In the first embodiment, the energy of the main band of the broadband signal H(t) is predominantly positioned in the frequency range 2kHz to 9kHz. However, the sensitivity of the human ear in this frequency range increases with frequency and for a typical audio signal, in which the energy is concentrated at low frequencies, the higher frequencies of the main band of the broadband signal H(t) can become noticeable. A second embodiment of the invention will now be described with reference to Figures 5 and 6 in which the frequency spectrum of the broadband signal H(t) is adjusted, prior to combining with the audio track, to concentrate the energy of the broadband signal H(t) more at low frequencies so that the broadband signal H(t) is less noticeable to the human ear. The components of Figures 5 and 6 which are identical to those of the first embodiment have been referenced with the same numerals and will not be described again. As shown in Figure 5, in the encoder 83 of the second embodiment, the broadband signal H(t) output from the modulator 41 is passed through a pre-emphasis circuit 85 before being mixed with the audio track in the audio mixer 43. The pre-emphasis circuit 85 applies a shaping algorithm which multiplies the broadband signal H(t) by a frequency-dependent scaling factor. In this embodiment the pre-emphasis circuit 85 is formed by an appropriate digital filter. The output H^t) of the pre-emphasis circuit 85 is then input to the audio mixer 43 where it is combined with the audio track as before.

Pre-emphasis has been conventionally applied to radio frequency spread spectrum communication systems to amplify the higher frequencies of a signal because at higher frequencies noise becomes more of a problem. In this embodiment, however, the shaping algorithm applied in the pre-emphasis circuit 85 reduces the amplitude of the broadband signal H(t) at higher frequencies to make the broadband signal H(t) less noticeable to the listener 27. The shaped filter used in this embodiment tapers the amplitude of the broadband signal H(t) by a function whose variation with frequency f is between a 1/f and a 1/f² function. In particular, the frequency variation is approximately inverse to the sensitivity of the human ear.

Figure 6 shows the decoder 89 (which is located in the toy 23) which is used in this embodiment. As shown, the digital signal output by the ADC 67 is input to a de-emphasis circuit 91 which applies an inverse shaping algorithm to the shaping algorithm used in the pre- emphasis circuit 85 of the encoder 83. In this embodiment the de-emphasis circuit is formed by an appropriate digital filter. The output of the de- emphasis circuit 91 is then input to the demodulator 69 and processed as before.

As described above, in the second embodiment the broadband signal H(t) is shaped so that the energy is concentrated more at the lower frequencies. While this may increase errors in the regeneration of the original data signal, the broadband signal is less noticeable to the listener 27 when combined with the audio track and output as audio signal 21. Standard error detection or error detection and correction techniques can be used to help ensure that the original data signal is recovered even if the data signal regenerated by the correlator 71 includes occasional errors.

A third embodiment of the invention will now be described with reference to Figures 7 and 8. In the first and second embodiments the broadband signal H(t) combined with the audio track in the audio mixer 43 was generated independently from the audio track. In this embodiment, a dynamic shaping algorithm is used which adjusts the energy of the broadband signal H(t) in dependence upon the energy of the audio track. In Figures 7 and 8 features which are identical to those of the first embodiment have been referenced with the same numerals and will not be described again. As shown in Figure 7, in the encoder 95 of the third embodiment the audio track generated by the audio source 33 is input to a power monitor 97. The power monitor 97 continually monitors the power of the audio signal and outputs a signal indicative of this power to a pre- emphasis circuit 99.

The broadband signal H(t) output by the modulator 41 is also input to the pre-emphasis circuit 99 where it is multiplied by a time-varying scaling factor which is determined using the signal from the power monitor 97. In this embodiment, the value of the scaling factor at any time is calculated so that the power of the broadband signal H(t) is a fixed amount below the power of the portion of the audio track with which it will be combined, unless this results in the power of the broadband signal H(t) falling below a threshold level in which case the power of the broadband signal H(t) is set equal to that threshold level.

The scaled signal H₂(t) output by the pre-emphasis circuit 99 is then combined in the audio mixer 43 with the audio track, the audio track having passed through a time delay unit 101. The time delay introduced by the time delay unit 101 corresponds to the time required by the power monitor 97 to analyse the audio track and for the pre-emphasis circuit 99 to generate and apply the scaling factor to the broadband signal H(t). Thus, each portion of the audio track is combined in the audio mixer 43 with the portion of the broadband signal H(t) which has been scaled in accordance with the energy in that portion of the audio track.

The use of a dynamic shaping algorithm is advantageous because a fixed shaping algorithm cannot ensure that the level of the broadband signal H(t) is both sufficiently high that it can be decoded during loud passages of the audio track and sufficiently low so that it is unobtrusive during quiet passages of the audio track. By employing a dynamic shaping algorithm it is possible to ensure that the scaled signal H₂(t) can be satisfactorily decoded at all times. However, a minimum level of the scaled signal H₂(t) must be maintained in order to prevent information from being lost. In this embodiment, this is achieved by setting the threshold level below which the power of the scaled signal H₂(t) is not allowed to fall.

As shown in Figure 8, in this embodiment the electrical signal D(t) input to the decoder 105 of the toy 23 is input to the ADC 67 (which includes an anti-aliasing filter for removing unwanted high frequencies). The digital signal output by the ADC 67 is then input to a power monitor 107 which continually monitors the power of the digital signal and outputs a signal indicative of this power to a de-emphasis circuit 109. The de-emphasis circuit 109 generates a time-varying scaling factor which is approximately inverse to the scaling factor applied in the encoder 95. The digital signal output by the ADC 67 is also input, via a time delay unit 111, to the de-emphasis circuit 109 where it is multiplied by the scaling factor generated by the de-emphasis circuit 109. As those skilled in the art will appreciate, the time delay unit 111 introduces a time delay which ensures that the value of the scaling factor for each portion of the digital signal output by the ADC 67 is determined by the power of that portion.

As described above, in the third embodiment the power of the broadband signal is maintained a fixed amount below that of the audio signal unless the power of the broadband signal falls below a threshold value in which case it is set at the threshold value. In this way the broadband signal can be made less noticeable to the listener 27 when combined with the audio track and output as the audio signal 21.

A fourth embodiment will now be described, with reference to Figures 9 to 11, in which psycho-acoustic analysis techniques are used to scale the broadband signal H(t) before it is combined with the audio track. In Figures 9 to 11 features that are identical to those of the third embodiment have been referenced with the same numerals and will not be described again.

Figure 9 shows the encoder 115 of the fourth embodiment in which the audio track is input to a psycho-acoustic analysis unit 117 to determine scaling information for the broadband signal H(t). In the psycho-acoustic analysis unit 117, the audio track is digitally sampled at a sampling rate of 22.05kHz and for each sample the frequency spectrum of the continuous stream of 1024 samples ending with that sample is determined. In this way a "sliding window" containing 1024 samples is analysed.

In this embodiment, the psycho-acoustic analysis unit 117 calculates the energy in ten non-overlapping frequency sub-bands spanning 1kHz to 11kHz and applies a psycho- acoustic algorithm to generate frequency-dependent scaling factors. For each window of samples, the psycho- acoustic algorithm calculates, for each frequency sub- band of the window, a theoretical level below which the human ear cannot distinguish any sound given the content of the audio track. This will be explained further with reference to Figure 10 which shows the sensitivity of a typical human ear for different frequencies (in other words, the minimum sound levels for different frequencies which can be heard by a typical human ear) without any background noise (the plot referenced as 123) and in the presence of a narrow band signal 125 (the dashed plot referenced as 127). As can be seen from Figure 10, the ability of the human ear to distinguish sound in the frequency range of the narrow band signal 125 and in a range of frequencies both above and below the frequency range of the narrow band signal 125 is significantly reduced. There are therefore audio signals which cannot be heard by the human ear in the presence of the narrowband signal 125, even though they would be heard if the narrowband signal is not present. A similar effect to that described above with reference to the frequency domain also exists in the time domain in that after a loud sound stops, the human ear does not immediately recover the sensitivity indicated by the plot 123.

As shown in Figure 9, the frequency-dependent scaling factors generated by the psycho-acoustic analysis unit are input to a scaling unit 119. The broadband signal H(t) output by the modulator 41 is also input to the scaling unit 119 where the broadband signal H(t) is scaled based on the frequency-dependent scaling factors. In this embodiment, this is achieved by passing the broadband signal H(t) through a filter, whose frequency response has been set in accordance with the current frequency-dependent scaling factors. The output P(t) of the scaling unit 119 is then combined in the audio mixer 43 with the audio track, the audio track having passed through a time delay unit 101 to ensure that each portion of the shaped broadband signal P(t) is combined with the portion of the audio track which was analysed to provide the scaling factors.

This type of psycho-acoustic analysis is now being performed in many audio encoding systems in order to reduce the amount of encoded data by removing information corresponding to sounds which would not be heard (and are therefore redundant) because of high level signals in either neighbouring frequency sub-bands or neighbouring time periods. Therefore, in this embodiment, the scaling factors output by the psycho-acoustic analysis unit 117 are set to ensure that if the audio track output from the audio mixer 43 is subsequently encoded in an audio encoder which utilises a psycho-acoustic evaluation of the audio track to remove redundant information, then the broadband signal H(t) will still be maintained in the encoded audio track. This is achieved by setting the level of the broadband signal H(t) in each frequency sub- band to be on or just above the minimum theoretical sound level that can be distinguished by the ear. As psycho- acoustic algorithms calculate conservative estimates of the minimum theoretical levels, the broadband signal H(t) will not be removed by such encoders while still being hardly noticeable to a listener.

As shown in Figure 11, in the decoder 131 of this embodiment, the electrical signal D(t) from the microphone 25 is digitally sampled by an ADC 67 at a sampling rate of 22.05kHz. The digitally sampled signal is input to a psycho-acoustic analysis unit 133 which applies the same psycho-acoustic analysis algorithm as applied in the psycho-acoustic analysis unit 117 in the encoder 115 to generate estimates of the minimum audible levels calculated in the encoder 115. The psycho- acoustic analysis unit 117 then generates inverse frequency-dependent scaling factors, based on these estimated minimum audible levels, which are designed to reverse the effect of the scaling factors applied in the encoder 115, and outputs the inverse scaling factors to a scaling unit 135.

The digital signal output by the ADC 67 is input, via a time delay unit 137, to the scaling unit 135 where it is scaled by the frequency-dependent inverse scaling factors, again by passing the delayed signal through an appropriate filter whose frequency response has been set using the current set of frequency-dependent inverse scaling factors. As before, the time delay unit 137 introduces a time delay which ensures that each portion of the digital signal is scaled by the inverse scaling factors generated for that portion.

To summarise, the use of a psycho-acoustic algorithm as described above has the advantage that the energy distribution of the broadband signal can be adjusted to reduce its obtrusiveness when combined with the audio track. Further, by a suitable setting of the frequency- dependent scaling factors, if the modified audio track is subsequently encoded using a psycho-acoustic algorithm the possibility of the broadband signal H(t) being removed to such an extent that the data signal F(t) can not be regenerated is reduced.

In the fourth embodiment, the frequency-dependent scaling factors are generated for a sliding window which moves in steps of a single sample. It has been found, however, that the bit error rate (BER) of the decoder is significantly reduced if the same frequency-dependent scaling factors are applied throughout each segment of the broadband signal H(t) corresponding to one data bit of the data signal F(t). A fifth embodiment will now be described, with reference to Figures 12 and 13, in which this type of processing is performed. In Figures 12 and 13, features which are identical to those of the fourth embodiment have been referenced with the same numerals and will not be described again.

Figure 12 shows the encoder of the fifth embodiment in which the audio track is input to a segmentation unit 143 which separates the audio track into segments whose duration is equal to the duration of a single data bit (which in this embodiment is approximately 46 ms). The audio track is then input segment-by-segment into a psycho-acoustic analysis unit 145 which analyses the frequency spectrum of each segment to generate frequency- dependent scaling factors which are output to a scaling unit 147. The psycho-acoustic algorithm for generating the scaling factors for a segment is identical to that used in the fourth embodiment described above.

The broadband signal H(t) is also input, via a segmentation unit 149, to the scaling unit 147. The segmentation unit 149 separates the broadband signal H(t) into segments with each segment containing 256 chips which correspond to a single data bit of the data signal F(t). In the scaling unit each segment output by the segmentation unit 149 is scaled using a filter whose frequency response has been set in accordance with the current set of frequency dependent scaling factors output by the segmentation unit 143. The shaped broadband signal P'(t) output by the scaling unit is then input to the audio mixer 43 where it is combined with the audio track, the audio track having passed through a time delay unit 101 to ensure that each segment of the shaped broadband signal P'(t) is combined with the segment of the audio track which was analysed to provide the scaling factors for that segment of the shaped broadband signal P'(t).

As shown in Figure 13, in the decoder 151 of this embodiment, the digital signal output by the ADC 67 is input to a segmentation unit 153 which separates the digital signal into segments each containing 1024 samples. In order to ensure that these 1024 samples correspond to a single data bit, the output of the correlator 71 is fed back to the segmentation unit 153 to provide timing information from which the segmentation unit 153 can determine where the start and end of each segment is positioned. Each segment is then input to a psycho-acoustic analysis unit 155 which analyses the energy content of the segment in the same manner as the psycho-acoustic analysis unit 145 in the encoder 141, but generates frequency-dependent scaling factors which are approximately inverse to those generated in the encoder 141. The generated scaling factors are then input to a scaling unit 157.

Each segment output by the segmentation unit 153 is also input, via a time delay unit 159, to the scaling unit 157 where the delayed samples are filtered by a filter having a frequency response set by the corresponding frequency- dependent scaling factors output by the psycho-acoustic analysis unit 155. As before, the time delay unit 159 introduces a time delay to allow for the time required for the psycho-acoustic analysis unit 155 to generate the scaling factors and therefore each segment is scaled using scaling factors generated by analysing the energy distribution of that segment. The signal output by the scaling unit 157 is then demodulated and correlated with the pseudo-noise code sequence to regenerate the data signal F(t) .

In the first to fifth embodiments, the pseudo-noise code generator 37 generates an 8 bit code having 255 bits. The broadband signal H(t) generated using an 8 bit code, while often satisfactory, may still be noticeable in the audio signal output by the loudspeaker 19. In order to reduce the noise of the broadband signal H(t) it is preferred that a longer pseudo-noise code sequence is used. In particular, using a 10 bit code (which forms a sequence of 1023 chips in a stream with no sequence of more than 10 chips repeated in the 1023 chips) or a 12 bit code (which forms a sequence of 4095 chips in a stream with no sequence of more than 12 chips repeated in the 4095 chips), while still multiplying each bit of the data signal F(t) by a 256 chip sequence so that neighbouring bits are multiplied by different chip sequences, provides a significant improvement in the unobtrusiveness of the broadband signal H(t) in the audio signal.

An arrangement using correlators formed by digital matched filters, as used in the first to fifth embodiments, could be used to despread the audio signal conveying a data signal spread using such longer PN code sequences. This is commonly referred to as incoherent despreading. However, it is preferred that the signal detected by the microphone 61 is despread by synchronously multiplying the detected signal D(t) with the same pseudo-noise code as was used for encoding the data signal F(t), because this enables a more reliable regeneration of the original data signal to be achieved. Despreading by synchronous multiplication is commonly referred to as coherent despreading.

A sixth embodiment which uses coherent despreading will now be described with reference to Figures 14 to 23. Figure 14 shows the encoder 161 used in this embodiment. In this embodiment, the data signal F(t) is a logic signal which, as in the previous embodiments, is generated at approximately 21.5 bits per second. In this embodiment, a phase shift keying (PSK) modulation technique is used to modulate the spread data signal I(t) to form a spread signal G(t) centred at about 5.5kHz. Further, as shown in Figure 14, the encoder 161 includes two pseudo-noise code generators 163, 165 for spreading the data signal F(t) . The first pseudo-noise code generator 163 generates a code sequence PN1, by repeating a first 12 bit PN code with an extra binary 0 added to the end of each sequence of 4095 chips, at a chip rate of 5512.5 Hz. Similarly, the second pseudo-noise code generator 163 generates a code sequence PN2 , by repeating a second different 12 bit PN code with an extra binary 0 added to the end of each sequence of 4095 chips, at a chip rate of 5512.5 Hz. In this embodiment, the first and second PN codes are orthogonal to each other and therefore if they are multiplied together chip by chip another pseudo-noise sequence is generated. The data signal F(t) and PNl are input to a first AND gate 167a while the inverse of the data signal F(t) and PN2 are input to a second AND gate 167b, with the outputs of the first and second AND gates connected together to form a common output. In this way a logic signal I(t) is formed at the common output in which every "1" of the data signal F(t) has been converted into a 256 chip sequence from PNl and every "0" of the data signal F(t) has been converted into a 256 chip sequence from PN2. The logic signal I(t) is input to a modulator 169 which uses phase shift keying to modulate a 5512.5 Hz carrier signal generated by an oscillator 171. This will be explained further with reference to Figures 15A and 15B which show an eight chip sequence of the logic signal I(t) and the corresponding modulated signal G(t) output by the modulator 169 respectively. As can be seen from Figures 15A and 15B, whenever the logic signal I(t) undergoes a change of state, a phase shift of 180° is introduced into the carrier signal. In this way the spread signal G(t) is generated with a main energy band from baseband to 11kHz and higher frequency sub-bands. The spread signal G(t) output from the modulator 169 is then input to an audio mixer 173 where it is combined with the audio track using a simple adding operation to form the modified audio track.

Figure 16 shows the decoder 181 of the sixth embodiment. The electrical signal D(t) from the microphone 25 is input to an ADC 183 (which includes an anti-aliasing filter to remove unwanted high frequencies) where it is sampled at a rate of about 22.05 kHz (4 times the chip rate), the exact rate being determined by a clock signal elk, to form a digital signal J(t) . Figure 15C shows an example of the samples obtained by sampling an electrical signal D(t) containing just the spread signal G(t) illustrated in Figure 15B.

In order to perform coherent despreading, the digital signal J(t) is separately multiplied by the code sequences PNl and PN2. It is, however, necessary to ensure that the chip sequence in the electrical signal D(t) and the chip sequences of the codes PNl and PN2 are time-synchronised. To achieve an initial synchronisation, the digital signal J(t) is input to an acquisition unit 185. The acquisition unit 185 generates signals which are analysed by a processor 187 which generates a signal F to control the rate of the clock signal elk generated by a clock 189, and signals X and Y for controlling the timing of the generation of the codes by the first and second pseudo-noise code generators 191 and 193 respectively.

Figure 17 shows in more detail the contents of the acquisition unit 185 and the signals that are generated by the acquisition unit 185 and supplied to the processor 187. In this embodiment, the processor 187 first removes any frequency offset between the chip rate of the chip sequence in the electrical signal D(t) and the chip rate of the first and second pseudo-noise code generators 191, 193 by adjusting the clock rate. This is necessary for two main reasons. The first reason is that if it is not done, then there will always be a slight difference between the clock frequencies used to generate the pseudo-noise codes in the encoder 161 and in the decoder 181 respectively. The second reason is that even if identical clock frequencies are used in the encoder and the decoder, frequency shifts caused by Doppler effects can occur which affect the chip rate in the detected signal. Therefore, without control of the rate at which the codes PNl and PN2 are generated in the decoder 181 it would be necessary to perform re-synchronisation on a frequent basis.

As shown in Figure 17, the samples of the digital signal J(t) from the microphone 25 are input to a series of four digital matched filters 211a to 2lid which are arranged in series so that the cascade output (indicated in Figure 17 by a) of the first matched filter is input to the second matched filter and so on. Each filter has 2048 taps, with the taps of the first matched filter 211a being matched to the first two bits of a SYNC byte, the taps of the second matched filter 211b being matched to the third and fourth bits of the SYNC byte, the taps of the third matched filter 211c being matched to the fifth and sixth bits of the SYNC byte and the taps of the fourth matched filter being matched to the seventh and eighth bits of the SYNC byte (1024 taps are required for each data bit because each data bit is multiplied by 256 chips and each chip is to be sampled four times).

The reason why a single matched filter having 8192 taps is not used rather than the four series connected matched filters 211 will now be described. In particular, if a single large matched filter was used in order to detect the SYNC byte, and if the rate at which the codes PNl and PN2 are generated is different to the chip rate in the received electrical signal D(t), then this lack of synchronisation will be more noticeable. This is because a large single matched filter performs the correlation over a larger time window and consequently the effects of the lack of synchronisation can build up over a longer period of time, thereby degrading the score output by the single matched filter. In contrast, by using a number of smaller series connected matched filters, the time window over which each of the matched filters performs the correlation is much smaller than that of the larger single matched filter. Hence, the effect of lack of synchronisation will be less noticeable for each of the individual smaller matched filters. As a result, larger frequency offsets between the chip rate in the received electrical signal D(t) and the chip rate of the codes PNl and PN2 can be tolerated by using the four matched filters 211 rather than a single matched filter.

The score output by each of the matched filters 211 (which is indicated by output b and which is updated at each clock pulse as the samples of J(t) are clocked through the matched filters) is input to a corresponding one of four normalisation circuits 213a to 213d. The normalisation circuits 213 provide a normalised output for a wide dynamic signal range of the input electrical signal D(t). This enables the output of the normalisation unit to be analysed by a simple thresholding operation. Figure 18 shows schematically the contents of each normalisation circuit 213. As shown, the current score from the corresponding matched filter 211 is input to a time delay unit 221 where it is delayed for 1024 clock periods, which corresponds to the time taken for the samples of the digital signal J(t) to propagate halfway through the corresponding one of the matched filters 211. The current score is also input to an averaging circuit 223 which uses the current score to update a running average of the last 2048 scores. The output of the time delay unit 221 is then input to a divider 225 which divides the delayed score by the current value of the running average, to produce the normalised output. The above processing makes the normalisation circuit particularly well suited to systems where a spread spectrum signal is hidden in an acoustic signal, because the acoustic signal will typically vary over a large dynamic range .

Figure 19 shows in more detail the contents of the averaging circuit 223. As shown, the current score is input to a time delay unit 231, where it is delayed for 2048 clock periods, and an adder 233 where the inverse of the time delayed score is added to the current score. The output of the adder 233 is then input to a second adder 235 where it is added to the current value of the running average (delayed by one clock cycle) output by the time delay unit 237, to generate a new current value of the running average which is used by the divider circuit 225. In this embodiment, sequences of SYNC bytes are repeated intermittently within the data signal F(t) output by the data source 35. Figure 20 shows a typical output of one of the normalisation circuits 213, when two consecutive SYNC bytes pass through the corresponding matched filter 211. In Figure 20 reference timings 241 have been illustrated which are separated by 8192 clock periods (nominally corresponding to the time required for the samples corresponding to one SYNC byte to pass through the matched filter) . The period between two adjacent reference timings 241 will hereinafter be referred to as a frame. A first peak 243 in the normalised score, corresponding to the first SYNC byte, occurs a time τ after the nearest preceding reference timing 241, while a second peak 245, corresponding to the second SYNC byte, occurs a time τ₂ after the nearest preceding reference timing 241. If there is no frequency offset in the chip rates, then x will be equal to τ₂ (since in 8192 clock periods the samples corresponding to a SYNC byte will pass completely through the four matched filters 211) and the matched filters 211a-211d will all output peaks at the same time. However, if there is a frequency offset in the chip rates, then there will be a timing offset τ_off/ defined by τ₂ - τ_{l r} between the peaks in neighbouring frames which is dependent on the frequency offset. Further, since there is a frequency offset in the chip rates, the peaks output by the four matched filters 211a to 2 lid will not occur simultaneously. However, the timing offset (τ_off) for the output of each of the normalisation circuits 213 should be identical. In this embodiment, the acquisition unit 185 makes use of this, in order to quantify the frequency offset and hence to correct for it. The way in which it does this will now be described.

As shown in Figure 17, in this embodiment, the output of each normalisation circuit 213 is input to a corresponding cross-correlator 215a to 215d where it is cross-correlated with the output from the same normalisation circuit for the immediately preceding frame. This is achieved by passing the output score from each normalisation unit 213 through a corresponding time delay unit 217a to 217d which delays the scores by one frame. The output from the normalisation circuit 213 is then cross correlated with the corresponding delayed output, by the cross-correlator 215. In this embodiment, a maximum frequency offset of three chips per SYNC byte is anticipated. Therefore, the cross-correlators 215 only look for a cross-correlation peak over a range of time offsets between the two frames, varying between a three chip lead and a three chip lag. This results in a significant reduction in the amount of processing required by the cross-correlators 215.

Figure 21 shows a typical output of one of the cross- correlators 215. The x-axis corresponds to the time offset between the two frames output by the normalisation circuit 213 and the y-axis corresponds to the score output by the cross-correlator 215. In this embodiment, twenty-five values of the cross-correlator score are obtained to allow for a maximum time offset of ± three chips. A cross-correlation peak 251 occurs at a time offset τ_off which is equal to τ₂ - τ_x . As mentioned above, the time offset for each of the matched filters 211a-211d should be identical and therefore the position of the cross-correlation peak 251 in the output of each of the cross-correlators 215 should be the same. The outputs of the four cross-correlators 215 are therefore added together in an adder 219 and the output of the adder 219, labelled OFFSET in Figure 17, is input to the processor 187. The processor 187 then calculates the frequency offset (from τ_off and the size of the correlation window of the matched filters 211) and sends a control signal F to the clock 189 to adjust the clock frequency in order to reduce the frequency offset to a manageable amount.

Once the frequency offset has been reduced in this way, it is then necessary to synchronise the codes PNl and PN2 generated by the first and second pseudo-noise code generators 191 and 193 respectively, with the chip sequence in the detected electrical signal D(t). In this embodiment, this is achieved by inputting the output scores A_L, B_{i r} C_L and D_L from the four normalisation circuits 213 directly into the processor 187 which determines, from the largest peak present in the four outputs, the timing of the chip sequence in the detected electrical signal D(t). The processor then outputs control signals X and Y to the first and second pseudo- noise code generators 191 and 193 respectively, in accordance with the timing of the largest peak to achieve time synchronisation. In this embodiment, the processor 187 is a microprocessor based system which is schematically illustrated in Figure 22. As shown, the processor includes an interface circuit 255 for interfacing a central processing unit (CPU) 257 with the normalised scores A_if B_if C_L and D_t output from the normalisation circuits 213 and for interfacing the CPU with the adder 219 and the two pseudo-noise generators 191 and 193. As shown in Figure 22, the interface circuit 255 also receives a signal (TRACK) which is used in tracking which will be described in more detail below. In carrying out the calculations described above, the processor 187 processes the values received from the interface 255 in accordance with predetermined instructions stored in a program memory 259. A working memory (RAM) 261 is also provided in which the CPU 257 can perform the various calculations. A user interface 263 is also provided to allow a user to adjust the settings of the processor 187, for example in order to change or alter the program instructions stored in the program memory 259 so that the decoder can be reconfigured.

Returning to Figure 16, when synchronisation has been achieved, data encoded in the digital signal J(t) can be extracted by inputting the digital signal J(t) to a correlator unit 195 (indicated by dotted lines) in which the digital signal J(t) is synchronously multiplied by PNl and PN2. As shown in Figure 16, the correlator unit 195 comprises three channels labelled late, on-time and early. As will be explained in detail, the three channels enable the time synchronisation to be tracked while data other than the SYNC byte is being transmitted.

The digital signal J(t) is input into each of the three channels of the correlator unit 195 and in each channel it is separately multiplied by PNl and PN2. In the late channel, the digital signal J(t) is input to a multiplier 199a, where it is multiplied by PNl time-delayed by two clock periods by a time delay unit 197a, and to a multiplier 201a, where it is multiplied by PN2 time delayed by two clock periods by a time delay unit 197c. Similarly, in the on-time channel the digital signal is input to a multiplier 199b, where it is multiplied by PNl time-delayed by one clock period by a time delay unit 197b, and to a multiplier 201b, where it is multiplied by PN2 time-delayed by one clock period by a time delay unit 197d. In the early channel, the digital signal is input to a multiplier 199c, where it is multiplied by PNl, and to a multiplier 201c, where it is multiplied by PN2.

When the digital signal J(t) is multiplied by PNl, if the chip sequence of the signal J(t) matches PNl, then a narrowband signal at about the carrier frequency of 5512.5 Hz will be generated. Similarly, when the digital signal J(t) is multiplied by PN2 , if the chip sequence of the signal J(t) matches PN2 , then a narrowband signal at the carrier frequency will be generated. Thus, for each channel, if the received data bit is a "1", then the output of the multipliers 199 will contain a narrowband signal at the carrier frequency and, because PNl and PN2 are orthogonal, the output of the multipliers 201 will not contain the narrowband signal. Similarly, if the received data bit is a "0", then the output of the multipliers 201 will contain the narrowband signal at the carrier frequency and the output of the multipliers 199 will not.

For each channel, the outputs of the two multipliers 199 and 201 are input to a corresponding power comparator 203a to 203c, which is shown in more detail in Figure 23. As shown, in the power comparator 203, the outputs of the two multipliers 199 and 201 are input to respective bandpass filter 267a and 267b which are centred on the carrier frequency. The output of each bandpass filter 267 is then input to a respective power monitor 269a or 269b, which determines the power of the signal output from the corresponding bandpass filter 267. As mentioned above, when the received data bit is a "1", the output from the power monitor 269b should be greater than the output from the power monitor 269a. In contrast, when the received data bit is a "0", the output from the power monitor 269a should be greater than the output from the power monitor 269b. Therefore, the outputs from the power monitors 269 are input to a comparator 271 which outputs a value in dependence upon the difference between the outputs of the two power monitors 269. In this embodiment, the output from the power monitor 269a is input to the positive terminal of the comparator 271 and the output from power monitor 269b is input to the negative input of the comparator 271. Therefore, if the received data bit is a "1", then the output of the comparator 271 will be positive and if the received data bit is a "0", then the output of the comparator 271 will be negative.

Returning to Figure 16, the output of the power comparator 203b in the on-time channel is input to a data regeneration circuit where it is converted into the regenerated data signal F'(t). The output of the power comparator 203b in the on-time channel is also input, together with the outputs of the power comparators 203a and 203c of the late and early channels, into an analysis unit 207. The analysis unit 207 determines which of the power comparators 203 provides the largest output, or in other words in which channel there is the best match between the chip sequence in the digital signal J(t) with PNl and PN2. If the power comparator 203a in the late channel provides the largest output, then the analysis unit 207 sends a signal (on control line labelled TRACK) to the processor 187 indicating that the clock should skip a sample so that the power comparator 203b in the on-time channel once more produces the largest output. Similarly, if the power comparator 203c in the early channel produces the largest output, then the analysis unit 207 outputs a signal to the processor 187 which causes the clock 189 to make a double sample so that the comparator 203b of the on-time channel once more produces the largest output. In this way a tracking operation is accomplished in which the synchronisation of PNl and PN2 with the chip sequence encoded in the signal J(t) is checked on a sample-by-sample basis and, if necessary, the timing of PNl and PN2 is adjusted to correct for a reduction in synchronisation. Figure 20 shows an exemplary output of the normalisation circuit in which only a single peak 245 is present in each frame corresponding to a single acoustic path between the loudspeaker 19 and the microphone 25. However, as shown in Figure 24, typically a group 275 of several peaks is present in each frame because in the present application, as shown in Figure 1, the television set 17 is located within a room and generally different paths will exist between the loudspeaker 19 and the microphone 25 (due to reflections off walls etc) and the signals travelling along these different paths arrive at the microphone 25 at different times. Thus, the electrical signal D(t) generated by the microphone 25 has several components, with each component corresponding to a path with a different time of flight, and a peak is formed corresponding to each of these components . As shown in Figure 24, the strongest component appears a time τ_x after the reference time 241, followed by further peaks at times τ₂, τ₃, τ₄, and τ₅.

It will be appreciated by the skilled person that the acquisition unit 185 described above is robust in the presence of these multi-path peaks because, although the signals output by the normalisation circuits will include several peaks per frame, the output of each cross- correlator 215 should still contain just a single peak as the intervals between the peaks should remain constant. Further, the determination of the time synchronisation in this embodiment is only based on the largest peak output by a normalisation circuit 213 and therefore the smaller peaks corresponding to the other paths are ignored. However, this means that the information contained in the other components is not used when regenerating the data signal F(t). A seventh embodiment will now be described with reference to Figure 25 in which some of these other components are used as well as the strongest component to regenerate the data signal. In particular, in this embodiment a "four-pronged rake receiver" is used which allows information contained in four components of the received signal D(t) to be used.

The encoder of the seventh embodiment is identical to that of the sixth embodiment and will not, therefore, be described again. The decoder 281 of the seventh embodiment is shown in Figure 25 in which features which are identical to those of the decoder 181 of the sixth embodiment have been referenced with the same numerals and will not be described again.

As shown in Figure 25, the digital signal J(t) output by the ADC 183 is input to the acquisition unit 185 and to four time delay units 283a, 283b, 283c and 283d, each time delay unit being in a different prong of the rake receiver. In this embodiment, the processor 187 determines from the outputs A_i B_L , and Di of the normalisation circuits 213 of the acquisition unit 185, the timings of the four strongest components of the digital signal J(t) relative to the reference timing 241. For example, for the output illustrated in Figure 24, the four timings are τ₁₍ τ₂, τ₃, and τ₄. Four control signals are then generated by the processor 187, each corresponding to a respective one of these four timings. As shown in Figure 25, each control signal is input to a corresponding one of the four time delay units 283a, 283b, 283c and 283d so that each time delay unit outputs a signal for which a respective one of the four strongest components of the digital signal J(t) will be time synchronised with PNl and PN2.

The signal output from each time delay unit 283a, 283b, 283c and 283d is then input to a corresponding one of four correlator units 195a, 195b, 195c and 195d, each of which is identical to the correlator unit 195 in the sixth embodiment. As a result of the time delays introduced by the time delay units 283, the outputs of the four correlator units 195a, 195b, 195c and 195d should be in phase. Therefore, the output of the multiplier 199a in the late channel of each correlator 195a to 195d is input to a first adder 285a where they are added together and the output of the multiplier 201a in the late channel of each correlator unit 195a to 195d is input to a second adder 285b where they are added together. Similarly, the outputs of the on-time channels in each correlator are input to third and fourth adders 285c and 285d and the outputs of the early channel in each correlator are input to fifth and sixth adders 285e and 285f respectively.

The output of each adder 285 then forms a respective one of the inputs to the three power comparators 203a, 203b and 203c and the processing proceeds as in the sixth embodiment. As those skilled in the art will appreciate, by using the four strongest components in the electrical signal D(t) output by the microphone 25, a higher signal to noise ratio is achieved compared to using only the strongest component, thereby improving the ability of the decoder 181 to regenerate the original data signal F(t) .

In the sixth and seventh embodiments described above, the processor 187 controls a clock signal elk which is used to determine the timings at which the ADC 183 samples the electrical signal D(t) . An eighth embodiment will now be described in which the sampling rate of the ADC 183 is fixed, but in which the output of the ADC 183 is re- sampled to overcome the frequency offset problem.

The encoder of the eighth embodiment is identical to that of the sixth embodiment and will not, therefore, be described again. The decoder 291 of the eighth embodiment is shown in Figure 26 and features which are identical to those of the decoder 181 of the sixth embodiment have been referenced with the same numerals and will not be described again. In the decoder 291 of the eighth embodiment, the ADC 183 samples the electrical signal D(t) from the microphone 25 at a fixed rate of 22.05 kHz. The digital signal J(t) output by the ADC 183 is then input to the acquisition unit 185 and to a re- sampling circuit 293. In this embodiment, the clock signal elk output by the clock 189 is at a frequency of approximately 44.10kHz, the exact frequency being determined by the chip rate of the chip sequence conveyed by the digital signal J(t) . As in the sixth and seventh embodiments, clock pulses can be skipped or doubled during the tracking operation. The pseudo-noise code generators 191 and 193 generate codes PNl and PN2 respectively, at a rate of one chip every eight clock pulses of the clock signal elk.

The digital signal J(t) is stored in blocks of 8192 samples in the re-sampling circuit 293. The processor 187 determines from the outputs of the cross-correlators 215 of the acquisition unit 185, the chip rate of the chip sequence in the digital signal J(t), and then outputs a signal S to the re-sampling circuit which indicates the rate at which the digital signal J(t) needs to be re-sampled. For example, if the determined chip rate in the digital signal is 5567.625 Hz, which corresponds to an increase of 1% over the nominal chip rate of 5512.5 Hz, then the re-sampling rate has to be 22.2705 kHz to allow for the additional chips present. The re-sampled data is determined in the re-sampling circuit 293 from the 8192 stored samples using interpolation techniques to give, for the exemplary 1% increase in chip rate, 8274 samples. The clock rate is also adjusted to be eight times the chip rate of the chip sequence in the digital signal J(t) and the re-sampled data is then input to the correlator unit 195 at the clock rate where it is multiplied by PNl and PN2 and the remaining processing proceeds as in the sixth embodiment.

In this embodiment the first and second pseudo-noise code generators 191 and 193 are controlled by the processor 187 to operate only when re-sampled data is input to the correlator unit 195. It will be appreciated by the skilled person that the clock signal elk is run at a faster rate than the sampling rate of the ADC 183 to give time for the re-sampling to take place and for any additional chips to be processed.

Modifications and Further Embodiments

The person skilled in the art will appreciate that the re-sampling techniques of the eighth embodiment could also be applied to a decoder employing a rake receiver arrangemen .

In the sixth to eighth embodiments a specific synchronisation and tracking technique is described. However, it will be appreciated that many modifications could be made to this synchronisation and tracking technique. Parameters, such as the number of matched filters and the number of taps in each matched filter, will depend upon the operational tolerance required for the device. It is desirable to use a long synchronisation sequence in the matched filter because, when there is no frequency offset, increasing the number of taps increases the size of the peak in the output of the matched filter. However, increasing the number of taps in a single matched filter reduces the frequency offset which that matched filter can tolerate.

It has been found that if it is desired to match the taps of the matched filter with a synchronisation sequence comprising N chips, but a frequency offset resulting in a time drift of e chips of the synchronisation sequence must be tolerated across the matched filter, then it is preferred to split the matched filter into (e + 1) matched filters with each one matched to N divided by (e + 1) chips of the synchronisation sequence. If the above-described cross-correlation technique is then used to measure the frequency offset, then sample offsets spanning from -e chips to +e chips need to be calculated in the cross-correlator.

In the sixth to eighth embodiments, the cross-correlators calculate cross-correlation scores for timing offsets between the two frames which are varied in steps corresponding to a single clock cycle. This is not essential and the cross-correlation could, for example, be calculated for timing offsets of the frames which vary in steps corresponding to 2 or 4 clock cycles. Further, it will be appreciated from Figure 21 that the frequency offset could be determined with a sub-sample resolution by interpolating the position of the peak from the values of the points neighbouring the maximum cross-correlation score. Another way of obtaining a more accurate measure of the frequency offset is to accumulate the summed outputs of the cross-correlators over more than two frames .

In the sixth to eighth embodiments the matched filters 211a-211d of the acquisition circuit 185 are arranged in series. Figure 27 illustrates an alternative acquisition unit 301 in which the matched filters are arranged in parallel. In Figure 27 features which are identical to corresponding features in Figure 17 have been referenced with the same numerals and will not be described again. The digital signal J(t) is input into four parallel channels. In the first channel the digital signal J(t) is input directly into the first matched filter 211a. In the second channel the digital signal J(t) is input, via a time delay unit 303 which introduces a delay of 2048 clock cycles, into the second matched filter 211b. In the third channel the digital signal J(t) is input, via a time delay unit 305 which introduces a delay of 4096 clock cycles, into the third matched filter 211c. In the fourth channel the digital signal J(t) is input, via a time delay unit 307 which introduces a delay of 6144 clock cycles, into the fourth matched filter 2 lid. In this way the score outputs A_if B_i Ci and D_t of the first to fourth matched filters 211a to 2lid are identical to those for the acquisition unit 185 of the sixth embodiment and therefore the remainder of the acquisition unit 301 functions in the same manner as the acquisition unit 185 of the sixth embodiment.

The acquisition circuits described with reference to Figures 17 and 27 have a number of advantages. These advantages include:

(1) The normalisation circuit 213 makes the acquisition circuit capable of analysing input digital signals

J(t) whose power spans a wide dynamic range.

(2) Replacing a single long matched filter with four short matched filters improves the ability to generate peaks in the presence of large frequency offsets i (3) By cross-correlating the output of each normalisation circuit with the output one frame earlier, the output of the cross-correlator for each matched filter is nominally identical and hence the output of each cross-correlator can simply be added together.

(4) By using a cross-correlation to determine the frequency offset, the amount of processing required to determine the frequency offset for multi-path signals is identical to that required for single- path signals.

(5) As the length of the matched filters used will be determined based on a required tolerance of frequency offset for the decoder, the cross- correlation need only be carried out for time offsets between the output of the normalisation circuit for neighbouring frames which fall within this tolerance, thereby reducing the amount of processing performed.

It will be appreciated by the person skilled in the art that the normalisation circuit is not essential to achieve some of the advantages discussed above. An automatic gain control (AGC) circuit at the input to the decoder could be used instead, but the normalisation circuit is preferred because it gives a significant improvement in performance. It will also be appreciated that the cross-correlation technique, in which the score outputs of neighbouring frames are cross-correlated, could also be applied to the output of a single filter to determine the frequency offset, although using plural filters and adding the results together increases the signal magnitude.

Further, the increased frequency offset tolerance achieved by splitting a single matched filter into a number of smaller matched filters can be achieved without using the cross-correlation technique. Figure 28 illustrates an alternative acquisition circuit 309 which corresponds to the acquisition circuit 185 illustrated in Figure 13 with the cross-correlators 215, delay units 217 and adder 219 removed so that the OFFSET signal is not generated and only the scores A_{L r} B_{i r} C_L and D_± of the normalisation circuits 213 for each sample i are input to the processor 187. The processor can then use these scores to keep a number of accumulated scores for different values of frequency offset. The frequency offset can then be determined by comparing the accumulated scores and to determine which has the best score.

For example, the processor 187 may perform a routine in which for each sample the following sums, which correspond to varying timing offsets, are calculated:

A_± + Bi + C_L + Di ( 1 ) Ai + B_i+1 + C_i+2 + D_{i+ 3} ( 2 )

Ai + B_t_ + Ci_₃ + Di.₃ ( 3 ) A_± + B_i+2 + C_i+4 + D_i+6 ( 4 )

Ai + Bi_₂ + Ci_₄ + Di_₆ ( 5 )

Ai + B₁₊₃ + C_i+6 + D_i+9 ( 6 )

Ai + Bi_₃ + Ci_₆ + D_±_₉ ( 7 )

The skilled person will recognise that sum (1) corresponds to no timing offset, sums (2) and (3) correspond to a timing offset of ±3 clock cycles, sums (4) and (5) correspond to a timing offset of ±6 clock cycles, and sums (6) and (7) correspond to a timing offset of ±9 clock cycles. For each sample i, the sums (1) to (7) are compared to a threshold figure which is set such that when one of the sums is above the threshold figure, the frequency offset can be estimated from which of the sums (1) to (7) was used and the time synchronisation can be estimated from the timing of the sample i which caused the sum to exceed the threshold. The skilled person will appreciate that this synchronisation technique is very efficient in terms of the processing power required in comparison to conventional synchronisation techniques because the correlation only has to be performed once. It will also be appreciated by the skilled person that the form of the sums used above can be varied to obtain different timing offsets if desired.

It will be appreciated by those skilled in the art that, because the acoustic signal undergoes an arbitrary phase shift between source and receiver, the output of the matched filters 211 is not always optimal. A complex filter can be used instead of each matched filter in order to remove the dependence on the phase of the incoming signal. As in this application it is the amplitude of the filter output which is most important, the complex filter could be formed by a parallel pair of matched filters each having 2048 taps and being matched to a quadrature pair of signals. The outputs of the matched filters would then be squared, summed and divided by the average output value over one data bit period, to generate a score output which can be compared with a threshold value to give an output which is independent of the input signal phase.

In another alternative acquisition circuit, the configuration illustrated in Figure 17 is used but each of the matched filters has 256 taps matched to a respective quarter of a 256 chip sequence corresponding to a SYNC bit.

In the seventh embodiment it is assumed that the chip rates of the four strongest components of the electrical signal D(t) are identical so that the same pseudo-noise code generators 191 and 193 can be used for each channel.

The person skilled in the art will, however, recognise that if the chip rates were different then the different chip rates would appear as different peaks in the output of the cross-correlators 215, and the processor could control different pseudo-noise code generators for each prong of the rake receiver to operate at different chip rates. This would, however, substantially increase the complexity of the decoder circuitry. In the first to fifth embodiments, continuous phase frequency shift keying (CPFSK) is used to modulate the data signal onto a carrier in the centre of the audible range of frequencies, while in the sixth to eighth embodiments phase shift keying is used. It will be appreciated that the described embodiments could easily be adapted to allow other modulation techniques to be used. If a technique is used where the receiver does not precisely know the phase and frequency of the received signal, for example standard binary phase shift keying, conventional circuitry such as a Costas loop can be used to extract estimators for these parameters from the received signal.

In all the described embodiments the data signal is first spread and then subsequently modulated. It will be appreciated by the person skilled in the art that the invention could equally be applied to systems in which the data signal is modulated and then subsequently spread. Similarly, in the decoder the received signal may be demodulated then despread or despread and then demodulated.

In the first to eighth embodiments, the data signal F(t) is spread using DSSS encoding. However, the energy of a data signal can be spread over a wide range of frequencies by using techniques other than DSSS encoding. For example, as illustrated by the power spectrum shown in Figure 29, an orthogonal frequency division modulation (OFDM) technique can be used in which, for example, 256 narrow-band orthogonal carriers 321 carry identical data. These 256 narrow-band carriers are evenly distributed in the frequency range of 1 to 11kHz and thus spreading of the energy of the data signal is achieved. The original data signal can then be reconstructed by demodulating and recombining each of the narrow-band signals.

It will be appreciated by a person skilled in the art that still further techniques could be used to spread the energy of the data signal. For example, frequency hopping could be used in which the frequency of the modulated data signal is changed in a random manner.

Another alternative to spreading the energy of a single narrow-band data signal over a desired broad range of frequencies is to generate a small number of narrow-band carriers at the centre of the desired frequency range and then to spread each of these narrow-band carriers over the entirety of the desired frequency range using orthogonal PN codes. The power spectrum for such a scheme is illustrated in Figure 30 in which 16 narrowband carriers 323 are evenly spaced between 5512.5Hz and 6300Hz and each of the narrow-band carriers is spread using DSSS encoding with a chip rate of 5512.5Hz to form a broadband signal 325.

Alternatively, a number of narrow-band signals can be evenly spaced in the desired frequency range and each of these narrow-band signals can be individually spread to cover a sub-band of the desired frequency range. Such a system is shown in Figure 31 in which eight narrowband signals 331, each transmitting a data signal at 5 bits per second, are spaced evenly throughout the desired range. Each bit of each of the narrowband signals is multiplied by a corresponding PN code using a 256 chips per bit ratio, to form eight broadband signals 333 spread over a corresponding sub-band of the desired range. This has the advantage that the resulting spread spectrum is significantly flatter than those of the other embodiments. The PN codes used to modulate the eight different signals form an orthogonal set so that each broadband signal 333 can be despread separately. Further, each broadband signal 333 can be adjusted as a whole for the energy of the corresponding segment of audio track which reduces errors caused by non-linear filtering across the entire desired range of frequencies.

In the systems described with reference to Figures 29 to 31 a number of narrowband signals are generated. These signals can be used to carry identical data streams which are staggered in time. This has the advantage that if a segment of the data signal is lost, for example due to a loud noise which does not form part of the audio signal 21 output by the loudspeaker 19, then the data in the lost segment will be repeated in different channels subsequently.

As mentioned above, the use of a longer PN code sequence has the effect of reducing the obtrusiveness of the broad band signal H(t) in the modified audio track. Instead of using a pseudo-noise code generator which generates a long sequence, for example a 10 bit code or a 12 bit code, it is possible to use a number of orthogonal shorter sequences, for example 8 bit codes, linked end to end to form a long sequence.

In the sixth to eighth embodiments, the output of each multiplier 199, 201 is input to a bandpass filter followed by a power monitor. Alternatively, the output of each multiplier 199, 201 could be input to a quadrature mixer and the output of the quadrature mixer then input to a low-pass filter with a time constant of the order of the duration of a data bit of the data signal F(t). The output of the low-pass filter can then be used as an input signal to the comparator 271. This provides a computationally efficient implementation.

In the sixth to eighth embodiments, the processor decides, based on the signal TRACK, after every clock cycle which of the early, on-time and late channels produces the best signal. However, in order to reduce the processing load, the processor could alternatively make this decision over longer intervals, for example intervals corresponding to one chip, one data bit or one repetition of the chip sequences PNl and PN2.

As explained in the sixth embodiment, multiple paths between the loudspeaker 19 and microphone 25 cause multiple peaks to appear in each frame of the output of the normalisation circuit, as shown in Figure 24. However multiple peaks can also be formed by deliberately combining two time offset broadband signals with the audio track. For example, if the audio track conveys a stereo signal then two identical broadband signals with a time offset of, for example, 150ms can be generated and each broadband signal can be added to a respective one of the two channels. This has the advantage of adding an additional level of time diversity which enables a more robust regeneration of the data signal. Alternatively, two different broadband signals could be generated with each one being added to a respective channel of the audio track.

In the seventh embodiment each of the signal components is input to a separate correlator unit 195 and the corresponding outputs of each correlator unit 195 are added together. Alternatively, a different rake receiver arrangement could be used in which the signals output by the four time delay units 283 can be added together and input to a single correlator unit where despreading takes place.

In a preferred alternative implementation of the rake receiver, each of the adders 285 weight the output of the corresponding correlator in accordance with the strength of the signal component processed by the correlator. In this way the strongest signal component, which should provide the most accurate data, is given more weight than the weaker components. The peak scores for each of the components calculated by the acquisition unit can be used to generate these weighting factors.

It will be appreciated that the exact number of prongs in the rake receiver can be varied. For example, rake receivers having two or six prongs could be used. In the fourth and fifth embodiments, psycho-acoustic encoding is performed in which a psycho-acoustic algorithm is used to determine minimum audible levels for a number of frequency sub-bands of a segment of the audio track based on the energy in each frequency sub-band of that segment and preceding segments, and this information is used to obtain scaling factors for the frequency sub-bands. In order to reduce the required processing a simpler algorithm can be applied, although this will result in a more noticeable broadband signal. For example the energy in the preceding segments could be ignored. Further, instead of calculating minimum audible levels, the scaling factors could be calculated so that the power ratio in each frequency band between the audio track and the broadband signal H(t) is held constant. Alternatively, the frequency analysis could be removed so that the algorithm calculates scaling factors based on the entire energy in preceding segments.

In the third embodiment, the decoder 105 includes a power monitor 107 for detecting the power in the electrical signal D(t) and a de-emphasis circuit for scaling the electrical signal in response to the detected power to reverse the scaling carried out in the encoder 95. This conforms with conventional communications principle that for any shaping performed in the transmitter stage, a corresponding reshaping, inverse to the shaping, is performed in the receiver stage. However, the inventors have found that the power monitor 107 and de-emphasis circuit 109 are not essential and the digital signal output by the ADC 67 can be input directly to the demodulator 69. Thus, a more simple decoder such as the decoder 89 of the first embodiment could be used. This is a significant advantage because in most commercial embodiments of the system, the number of decoders will greatly outnumber the number of encoders and therefore it is desirable to keep the cost of each decoder as low as possible. Similarly, the psycho-acoustic analysis unit 133, scaling unit 135 and time delay unit 137 in the decoder 131 of the fourth embodiment and the psycho- acoustic analysis unit 155, scaling unit 157 and time delay unit 159 in the decoder 151 of the fourth embodiment are not essential. By removing the need for the decoder to carry out psycho-acoustic analysis, the amount of processing required to be done by the decoder is significantly reduced which substantially reduces the cost of each decoder.

In the fifth embodiment, the psycho-acoustic analysis unit 155 of the decoder 151 determines scale factors for each frequency sub-band of a segment of the electrical signal D(t) which are inverse to those used in the encoder. It has been found that if the encoder 141 splits the broadband signal H(t) into segments corresponding to an integer number of the original data bits in the data signal F(t), then the original data signal can be regenerated with a high performance level without carrying out psycho-acoustic analysis and de-emphasis in the decoder.

In the fourth and fifth embodiments, the psycho-acoustic analysis unit calculates theoretical minimum audible levels for ten frequency sub-bands. In practice, a larger number of frequency sub-bands can be used, for example two thousand and forty-eight. However, increasing the number of frequency sub-bands will increase the processing load to be performed by the psycho-acoustic analysis unit.

It will be appreciated that all the above-described shaping techniques, including those in the second to fifth embodiments, could be applied to systems employing synchronous multiplication to despread the broadband signal such as those described in the sixth to eighth embodiments .

It will also be appreciated that conventional equalisation techniques can be applied in the decoder to improve the bit error rate in the presence of multi-path components. Further, an automatic gain control circuit could be included at the input of the decoder. It will also be appreciated that standard techniques of error management could be applied in the encoders and the decoders according to the invention.

The precise values of the bit rates, chip rates, sampling rates and modulation frequencies described in the detailed embodiments are not essential features of the invention and can be varied without departing from the invention. Further, while in the described embodiments the data signal is a binary signal, the data signal could be any narrowband signal, for example a modulated signal in which frequency shift keying has been used to represent a "1" data bit by a first frequency and a "0" data bit as a second different frequency.

Although digital signal processing techniques have been described as the preferred implementation of the invention, analog processing techniques could be used instead. It will be appreciated by the person skilled in the art that if in the encoder a digital processing operation is performed to combine the audio track and the broadband signal, it is preferable that the sampling rate be as high as possible to preserve the original quality of the audio.

All the above embodiments have been described with reference to the communication system illustrated in

Figure 1, in which an interactive toy 23 decodes a data signal encoded in an audio signal output by a television set 15 and, in response to the data signal, outputs a sound stored in a memory in the interactive toy 23 which can be heard by a user. Alternatively, the data signal could convey information enabling a speech synthesiser located in the interactive toy 23 to produce a desired sound, for example a word or phrase. Alternatively, the interactive toy 23 could display information on a screen or part of the interactive toy 23 could move in response to the encoded data signal.

It will be appreciated that the television signal need not be broadcast using a transmitter 5 but could be sent to the television set 15 along a cable network. It will also be appreciated that the same techniques could be applied to a radio signal, whether broadcast using a transmitter or sent along a cable network. Further these techniques can be applied to a point-to-point communication system as well as broadcast systems. Further, conventional encryption techniques could be used so that the television or radio signal could only be reproduced after processing by decryption circuitry.

As another alternative, the television signal could be stored on a video cassette, a digital video disk (DVD) or the like. In this way, no signal is transmitted through the atmosphere or through a cable network but rather the television signal is stored on a recording medium which can subsequently be replayed to a user on the user's television set 15. In one application a user can buy a video cassette to be played on a television set 15 using a video player together with a detachable memory for the interactive toy 23 which stores instructions for the interactive toy 23 which are related to the television signal stored on the video cassette. Similarly, data could be encoded in a purely audio signal stored on a recording medium such as an audio cassette, a compact disc (CD) or the like.

As those skilled in the art will appreciate, the sampling rate of 22.05kHz used in the decoders of the first to sixth embodiments matches that used for compact discs and therefore the encoders and decoders described for these embodiments are suitable for use in systems where a data signal is conveyed by an audio track recorded on a compact disc. All the above-described embodiments are simplex communication systems in which a data signal is transmitted from a transmitter to a receiver without a signal being returned from the receiver to the transmitter. However, the present invention can equally be applied to a duplex communication system in which data signals are transmitted in both directions between two audio transmitter-and-receiver circuits (which will hereinafter be referred to as audio transceivers). An example of such a duplex communication system is illustrated in Figure 33 in which data signals are transmitted in both directions between an interactive toy 221 and a multimedia computer 223 while the interactive toy outputs background music (which reduces the audibility of the audio signals transmitted between the interactive toy 221 and the computer 223).

As shown in Figure 33, the multimedia computer 223 has a display 225, keyboard 227, and an audio transceiver unit 229. A computer memory 231 storing a computer program containing instructions can be inserted into the computer 223 via a slot 233. The audio transceiver unit 229 contains an encoder circuit for spread spectrum encoding and modulating a data signal to form a broadband signal at audio frequencies, a loudspeaker for outputting the broadband signal as an audio signal 243, a microphone for detecting an audio signal transmitted by the interactive toy 221 and a decoder circuit for extracting a data signal encoded in the audio signal detected by the microphone. The interactive toy 221 also has an audio transceiver unit 235 having an encoder circuit, a loudspeaker, a microphone and a decoder circuit identical to those in the audio transceiver unit 229. The toy 221 also has a user input device 237 having four buttons 239 which can be independently lit up.

In use, a user 241 inputs the computer memory 231 into the slot 233 of the computer 223 and runs the stored computer program which causes the computer 223 to generate a data signal indicating a sequence in which the buttons 239 of the user input device 237 of the interactive toy 221 are to be lit up. This data signal is sent to the audio transceiver 239 where it is encoded and output as an audio signal 243. The audio signal 243 output by the audio transceiver unit 229 of the computer 223 is detected and decoded by the audio transceiver unit 235 of the interactive toy 221. The buttons 239 of the user input device 237 are then lit up in the order indicated by the data signal encoded in the audio signal 243 transmitted from the computer 221 to the interactive toy 221.

When the buttons 239 have been lit up in the indicated order the user 241 attempts to press the buttons 239 in the same order as they were lit up. An audio signal 243 is then output by the audio transceiver unit 235 of the interactive toy 221 having encoded therein details of the order in which the user 241 pressed the buttons 239. This audio signal 243 is detected and decoded by the audio transceiver unit 229 of the computer 223 and the resulting data signal is processed by the computer 225. In this way the computer 225 is able to keep a record of the success rate of the user 241 and obtain statistical data as to how the user 241 improves over time and whether there are any particular sequences of buttons which the user finds difficult to reproduce.

Figure 34 shows in more detail the audio transceiver unit 235 in the interactive toy 221. As shown, an audio signal received by the microphone 251 is converted into an electrical signal which is input to a filter 255 to remove unwanted frequencies. The filtered signal is then input to demodulator 257 and the output of the demodulator 257 is input to an analogue-to-digital converter (ADC) 259 which converts the demodulated signal into a digital signal. The digital signal output by the ADC 259 is then input to a correlator 261 which despreads the digital signal. In this embodiment the correlator 261 is a digital matched filter, as in the first to fourth embodiments, whose parameters are set by the output of a pseudo-noise code generator 263. The regenerated data signal output by the correlator 261 is input to a processor 265. A memory 267 is connected to the processor 265 for storing process instructions and to provide working memory.

The processor 265 determines from the regenerated data signal the sequence in which the buttons 239 are to be lit up and outputs a signal to the user input device 237 to light up the buttons 239 accordingly. When the user 241 subsequently attempts to repeat the sequence, the user input device 237 sends details of the buttons 239 pressed to the processor 265 which generates a data signal conveying this information. This data signal is then input to a multiplier 269 where it is multiplied by the pseudo-noise code output by the pseudo-noise code generator 263 in order to spread the power of the data signal over a broad range of frequencies. The spread signal output by the multiplier 269 is then input to a modulator 271 which centres the power of the main band of the spread signal to 5512.5KHZ. The modulated signal output by the modulator 271 is then input to the audio mixer 273 where it is combined in a simple adding operation to background music output by an audio source 275. The combined signal output by the audio mixer 273 is then converted into an audio signal by the loudspeaker 253.

The audio transceiver unit 229 connected to the computer 223 includes circuitry identical to that contained within the dashed block in Figure 34 and a more detailed description of this transceiver unit 229 will therefore be omitted.

It will be appreciated that the background music could be generated by the audio transceiver unit 229 of the computer 223 with the encoded version of the data signal produced by the computer 223 being combined with the background music. However it is preferred that the background music be output by the interactive toy 221 as the interactive toy 221 will generally be nearer to the user 241 in use and therefore the power of the background music signal can be reduced. The above embodiments have all been described in relation to an application in which an interactive toy 23 responds to a signal encoded in an audio track. However, the audio communication techniques described hereinbefore are broadly applicable to a wide range of applications further examples of which are given below.

In one application, as shown in Figure 32, a conventional CD 205 has recorded thereon an audio track with a data signal encoded in the audio track. The audio track is output by a pair of speakers 207 of a music centre 209 in the form of audio signal 211 which is received by a microphone 213 which forms part of a lighting system 215. Incorporated in the lighting system 215 is a decoder like those described above for decoding the data signal encoded in the audio signal 211. This data signal conveys instructions to determine which of the lights 217a, 217b and/or 217c of the lighting system 215 are turned on at any one time. Therefore the lights 217 of the lighting system 215 can be made to react to the music in accordance with data programmed on the compact disc 205.

The audio communication techniques could also be used to distribute information to intelligent home appliances. For example, if a television programme about food is on the television or radio, recipes discussed could be encoded in the audio track and detected by a microphone of a computer which stores the recipe for future access by a user. Alternatively, news headlines or sports results could be encoded into the audio track of a television or radio signal to be detected by the computer and displayed on a screen either automatically or on command of a user. The audio communication system could also provide a distribution channel for paging information.

It will be appreciated that the term audio track refers to information which is intended to be reproduced as an audio signal by a loudspeaker in the audible range of frequencies, which typically spans from 20Hz to 20,000Hz. An audio signal is formed by pressure waves in the air that can be detected by an electro-acoustic transducer. Such pressure waves can alternatively be referred to as acoustic waves and audio signals can alternatively be referred to as acoustic signals.

As described previously, prior to an appliance converting the audio track to an audio signal, the audio track is conveyed to that appliance using any of a number of techniques, for example via a wireless broadcast, a cable network or a recording medium. It is envisaged that the transmission over the internet of audio tracks encoded with data using the audio communication techniques described hereinbefore will have many applications.

A particular advantage of encoding data onto an audio track is that the bandwidth required to transmit the audio track together with the data signal is no more than that required to transmit the audio track alone. The audio encoding techniques described hereinbefore could therefore be used for applications where the appliance which converts the audio track into an audio signal also decodes the data signal embedded in the audio track and reacts in some manner to the decoded data signal. For example, subtitles could be encoded in the audio track of a television signal, and a television set having a suitable decoder can decode the subtitles and display them on the screen. There is no need to remove the data signal from the audio track before converting the audio track into an audio signal because the data signal is not noticeable to a listener of the audio signal.

The present invention is not intended to be limited to the above described embodiments. Other modifications and embodiments would be apparent to those skilled in the art.

Claims

1. A toy system comprising an encoder for encoding a data signal to form a spread signal, an electro-acoustic transducer for converting the spread signal into a corresponding acoustic signal, and a toy responsive to the data signal within the spread signal, wherein: the encoder comprises: (i) means for receiving the data signal; (ii) means for spreading the received data signal to form a spread signal; and (iii) output means for outputting the spread signal, and wherein the toy comprises : an electro-acoustic transducer for receiving and for converting the acoustic signal into a corresponding electrical signal; (ii) a decoder for de-spreading the electrical signal in order to regenerate the data signal; and (iii) response means responsive to the data signal.

2. A toy system according to claim 1, wherein the encoder further comprises: means for receiving an audio track; means for combining the spread signal with the audio track to generate a modified audio track; and means for outputting the modified audio track.

3. A toy system according to either claim 1 or 2 , wherein the encoder further comprises modulating means for modulating the data signal before being spread or modulating the spread data signal.

4. A toy system according to any of claims 1 to 3 , wherein the means for spreading the received data signal comprises a first pseudo-noise code generator operable to generate a first pseudo-noise code comprising a sequence of chips, and is operable to perform direct sequence spread spectrum encoding using the first pseudo-noise code.

5. A toy system according to claim 4, wherein the first pseudo-noise code generator is operable to generate a 12 bit code having 4095 chips.

6. A toy system according to either claim 4 or 5 , wherein the spreading means is operable to combine each data element of the data signal with a part of the first pseudo-noise code.

7. A toy system according to claim 6, wherein the spreading means is arranged to multiply each data element of the data signal by a sequence of two hundred and fifty-six chips of the first pseudo-noise code.

8. A toy system according to any of claims 4 to 7 , wherein the spreading means further comprises a second pseudo-noise code generator operable to generate a second pseudo-noise code which is different to the first pseudo- noise code, and the modulating and spreading means is arranged to combine each data element of the data signal with a chip sequence from either the first pseudo-noise code or the second pseudo-noise code in dependence upon the value of the data element.

9. A toy system according to claim 8, wherein the second pseudo-noise code generator is operable to generate a second pseudo-noise code orthogonal to the first pseudo-noise code.

10. A toy system according to any preceding claim, wherein the encoder further comprises scaling means for scaling the spread signal.

11. A toy system according to claim 10, wherein the decoder further comprises de-scaling means for removing the scaling applied by the scaling means.

12. A toy system according to either claim 10 or 11, wherein the scaling means is operable to perform a frequency dependent scaling.

13. A toy system according to claim 12, wherein the scaling means is operable to increase the proportion of the energy at lower frequencies.

14. A toy system according to claim 13, wherein the scaling means is arranged to apply a frequency-dependent scaling function having a frequency dependence between 1/f and 1/f², where f is the frequency.

15. A toy system according to claim 13, wherein the scaling function is approximately inverse to the sensitivity of a human ear.

16. A toy system according to any of claims 10 to 15, wherein the encoder further comprises a power monitor operable to output a signal indicative of the power in the audio track to the scaling means, and wherein the scaling means is operable to vary the applied scaling in dependence upon the power signal output by the power monitor.

17. A toy system according to claim 16, wherein the scaling means is operable to adjust the power in the spread signal to be a fixed amount below the power in the audio track, unless the power of the audio track is below a threshold in which case the power of the spread signal is set at a predetermined level.

18. A toy system according to either claim 10 or 11, wherein the encoder further comprises a psycho-acoustic analysis system for determining theoretical minimum audible sound levels in the presence of the audio track, and wherein the scaling means is operable to scale the spread signal in accordance with the determined theoretical minimum audible sound levels.

19. A toy system according to claim 18, wherein the scaling means is operable to scale the power of the spread signal to be at or above the determined minimum audible sound level.

20. A toy system according to claim 19, wherein the scaling means is operable to scale the power of the spread signal to be a predetermined amount above the theoretical minimum audible level.

21. A toy system according to any of claims 18 to 20, wherein the psycho-acoustic analysis system is operable to analyse the audio track in segments whose duration corresponds to the duration of an integer number of data elements of the data signal, and wherein the encoder is operable: (i) to scale a portion of the spread signal corresponding to one data element of the data signal in accordance with the minimum audible sound level calculated for a segment of the audio track; and (ii) to subsequently combine said portion of the spread signal with said segment of the audio track.

22. A toy system according to claim 21 when dependent on claim 9, wherein the decoder does not include a de- scaling unit.

23. A toy system according to any of claims 18 to 22, wherein the psycho-acoustic analysis unit is operable to generate frequency-dependent scaling factors corresponding to a segment of the audio track in accordance with the frequency spectrum of that segment.

24. A toy system according to any of claims 4 to 23, further comprising demodulating means for demodulating the electrical signal, wherein the de-spreading means comprises a third pseudo-noise code generator operable to generate a third pseudo-noise code identical to the first pseudo-noise code, and the de-spreading means is operable to synchronously multiply the demodulated signal by the third pseudo-noise code to form the de-spread signal.

25. A toy system according to any of claims 4 to 23, wherein the decoder of the toy comprises : demodulating means for demodulating the de-spread signal to form a demodulated signal, wherein the de-spreading means comprises a third pseudo-noise code generator operable to generate a third pseudo-noise code identical to the first pseudo-noise code, and the de-spreading means is operable to synchronously multiply the electrical signal by the third pseudo-noise code to form the de-spread signal.

26. A toy system according to claim 24 or 25, wherein the decoder comprises a rake receiver having a plurality of prongs, and the decoder is operable to introduce different time delays between the electrical signal and the third pseudo-noise code in each prong of the rake receiver, in order to de-spread different components of the electrical signal.

27. A toy system according to any of claims 24 to 26, wherein the decoder further comprises a synchronisation circuit for synchronising the third pseudo-noise code with a code sequence conveyed by the electrical signal.

28. A toy system according to claim 27, wherein the synchronisation circuit comprises: a correlator for generating a time-varying output dependent on the similarity of a chip sequence conveyed by the electrical signal and a predetermined chip sequence; and a normalisation circuit operable to scale the time- varying output of the correlator by a normalisation factor determined from the average value of the time- varying output over a predetermined period of time.

29. A toy system according to claim 28, wherein the normalisation circuit comprises means for calculating a running average of the time-varying output over the predetermined period of time.

30. A toy system according to claim 27, wherein the synchronisation circuit comprises: a correlator for generating a time-varying output by correlating a chip sequence conveyed by the electrical signal and a predetermined chip sequence; a cross-correlator for cross-correlating the output of the correlator over a first time period with the output of the correlator over a second time period; and means for determining a frequency offset between the frequency at which the third pseudo-noise code generator generates the third pseudo-noise code and the frequency of the code sequence conveyed by the electrical signal from the output of the cross-correlator.

31. A toy system according to claim 27, wherein the synchronisation circuit comprises: a correlator for generating a time-varying output by correlating a chip sequence conveyed by the electrical signal and a predetermined chip sequence; a cross-correlator for cross-correlating the output of the correlator over a first time period with the output of the correlator over a second time period; and means for determining the difference between the chip rate of the predetermined chip sequence and the chip rate of the chip sequence conveyed by the electrical signal from the output of the cross-correlator.

32. A toy system according to either claim 30 or 31, further comprising a normalisation circuit operable to scale the time-varying output of the correlator by a normalisation factor determined from the average value of the time-varying output over a predetermined period of time .

33. A toy system according to claim 32, wherein the normalisation circuit comprises means for calculating a running average of the time-varying output over the predetermined period of time.

34. A toy system according to claim 27, wherein the synchronisation circuit comprises: a plurality of correlators, each correlator arranged to generate a time-varying output dependent by correlating a chip sequence conveyed by the electrical signal and a respective predetermined chip sequence; and means for controlling the third pseudo-noise code generator in accordance with the outputs of the plurality of correlators, wherein the respective predetermined chip sequences have the same chip rate .

35. A toy system according to claim 34, wherein the plurality of correlators are cascaded in series.

36. A toy system according to claim 34, wherein the plurality of correlators are connected in parallel.

37. A toy system according to any of claims 34 to 36, further comprising a plurality of normalisation circuits, each normalisation circuit being operable to scale the time-varying output of a respective one of the plurality of correlators by a normalisation factor determined from the average of the time-varying output of that correlator over a predetermined period of time.

38. A toy system according to claim 37, wherein the normalisation circuit comprises means for calculating a running average of the time-varying output over the predetermined period of time.

39. A toy system according to any of claims 27 to 38, wherein the synchronisation circuit further comprises: a plurality of cross-correlators, each cross- correlator being arranged to cross-correlate the output of a respective one of the plurality of correlators over a first time period with the output of that respective correlator over a second time period; means for adding the outputs of each of the cross- correlators; and means for determining a frequency offset between the frequency at which the third pseudo-noise code generator generates the third pseudo-noise code and the frequency of the spreading code in the electrical signal from the output of the adding means .

40. A toy system according to any of claims 28 to 39, wherein the or each correlator is formed by a matched filter.

41. A toy system according to any of claims 1 to 40, wherein the response means is operable to generate an output that is discernible to human beings.

42. A toy system according to claim 41, wherein the response means is operable to cause the toy to output an acoustic signal determined using the data signal.

43. A toy system according to claim 42, wherein the response means comprises a processor operable to select one of a plurality of sound files stored in a memory, and to output the selected sound file via an electro-acoustic transducer.

44. A toy system according to claim 43, wherein the memory is detachable.

45. A toy system according to claim 41, wherein the response means is arranged to cause the toy to display a visual signal determined using the data signal.

46. A toy system according to claim 41, wherein the response means is operable to cause a movement of the toy.

47. A toy system according to claim 41, wherein the response means is operable to cause a movement of part of the toy relative to the rest of the toy.

48. A toy system according to any preceding claims, wherein the toy further comprises: means for generating a data signal; means for spreading the generated data signal to form a spread signal; and an electro-acoustic transducer for receiving and converting the spread signal into an acoustic signal.

49. A toy system according any preceding claim, wherein the encoder forms part of a television broadcast system, and the electro-acoustic transducer is formed by a loudspeaker of a television set.

50. A toy system according to claim 49 when dependent on claim 2, wherein the audio track is the audio track of a television programme, and the data signal is operable to enable the toy to interact with the television programme.

51. A toy system according to any of claims 1 to 48 when dependent on claim 2, in which the modified audio track is recorded on a recording medium, and the toy system further comprises a reproducing apparatus, including the electro-acoustic transducer, for reproducing the modified audio track stored on the recording medium.

52. A toy system according to claim 51, wherein the recording medium is a compact disc.

53. A toy system according to claim 51, wherein the recording medium is a video cassette.

54. An encoder for the toy system as claimed in any preceding claim, the encoder comprising: means for receiving the data signal; means for spreading the received data signal to form a spread signal; and output means for outputting the modified audio track.

55. An encoder according to claim 54, further comprising any of the technical encoder features claimed in claims 2 to 50.

56. A toy comprising: an electro-acoustic transducer for receiving and for converting an acoustic signal into an electrical signal; and a decoder for decoding a data signal conveyed by the acoustic signal, the decoder comprising means for de- spreading the electrical signal and means for regenerating the data signal from the de-spread signal.

57. A toy according to claim 56, further comprising any of the toy features of claims 2 to 53.

58. A method of controlling a toy, the method comprising the steps of: generating a control signal at a control centre; spreading the control signal; combining the control signal with an audio track to form a modified audio track; transmitting the modified audio track to an electro- acoustic transducer in the vicinity of the toy; converting the modified audio track into an acoustic signal at the electro-acoustic transducer; converting the acoustic signal received by the toy into an electrical signal; de-spreading the electrical signal to generate a de- spread signal; regenerating the control signal from the de-spread signal; and controlling the toy in accordance with the control signal.

59. A method of transferring data to a domestic appliance, the method comprising the steps of: generating a data signal; spreading the data signal to form a spread signal; generating an acoustic signal conveying the spread signal; detecting the acoustic signal at the domestic appliance; converting the acoustic signal into an electrical signal in the domestic appliance; and de-spreading the electrical signal in order to regenerate the control signal.

60. A communication system comprising an encoder for combining a data signal with an audio track to form a modified audio track, an electro-acoustic transducer for converting the modified audio track into a corresponding acoustic signal, and a receiver responsive to the data signal within the modified audio track, wherein: the encoder comprises: (i) first receiving means for receiving the audio track; (ii) second receiving means for receiving the data signal; (iii) means for spreading the received data signal to form a spread signal; (iv) means for combining the received audio track and the spread signal to generate the modified audio track; and (v) output means for outputting the modified audio track, and wherein the receiver comprises: an electroacoustic transducer for receiving and for converting the acoustic signal into a corresponding electrical signal; (ii) a decoder for de-spreading the electrical signal in order to regenerate the data signal; and (iii) response means responsive to the data signal.