WO1994003988A2

WO1994003988A2 - Dithered digital signal processing system

Info

Publication number: WO1994003988A2
Application number: PCT/GB1993/001644
Authority: WO
Inventors: Michael Anthony Gerzon; Peter Graham Craven
Original assignee: Michael Anthony Gerzon; Peter Graham Craven
Priority date: 1992-08-05
Filing date: 1993-08-04
Publication date: 1994-02-17
Also published as: WO1994003988A3; GB9216659D0

Abstract

Digital signal processing systems for converting a digital signal of a first resolution to a digital signal of a second different resolution include a generator for dither noise. The dither noise is used in such a way as to mask or reduce the effect of non-linearities in the system. The generator generates the dither noise for use in processing a current sample of a lower resolution signal in functional dependence upon previous samples of the lower resolution signal. In one sample, the dither noise generator is formed as a look-up table addressed by the least significant bits of the previous, e.g., 16 samples. The least significant bits of those samples are stored in a buffer. In a digital encoder which encodes a high resolution digital signal to produce a lower resolution signal, the dither noise output from the generator is added to the lower resolution signal at the input to a quantizer. In a complementary digital decoder, the output of a matching dither noise generator is subtracted from the lower resolution digital signal. The encoder and decoder may be combined in a transmission system which transmits a higher resolution signal via a lower resolution digital channel formed, e.g., by a digital storage medium such as a CD, DAT or other digital tape format.

Description

DITHERED DIGITAL SIGNAL PROCESSING SYSTEM

BACKGROUND TO THE INVENTION

The present invention relates to digital waveform encoding and decoding systems, and in particular to systems in which dither is used to counter the effect of low-level non-linearities on signals.

Such systems might be used, for example, in recording and reproducing compact discs. While the CD 16 bit standard appeared to be virtually "noise free" by comparison with previous analogue media, experience has shown that its noise floor is still too high for many applications. It was realised early on that the apparent quietness of silent passages was not due to any inherent freedom from noise in digital systems, but because the staircase non-linear transfer characteristic of the undithered 16 bit quantizer acted as a ferocious instantaneous non-linear "noise gate" for low-level signals, of a kind that would not have been acceptable in analogue systems, and that it was necessary to add a "dither noise" signal to remove the effect of this non- linearity.

Such dither noise, when of a "Gaussian" white noise form, degraded the signal-to-noise ratio of the 16 bit digital channel to not better than 92 dB. Moreover, the peak signal level used to calculate signal-to-noise ratio is a sharp limit, rather than the soft overload of analogue media, so that in practice, the signal-to-noise ratio of CD is often comparable to that of an analogue medium with around a 75 dB S/N (signal to noise ratio) . Both in the case of music with exceptionally wide dynamic range, and for recordings where a safety margin must be left in recording level to prevent clipping, the 16 bit medium has proved to inadequate. The problem of inadequate S/N is compounded if a recording has to go through several generations of signal processing (even if the signal processing is as simple as a small change of gain) , since each stage of processing requires that the signal be requantized and redithered to prevent non-linearity. The result is that for digital signals, exactly the same kind of "generation loss" in S/N occurs whenever a signal is processed in any way other than exact cloning as for analogue media.

In ref. [1], the present inventors pointed out that the inherent performance of a 16 bit channel could be improved substantially by a combination of two strategies: 1) the use of noise shaping around the quantization/dither error process, so as to shape the spectrum of the noise error to make it subjectively less audible, and 2) the use of subtractive dither, which allowed an improvement of 6 dB (relative to Gaussian nonsubtractive dither) or 4.8 dB (relative to triangular nonsubtractive dither) for suitably designed reproducers, while retaining compatibility with existing nonsubtractive reproducers.

The optimization of the noise shaping strategy to match human psychoacoustics has been taken further by Lipshitz et al [2] and by Stuart and Wilson [3], whose researches have indicated that about 18 dB reduction in the perceived S/N is achievable by noise shaping. Combined with subtractive dither, this represents approximately 24 dB (i.e. 4 bits) improvement in effective S/N performance as compared with conventionally Gaussian dithered 16 bit systems.

However, while a noise-shaped subtractively dithered system can improve S/N by 4 bits, and in the case of a nonsubtractive player with an otherwise "perfect" D/A converter can still give a 3 bit improvement, hitherto, such subtractively dithered systems required the transmission of a synchronizing signal to allow the regeneration of the dither signal at the D/A converter. This need for synchronization created various problems, including: 1) the need to standardize the synchronization process for all 16 bit media, including CD, DAT and other digital tape media, 2) the need to recover the synchronisation signal from the digital data stream in existing players, 3) the problem of resynchronization across a tape edit, or if a signal is shifted in time even by as little as one sample. Because a synchronised subtractive dither system uses a dither that is dependent on the relative timing of a signal sample with the synchronizing clock, any time-shift of the signal would require a requantization of the signal, subtracting the "old" dither signal, and requantizing with the "new" dither signal, resulting in a 3 dB loss of S/N. Only two such retimings within a signal processing chain would cause a loss of all the S/N advantages of subtractive dither over nonsubtractive dither. The alternative was to find a way of "retiming" the synchronizing clock quickly whenever necessary, which further complicated the design of a synchronization standard.

SUMMARY OF THE INVENTION

According to a first aspect of the present invention, there is provided a digital signal processing system comprising means for processing a signal of a first resolution to produce a signal of a second, different, resolution, the said means including a generator for dither noise,the system using the dither noise in processing the signal to ameliorate the effect of system non-linearities, characterised in that the generator for dither noise includes means for generating in functional dependence upon samples of the lower resolution signal a pseudo-random dither noise output for use in processing a current sample of said lower resolution signal.

The present invention uses a new approach to subtractive dither that requires no synchronizing clock signal, but which uses the signal samples themselves to compute the dither signal. This has a first important advantage that no extra data need be transmitted or recovered to synchronize the subtractive dither process, which means that subtractive dither can be used with existing media and chips. But a second advantage is that retiming of the subtractive dither can be achieved in as few as 16 samples across an edit or signal retiming, thereby allowing such processing as time delays or "slipping" between simultaneous or consecutive tracks to be achieved without any loss of S/N. Also, the process of generating and regenerating dither from the dithered signal samples themselves means that other processes which conventionally do not cause loss of S/N in the digital domain, such as polarity inversion and/or the swapping of stereo channels, still do not cause a loss of S/N.

The dither generator may produce a dither output dependent on, eg, the value or polarity of the presently processed sample as well as previous samples of the lower resolution signal. It is much preferred however that for the current sample the output of the generator depends on previous samples only.

In contrast to this invention, subtractive dither synchronized by a clock signal requires requantization and redithering whenever any of these signal manipulations occur, resulting in a 3 dB loss of S/N in the first "generation".

The new approach, which we term "autodither", generates a dither signal for each sample that is dependent only on a finite number (typically 16 or 24) of the previously quantized/dithered samples. The autodither process itself requires standardization, and the standard must include a specification of the noise-shaping being applied. However, once such a standardization has been made, autodithered signals may be sent via existing linear digital channels in a manner fully compatible with existing uses, although only subtractively dithered playback will recover the full available dynamic range, with nonsubtractively dithered playback being about 4.8 dB worse.

Only when a signal is subjected to the kind of signal processing that conventionally requires requantization will it be necessary to adopt modified signal processing strategies, which consist of subtracting the dither out before doing the signal processing, and requantizing using the standardized autodither process before going back to 16 bits.

The autodither process is thus operationally robust, and does not cause unnecessary losses of S/N in ordinary studio operation. The theoretically achievable subjective S/N of about 116 dB is far better than existing A/D converters, and so leaves adequate margins for loss of S/N due to subsequent signal manipulations.

Even with existing converters, 116 dB S/N allows fade- outs to near silence in the digital domain, and provides a reasonable safety margin for conservative recording levels to avoid peak overload in live recording of music with a wide dynamic range. It also allows compact disc recorders to be used for live recording applications, while obtaining the same dynamic range as currently achieved using 20 bit recording media. A lowered noise floor also means that there is less need to push up the levels recorded on CDs so that peaks run to near the peak 0 dB level. Rather, it is possible to record inherently quiet music (e.g. clavichords or lutes) at their natural levels even if this means peaks at -30 dB or so. Both inventors have long been of the view that a perfect "high fidelity" medium would not need any level adjustment anywhere in the chain, but would preserve the original absolute sound pressure level. 116 dB dynamic range, and the limitations of existing A/D converters, do not yet permit achievement of this ideal, but the need for "gain riding" and high channel modulation is no longer so pressing.

The digital signal processing system may be implemented using some stages operating in the analogue domain. For example, as further described below, noise subtraction in a decoder might be carried out in the analogue domain. Preferably the digital signal processing system further comprises filter means for applying noise-shaping to the dither noise to reduce the perceptibility of the dither noise in the reproduced signal. The combination of autodither with noise-shaping makes it possible to increase the effective dynamic range of, e.g., 16 bit audio media to the equivalent of around 19.5- 20 bits. With a 20 bit system, such as those used for studio recordings, it makes possible a 24 bit performance. The digital signal processing system may be a digital encoder for encoding a digital signal of a higher resolution to produce a digital signal of a lower resolution, in which case the encoder comprises a quantizer for rounding or truncating the higher resolution signal to produce the lower resolution signal, the output of the quantizer being connected to an input of the generator for dither noise, and an addition means for adding the dither noise output by the generator to the higher resolution signal input to the quantizer. Alternatively, the digital signal processing system may be a digital decoder for decoding a lower resolution digital signal to reproduce a digital signal of higher resolution, the decoder including subtraction means for subtracting dither noise from the lower resolution signal, the dither noise generator receiving the lower resolution digital signal at its input and outputting dither noise to the subtraction means.

Where reference is made herein to "subtraction" this includes the addition of a signal of reversed polarity, and vice versa for "addition".

According to a further aspect of the present invention, a transmission system for transmitting a higher resolution signal via a lower resolution digital channel comprises in combination an autodither encoder and an autodither decoder as defined above.

The lower resolution digital channel will typically be provided by a recording medium such as CD, DAT or other digital tape media, or may be a digital storage or memory medium such as a RAM or ROM or a hard disc, floppy disc or optical disc.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the present invention, and the theoretical background to the invention, will now be described in further detail, by way of example only, with reference to the accompanying drawings, in which: Figure 1 shows a schematic of nonsubtractive dither.

Figure 2 shows a schematic of subtractive dither.

Figure 3 shows the use of a transmitted clock signal to synchronize subtractive dither. Figure 4 shows noise shaping around a dithered quantizer.

Figure 5 shows an alternative form of noise shaped dithered quantizer equivalent to figure 4.

Figure 6 shows the subtractive reconstruction of signals for noise-shaped dither. Figure 7 shows the autodither encode process.

Figure 8 shows the autodither decode process.

Figure 9 shows autodither encoding with noise shaping

Figure 10 shows an alternative autodither encoding with noise shaping equivalent to figure 9. Figure 11 shows autodither decoding with noise shaping.

Figure 12 shows the cross-fade splicing of autodithered signals.

Figure 13 shows the generation of pseudo-random dither noise with a triangular pdf by means of adding two uniform pdf's.

Figure 14 shows the use of a first noise shaping characteristic on dither and a second different characteristic on quantizer error.

Figure 15 shows a schematic of noise-shaped dithered encoding with adaptive noise shaping around the quantizer.

Figure 16 shows a scheme for ensuring that autodither decoding works correctly when signal polarity is inverted. DETAILED DESCRIPTION OF EXAMPLES

Figure 7 shows a first example of an encoder embodying the present invention. A high resolution digital signal is input on line 1 via an addition node 2 to a quantizer 3. After truncation by the quantizer a low resolution signal is output on the line 4. At the same time the low resolution signal is fed to a buffer memory 5. M successive samples in the buffer memory 5 are output in parallel to address a look-up table 6. The look-up table 6 then outputs a value which is added to the incoming high resolution signal at the node 2 as dither noise.

Figure 8 shows a complementary autodither decoder structure. An incoming low resolution digital signal on line 7 is input to a level reconstruction circuit 8, of the type well known in the art. At the same time the samples from the line 7 are used to address a look-up table 10 via a buffer memory 9 to generate autodither noise in the same manner as described for the encoder. The contents of the look-up table 9 are matched to those of the look-up table 6 in the encoder. The output from the look-up table 9 is combined with the output of the level reconstruction circuit 8 at a subtractive node 11 so that the dither noise is subtracted from the output signal. The invention may conveniently be implemented using digital signal processing algorithms on general-purpose digital signal processing (DSP) chips, such as those of the Motorola DSP 56000 family or those of the Texas Instruments TMS320 family, in which a digital signal representing a waveform at one resolution is input to the DSP chip and the required digital signal output is output from the chip. The individual signal-processing blocks described in this description will, in such convenient implementations, be realized as signal processing sub-algorithms of the digital signal processing algorithm programmed on the DSP chip. In particular the signal processing block referred to as a quantizer may consist of an algorithm rounding or truncating a digital signal word to a digital word having fewer bits at a lower resolution.

Where an algorithm requires small amounts of memory, the memory on board the DSP chip may be used, but where larger amounts of memory are required, as for example in look-up tables, an external RAM or ROM chip or chips may be provided addressed by the DSP chip. For example, external RAM may be provided by a high speed static RAM such as the Integrated Device Technology IDT 71256 when 32 K of 8 bit memory is required. However, when using a fixed look-up table, it is generally cheaper to store the table in ROM usindeg an EPROM chip.

Further developments and refinements of these circuits and algorithms, and details of the look-up table and alternatives, are described in further detail in the discussion below.

We will now describe the principles of subtractive dither and noise-shaping as known in the prior art and as modified for use in the present invention.

Subtractive Dither

The basic principles of noise-shaped subtractive dither were given in ref. [1]. Dither is the process of adding a random (or pseudo-random) noise signal before a quantizer, as shown in Figure 1, so as to eliminate the effect of the "staircase" transfer-function non-linearity of the quantizer. Dither achieves this by replacing a single input signal level which is quantized by a range of such signal levels determined statistically by the added noise signal, so that the transfer-function non-linearity is averaged over the possible range of input signal levels created by adding the noise. This averaging or smoothing of the transfer-function non-linearity can be achieved only if the statistics of the added noise signal are appropriately chosen [4] and has the side effect of causing the output signal in Figure l to have an added noise component. However, the added noise caused by dither can be eliminated by using subtractive dither, as shown in Figure 2. Here, the noise added before the quantizer is simply subtracted again afterwards.

Conventionally in subtractive dither systems, the noise is reconstituted at the output for subtraction by means of a synchronizing clock, to which is locked a pseudo-random noise generator as shown in Figure 3. As noted in the introduction, the use of such synchronizing clocks has severe operational problems in existing audio systems.

Noise-shaping

Conventional pseudo-random dither signals generally have independent statistics for each sample of the noise signal, and hence have a flat ("white") power spectrum. Subtractive and nonsubtractive dither systems such as those of Figures 2 and l thus produce a white noise spectrum. The ears are not equally sensitive to all frequencies, but as described in refs. [2] and [3] are most sensitive to noise in the 3 kHz region and are much less sensitive to very low and very high frequencies within the audio band. It is therefore preferred that the spectrum of the added noise signal, in both the subtractive and nonsubtractive cases, is altered, with lower energy near 3 kHz and compensatory higher energy at very high (and possible very low) frequencies, so that the total amount of perceived noise is reduced. As shown in refs. [2] and [3], such noise-shaping of the power spectrum of the noise can yield a subjective reduction of noise by as much as 18 dB. Figure 4 shows how such noise-shaping is achieved for subtractive dither systems. Essentially, one treats the dithered quantizer of Figure 1, with white pseudo-random noise added before the quantizer, as a single cause of "quantization error", and puts noise-shaping around the whole system. This noise-shaping consists of taking the error signal around the quantizer dithered by adding dither noise beforehand, and passing this error signal through a filter H(z^" ) including a one-sample delay factor z^" , and subtracting the resulting filtered error signal from the input. This has the effect [1] on the dithered quantized noise spectrum of filtering the original unshaped error noise signal spectrum by the action of the filter

1 - H(z^"1) (1)

as shown in ref. [1]. It was shown in ref. [1] that, provided that the filter of Eq. (1) is minimum phase, that the averaged decibel frequency response up to Nyquist (half sampling) frequency is unchanged by noise-shaping, and that the noise energy, weighted by a frequency weighting curve chosen to represent the perceptual audibility of human hearing, can be minimised by choosing the filter H(z^" ) to be that unique filter such that:

1) 1 - H(z ) is minimum phase, and

2) the frequency response of 1 - H(z^" ) is inverse to the chosen perceptual weighting frequency response. Lipshitz et al [2] determined practical approximations to such noise-shaping filters inverse to perceptual weighting curves to ensure minimum noise audibility. The unweighted rms noise produced via such filters is increased, and they showed that constraining this increase to not more than say 20 dB also limited the degree of perceptual improvement in S/N, but showed that for typical realistic model perceptual weighting curves, a decrease of perceived S/N of around 18 dB was possible. Stuart and Wilson [3], using a more accurate perceptual model for the audibility of spectrally weighted noise based on detection theory and probabilistic thresholds of hearing within critical bands, arrived at a similar figure for the perceptual improvement produced by noise-shaping, but with slightly different noise-shaping frequency responses. Figure 5 shows an alternative form of noise-shaping of a dithered quantizer in which the noise-shaping of the quantizer and the dither is separated. It is easily shown [1] that Figures 4 and 5 have exactly equivalent performance, and are equivalent to one another.

Although Figure 5 is more complicated than Figure 4 to

.<] implement, involving two copies of the filter H(z ) , it has the advantage of separating the effect of the noise-shaping of the dither from that of the quantizer. From Figure 5, it is clear that the subtractive dither decoding of the noise-shaped dithered quantizer of Figures 4 or 5 is as shown in Figure 6, where the reconstructed dither signal is first noise shaped by the filter of Eq. (1) and then subtracted from the reconstructed levels output from the quantization process. It is thus clear that for any noise- shaped subtractive dither system, the noise-shaping 1 - H(z^" ) must be standardized in order that the correct dither be subtracted.

However, Figure 5 reveals that it is possible, as shown schematically in Figure 14 to use a different filter, say H'(z^"1), around the quantizer than for shaping the dither noise spectrum, making the quantization noise error have a different spectrum from that of the added noise. The final noise error spectrum from the subtractive dither process with the dither noise- shaped by H(z^" ) is thus shaped by H' (z -1) . Thi.s means that i.f a subj.ectl.vely better noise-shaping filter H'(z^* ) is determined after the decoding noise-shaping filter has already been standardized as H(z -1) , the subtractively dithered decoded results wi.ll incorporate the improved noise-shaping. Thus the choice of a standardized noise- shaping H(z^" ) for subtractively dithered decoding as in Figure 6 in no way limits the potential for improved noise- shaping in the future, but only limits the noise-shaping heard by the user who is not using subtractive dither.

One potential future improvement that is left open in the possibility that the noise-shaping filter H' (z -1) around the quantizer might vary adaptively with the input signal, possible mimicking the general spectral shape of the signal, so as to increase masking of the noise by the signal. Ref. [1] gave a design procedure for giving 1- H' (z^" ) any desired spectral shape. Such an adaptive noise- shaping quantizer is shown schematically in Figure 15. Nevertheless, the standardized choice of noise-shaping used with the subtractive dither decoding of Figure 6 needs careful optimization in order to assure the best possible results for nonsubtractive listeners, and we propose the use of the kind of optimized noise-shaping proposed by Stuart and Wilson [3], as a preferred option. This noise- shaping will give about 18 dB improvement in S/N subjectively as compared to no use of noise-shaping. The use of a ninth order noise-shaping filter H(z^' ) as described in references [3] or [2] is preferred.

As shown in ref. [1], there are various equivalent ways of performing the noise shaping encoding in which the noise-shaping 1-H(z^" ) of the original white dither signal is different from the noise-shaping 1-H' (z^' ) of the quantizer besides the architecture shown in Figure 5. For example, the architecture of Figure 4 can be used, where the feedback filter shown is made to equal H' (z^" ) and the dither noise signal is subjected to an initial filtering

[1-H(z^"1)]/[1-H'(z^"1)] (2)

before being fed to the addition node in Figure 4.

While it can be shown that the noise heard by the user of a subtractively dithered decoder, as in Figure 6, will always be statistically uncorrelated with the input signal if the dither noise (before noise-shaping filtering) has independent samples with a rectangular probability distribution function (pdf) with a peak-to-peak level of a least significant bit (lsb) step size, it can be shown [1,4] that this does not eliminate audible modulation noise effects for nonsubtractive listeners. It is therefore strongly preferred that the dither noise used, before noise-shaping, should have a triangular pdf with peak-to- peak amplitude of 2 lsb's, as described in refs. [1] and [4], although any other pdf comprising this triangular probability distribution convoluted with any other pdf will also avoid variations in the mean square noise level heard by nonsubtractive listeners. Such a pdf may be acheived by adding to a triangular pdf dither a second statistically independent dither signal with an arbitrary pdf. For example, a quadratic B-spline pdf with peak-to-peak amplitude of 3 lsb's may be used. Such a quadratic B- spline pdf dither may be formed by adding three statistically independent rectangular pdf dithers with peak levels ±% lsb. The form of the noise and its generation has to be standardized in a subtractively dithered system in order that the dither signal reconstructed at the decoding stage should exactly match that used at the encoding state. Alternatively, a dither noise with Gaussian statistics, at a level of about 4.8 dB or more above the rms (root mean square) level of rectangular pdf noise with 1 lsb peak-to-peak level will also give satisfactory results both for subtractive and nonsubtractive decoding. However, our preferred option of triangular pdf dither described above will minimize the noise energy heard with nonsubtractive decoding.

Autodither Rather than generate pseudo random noise from a clock, the autodither process of the presently described embodiments instead generates dither from the least significant bits of the quantized digital data stream in the last M samples, where M is a number that is typically chosen equal to 14 or 16 or 24. In the simplest examples, without noise-shaping, the encode process of turning a high resolution digital signal (which may have a much larger number of bits than the final number of say 16) into a quantized data stream is shown schematically in Figure 7. This differs from the process shown in Figure 3 only in that the pseudo-random dither signal is now generated from the least significant bits (lsb's) of the previous M samples of the quantised digital output, by means of a buffer memory that stores the M previous lsb's, and a look¬ up-table that converts these values of the lsbs into a sample of dither noise. By way of example, we consider the case where M = 16 i.e. the buffer store uses the lsb's of the previous 16 output samples. The following considerations apply equally well to any other number M.

There are 2 possible values of "state variable" stored in the buffer, and the look-up table converts these into one of 2 values of dither noise. In order that the dither noise be as close to random in behaviour as possible, the mapping from the 2 states of the state variables n the buffer to the noise signal should appear as random or "chaotic" as possible, and may be chosen by random or Monte-Carlo methods, or by any known method of

16 generating pseudo-random sequences of length 2 .

The generated dither samples may have any probability distribution function that is suitable for subtractive dither, but for our purposes a triangular probability distribution function with peak levels ± 1 lsb is preferred, since such dither also performs optimally without subtractive decoding as shown by Lipshitz et al

[4]. A look-up table with 2 entries is expensive to implement, and for triangular dither may alternatively be o implemented as two look-up tables each having 2 entries, producing rectangular probability distribution function

(pdf) dither with peaks ±% lsb, whose outputs are added to form the triangular dither. The mapping from the original

2 states to a product of two state variables with 2 values may be any deterministic mapping, which may be implemented by means of any one-to-one function followed by a separation of the 16 binary digits into two subsets (e.g. even and odd samples after the mapping) .

Q

Preferably, the mapping from the 2 states to the rectangular dither should generate each of 2 values of rectangular dither quantized to 8 bits once only. Also, in order that the pseudo-random generation process be unchanged by signal polarity inversion, the mapping should be such that if a given set of 8 digits goes to a dither value d, then the digits of opposite polarity go to -d; this assumes as is preferred that the mapping from 2 to

2 8 x 28 state variable is preserved under polarity inversions, as is the case when this mapping is done by selecting 2 subsets of the 16 digits in the buffer. (We assume here that polarity inversion negates all digits - i.e. that analog zero is halfway between quantization levels) .

The subtractive dither decoding process used to recover a higher-resolution digital signal with more than (say) 16 bits from the encoded signal is shown schematically in Figure 8. This uses the same buffer memory and look-up table arrangement as in the encoder, but now operating on the last M samples of the input digital data stream, and the pseudo-random dither signal produced is now subtracted from the digital signal recovered from the data stream by reconstructing the quantization levels. This operates exactly as the subtractive dither scheme shown in Figure 3, but with the specialized scheme for generating and recovering the dither signal using the lsb's of the digital data stream.

In the numerical examples considered by way of example here, the digital signal may be quantized to 16 bits with digits 0 (most significant bit or msb) to digit 15 (least significant bit or lsb) , and the rectangular dither signal operates at digits 16 to 23, so that the resulting triangular dither obtained by adding two rectangular dithers operates at digits 15 to 23. The resulting reconstructed signal is of 24 bit length.

Preferably, the encoding process will similarly operate on input digital signals expressed to well over 16 bit accuracy, e.g. 24 bits, even if the inherent noise in that signal is well above the 24 bit digital quantization noise floor. Thus, for example, if the digital signal is derived from an oversampling analog-to-digital converter (ADC) or as a result of earlier digital signal processing, the results ideally should not be "rounded off" below 24 bits before being encoded by the autodither encoding of Figure 6. By avoiding unnecessary rounding off or truncation before autodither encoding, any build up of perceived quantization noise or distortion is thereby minimized. The invention is not confined to being implemented purely in the digital domain. In particular, the level reconstruction blocks in various implementations of the invention (for example as described with respect to decoders) may be realised as digital to analogue converters (DAC's) and the subtracted dither noise may be realised in the analogue domain or may also be converted to the analogue domain by a second DAC, so that the subtraction of dither from the reconstructed signal may be implemented in the analogue domain.

"Look-up Tables"

Although we have described the autodither encoding and decoding in terms of a look-up table, any non-linear mapping approximating the required pseudo-random and polarity-inversion properties may be used, including methods of deriving such a mapping by means of a length L pseudo-random logic sequence generator, of the kind well known in the mathematical and signal processing literature for generating pseudo-random number sequences. One problem is the possibility, especially for a constant-level input signal such as a zero "digital black" silence signal, that the look-up table or one-to-one logic mapping may "lock up" into a non-random dither output signal, and a proposed look-up table needs to be tested to check that it avoids subjective correlations between low- level audio signals and the generated error signal across the autodither encode/decode process. In one of the inventor's (PGC) experiments, it has been found that such correlation effects are very low for look-up tables generated by a purely random procedure, so that in practice this problem may not be serious. Nevertheless, in any system to be standardized for universal use, it is advisable to test extensively a proposed look-up table or deterministic pseudo-random one- to-one mapping for generating dither noise to ensure that it is subjectively free of such correlation effects. In cases where such low-level correlations between the signal and the error noise exist, or where audibly non- random "lock-up" effects occur, they can be broken up by the addition of an additional low-level random noise signal more than, say, 12 dB down from the dither noise at the input of the autodither encoding process.

Although in the above we have suggested that the look¬ up table (or the substitute of a pseudo-random mapping implemented by logic means) should operate on the lsb's of the previous signal samples, it is clear that they could operate, say on the n least significant bits for n = 2 or more, or on the least significant digit of the digital samples expressed to base b for a number b greater than two. (This least significant digit is simply the remainder term if the number representing the sample is divided by b) . Such use of extra information from each sample will generally increase the number of states for a given number of samples in the buffer, and so possibly complicate the look up table or logic, but can be used to provide a higher approximation to chaotic or random behaviour of the dither noise.

Autodither with Noise-shaping

Noise-shaping can be added to autodither systems in exactly the same manner as already described in connection with Figures 4 to 6 for the general subtractively dithered case. Figure 9 shows the use of noise-shaping with autodither encoding analogous to that shown in Figure 4. The only difference is that here, the unshaped dither noise is derived from the digits of the previous samples of the encoded quantized signal via a buffer memory and look up table (or a pseudo-random logic means) as described above. If it is desired that the noise shaping of the quantizer be different, the feedback filter in Figure 9 can be replaced by another filter H' (z^" ) provided that the dither noise is filtered by the filter of Eq. (2) before being fed to the addition node, again as described earlier in connection with Figure 4.

Figure 10 is an alternative means, equivalent to that of Figure 9, of encoding autodithered signals with noise- shaping. This is the version of Figure 5 using the buffer and look up table (or a pseudo-random logic means) operating from the digits of the previous sample outputs of the quantizer to derive the dither noise. Again, the feedback filter H may be replaced by an alternative H' to alter the quantization noise spectrum.

Signals quantized and encoded by any of these autodither methods described in connection with Figures 9 and 10 may be decoded as shown in Figure 11. This simply derives the dither noise by means of a buffer for the digits of previous samples, followed by a look up table (or pseudo-random logic means) having identical performance to that used in the encoder, and then noise-shapes the dither with the filter 1-H(z^" ) of equation (1) before subtracting it from the signal reconstructed from or represented by the input digits.

By this means, the noise-shaped dither noise added in the encoding process is removed again in the decoding process, leaving only noise-shaped quantization noise which, before shaping, has rectangular pdf of ±% lsb with statistically independent samples that are also uncorrelated with the input signal to the encoder. This gives a noise level that is 4.8 dB below that of a noise-shaped nonsubtractive system with triangular dither, and 6 dB below that of a noise-shaped nonsubtractive system with Gaussian dither. Additionally, as noted earlier, noise-shaping itself can give a perceptual improvement of around 18 dB as compared to a system with spectrally flat noise, using the published noise-shaping filters described in refs. [2] or [3]. This leads to a perceptual improvement of around 24 dB as compared with properly dithered conventional quantization using nonsubtractive Gaussian noise - i.e. a subjective improvement in performance equivalent to 4 additional bits of resolution.

Although the embodiments so far described are particularly suitable for CD-quality audio, the same kind of autodither technique can be used for other kinds of waveform coding using subtractive dither and noise-shaping. In particular, it can be used for quantizing the subbands of low-bit-rate subband encoding systems, such as aptX-100 or MUSICAM with subtractive dither (and possible noise- shaping) in each subband that has a signal present, without the need for any synchronizing clock. In this application, the use of rectangular pdf dither, rather than triangular, may give better compatibility with nonsubtractive decoders, since lower noise energy may here be more important than modulation noise effects. Autodither can also be used with companding systems of the NICAM type. This method may also be used for subtractively dithered video signals, such as described by Roberts [5], and again some degree of noise-shaping can improve perceptual results (although the best noise-shaping characteristic depends on viewing distance - [6]) . In this application also, rectangular pdf dither may be preferable to triangular for maximum compatibility with nonsubtractive decoding.

In the case that autodither is used with two dimensional images, it is preferred that both the buffer memory and the noise-shaping be two-dimensional. This means that the buffer memory should store the lsb's of data about recent previous image samples in both (say) the horizontal and vertical image directions, and that the noise-shaping filter should be of the form H(z-, ,z₂ ), where z.^~ is a one sample delay in (say) the horizontal di Irecti.on and z₂-1 i.s a one-sample delay i.n the verti•cal direction, and where H has no constant term. Such two- dimensional noise-shaping is well known in the image processing literature. For moving or video images, the buffer memory and noise shaping can be three-dimensional, also involving memory in the time dimension involving previously presented images.

By using autodither and noise-shaping with still or moving images, a significant improvement can be obtained in image quality and perceived noise for any given number of bits, as in the audio case. In particular, the subjective "generation losses" due to several stages of image manipulation can be significantly reduced.

OPERATIONAL CONSIDERATIONS

Imperfect Channels

The above has so far assumed that the digits received by the user of a decoder are an exact replica of the digits leaving the encoder. While the CD medium is remarkably reliable, with relatively few uncorrected errors, this cannot be guaranteed, and it must also be recognised that other media such as digital audio tape typically have higher rates of uncorrected errors. It is important to design the autodither system to be such that the effect of such error is not too audible. Typically, when there is an uncorrectable error, existing systems are designed to go into an error concealment mode, which loses some samples and replaces them with "interpolated" estimates of the missing samples obtained from adjacent samples that are not in error. Such erroneous samples will have a 50% chance of having the wrong lsb's, which will affect the dither recovery. The effect of such errors on the recovered dither before noise shaping will be of limited duration. typically 16 samples or whatever the length of the buffer memory used is, after the occurrence of an error. During this error duration, the original subtracted dither noise energy N will be increased to 5N (where N is the noise that would have been heard in the absence of error, 2N is the noise energy that is no longer being subtracted, and 2N is the new noise that erroneously is being subtracted) .

This noise error energy after the sample error is noise shaped, and so should not be particularly serious as compared with the erroneous sample itself. However, the erroneous sample itself multiplies the noise-shaped nonsubtractive noise signal by a one-sample duration delta function impulse response. This is more serious than it might seem, since the noise-shaping itself increases the unweighted power of the noise, which can be as much as 38 dB higher than the perceptually weighted noise power, meaning that a sharp click may be heard for an isolated sample error, especially with low wanted signals.

This problem can be minimised only by limiting the increase in unweighted noise power produced by the noise- shaping. Such limitation of the unweighted power restricts the subjective improvement in S/N given by noise-shaping as shown in refs. [2] and [3] .

Occasional small "clicks" may be acceptable with a medium such as CD where the error rate is low, and the problem can be mitigated by using an interpolation procedure itself incorporating noise shaping or spectral weighting in playback.

More serious is the error mode in the case of large data losses, particularly prevalent with DAT tape when one head is clogged, where every alternate sample is lost. This has the effect of aliasing the noise spectrum about the frequency (1/4)F, where F is the sampling frequency, which brings the boosted noise near the upper bandlimit down to lower audio frequencies at which it becomes much more audible. This problem can be reduced by altering the noise-shaping around (1/2)F - 2.5 kHz so as to reduce the noise energy then aliased down to the psychoacoustically critical region around 3.5 kHz.

It is emphasised that this problem is not specific to autodither, or even subtractively dithered systems, but is common to all noise-shaped dithering systems. In practice, these problems have generally not proved too troublesome, but it must be recognized that noise-shaped dither can degrade the performance of error concealment strategies. One strategy that may be useful to cope with losses of alternate samples, particularly with DAT tape, is to use a look up table to generate dither that operates only on every second sample of the history of the signal, so that even if all odd samples are lost, the correct dither is generated for even samples, and vice-versa.

Splices

A prime advantage of autodither is that it recovers quickly in the presence of splices between signals, typically taking the duration of the buffer to recover the correct dither signal for subsequent samples.

Although dither recovery is quick, typically 16 samples, and although noise is only increased to 5N during this recovery period (except in the last few samples where it can be shown to reduce slightly) , a sharp or so called "butt" edit is generally not recommended, since the splicing of the two noise signals can produce a click sideband effect that may be audible, since the unweighted noise power of noise-shaped noise is increased so much.

It is generally preferable to use a cross-fade splice, with a cross-fade sufficiently slow that modulation sidebands from the frequencies near (1/2)F do not move into lower frequency regions at which they would be much more audible. In practice, typical cross-fade times widely used in audio are sufficient to prevent modulation sidebands of the noise from becoming audible.

The original signals may be used before and after the cross-fade, so that no redithering or requantization is necessary. During the cross-fade, however, it is best if the original signals are autodither decoded to a higher word length (e.g. 24 bits) , cross-faded at that word length and then re-coded. This re-encoding will give a noise level of 2N during the cross fade (depending on the precise cross-fade law) . Figure 12 shows a schematic of the basics of cross-fade splicing for autodithered signals. However, the transition at the end of the cross-fade also needs careful handling, since a sudden switch at the end of the transition from the noise-shaped autodithered signal to the original post-fade autodithered signal can cause a modulation sideband "click".

There are several different strategies for avoiding this "click" problem with noise shaping around a splice. All of these strategies involve modified quantization around the ends of the splice, with a somewhat increased noise level near those points.

An alternative, and less preferred, approach to splices is to continue autodither decoding the original post-splice signal and to re-encode it using noise-shaped autodither based on the new lsbs in the buffer memory. This strategy is less preferred since it continues to cause a generation loss (3 dB in the first generation) even well after the splice is over.

Signal Processing

With the preferred form of autodither, certain signal processing operations need no special precautions, including swapping of stereo channels, polarity inversion, replacement of one stereo channel by another autodithered signal (including the correction of an interchannel polarity error), and "time slipping", i.e. the incorporation of a time delay or advance into one or both stereo channels. However, other forms of signal processing, including simple gain changes, require that the signals first be autodither decoded, resulting in a longer word length that typically may be 24 bits, followed by the signal processing at the increased word length, followed by a new stage of autodither encoding when it is required to reduce the word length back down to say 16 bits for storage, rerecording or transmission elsewhere.

While there is a generation loss for each extra stage of autodither decoding and encoding used in the signal processing chain, this loss is subjectively lower than when autodither is not used, due to the lower noise floor of autodither. The extra 4 bit advantage of noise-shaped autodither over unshaped nonsubtractively dithered systems in current use means that 256 generations of re-encoding are required to equal the noise build up of just one generation of processing with conventional unshaped dither. In practice, in all but the very most critical applications, the noise penalties of a decoding/recording cycle are relatively small in terms of most current audio applications.

Flagging Autodither

While noise-shaped autodither is, by design, backward compatible with playback without subtractive dither, giving only a 4.8 dB weighted noise penalty for the case of triangular dither noise, there is the question of forward compatibility, i.e. of the response of an autodithered decoder to a signal that is not autodither encoded. In this case, the autodither decoder will subtract a spurious noise-shaped dither signal from the original signal, thereby adding noise. However, this added noise is at a weighted level much lower than the noise found in current nonsubtractively dithered systems, and will not in any case increase the perceived noise level by more than 3 dB even in the worst case. If a standard is adopted for autodither, it is likely that the results of the best converters will be encoded with autodither in order to keep the noise down in any case. Thus in most practical cases, autodither should prove to be practically compatible with most older material, the added noise being in any case well below the weighted noise level added by the imperfections in most existing DAC's (digital-to-analog converters) . However, the potential for degradation exists, and in the case of existing signals that are not properly dithered, the reconstructed dither noise may not be random due to systematic patterns in the lsb's, especially at low signal levels, so that the subtracted "dither noise" may have undesirable perceptual properties.

For this reason, it is also desirable to convey an autodither "flag" in the digital data stream to indicate when the autodither should be subtracted. Where this flag should be placed in different digital media is still under study. For example, in CD or DAT media, the flag could be placed in the sub-code data stream, and similarly room can be found for the flag in the AES/EBU or MADI data stream. The data rate for this flag need not be high, since no synchronization is involved, and a temporary error in the flag of a few milliseconds is relatively unimportant. One flag per CD frame, for example, should prove adequate. This flag does not have to be used by an autodither decoder, as just noted, but is preferably used to minimise the potential degradation.

It is in principle possible to determine whether a signal is autodithered purely from its average statistical properties over a period of time, simply by comparing the energy spectrum of the signal with and without autodither subtracted. If the energy of the spectrum in the least energetic frequency bands is systematically lower, by an amount equal to the energy in the noise-shaped autodither noise, then it is likely that the signal is indeed autodither encoded. The problem with this test is that the noise energy averaged over finite intervals is subject to statistical fluctuations, so that the determination of autodither on this basis can only be done with an imperfect confidence level of less than 100%. This confidence level will in general be low in the case when the signal has large energies in all frequency bands, but will be high in the case when some or all audio frequency bands have low spectral energy.

Thus it is in principle possible for an autodither decoder to determine from a comparison of the spectral statistics with and without autodither decoding, whether the signal is in fact autodither encoded, and to subtract the noise-shaped dither signal when it is. Other equivalent methods of determining the presence of autodither encoding from signal statistics exist, such as measuring the cross-correlation of the regenerated noise- shaped dither noise and the signal, and comparing this to the autocorrelation of the shaped dither noise - the two should be similar over a sufficiently long measurement time interval. In the Fourier domain, this is equivalent to the spectrum of the noise-shaped regenerated dither signal being the same as its cross spectrum with the signal. The equality will be easiest to determine at frequencies at which the signal's spectral energy is smallest.

The relative complication and statistical unreliability of determining the presence of the autodither from signal statistics alone means that the use of a flag is a preferred solution provided that such a flag can be handled without operational problems.

"Look up table" blocks

The construction of the "look up table" block in the algorithms in this description and the accompanying figures will now be described in more detail. Constructing a "look up table" block that achieves a high degree of pseudo- randomness for the dither requires a degree of empirical experimentation, but the following describes methods of construction which gives results among which it is easy to find algorithms achieving a high degree of pseudo randomness by trial and error experimentation.

A conceptually simple form of "look up table" block is simply to use a look up table in which the history h of the LSB's of previous samples stored in the buffer memory is used as an address in a look-up table memory, at which is stored a value of pseudo-random dither corresponding to that history h of LSB's. Thus the "look up table" block is this case comprises a memory addressed by h, and achieves a mapping f : h -> f(h) from possible histories h to dither signal values f(h) . The function f mapping histories h of the LSB's to dither signal values is termed the "scrambling function". For a dither signal with a uniform probability distribution function (pdf) , f may be chosen using a random number generator with uniform pdf (such as is available in the libraries of many computer computation programs and languages) and storing successive random numbers at addresses corresponding to successive values of the history h regarded as a digital number. For a triangular pdf dither noise, one may store the sum of two successive random numbers at addresses corresponding to successive values of the history h regarded as a digital number. It has been found that in most cases tried that this "Monte Carlo" method of constructing a look-up table gives a high degree of pseudo-randomness in the dither.

However, especially when the number of possible histories h is large, such look-up tables can be very expensive in memory, and in such cases it may be preferred to achieve the scrambling function f of the "look up table" block not by means of a look-up table based on reading the contents of addresses of a memory, but by means of a pseudo-random logic technique, and a method of doing this is to use linear or polynomial congruence methods, as now described. As is well known, a uniform pdf pseudo-random sequence y, of words of length N bits can be generated by a congruence relationship

Yi₊₁ = ay_t + b (mod 2^N) , where a and b are suitably chosen integers (which if necessary can be chosen by trial and error to ensure good pseudo-randomness) . In a similar way, one can generate pseudo-random sequence by polynomial congruence relationships of the form y_i+1 = a₀ + _aιy,- + a₂y_i2 + a^_j- (mod 2^N) where the polynomial coefficients a₀ to a_n are integers chosen (if necessary by trial and error) to ensure pseudo randomness.

In our application, a similar congruence method can be applied to generate a pseudo-random dither from the word h representing the least significant digits of the previous N samples. A uniform pdf dither signal d,- can be generated by a linear congruence relationship d = ah + b (mod 2^N) , or by a polynomial congruence relationship d = a₀ + a.,h + a₂h² + ... + a_nhⁿ (mod 2^N) , where the coefficients of the linear or polynomial expressions are selected by trial and error experimentation to ensure pseudo-randomness. In practice, one may choose to form a uniform pdf dither not by taking the whole word generated by the linear or polynomial congruence relationship, but by selecting a subset of N, < H out of its N digits. For example, if N = 24, one might select the least or most significant 12 digits to form d,-. The problem of generating a triangular pdf dither d can be solved by generating two uniform pdf dithers d, and d₂ from h by two different polynomial congruence relationships and adding the results together thus d₁ = a₀ + a_th + a-Ji + ... + a_nhⁿ (mod 2 ) , d₂ = b₀ + b,h + b₂h² + ... + b_nhⁿ (mod 2^N) , to form d = d,, + d₂. Again d, and d₂ can be formed by selecting a subset of N, < N out of the N digits formed by the polynomial congruence relationships.

In general, as in this example or the example earlier in this description using two 8-bit look-up tables, triangular pdf dither may be generated as in Figure 13, using a different look up table or polynomial congruence relations to generate each of the two uniform pdf dither signal that are substantially statistically indpendent of each other, and then adding the resulting uniform or rectangular pdf dithers.

We have found in practice that quadratic congruence relationships with n = 2 often work well in deriving a pseudo-random dither with triangular pdf. By replacing 2 by B for an integer base B in the above congruence formulas, one can also generate uniform pdf pseudo random words of length N to base B, so that the methods here are not confined to least significant digits to base 2, but may also be applied to residues to base B. It is found in practice that polynomial congruence techniques work more effectively in pseudo-rando ising the dither when the autodither algorithm incorporates noise shaping. This can be understood by noting that autodither without noise shaping generates a pseudo random noise with an algorithm containing only N digits of state variables in its feedback loop, giving only 2 possible states. However, with a noise shaping loop, the feedback in the noise shaping loop generally has many more state variables, so that the overall feedback loop has a much larger number of state variables, meaning that the likely repetition length of the pseudo-random sequence will generally be much larger and much more likely not to have obviously non-random short term patterns. The higher the order of noise shaping, and the higher the word-length of the arithmetic used for the noise shaping, the larger the number of the state variables, and the more random is the dither likely to be. Polarity Inversion

Here we describe in more detail methods of ensuring that the dither signal is polarity inverted when the signal word is polarity inverted, so that polarity inversions of the digital word do not affect the effectiveness of the autodither process. This problem is complicated by the fact that in the digital domain, the are two conventions, negation and complementation, for polarity inversion, and a method that ensures the correct inversion behaviour of autodither for one convention will in general not work for the other. Although it is desirable to standardise the convention used in digital systems, we also describe methods that will cope for most signals with both types of polarity inversion occurring in cascade. If a signal processing device inverts polarity by using a NEGATE operator x -> -x on a digital word x, the LSB is unchanged by this process, hence a simple autodither encode/decode process that uses the LSB only will generate the same dither signal in the decoder as if no inversion had taken place. However, in order to cancel the encoded dither noise, the decoded dither signal should be inverted in this case.

At least two ways exist to tackle this problem. One is to use buried data so that the decoder can explicitly recognise when polarity inversion has taken place. For example, one can commandeer every 50th LSB and force this to be the same as the MSB (most significant bit) of the same sample. Except when digital 0 or digital 1 is being transmitted, this relationship is broken by polarity inversion. (We adopt the convention here that the digital words are in units of LSB's, so that 1 represents one LSB.)

It will be noted that when samples are modified in order to bury data, this modification must take place inside the autodither feedback loop, so that the autodither decoder will recover the correct dither in reproduction. The other way is to modify the autodither encoding and decoding algorithm so that the signal will automatically reverse polarity when the signal is inverted. One such modified dither signal would be the standard autodither signal as discussed previously, multiplied by the signum of the most recently transmitted non-zero signal. The corresponding modified encoder is shown in fig. 16.

Here the transmitted signal is compared to zero; if non-zero, it forces the state of a flip-flop according to whether it is positive or negative, and this causes the dither noise signal as obtained previously (e.g. with reference to fig. 7) to be multiplied by either +1 or -1. (We assume here that the dither noise takes values symmetrically distributed about zero.) The corresponding decoder will likewise multiply the dither noise by +1 or - 1, according to the sign of the most recent non-zero signal word. It is evident that polarity inversion in the transmission chain will thus cause the reconstituted dither also to be inverted. It will be noted that:

1. In interpreting fig. 16, and indeed the earlier figures, it is assumed that a "latch" is included, so that the feedback loops are not delay-free. Typically, the "buffer memory" and "flip-flop" will each be clocked, so that the current dither noise sample will be dependent only on previous transmitted samples.

2. The addition of the polarity-inversion modification will, in general, affect the randomness of the dither signal, although it is difficult to predict whether a given look-up table will yield better or worse results with this modification. A degree of trial and error is required in choosing a look-up table for use with this modification to ensure adequate pseudo-randomness.

There are various other ways of generating a dither which preserves polarity inversion, especially if the look up table is responsive to the last few LSB's instead of the LSB only. (e.g. one might replace the scrambling function f : h -> f(h) of the look up-table block by h -> f(h)-f(-h) .)

However, we consider that the algorithm presented above is one of the more robust ones, taking into account the possibility of small additive offsets as a result of combined negation and complementation, as discussed below. Sometimes, when "polarity inversion" is accomplished by hardware, this complements the digital word rather than negating it. Now, comp(x) = -x-1 , where comp(x) is the complement of x. Complementation of the transmitted word implies complementation of the LSB history h in the buffer: h -> comp(h) = -h-1 . Hence if the look-up table scrambling function f is arranged to provide a value which is invariant under this transformation, complementation and negation of the transmitted signal will be equally catered for. In particular, if the look-up table scrambling function f is a function of h x (-h-1) this property will be satisfied.

Complementation followed by negation is equivalent to adding 1 to the signal, hence the comments about "small offsets" above. A dither noise responsive to h x (-h-1) , where h is the LSB history, with multiplication by ±1 dependent on the sign of the most recent non-zero signal, will give correct results even when the signal has been through several negations and/or complementations, except when the transmitted signal is very close to zero (under which conditions, the received signal's sign may be changed by the small offset) .

This problem of what happens after several negations and complementations when the transmitted signal is close to zero can if desired be transferred to a limited range of higher signal values by making the output of the flip-flop in fig. 16 responsive not to the most recent non-zero word, but to the most recent non-zero word whose magnitude is above a predetermined threshold value larger than any likely value of the offset. This may be achieved by the comparator being preceded by a rounding operation that discards the last n bits of the incoming word, with perhaps n-1 n = 3, 4 or 5, so that any signal below the level of 2 will be seen by the comparator as a zero signal and so will not trigger the flip-flop.

When a polynomial congruence method is used to implement the "look up table" block, one may ensure than the generated signals d or d. and d₂ above are invariant under negation by using polynomials with no odd power terms, and that they negate under negation by using, in the congruence relationships, polynomials with no even power terms.

One may ensure than the generated signals d or d, and d₂ above are invariant under complementation by using, in the congruence relationships, polynomials in h x (-h-1) , by for example having d, = a₁h(-h-l) +b, (mod 2^N) and d₂ = a₂h(-h-l) +b₂ (mod 2 ) in the generation of a triangular pdf dither d = d, + d₂ , where the coefficients a,- and b,- are selected, if necessary by trial and error experimentation, to give a pseudo-random dither. As noted earlier, this may be combined with the strategy shown with reference to fig. 16 to ensure that the dither remains correct both under complementation and negation.

Dither Probability Distribution Functions The invention is preferably used with pseudo-random dither noise having statistical properties known substantially to eliminate a low-level non-linear distortion produced by a quantizer truncation or rounding operation. It is known [7] [8] that a necessary and sufficient condition for the elimination of non-linear distortion by a dither signal, both for non-subtractive and subtractive reproduction is that the dither noise should have a pdf that, for each sample is the convolution of any pdf function with a uniform pdf having a peak-to- peak argument range substantially equal to one quantized step size or one least significant digit of the lower resolution signal.

Dither signals with such convoluted pdf's may be produced by adding a first pseudo-random dither signal with any statistics to a second statistically independent pseudo-random dither signal with uniform pdf the second dither signal's peak-to-peak range of values being substantially equal to one quantization step size or one least significant digit of said lower resolution signal. The second dither signal may easily be produced by the linear or polynomial congruence method described above.

If, in addition to no distortion it is desired that the non-subtractive reproduction has no modulation noise then this may be achieved using dither with appropriate statistics as described in references [7] [8] [9]. A sufficient condition for this is that the dither noise should have a pdf that, for each sample, is the convolution of any pdf function with a triangular pdf having a peak-to- peak argument range substantially equal to twice the quantization step size or twice the least significant digit of the lower resolution signal, where the second dither signal may have statistically independent samples or be of the form known as high-pass dither [7] [8] [9] or be any other triangular dither whose cross-correlation between successive samples is time-invariant. Dither signals with such convoluted pdf's may be produced by adding a first pseudo-random dither signal with any statistics to a second statistically independent pseudo-random dither signal with triangular pdf having a peak-to-peak argument range substantially equal to twice the quantization step size or twice the least significant digit of the lower resolution signal. The second dither signal may easily be produce by adding to statistically independent signals with uniform pdf's derived by the linear or polynomial congruence method.

Noise shaping architectures

Although the signal processing architectures for noise shaping shown in the figures are the most widely used , there are many other known architectures for noise shaping around a dithered quantizer that give equivalent results, and reference [9] provides a detailed review. Any of these alternative architectures may be used with the invention, provided the dither noise continues to be generated only from previous samples of the lower resoluton digital signal output from the quantizer.

References

[1] M.A. Gerzon & P.G. Craven, "Optimal Noise Shaping and

Dither of Digital Signals", Preprint 2822 of the 87th Audio Engineering Society Convention, New York (1989 Oct. 18-21)

[2] S.P. Lipshitz, J. Vanderkooy & R.A. Wannamaker, "Minimally Audible Noise Shaping", J. Audio Eng. Soc, vol. 39 no. 11, pp. 836-852 (1991 Nov.). Corrections to be published in JAES

[3] J.R. Stuart & R.J. Wilson, "A Search for Efficient Dither for DSP Applications", Preprint 3334 of the 92nd Audio Engineering Society Convention, Vienna (1992 March)

[4] S.P. Lipshitz, R.A. Wannamaker & J. Vanderkooy, "Quantization and Dither: A Theoretical Survey", J. Audio Eng. Soc, vol. 40 no. 5, pp. 355-375 (1992 May)

[5] L.G. Roberts, "Picture Coding Using Pseudo-Random Noise", IRE Trans. Inform. Theory, vol. IT-18, pp. 145- 154 (1962 Feb.)

[6] R.J. Clarke, "Transform Coding of Images", Academic Press, 1985

[7] SP Lipshitz , RA Wannamaker, J Vanderkooy,

"Quantization and Dither: A Theoretical Survey", J. Audio

Eng. Soc, Vol 40 no.5 pp355-375 1992 May

[8] RA Wannamaker, SP Lipshitz, J Vanderkooy, JN Wright, "A Theory of Non-Subtractive Dither" submitted to IEEE Trans.

Sig. Proc 1991

[9] MA Gerzon, PG Craven, JR Stuart, RJ Wilson,

"Psychoacoustic Noise Shaped Improvements in CD and other

Linear Digital Media", Preprint 3501 of the 94th Audio Engineering Society Convention, Berlin 1993 March 16-19, sections 1 & 2

Claims

1. A digital signal processing system comprising means for processing a signal of a first resolution to produce a signal of a second, different, resolution, the said means including a generator for dither noise, the system using the dither noise in processing the signal to ameliorate the effect of system non-linearities, characterised in that the generator for dither noise includes means for generating in functional dependence upon samples of the lower resolution digital signal a pseudo¬ random dither noise output for use in processing a current sample of said lower resolution signal.

2. A system according to claim l, in which the system is a digital encoder for encoding a signal of a higher resolution to produce a digital signal of a lower resolution, the digital encoder comprising a quantizer for rounding or truncating the higher resolution signal to produce the lower resolution signal, the output of the quantizer being connected to an input of the generator for dither noise, and an addition means for adding the dither noise output by the generator to the higher resolution signal input to the quantizer.

3. A system according to claim 2, further comprising filter means arranged to apply noise-shaping to noise generated in the encoder to reduce the perceptibility of the noise in the reproduced signal.

4. A system according to claim 3 in which the filter means include a filter connected to an output of said quantizer and arranged to feed back to an input of said quantizer a signal depending on the difference between the input and the output of the quantizer and delayed by the duration of at least one digital sample.

5. A system according to claim 3 or 4, in which the filter means comprise at least two filters, a first filter arranged to filter the output of the dither noise generator and a second filter connected to the output of the quantizer to filter the quantizer error noise.

6. A system according to claim 5, in which the first and second filters have different filter characteristics.

7. A system according to claim 6, in which the second filter is an adaptive filter arranged to vary in response to the properties of an input signal.

8. A system according to claim 1, in which the system is a digital decoder for decoding a lower resolution digital signal to reproduce a signal of higher resolution, the decoder including subtraction means for subtracting dither noise from the lower resolution signal, the dither noise generator receiving the lower resolution digital signal at its input and outputting dither noise to the subtracting means.

9. A system according to any one of the preceding claims, in which the dither noise generator includes means for deriving one or more least significant digits from the lower resolution digital signal, and means for generating a pseudo-random output in response to the least significant digits derived from a plurality of preceding samples of the lower resolution digital signal.

10. A system according to claim 9, in which the means for generating a pseudo-random output include a look-up table programmed with data having desired pseudo-random statistics.

11. A system according to claim 9, in which the means for generating a pseudo-random output include processor means arranged to receive the least significant digits of the plurality of preceding samples and to apply a pseudo-random function to the said digits to generate the output.

12. A system according to claim 11, in which the pseudo¬ random function is a linear or polynomial congruence function.

13. A system according to any one of the preceding claims, in which the generator for dither noise is arranged to generate noise comprising substantially statistically independent samples having a uniform probability distribution function (pdf) , the noise having a peak-to- peak amplitude substantially equal to one quantization step size or a least significant digit of said lower resolution signal.

14. A system according to any one of claims 1 to 12, in which the generator for dither noise is arranged to generate noise comprising samples having a triangular probability distribution function, the noise having a peak-to-peak amplitude substantially equal to twice the quantization step size or twice the least significant digit size.

15. A system according to any one of claims 1 to 12, in which the generator for dither noise is arranged to generate noise comprising substantially statistically independent samples having a probability distribution function corresponding to any probability distribution function convoluted with a uniform probability distribution function, said uniform probability distribution function having a peak-to-peak argument range substantially equal to one quantization step size or one least significant digit of said lower resolution signal.

16. A system according to any one of claims l to 12, in which the generator for dither noise is arranged to generate noise comprising substantially statistically independent samples having a probability distribution function corresponding to any probability distribution function convoluted with a triangular probability distribution function, said triangular pdf having a peak- to-peak argument range substantially equal to two quantization step sizes or twice the least significant digit of said lower resolution signal.

17. A transmission system for transmitting a higher resolution signal via a lower resolution digital channel comprising, in combination, an encoder according to claim 2, or any one of claims 3 to 16 when directly or indirectly dependent on claim 2 ,and a complementary decoder according to claim 8 or any one of claims 9 to 16 when directly or indirectly dependent on claim 8, the dither noise generators in the encoder and decoder being substantially identical in effect, the encoder outputting the lower resolution signal onto the digital channel and the decoder receiving the lower resolution signal at its input via the digital channel.

18. A transmission system according to claim 17, in which the encoder includes noise-shaping filter means and the decoder includes a filter arranged to apply to the output of the dither noise generator a filter function equivalent to the effect of the encoder noise-shaping filter on the dither noise.

19. A system for transmitting a higher resolution signal via a lower resolution digital channel comprising: an encoder for encoding a higher resolution signal into a lower resolution digital signal for transmission; a decoder for decoding a lower resolution digital signal thereby producing an improved approximation to said high resolution signal; and a lower resolution digital channel linking said encoder to said decoder; wherein said encoder comprises a quantizer for rounding or truncating said higher resolution signal, a first dither noise generator for generating a pseudo-random dither noise output of predetermined statistics from previous samples of an output of said quantizer, and means for adding said dither noise output to an input of said quantizer; and wherein said decoder comprises a second dither noise generator substantially identical in effect to said first dither noise generator for generating a pseudo-random dither noise output of said predetermined statistics from previous samples of said lower resolution signal, and means for subtracting said dither noise output from said lower resolution signal.

20. A system for transmitting a higher resolution signal via a lower resolution digital channel comprising: an encoder for encoding a higher resolution signal into a lower resolution digital signal for transmission; a decoder for decoding a lower resolution digital signal thereby recovering an improved approximation to said high resolution signal; and a lower resolution digital channel linking said encoder to said decoder; wherein said encoder comprises a quantizer for rounding or truncating said higher resolution signal, a first dither noise generator for generating a pseudo-random dither noise output of predetermined statistics from samples of an output of said quantizer, and means for adding said dither noise output to an input of said quantizer, and noise shaping means arranged to feed back to said quantizer a signal dependent on an error across said quantizer and having a delay of at least one digital sample duration; and wherein said decoder comprises a second dither noise generator substantially identical in effect to said first dither noise generator for generating a pseudo-random dither noise output of said predetermined statistics from samples of said lower resolution signal, filtering means equivalent in effect on said dither noise to said encoder noise shaping means for filtering said output of said dither noise generator, and means for subtracting said filtered dither noise output from said lower resolution signal.

21. A method of processing a signal of a first resolution to produce a waveform of a second, different resolution including the step of generating dither noise for use in processing the signal to ameliorate the effect of system non-linearities, characterised in that the dither noise is generated in functional dependence upon samples of the lower resolution signal.

22. A method according to claim 21, including the steps of rounding or truncating a high resolution digital signal to produce a lower resolution signal, generating dither noise in response to the results of previous iterations of the step of rounding or truncating, and adding the dither noise to the higher resolution signal, thereby encoding the higher resolution digital signal into a lower resolution signal.

23. A method according to claim 21, including the steps of generating dither noise in response to previous samples of a lower resolution waveform and subtracting dither noise from a current sample of said waveform.

24. A system or method according to any one of the preceding claims, in which a dither noise generator is arranged to output a signal of reversed polarity if the polarity of the lower resolution signal is reversed.

25. A system according to claim 21, wherein said dither generator implements a pseudo-random mapping from a digital word h formed from the least significant bits of a predetermined number of previous samples of said lower resolution signal, such that the mapping is a function of h x (-1-h) , and where additionally the polarity of the dither signal is multiplied by the polarity of a previous sample of the lower resolution signal.

26. A digital signal processing system for editing or splicing signals encoded for a lower resolution from a higher resolution by the addition of dither before a quantizer comprising: a first means for receiving a pre-splice lower resolution signal preceding the splice; a second means for receiving a post-splice lower resolution signal following the splice cross-fading means for cross-fading at a higher resolution from said pre-splice signal to said post-splice signal, and a re-encoder to encode said cross-faded high resolution signal to a lower resolution signal, wherein said re-encoder is operative during the period of said cross-fade and where the output at times greatly before the cross-fade period is the pre-fade lower resolution signal, and the output at times greatly after the cross-fade period is the post-fade lower resolution signal.

27. A system according to claim 26, in which the re- encoder comprises a signal processing system according to claim 2.

28. A system or method according to any one of the preceding claims in which the dither noise generator produces an output functionally dependent on previous samples only of the lower resolution signal.