WO2002019708A1

WO2002019708A1 - Dual priority video transmission for mobile applications

Info

Publication number: WO2002019708A1
Application number: PCT/US2001/026318
Authority: WO
Inventors: Hamid Gharavi
Original assignee: Hamid Gharavi
Priority date: 2000-08-25
Filing date: 2001-08-23
Publication date: 2002-03-07
Also published as: WO2002019709A1; AU2001285225A1; AU2001286682A1

Abstract

Apparatus, methods, and data structures for robust partitioning and reassembling of video transmission over multipath fading channels is presented. The system is based on a separation of variable-length-coded discrete cosine transform coefficients within each block and is suitable for constant bit rate transmission where the data rate for each proportion is controlled in accordance with its buffer fullness. It is shown that variable-length-coding-based partitioning for INTRA frame blocks can render itself to an accumulation of distortion due to a loss of the second layer, which is overcome by the invention (Figure 6). The propagation of such distortion is shown to be negligible when applied to the ITU-T H263 video coding standard.

Description

DUAL PRIORITY VIDEO TRANSMISSION FOR MOBILE APPLICATIONS

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims priority from U.S. Patent Application Serial No. 09/645,622, filed August 25, 2000.

BACKGROUND OF THE INVENTION

The invention relates in general to apparatus, methods, and data structures for robust partitioning and reassembling compressed video or other data over multipath fading channels. More particularly, the invention relates to an apparatus, method, and data structure for adaptively separating compressed video data consisting of header information, and blocks of data of discrete-cosine transform (DCT) coefficients generated by variable-length coding (VLC) of run-level symbols and reassembling the separated data.

The growing demand for wireless multimedia communications has presented a new challenge in dealing with the problems related to image and video transmission. As most video compression standards have been developed for relatively error-free environments, they cannot be directly applied to a hostile mobile domain. In addition, while third-generation mobile systems for wideband applications are currently under development, second-generation mobile communication systems offer only narrowband data transmission that is not suitable for transmission of video information. To accommodate higher bit rates, system enhancements using multiple data-time slots for a single-data connection are expected to be introduced (i.e., multiple 14.4/9.9 Kb/s in the Global System for Mobile Communications (GSM)) based on the High Speed Circuit Switched Data Service (HSCSD). This is discussed in "ETSI Technical Recommendation for HSCSD," ETR51 02.34, V5.2.0, 1, the contents of which are incorporated by reference in their entirety; and T1TR3GPP 22.034-310: "3rd Generation Partnership

Project; Technical Specification Group Services and System Aspects; High Speed Circuit Switched Data (HSCSD)," Version 3.1.0, the contents of which are incorporated by reference in their entirety. HSCSD is a new technology that is currently being implemented in some GSM networks. HSCSD has been developed as the evolutionary route toward third-generation mobile communication systems. HSCSD, due to its multi- slot capability, can be viewed as the most attractive mobile teclmology currently available for real-time video services. More importantly, if the compressed video is efficiently partitioned, it can effectively utilize the multi-slot capabilities of GSM/HSCSD as well as GSM/General Packet Radio Service (GPRS), to offer a reliable video service. GSM/GPRS is discussed in T1TR3GPP 22.060-310: "3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; General Packet Radio Service (GPRS); Stage 1," the contents of which are incorporated by reference in their entirety. Thus, the ability to use multiple slots requires splitting the compressed video signal into separate bitstreams. This can be accomplished by taking into consideration the perceptual significance of coded video signals where a better protection can be provided to transmit the higher priority bitstreams. Video data prioritization schemes for various applications are discussed in U.S. Patent Nos. 6,052,150, 5,742,343,

5,253,058, 5,231,384, and 5,821,885, the contents of which are incorporated by reference in their entirety.

Video partitioning utilizing unequal error protection may also be considered when spectrally efficient modulation techniques such as multilevel-quadrature amplitude modulation (QAM) are utilized. These techniques are discussed in R. Steadman, H.

Gharavi, L. Hanzo, and R. Steele, "Transmission of Subband Coded Image via Mobile Channels," IEEE Trans, on Circuits and Systems for Video Technology, vol. 3, no. 1, pp. 15-26, Feb. 1993, the contents of which are incorporated by reference in their entirety; and H. Gharavi and CI. Richards, "Partitioning of MPEG Coded Video Bitstreams for Wireless Transmission," IEEE Signal Processing Letters, June 1997, the contents of which are incorporated by reference in their entirety. For example, Gray-coded 16-QAM can form two data channels — the first channel being represented by the most significant bit (MSB) of both in-phase and quadrate-phase codes, and the second channel being formed by the remaining two bits (i.e., the least significant bits). The main advantage of this strategy is due to the multipath fading resistance property of the first channel which makes it possible to protect most error-sensitive information for transmission over mobile channels. On the other hand, the second channel has been shown to be extremely susceptible to the multipath fading effect. This can undermine the spectral efficiency of the 16-QAM unless accompanied by pilot-assisted fade estimation and compensation. These characteristics are discussed in S. Sampei and T. Sunaga, "Rayleigh Fading Compensation for QAM in Land Mobile Radio Communications," IEEE Trans, on Vehicular Technology, vol. 42, no. 2, pp. 137-46, May 1993, the contents of which are incorporated by reference in their entirety; J.K. Cavers, "An Analysis of Pilot Symbol

Assisted Modulation for Rayleigh Fading Channels," IEEE Trans, on Vehicular Technology, vol. 40, no. 4, pp. 686-93, Nov. 1991, the contents of which are incorporated by reference in their entirety; and H. Gharavi, "Pilot Assisted 16-level QAM for Wireless Video Transmission," accepted for publication in IEEE Trans, on Circuits and Systems for Video Technology, the contents of which are incorporated by reference in their entirety.

For the above-mentioned applications, there is a concern related to developing an error-resilient partitioning/reassembling scheme based on constant bit rate (CBR) transmission. It is noted, however, that there are no fundamental requirements for the partitions to be of equal size. Indeed, the partitions must be of different sizes when unequal error protection parity bits are applied to any of the partitions. Thus, depending on the size of additional overheads, such as parity check bits as well as other synchronization bits, a fixed bit rate splitting factor should be defined in the partitioning process for dividing the original bitstream into appropriate proportions. For dual priority partitioning this can be arranged by taking into consideration the visual importance as well as the sensitivity of the coded bitstream against transmission errors. For instance, in a subjective sense, the visual importance is best described by the order of the frequency representations of image signals. Since this important property is inherently exploited in subband and DCT-based coding, their partitioning is relatively straightforward. For video applications the latter approach, when supplemented with INTER frame prediction, forms what is known as an INTER frame hybrid DCT/differential pulse-code modulation (DPCM) coding technique; the DCT/DPCM technique is a combination of INTER frame DPCM and DCT coding. An efficient hybrid DCT/DPCM method is discussed in U.S. Patent No. 4,821,119, the contents of which are incorporated by reference in their entirety. This method has also been widely considered for most practical video coding applications including the existing video coding standards. This method has been discussed in ITU-T H.261 Recommendation, "Video Codec for Audio Visual Services at px64 kb Vs," Mar. 1993, the contents of which are incorporated by reference in their entirety; ITU-T Recommendation H.263. "Video Coding for Low Bitrate Communication," Feb. 1998, the contents of which are incorporated by reference in their entirety; Draft Text of Recommendation H.263 Version 2 ("H.263+") for Decision, COM- 16-26, 1998, the contents of which are incorporated by reference in their entirety; ISO/IEC 11172-2, "Coding of Moving Pictures and Assisted Audio for Digital Storage

Media at up to 1.5 Mb/s, Part 2: Video," Aug. 1993, the contents of which are incorporated in their entirety; and ISO/IEC 1318-2, "Information Technology-Generic Coding of Moving Pictures and Assisted Audio Information, Part 2: Video," Jan. 20, 1995, the contents of which are incorporated by reference in their entirety. Its partitioning has also received considerable attention in recent years for asynchronous transfer mode (ATM) networks combating cell loss and providing signal-to-noise ratio (SNR) scalability as provisions in the MPEG-2 and H.263 standards. These issues are discussed in S. Tubaro, "Two Layers Video Coding Scheme for ATM Networks," Signal Processing: Image Communication, vol. 3, pp. 129-41, June 1991, the contents of which are incorporated by reference in their entirety; R. Aravind et al., "Packet Loss Resilience of MPEG-2 Video Coding Algorithms," IEEE Trans, on Circuits and Systems for Video Technology, vol. 6, pp. 426-35, Oct. 1996, the contents of which are incorporated by reference in their entirety; and H. Sun et al., "Architectures for MPEG Compressed Bitstream Scaling," IEEE Trans, on Circuits and Systems for Video Technology, vol. 6, pp. 191-99, Apr. 1996, the contents of which are incorporated by reference in their entirety.

In an interframe hybrid DCT/DPCM video coding scheme, a video frame is first divided into non-overlapping blocks where each block is transformed via a DCT, quantized, and entropy encoded. The entropy coder consists of run-length, variable- length, and end-of-block coding — a combination that is termed "3-D VLC." Except for the first frame, which has to be LNTRA frame coded, the remaining frames may use INTER frame prediction which is referred to as P-frames (e.g., P-frames which are predicted from the previous decoded frame). With additional frame delays, both previous and future reconstructed frames may also be considered for prediction; this scheme is known as bi-directional prediction frame (B-frame).

For LNTRA frame prediction, a larger block consisting of four neighboring luminance DCT blocks, called macroblocks (MBs), is used to perform block matching motion estimation and compensation. Note that an MB also contains two more blocks representing the color difference (chrominance) components; these additional blocks are not involved in the process of motion estimation. The estimated displacement motion vectors are multiplexed to the DCT-coded data and transmitted as a part of the hierarchically dependent MB header. The multiplexing structure of all existing video standards is generally based on the same concept. For example, in the H.263 standard, the video-coded information for each frame is arranged in four hierarchical layers. The top layer is the picture layer followed by a group of blocks (GOBs) layer comprising a number of consecutive MBs, then the MB layer, and, finally, a block layer.

Each layer is furnished with some header information which may include synchronization bits for the two top layers (i.e., the picture start code (PSC) and the GOB start code (GBSC)). In addition, each layer includes other important parameters defining the nature of the information associated with it. At the picture layer, the parameters may include temporal reference (TR), picture type (i.e., Sub-Quarter Common Interface Format (Sub-QCIF), QCIF, CIF, 4CIF, 16CLF, or any other suitable format), quantizer information (PQUANT); however, any suitable parameters may be included. At the GOB layer, the parameters may include group number (GN), GOB frame ID (GFID), and quantizer information (GQUANT); however, any suitable parameters may be included. At the MB layer, the parameters may include coded macroblock indication (COD), block type and coded block pattern for chrominance (MCBPC), coded block pattern for luminance (CBPY), and motion vector data (MVD). It is noted that the actual video information, in the form of VLC-coded data, is transmitted at the block layer.

If the header information for the frame is lost during transmission, the decoder will have no indication as to how the frame, GOB, or MB had been coded and so any further data received will be useless. The actual DCT coefficients are transmitted at the block layer and errors occurring in these data mean that some, or all, of the data are lost.

In addition, due to extreme sensitivity of the compressed video to transmission errors, most prior art schemes have relied on the INTRA frame update to avoid the propagation of distortion. In this method, the encoder will resort to an INTRA frame mode at regular intervals. The period in which the LNTRA frame resets is based on the condition of the transmission channel. For example, for error-prone channels, the intra-reset period may be selected to be fewer than ten frames. One drawback with INTRA frame coding is that its compression efficiency is significantly less than that of LNTRA frame coding. Therefore, to maintain the overall coding efficiency, it is important to select a larger interval for LNTRA frame resets. This approach has drawbacks in the areas of error recovery, resynchronization, and concealment.

In addition, one of the main challenges in video partitioning is reassembling the partitioned bitstreams corrupted by error. The prior art solutions to such issues are unsatisfactory.

Therefore, it would be desirable to provide alternative apparatus, methods, and data structures that overcome at least some of the shortcomings noted above for both encoding and decoding.

SUMMARY OF THE INVENTION

The invention satisfies the need and avoids the drawbacks of the prior art by providing apparatus, methods, and data structures for robust partitioning of video or other data over multipath fading channels and reassembling of the same. For example, in the case of lost information during transmission, if errors affect only higher-frequency coefficients, the damage would be less catastrophic if the block can be reconstructed with minimal distortion. Thus, protection of the most error-sensitive header information and as many lower-frequency DCT coefficients as possible would be a valuable feature. In one aspect of the invention, an apparatus for and a method of dynamically partitioning video data for a fixed-rate transmission is disclosed in which encoding the DCT coefficients of the video data via VLC takes place prior to partitioning. A video data stream is partitioned with this technique, and each partitioned bitstream is then transmitted via a separate transmission channel having different error-protection characteristics from another channel. For example, for a given total bit rate budget, the first bitstream, which conveys the most error sensitive information, will use a much higher number of parity check bits for better protection. Performing the variable-length coding on the DCT coefficients prior to partitioning is of great use because of its partitioning efficiency and its suitability for protecting video signals against error bursts typical of the mobile radio environment are enhanced. In this embodiment, the use of CBR transmission is presented. However, this apparatus may also be used for smoothing variation of a video signal for applications using a variable bit rate (VBR) transmission. In another aspect of the invention, a system for partitioning video data for transmission over multipath channels contains a computer-readable memory for storing data for access by an application program and includes a data structure stored in the computer-readable memory. The data structure may include an image field, a reference field, a group-of-blocks field, an MB field, one or more block fields, one or more DCT fields, one or more VLC fields, one or more buffer fields, a buffer control unit field, a first partition field, a second partition field, a splitting percentage factor field, cut-off value field, pre-decoder temporary buffer field, total bit rate field, inner-level forward error correction (FEC) field, or a second partition and synchronization field. In another aspect of the invention, an apparatus for and a method of pre-decoding partitioned data is provided. The pre-decoder may be able to combine two bitstreams at the receiver. In other words, the pre-decoder may provide robust self-error detection and concealment capabilities that enable recovery of the corrupted video bitstream with minimal visual distortion. Synchronization bits that are in the second partition may be used by the pre-decoder unit at the receiver end in order to align and reassemble the two partitions. Thus, an important property of the pre-decoder is based on its ability to preserve the integrity of the compressed bitstream and its output may be directly forwarded to a standard decoder.

In another aspect of the invention, a system for pre-decoding partitioned data contains a computer-readable memory for storing data for access by an application program and includes a data structure stored in the computer-readable memory. The data structure may include a first bitstream field, a second bitstream field, a PSC detector field, a GBSC detector field, a cut-off value field, a GN field, a TR field, a temporary buffer field, an inner-level FEC decoder field, an outer-level FEC decoder field, a VLC codebook field, and an error concealment field.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 illustrates a prior art block diagram for a hybrid DCT/DPCM scheme.

Figure 2a illustrates the zone difference in a VLC-based partitioning scheme for a sample block of data. Figure 2b illustrates the zone difference in a VLC-based partitioning scheme for a reference block of data. Figure 3a illustrates a first-partition-percentage ratio for a sample data set.

Figure 3b illustrates a first-partition-percentage ratio for a sample data set.

Figure 4 illustrates a block diagram for a buffer-controlled VLC-based partitioning scheme. Figure 5a illustrates inner-level FEC protection for a TR codeword.

Figure 5b illustrates inner-level FEC protection for a GN codeword.

Figure 6 illustrates a two-layer partitioning scheme for a selection value of 3.

Figure 7 illustrates a percentage difference between two bitstreams coded at various fixed bit rates for two sample data sets. Figure 8 illustrates a sample data set in which the macroblock header is corrupted by error.

Figure 9a illustrates the original syntax of two bitstreams of a sample data set for an error concealment scheme with a cut-off value of 1.

Figure 9b illustrates the syntax of the output bitstream after concealment for the data set shown in Figure 9a.

Figure 10a illustrates the original syntax of two bitstreams of a sample data set for the reconstruction of a GOB for a cut-off value of 3.

Figure 10b illustrates the output bitstream after concealment for the data set shown in Figure 10a. Figure 11 illustrates the interpolation of LNTRA DCs in a macroblock using the nearest LNTRA DC from the previous GOB.

Figure 12a illustrates the original syntax for a sample data set of a macroblock in a GOB for an error concealment method for I-frames with a cut-off value of 2.

Figure 12b illustrates the modified syntax of the macroblock after error concealment for the data set shown in Figure 12a.

Figures 13a and 13b illustrate the last decoded frames at 15 frames/s and 10 frames/s, respectively, of a sample data set coded at 32 Kb/s with the second bitstream being corrupted by errors.

Figures 13c and 13d illustrate the last decoded frames at 15 frames/s and 10 frames/s, respectively, of the same sample data set shown in Figures 13a and 13b, respectively, coded at 32 Kb/s without the second bitstream being corrupted by errors. Figures 13e and 13f illustrate the last decoded frames at 15 frames/s and 10 frames/s, respectively, of a sample data set coded at 32 Kb/s with the second bitstream being corrupted by errors.

Figures 13g and 13h illustrate the last decoded frames at 15 frames/s and 10 frames/s, respectively, of the same sample data set shown in Figures 13e and 13f, respectively, coded at 32 Kb/s without the second bitstream being corrupted by errors.

Figure 14 illustrates the average PSNR of a first video data set versus channel SNR.

DETAILED DESCRIPTION Figure 1 depicts a prior art block diagram of an interframe hybrid DCT/DPCM video coding system. As discussed above, a video frame is first divided into non- overlapping blocks, Sf, and then each block is transformed by a DCT, quantized, and entropy coded. In this scheme, the difference between the r input block, S_f, and its corresponding motion-compensated (MC) predicted block, S f-i (i.e., the reference block) is transformed via a DCT. The output of the DCT function, DCT(S_f - S _f-i), is then quantized by sending the output of the DCT function through a quantizer, Q. The output of the quantizer, Q{DCT(Sf - S f-ι))}_> is then coded via VLC. The output of the VLC is then the output of the hybrid DCT system.

The separation of the quantized DCT coefficients may be arranged either before or after entropy coding. In the former, which is referred to as fixed-zone partitioning, the same number of lower frequency coefficients (in a zigzag scanning order) is selected as the upper-zone for transmission over the first channel. In the latter case, the first partition takes up a fixed number of VLC codewords instead. To distinguish between the two splitting methods, the effect a loss of the second partition would have on the reconstruction of the first partition is presented. Neglecting the effect of quantization noise (by assuming that all DCT coefficients are quantized by the same N-level uniform quantizer where N» 1), the prediction error signal (before VLC coding) may be shown as,

DCT(Sr S_f.ι) = DCT (S_f) - DCT (S_f-ι)

(1) where S_f and S_f.i represent the current block of picture elements and its corresponding motion compensated (MC) predicted block (i.e., reference block), respectively. It is noted that S_f.i = S M due to the negligible effect of quantizing noise. However, this technique is also effective on data for which the quantization noise is not negligible.

In a preferred embodiment, a system for partitioning data is disclosed. In this embodiment, a fixed number of top VLC codewords may be selected for the first partition for each block of 8-by-8. In this case the number of coefficients that a single VLC codeword can represent depends on the number of quantized zero coefficients (i.e., a zero run) preceded by a non-zero value (level). It is assumed that the coefficients are read in a zig-zag scanning order (from the top left corner to the bottom right corner). For instance, the first VLC codeword may represent one or more quantized DCT coefficients of the lowest order depending on the nature of the run-level symbol. Therefore, for a given number of selected top VLC codewords for the first partition, different blocks may result in covering a different number of DCT coefficients. To assist with the analysis, a zone,

Zχ_,y ( y > x) is defined as a region in which the DCT block covers a specific number of DCT coefficients where the coefficients are identified by the zone's subscript indices, x and y. For instance, Z_x,_y correspond to the DCT coefficients: C_x, C_x+ι, ....C_y (in a zig-zag order). Therefore, for a given number of VLC codewords selected for the first partition, it may be assumed that for the current coding block, this would correspond to upper zone Zι,_m as indicated in Figure 2a. As the effect of quantization noise is being neglected, as stated above, it may safely be assumed that S_f.i = S «. Thus, the prediction error signal for selected VLC codewords may be shown as:

First partition:

Z_1;m{DCT(S_f - Sf-i)} = Z_1)m{DCT(Sf)} - Z_ljm{DCT(Sf-ι)}

(2)

Second partition:

Z_m+lj6 {DCT(S_f - Sf.j)} = Z_m+ι_j64{DCT(S_f)} -

(3) Assume that when the reference block (as shown in Figure 2b) was encoded (based on the same number of VLC codewords), the number of its DCT coefficients for the first partition was represented by a zone that was smaller than that in the current block (i.e. Zι,_n < Z_1>m). Thus, the last term in equation (2) may be divided as,

Zι_,m{DCT(S_f.₁)} = Zi,_n{DCT(S_f-,)} + Z_(n+1)_!m{DCT(S_f.₁)}

(4)

It can be easily deduced from equation (2) that in order to recover the current DCT block, the prediction error signal should be added to the reference block. Therefore, in the absence of any distortion caused by quantization or transmission errors, the VLC codewords for the first partition can be expressed as,

Zι_,m{DCT (S_f)} = Z_ljffl{DCT(S_f- S_f-i)} + Zι,_m{DCT(S_f.ι)} (5)

Using equation (4), equation (5) may be expressed as,

Zι,_m{DCT (S_f)} = Z_ljm{DCT(S_f- S_f.i)} + Zi,_n{DCT(S_f.i)} + Z_(n+ι_λm{DCT(S_f.ι)}

(6)

In the case of transmission errors, it is important to note that the coefficients within the Z_n+ι,_m belong to the second partition at the time when the reference block was transmitted.

Next, the situation when this partition (Z_n+ι_>m ) has been received erroneously at the time the reference block was transmitted will be considered (this is a realistic assumption as the first partition is expected to be well protected against transmission errors). For the current block however, in order to decode its first partition, the receiver is required to provide the reference block represented by the same upper zone (i.e., Zι,_m) when it was decoded. Under these conditions, we can write the reconstructed reference block, after DCT conversion, with the same upper-zone may be expressed as, Z_ljm{DCT(S_f-ι)} = Z_1>n{DCT(S_f.ι)}+ Z_n+ι_>m{DCT(S_f-ι)} (7)

where Z represent a zone that is effected by errors. Note that in many practical applications the receiver is designed to force these corrupted coefficients to zero. Subsequently, to decode the DCT coefficients covered by the first partition of the current block, the prediction error signal in equation (2) should be added to the reconstructed reference block with the same upper-zone, as shown in equation (7). Thus from equations (2) and (7) the DCT coefficients of the first partition at the receiver may be expressed as,

Zi,_m{DCT(S_f)} =

Zι,_m{DCT(S_f- S_f- } + Zι_!n{DCT(S_f.ι)}+ Z_n+ι_;m{DCT(S_f-ι)}

(8)

The absolute difference value between the reconstructed upper-zone, with and without transmission errors, at the second partition represents the distortion value for the coding block. Thus, from equations (6) and (8), the following expression can be shown,

|Zι_,m{DCT (S_f)} - Zι,_m{DCT (S_f)} | = |Z_(n+i₎,_m{DCT(S_f-ι)} - Z_n+ι,_ra{DCT(S_f-ι)} ]

(9)

The above distortion represents the amount of drift between the local decoder (equation (4)) shown in Figure 1 (feedback loop) and remote decoder (equation (8)) for the first partition of the current block. This indicates that the first partition has to rely on the channel condition in which the second partition is received. This may make the VLC partitioning unsuitable for SNR scalability. Nevertheless, since the objective here is not to provide SNR scalability, VLC-based partitioning with its superb efficiency is found to be more suitable for wireless/mobile applications. The visual impact of such distortion however, depends on the number as well as the magnitudes of the non-zero coefficients that fall within the differential zone, Z_m__n. However, the visual effect of such distortion may not be a concern as long as the number of selected VLC codewords for the first partition remains the same. It is understood that, in order to achieve a fixed-rate transmission for each partition, the splitting of the VLC codewords between the two partitions may need to be changed from time to time.

As an example, two video sequences, known as "Suzie" and "Salesman," each formatted with QCIF (i.e., 176 pixels by 144 lines per frame) and a frame rate of 30 frames/second, were encoded at four constant bit rates: 16 kilobits/s (kb/s), 32 kb/s, 64 kb/s, and 128 kb/s. Figures 3a and 3b illustrate the bit rate percentage ratio of the first partitions for "Suzie" and "Salesman," respectively, when they include the following information: (1) header, (2) header and first VLC codeword, (3) header and first and second VLC codewords. For each of the video sequences (except frame 1 which was encoded as an LNTRA frame), frames 1-100 were encoded using INTER frame prediction

(i.e., P-frames only). After the last frame was encoded (frame 100), the sequence was repeated and the first frame of the repeated sequence was INTRA frame coded. Thus, resulting in the LNTRA frame reset period of 100 frames.

The results shown in Figures 3a and 3b indicate that for a given splitting ratio, the number of VLC codewords that may be allocated to the first partition may depend on the bit rate as well as the nature of the video signal. For example, for the "Suzie" sequence the header information far outweighs the coefficient data as the bit rate drops below 45 kb/s. This indicates that some of the header information may need to be split (in the case of P-frames). In this situation, the picture and the GOB headers may remain in the first bitstream, but the MB information may be transferred to the second bitstream.

In a preferred embodiment of the invention, illustrated in Figure 4, the splitting mechanism takes into consideration the instantaneous variations of the coded video in order to develop a robust partitioning scheme. As the techniques discussed herein deal with CBR transmission, the splitting may be arranged in accordance with buffer fullness; however, any suitable type of data transmission may be used. In the preferred embodiment shown in Figure 4, each partition may be equipped with a separate buffer. The relative size of each buffer may be set in accordance with splitting percentages. For example, for a splitting percentage of 50, the same size buffer for both partitions may be employed. To balance the distribution of data between the two partitions, the number of VLC codewords, as well as part of the header information carried by each partition, may change from time to time. This is to ensure that the amount of fixed information transported by each partition and set by the pre-defined splitting percentage, will remain almost constant throughout the splitting process. In this embodiment, the control management of the buffers may be handled by the buffer control unit (BCU). The BCU may calculate and compare the occupancies between the two buffers. The BCU may instruct the bitstream splitter to select one of the following options for the first partition: (1) split-header (i.e., picture and GOB headers),

(2) all headers (picture, GOB header, and MB header),

(3) all headers and the first VLC codeword (note that for INTRA blocks, the first codeword is not a VLC codeword but a fixed-length 8-bit codeword representing the first coefficient and is referred to as a DC coefficient), or (4) all headers and the first and second VLC codewords.

The selected information, which may be referred to as a "cut-off value," may be represented by 0, 1, 2, or 3 and, after being binary coded (e.g., 2-bit fixed-length in this case), is preferably included in the header of the particular transmission layer on which it will be updated. It should be noted that the number of options for the first partition may be increased (i.e., all headers and more than two VLC codewords, or a further break down of the header information); however, increasing the number of options can enhance the dynamic range in which the data is distributed, but this may be at the expense of a slight increase in the number of bits representing the cut-off value.

In order to avoid frequent buffer overflow/underflow, the cut-off value may be accomplished at the GOB level. In addition, the cut-off value has to be transmitted in advance to notify the receiver of the selected option. It is noted that the cut-off value may be embedded into the group number (GN), where the GN is a fixed-length codeword of 5 bits for the ITU-T H.263 standard; however, any suitable manner of transmitting the cutoff values or standard may be employed. According to ITU-T H.263, the bits of the GN are the binary representation of the GOB numbers in a frame. For a QCIF signal (176 pixels by 144 lines per frame), there exist a total of 9 GOBs. Although the first GOB may not require GOB information as it is located immediately after the picture information, three bits may be sufficient to transmit the GN (excluding the first GOB number). In the example presented herein, this has been arranged by sending the GN for the first two GOBs as "0" and the remainder in sequential number. With this arrangement, the two most significant bits of the 5-bit GN will be free and may be utilized for transmitting the cut-off values. In order to preserve the integrity of the H.263 syntax, in the reassembling process (i.e., the pre-decoder), the GNs may be returned to their original format before being decoded by the H.263 standard decoder. Under this arrangement, no additional bits will be needed for partitioning the H.263 bitstream. To allow for more than four options for splitting, and/or being able to use higher spatial resolution video (e.g., CLF format or higher), a separate field may be allocated at the GOB header of the first partition for transporting the cut-off value. It is noted that the video-coded information in each frame may be arranged in four hierarchical layers, i.e., picture layer, GOB layer, MB layer, and block layer. However, similar to the MPEG-I and MPEG-II hierarchical layer structure, a slice layer may be considered to update the cutoff value. A slice may consist of multiple MBs. In this embodiment, the second partition may be furnished with some synchronization bits to help the re-alignment of the two partitions in the presence of transmission errors. At the picture layer, the synchronization consists of the PSC, GN, TR. At the GOB layer, the synchronization bits for the second partition include GBSC and GN. (It is noted that to deal with long error burst the TR codeword may also be included.) This information, which may be already included in the first partition, may be essential for the second partition to help re-alignment of both bitstreams in the presence of the transmission errors. Since the correct recovery of TR, GN, and cut-off (if not embedded in the GN) codewords is useful in re-aligning the two bitstreams, parity check bits for providing extra protection may be considered. In this disclosure, this extra FEC protection will be referred to as the inner-level protection; this is shown in Figures 5a and

5b for inner-level FEC protection for TR and GN codewords, respectively.

It is important to point out that any extra information added to the input data during the splitting process may also be taken into consideration. For instance, if the video is encoded at a fixed rate "R" and the amount of average additional information (such as cut-off-value, synchronization bits for the second partition, and inner FEC for both partitions) is denoted by "Re," the overall bit rate "Ro," after splitting, may be shown as,

R_o = (R + R_e) (10)

For the splitting percentage "Y," the bit rate for each partition is,

Ri = (Ro x Y)/100 First partition (11) R₂ = [R_o x (100 - Y)]/100 Second partition (12)

In many practical applications each partition may have to be transmitted via a separate channel with an equal bandwidth (e.g., each partition using one slot of HSCSD/GSM). For an unequal error protection we can write,

(Ri + FEC 1) = (Ra + FEC2) = R_m (13)

where FECl and FEC2 represent the parity bits associated with the first and second partitions, respectively, and R_m corresponds to the channel capacity that is available for transmitting each partitioned bitstream. It is noted that, for this stage, error protection will be referred to as the outer-level FEC. It should pointed out that the first partition would require much better error protection, i.e., FECl > FEC2. Under this condition, the splitting percentage, Y, which would be less than 50, may be calculated from equations (11), (12), and (13).

The manner in which the two partitions may be formed for this preferred embodiment is now discussed. For this purpose, an example where the cut-off value indicates that at least two VLC codewords should be selected for the upcoming GOB will be considered. In this case, as shown for a sample data set in Figure 6, the first partition may begin with the GOB header followed by the MB header and the first two VLC codewords from each block in the transmitting order. This process may continue until the selection value is updated at the next GOB. The remaining VLC codewords may subsequently be transferred to the second partition in the same order. It is noted that the second bitstream will not carry any VLC codewords from blocks in which the last VLC coefficient is included in the first partition (note that the last VLC codeword will be referred to as "VLC-LAST" or "VLC-L") or identified as zero blocks by the MB header, as can be seen for B2, B5, and B6 in Figure 6.

In another preferred embodiment, a system having a computer-readable memory may perform a robust partitioning of video data. The system may perform adaptive splitting similar to that discussed above. In this system, a user may define a splitting percentage factor by entering information into the percentage factor field as shown in

Figure 4. The percentage factor field sets the transmission rates for the first and second bitstreams at YxR and (100-Y)xR, respectively, where R represents the transmission rate before splitting (it should be noted that in Figure 4, R. is assumed to be zero, thus R = Re; however, this need not be the case). An image field contains the video image to be partitioned. The reference field is a picture which temporally precedes the current picture. The MB field is four neighboring luminance blocks and two chrominance blocks in a picture. The block field is a two-dimensional array of n-by-n (e.g., 8-by-8) picture elements. The DCT field contains the DCT coefficients for each block of video data.

The cut-off value field describes the partitioning of the video data. For example, a cut-off value of 1 may indicate that the MB header is included in the first partition, and the VLC codewords are included in the second partition, while a cut-off value of 2 may indicate that all header information and the first VLC codeword is included in the first partition. The total bit rate field is the video encoder (e.g., ITU-T H.263) output rate plus the synchronization bits added to the second partition and inner-level parity bits. The inner- level FEC field is the FEC parity check bits added to some of the most error sensitive fixed-length codewords such as TR, GN, and cut-off value (if it is not embedded in the GN field). The first partition field contains the first partition of the video data in the manner described by the value in the cut-off value field. Similarly, the second partition field contains the second partition of the video data in the manner described by the value in the cut-off value field. The synchronization field in the second partition is the PSC and TR at the picture layer, and GBSC and GN, at the GOB layer.

In another preferred embodiment, a system for robust pre-decoding to combine the two bitstreams at the receiver is presented. In this embodiment, the pre-decoder will preferably provide robust self-error detection and concealment capabilities that enable recovery of a corrupted video bitstream with minimal visual distortion. An important property of the pre-decoder is based on its ability to preserve the integrity of the compressed bitstream, and the output of the pre-decoder may then be directly forwarded to a standard decoder. To preserve the integrity of the H.263 syntax during the reassembling process of the two partitions, GNs (if the cut-off value is embedded in the GN field) may be reorganized in to the original sequence before being decoded by the H.263 standard decoder. The synchronization bits that may be added to the second partition are the PSC, the TR at the picture level, the GBSC, and the GN (3 bits for QCIF and 5 bits for CIF; however, any suitable interface format may be used). These may be used by the pre-decoder unit at the receiver end in order to align and reassemble the two partitions. It is understood that although the H.263 standard is presented, any suitable transmission standard is contemplated by the invention. Moreover, this disclosure contemplates dividing or partitioning the signal into more than two partitions (i.e., three or more) and subsequently pre-decoding and decoding three or more partitions to provide enhanced capabilities according to the principles of the invention. In addition, according to principles of the invention, error detection and concealment methods presented herein may also be applied to non-partitioned data.

In this embodiment, the system may preferably begin with the initial cut-off. For the INTRA frame, the VLC codeword data are expected to far outweigh the header information as the frame is encoded without utilizing any previous frame. In this case a cut-off value of 2 or 3 (depending on the target fixed bit rate) may be used. For the first GOB of the second frame (i.e., the P-frame), an initial cut-off value of 1 may used. In this embodiment, the two bitstreams may be joined at the receiver to form the original H.263 bitstream via a pre-decoder unit.

If the cutoff value is embedded in the GN codeword, the pre-decoder will preferably read the two most significant bits of the GN to extract the cut-off value. It is noted that the GN is a 5-bit codeword in H.263, which represents the number of the GOB

(up to 32 GOBs). For QCIF, there are altogether 9 GOBs. If a GOB number is not used for the first GOB, there would be 8 GOBs remaining. Thus, three bits would be sufficient for GOBs. The remaining two bits may then be allocated for cut-off values consisting of four options and thus requiring only two bits. With this arrangement it is possible that, in the presence of errors, both top GOBs may be lost regardless of which one has been corrupted by errors. To avoid this occurrence, a separate field at the GOB syntax (first partition) may be used to transport the cut-off value. This alternative would be at the expense of a slight increase in the bit rate. From the cut-off value the pre-decoder may determine which of the two bitstreams should obtain the MB header if a split header has been identified.

For the other cut-off values, all the headers will preferably be obtained from the first partition and the pre-decoder may then establish the number of VLC codewords that should be read from the first partition. If the " VLC-LAST" codeword of a block is not found in the first partition, the second partition may be searched until the "VLC-LAST" codeword is found. It is noted that in the ITU-T H.263, a separate codebook is used for

"VLC-LAST." If all the VLC codewords in a block use the same codebook (including the "VLC-LAST" codeword), then a unique codeword known as the end of block (EOB) codeword may be used instead once the last run-level symbol in the block is coded (e.g., MPEG-I or MPEG-II; however, any suitable scheme may be used). Under this condition, the pre-decoder needs to identify the EOB codeword instead of the last "VLC-LAST" codeword.

When used with the H.263 standard, the output bitstream of the disclosed pre- decoder should satisfy the correct H.263 syntax; otherwise, it cannot be recognized by the

H.263 standard decoder and may consequently result in the failure of video decoding. This situation may happen if the pre-decoder combines the two bitstreams in which the proper syntax has been corrupted by errors. Therefore, according to principles of the invention, the pre-decoder should be configured to enable it detect the uncorrectable errors and recover the corrupted bitstreams using an error concealment technique.

As the second bitstream is more sensitive to errors due to relatively less protection, the employment of an error concealment technique in the second bitstream is useful, although the techniques presented here may be equally applied to the corrupted first bitstream. Four different possibilities of error occurrence in the second bitstream will be discussed below: MB header; VLC codewords; GBSC and GN; and PSC and TR.

The general approach to employing error concealment is that the pre-decoder first combines the data from the two bitstreams on a GOB-by-GOB basis. Initially, the combined bitstream may be temporarily stored in a buffer until the end of a GOB is reached. In addition, some important information (e.g., header information and VLC codewords) may also be saved in the memory during the reassembling process.

In the description contained herein, it is assumed that the first bitstream is error free; however this approach may also be used if the first bitstream were not error free. In addition, the BCU used in the partitioning scheme may be configured in such a way that the cut-off option 0 will not be selected in the INTRA frame. This is a very realistic assumption as the number of VLC codewords in a block outweigh the header information.

This may serve to ensure that all the header information (picture header, GOB header, and MB header) in the LNTRA frame will be included in the first bitstream.

If errors are detected in the second bitstream during the reassembling process, the pre-decoder may ignore the upcoming data from the second bitstream for that particular GOB. Then a suitable concealment technique may be applied to restore the corrupted

GOB based on the data of the first bitstream and the saved information. It is noted that errors may not be detected and pinpointed at the exact location in which they occur. This is due to the nature of variable-length codes, which may be incorrectly decoded even when only a single bit error occurs. Therefore, in most cases, errors propagate into the bitstream before they can be detected by an appropriate scheme. Conversely, if no error is found in the second bitstream, the content in the buffer will be an output to the H.263 standard decoder. The same process may continue in every GOB until the end of sequence indicator is detected in each of the bitstreams.

As it is unlikely that the MB headers in the INTRA frame will be included in the second bitstream, the case where the MB header is corrupted will usually only be presented for the P-frames; however, the invention is not limited to this situation, as it may be employed in a situation when a very low splitting percentage factor has been selected. According to H.263 specifications, an INTER MB consists of the following elements: COD, MCBPC, CBPY, and MVD.

The occurrence of errors in any of these codewords may change the nature of the MB. For example, in the situation in which the COD of the current MB is 1, the MB is uncoded, and the following bit would be the COD of the next MB. However, if the COD is changed from 1 to 0 because of a single bit error, the current MB will be treated as a coded MB. Consequently, the pre-decoder may not expect the next bit to be a COD, but may continue to search for the MCBPC, CBPY, MVD, and VLC codewords from the upcoming bits, without realizing that these are actually the data for the next MB. Such an example is depicted in Figure 8. In this situation, the error may be detected if the upcoming bits fail to form a valid codeword for the MCBPC, CBPY, MVD, or VLC codewords. Otherwise, the pre- decoder may continue to reassemble the two bitstreams and try to search for the "VLC- LAST" of the first block in the MB. In most cases, the error may eventually be realized when the pre-decoder reads the next GBSC (16 bits of "0" and 1 bit of "1") of the next GOB thus forming an invalid VLC codeword.

Another possibility is that the reassembling process for all the MBs in the current GOB (e.g., for QCIF, there are 11 MBs) may be completed by actually reaching the next GBSC. This may happen if the upcoming bits in the second bitstream coincidentally form one of the codewords of "VLC-LAST" (or EOB code) that indicates the end of the GOB. In this case, the GOB is considered "prematurely" completed and thus the error cannot be detected.

The occurrence of errors on the other bits in the header may also result in the above two possible outcomes. Errors may change the MB type if the MCBPC is affected, or the pattern of the coded blocks for luminance if CBPY is affected. A wrong motion vector may be obtained if MVD is corrupted. In other words, the two possible outcomes where errors occur in the MB header are that errors may be detected before they propagate into the next GOB or errors cannot be detected if the current GOB is prematurely closed.

The concealment method for the first outcome may be performed by first deleting the contents of the temporary buffer, and then constructing a new GOB by setting the COD of every MB to 1. This method is preferably adopted because the MB header which was transported via the second bitstream may be lost and, thus, no information on how each MB is to be coded would be available. Therefore, to satisfy the H.263 syntax, one feasible solution would be to create a GOB in which all its 11 MBs (in the case of QCLF) are considered uncoded (COD = 1). This implies that the standard decoder will use the information from the same GOB in the previous reconstructed frame to recover the video data. On the other hand, the second outcome may be prevented by employing a better scheme for detecting the GBSCs.

As mentioned previously, most of the DCT coefficients are variable-length coded, except the less commonly occurring run-level symbols, which are coded by escape codes. The two possibilities that a VLC codeword is incorrectly decoded at the pre-decoder are that the VLC codeword is corrupted by errors and immediately detected as an unrecognizable codeword at its location or that the VLC codeword is detected as an invalid codeword due to the propagation of previously corrupted data. In either case, errors may be detected if a VLC codeword is changed into an invalid code that cannot be found in the VLC codebook.

Another possible outcome may be that the decoding process of VLC codewords may continue and eventually reach the GBSC of the next GOB and be treated as a VLC codeword. The pre-decoder may then detect an error before it propagates into the next GOB. As mentioned earlier, the other possibility is that the current GOB may be closed "prematurely." This means that the pre-decoder may face the situation of how to deal with the remaining bits in the current GOB. However, the pre-decoder may identify this problem by correctly detecting the next GBSC.

The case where errors may be detected is next considered. The concealment method used to reconstruct the corrupted GOB may be determined by the cut-off value associated with that GOB, and also by the coding mode (INTER or LNTRA) of the current frame. The methods used in the INTER-frame (i.e., P-frame) are summarized as follows:

For a cut-off value of 0, the MB header and all the VLC codewords are transmitted via the second bitstream, which has to be ignored during the reconstruction of the current GOB. Upon detection of errors, the reassembling process is terminated and a new GOB is created with 11 uncoded MBs (in the case of QCIF format). This method is the same in the case where the MB header is in error.

For a cut-off value of 1, the MB header is included in the first partition, which is assumed to be error free, while all the VLC codewords are transmitted in the second bitstream. After errors are detected, the pre-decoder will continue to read and save the macroblock header (COD, MCBPC, CBPY, MVD) in the first bitstream while ignoring all the information on the second bitstream that belongs to the current GOB. Upon completion of reading all the macroblock headers in that GOB, a new GOB is created based on the saved header information. For every coded macroblock (i.e., COD = 0) in that GOB, the COD is still set to 0, and the associated MCBPC and CBPY are set to the codes "1" and "11," respectively. MCBPC = "1" indicates that block numbers 4 and 5 (C_B and C_R, as defined in the ITU-T H.263 specifications) do not contain any AC coefficients (as defined in the ITU-T H.263 specifications). According to the ITU-T H.263 specifications, for the INTER block, all the coefficients are considered AC; for the INTRA block, the first coefficient is called DC and the rest are called AC. CBPY = "11" indicates all the luminance blocks in that macroblock do not contain any AC coefficients (as described in the ITU-T H.263 specifications). The motion vector, MVD, which has been saved, is retrieved and then appended to the end of the reconstructed MB header. This process is illustrated in Figures 9a and 9b. It is noted that any distortion may be a result of this error concealment scheme which is based on the assumption that every coded MBs (COD = 0) in the GOB has a perfect match with its MC reference MB in the previous frame. This assumption is based on the fact that all the VLC codewords s in the second bitstream are lost and only the MVDs are available.

For a cut-off value of 2, all the header information, followed by the first VLC codeword of every block, is included in the first bitstream. It is noted that there is no

INTRA DC in the MBs of INTRA frame, except for those LNTRA macroblocks in the P- frame. During the reassembling process, all the header information and the first VLC codeword in the first bitstream are stored in the memory. After errors are detected, the reassembling process stops, but the pre-decoder will continue to capture all remaining header information and the first VLC codeword of every block in that GOB. In the meantime, the remaining VLC codewords (for every block) in the second bitstream will be ignored by the pre-decoder. To reconstruct the new GOB, the saved header information may be retrieved and the codeword for the first VLC codeword of every block may be changed to the one that represents the same run-level symbol, but decoded as the "VLC-LAST" codeword. (It is noted that when the EOB code is used the pre-decoder only needs to insert this code after the first VLC codeword.) This assures that the first VLC codeword is always the last VLC codeword in every block. For example, if the first VLC codeword is coded by "10," which is not a "VLC-LAST" and represents a run of 0 and level of 1, the code has to be changed to "Oi l 1," which corresponds to the "VLC-LAST" in the codebook with same run and level (run = 0, level = 1) during the reconstruction process of the GOB (according to the ITU-T H.263 specifications). If the codeword for the "VLC-LAST" codeword is not found in the codebook, an escape code may be used. For example, the first VLC codeword corresponds to a non "VLC-LAST" with a run = 0 and level = 9. The codeword that represents the "VLC-LAST" with a run = 0 and level = 9 is not available in the VLC codebook. In this case, an escape code "0000 011" followed by "1" (for being the last codeword), run = "000 000," level = "1111 0111" is used (according to the H.263 specifications).

For a cut-off value of 3, all the header information and two VLC codewords or fewer may be included in the first bitstream. Similarly, the header information may be saved in the memory during the reassembling process, while the combined bitstream may be temporarily stored in a buffer. In addition, in the case where there are more than two VLC codewords in a block, the first two VLC codewords in the first bitstream may also be saved in the memory. After errors are detected, the reassembling process stops and the pre-decoder continues to capture the remaining header information and the VLC codewords in the first bitstream for that GOB. The remaining data in the second bitstream then may be ignored and the content of the buffer may not be used to reconstruct the GOB. The method to reconstruct the GOB is similar to the one discussed above for a cut-off value of 2. First, the GOB header may be retrieved from the memory to the output bitstream. Second, the macroblock header may be retrieved to indicate how the MB is coded. Finally, the VLC codewords are restored depending on the number of VLC codewords which are saved from the first bitstream. The non-"VLC-LAST" codewords in every block may be restored exactly as they appeared in the first bitstream. For the saved "VLC-LAST" codeword, the codeword may be changed to the one that corresponds to the same run-level symbol, but its equivalent "VLC-LAST", as described above. If the combination of "VLC-LAST" with same run-level symbol is not found in the codebook, an escape code is used. An example of the concealment method is given in Figures 10a and 10b. Both bitstreams are viewed as they are concatenated to each other in Figure 10a. The corrupted data in the second bitstream is ignored during the error concealment, as shown in Figure 10b. It is noted that the second VLC codeword becomes the "VLC-LAST codeword, which indicates the end of the block.

The concealment methods employed in the LNTRA frame are not entirely identical to those employed in the INTER frames, due to the presence of INTRA DC. The performance of the prediction algorithm used in the INTER frames greatly depends on the LNTRA DC. The loss of INTRA DC will have a significant effect on the quality of the video. The following describes methods for INTRA frame based on the decoded cut-off value of a GOB:

As mentioned above, the BCU in the splitting scheme presented is configured not to select the cut-off value 0 for the INTRA frame. There is no concealment method defined for this presented option. However, as contemplated by the invention, any suitable configuration may be used.

For a cut-off value 1, all header information may be included in the first bitstream, while the INTRA DC and the VLC codewords may be transmitted in the second bitstream. Similar to the approach in INTER frames (i.e., P-frames), the header information in every GOB is always saved in the memory before and after errors are detected. Since the second bitstream may be ignored during error concealment, the

INTRA DCs in the entire GOB may be lost. In order to reconstruct the corrupted GOB, INTRA DC is interpolated from the nearest available block. INTRA DCs from the previous decoded GOB may be used to replace the lost LNTRA DC in the current GOB. Lt is noted that the interpolation accuracy may be significantly enhanced if the INTRA DCs in the following GOBs may also be considered. The interpolation of NTRA DC from the previous GOB is shown in Figure 11. In Figure 11 , blocks 0 through 3 are the luminance blocks, and blocks 4 and 5 are the chrominance blocks, C_B and C_R, respectively. The INTRA DC of block 2 in the previous GOB may be taken to replace the INTRA DCs of block 0 and block 2 in the current GOB. Similarly, the LNTRA DC of block 3 in the previous GOB may replace the INTRA DCs of block 1 and block 3 in the current GOB. For the chrominance blocks, the INTRA DCs may be directly obtained from the corresponding blocks in the previous GOB. This process repeats for every macroblock in the GOB. However, the interpolation may utilize the previous GOB as well as the next GOB to improve interpolation accuracy.

For a cut-off value 2, all the header information and the INTRA DC of every block may be included in the first bitstream. These data may be saved in the memory during the reassembling process so that they may be retrieved later to reconstruct the corrupted GOB. As all the VLC codewords in the second bitstream may be discarded after errors are detected, INTRA DC will be the only coefficient in every block that may be used for error concealment. Thus, it may be required that the codewords of MCBPC and CBPY change to "1" and "0011," respectively (as is shown in the VLC codeword tables in the H.263 specifications). MCBPC = "1" indicates that neither of the chrominance blocks contain any AC coefficients. CBPY = "0011" indicates the four luminance blocks do not contain any AC coefficients. Accordingly, every block in the GOB may be closed after the LNTRA DC is retrieved. An example of this concealment method in a particular macroblock is shown in Figures 12a and 12b. In Figure 12a, MCBPC "011" and CBPY "11" indicate that all the blocks in the MB contain AC coefficients, assuming that the VLC codewords transmitted in the second bitstream are corrupted by errors. Consequently, as shown in Figure 12b, a new macroblock may be constructed based on the macroblock header and the saved INTRA DC in every block. It is noted that MCBPC is changed from "011" to "1" and CBPY is changed from "11" to "0011" to signify the fact that no VLC codewords representing the AC coefficient are available in this MB.

For a cut-off value 3, all the header information, INTRA DC and one VLC codeword (or fewer) in every block may be included in the first bitstream. The remaining VLC codewords may be transported by the second bitstream. The applied concealment technique, after errors are detected in the second bitstream, may be similar to the one discussed above for INTER frame (i.e., P-frame) when the cut-off value is 3. The fundamental objective is to change the codeword of the last VLC codeword saved from the first bitstream to an appropriate codeword that signals the end of block according to the principles of the invention. It is noted that in some cases an error may change a VLC codeword to another valid VLC codeword with the same length. In this situation, the pre-decoder may be unable to detect the corrupted VLC codeword and thus may continue with the process of realigning the bitstreams. However, such a change to the VLC codeword may effect the decoding of the run-level symbol and result in a different number of DCT coefficients, which may consequently cause the H.263 standard decoder to crash if the number of DCT coefficients exceeds 64 (for an 8-by-8 DCT block). To prevent this occurrence, the following may be done. The pre-decoder, before sending out the final bitstream, will count the number of DCT coefficients in every block (by decoding all its VLC codewords). If the number does not exceed 64, the block is considered safe (in the sense that the H.263 decoder will accept it). Otherwise, all the VLC codewords for AC coefficients in the block are removed by changing the MCBPC (if the block is color) or CBPY (if the block is a luminance) accordingly. It is noted that to implement this feature, the pre-decoder will require a VLC codebook that identifies the number of DCT coefficients represented by every VLC codeword.

As mentioned previously, PSC, TR, GBSC, and GN may be included as the extra information added to the second bitstream during the video partitioning. They may be used to align the two bitstreams in order to reassemble them at the receiver. Therefore, it may be deduced that the loss of this information during transmission may cause misalignment between the two bitstreams, which in turn may result in a failure of the reassembling process. GBSC may also be used as a barrier to avoid errors in a GOB from propagating into the next GOB. This configuration makes the second bitstream more robust to transmission errors. The significance of PSC and TR will be discussed below. A good technique for detecting the GBSCs may improve the efficiency of error concealment. If all the GBSCs in a frame may be detected before the reassembling process starts, it may be easier for the pre-decoder to select an appropriate concealment method for a certain occurrence of errors. However, this approach may introduce additional delay in decoding as well as requiring a larger buffer in order to store all the data in a frame before the reassembling process begins. An alternative approach may be based on a GOB-by-GOB basis. The pre-decoder may be required to always detect the next available GBSC before the start of the reassembling process at every GOB. For example, after the PSC and TR in the picture header are read, the next available GBSC in the second bitstream may be searched and the following GN may then stored. If the GBSC of the second GOB is corrupted by errors, the search may continue until a valid GBSC is captured. Based on the GN, the pre- decoder can anticipate the GOB whose GBSC has been corrupted. For the first bitstream, this searching process may not be required as the first bitstream may be assumed to be error-free.

At the beginning of every GOB, the next available GBSC in the second bitstream may be searched. Then, the following GN, as well as the address of that GN, may be stored. This address may be used to align the two bitstreams for the next GOB in the case where the second bitstream for the current GOB is dropped after errors are found. After the searching process, the pre-decoder may return to the beginning of the current GOB and start the reassembling process. The data corresponding to one GOB may be combined from the two bitstreams and stored temporarily in a buffer. If errors are detected within the GOB, the second bitstream may be dropped and an appropriate error concealment method may be selected to reconstruct the corrupted GOB, as discussed above, so that the new GOB will then be in agreement with the H.263 standard syntax.

To proceed into the next GOB, the pre-decoder may move from the position where errors have been detected to the location specified by the address of the GN saved during the searching process.

If no error is found throughout the reassembling process, further confirmation may be made. It is noted that the reassembling process stops after the codeword that indicates the "VLC-LAST" codeword of a GOB is obtained (or the EOB code). This further confirmation may assist in deciding if the GOB is completely or prematurely closed. (The reasons for a GOB to be prematurely closed were discussed above.) This task may be carried out by checking the next 16 bits in the second bitstream immediately after the reassembling process stops. If the GOB is completely closed, the 16 bits would be the 16 zero bits of the next GBSC. The content of the buffer may then be sent to the H.263 standard decoder. On the other hand, if the next 16 bits are not all zero bits and the next GBSC is known to be error free, the GOB may be considered to be prematurely closed. The advantage of always detecting one GBSC in advance is that the pre-decoder may anticipate any erroneous GBSCs. After the GOB is confirmed as being prematurely closed, the second bitstream is dropped and the GOB is reconstructed based on the first bitstream only. For the GOB whose GBSC or GN in the second bitstream is corrupted, the reassembling process will not be invoked in that GOB. Instead, the second bitstream may be ignored and GOB reconstruction depends solely on the first bitstream. This may be repeated for every GOB in the video sequence. PSC and TR may embedded in the bitstreams at the beginning of every frame. As discussed previously, PSC is used for the synchronization of the picture frames. TR is a number that represents the picture frame and starts from 0, 3, 6, 9, and so on (based on 10 frame/s). As disclosed earlier, because of the error sensitivity of TR, inner-level error parity check bits may be added to the TR codeword to improve its recovery. This information may also be included in the second bitstream in order to align itself with the first bitstream during the reassembling process.

At the beginning of every frame, the pre-decoder may search for the available PSC and the protected TR in both bitstreams. Subsequently, the TR in the first bitstream, after being decoded, may be compared with the decoded TR in the second bitstream. If they do not match, it may be interpreted that either the PSC or the TR in the second bitstieam is corrupted by errors (assuming an error-free first bitstieam). Consequently, the data for the entire frame in both bitstreams may be discarded and the pre-decoder may proceed into the next available frame.

In order to assure the two bitstreams are aligned correctly in upcoming frames, one assumption is made based on the performance of error protection coding according to the principles of the invention. It is assumed that the maximum number of consecutive PSCs (or TRs) corrupted by errors in the second bitstieam is three. In other words, if the PSC or TR of frame 3 is in error, the PSC and TR of frame 12 is always assumed error- free. (This assumption may be made to limit the pre-decoding delay. It is believed that this assumption is realistic based on the experiments under various channel conditions and where software programs had been constrained to avoid excessive delay. For example, for 10 frames/s the coded frames are: 0, 3, 6, 9, 12, and 15. This assumption contemplates that if errors corrupt frames 3, 6, and 9, then frame 12 would be impossible to corrupt.) Two example cases will be considered. In the first example, TR = 3 is detected in the first bitstieam and TR = 6 is detected for the second bitstream. This means that the PSC of frame 3 in the second bitstieam is in error and cannot be detected by the pre- decoder. In this situation, frame 3 will be dropped and the pre-decoder starts reassembling the two bitstreams starting at frame 6. In the second example, it is assumed that TR = 3 for the first bitstream and TR = 14 for the second bitstream. Since 14 is not a valid number for TR, it can be deduced that TR has been corrupted by errors. Unfortunately, the exact frame whose TR is in error cannot be identified. However, by means of the assumptions made, the PSC and TR of frame 12 are error-free. Therefore, frames 3, 6, and 9 will be dropped and the reassembling process restarts from frame 12. Errors occurring in PSC or TR, will result in the loss of three consecutive frames (in the worst case). According to principles of the invention, the pre-decoder may be sufficiently robust to handle all possible error occurrences and produce an output bitstieam that is decodable by the H.263 standard decoder; it is for this reason that the inner-level FEC may be considered for the TR codeword. Therefore, after the PSC code is detected, the process of checking and correcting the TR code may begin. It is noted that the inner FEC may also be considered for the GN codeword.

In another preferred embodiment, a system having a computer-readable memory may perform pre-decoding of partitioned or unpartitioned video data. The system preferably may include a first bitstieam field, a second bitstream field, a PSC detector field, a GBSC detector field, a cut-off value field, a GN field, a TR field, a temporary buffer field, an inner-level FEC decoder field, an outer-level FEC decoder field, a VLC codebook field, and an error concealment field. The system may perform reassembling similar to that discussed above.

A splitting scheme defined in one preferred embodiment was used to partition the first 300 frames of two video sequences: "Salesman" and "Clare." With these same data sets, QCIF was used in order to embed the cut-off value in the GN codeword. It is noted that a separate field in the GOB header in the first partition might also have been considered. The splitting algorithm was initially examined when the sample data sets were coded at 30 frames/s by considering equal-size partitioning (e.g., X = 50%). In these tests, no INTRA frame reset was considered and, therefore, except for the first frame which was INTRA frame coded, the remaining 299 frames were INTER frame coded. It is understood that the number of frames and the frame rate presented herein are merely illustrative; any suitable number of frames and frame rates may be used.

In Figure 7, the percentage differences between the two bitstreams coded at various fixed bit rates ranging from 16 kb/s to 128 kb/s are shown for the "Salesman" and "Clare" data sets. As is shown in Figure 7, the 50% splitting target was successfully met for both sequences. At 16 kb/s, the number of VLC codewords drops due to the coarse quantization and, therefore, did not leave the second bitstieam with enough information to transport. This is due largely to utilizing full temporal resolution which can result in fewer VLC codewords as a consequence of using coarser quantization (i.e., forcing more DCT coefficients to zero). It is for this reason that a smaller number of frames per second is normally used for very low bit rate applications. It is noted that the experiments presented herein have been applied to prediction frames (P-frames); however, application of the partitioning scheme to any form of bi-directional frames is also contemplated by the invention.

In another set of experiments, the first 300 frames of the "Salesman" and "Clare" sequences were encoded at a rate of 15 frames/s. In addition, in order to allow for uneven partitioning the splitting percentage was set at 46%. Table I shows the results which include detailed values of the coding and splitting parameters.

Table I: Results for "Salesman" and "Claire" sequences coded at 15 frames per second with 46% splitting.

As would be expected by lowering the frame rate, the partitioning may be successfully accomplished for all bit rates — even at a 46% splitting target. The results presented in Table 1 indicate that, as the bit rate goes up, the number of frames with split- header GOBs decreases. Thus, as bit rates increase the encoder can afford to apply a finer quantization step size which results in more VLC codewords being generated. For a given splitting percentage, the generation of a higher number of VLC codewords can reduce the possibility of resorting to the split-header option. Consequently, this may have a very positive effect on the progression of distortion as none of the error sensitive header information will be transported via the second partition. In order to provide a subjective evaluation with respect to distortion built up, a situation was created where the entire second bitstream is corrupted by errors (except for the synchronization bits). Figures 13a and 13b and 13e and 13f depict the last decoded frame (i.e., frame 300) of the sample data sets encoded at 32 kb/s for the "Clare" and the "Salesman" sample data sets, respectively. For sake of comparison, Figures 13c and 13d and 13g and 13h depict the reconstructed frames where the second bitstream is received error-free for the "Clare" and the "Salesman" sample data sets, respectively. In order to examine the effect of temporal resolutions on the performance of VLC-based partitioning, these results are presented at frame rates of 15 frames/s and 10 frames/s. As can be seen from the quality of the last decoded frames, a considerable improvement has been gained by dropping the frame rate from 15 frames/s to 10 frames/s; in particular, the improvement is qualitatively better for the "Clare" sequence. This is due to the fact that, in this sequence, the number of GOBs with a split header tends to drop more sharply as the frame rate reduces to 10 frames/s. It should be emphasized that such a quality assessment is confined to the progression of distortion due to the loss of the second bitstream. Therefore, it does not reflect any deterioration that may have been caused by lowering the temporal resolution (unless reviewed in real time). It should also be indicated that the relative percentage of synchronization overhead added to the second bitstream for aligning the two bitstreams at every GOB may become relatively less significant as the frame rate rises. Another aspect of the experiments conducted with respect to one preferred embodiment of the invention concerns the transmission aspects of the partitioned video signal. In these experiments, error control parity bits (i.e., outer-level FECl) were added only to the first bitstream, while the second stream was transmitted unprotected (i.e., without an outer-level FEC2). The partitioned video signal was then modulated and tiansmitted via additive white Gaussian noise (AWGN) channels under Rayleigh (i.e., flat fading) conditions. It is noted that a Rayleigh fading model is used in mobile communication systems to evaluate the effect of fading due to multiple scatters in the vicinity of a mobile receiver unit.

The transmission model for these experiments consisted of BCH coding, Gray- coded 16-QAM modulation assisted with fade estimation and compensation, space diversity (i.e., two-branch switched diversity). In conventional pilot symbol assisted modulation, a symbol representing a known phasor (note that a 16-QAM consists of the 16 phasors) is allocated at the beginning of each transmitting frame consisting of K-l symbol data. At the receiver, the demodulated pilot symbol, which was transmitted at the beginning of each frame, may then be used to interpolate the symbol-spaced samples of the received signal and thus reduce the effect of multipath fading. The Gray-coded 16- QAM scheme was selected for its superb spectral efficiency and its dual multipath resistance feature. It is noted that in this scheme, the multipath-resistant channel (i.e., the first channel) is represented by the most significant bit (MSB) of both in-phase and quadrature-phase codes, and the second channel is formed by the remaining two bits (i.e., the least significant bits). As the significant difficulty with multilevel QAM is that neither the energy per symbol nor the distance between different symbol states is constant, a symbol-assisted fade estimation and compensation technique was employed to combat multipath fading. Thus, in the experiments presented herein, a symbol representing a known phasor was allocated at the beginning of each transmitting frame consisting of a K-l symbol data. This means that for every K-l data symbol there is one pilot symbol which will be used to estimate fading at the receiver. The concept of using a pilot symbol is based on the fact that the receiver would know the amplitude and phase of the transmitted symbol as well as its location in the frame. The receiver, therefore may use this information to estimate fade on the received pilot symbol.

At the receiver, the estimated symbols for consecutive frames were then utilized to interpolate the symbol-spaced samples of the received envelope. For interpolation, a first-order Gaussian filter was applied. It is understood that the transmission model and other technical details described for this experiment are merely illustrative; the preferred embodiment presented is not limited to applications with the preceding technical details. For instance, HSCSD/GSM or GSM/GPRS, due to their multi-slot capabilities, may be effectively utilized for transporting the partitioned video. Figure 14 illustrates the average peak-to-peak signal-to-root-mean-square-noise ratio (PSNR) of the reconstructed video frames for the "Salesman" sample data set versus the channel SNR (i.e., the average signal power to the average noise power) using

BCH(63, 45, 3) and BCH(63, 51, 2) to protect the first bitstream. These results were obtained at a symbol rate of 24 kbaud/s with a frame length of eight symbols (i.e., K = 8) and a Doppler frequency of 120 Hz. In order to accommodate the differing amounts of parity check bits associated with each of two BCH codes, the 300 frames of the "Salesman" sequence were encoded at 59 kb/s and 62.4 kb/s. This was to make sure that after adding the BCH code for each of the experiments, the overall bit rate could be accommodated for transmission over the 24 kbaud/s of the 16-QAM. The video frame rate was then set at 15 frames/s, which resulted in 150 coded frames (except for the first frame, all the remaining frames were encoded as P-frames). For a thorough evaluation of the transmission system, the sequence was repeated 100 times to generate longer data.

After the last frame was encoded (frame 150), the first frame of the repeated sequence was INTRA frame coded (also referred to as I-frame). Thus, resulting in an INTRA frame reset period of 150 frames. In addition, to minimize the effect of deep fades, a conventional block interleaver of 16x24 bits was applied to each channel. At the receiver, the output of each QAM channel was sent to the input of the pre-decoder unit to align the two outputs. These parameters were discussed above.

The results depicted in Figure 14 indicate that at low channel SNR, the average PSNR of the reconstructed frames suffers severely from the multipath fading effect which could extend even beyond the multipath resistant first channel. However, with increasing channel SNR, the bit-error rate (BER) performance of the data on the first channel is enhanced considerably. Consequently, this may lead to better recovery of the first channel data and thus make the contributions of the symbol-assisted fade estimation as well as the BCH code more effective.

For comparison purposes, the transmission of a non-partitioned H.263 coded bitstieam under the same transmission conditions was also conducted and the results are included in Figure 14. In this instance, the BCH(63, 51, 2) was applied to the entire bitstieam. This required adjusting the coding rate to 58.35 kb/s for accommodating the additional BCH parity bits (for covering the entire bitstieam) as well as taking into consideration the removal of the synclironization bits on the second bitstream which were not needed in this case. The coded bitstream was then split into two separate bitstieams in a sequential order via a serial-to-parallel converter and subsequently transported by each of the 16-QAM channels. As shown in Figure 14, the transmission performance of the non-partitioned video via the same multipath fading channel is almost impossible at the chaimel SNR up to 25 dB. This is despite the fact that its error correcting overhead is almost doubled compared with its BCH counterpart on the partitioned video.

It is understood that the scope of the invention includes any combination of the elements of the embodiments disclosed herein.

Claims

What is claimed is:

1. A method of partitioning video data for transmission comprising: calculating at least one discrete cosine transform (DCT) coefficient from a block of said video data; performing a variable-length coding (VLC) operation on said at least one DCT coefficient to generate at least one VLC codeword; and splitting said at least one VLC codeword into a first partition and a second partition.

2. The method of claim 1, wherein said first partition comprises header information from said video data.

3. The method of claim 2, wherein said first partition further comprises a first VLC codeword from said at least one VLC codeword.

4. The method of claim 3, wherein said first partition further comprises a second VLC codeword from said at least one VLC codeword.

5. The method of claim 1, further comprising updating a cut-off value for said block of video data, wherein said cut-off value adaptively updates a splitting ratio of said first partition and said second partition.

6. The method of claim 5, wherein said cut-off value is tiansmitted in advance of said block of video data.

7. A method of partitioning video data for transmission comprising the steps of: calculating at least one discrete cosine transform (DCT) coefficient from a block of said video data; performing a variable-length coding (VLC) operation on said at least one DCT coefficient to generate at least one VLC codeword; and splitting said at least one VLC codeword into a first partition and a second partition.

8. A method of partitioning and reassembling video data comprising: calculating at least one discrete cosine transform (DCT) coefficient from a block of said video data; performing a variable-length coding (VLC) operation on said at least one DCT coefficient to generate at least one VLC codeword; splitting said at least one VLC codeword into a first partition and a second partition; transmitting said first partition and said second partition; receiving said first partition and said second partition; combining said first partition and said second partition of said video data into a aggregate bitstream; searching for at least one error in said aggregate bitstream; pre-decoding said aggregate bitstieam to produce a pre-decoded bitstieam; and decoding said pre-decoded bitstream; wherein said pre-decoding ignores a portion of upcoming data from said aggregate bitstream and conceals said portion of upcoming data in an existence of said at least one error in said aggregate bitstream.

9. An apparatus for partition and reassembling video data comprising: a discrete cosine transform (DCT) calculator; a processor for variable-length coding (VLC); a splitter; a transmitter; a receiver; a buffer; a detector; a pre-decoder; and a decoder; wherein said DCT calculator calculates at least one DCT coefficient from a block of said video data, said processor for VLC generates at least one VLC codeword from said at least one DCT coefficient, and said splitter splits said at least one VLC codeword into a first partition and a second partition, said buffer creates an aggregate bitstream from said first partition and said second partition, said detector senses the existence of at least one error in said aggregate bitstieam, said pre-decoder ignores a portion of upcoming data from said aggregate bitstream and conceals said portion of upcoming data in an existence of said at least one error in said aggregate bitstream.

10. An apparatus for partitioning video data for tiansmission comprising: a discrete cosine transform (DCT) calculator; a processor for variable-length coding (VLC); and a splitter; and wherein said DCT calculator calculates at least one DCT coefficient from a block of said video data, said processor for VLC generates at least one VLC codeword from said at least one DCT coefficient, and said splitter splits said at least one VLC codeword into a first partition and a second partition.

11. The apparatus of claim 10, wherein said first partition comprises a header of said video data.

12. The apparatus of claim 11, wherein said first partition further comprises a first VLC codeword from said at least one VLC codeword.

13. The apparatus of claim 12, wherein said first partition further comprises a second VLC codeword from said at least one VLC codeword.

14. The apparatus of claim 10, wherein said splitter adaptively updates a splitting ratio.

15. An apparatus for partitioning video data for transmission comprising: means for calculating a discrete cosine transform (DCT); means for performing variable-length coding (VLC); and means for splitting; and wherein said means for calculating a DCT calculates at least one DCT coefficient from a block of video data, said means for performing VLC generates at least one VLC codeword from said at least one DCT coefficient, and said means for splitting divides said at least one VLC codeword into a first partition and a second partition.

16. In a system for bitstieam partitioning of video data, a computer-readable memory for storing data for access by an application program comprising: a data structure stored in said computer-readable memory, said data structure including information used by said application program and including: a plurality of image fields; a plurality of block fields; a plurality of discrete cosine transform (DCT) fields; a plurality of variable-length coding (VLC) fields; a plurality of first partition fields; and a plurality of second partition fields.

17. The data structure of said computer-readable memory of claim 16 further comprising a plurality of reference fields.

18. The data structure of said computer-readable memory of claim 16 further comprising a plurality of group-of-block fields.

19. The data structure of said computer-readable memory of claim 16 further comprising a plurality of macroblock fields.

20. The data structure of said computer-readable memory of claim 16 further comprising a plurality of buffer fields.

21. The data structure of said computer-readable memory of claim 16 further comprising a plurality of buffer contiol unit fields.

22. The data structure of said computer-readable memory of claim 16 further comprising a plurality of percentage factor fields.

23. A method of reassembling video data comprising: combining a first bitstream and a second bitstream of said video data into a aggregate bitstream; searching for at least one error in said aggregate bitstream; pre-decoding said aggregate bitstieam to produce a pre-decoded bitstream; and decoding said pre-decoded bitstieam; wherein said pre-decoding ignores a portion of upcoming data from said aggregate bitstream and conceals said portion of upcoming data in an existence of said at least one error in said aggregate bitstream.

24. A method of reassembling video data comprising the steps of: combining a first bitstieam and a second bitstieam of said video data into a aggregate bitstream; searching for at least one error in said aggregate bitstieam; pre-decoding said aggregate bitstieam to produce a pre-decoded bitstream; and decoding said pre-decoded bitstream; wherein said step of pre-decoding ignores a portion of upcoming data from said aggregate bitstream and conceals said portion of upcoming data in an existence of said at least one error in said aggregate bitstieam.

25. An apparatus for reassembling video data comprising: a buffer; a detector; a pre-decoder; and a decoder; wherein said buffer creates an aggregate bitstream from a first bitstream and a second bitstream, said detector senses the existence of at least one error in said aggregate bitstream, said pre-decoder ignores a portion of upcoming data from said aggregate bitstieam and conceals said portion of upcoming data in an existence of said at least one error in said aggregate bitstieam.

26. An apparatus for reassembling video data comprising: means for creating an aggregate bitstream from a first bitstream and a second bitstieam; means for detecting an existence of at least one error in said aggregate bitstream; means for pre-decoding said aggregate bitstieam by ignoring a portion of upcoming data from said aggregate bitstieam and concealing said portion of upcoming data in an existence of said least one error in said aggregate bitstieam.

27. In a system for reassembling video data, a computer-readable memory for storing data for access by an application program comprising: a data structure stored in said computer-readable memory, said data structure including information used by said application program and including: a plurality of first bitstream fields; a plurality picture start code (PSC) detector fields; a plurality of group of block start code (GBSC) detector fields; a plurality of group number (GN) fields; a plurality of temporal reference (TR) fields; a plurality of inner-level forward error correction (FEC) decoder fields; a plurality of outer-level FEC decoder fields; a plurality of variable-length coding (VLC) codebook fields; and a plurality of error concealment fields.

28. The data structure of said computer-readable memory of claim 27 further comprising a plurality of temporary buffer fields.

29. The data structure of said computer-readable memory of claim 27 further comprising a plurality of second bitstieam fields.

30. The data structure of said computer-readable memory of claim 27 further comprising a plurality of cut-off value fields.