WO2002001848A2 - System and method for reducing the computational complexity of mpeg video decoding - Google Patents

System and method for reducing the computational complexity of mpeg video decoding Download PDF

Info

Publication number
WO2002001848A2
WO2002001848A2 PCT/US2001/020661 US0120661W WO0201848A2 WO 2002001848 A2 WO2002001848 A2 WO 2002001848A2 US 0120661 W US0120661 W US 0120661W WO 0201848 A2 WO0201848 A2 WO 0201848A2
Authority
WO
WIPO (PCT)
Prior art keywords
motion vector
video
block
frame
decoded
Prior art date
Application number
PCT/US2001/020661
Other languages
French (fr)
Other versions
WO2002001848A3 (en
Inventor
Shahab Layeghi
Andy Hung
Original Assignee
Intervideo, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intervideo, Inc. filed Critical Intervideo, Inc.
Priority to AU2001273059A priority Critical patent/AU2001273059A1/en
Publication of WO2002001848A2 publication Critical patent/WO2002001848A2/en
Publication of WO2002001848A3 publication Critical patent/WO2002001848A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals

Definitions

  • the present invention generally relates to MPEG video decoders and, more particularly, to a system and corresponding algorithm for reducing the complexity and time associated with decoding MPEG video signals.
  • a sequence is comprised of a series of video frames. Each video frame in the sequence is subdivided into a number of rectangular information blocks, each containing a pixel portion of the image. These information blocks are referred to as macroblocks.
  • the pixel portions of the image are represented by a series of bits of data.
  • the data bits are encoded in a particular fashion.
  • the encoded bitstream contains compressed information based on encoding each macroblock of an image.
  • An MPEG decoder is used to decode each of the compressed macroblocks in the MPEG encoded bitstream based on previously transmitted and decoded video frames, called reference frames.
  • each decoded macroblock refers to one or more image regions in a previously transmitted reference frame to use as a prediction source for decoding the current frame.
  • the displacement between the macroblock and the image regions of interest is called a motion vector.
  • the displacement motion vectors are computed with half pixel resolution, referred to as half-pel prediction.
  • the motion vectors represent displacement on a half-pel grid.
  • motion vectors in MPEG are computed with half pixel (half-pel) resolution.
  • the displacement in the corresponding decoded block is the entire pixel displacement value as provided in the reference frame. Consequently, prediction of the corresponding macroblock in the decoded frame does not have to be computed.
  • the prediction macroblock is reconstructed by taking the average of all the surrounding pixels.
  • the present invention is directed to a digital video system which incorporates an algorithm that adjusts the motion vector used to decode a corresponding macroblock of a digital video frame based on a subset of information contained in an encoded input video signal.
  • the digital video system comprises means for providing an input signal, said input signal including a plurality of encoded macroblocks; means for constructing a plurality of decoded macroblocks in response to a subset of data present in said plurality of encoded macroblocks; and means for providing an output signal in response to said plurality of decoded macroblocks.
  • the video system further includes a means for displaying the constructed video signal.
  • the constructing means is a decoder present within a larger digital video disk player which constructs a decoded video frame from an encoded reference frame by retrieving a motion vector from the reference frame; determining whether the block of the reference frame is of a particular type; constructing a modified motion vector from a subset of data contained within the reference frame; and applying the modified motion vector to a corresponding block within the reference frame.
  • the algorithm performed by the decoder is most advantageously used in conjunction with motion vectors containing an odd number of half-pels, wherein the motion vector provided by the decoder is comprised of horizontal components and vertical components having values equal to the nearest even half-pel value of the corresponding reference value.
  • An advantage of the present invention is that it improves video decoding efficiency by reducing the number of calculations that need to be performed on a given frame.
  • Another advantage of the present invention is that it is straightforward to implement.
  • a feature of the present invention is that it has minimal effect on video image quality.
  • Figures 1(a) - (c) are schematic representations of the calculations made to determine the motion vector components of a half-pel using conventional decoding techniques
  • Figure 2 is a schematic representation of the structure of a video frame
  • Figures 3(a) - (c) are schematic representations of the types of macroblocks that comprise a video frame
  • Figure 4 is a block diagram of a digital video disk player incorporating a decoder module that performs the improved computation algorithm according to the present invention
  • Figure 5 is a flow chart illustrating the operating steps performed by the decoder module in decoding a macroblock according to the improved computation algorithm of the present invention
  • Figures 6(a) - (b) are schematic representations of a macroblock being constructed using the motion vector calculated according to the improved computation algorithm of the present invention.
  • a digital image is comprised of a series of frames.
  • the types of frames that make up the video image are illustrated in Figure 2.
  • a video sequence is comprised of an I-frame 20, a number of B-frames 22 and a P-frame 24.
  • I-frames also referred to as an intra-frame is a type of video frame that is encoded as a stand alone still image. I-frames allow random access points within the video stream. In application, I-frames are used where scene cuts occur.
  • B-frames or bi- directional frames, provide the most compression and decrease noise by averaging the pixel information contained in the frames that are used to decode (or predict) the contents of the B-frame.
  • P-frames, or predicted frames are frames that are encoded relative to the nearest I-frame or P-frame, resulting in forward prediction processing.
  • the I-frame 20, B-frame 22 and P-frame 24 can be encoded using any one of several types (i.e. Huffman) of encoding schemes.
  • the data (pixel representation) of the B-frame to be decoded is predicted by a preceding I-frame or P-frame and a subsequent I-frame or P-frame.
  • the pixel data of B-frame 22 is decoded by the data contained within the preceding I-frame 20 and a subsequent P-frame 24.
  • the same decoding scheme may be applied to decode the contents of the second B-frame
  • B-frame 23 in the frame series The content of B-frame 23 is decoded by using the I-frame 20 and the subsequent P-frame 24.
  • the I-frame 20, alone can be used to decode P-frame 24.
  • the P-frame 24 also can be used to decode either B-frame 23, B-frame 25 or a subsequent P-frame (not shown).
  • the B-frame is not used to decode (predict the contents of) any other video frame.
  • the P-frame 24 is a fixed reference frame.
  • each video frame is comprised of a series of blocks, referred to as macroblocks.
  • Each macroblock is comprised of a luminance component and two chrominance components, referred to as U and V, respectively.
  • Macroblocks contain a series of pixels (represented as dots in Figure 3(c)) which represent a larger image.
  • each pixel is comprised of a series of bits of information, represented as where N is an integer. In a preferred embodiment of the present invention, N equals 8.
  • each pixel is comprised of 8 bit of data.
  • each macroblock is 16 pixels x 16 pixels in size.
  • the luminance component of each macroblock contains 16 x 16 x 8 bits of information.
  • the chrominance components of the macroblock contain two corresponding 8 x 8 x 8 bit blocks of information corresponding to the U and V portions, respectively.
  • the decoding of the pixel information contained in a current macroblock, for example, current macroblock 32 is a combination of the changes present in a corresponding reference macroblock 32', plus the displacement of the reference macroblock 32' from a standard reference location.
  • the displacement of the reference macroblock 32' from the standard reference location is referred to as the motion vector (V m ).
  • V m is comprised of two components: (1) the distance along the horizontal direction (x-axis) from the standard reference location (x-component or Vmx); and (2) the distance along the vertical direction (y-axis) from the standard reference location (y-component or V my ).
  • the reference macroblock 32' as shown in Figure 3(a), consists of information from a plurality of macroblocks that are determined during frame encoding and the corresponding V m which provides the displacement of the reference macroblock 32' from the corner of the reference frame.
  • Vmx and V may either have an integer value or a non-integer value measured on a pel unit scale.
  • integer values of components results in an even half-pel component and fractional values of components results in an odd half-pel component.
  • Half-pel reference motion vectors require additional calculations before a corresponding V m can be obtained.
  • the average displacement of that pixel from the reference pixel location requires the calculation of the displacement from all four surrounding pixel locations as illustrated in Figure 1(c).
  • the present invention is directed to a motion vector computation method which reduces the number of calculations that have to be performed when computing V m when the reference displacement vector is a half-pel.
  • the computed V m is then used to generate a corresponding decoded macroblock.
  • the computation method will now be described with reference to Figures 4-6.
  • FIG 4 is a block diagram of a digital versatile disk (DVD) player 40 that performs the improved computation algorithm according to the present invention.
  • the DVD player 40 includes a navigation unit 42 and a corresponding video unit 44 which provides an output video signal to a display device 48 on line 47.
  • the display device 48 is a progressive display device, such as a computer monitor.
  • the display device 48 is an interlaced display device.
  • the video unit 44 includes a video decoder module 45 and a video display module 46.
  • the decoder module 45 decodes (constructs) the input video signal provided by the navigation unit 42 according to the improved computation algorithm according to the present invention.
  • the navigation unit 42 accepts a digital media element such as, for example, a digital versatile disk 11 having digital information, i.e., audio, video and complementary information stored thereon.
  • the navigation unit 42 is capable of differentiating between the different types of information stored on the disk 11 and providing the encoded video information on a first data line (VIDEO).
  • the audio and other complementary information stored on the disk 11 are provided on an AUDIO line and a COMP line, respectively.
  • the encoded video information present on the VIDEO line is transferred to the video unit 44 through the video decoder module 45.
  • the video decoder module 45 receives the encoded video bit stream from the navigation unit 42 and reconstructs the I-frames and P-frames of the reference frame 32' ( Figure 3). Using the I-frame 20 and the P-frame 24, and the reference motion vector (V m ') from the reference frame 32', the individual macroblocks of the B-frame 22 are decoded by the video decoder module 45 based on the following representative algorithm: For each macroblock in the image
  • ⁇ mv_x mx_x & ⁇ 1 ; ⁇ if(chrominance_block)
  • the video decoder module 45 generates a modified motion vector (V m ) for each macroblock based on a subset of the displacement data (horizontal (x) and vertical (y) components of the motion vector) present in the reference motion vector V m '. After generating the motion vector for the current macroblock being decoded, such motion vector is used to obtain the decoded macroblock to provide a modified (decoded) video signal. The modified video signal is then transferred to the video display module 46 on line 43.
  • the video display module 46 includes a detection unit (not shown) and a processing unit (not shown) which are capable of detecting the modified video signal provided by the video decoder module 45 and converting the modified video signal into the output video signal that is transferred to the computer monitor 48 on line 47.
  • FIG. 5 is a flow chart illustrating the improved computation algorithm according to the present invention.
  • a first step 60 the horizontal (x-component) and vertical (y-component) components of the reference motion vector are obtained.
  • step 62 a determination is made as to whether the current macroblock to be decoded is within a B-frame. This is done by detecting whether the picture coding type present in the reference frame is bi-directional type. This can be accomplished, for example, by detecting the presence of a flag bit preceding, within, of subsequent to the data bits that comprise the current macroblock.
  • step 68 control is passed to step 68 where conventional motion compensation is performed on the macroblock using the reference motion vector.
  • the computation algorithm employed by the present invention is only used on B-frames. The reason for applying the computation algorithm only to B-frames is that they are not reference frames used for later decoding.
  • the I-frames and P-frames are reference frames and are only used to decode B-frames.
  • step 64 a determination is made as to whether the current block is the luminance portion of the block.
  • the luminance portion of the block includes data representative of the brightness of the corresponding image.
  • control is then passed to step 65 where the horizontal displacement (Vmx) of the modified motion vector is approximated to be the nearest even number integer to that provided by the reference motion vector on a half-pel basis.
  • Vmx the horizontal displacement of the reference motion vector
  • the algorithm of the present invention approximates V mx to have a value of 8 or 10.
  • the vertical displacement, V my retains its current value as provided by the reference motion vector.
  • no approximation is performed on the y-component (or vertical displacement) of the reference motion vector.
  • the additional computation steps that are performed in conventional decoding schemes to determine the horizontal displacement of the current motion vector are eliminated. This results in increased decoding speed.
  • motion compensation is performed on the current macroblock in step 68 where the modified motion vector is applied to the current macroblock to place the current macroblock in the correct position with respect to the frame being decoded.
  • the current portion of the block being decoded is not a luminance portion, then the current portion is the chrominance portion of the macroblock and the decoder module of the present invention approximates the horizontal displacement (x- component) and the vertical displacement (y-component) of the current motion vector to have a value equal to the nearest lowest or highest integer of the corresponding values in the reference motion vector in step 66. More specifically, if the horizontal displacement of the reference motion vector has a value of 9, the algorithm of the present invention approximates V mx to have a value of 8 or 10, the nearest integers. Correspondingly, if the vertical displacement of the reference motion vector has a value of 7, the algorithm of the present invention approximates V my to have a value of
  • step 66 After the motion vector of the current macroblock has been calculated in step 66, standard motion compensation is then performed to recover the pixel data from the macroblock in step 68 using the V m calculated in step 66 as shown in Figure 6.
  • the currently decoded frame 60' (Fig. 6(b)) contains the same pixel information contained in the reference frame 60 (Fig. 6(a)), shifted by the amount of the motion vector V m .

Abstract

A system and corresponding method for reducing the computational complexity in decoding an MPEG encoded signal is disclosed. The disclosed system is a digital video disk player capable of decoding and constructing a previously encoded MPEG video signal based on a subset of information contained in the encoded input signal (Fig 4). The system of the present invention also includes means for determining whether the particular block to be decoded is of a predetermined type and decoding such block accordingly.

Description

SYSTEM AND METHOD FOR REDUCING THE COMPUTATIONAL COMPLEXITY OF MPEG VIDEO DECODING
NOTICE OF COPYRIGHT A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever. FIELD OF THE INVENTION
The present invention generally relates to MPEG video decoders and, more particularly, to a system and corresponding algorithm for reducing the complexity and time associated with decoding MPEG video signals.
BACKGROUND OF THE INVENTION In the Moving Pictures Experts Group (MPEG) video standard, a sequence is comprised of a series of video frames. Each video frame in the sequence is subdivided into a number of rectangular information blocks, each containing a pixel portion of the image. These information blocks are referred to as macroblocks. The pixel portions of the image are represented by a series of bits of data. In the MPEG video standard, the data bits are encoded in a particular fashion. The encoded bitstream contains compressed information based on encoding each macroblock of an image.
An MPEG decoder is used to decode each of the compressed macroblocks in the MPEG encoded bitstream based on previously transmitted and decoded video frames, called reference frames. In order to accommodate motion in the video frames, each decoded macroblock refers to one or more image regions in a previously transmitted reference frame to use as a prediction source for decoding the current frame. The displacement between the macroblock and the image regions of interest is called a motion vector. For MPEG, the displacement motion vectors are computed with half pixel resolution, referred to as half-pel prediction. Thus, the motion vectors represent displacement on a half-pel grid.
Computing the motion prediction is one of the most time consuming steps in decoding MPEG video signals. As discussed above, motion vectors in MPEG are computed with half pixel (half-pel) resolution. In situations where the horizontal or vertical components of the motion vector are an even number of half-pels, the displacement in the corresponding decoded block is the entire pixel displacement value as provided in the reference frame. Consequently, prediction of the corresponding macroblock in the decoded frame does not have to be computed. In situations where the horizontal or vertical components of the motion vector are an odd number of half-pels, the prediction macroblock is reconstructed by taking the average of all the surrounding pixels. There are four cases of motion prediction associated with any video image, depending on whether the x (horizontal) and y (vertical) components of the motion vector are even or odd: (1) x is even and y is even, this represents the situation where the motion prediction does not have to be computed as both the horizontal and vertical components have even values; (2) x is even and y is odd, this represents half-pel vertical prediction as illustrated in Fig. 1(a); (3) x is odd and y is even, this represents half-pel horizontal prediction as illustrated in Fig. 1(b); and (4) x is odd and y is odd, this represents half-pel horizontal and vertical prediction as illustrated in Fig. 1(c).
SUMMARY OF THE INVENTION The present invention is directed to a digital video system which incorporates an algorithm that adjusts the motion vector used to decode a corresponding macroblock of a digital video frame based on a subset of information contained in an encoded input video signal. The digital video system comprises means for providing an input signal, said input signal including a plurality of encoded macroblocks; means for constructing a plurality of decoded macroblocks in response to a subset of data present in said plurality of encoded macroblocks; and means for providing an output signal in response to said plurality of decoded macroblocks. The video system further includes a means for displaying the constructed video signal.
In an exemplary embodiment of the present invention, the constructing means is a decoder present within a larger digital video disk player which constructs a decoded video frame from an encoded reference frame by retrieving a motion vector from the reference frame; determining whether the block of the reference frame is of a particular type; constructing a modified motion vector from a subset of data contained within the reference frame; and applying the modified motion vector to a corresponding block within the reference frame. The algorithm performed by the decoder is most advantageously used in conjunction with motion vectors containing an odd number of half-pels, wherein the motion vector provided by the decoder is comprised of horizontal components and vertical components having values equal to the nearest even half-pel value of the corresponding reference value.
An advantage of the present invention is that it improves video decoding efficiency by reducing the number of calculations that need to be performed on a given frame.
Another advantage of the present invention is that it is straightforward to implement.
A feature of the present invention is that it has minimal effect on video image quality.
BRIEF DESCRIPTION OF THE DRAWINGS The aforementioned and related advantages and features of the present invention will become apparent upon review of the following detailed description of the invention, taken in conjunction with the following drawings, where like numerals represent like elements, in which:
Figures 1(a) - (c) are schematic representations of the calculations made to determine the motion vector components of a half-pel using conventional decoding techniques;
Figure 2 is a schematic representation of the structure of a video frame; Figures 3(a) - (c) are schematic representations of the types of macroblocks that comprise a video frame;
Figure 4 is a block diagram of a digital video disk player incorporating a decoder module that performs the improved computation algorithm according to the present invention;
Figure 5 is a flow chart illustrating the operating steps performed by the decoder module in decoding a macroblock according to the improved computation algorithm of the present invention; and Figures 6(a) - (b) are schematic representations of a macroblock being constructed using the motion vector calculated according to the improved computation algorithm of the present invention. . DETAILED DESCRIPTION OF THE INVENTION
The system and corresponding method of decoding previously encoded MPEG video signals will now be described with reference to Figures 2 - 6. In the MPEG video standard, a digital image is comprised of a series of frames. The types of frames that make up the video image are illustrated in Figure 2. As shown, a video sequence is comprised of an I-frame 20, a number of B-frames 22 and a P-frame 24. I-frames, also referred to as an intra-frame is a type of video frame that is encoded as a stand alone still image. I-frames allow random access points within the video stream. In application, I-frames are used where scene cuts occur. B-frames, or bi- directional frames, provide the most compression and decrease noise by averaging the pixel information contained in the frames that are used to decode (or predict) the contents of the B-frame. P-frames, or predicted frames, are frames that are encoded relative to the nearest I-frame or P-frame, resulting in forward prediction processing. The I-frame 20, B-frame 22 and P-frame 24 can be encoded using any one of several types (i.e. Huffman) of encoding schemes.
In application, the data (pixel representation) of the B-frame to be decoded is predicted by a preceding I-frame or P-frame and a subsequent I-frame or P-frame. For example, as shown in Figure 2, the pixel data of B-frame 22 is decoded by the data contained within the preceding I-frame 20 and a subsequent P-frame 24. The same decoding scheme may be applied to decode the contents of the second B-frame
23 in the frame series. The content of B-frame 23 is decoded by using the I-frame 20 and the subsequent P-frame 24. In an alternate embodiment of the present invention, the I-frame 20, alone, can be used to decode P-frame 24. Moreover, the P-frame 24 also can be used to decode either B-frame 23, B-frame 25 or a subsequent P-frame (not shown). In accordance with the present invention, the B-frame is not used to decode (predict the contents of) any other video frame. In a preferred embodiment of the present invention, the P-frame 24 is a fixed reference frame.
As shown in greater detail in Figure 3, each video frame is comprised of a series of blocks, referred to as macroblocks. Each macroblock is comprised of a luminance component and two chrominance components, referred to as U and V, respectively. Macroblocks contain a series of pixels (represented as dots in Figure 3(c)) which represent a larger image. In application, each pixel is comprised of a series of bits of information, represented as
Figure imgf000006_0001
where N is an integer. In a preferred embodiment of the present invention, N equals 8. Thus, each pixel is comprised of 8 bit of data. In a preferred embodiment, each macroblock is 16 pixels x 16 pixels in size. The luminance component of each macroblock contains 16 x 16 x 8 bits of information. The chrominance components of the macroblock contain two corresponding 8 x 8 x 8 bit blocks of information corresponding to the U and V portions, respectively. The decoding of the pixel information contained in a current macroblock, for example, current macroblock 32 is a combination of the changes present in a corresponding reference macroblock 32', plus the displacement of the reference macroblock 32' from a standard reference location. The displacement of the reference macroblock 32' from the standard reference location is referred to as the motion vector (Vm). Vm is comprised of two components: (1) the distance along the horizontal direction (x-axis) from the standard reference location (x-component or Vmx); and (2) the distance along the vertical direction (y-axis) from the standard reference location (y-component or Vmy). In application, the reference macroblock 32' as shown in Figure 3(a), consists of information from a plurality of macroblocks that are determined during frame encoding and the corresponding Vm which provides the displacement of the reference macroblock 32' from the corner of the reference frame. After being encoded Vmx and Vmy may either have an integer value or a non-integer value measured on a pel unit scale. When motion vectors are measured in half-pel units, integer values of components results in an even half-pel component and fractional values of components results in an odd half-pel component. Half-pel reference motion vectors require additional calculations before a corresponding Vm can be obtained. In those situations where the pixel displacement of the reference macroblock is in the middle of a base (reference) frame, the average displacement of that pixel from the reference pixel location requires the calculation of the displacement from all four surrounding pixel locations as illustrated in Figure 1(c).
This requires that four additional pixel average calculations be completed per pixel. As B-frames may have both forward and backward prediction, twice the amount of additional calculations may be required. This can significantly increases video frame decoding time.
The present invention is directed to a motion vector computation method which reduces the number of calculations that have to be performed when computing Vm when the reference displacement vector is a half-pel. The computed Vm is then used to generate a corresponding decoded macroblock. The computation method will now be described with reference to Figures 4-6.
Figure 4 is a block diagram of a digital versatile disk (DVD) player 40 that performs the improved computation algorithm according to the present invention. The DVD player 40 includes a navigation unit 42 and a corresponding video unit 44 which provides an output video signal to a display device 48 on line 47. In a preferred embodiment of the present invention, the display device 48 is a progressive display device, such as a computer monitor. In an alternate embodiment of the present invention, the display device 48 is an interlaced display device. The video unit 44 includes a video decoder module 45 and a video display module 46. The decoder module 45 decodes (constructs) the input video signal provided by the navigation unit 42 according to the improved computation algorithm according to the present invention.
The navigation unit 42 accepts a digital media element such as, for example, a digital versatile disk 11 having digital information, i.e., audio, video and complementary information stored thereon. The navigation unit 42 is capable of differentiating between the different types of information stored on the disk 11 and providing the encoded video information on a first data line (VIDEO). The audio and other complementary information stored on the disk 11 are provided on an AUDIO line and a COMP line, respectively.
The encoded video information present on the VIDEO line is transferred to the video unit 44 through the video decoder module 45. The video decoder module 45 receives the encoded video bit stream from the navigation unit 42 and reconstructs the I-frames and P-frames of the reference frame 32' (Figure 3). Using the I-frame 20 and the P-frame 24, and the reference motion vector (Vm') from the reference frame 32', the individual macroblocks of the B-frame 22 are decoded by the video decoder module 45 based on the following representative algorithm: For each macroblock in the image
{
Find motion vector components mv-x, mv-y; if (picture coding-type = = B TYPE)
{ if(luminance_block)
{ mv_x = mx_x &~1 ; } if(chrominance_block)
{ mv_x = mv_x & ~1; mv_y = mv_y & ~1; }
}
Do motion compensation for the block;
} As illustrated by the pseudo-code provided above, the video decoder module 45 generates a modified motion vector (Vm) for each macroblock based on a subset of the displacement data (horizontal (x) and vertical (y) components of the motion vector) present in the reference motion vector Vm'. After generating the motion vector for the current macroblock being decoded, such motion vector is used to obtain the decoded macroblock to provide a modified (decoded) video signal. The modified video signal is then transferred to the video display module 46 on line 43.
The video display module 46 includes a detection unit (not shown) and a processing unit (not shown) which are capable of detecting the modified video signal provided by the video decoder module 45 and converting the modified video signal into the output video signal that is transferred to the computer monitor 48 on line 47.
The computation steps performed by the video decoder module 45 to construct the modified video signal from the encoded input video signal will now be described with reference to Figure 5. Figure 5 is a flow chart illustrating the improved computation algorithm according to the present invention. In a first step 60, the horizontal (x-component) and vertical (y-component) components of the reference motion vector are obtained. Next, in step 62, a determination is made as to whether the current macroblock to be decoded is within a B-frame. This is done by detecting whether the picture coding type present in the reference frame is bi-directional type. This can be accomplished, for example, by detecting the presence of a flag bit preceding, within, of subsequent to the data bits that comprise the current macroblock. If the current macroblock to be decoded is not in a B-frame, then control is passed to step 68 where conventional motion compensation is performed on the macroblock using the reference motion vector. The computation algorithm employed by the present invention is only used on B-frames. The reason for applying the computation algorithm only to B-frames is that they are not reference frames used for later decoding. The I-frames and P-frames are reference frames and are only used to decode B-frames. On the other hand, if the current frame to be decoded is a B-frame, control is passed to step 64 where a determination is made as to whether the current block is the luminance portion of the block. The luminance portion of the block includes data representative of the brightness of the corresponding image. For a luminance block, control is then passed to step 65 where the horizontal displacement (Vmx) of the modified motion vector is approximated to be the nearest even number integer to that provided by the reference motion vector on a half-pel basis. For example, if the horizontal displacement of the reference motion vector has a value of 9, the algorithm of the present invention approximates Vmx to have a value of 8 or 10. The vertical displacement, Vmy retains its current value as provided by the reference motion vector. Thus, no approximation is performed on the y-component (or vertical displacement) of the reference motion vector. By approximating the value of Vmx, the additional computation steps that are performed in conventional decoding schemes to determine the horizontal displacement of the current motion vector are eliminated. This results in increased decoding speed. After Vmx has been determined, motion compensation is performed on the current macroblock in step 68 where the modified motion vector is applied to the current macroblock to place the current macroblock in the correct position with respect to the frame being decoded.
If the current portion of the block being decoded is not a luminance portion, then the current portion is the chrominance portion of the macroblock and the decoder module of the present invention approximates the horizontal displacement (x- component) and the vertical displacement (y-component) of the current motion vector to have a value equal to the nearest lowest or highest integer of the corresponding values in the reference motion vector in step 66. More specifically, if the horizontal displacement of the reference motion vector has a value of 9, the algorithm of the present invention approximates Vmx to have a value of 8 or 10, the nearest integers. Correspondingly, if the vertical displacement of the reference motion vector has a value of 7, the algorithm of the present invention approximates Vmy to have a value of
6 or 8, the nearest integers. Thus, if the current macroblock to be decoded is a color block, the entire motion vector is calculated according to the present invention. In this fashion, decoding time is significantly reduced. In experiments performed by the inventors, it was determined that using the computation scheme of the present invention, decoding time is decreased by 25% as compared to conventional decoding schemes, with no significant degradation in resulting video image quality.
After the motion vector of the current macroblock has been calculated in step 66, standard motion compensation is then performed to recover the pixel data from the macroblock in step 68 using the Vm calculated in step 66 as shown in Figure 6. As shown in Figure 6, the currently decoded frame 60' (Fig. 6(b)) contains the same pixel information contained in the reference frame 60 (Fig. 6(a)), shifted by the amount of the motion vector Vm.
The foregoing detailed description of the invention has been provided for the purposes of illustration and description. Although an exemplary embodiment of the present invention has been described in detail herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiment disclosed, and that various changes and modifications to the invention are possible in light of the above teaching. Accordingly, the scope of the present invention is to be defined by the claims appended hereto.

Claims

WHAT IS CLAIMED IS:
1. A digital video system, comprising: means for providing an input video signal, said input video signal including a plurality of blocks; means for constructing a plurality of decoded blocks on a block by block basis in response to a subset of data present in said plurality of encoded blocks; and means for providing an output video input signal in response to said plurality of decoded blocks, wherein said output video signal is formatted to be displayed as a conventional video image.
2. The video system of Claim 1, wherein said video image is displayed on a progressive display device.
3. The video system of Claim 1, wherein said video image is displayed on an interlaced display device.
4. The video system of Claim 1, wherein said constructing means comprises a decoder capable of generating a decoded motion vector as a function of a subset of information present in said encoded block, said decoder including means for generating a decoded block in response to said motion vector.
5. The video system of Claim 4, wherein said motion vector includes a horizontal component and a vertical component, and said decoder constructs said video image by displacing said previously decoded blocks by an amount corresponding to said motion vector.
6. The video system of Claim 5, wherein said decoder detects whether said encoded block is a luminance block and in response to the detection of a luminance block, adjusting the horizontal half-pel component of said motion vector to the nearest even integer value.
7. The video system of Claim 5, wherein said decoder detects whether said encoded block is a chrominance block and in response to the detection of a chrominance block, adjusting the horizontal and vertical half-pel components of said motion vector to the nearest integer value.
8. The video system of Claim 1, wherein said providing means is a navigation unit for generating said input video signal in response to information read from a digital media element.
9. The video system of Claim 1, wherein said providing means comprises a video display module capable of combining said plurality of decoded blocks into an output video signal.
10. A method of constructing a video frame signal from a reference frame comprising a plurality of coded blocks, comprising the steps of:
(a) retrieving a reference motion vector from one of said plurality of coded blocks;
(b) detecting whether the current block to be decoded is a luminance block;
(c) constructing a modified motion vector based on a subset of information contained in said reference frame and said current frame; and
(d) applying the modified motion vector constructed in step (c) to a corresponding block within said current frame.
11. The method of Claim 10, wherein step (c) comprises the step of:
(cl) adjusting a component value of said motion vector to the nearest even value.
12. The method of Claim 11, wherein the component part of step (cl) is the horizontal half-pel component value of said motion vector.
13. A method of constructing a video frame signal from a reference frame comprising a plurality of coded blocks, comprising the steps of:
(a) retrieving a reference motion vector from one of said plurality of coded blocks; (b) detecting whether the current block to be decoded is a chrominance block;
(c) constructing a modified motion vector by adjusting the component values of said reference frame; and
(d) applying the modified motion vector constructed in step (c) to the corresponding block to be decoded.
14. The method of Claim 13, wherein step (c) comprises the steps of: (cl) adjusting the horizontal component value of said motion vector t the nearest integer; and (c2) adjusting the vertical component value of said motion vector to the nearest integer.
PCT/US2001/020661 2000-06-27 2001-06-27 System and method for reducing the computational complexity of mpeg video decoding WO2002001848A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001273059A AU2001273059A1 (en) 2000-06-27 2001-06-27 System and method for reducing the computational complexity of mpeg video decoding

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US60509200A 2000-06-27 2000-06-27
US09/605,092 2000-06-27

Publications (2)

Publication Number Publication Date
WO2002001848A2 true WO2002001848A2 (en) 2002-01-03
WO2002001848A3 WO2002001848A3 (en) 2002-04-11

Family

ID=24422222

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/020661 WO2002001848A2 (en) 2000-06-27 2001-06-27 System and method for reducing the computational complexity of mpeg video decoding

Country Status (3)

Country Link
AU (1) AU2001273059A1 (en)
TW (1) TW535441B (en)
WO (1) WO2002001848A2 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5317397A (en) * 1991-05-31 1994-05-31 Kabushiki Kaisha Toshiba Predictive coding using spatial-temporal filtering and plural motion vectors
US5510834A (en) * 1992-04-13 1996-04-23 Dv Sweden Ab Method for adaptive estimation of unwanted global picture instabilities in picture sequences in digital video signals

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5317397A (en) * 1991-05-31 1994-05-31 Kabushiki Kaisha Toshiba Predictive coding using spatial-temporal filtering and plural motion vectors
US5510834A (en) * 1992-04-13 1996-04-23 Dv Sweden Ab Method for adaptive estimation of unwanted global picture instabilities in picture sequences in digital video signals

Also Published As

Publication number Publication date
TW535441B (en) 2003-06-01
WO2002001848A3 (en) 2002-04-11
AU2001273059A1 (en) 2002-01-08

Similar Documents

Publication Publication Date Title
US8358701B2 (en) Switching decode resolution during video decoding
EP1528813B1 (en) Improved video coding using adaptive coding of block parameters for coded/uncoded blocks
US8385427B2 (en) Reduced resolution video decode
US6301304B1 (en) Architecture and method for inverse quantization of discrete cosine transform coefficients in MPEG decoders
US6415055B1 (en) Moving image encoding method and apparatus, and moving image decoding method and apparatus
JP3302939B2 (en) Video signal decompressor for independently compressed even and odd field data
US20060072673A1 (en) Decoding variable coded resolution video with native range/resolution post-processing operation
US5739862A (en) Reverse playback of MPEG video
JP2003304542A (en) Video signal decompression apparatus
JP4875007B2 (en) Moving picture coding apparatus, moving picture coding method, and moving picture decoding apparatus
JPH09224254A (en) Device and method for estimating motion
US20020150159A1 (en) Decoding system and method for proper interpolation for motion compensation
KR100260475B1 (en) Methods and devices for encoding and decoding frame signals and recording medium therefor
US5991445A (en) Image processing apparatus
JPH09200695A (en) Method and device for decoding video data for high-speed reproduction
US7116718B2 (en) Unified memory address generation system and method for fetching and storing MPEG video data
JP3078991B2 (en) Low delay mode image decoding method and apparatus
JP2003333540A (en) Frame rate converting apparatus, video display apparatus using the same, and a television broadcast receiving apparatus
US6556714B2 (en) Signal processing apparatus and method
JP2006246277A (en) Re-encoding apparatus, re-encoding method, and re-encoding program
JP3061125B2 (en) MPEG image reproducing apparatus and MPEG image reproducing method
JPH0795536A (en) Device and method for reversely reproducing moving image
WO2002001848A2 (en) System and method for reducing the computational complexity of mpeg video decoding
KR100636465B1 (en) Data processing device and data processing method
JPH1032826A (en) Animation image processor

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP