5/02/2009

3:Video compression standard MPEG-4, H.264

Video compression standard MPEG-4, H.264 A number of h.264 and MPEG-4 standard would like to share with everyone, would like to have done the set-top box to the younger brothers and a little experience teaching. Now I will explain about the knowledge and the STB on the video compression standards. MPEG-4 standard will support seven new features. Can be roughly divided into three categories: content-based interactivity, high compression rates and flexible access model. Is introduced as follows: 1. The interactive content-based (Content-based interactivity) (1) the operation of content-based bit-stream editing and encoding can be carried out not necessary to the operation and content-based bit-stream editing. For example: Users can bit-stream in the image or select one specific object (Object) (such as the image of a person, a building, etc.), followed by some of its characteristics change. (2) natural and synthetic data to provide hybrid coding of natural video images with the synthesis of data (text, graphics) the effective integration of the way, while supporting interactive operation. (3) increased time domain random access MPEG-4 would provide an effective means of random access: In the limited time interval, the frame can be any shape or object, the sound of a video random access sequences. For example, a sequence of a sound, video for the goal of "fast-forward" Search. 2. High compression rate (Compression) (l) to improve coding efficiency with the existing or emerging standards can be compared with the rate, MPEG-4 standard will provide a better subjective visual quality of images. This feature is expected in the rapidly growing mobile communications network to obtain applications, but it is worth noting that: improve the coding efficiency of MPEG-4 is not the only time when the main objective. (2) multiple concurrent data stream encoded MPEG-4 will provide an effective features of a multi-angle encoder, together with the multi-channel sound and efficient coding of audio-visual synchronization. In the three-dimensional video applications, MPEG-4 will use the same multi-view observation of the scene caused by information redundancy, MPEG-4 of this feature observed in sufficient perspective to be effective under the conditions described in three-dimensional nature. 3. Flexible access (Universal access) (l) error-prone environment of the anti-wrong (Robustness) "Flexible" refers to allowing the use of a variety of cable, wire and all kinds of storage media, MPEG-4 will increase the capacity of anti-error (Error robustness capability), particularly in serious error-prone environment of low-bit applications ( mobile communication link). Note, MPEG-4 is the first in its audio and video that take into account norms of the standard channel characteristics. Is not intended to replace the communications network provided by the error control techniques, but rather to provide a residual error against the tenacity. For example: a selective Forward Error Correction (Selective forward error correction), to contain an error (Error containment), or to cover up the error (Error concealment). (2) content-based measure of variability (Content-based scalability) Variability means that the content of scale to the image, assign a priority to each object. Among them, the more important the higher the object or space and time resolution said. Content-based measure of variability is the core of MPEG-4, because once the image contained in the directory object and the corresponding level of priority are identified, other content-based features are more easily achieved. For very low bit rate applications, the scale variability is a key factor, because it provides a self-adaptive capacity of available resources. For example, this feature allows the user requirements: the highest priority on the subject of an acceptable quality to show that the object of the second priority was the quality of the lower display, while the remaining content (object) that does not show that we can see that This approach may be the most effective use of limited resources. Detailed H.264 standard: JVT (Joint Video Team, Joint Working Group on the video) in December 2001 the establishment of Pattaya in Thailand. By ITU-T and the International Organization for Standardization ISO two video encoding on the composition of the experts. JVT work goal is to develop a new video coding standard in order to achieve high video compression ratio, high image quality, good adaptability of the network objectives. At present, the work of the JVT has been accepted by ITU-T, the new video coding standard known as H.264 compression standard, which was also accepted by ISO, known as AVC (Advanced Video Coding) standard, is the MPEG-4 Part 10. H.264 standard can be divided into Three: The basic level (the simple version of the application of a wide range); The main level (using a number of improve image quality and increase the compression ratio of technical measures can be used for SDTV, HDTV and DVD, etc.); Expansion level (can be used for a variety of network transmission of video streams). Not only H.263 and H.264 than MPEG-4 of the 50 percent savings rate, but also has better network support. It into IP packets for encoding mechanism is conducive to the packet transmission network to support network streaming video transmission. H.264 has a strong anti-BER characteristics, can be adapted to the high rate of packet loss, a serious interference in the video transmission channel. H.264 support for various network resources under the classification code transmission, and thus obtain a smooth image quality. H.264 can adapt to different video transmission network, the network good affinity. 1, H.264 video compression system Compression standard H.264 video coding system layer (VCL) and network abstraction layer (Network Abstraction Layer, NAL) is composed of two parts. VCL, including VCL and VCL encoder decoder, the main functions of the video data compression encoding and decoding, which includes the motion compensation, transform coding, entropy coding compression unit. NAL is used for the VCL has nothing to do with the network to provide a unified interface, which is responsible for video data package after package to send in the network, it uses a unified data format, including a single byte of the header information, a number of words section with the group of video data frames, logical channel signaling, timing information, the end of signal sequence. Header contains the type of store signs and markers. Store signs used to indicate the current data does not belong to be a reference frame. Types of signs used to indicate the type of image data. VCL can be transmitted by the network to adjust the current encoding parameters. Second, H.264 characteristics H.264 and H.261, H.263, the DCT transform coding is the increase in the use of DPCM coding of the difference, that is, hybrid coding structure. At the same time, H.264 hybrid coding in the framework of the introduction of a new coding method to improve the coding efficiency, closer to practical application. H.264 is not complicated options, but try to be brief the "return to basics", it is better than H.263 + + the compression performance, but also has to adapt to a wide range of channel capacity. H.264 target a wide range of applications, to meet a variety of different rate, video applications on various occasions, has good anti-error and anti-handling capacity of packet loss. H.264 basic system without the use of copyrights, to have an open nature, can be well adapted to IP and wireless networks use the Internet for the current transmission, multimedia messaging, mobile broadband network to transmit information of great significance to all. Although the basic structure of H.264 encoding with H.261, H.263 is similar, but it has made improvements in many areas, are listed below. 1. A variety of better motion estimation High-precision estimates H.263 is used in half pixel is estimated that in the further use of H.264 in the 1 / 4 pixel or 1 / 8 pixel motion estimation. That is, the real movement of the displacement vector may be based on 1 / 4 or 1 / 8 as the basic unit of pixels. Obviously, the motion vector accuracy of the higher displacement, the smaller the residual error frame, the lower the transmission rate, that is, the higher the compression ratio. H.264 is used in the 6-order FIR interpolation filter to obtain 1 / 2 pixel position value. When 1 / 2 pixel values obtained, the 1 / 4 pixel value can be obtained through linear interpolation, For 4:1:1 video format, the brightness signal 1 / 4 pixel accuracy corresponds to the color part of the 1 / 8 pixel motion vectors, it signals the need for color 1 / 8 pixel interpolation operator. In theory, if the accuracy of motion compensation doubled (for example, from whole-pixel precision to 1 / 2 pixel accuracy), can 0.5bit/Sample the coding gain, but to verify the accuracy of motion vectors found in more than 1 / 8 pixel , the system is basically there is no obvious gain, so that in H.264, only used 1 / 4 pixel accuracy motion vector mode, rather than 1 / 8 pixel accuracy. Multi-mode macroblock is estimated breakdown The forecasting model in H.264, a macroblock (MB) can be divided into seven kinds of different size, this multi-mode flexible, subtle delineation macroblock is in line with our image in the shape of the actual movement of objects, so in each macroblock may contain 1,2,4,8 or 16 motion vectors. Multi-parameter frame is estimated In H.264, the frame can be a number of parameters of motion estimation, that is in the encoders of the cache code there is more than just a good parameter frame, from one encoder to choose a better coding results are given as parameters of the frame, and pointed out that the frame which was used to predict, so that you can use than just a good frame just encoded frame as a better predict the effect of the code. 2. Small size 4? 4 integer transform Video compression coding unit used in the past 8? 8. H.264 is used in the small size of the 4? 4, as the transform block size has become smaller, moving objects on the more precise delineation. This case, the image transform in the process of computation, and edges of moving objects in the convergence of error has been greatly reduced. When the images are large smooth areas, in order not to have a small size due to change brought about by inter-block differences in gray, H.264 Intra macroblock of luminance data 16 4? 4 of DCT coefficients 4 the second time? 4 Transformation of chroma data 4 4? the DC coefficient of four (one for each small piece of a total of four DC coefficients) for 2? 2 transformation. H.263 not only image transform block size has become smaller, and the transformation is an integer operation, rather than real computing, namely, encoders and decoders transform and inverse transform of the accuracy of the same, there is no "anti-conversion error." 3. More accurate intra-prediction In H.264, each 4? Of four are available for each pixel 17 of the nearest previously encoded pixels and the different weights for intra-prediction. 4. VLC unified H.264 encoding on the entropy in two ways. Unified VLC (that is, UVLC: Universal VLC). UVLC use the same code table for encoding, while the decoder can easily identify the code word prefix, UVLC bit error in the event of rapid access to re-sync. Content adaptive binary arithmetic coding (CABAC: Context Adaptive Binary Arithmetic Coding). UVLC its slightly better coding performance, but higher complexity. Third, the performance advantage H.264 and MPEG-4, H.263 + + coding performance comparison using the following six test rate: 32kbit / s, 10F / s and QCIF; 64kbit / s, 15F / s and QCIF; 128kbit / s, 15F / s and CIF; 256kbit / s, 15F / s and QCIF; 512kbit / s, 30F / s and CIF; 1024kbit / s, 30F / s and CIF. The test results indicate, H.264 than MPEG and H.263 + + more excellent PSNR performance. PSNR than H.264 in MPEG-4 high average 2dB, than the average to H.263 + + High-3dB. Fourth, the new fast motion estimation algorithm New Fast Motion Estimation Algorithm UMHexagonS (China Patent) is a computational complexity compared to H.264 Zhongyuan fast full search algorithm and some savings of more than 90% of the new algorithm, called the whole "non-symmetrical cross-shaped multi-level six - shaped grid search algorithm "(Unsymmetrical-Cross Muti-Hexagon Search)", which is a whole-pixel motion estimation algorithm. because of its large sport in high-bit-rate image sequence coding at a better rate-distortion performance to maintain the conditions , the computational complexity is very low, has been formally adopted H.264 standard. ITU and ISO joint development of the H.264 (MPEG-4 Part 10) may be broadcast, communications and storage media (CD DVD) to become a unified standard, is most likely to become a broadband interactive new media standards. China's source coding standard has not yet been formulated, paying close attention to the development of H.264, the development of our source coding standards are being stepped up. Standard H264 video compression technology to enable movement to rise to a higher stage, at a relatively low bandwidth to provide high-quality H.264 video transmission is a bright spot in the application. The popularization and application of H.264 video terminals, gatekeepers, gateways, MCU, such as higher system requirements, will greatly promote the video conferencing software and hardware equipment in the continuous improvement in all aspects.

1 comment: