bestry's Blogger: 2:Basic knowledge of H.264 video coding

A video encoding technology, the development of video coding technology is the basic ISO / IEC to develop the MPEG-x and the development of ITU-T Video Coding H.26x two series of the introduction of international standards. Recommendations from the H.261 video encoding, and H.262 / 3, MPEG-1/2/4, etc. are continuing to pursue a common goal, which is at the lowest possible bit-rate (or storage capacity), as far as possible under good image quality. Moreover, with the market's increasing demand for image transmission, how to adapt to different channel transmission characteristics of the problem is also increasingly apparent. So IEO / IEC and ITU-T International Organization for Standardization two jointly developed the new standard H.264 video to solve these problems. H.261 is one of the first video coding recommendation appears to regulate online ISDN video conferencing and video telephony applications, video coding techniques. It uses a combination of the algorithm can reduce the temporal redundancy of the frame prediction and spatial redundancy to reduce the mixing of the DCT transform coding method. And ISDN channel to match the output rate is p × 64kbit / s. p value is small, only not too high mass-resolution images, suitable for face-to-face teleconference; p larger values (eg, p> 6), can transfer HD video conferencing better image. H.263 is proposed for low bit rate image compression standard, H.261 is technically improved and expanded to support the bit rate of less than 64kbit / s applications. But in essence, H.263 and H.263 + and the subsequent H.263 + + has developed into full-rate applications to support the recommendation of its support from a large number of image formats that can be seen, such as Sub-QCIF, QCIF , CIF, 4CIF, etc. even 16CIF format. MPEG-1 standard rate for the 1.2Mbit / s or so, can provide 30 CIF (352 × 288) image quality is CD-ROM CD-ROM video storage and broadcast set. MPEG-l standard video coding part of the basic algorithm and H.261/H.263 similar, using the inter-frame motion compensation prediction, two-dimensional DCT, VLC measures such as run-length coding. In addition, the introduction of intra-frame (I), prediction frame (P), bi-directional prediction frame (B) and DC frame (D) concepts, to further improve the coding efficiency. In the MPEG-1 basis, MPEG-2 standard in improving the image resolution, compatibility and other aspects of digital television has made some improvements, such as its motion vector for the half-pixel accuracy; in the encoding operation in (such as motion estimation and DCT) distinction between "frame" and "field"; the introduction of a scalable coding technology, such as the classification of space, time and signal to noise ratio scalability, such as scalability. Introduced in recent years the introduction of MPEG-4 standard-based audio-visual objects (AVO: Audio-Visual Object) code, greatly improving the interactive video communication capabilities and coding efficiency. MPEG-4 also introduced some new techniques, such as shape coding, adaptive DCT, arbitrarily shaped video object coding. However, the basic MPEG-4 video encoder and H.263 or belonging to a class of similar hybrid encoder. In short, H.261 video encoding proposal is classic, H.263 is its development and will gradually replace it in practice, mainly used in communications, but a large number of options H.263 users often at a loss. MPEG family of standards for storage media, from application development to adapt to the application of transmission media, the core of the basic framework of video coding and H.261 is the same, one high-profile MPEG-4's "object-based coding," in part because of yet There are technical barriers to widespread application is still difficult. Therefore, in this developed on the basis of the new H.264 video encoding proposed to overcome the weaknesses of both, in the framework of hybrid coding introduced a new coding method to improve the coding efficiency, for practical application. At the same time, it is the International Organization for Standardization two co-developed, its application should be self-evident. Second, H.264, introduced H.264 is ITU-T's VCEG (Video Coding Experts Group) and ISO / IEC for MPEG (Moving Picture Coding Experts Group) video group joint (JVT: joint video team) to develop a new digital video coding standard, it is an ITU-T's H.264, it is ISO / IEC for MPEG-4 Part 10. January 1998 draft began to solicit, in September 1999, completed the first draft in May 2001 to develop the test model TML-8, 2002 years 6 months of JVT adopted at the 5th Session of the FCD Board of H.264. March 2003 release. H.264 and previous standards, but also increases transform coding of DPCM hybrid coding mode. But it is a "return to basics" simplicity of design, do not have many options than H.263 + + much better compression performance; strengthened to adapt to a variety of channel capacity, the use of "network-friendly" structure and grammar is beneficial to the handling of error and packet loss; application of broad goals to meet the needs of different rates, different resolution and different transmission (storage) requirements occasions; its basic system is open, use without copyright. Technically, H.264 standard in a number of flash, such as in the unified symbol VLC coding, high-precision, multi-mode displacement is estimated, based on the 4 × 4 Integer Transform block, such as layered coding syntax. These measures make H.264 algorithm has very high coding efficiency, the reconstruction of the same image quality, it can save more than H.263 bit-rate of around 50%. H.264 Bitstream structure adaptable network, an increase of error recovery capabilities, able to adapt to IP and wireless network applications. Third, H.264 technical highlights 1, hierarchical design concept of H.264 on the algorithm can be divided into two layers: video coding layer (VCL: Video Coding Layer) is responsible for high-performance video content, network abstraction layer (NAL: Network Abstraction Layer) is responsible for the network as required on appropriate packaging and transmission of data. Between VCL and NAL in the definition of a form of packet-based interface, and the corresponding signaling package is part of NAL. In this way, the high coding efficiency and network friendly task by the VCL and NAL, respectively to complete. VCL layer including motion compensation block-based hybrid coding, and some new features. And in front of the same video coding standard, H.264 is not the pre-treatment and post-processing functions included in the draft, so that standards can increase flexibility. NAL is responsible for the use of the lower sub-network data format to package, including the group of frames, logical channel signaling, timing information or the use of the end of signal sequence. For example, NAL support circuit-switched video channel in the transmission format, to support the use of Internet video in RTP / UDP / IP transmission format. NAL, including its head of information, paragraph structure of information and the actual load information, that is, the upper VCL data. (If the data partitioning technology, data may be composed of several components). 2, high-precision, multi-mode motion estimation H.264 support 1 / 4 or 1 / 8 pixel accuracy motion vector. In 1 / 4 pixel accuracy may be using the 6-tap filter to reduce high frequency noise, the 1 / 8 pixel accuracy of motion vectors, can be more complex to use 8-tap filter. In motion estimation, the encoder can also choose to "enhance" interpolation filters to improve the forecast results. Prediction of movement in the H.264, a macroblock (MB) according to Figure 2 can be divided into different sub-block, seven kinds of different models of the formation of the block size. This multi-mode flexible and detailed delineation is in line with our image in the shape of the actual movement of objects, greatly improving the accuracy of motion estimation. In this mode, each macroblock can contain 1,2,4,8 or 16 motion vectors. In H.264, the encoder to allow the use of more than one previous frames for motion estimation, that is, the so-called multi-frame reference technology. For example, two or three just a good reference frame coding, encoder will choose the target for each macroblock can give a better prediction frame, and instructions for each macroblock which has been used to predict frame. 3,4 × 4 block H.264 integer transform similar to the previous standard of residual-based block transform coding, but the operation is an integer transform instead of real computing, the process and is basically similar to DCT. The advantage of this approach is that: in the encoder and decoder to allow the accuracy of the same transform and inverse transform, easy-to-use simple fixed-point algorithms. In other words, there are no "anti-conversion error." Transform is the 4 × 4 blocks, and not commonly used in the past 8 × 8 block. For transformation as a result of block size, sport a more precise delineation of objects, so that not only transform smaller than calculated, and the edges of objects in the movement of the convergence error is also greatly reduced. In order to transform the small size of the block on the larger image does not produce a smooth region between the gray-block differences, intra-macroblock of luminance data 16 4 × 4 block DC coefficients (one for each small A total of 16) a second 4 × 4 block transform, the data on the color of four 4 × 4 block DC coefficients (one for each small piece of a total of four) for 2 × 2 block transform. H.264 Rate Control In order to improve the ability to quantify the changes in step size to control the rate of 12.5 percent in about the same rate of increase rather than change. Transform coefficients of the normalized rate was on the process of dealing with anti-quantified in order to reduce the complexity of the calculation. In order to emphasize the vivid color and chroma coefficient used for quantization step size smaller. 4, uniform VLC H.264 entropy coding in two methods, one is for all the symbols to be a unified code of VLC (UVLC: Universal VLC), the other is using the content adaptive binary arithmetic coding ( CABAC: Context-Adaptive Binary Arithmetic Coding). CABAC is optional, and its encoding UVLC better performance, but also the high computational complexity. UVLC unlimited use of a code word length set, the design of the structure of very rules, by the same code table can encode different object. This method can easily produce a code word, and the decoder can easily identify the prefix code word, UVLC bit error in the event of rapid access to re-sync. 5, intra-prediction in the previous series H.26x and MPEG-x series of standards are used in a way inter-frame prediction. In H.264, when encoding Intra intra-prediction image can be used. For each 4 × 4 block (in addition to the special disposition of blocks away from the edge), are available for each pixel of the 17 closest pixels previously encoded different weighted and (some of the right value for 0) to predict that this pixel where the upper left corner of block 17 pixels. Obviously, this intra-prediction is not in time but in the space domain to the predictive coding algorithm, you can remove the space between adjacent block redundancy to obtain more effective compression. Shown in Figure 4, 4 × 4 box a, b ,..., p for the 16 pixels to be predicted, and A, B ,..., P pixels is encoded. If the value of m points can be (J +2 K + L +2) / 4-type to predict, it can be (A + B + C + D + I + J + K + L) / 8 style to predict, and so on. Selected in accordance with the predictions of different reference points, the brightness of a total of 9 different models, but the color of the intra prediction mode is only 1 category. 6, IP and wireless environment for the draft H.264 contains a tool for the elimination of errors for compressed video in error, packet loss-prone transmission environment, such as the mobile channel or IP transmission channel robustness. In order to withstand transmission errors, H.264 video streams in the time synchronization can be set through the use of intra-image to be completed simultaneously by the structure of space encoding (slice structured coding) to support. At the same time in order to facilitate subsequent re-synchronization error in an image of the video data also provide a certain amount of re-synchronization point. In addition, the intra-macroblock refresh and allow code reference macroblock in macroblock mode decision time could be considered not only the coding efficiency, but also consider the characteristics of transmission channel. In addition to using quantitative step to adapt to changes in channel bit rate, in H.264, it is also often make use of data partitioning method to deal with changes in channel bit rate. Generally speaking, the concept of data partitioning is generated in the encoder with different priority of video data to support the network quality of service QoS. For example, based on the data partitioning syntax (syntax-based data partitioning) method to each frame of data is divided into several parts according to their importance, so that when allowed to discard buffer overflow in the less important information. Can also use a similar data partitioning time (temporal data partitioning) method, through the P frames and B frames using multiple reference frames to complete. In wireless communications applications, we can change the quantitative accuracy of each frame or spatial / temporal resolution to support the great bit-rate wireless channel changes. However, in the case of multicast, the requirements of the encoder to change in response to a variety of bit rates is impossible. Therefore, unlike the use of MPEG-4 Scalable Coding fine FGS (Fine Granular Scalability) method (low efficiency), H.264 flow switch used to replace the SP-frame coding classification. Fourth, H.264 compare the performance of TML-8 for the H.264 test model, using it to H.264 video coding efficiency comparison and testing. Provided by the test results clearly show that the PSNR, compared to MPEG-4 (ASP: Advanced Simple Profile) and H.263 + + (HLP: High Latency Profile) performance, H.264 has the obvious superiority of the results. PSNR than H.264 in MPEG-4 (ASP) and H.263 + + (HLP) significantly better rate in contrast to six kinds of testing, H.264 of PSNR than MPEG-4 (ASP) higher average 2dB, than H.263 (HLP) on average higher 3dB. 6 rate and its associated test conditions were: 32 kbit / s rate, 10f / s frame rate and QCIF format; 64 kbit / s rate, 15f / s frame rate and QCIF format; 128kbit / s rate, 15f / s CIF format and frame rate; 256kbit / s rate, 15f / s frame rate and QCIF format; 512 kbit / s rate, 30f / s frame rate and the CIF format; 1024 kbit / s rate, 30f / s frame rate and the CIF format.

bestry's Blogger

5/01/2009

2:Basic knowledge of H.264 video coding

1 comment:

Followers

bestry's博客

About Me