With HDTV and other resources, the emergence of high-definition, H.264 frequent the norm in front of us, HD-DVD and Blu-ray DVD plans are carried out using the standard programming. But since the second half of 2005, both the NVIDIA or ATI have to support H.264 hardware decode acceleration, as most boast about their own video technology. And the field of digital audio players have also brought us a wave of high-definition and H.264, digital audio players at home and abroad, many manufacturers have already begun to support such encoding video files, while the abundance of network resources has gradually enhanced. H.264 in the end is where the "sacred" mean? And traditions, such as RMVB encoder compared to what it advanced?
H.264, also part of MPEG-4 10th by ITU-T Video Coding Experts Group (VCEG) and ISO / IEC Moving Pictures Experts Group (MPEG) formed a joint video team of the joint (JVT, JointVideoTeam) proposed a high compression digital video codec standard. What is H.264? H.264 is a high-performance video codec technology. At present, the development of video codec technology, there are two organizations, one is "the ITU (ITU-T)", which developed the standards H.261, H.263, H.263 +, etc., and the other is " International Organization for Standardization (ISO) "It developed standards MPEG-1, MPEG-2, MPEG-4 and so on. And H.264 are the two organizations jointly established by the Joint Video Team (JVT) to develop a new digital video coding standard, so it is an ITU-T's H.264, it is ISO / IEC for MPEG-4 advanced video encoding (AdvancedVideoCoding, AVC), and it will be the MPEG-4 standard Part 10. Thus, MPEG-4AVC, MPEG-4Part10, or ISO/IEC14496-10, refer to H.264.
The biggest advantage of H.264 is a high data compression ratio, in the same conditions of image quality, H.264 is a compression ratio of MPEG-2 to 2 times higher than is MPEG-4 1.5 ~ 2 times. For example, if the size of the original documents for the 88GB, the use of MPEG-2 compression standard compressed into 3.5GB, for the 25:1 compression ratio, and the use of compression standard H.264 compressed into a 879MB, from 88GB to 879MB, H.264 compression ratio of 102:1 staggering! H.264 Why is there such a high compression ratio? Low bit rate (LowBitRate) played an important role, and MPEG-2 and MPEG-4ASP compared, such as compression technology, H.264 compression technology will go a long way to save the user's download time and data traffic charges. In particular, it is worth mentioning that, H.264 high compression ratio in the fluid also has high-quality images.
The advantages of H.264 algorithm
H.264 is a MPEG-4 technology based on the established, and its codec processes include five parts: frame and intra-prediction (Estimation), Transform (Transform) and inverse transform, quantization (Quantization) and anti-quantified loop filter (LoopFilter), entropy coding (EntropyCoding). H.264/MPEG-4AVC (H.264) in 1995 from MPEG-2 video compression standard issued after the latest and most promising video compression standard. H.264 is the ITU-T and ISO / IEC joint development team to develop the latest international video coding standard. The adoption of the standards under the same image quality of the compression efficiency standards than the previous 2 times more, so, H.264 is generally considered the most influential industry standards.
A, H.264 history of the development
H.264 in 1997, the ITU Video Coding Expert Group (VideoCodingExpertsGroup) when it was known as H.26L, in the ITU and ISO joint research was known as MPEG4Part10 (MPEG4AVC) or H.264 (JVT).
H.264, senior technical background
The main objective of the standard H.264 is: with other existing video coding standards, in the same bandwidth to provide better image quality.
And, H.264 and previous international standards such as H.263 and MPEG-4 compared to the advantages of the largest in the following four aspects:
1. Each video frame will be separated into components from the pixel block, the video frame encoding process of block-level can be achieved.
2. The use of redundant space method, some of the original video frame space prediction block, conversion, optimization and entropy coding (variable length coding).
3. Consecutive frames to block the use of temporary storage in different ways, so that only the frame of the continuous changes in the coding part. The algorithm uses motion compensation prediction and movement to complete. On certain blocks, in one or more have been frames encoded block the implementation of the search to determine the motion vectors, and thus behind the encoding and decoding blocks in the main forecast.
4. Redundancy using the remaining space on the video frame in the coding of residual block. For example: the source block and the corresponding projections of the different blocks, again using the conversion, optimization and entropy coding.
H.264 edge features and advanced
H.264 is the International Organization for Standardization (ISO) and the International Telecommunication Union (ITU) proposed the following common after the next-generation MPEG4 digital video compression format, that is, it retains the advantages of the past, compression technology and the best compression technology and other unmatched many advantages.
1. Low bit-stream (LowBitRate): and MPEG2 compression technology, such as MPEG4ASP and compared to the same image quality, the use of H.264 compression technology data of the MPEG2 is only 1 / 8, MPEG4 1 / 3.
Obviously, H.264 compression technology adoption will go a long way to save the user's download time and data traffic charges.
2. High-quality images: H.264 can provide continuous, smooth, high-quality images (DVD quality).
3. Fault-tolerant capability: H.264 offers a solution in an unstable network environment, such as error-prone packet loss of the necessary tools.
4. Adaptable network: H.264 adaptation layer provides a network (NetworkAdaptationLayer), the document makes H.264 can easily transfer in different networks (such as the Internet, CDMA, GPRS, WCDMA, CDMA2000, etc.).
Second, H.264 standards outlined
H.264 and previous standards, but also increases transform coding of DPCM hybrid coding mode. But it is a "return to basics" simplicity of design, do not have many options than H.263 + + much better compression performance; strengthened to adapt to a variety of channel capacity, the use of "network-friendly" structure and grammar is beneficial to the handling of error and packet loss; application of broad goals to meet the needs of different rates, different resolution and different transmission (storage) occasions demand.
Technically, it focused on the merits of the previous standards, and the absorption of the standard-setting experience. And H.263v2 (H.263 +) or MPEG-4 simple profile (SimpleProfile) compared to, H.264 encoding using methods similar to the above the best encoder, the bit rate, in most can save up to 50% of the rate. H.264 at all bit rate can continue to provide high video quality. H.264 to work in low-latency mode in order to adapt to real-time communication application (such as video conferencing), very good to work at the same time there is no delay in the application of restrictions, such as video storage and server-based video streaming applications . H.264 packet transmission network to provide packet loss to deal with the necessary tools, as well as error-prone wireless network tools to deal with bit error.
Level in the system, H.264 proposed a new concept in video coding layer (VideoCodingLayer, VCL) and network abstraction layer (NetworkAbstractionLayer, NAL) division between the conceptual, the former is the core of compressed video content statements , which is a specific type of network through the formulation of delivery, so the structure of the package to facilitate information and better information priority control.
Summary: H.264 = excellent + General
Many people dizzy terms, but sum up, very simple, H.264 is MPEG-4 standard, a project by virtue of its own superiority and versatility to become an internationally recognized standard.
5/11/2009
5/08/2009
6:On the MPEG video compression standards and their associated encoding formats.
The full name of MPEG is the Moving Picture Experts Group (MovingPictureExpertsGroup), is devoted to the development of multimedia in the field of international standards of an organization. The organization was founded in 1988 by some 300 world experts in multimedia technology. Including MPEG video, MPEG audio and MPEG systems (video and audio sync) in three parts. MPEG compression standard is designed for motion picture, the basic approach is - in units of time collecting information and save the first frame, and then store the rest of the frame on only the relative changes in the first part of the frame in order to achieve the purpose of compression. MPEG compression standard can be compressed between the frame, the average of up to 50:1 compression ratio, compression ratio is relatively high, and have a unified format, good compatibility. Recommended: FlvixFLV download streaming video conversion step in the completion of.
Multimedia data compression standards, more use of MPEG series of standards, including MPEG-1, 2,4 and so on. MPEG-1 (ISO/IEC11172) is the MPEG organization in 1992, first put forward by the impact of a broad multi-media international standards. MPEG-1 standard is formally known as the "media campaign based on the digital storage of image and sound compression standard", we can see, MPEG-1 focus on resolving the issue of multi-media storage. Due to the success of MPEG-1 for the purpose of VCD and MP3 as the representative of the MPEG-1 product in the rapidly growing popularity around the world.
Following the successful development of MPEG-1 after, MPEG organization launched in 1996 to resolve the issue of multimedia transmission of MPEG-2 standard. MPEG-2 is formally known as the "common compression standards image and sound." MPEG-2 standard is the most eye-catching digital TV set-top boxes and DVD. Since then, MPEG did not stop the pace of progress in January 1999 the ISO published the MPEG-4 (video and audio compression object) of the first edition of the standard, and then in December 1999 announced that this standard of the second version. MPEG-4 full ISO name for ISO/IEC14496. MPEG-4 in May 1991 the first time, in July 1993 launched in January 1999 as an international standard, has experienced up to 6 years of research and discussion.
MPEG-1 for the transmission of data transmission rate of 1.5Mbps digital storage media, motion picture and sound encoding, the MPEG-1 compression standard, video data compression rate of 1/100-1/200, audio compression ratio of 1 / 6.5. MPEG-1 to provide 30 frames per second, 352 * 240 resolution image, when using compression techniques when appropriate, with close to home video formats (VHS) video quality. MPEG-1 to allow more than 70 minutes of high-quality video and audio stored on a CD-ROM disk. VCD is used in the MPEG-1 standard, which is a family-oriented TV-quality video, audio compression standard.
MPEG-2 mainly for high-definition television (HDTV) needs, transmission rate of 10Mbps, compatible with the MPEG-1 for 1.5-60Mbps or even higher range of the encoding. MPEG-2 there are 30 frames per second 704 * 480 resolution, MPEG-1 is four times the playback speed. It is suitable for demanding broadcast and entertainment applications, such as: DSS satellite broadcasting and DVD, MPEG-2 is a home video format (VHS) video resolution twice.
MPEG-4 standard is the ultra-low bit-rate motion image compression standard and the language used to 64kbps transfer rate is lower than the real-time video transmission, it can cover not only low-frequency band, but also to the development of high frequency band. Than the previous two standards, MPEG-4 data compression for multimedia provided - a much wider platform. It is a more defined format, a framework, rather than specific algorithms. It can be a wide range of multimedia technology into full use, including compression own tools, algorithms, including image synthesis, speech synthesis technology. MPEG-4 from the date on the proposed cause for concern, although not everyone is clear of its specific objectives, but have it placed a great deal of hope. MPEG-4's biggest innovation is to empower the user application's ability to establish a system, rather than use a fixed standard for the application. In addition, MPEG-4 will be integrated as much as possible of the data types, such as natural and synthetic data in order to achieve a variety of transmission media content in support of reciprocal expression. With MPEG-4, the first time, we may establish a personalized audio-visual system.
MPEG-7 standard is known as "Multimedia Content Description Interface", for all types of multimedia information to provide a standardized description, which describes itself with the content, allowing rapid and effective information of interest to the user query. It will expand the existing content-specific solutions to identify a limited capacity, in particular, it also includes more data types. Other words, MPEG-7 provides that a used to describe different types of multimedia information of the standard collection of descriptors. The standard was put forward in October 1998.
MPEG-21 is the latest development of MPEG level. It is a heterogeneous network support and equipment through to enable users to the widespread use of transparent and standard multi-media resources, and its goal is to establish a framework for interactive multimedia. MPEG-21 to the people of the technical report paints a future scenario of the multimedia environment, the environment can support a variety of applications, different users can use and transmission of all types of digital content. Can also say, MPEG-21 is a response to the realization and protection of intellectual property management with the ability of the number of technical standards for multimedia content.
Multimedia data compression standards, more use of MPEG series of standards, including MPEG-1, 2,4 and so on. MPEG-1 (ISO/IEC11172) is the MPEG organization in 1992, first put forward by the impact of a broad multi-media international standards. MPEG-1 standard is formally known as the "media campaign based on the digital storage of image and sound compression standard", we can see, MPEG-1 focus on resolving the issue of multi-media storage. Due to the success of MPEG-1 for the purpose of VCD and MP3 as the representative of the MPEG-1 product in the rapidly growing popularity around the world.
Following the successful development of MPEG-1 after, MPEG organization launched in 1996 to resolve the issue of multimedia transmission of MPEG-2 standard. MPEG-2 is formally known as the "common compression standards image and sound." MPEG-2 standard is the most eye-catching digital TV set-top boxes and DVD. Since then, MPEG did not stop the pace of progress in January 1999 the ISO published the MPEG-4 (video and audio compression object) of the first edition of the standard, and then in December 1999 announced that this standard of the second version. MPEG-4 full ISO name for ISO/IEC14496. MPEG-4 in May 1991 the first time, in July 1993 launched in January 1999 as an international standard, has experienced up to 6 years of research and discussion.
MPEG-1 for the transmission of data transmission rate of 1.5Mbps digital storage media, motion picture and sound encoding, the MPEG-1 compression standard, video data compression rate of 1/100-1/200, audio compression ratio of 1 / 6.5. MPEG-1 to provide 30 frames per second, 352 * 240 resolution image, when using compression techniques when appropriate, with close to home video formats (VHS) video quality. MPEG-1 to allow more than 70 minutes of high-quality video and audio stored on a CD-ROM disk. VCD is used in the MPEG-1 standard, which is a family-oriented TV-quality video, audio compression standard.
MPEG-2 mainly for high-definition television (HDTV) needs, transmission rate of 10Mbps, compatible with the MPEG-1 for 1.5-60Mbps or even higher range of the encoding. MPEG-2 there are 30 frames per second 704 * 480 resolution, MPEG-1 is four times the playback speed. It is suitable for demanding broadcast and entertainment applications, such as: DSS satellite broadcasting and DVD, MPEG-2 is a home video format (VHS) video resolution twice.
MPEG-4 standard is the ultra-low bit-rate motion image compression standard and the language used to 64kbps transfer rate is lower than the real-time video transmission, it can cover not only low-frequency band, but also to the development of high frequency band. Than the previous two standards, MPEG-4 data compression for multimedia provided - a much wider platform. It is a more defined format, a framework, rather than specific algorithms. It can be a wide range of multimedia technology into full use, including compression own tools, algorithms, including image synthesis, speech synthesis technology. MPEG-4 from the date on the proposed cause for concern, although not everyone is clear of its specific objectives, but have it placed a great deal of hope. MPEG-4's biggest innovation is to empower the user application's ability to establish a system, rather than use a fixed standard for the application. In addition, MPEG-4 will be integrated as much as possible of the data types, such as natural and synthetic data in order to achieve a variety of transmission media content in support of reciprocal expression. With MPEG-4, the first time, we may establish a personalized audio-visual system.
MPEG-7 standard is known as "Multimedia Content Description Interface", for all types of multimedia information to provide a standardized description, which describes itself with the content, allowing rapid and effective information of interest to the user query. It will expand the existing content-specific solutions to identify a limited capacity, in particular, it also includes more data types. Other words, MPEG-7 provides that a used to describe different types of multimedia information of the standard collection of descriptors. The standard was put forward in October 1998.
MPEG-21 is the latest development of MPEG level. It is a heterogeneous network support and equipment through to enable users to the widespread use of transparent and standard multi-media resources, and its goal is to establish a framework for interactive multimedia. MPEG-21 to the people of the technical report paints a future scenario of the multimedia environment, the environment can support a variety of applications, different users can use and transmission of all types of digital content. Can also say, MPEG-21 is a response to the realization and protection of intellectual property management with the ability of the number of technical standards for multimedia content.
5/06/2009
5:H.264: Video Coding for new development
H.264: Video Coding for new development JVT (Joint Video Team, Joint Working Group on the video) in December 2001 the establishment of Pattaya in Thailand. By ITU-T and the International Organization for Standardization ISO two video encoding on the composition of the experts. JVT work goal is to develop a new video coding standard in order to achieve high video compression ratio, high image quality, good adaptability of the network objectives. At present, the work of the JVT has been accepted by ITU-T, the new video coding standard known as H.264 compression standard, which was also accepted by ISO, known as AVC (Advanced Video Coding) standard, is the MPEG-4 Part 10. H.264 standard can be divided into Three:
The basic level (the simple version of the application of a wide range); The main level (using a number of improve image quality and increase the compression ratio of technical measures can be used for SDTV, HDTV and DVD, etc.); Expansion level (can be used for a variety of network transmission of video streams).
Not only H.263 and H.264 than MPEG-4 of the 50 percent savings rate, but also has better network support. It into IP packets for encoding mechanism is conducive to the packet transmission network to support network streaming video transmission. H.264 has a strong anti-BER characteristics, can be adapted to the high rate of packet loss, a serious interference in the video transmission channel. H.264 support for various network resources under the classification code transmission, and thus obtain a smooth image quality. H.264 can adapt to different video transmission network, the network good affinity.
1, H.264 video compression system Compression standard H.264 video coding system layer (VCL) and network abstraction layer (Network Abstraction Layer, NAL) is composed of two parts. VCL, including VCL and VCL encoder decoder, the main functions of the video data compression encoding and decoding, which includes the motion compensation, transform coding, entropy coding compression unit. NAL is used for the VCL has nothing to do with the network to provide a unified interface, which is responsible for video data package after package to send in the network, it uses a unified data format, including a single byte of the header information, a number of words section with the group of video data frames, logical channel signaling, timing information, the end of signal sequence. Header contains the type of store signs and markers. Store signs used to indicate the current data does not belong to be a reference frame. The type of symbol used to indicate the type of image data. VCL can be transmitted by the network to adjust the current encoding parameters.
2, H.264 characteristics H.264 and H.261, H.263, the DCT transform coding is the increase in the use of DPCM coding of the difference, that is, hybrid coding structure. At the same time, H.264 hybrid coding in the framework of the introduction of a new coding method to improve the coding efficiency, closer to practical application. H.264 is not complicated options, but try to be brief the "return to basics", it is better than H.263 + + the compression performance, but also has to adapt to a wide range of channel capacity. H.264 target a wide range of applications, to meet a variety of different rate, video applications on various occasions, has good anti-error and anti-handling capacity of packet loss. H.264 basic system without the use of copyright, the nature of open, well adapted to IP and wireless networks use the Internet for the current transmission, multimedia messaging, mobile broadband network to transmit information of great significance to all. Although the basic structure of H.264 encoding with H.261, H.263 is similar, but it has made improvements in many areas, are listed below. 1. A variety of better motion estimation High-precision estimates H.263 is used in half pixel is estimated that in the further use of H.264 in the 1 / 4 pixel or 1 / 8 pixel motion estimation. That is, the real movement of the displacement vector may be based on 1 / 4 or 1 / 8 as the basic unit of pixels. Obviously, the motion vector accuracy of the higher displacement, the smaller the residual error frame, the lower the transmission rate, that is, the higher the compression ratio. H.264 is used in the 6-order FIR interpolation filter to obtain 1 / 2 pixel position value. When 1 / 2 pixel values obtained, the 1 / 4 pixel value can be obtained through linear interpolation, For 4:1:1 video format, the brightness signal 1 / 4 pixel accuracy corresponds to the color part of the 1 / 8 pixel motion vectors, it signals the need for color 1 / 8 pixel interpolation operator. In theory, if the accuracy of motion compensation doubled (for example, from whole-pixel precision to 1 / 2 pixel accuracy), can 0.5bit/Sample the coding gain, but to verify the accuracy of motion vectors found in more than 1 / 8 pixel , the system is basically there is no obvious gain, so that in H.264, only used 1 / 4 pixel accuracy motion vector mode, rather than 1 / 8 pixel accuracy. Multi-mode macroblock is estimated breakdown The forecasting model in H.264, a macroblock (MB) can be divided into seven kinds of different size, this multi-mode flexible, subtle delineation macroblock is in line with our image in the shape of the actual movement of objects, so in each macroblock may contain 1, 2, 4, 8 or 16 motion vectors. Multi-parameter frame is estimated In H.264, the frame can be a number of parameters of motion estimation, that is in the encoders of the cache code there is more than just a good parameter frame, from one encoder to choose a better coding results are given as parameters of the frame, and pointed out that the frame which was used to predict, so that you can use than just a good frame just encoded frame as a better predict the effect of the code. 4 Integer Transform '2. Small size 4 4, as the transform block size has become smaller, moving objects on the more precise delineation. This case, the image transform in the process of computation, and edges of moving objects in the convergence of error has been greatly reduced. '8. H.264 is used in the small size of the 4 'video compression coding unit used in the past 8 Transformation 2. 'DC coefficient of the four (one for each small piece of a total of four DC coefficients) for 2' 4 Transformation of chroma data 4 4 '4 of the DCT coefficient of the second 4' when the image There are large smooth areas, in order not to have a small size due to change brought about by inter-block differences in gray, H.264 Intra macroblock of luminance data 16 4 H.263 not only image transform block size has become smaller, and the transformation is an integer operation, rather than real computing, namely, encoders and decoders transform and inverse transform of the accuracy of the same, there is no "anti-conversion error." 3. More accurate intra-prediction The four are available for each pixel 17 of the nearest previously encoded pixels and the different weights for intra-prediction. 'In H.264, each 4 4. VLC unified H.264 encoding on the entropy in two ways. Unified VLC (that is, UVLC: Universal VLC). UVLC use the same code table for encoding, while the decoder can easily identify the code word prefix, UVLC bit error in the event of rapid access to re-sync. Content adaptive binary arithmetic coding (CABAC: Context Adaptive Binary Arithmetic Coding). UVLC its slightly better coding performance, but higher complexity.
3, the performance advantage H.264 and MPEG-4, H.263 + + coding performance comparison using the following six test rate: 32kbit / s, 10F / s and QCIF; 64kbit / s, 15F / s and QCIF; 128kbit / s, 15F / s and CIF; 256kbit / s, 15F / s and QCIF; 512kbit / s, 30F / s and CIF; 1024kbit / s, 30F / s and CIF. The test results indicate, H.264 than MPEG and H.263 + + more excellent PSNR performance. PSNR than H.264 in MPEG-4 high average 2dB, than H.263 + + average to high-3dB. Fourth, the new fast motion estimation algorithm New Fast Motion Estimation Algorithm UMHexagonS (China Patent) is a computational complexity compared to H.264 Zhongyuan fast full search algorithm and some savings of more than 90% of the new algorithm, called the whole "non-symmetrical cross-shaped multi-level six - shaped grid search algorithm "(Unsymmetrical-Cross Muti-Hexagon Search)", which is a whole-pixel motion estimation algorithm. because of its large sport in high-bit-rate image sequence coding at a better rate-distortion performance to maintain the conditions , the computational complexity is very low, has been formally adopted H.264 standard. ITU and ISO joint development of the H.264 (MPEG-4 Part 10) may be broadcast, communications and storage media (CD DVD) to become a unified standard, is most likely to become a broadband interactive new media standards. China's source coding standard has not yet been formulated, paying close attention to the development of H.264, the development of our source coding standards are being stepped up. Standard H264 video compression technology to enable movement to rise to a higher stage, at a relatively low bandwidth to provide high-quality H.264 video transmission is a bright spot in the application. The popularization and application of H.264 video terminals, gatekeepers, gateways, MCU, such as higher system requirements, will greatly promote the video conferencing software and hardware equipment in the continuous improvement in all aspects.
The basic level (the simple version of the application of a wide range); The main level (using a number of improve image quality and increase the compression ratio of technical measures can be used for SDTV, HDTV and DVD, etc.); Expansion level (can be used for a variety of network transmission of video streams).
Not only H.263 and H.264 than MPEG-4 of the 50 percent savings rate, but also has better network support. It into IP packets for encoding mechanism is conducive to the packet transmission network to support network streaming video transmission. H.264 has a strong anti-BER characteristics, can be adapted to the high rate of packet loss, a serious interference in the video transmission channel. H.264 support for various network resources under the classification code transmission, and thus obtain a smooth image quality. H.264 can adapt to different video transmission network, the network good affinity.
1, H.264 video compression system Compression standard H.264 video coding system layer (VCL) and network abstraction layer (Network Abstraction Layer, NAL) is composed of two parts. VCL, including VCL and VCL encoder decoder, the main functions of the video data compression encoding and decoding, which includes the motion compensation, transform coding, entropy coding compression unit. NAL is used for the VCL has nothing to do with the network to provide a unified interface, which is responsible for video data package after package to send in the network, it uses a unified data format, including a single byte of the header information, a number of words section with the group of video data frames, logical channel signaling, timing information, the end of signal sequence. Header contains the type of store signs and markers. Store signs used to indicate the current data does not belong to be a reference frame. The type of symbol used to indicate the type of image data. VCL can be transmitted by the network to adjust the current encoding parameters.
2, H.264 characteristics H.264 and H.261, H.263, the DCT transform coding is the increase in the use of DPCM coding of the difference, that is, hybrid coding structure. At the same time, H.264 hybrid coding in the framework of the introduction of a new coding method to improve the coding efficiency, closer to practical application. H.264 is not complicated options, but try to be brief the "return to basics", it is better than H.263 + + the compression performance, but also has to adapt to a wide range of channel capacity. H.264 target a wide range of applications, to meet a variety of different rate, video applications on various occasions, has good anti-error and anti-handling capacity of packet loss. H.264 basic system without the use of copyright, the nature of open, well adapted to IP and wireless networks use the Internet for the current transmission, multimedia messaging, mobile broadband network to transmit information of great significance to all. Although the basic structure of H.264 encoding with H.261, H.263 is similar, but it has made improvements in many areas, are listed below. 1. A variety of better motion estimation High-precision estimates H.263 is used in half pixel is estimated that in the further use of H.264 in the 1 / 4 pixel or 1 / 8 pixel motion estimation. That is, the real movement of the displacement vector may be based on 1 / 4 or 1 / 8 as the basic unit of pixels. Obviously, the motion vector accuracy of the higher displacement, the smaller the residual error frame, the lower the transmission rate, that is, the higher the compression ratio. H.264 is used in the 6-order FIR interpolation filter to obtain 1 / 2 pixel position value. When 1 / 2 pixel values obtained, the 1 / 4 pixel value can be obtained through linear interpolation, For 4:1:1 video format, the brightness signal 1 / 4 pixel accuracy corresponds to the color part of the 1 / 8 pixel motion vectors, it signals the need for color 1 / 8 pixel interpolation operator. In theory, if the accuracy of motion compensation doubled (for example, from whole-pixel precision to 1 / 2 pixel accuracy), can 0.5bit/Sample the coding gain, but to verify the accuracy of motion vectors found in more than 1 / 8 pixel , the system is basically there is no obvious gain, so that in H.264, only used 1 / 4 pixel accuracy motion vector mode, rather than 1 / 8 pixel accuracy. Multi-mode macroblock is estimated breakdown The forecasting model in H.264, a macroblock (MB) can be divided into seven kinds of different size, this multi-mode flexible, subtle delineation macroblock is in line with our image in the shape of the actual movement of objects, so in each macroblock may contain 1, 2, 4, 8 or 16 motion vectors. Multi-parameter frame is estimated In H.264, the frame can be a number of parameters of motion estimation, that is in the encoders of the cache code there is more than just a good parameter frame, from one encoder to choose a better coding results are given as parameters of the frame, and pointed out that the frame which was used to predict, so that you can use than just a good frame just encoded frame as a better predict the effect of the code. 4 Integer Transform '2. Small size 4 4, as the transform block size has become smaller, moving objects on the more precise delineation. This case, the image transform in the process of computation, and edges of moving objects in the convergence of error has been greatly reduced. '8. H.264 is used in the small size of the 4 'video compression coding unit used in the past 8 Transformation 2. 'DC coefficient of the four (one for each small piece of a total of four DC coefficients) for 2' 4 Transformation of chroma data 4 4 '4 of the DCT coefficient of the second 4' when the image There are large smooth areas, in order not to have a small size due to change brought about by inter-block differences in gray, H.264 Intra macroblock of luminance data 16 4 H.263 not only image transform block size has become smaller, and the transformation is an integer operation, rather than real computing, namely, encoders and decoders transform and inverse transform of the accuracy of the same, there is no "anti-conversion error." 3. More accurate intra-prediction The four are available for each pixel 17 of the nearest previously encoded pixels and the different weights for intra-prediction. 'In H.264, each 4 4. VLC unified H.264 encoding on the entropy in two ways. Unified VLC (that is, UVLC: Universal VLC). UVLC use the same code table for encoding, while the decoder can easily identify the code word prefix, UVLC bit error in the event of rapid access to re-sync. Content adaptive binary arithmetic coding (CABAC: Context Adaptive Binary Arithmetic Coding). UVLC its slightly better coding performance, but higher complexity.
3, the performance advantage H.264 and MPEG-4, H.263 + + coding performance comparison using the following six test rate: 32kbit / s, 10F / s and QCIF; 64kbit / s, 15F / s and QCIF; 128kbit / s, 15F / s and CIF; 256kbit / s, 15F / s and QCIF; 512kbit / s, 30F / s and CIF; 1024kbit / s, 30F / s and CIF. The test results indicate, H.264 than MPEG and H.263 + + more excellent PSNR performance. PSNR than H.264 in MPEG-4 high average 2dB, than H.263 + + average to high-3dB. Fourth, the new fast motion estimation algorithm New Fast Motion Estimation Algorithm UMHexagonS (China Patent) is a computational complexity compared to H.264 Zhongyuan fast full search algorithm and some savings of more than 90% of the new algorithm, called the whole "non-symmetrical cross-shaped multi-level six - shaped grid search algorithm "(Unsymmetrical-Cross Muti-Hexagon Search)", which is a whole-pixel motion estimation algorithm. because of its large sport in high-bit-rate image sequence coding at a better rate-distortion performance to maintain the conditions , the computational complexity is very low, has been formally adopted H.264 standard. ITU and ISO joint development of the H.264 (MPEG-4 Part 10) may be broadcast, communications and storage media (CD DVD) to become a unified standard, is most likely to become a broadband interactive new media standards. China's source coding standard has not yet been formulated, paying close attention to the development of H.264, the development of our source coding standards are being stepped up. Standard H264 video compression technology to enable movement to rise to a higher stage, at a relatively low bandwidth to provide high-quality H.264 video transmission is a bright spot in the application. The popularization and application of H.264 video terminals, gatekeepers, gateways, MCU, such as higher system requirements, will greatly promote the video conferencing software and hardware equipment in the continuous improvement in all aspects.
5/03/2009
4:HDTV basics
HDTV basics: H.264 encoding technologies, H.264 and the formulation of applications End in the formulation of the first after the H.263 standard, ITU-T Video Coding Experts Group (VCEG) will develop the work is divided into two parts: one called "short-term (short-term)" program is designed to H.263 increase the number of new features (the program has developed a H.263 + and H.263 + +); the other part is called "long-term (long-term)" program, the initial goal is to work out a time than other video coding standard to double the efficiency of the new standards. The plan started in 1997, the results as H.264 is the predecessor of the H.26L (initially called H.263L). Nearly the end of 2001, as the H.26L superior performance, ISO / IEC added to the MPEG group of experts to VCEG in common the establishment of a joint video Group (JVT), has taken over the development of H.26L. The organization's objectives are: "Study of new video coding algorithm, with the goal of performance than in the past developed a lot of the best standards." This standard as an international standard is officially in March 2003 held in Pattaya, Thailand, JVT 7th meeting adopted. As the standard is different from the two organizations jointly developed, so there are two different names: In the ITU-T, it was called H.264; and in ISO / IEC, it is known as MPEG-4 Part 10, that is advanced video coding (AVC). H.264 a wide range of applications, including video telephony (fixed or mobile), real-time video conferencing systems, video surveillance systems, Internet video transmission and multimedia, such as information storage. At present in the international arena, Canada's UB Video has developed a set of H.26L based on the TMS320C64x series of real-time video communication systems, it can 160kbit / s bit-rate obtained with the H.263 + in the 320kbit / s under the same image quality. Canadian companies another VideoLocus inserted in the system through an FPGA-based hardware expansion card in the P4 platform has been realized in real-time H.264 codec. Second, H.264 characteristics H.264 coding framework in the past or the MC-DCT structure, that is, motion compensation transform coding plus mixed (hybrid) structure, so it retains some of the characteristics of the previous standards, such as the unrestricted motion vector (unrestricted motion vectors ), on the motion vector prediction of the median (median prediction) and so on. However, introduction of technology enables H.264 video encoding than the previous standard in performance has been greatly improved. It should be noted that instead of relying solely on this to raise the achievement of a particular technology, but different types of technology to improve the performance of small and co-produced. 1. Intra Prediction Encoding of I-frame through the use of space rather than time-dependent correlation of realization. The old standard of using only a macroblock (macroblock) the relevance of the internal to the neglect of the correlation between the macroblock, so in general the amount of data encoded large. In order to further the relevance of the use of space, H.264 intra-prediction introduced to improve the compression efficiency. In short, intra-prediction coding is used around the values of neighboring pixels to predict the current pixel value, and then encode the prediction error on. This forecast is based on the block for the brightness component (1uma), can block the size of 16 × 16 and 4 × 4 to choose between, 16 × 16 block there are four kinds of forecasting models, 4 × 4 block are nine kinds of forecasting models ; for the chrominance component (chroma), the forecast for the entire 8 × 8 blocks, with four kinds of forecasting models. DC In addition to the forecast, the forecasting model of each other in different directions corresponding to the forecast. 2. Interframe prediction Like the previous standards, H.264 use motion estimation and motion compensation to eliminate the temporal redundancy, but it has the following five characteristics: (1) forecasts using variable block size As a result of block-based motion model assumed that all pixels within the block are doing the same translation, in sports or moving objects relatively sharp edges of the assumption that actual access to a larger, resulting in greater prediction error, this time by small size can make the assumption that in the small block is still set up. While a small block of the block caused by the relative effect is also small, so small blocks in general can improve the forecast results. To this end, H.264 uses a total of seven kinds of ways to partition a macroblock, each block mode is not the same size and shape, which makes encoder can select the best images of the contents of the forecasting model. With only 16 × 16 blocks compared to the forecast, the use of different size and shape can make the block more than 15% bit rate savings. (2) the prediction accuracy of more sophisticated In H.264 in, Luma component of the motion vector (MV) 1 / 4 pixel accuracy. Chroma component luma MV derived from the MV, as the chroma resolution is half of the luma (for 4:2:0), so the accuracy of their MV will be 1 / 8, which means that a unit represented by the chroma MV displacement chroma component is only the distance between sampling points of 1 / 8. So the prediction accuracy compared with fine precision integer can save over 20% rate. (3) multi-reference frame H.264 support for multi-reference frame prediction (multiple reference frames), which can have more than one (up to 5) in the current frame before the decoding of the frame as a reference frame can produce projections of the current frame (motion-compensated prediction). This applies to video sequences contain a cyclical movement in the situation. The use of this technology can improve the motion estimation (ME) the performance of H.264 codec to improve the error recovery capabilities, but at the same time increase the capacity of the cache as well as the codec complexity. However, H.264 is based on the proposed rapid development of semiconductor technology, so the burden of these two in the near future will become insignificant. Compared with only a reference frame, the use of five reference frames can save 5 to 10 percent rate. (4) Anti-blocking filter Anti-blocking filter (Deblocking Filter), its role is to remove the anti-quantization and inverse transform of the reconstructed image after the prediction error generated as a result of blocking, that is, the edges of the pixel block hopping value and thus to improve the image of a subjective quality, the second to reduce the prediction error. Deblocking Filter in H.264 also be able to make judgments based on image content, only the block effect as a result of the pixel values to smooth transition, while the edges of objects in the image pixel value given for a reservation to avoid the edge of ambiguity. Deblocking Filter with the previous difference is that, after the image after filtering will be based on the need for inter-frame prediction cache, rather than just the reconstruction of images in the output used to improve the subjective quality, that is to say that the filter decoder ring is located rather than the output decoder ring, which it called the Loop Filter. It should be noted that, for intra prediction, using the unfiltered reconstructed image wave. 3. Integer Transform H.264 intra-or inter-frame prediction of the residuals (residual) for DCT transform coding. In order to overcome the floating-point operations brought about by the complexity of hardware design and, more importantly, a result of rounding error encoder and decoder do not match (mismatch), the new standard for the definition of DCT been modified so that only conversion integer addition and subtraction and shift operations can be realized, so do not consider the quantitative effect on the output of decoder can accurately restore the input-side code. Of course, the cost of doing so is a slight drop in compression performance. In addition, the conversion is 4 × 4 blocks, which also help to reduce the block effect. In order to further the use of space-related images, and chroma in the prediction residual and 16 × 16 intra prediction of the prediction residual to the above-mentioned DCT integer transform, the standard will each 4 × 4 transform coefficients of the DC coefficient of block group into 2 × 2 or 4 × 4 block size, so further Hadamard (Hadamard) transform. 4. Entropy coding If Slice layer prediction residual, H.264, there are two methods of entropy coding: Context-based Adaptive Variable Length Code (Context-based Adaptive Variable Length Coding, CAVLC) and Context-based Adaptive Binary Arithmetic Coding (Context - based Adaptive Binary Arithmetic Coding, CABAC); if not prediction residual, H.264 using Exp-Golomb code or CABAC coding, depending on encoder settings. (1) CAVLC VLC is the basic idea of the symbol frequency of the use of large codeword shorter, and the small frequency of occurrence of symbols of the codeword longer. This makes the minimum average code length. In the CAVLC in, H.264 using a number of VLC code table, different code table corresponding to the probability of a different model. Encoder according to the context, such as around the block or the non-zero coefficient of the absolute size of coefficients in these tables automatically code options, the maximum extent possible with the current data model of the probability of matching, in order to achieve the context of adaptive function. (2) CABAC Arithmetic coding is a highly efficient entropy coding scheme, the symbols corresponding to each code length is considered scores. Because of the encoding of each symbol are encoded with the previous results, it is considered in the overall sequence of source symbols of the probability characteristics of a single symbol rather than the probability of identity, so it can be a greater source close to the limit of entropy to reduce the bit rate. Arithmetic coding in order to bypass a small number of infinite precision, as well as that of the source symbol of probability estimates, more than modern arithmetic coding state machine with limited way, H.264 is an example of CABAC, other examples are JPEG2000 . In CABAC, each encoding a binary symbol, encoder will automatically adjust the source of the probability model (with a "state" to that) estimates, then the binary symbols in this update of the probability model based on the coding. This source code does not require prior knowledge of statistical properties, but in the encoding process is estimated adaptively. Obviously, with the CAVLC encoding a number of pre-set probability model comparison, CABAC greater flexibility of the coding performance can be better - about 10% lower rate. Characteristics described above are used to improve the coding performance of H.264, H.264 also has very good ability to recover the error (error resilience) and adaptive network (network adaptability), Here are some of the features. 5. SP Slice SP Slice's main purpose is code for different flow switch (switch), The stream can also be used for random access, fast forward and rewind error recovery. Talking about here refers to different code streams at different bit-rate constraints of the same encoded source code stream generated. Switch-based code stream prior to transmission of the last one for Al, the goal of switching streams after the first frame as the B2 (the assumption that the P-frame), as a result of B2 frame of reference does not exist, so obviously a direct switch will lead to a lot of distortion , and this distortion will transfer back. A simple solution is to transfer intra-encoded B2, but in general I frame a large amount of data, this method will cause a sharp increase in transmission bit rate. According to previous assumptions, the same letter as the source code, even though a different bit rate, but the switch before and after the two must have a lot of relevance, it could be Al encoder B2 as a reference frame for interframe prediction of B2, prediction error is SP Slice, and then completed by passing a stream SP Slice the switch. With the conventional P-frame difference is generated by the SP Slice forecast and B2 in the Al domain of the transformation. SP Slice requested B2 after switching the image should be sent directly target the same stream. Obviously, if the goal is not to switch to another relevant stream, SP Slice does not apply. 6. Flexible macroblock order Flexible macroblock order (flexible macroblock ordering, FMO), refers to an image in the macroblock is divided into several groups, independent coding of a macroblock group may not necessarily be in order under Standing Orders, before and after scanning for , and may be randomly scattered in different image locations. If such a transmission error, a group can not be certain the correct decoding macroblock, the decoder can still be based on space-related images of the correct decoding on its surrounding pixels to restore them. Third, H.264 specific content Through the introduction of the above, no doubt, H.264 compression performance superior than the other standards, including MPEG-4 (2) (MPEG-4 Part 2). As we all know, MPEG-4 (2) is characterized by the largest object-oriented coding, the object is a concept advanced in the object has been extracted under the conditions of access to really high compression ratio, but how to extract the object become before the people before a major problem. A true object extraction algorithm with intelligent human beings should be able to think like human beings and are able to learn, and the current technology and thus fail to this point, although there are a lot of literature to introduce the method of extracting the object, but I that these only a temporary solution at best, only the right direction to a small step. It is for this reason, MPEG-4 (2) the idea of object-oriented coding is too advanced. ITU-T's VCEG unrealistic to give up the concept of the object with the current level of development of science and technology adapted to the H.264 (10) (MPEG-4 Part 10) (H.26L) video coding standard, This is a valuable and, more importantly, it achieved the same MPEG-4 (2) object-oriented coding one of the goals - a high compression ratio. Video signals is a great amount of data, in order to achieve efficient compression, must make full use of all kinds of redundancy, in general, the redundancy in video sequences, including two categories of statistical redundancy, it contains: (1 ) spectrum redundancy means the correlation between color components of; (2) spatial redundancy; (3) time redundancy, which is still image video compression from the fundamental point of compression, video compression major use of time redundancy to achieve large compression ratio. The second category is the physical visual redundancy, which is due to human visual system (HVS) properties resulting from, such as the human eye color component of the high frequency luminance components are not sensitive to high frequency, high frequency of the image (that is, the details of ) Department, such as the noise is not sensitive. In response to these redundancy, video compression algorithm uses a different method be used, but the main consideration on the space and time redundancy on redundancy. Similar to previous standards, H.264 has also been adopted so-called mixed (hybrid) structure of space and time redundancy to deal with redundancy, respectively. Space redundancy, through the transformation and quantitative criteria to achieve the purpose of the elimination of so called I-frame encoded frame; and time redundancy is through the inter-frame prediction, motion estimation and compensation that is, to remove, so called frame coding P-frame or B frame. With the previous standard is different, H.264 encoding I-frame, the use of intra-prediction, and then encode the prediction error on. This will take full advantage of the spatial correlation to improve the coding efficiency. H.264 intra-coding the diagram (details please refer to the "China multimedia video" Seventh) as shown. H.264 intra-prediction of a 16 × 16 for the basic unit of macroblock. First of all, the encoder will work with the current macroblock the same frame as a reference of the neighboring pixels, resulting in the current macroblock of predictive value, and then carried out on the prediction residual transform and quantization, and then transform and quantification of results after the entropy coding. The results of entropy coding can be formed on the stream. Due to end in the decoder to receive the data are to quantify the anti-conversion and anti-reconstruction images, so in order to make the same codec, encoder used to predict the end of the reference and decoder on the same side, but also through anti-conversion and anti-quantitative image reconstruction. One thing to note is that for the intra-prediction of these data do not need to filter through Deblocking Filter, which is encoded reference image frame is different. 1, intra-prediction Brightness Intra - 16 × 16 intra prediction mode of Figure (details please see "China's multi-media video" Seventh) as shown. Brightness Intra - 16 × 16 intra prediction mode Color component 8 × 8 4 intra prediction mode of Figure species (details see "China's multi-media video" Seventh) as shown. Color component 8 × 8 4 intra prediction mode of species Brightness component of the 4 × 4 8 species the direction of intra prediction mode. Figure 5 brightness component of 4 × 4 8 species the direction of intra prediction mode 2, transform and quantization The current image pixel value and subtract predicted to form a prediction residual. Still contain residual spatial redundancy, in order to eliminate this redundancy, transform coding is usually used, that is, transform - quantitative - step entropy coding. Transform does not compress the data, it is only to eliminate the relevance of data, or data redundancy (or correlation) in a subsequent entropy coding to facilitate the ways. Compression is the quantization and entropy coding in step completed. In addition to further reduce the amount of data, encoder also transform coefficients after the quantization, and its real value is to reduce the scope of data to reduce the entropy of each symbol. It will cause the loss of information is detrimental to encode an important step, it is to control the image rate-distortion (RD) characteristics of one of the main means. In H.264, the transform and quantization are closely linked in two steps. Integer DCT transformation formula H. 264 anti-DCT Transform the formula Commonly used in image coding is to transform DCT, because it is similar in theory, under certain conditions optimal KL transform. However, if directly used to transform the definition of DCT, which would bring about two problems: One is the need for floating-point operations, resulting in system design complexity; Secondly, as a result of nuclear transformation are irrational, and the limited accuracy of floating-point numbers can not be accurately said that irrational number, together with the floating-point arithmetic may be the introduction of round-off errors, which makes the concrete realization will lead to inconsistencies codec (mismatch), or anti-transform the output and input transformation is not the same. In order to overcome these problems, H. 264 using integer DCT transform, the integer transform operation only addition and subtraction and shift operations can be completed, so that not only reduces design complexity and to avoid codec mismatch, and the resulting reduction in minimal coding performance. Note that at this time of transformation is not the real DCT, still called DCT transform only that it is derived from the DCT, and another in order to transform (Hadamard transform) to distinguish it. H. Encoder 264 to quantify the process of transformation and see seven magazines. Chart for the prediction residual input and output in preparation for entropy encoding the data, a total of five. In order to greater use of spatial redundancy, the Intra 16 × 16 intra prediction mode, H. 264 of 16 × 16 components hma of 16 4 × 4 blocks after the DCT transform, each 4 × 4 block DC coefficient (also not been quantified) extracted, the formation of a 4 × 4 the luma DC block, its another 4 × 4 of the Hadamard (Hadamard) transform. Similarly, the 8 × 8 chroma component of the four 4 × 4 blocks after the DCT transform, and each 4 × 4 block DC coefficient extracted to form a 2 × 2 block of chroma DC, its 2 × 2 Hadamard transform, shown in figure 7. The figures show that the figure represented by the block in the bit stream in the order. DC coefficient of the brightness of the additional weight of the (4 × 4) 4-order Hadamard transform for chroma DC coefficient of an additional component of the (2 × 2) 2-order Hadamard transform processing DC coefficients in Figure (details please see "China's multi-media, as inquiry, "the seventh issue) as shown. Map is a decoder input (CAVLC or CABAC) the result, the output data together with the predictive value of the reconstructed image after the reconstruction images for intra-prediction, or after the show Deblocking Filter and stored in accordance with the need to alleviate deposit that is used for interframe prediction. There is a need attention, the DC coefficient (both Intra 16 × 16 luma DC or chroma DC), the decoder is the first anti-anti-transform and then to quantify the reasons for doing so will be at the back to explain the content. MUX refers to the DC coefficient of the assembly according to Figure 8 to the AC coefficient to form a complete 4 × 4 blocks, for the follow-up anti-DCT transform. At present, the main lack of H.264 is the complexity, but with the continuous advancement of technology, especially the development of semiconductor technology, chip processing power and memory capacity will be greatly improved, so the future is bound to H.264 full of vitality, the market has gradually become the main characters.
5/02/2009
3:Video compression standard MPEG-4, H.264
Video compression standard MPEG-4, H.264 A number of h.264 and MPEG-4 standard would like to share with everyone, would like to have done the set-top box to the younger brothers and a little experience teaching. Now I will explain about the knowledge and the STB on the video compression standards. MPEG-4 standard will support seven new features. Can be roughly divided into three categories: content-based interactivity, high compression rates and flexible access model. Is introduced as follows: 1. The interactive content-based (Content-based interactivity) (1) the operation of content-based bit-stream editing and encoding can be carried out not necessary to the operation and content-based bit-stream editing. For example: Users can bit-stream in the image or select one specific object (Object) (such as the image of a person, a building, etc.), followed by some of its characteristics change. (2) natural and synthetic data to provide hybrid coding of natural video images with the synthesis of data (text, graphics) the effective integration of the way, while supporting interactive operation. (3) increased time domain random access MPEG-4 would provide an effective means of random access: In the limited time interval, the frame can be any shape or object, the sound of a video random access sequences. For example, a sequence of a sound, video for the goal of "fast-forward" Search. 2. High compression rate (Compression) (l) to improve coding efficiency with the existing or emerging standards can be compared with the rate, MPEG-4 standard will provide a better subjective visual quality of images. This feature is expected in the rapidly growing mobile communications network to obtain applications, but it is worth noting that: improve the coding efficiency of MPEG-4 is not the only time when the main objective. (2) multiple concurrent data stream encoded MPEG-4 will provide an effective features of a multi-angle encoder, together with the multi-channel sound and efficient coding of audio-visual synchronization. In the three-dimensional video applications, MPEG-4 will use the same multi-view observation of the scene caused by information redundancy, MPEG-4 of this feature observed in sufficient perspective to be effective under the conditions described in three-dimensional nature. 3. Flexible access (Universal access) (l) error-prone environment of the anti-wrong (Robustness) "Flexible" refers to allowing the use of a variety of cable, wire and all kinds of storage media, MPEG-4 will increase the capacity of anti-error (Error robustness capability), particularly in serious error-prone environment of low-bit applications ( mobile communication link). Note, MPEG-4 is the first in its audio and video that take into account norms of the standard channel characteristics. Is not intended to replace the communications network provided by the error control techniques, but rather to provide a residual error against the tenacity. For example: a selective Forward Error Correction (Selective forward error correction), to contain an error (Error containment), or to cover up the error (Error concealment). (2) content-based measure of variability (Content-based scalability) Variability means that the content of scale to the image, assign a priority to each object. Among them, the more important the higher the object or space and time resolution said. Content-based measure of variability is the core of MPEG-4, because once the image contained in the directory object and the corresponding level of priority are identified, other content-based features are more easily achieved. For very low bit rate applications, the scale variability is a key factor, because it provides a self-adaptive capacity of available resources. For example, this feature allows the user requirements: the highest priority on the subject of an acceptable quality to show that the object of the second priority was the quality of the lower display, while the remaining content (object) that does not show that we can see that This approach may be the most effective use of limited resources. Detailed H.264 standard: JVT (Joint Video Team, Joint Working Group on the video) in December 2001 the establishment of Pattaya in Thailand. By ITU-T and the International Organization for Standardization ISO two video encoding on the composition of the experts. JVT work goal is to develop a new video coding standard in order to achieve high video compression ratio, high image quality, good adaptability of the network objectives. At present, the work of the JVT has been accepted by ITU-T, the new video coding standard known as H.264 compression standard, which was also accepted by ISO, known as AVC (Advanced Video Coding) standard, is the MPEG-4 Part 10. H.264 standard can be divided into Three: The basic level (the simple version of the application of a wide range); The main level (using a number of improve image quality and increase the compression ratio of technical measures can be used for SDTV, HDTV and DVD, etc.); Expansion level (can be used for a variety of network transmission of video streams). Not only H.263 and H.264 than MPEG-4 of the 50 percent savings rate, but also has better network support. It into IP packets for encoding mechanism is conducive to the packet transmission network to support network streaming video transmission. H.264 has a strong anti-BER characteristics, can be adapted to the high rate of packet loss, a serious interference in the video transmission channel. H.264 support for various network resources under the classification code transmission, and thus obtain a smooth image quality. H.264 can adapt to different video transmission network, the network good affinity. 1, H.264 video compression system Compression standard H.264 video coding system layer (VCL) and network abstraction layer (Network Abstraction Layer, NAL) is composed of two parts. VCL, including VCL and VCL encoder decoder, the main functions of the video data compression encoding and decoding, which includes the motion compensation, transform coding, entropy coding compression unit. NAL is used for the VCL has nothing to do with the network to provide a unified interface, which is responsible for video data package after package to send in the network, it uses a unified data format, including a single byte of the header information, a number of words section with the group of video data frames, logical channel signaling, timing information, the end of signal sequence. Header contains the type of store signs and markers. Store signs used to indicate the current data does not belong to be a reference frame. Types of signs used to indicate the type of image data. VCL can be transmitted by the network to adjust the current encoding parameters. Second, H.264 characteristics H.264 and H.261, H.263, the DCT transform coding is the increase in the use of DPCM coding of the difference, that is, hybrid coding structure. At the same time, H.264 hybrid coding in the framework of the introduction of a new coding method to improve the coding efficiency, closer to practical application. H.264 is not complicated options, but try to be brief the "return to basics", it is better than H.263 + + the compression performance, but also has to adapt to a wide range of channel capacity. H.264 target a wide range of applications, to meet a variety of different rate, video applications on various occasions, has good anti-error and anti-handling capacity of packet loss. H.264 basic system without the use of copyrights, to have an open nature, can be well adapted to IP and wireless networks use the Internet for the current transmission, multimedia messaging, mobile broadband network to transmit information of great significance to all. Although the basic structure of H.264 encoding with H.261, H.263 is similar, but it has made improvements in many areas, are listed below. 1. A variety of better motion estimation High-precision estimates H.263 is used in half pixel is estimated that in the further use of H.264 in the 1 / 4 pixel or 1 / 8 pixel motion estimation. That is, the real movement of the displacement vector may be based on 1 / 4 or 1 / 8 as the basic unit of pixels. Obviously, the motion vector accuracy of the higher displacement, the smaller the residual error frame, the lower the transmission rate, that is, the higher the compression ratio. H.264 is used in the 6-order FIR interpolation filter to obtain 1 / 2 pixel position value. When 1 / 2 pixel values obtained, the 1 / 4 pixel value can be obtained through linear interpolation, For 4:1:1 video format, the brightness signal 1 / 4 pixel accuracy corresponds to the color part of the 1 / 8 pixel motion vectors, it signals the need for color 1 / 8 pixel interpolation operator. In theory, if the accuracy of motion compensation doubled (for example, from whole-pixel precision to 1 / 2 pixel accuracy), can 0.5bit/Sample the coding gain, but to verify the accuracy of motion vectors found in more than 1 / 8 pixel , the system is basically there is no obvious gain, so that in H.264, only used 1 / 4 pixel accuracy motion vector mode, rather than 1 / 8 pixel accuracy. Multi-mode macroblock is estimated breakdown The forecasting model in H.264, a macroblock (MB) can be divided into seven kinds of different size, this multi-mode flexible, subtle delineation macroblock is in line with our image in the shape of the actual movement of objects, so in each macroblock may contain 1,2,4,8 or 16 motion vectors. Multi-parameter frame is estimated In H.264, the frame can be a number of parameters of motion estimation, that is in the encoders of the cache code there is more than just a good parameter frame, from one encoder to choose a better coding results are given as parameters of the frame, and pointed out that the frame which was used to predict, so that you can use than just a good frame just encoded frame as a better predict the effect of the code. 2. Small size 4? 4 integer transform Video compression coding unit used in the past 8? 8. H.264 is used in the small size of the 4? 4, as the transform block size has become smaller, moving objects on the more precise delineation. This case, the image transform in the process of computation, and edges of moving objects in the convergence of error has been greatly reduced. When the images are large smooth areas, in order not to have a small size due to change brought about by inter-block differences in gray, H.264 Intra macroblock of luminance data 16 4? 4 of DCT coefficients 4 the second time? 4 Transformation of chroma data 4 4? the DC coefficient of four (one for each small piece of a total of four DC coefficients) for 2? 2 transformation. H.263 not only image transform block size has become smaller, and the transformation is an integer operation, rather than real computing, namely, encoders and decoders transform and inverse transform of the accuracy of the same, there is no "anti-conversion error." 3. More accurate intra-prediction In H.264, each 4? Of four are available for each pixel 17 of the nearest previously encoded pixels and the different weights for intra-prediction. 4. VLC unified H.264 encoding on the entropy in two ways. Unified VLC (that is, UVLC: Universal VLC). UVLC use the same code table for encoding, while the decoder can easily identify the code word prefix, UVLC bit error in the event of rapid access to re-sync. Content adaptive binary arithmetic coding (CABAC: Context Adaptive Binary Arithmetic Coding). UVLC its slightly better coding performance, but higher complexity. Third, the performance advantage H.264 and MPEG-4, H.263 + + coding performance comparison using the following six test rate: 32kbit / s, 10F / s and QCIF; 64kbit / s, 15F / s and QCIF; 128kbit / s, 15F / s and CIF; 256kbit / s, 15F / s and QCIF; 512kbit / s, 30F / s and CIF; 1024kbit / s, 30F / s and CIF. The test results indicate, H.264 than MPEG and H.263 + + more excellent PSNR performance. PSNR than H.264 in MPEG-4 high average 2dB, than the average to H.263 + + High-3dB. Fourth, the new fast motion estimation algorithm New Fast Motion Estimation Algorithm UMHexagonS (China Patent) is a computational complexity compared to H.264 Zhongyuan fast full search algorithm and some savings of more than 90% of the new algorithm, called the whole "non-symmetrical cross-shaped multi-level six - shaped grid search algorithm "(Unsymmetrical-Cross Muti-Hexagon Search)", which is a whole-pixel motion estimation algorithm. because of its large sport in high-bit-rate image sequence coding at a better rate-distortion performance to maintain the conditions , the computational complexity is very low, has been formally adopted H.264 standard. ITU and ISO joint development of the H.264 (MPEG-4 Part 10) may be broadcast, communications and storage media (CD DVD) to become a unified standard, is most likely to become a broadband interactive new media standards. China's source coding standard has not yet been formulated, paying close attention to the development of H.264, the development of our source coding standards are being stepped up. Standard H264 video compression technology to enable movement to rise to a higher stage, at a relatively low bandwidth to provide high-quality H.264 video transmission is a bright spot in the application. The popularization and application of H.264 video terminals, gatekeepers, gateways, MCU, such as higher system requirements, will greatly promote the video conferencing software and hardware equipment in the continuous improvement in all aspects.
5/01/2009
2:Basic knowledge of H.264 video coding
A video encoding technology, the development of video coding technology is the basic ISO / IEC to develop the MPEG-x and the development of ITU-T Video Coding H.26x two series of the introduction of international standards. Recommendations from the H.261 video encoding, and H.262 / 3, MPEG-1/2/4, etc. are continuing to pursue a common goal, which is at the lowest possible bit-rate (or storage capacity), as far as possible under good image quality. Moreover, with the market's increasing demand for image transmission, how to adapt to different channel transmission characteristics of the problem is also increasingly apparent. So IEO / IEC and ITU-T International Organization for Standardization two jointly developed the new standard H.264 video to solve these problems. H.261 is one of the first video coding recommendation appears to regulate online ISDN video conferencing and video telephony applications, video coding techniques. It uses a combination of the algorithm can reduce the temporal redundancy of the frame prediction and spatial redundancy to reduce the mixing of the DCT transform coding method. And ISDN channel to match the output rate is p × 64kbit / s. p value is small, only not too high mass-resolution images, suitable for face-to-face teleconference; p larger values (eg, p> 6), can transfer HD video conferencing better image. H.263 is proposed for low bit rate image compression standard, H.261 is technically improved and expanded to support the bit rate of less than 64kbit / s applications. But in essence, H.263 and H.263 + and the subsequent H.263 + + has developed into full-rate applications to support the recommendation of its support from a large number of image formats that can be seen, such as Sub-QCIF, QCIF , CIF, 4CIF, etc. even 16CIF format. MPEG-1 standard rate for the 1.2Mbit / s or so, can provide 30 CIF (352 × 288) image quality is CD-ROM CD-ROM video storage and broadcast set. MPEG-l standard video coding part of the basic algorithm and H.261/H.263 similar, using the inter-frame motion compensation prediction, two-dimensional DCT, VLC measures such as run-length coding. In addition, the introduction of intra-frame (I), prediction frame (P), bi-directional prediction frame (B) and DC frame (D) concepts, to further improve the coding efficiency. In the MPEG-1 basis, MPEG-2 standard in improving the image resolution, compatibility and other aspects of digital television has made some improvements, such as its motion vector for the half-pixel accuracy; in the encoding operation in (such as motion estimation and DCT) distinction between "frame" and "field"; the introduction of a scalable coding technology, such as the classification of space, time and signal to noise ratio scalability, such as scalability. Introduced in recent years the introduction of MPEG-4 standard-based audio-visual objects (AVO: Audio-Visual Object) code, greatly improving the interactive video communication capabilities and coding efficiency. MPEG-4 also introduced some new techniques, such as shape coding, adaptive DCT, arbitrarily shaped video object coding. However, the basic MPEG-4 video encoder and H.263 or belonging to a class of similar hybrid encoder. In short, H.261 video encoding proposal is classic, H.263 is its development and will gradually replace it in practice, mainly used in communications, but a large number of options H.263 users often at a loss. MPEG family of standards for storage media, from application development to adapt to the application of transmission media, the core of the basic framework of video coding and H.261 is the same, one high-profile MPEG-4's "object-based coding," in part because of yet There are technical barriers to widespread application is still difficult. Therefore, in this developed on the basis of the new H.264 video encoding proposed to overcome the weaknesses of both, in the framework of hybrid coding introduced a new coding method to improve the coding efficiency, for practical application. At the same time, it is the International Organization for Standardization two co-developed, its application should be self-evident. Second, H.264, introduced H.264 is ITU-T's VCEG (Video Coding Experts Group) and ISO / IEC for MPEG (Moving Picture Coding Experts Group) video group joint (JVT: joint video team) to develop a new digital video coding standard, it is an ITU-T's H.264, it is ISO / IEC for MPEG-4 Part 10. January 1998 draft began to solicit, in September 1999, completed the first draft in May 2001 to develop the test model TML-8, 2002 years 6 months of JVT adopted at the 5th Session of the FCD Board of H.264. March 2003 release. H.264 and previous standards, but also increases transform coding of DPCM hybrid coding mode. But it is a "return to basics" simplicity of design, do not have many options than H.263 + + much better compression performance; strengthened to adapt to a variety of channel capacity, the use of "network-friendly" structure and grammar is beneficial to the handling of error and packet loss; application of broad goals to meet the needs of different rates, different resolution and different transmission (storage) requirements occasions; its basic system is open, use without copyright. Technically, H.264 standard in a number of flash, such as in the unified symbol VLC coding, high-precision, multi-mode displacement is estimated, based on the 4 × 4 Integer Transform block, such as layered coding syntax. These measures make H.264 algorithm has very high coding efficiency, the reconstruction of the same image quality, it can save more than H.263 bit-rate of around 50%. H.264 Bitstream structure adaptable network, an increase of error recovery capabilities, able to adapt to IP and wireless network applications. Third, H.264 technical highlights 1, hierarchical design concept of H.264 on the algorithm can be divided into two layers: video coding layer (VCL: Video Coding Layer) is responsible for high-performance video content, network abstraction layer (NAL: Network Abstraction Layer) is responsible for the network as required on appropriate packaging and transmission of data. Between VCL and NAL in the definition of a form of packet-based interface, and the corresponding signaling package is part of NAL. In this way, the high coding efficiency and network friendly task by the VCL and NAL, respectively to complete. VCL layer including motion compensation block-based hybrid coding, and some new features. And in front of the same video coding standard, H.264 is not the pre-treatment and post-processing functions included in the draft, so that standards can increase flexibility. NAL is responsible for the use of the lower sub-network data format to package, including the group of frames, logical channel signaling, timing information or the use of the end of signal sequence. For example, NAL support circuit-switched video channel in the transmission format, to support the use of Internet video in RTP / UDP / IP transmission format. NAL, including its head of information, paragraph structure of information and the actual load information, that is, the upper VCL data. (If the data partitioning technology, data may be composed of several components). 2, high-precision, multi-mode motion estimation H.264 support 1 / 4 or 1 / 8 pixel accuracy motion vector. In 1 / 4 pixel accuracy may be using the 6-tap filter to reduce high frequency noise, the 1 / 8 pixel accuracy of motion vectors, can be more complex to use 8-tap filter. In motion estimation, the encoder can also choose to "enhance" interpolation filters to improve the forecast results. Prediction of movement in the H.264, a macroblock (MB) according to Figure 2 can be divided into different sub-block, seven kinds of different models of the formation of the block size. This multi-mode flexible and detailed delineation is in line with our image in the shape of the actual movement of objects, greatly improving the accuracy of motion estimation. In this mode, each macroblock can contain 1,2,4,8 or 16 motion vectors. In H.264, the encoder to allow the use of more than one previous frames for motion estimation, that is, the so-called multi-frame reference technology. For example, two or three just a good reference frame coding, encoder will choose the target for each macroblock can give a better prediction frame, and instructions for each macroblock which has been used to predict frame. 3,4 × 4 block H.264 integer transform similar to the previous standard of residual-based block transform coding, but the operation is an integer transform instead of real computing, the process and is basically similar to DCT. The advantage of this approach is that: in the encoder and decoder to allow the accuracy of the same transform and inverse transform, easy-to-use simple fixed-point algorithms. In other words, there are no "anti-conversion error." Transform is the 4 × 4 blocks, and not commonly used in the past 8 × 8 block. For transformation as a result of block size, sport a more precise delineation of objects, so that not only transform smaller than calculated, and the edges of objects in the movement of the convergence error is also greatly reduced. In order to transform the small size of the block on the larger image does not produce a smooth region between the gray-block differences, intra-macroblock of luminance data 16 4 × 4 block DC coefficients (one for each small A total of 16) a second 4 × 4 block transform, the data on the color of four 4 × 4 block DC coefficients (one for each small piece of a total of four) for 2 × 2 block transform. H.264 Rate Control In order to improve the ability to quantify the changes in step size to control the rate of 12.5 percent in about the same rate of increase rather than change. Transform coefficients of the normalized rate was on the process of dealing with anti-quantified in order to reduce the complexity of the calculation. In order to emphasize the vivid color and chroma coefficient used for quantization step size smaller. 4, uniform VLC H.264 entropy coding in two methods, one is for all the symbols to be a unified code of VLC (UVLC: Universal VLC), the other is using the content adaptive binary arithmetic coding ( CABAC: Context-Adaptive Binary Arithmetic Coding). CABAC is optional, and its encoding UVLC better performance, but also the high computational complexity. UVLC unlimited use of a code word length set, the design of the structure of very rules, by the same code table can encode different object. This method can easily produce a code word, and the decoder can easily identify the prefix code word, UVLC bit error in the event of rapid access to re-sync. 5, intra-prediction in the previous series H.26x and MPEG-x series of standards are used in a way inter-frame prediction. In H.264, when encoding Intra intra-prediction image can be used. For each 4 × 4 block (in addition to the special disposition of blocks away from the edge), are available for each pixel of the 17 closest pixels previously encoded different weighted and (some of the right value for 0) to predict that this pixel where the upper left corner of block 17 pixels. Obviously, this intra-prediction is not in time but in the space domain to the predictive coding algorithm, you can remove the space between adjacent block redundancy to obtain more effective compression. Shown in Figure 4, 4 × 4 box a, b ,..., p for the 16 pixels to be predicted, and A, B ,..., P pixels is encoded. If the value of m points can be (J +2 K + L +2) / 4-type to predict, it can be (A + B + C + D + I + J + K + L) / 8 style to predict, and so on. Selected in accordance with the predictions of different reference points, the brightness of a total of 9 different models, but the color of the intra prediction mode is only 1 category. 6, IP and wireless environment for the draft H.264 contains a tool for the elimination of errors for compressed video in error, packet loss-prone transmission environment, such as the mobile channel or IP transmission channel robustness. In order to withstand transmission errors, H.264 video streams in the time synchronization can be set through the use of intra-image to be completed simultaneously by the structure of space encoding (slice structured coding) to support. At the same time in order to facilitate subsequent re-synchronization error in an image of the video data also provide a certain amount of re-synchronization point. In addition, the intra-macroblock refresh and allow code reference macroblock in macroblock mode decision time could be considered not only the coding efficiency, but also consider the characteristics of transmission channel. In addition to using quantitative step to adapt to changes in channel bit rate, in H.264, it is also often make use of data partitioning method to deal with changes in channel bit rate. Generally speaking, the concept of data partitioning is generated in the encoder with different priority of video data to support the network quality of service QoS. For example, based on the data partitioning syntax (syntax-based data partitioning) method to each frame of data is divided into several parts according to their importance, so that when allowed to discard buffer overflow in the less important information. Can also use a similar data partitioning time (temporal data partitioning) method, through the P frames and B frames using multiple reference frames to complete. In wireless communications applications, we can change the quantitative accuracy of each frame or spatial / temporal resolution to support the great bit-rate wireless channel changes. However, in the case of multicast, the requirements of the encoder to change in response to a variety of bit rates is impossible. Therefore, unlike the use of MPEG-4 Scalable Coding fine FGS (Fine Granular Scalability) method (low efficiency), H.264 flow switch used to replace the SP-frame coding classification. Fourth, H.264 compare the performance of TML-8 for the H.264 test model, using it to H.264 video coding efficiency comparison and testing. Provided by the test results clearly show that the PSNR, compared to MPEG-4 (ASP: Advanced Simple Profile) and H.263 + + (HLP: High Latency Profile) performance, H.264 has the obvious superiority of the results. PSNR than H.264 in MPEG-4 (ASP) and H.263 + + (HLP) significantly better rate in contrast to six kinds of testing, H.264 of PSNR than MPEG-4 (ASP) higher average 2dB, than H.263 (HLP) on average higher 3dB. 6 rate and its associated test conditions were: 32 kbit / s rate, 10f / s frame rate and QCIF format; 64 kbit / s rate, 15f / s frame rate and QCIF format; 128kbit / s rate, 15f / s CIF format and frame rate; 256kbit / s rate, 15f / s frame rate and QCIF format; 512 kbit / s rate, 30f / s frame rate and the CIF format; 1024 kbit / s rate, 30f / s frame rate and the CIF format.
Subscribe to:
Posts (Atom)
